Direct Updates
Now that we have a graph we can directly update it with the add_node()
and add_edge()
functions.
Adding nodes
To add a node we need a unique id
to represent it and an update timestamp
to specify when in the history of your data this node addition took place. In the below example we are going to add node 10
at timestamp 1
.
Info
If your data doesn't have any timestamps, don't fret! You can just set a constant value like 1
for all additions into the graph.
from raphtory import Graph
g = Graph()
v = g.add_node(timestamp=1, id=10)
print(g)
print(v)
Output
Graph(number_of_nodes=1, number_of_edges=0, number_of_temporal_edges=0, earliest_time=1, latest_time=1)
Node(name=10, earliest_time=1, latest_time=1)
Printing out the graph and the returned node we can see the update was successful and the earliest/latest time has been updated.
Adding edges
All graphs in raphtory are directed, meaning edge additions must specify a timestamp
(the same as a node_add()
), the source
node the edge starts from and the destination
node the edge ends at.
As an example of this below we are adding an edge to the graph from 15
to 16
at timestamp 1
.
Info
You will notice in the output that the graph says that it has two nodes as well as the edge. This is because Raphtory automatically creates the source and destination nodes at the same time if they are yet to exist in the graph. This is to keep the graph consistent and avoid hanging edges
.
from raphtory import Graph
g = Graph()
e = g.add_edge(timestamp=1, src=15, dst=16)
print(g)
print(e)
Output
Graph(number_of_nodes=2, number_of_edges=1, number_of_temporal_edges=1, earliest_time=1, latest_time=1)
Edge(source=15, target=16, earliest_time=1, latest_time=1)
Accepted ID types
The data you want to use for node IDs may not always be integers, they can often be unique strings like a person's username or a blockchain wallet hash. As such add_node()
and add_edge()
will also accept strings for their id
, src
& dst
arguments.
Below you can see we are adding two nodes to the graph User 1
and User 2
and an edge between them.
from raphtory import Graph
g = Graph()
g.add_node(timestamp=123, id="User 1")
g.add_node(timestamp=456, id="User 2")
g.add_edge(timestamp=789, src="User 1", dst="User 2")
print(g.node("User 1"))
print(g.node("User 2"))
print(g.edge("User 1", "User 2"))
Output
Node(name=User 1, earliest_time=123, latest_time=789)
Node(name=User 2, earliest_time=456, latest_time=789)
Edge(source=User 1, target=User 2, earliest_time=789, latest_time=789)
Warning
Note: A graph can index nodes by either integers or strings, not both at the same time.This means, for example, you cannot have User 1
(a string) and 200
(an integer) as ids in the same graph.
Accepted timestamps
While integer based timestamps (like in the above examples) can represent both logical time and epoch time, datasets can often have their timestamps stored in human readable formats or special datetime objects. As such, add_node()
and add_edge()
can accept integers, datetime strings and datetime objects interchangeably.
Below we can see node 10
being added into the graph at 2021-02-03 14:01:00
and 2021-01-01 12:32:00
. The first timestamp is kept as a string, with Raphtory internally handling the conversion, and the second has been converted into a python datetime object before ingestion. This datetime object can also have a timezone, with Raphtory storing everything internally in UTC.
from raphtory import Graph
from datetime import datetime
g = Graph()
g.add_node(timestamp="2021-02-03 14:01:00", id=10)
# Create a python datetime object
datetime_obj = datetime(2021, 1, 1, 12, 32, 0, 0)
g.add_node(timestamp=datetime_obj, id=10)
print(g)
print(g.node(id=10).history())
print(g.node(id=10).history_date_time())
Output
Graph(number_of_nodes=1, number_of_edges=0, number_of_temporal_edges=0, earliest_time=1609504320000, latest_time=1612360860000)
[1609504320000, 1612360860000]
[datetime.datetime(2021, 1, 1, 12, 32, tzinfo=datetime.timezone.utc), datetime.datetime(2021, 2, 3, 14, 1, tzinfo=datetime.timezone.utc)]
In our output we can see the history
of node 10
contains the two times at which we have added it into the graph (maintained in ascending order), returned in both unix epoch (integer) and datetime format.
Properties
Alongside the structural update history, Raphtory can maintain the changing value of properties associated with nodes and edges. Both the add_node()
and add_edge()
functions have an optional parameter properties
which takes a dictionary of key value pairs to be stored at the given timestamp.
The graph itself may also have its own global properties
via the add_property()
function which takes only a timestamp
and a properties
dictionary.
Properties can consist of primitives (Integer
, Float
, String
, Boolean
, Datetime
) and structures (Dictionary
, List
, Graph
). This allows you to store both basic values as well as do complex hierarchical modelling depending on your use case.
In the example below we are using all of these functions to add a mixture of properties to a node, an edge and the graph.
Warning
Please note that once a property key
is associated to one of the above types for a given node/edge/graph, attempting to add a value of a different type under the same key will result in an error.
from raphtory import Graph
from datetime import datetime
g = Graph()
# Primitive type properties added to a node
g.add_node(
timestamp=1,
id="User 1",
properties={"count": 1, "greeting": "hi", "encrypted": True},
)
g.add_node(
timestamp=2,
id="User 1",
properties={"count": 2, "balance": 0.6, "encrypted": False},
)
g.add_node(
timestamp=3,
id="User 1",
properties={"balance": 0.9, "greeting": "hello", "encrypted": True},
)
# Dictionaries and Lists added to a graph
g.add_property(
timestamp=1,
properties={
"inner data": {"name": "bob", "value list": [1, 2, 3]},
"favourite greetings": ["hi", "hello", "howdy"],
},
)
datetime_obj = datetime.strptime("2021-01-01 12:32:00", "%Y-%m-%d %H:%M:%S")
g.add_property(
timestamp=2,
properties={
"inner data": {
"date of birth": datetime_obj,
"fruits": {"apple": 5, "banana": 3},
}
},
)
# Graph property on an edge
g2 = Graph()
g2.add_node(timestamp=123, id="Inner User")
g.add_edge(timestamp=4, src="User 1", dst="User 2", properties={"i_graph": g2})
# Printing everything out
v = g.node(id="User 1")
e = g.edge(src="User 1", dst="User 2")
print(g)
print(v)
print(e)
Output
Graph(number_of_nodes=2, number_of_edges=1, number_of_temporal_edges=1, earliest_time=1, latest_time=4, properties=Properties({favourite greetings: [hi, hello, howdy], inner data: {date of birth: 2021-01-01 12:32:00, fruits: {apple: 5, banana: 3}}}))
Node(name=User 1, earliest_time=1, latest_time=4, properties=Properties({greeting: hello, count: 2, encrypted: true, balance: 0.9}))
Edge(source=User 1, target=User 2, earliest_time=4, latest_time=4, properties={i_graph: Graph(num_nodes=1, num_edges=0)})
Info
You will see in the output that when we print these, only the latest values are shown. The older values haven't been lost, in fact the history of all of these different property types can be queried, explored and aggregated, as you will see in Property Queries.
Constant Properties
Alongside the temporal
properties which have a value history, Raphtory also provides constant
properties which have an immutable value. These are useful when you know a value won't change or are adding meta data
to your graph which doesn't make sense to happen at a specific time in the given context. To add these into your model the graph
, node
and edge
have the add_constant_properties()
function, which takes a single properties dict argument.
You can see in the example below three different constant properties being added to the graph
, node
and edge
.
from raphtory import Graph
from datetime import datetime
g = Graph()
v = g.add_node(timestamp=1, id="User 1")
e = g.add_edge(timestamp=2, src="User 1", dst="User 2")
g.add_constant_properties(properties={"name": "Example Graph"})
v.add_constant_properties(
properties={"date of birth": datetime.strptime("1990-02-03", "%Y-%m-%d")},
)
e.add_constant_properties(properties={"data source": "https://link-to-repo.com"})
print(g)
print(v)
print(e)
Graph(number_of_nodes=2, number_of_edges=1, number_of_temporal_edges=1, earliest_time=1, latest_time=2, properties=Properties({name: Example Graph}))
Node(name=User 1, earliest_time=1, latest_time=2, properties=Properties({date of birth: 1990-02-03 00:00:00}))
Edge(source=User 1, target=User 2, earliest_time=2, latest_time=2, properties={data source: https://link-to-repo.com})
Edge Layers
If you have worked with other graph libraries you may be expecting two calls to add_edge()
between the same nodes to generate two distinct edge objects. As we have seen above, in Raphtory, these calls will append the information together into the history of a single edge.
These edges can be exploded
to interact with all updates independently (as you shall see in Exploded edges), but Raphtory also allows you to represent totally different relationships between the same nodes via edge layers
.
The add_edge()
function takes a second optional parameter, layer
, allowing the user to name the type of relationship being added. All calls to add_edge
with the same layer
value will be stored together allowing them to be accessed separately or merged with other layers as required.
You can see this in the example below where we add five updates between Person 1
and Person 2
across the layers Friends
, Co Workers
and Family
. When we query the history of the weight
property on the edge we initially get all of the values back. However, after applying the layers()
graph view we only return updates from Co Workers
and Family
.
from raphtory import Graph
g = Graph()
g.add_edge(
timestamp=1,
src="Person 1",
dst="Person 2",
properties={"weight": 10},
layer="Friends",
)
g.add_edge(
timestamp=2,
src="Person 1",
dst="Person 2",
properties={"weight": 13},
layer="Friends",
)
g.add_edge(
timestamp=3,
src="Person 1",
dst="Person 2",
properties={"weight": 20},
layer="Co Workers",
)
g.add_edge(
timestamp=4,
src="Person 1",
dst="Person 2",
properties={"weight": 17},
layer="Friends",
)
g.add_edge(
timestamp=5,
src="Person 1",
dst="Person 2",
properties={"weight": 35},
layer="Family",
)
unlayered_edge = g.edge("Person 1", "Person 2")
layered_edge = g.layers(["Co Workers", "Family"]).edge("Person 1", "Person 2")
print(unlayered_edge.properties.temporal.get("weight").values())
print(layered_edge.properties.temporal.get("weight").values())
Output
[10, 13, 20, 17, 35]
[20, 35]