Skip to content

Querying the graph over time

The first set of view functions we will look at are for traveling through time, viewing the graph as it was at a specific point, or between two points (applying a time window). For this Raphtory provides four functions: at(), window(), expand() and rolling().

At

The at() function takes a time argument in epoch (integer) or datetime (string/datetime object) format and can be called on a graph, vertex, or edge. This will return an equivalent Graph View, Vertex View or Edge View which includes all updates between the beginning of the graphs history and the provided time (inclusive of the time provided).

This returned object has all of the same functions as its unfiltered counterpart and will pass the view criteria onto any entities it returns. For example if you call at() on a graph and then call vertex(), this will return a Vertex View filtered to the time passed to the graph.

An example of this can be seen in the code below where we print the degree of Lome on the full dataset, at 9:07 on the 14th of June and at 12:17 on the 13th of June.

Info

You will below that graph.at().vertex() and graph.vertex().at() are synonymous.

We also introduce two new time functions here, start() and end(), which specify the time range a view is filtered to, if one has been applied. You can see in the last line of the example we print the start, earliest_time, latest_time and end of the vertex to show you how these differ.

v = g.vertex("LOME")

print(f"Across the full dataset {v.name()} interacted with {v.degree()} other monkeys.")

v_at = g.vertex("LOME").at("2019-06-14 9:07:31")
print(
    f"Between {v_at.start_date_time()} and {v_at.end_date_time()}, {v_at.name()} interacted with {v_at.degree()} other monkeys."
)

v_at_2 = g.at(1560428239000).vertex("LOME")  # 13/06/2019 12:17:19 as epoch
print(
    f"Between {v_at_2.start_date_time()} and {v_at_2.end_date_time()}, {v_at_2.name()} interacted with {v_at_2.degree()} other monkeys."
)

print(
    f"Window start: {v_at_2.start_date_time()}, First update: {v_at_2.earliest_date_time()}, Last update: {v_at_2.latest_date_time()}, Window End: {v_at_2.end_date_time()}"
)

Output

Across the full dataset LOME interacted with 18 other monkeys.
Between 2019-06-13 09:50:00 and 2019-06-14 09:07:31.001000, LOME interacted with 9 other monkeys.
Between 2019-06-13 09:50:00 and 2019-06-13 12:17:19.001000, LOME interacted with 5 other monkeys.
Window start: 2019-06-13 09:50:00, First update: 2019-06-13 09:52:00, Last update: 2019-06-13 11:01:00, Window End: 2019-06-13 12:17:19.001000

Window

The window() function works the same as the at() function, but allows you to set a start time as well as an end time (inclusive of start, exclusive of end).

This is useful for digging into specific ranges of the history that you are interested in, for example a given day within your data, filtering everything else outside this range. An example of this can be seen below where we look at the number of times Lome interacts wth Nekke within the full dataset and for one day between the 13th of June and the 14th of June.

Info

We use datetime objects in this example, but it would work exactly the same with string dates and epoch integers.

from datetime import datetime

start_day = datetime.strptime("2019-06-13", "%Y-%m-%d")
end_day = datetime.strptime("2019-06-14", "%Y-%m-%d")
e = g.edge("LOME", "NEKKE")
print(
    f"Across the full dataset {e.src().name()} interacted with {e.dst().name()} {len(e.history())} times"
)
e = e.window(start_day, end_day)
print(
    f"Between {v_at_2.start_date_time()} and {v_at_2.end_date_time()}, {e.src().name()} interacted with {e.dst().name()} {len(e.history())} times"
)
print(
    f"Window start: {e.start_date_time()}, First update: {e.earliest_date_time()}, Last update: {e.latest_date_time()}, Window End: {e.end_date_time()}"
)

Output

Across the full dataset LOME interacted with NEKKE 41 times
Between 2019-06-13 09:50:00 and 2019-06-13 12:17:19.001000, LOME interacted with NEKKE 8 times
Window start: 2019-06-13 00:00:00, First update: 2019-06-13 10:18:00, Last update: 2019-06-13 15:05:00, Window End: 2019-06-14 00:00:00

Expanding

If you have data covering a large period of time, or have many time points of interest, it is quite likely you will find yourself calling at() over and over. If there is a pattern to these calls, say you are interested in how your graph looks every morning for the last week, you can instead utilise expanding().

expanding() will return an iterable of views as if you called at() from the earliest time to the latest time at increments of a given step.

The step can be given as a simple epoch integer, or a natural language string describing the interval. For the latter, this is converted it into a iterator of datetimes, handling all corner cases like varying month length and leap years.

Within the string you can reference years, months weeks, days, hours, minutes, seconds and milliseconds. These can be singular or plural and the string can include 'and', spaces, and commas to improve readability.

In the code below, we can see some examples of this where we first increment through the full history of the graph a week at a time. This creates four views, each of which we ask how many monkey interactions it has seen. You will notice the start time doesn't not change, but the end time increments by 7 days each view.

The second example shows the complexity of increments Raphtory can handle, stepping by 2 days, 3 hours, 12 minutes and 6 seconds each time. We have additionally bounded this expand via a window between the 13th and 23rd of June to demonstrate how these views may be chained.

print(
    f"The full range of time in the graph is {g.earliest_date_time()} to {g.latest_date_time()}\n"
)

for expanding_g in g.expanding("1 week"):
    print(
        f"From {expanding_g.start_date_time()} to {expanding_g.end_date_time()} there were {expanding_g.num_temporal_edges()} monkey interactions"
    )

print()
start_day = datetime.strptime("2019-06-13", "%Y-%m-%d")
end_day = datetime.strptime("2019-06-23", "%Y-%m-%d")
for expanding_g in g.window(start_day, end_day).expanding(
    "2 days, 3 hours, 12 minutes and 6 seconds"
):
    print(
        f"From {expanding_g.start_date_time()} to {expanding_g.end_date_time()} there were {expanding_g.num_temporal_edges()} monkey interactions"
    )

Output

The full range of time in the graph is 2019-06-13 09:50:00 to 2019-07-10 11:05:00

From 2019-06-13 09:50:00 to 2019-06-20 09:50:00 there were 789 monkey interactions
From 2019-06-13 09:50:00 to 2019-06-27 09:50:00 there were 1724 monkey interactions
From 2019-06-13 09:50:00 to 2019-07-04 09:50:00 there were 2358 monkey interactions
From 2019-06-13 09:50:00 to 2019-07-11 09:50:00 there were 3196 monkey interactions

From 2019-06-13 00:00:00 to 2019-06-15 03:12:06 there were 377 monkey interactions
From 2019-06-13 00:00:00 to 2019-06-17 06:24:12 there were 377 monkey interactions
From 2019-06-13 00:00:00 to 2019-06-19 09:36:18 there were 691 monkey interactions
From 2019-06-13 00:00:00 to 2019-06-21 12:48:24 there were 1143 monkey interactions
From 2019-06-13 00:00:00 to 2019-06-23 16:00:30 there were 1164 monkey interactions

Rolling

Where at() has expanding(), window() has rolling(). This function will return an iterable of views, incrementing by a window size and only including the history from inside the window period (Inclusive of start, exclusive of end). This allows you to easily extract daily or monthly metrics.

For example, below we take the code from expanding and swap out the function for rolling(). In the first loop we can see both the start date and end date increase by seven days each time, and the number of monkey interactions sometimes decreases as older data is dropped from the window.

print("Rolling 1 week")
for expanding_g in g.rolling(window="1 week"):
    print(
        f"From {expanding_g.start_date_time()} to {expanding_g.end_date_time()} there were {expanding_g.num_temporal_edges()} monkey interactions"
    )

Output

Rolling 1 week
From 2019-06-13 09:50:00 to 2019-06-20 09:50:00 there were 789 monkey interactions
From 2019-06-20 09:50:00 to 2019-06-27 09:50:00 there were 935 monkey interactions
From 2019-06-27 09:50:00 to 2019-07-04 09:50:00 there were 634 monkey interactions
From 2019-07-04 09:50:00 to 2019-07-11 09:50:00 there were 838 monkey interactions

Alongside the window size, rolling() takes an option step argument which specifies how far along the timeline it should increment before applying the next window. By default this is the same as window, allowing all updates to be analysed exactly once in non-overlapping windows.

If, however, you would like to have overlapping or fully disconnected windows, you can set a step smaller or greater than the given window size. For example, in the code below we add a step of two days. You can see in the output the start and end dates incrementing by two days each view, but are always seven days apart.

print("\nRolling 1 week, stepping 2 days (overlapping window)")
for expanding_g in g.rolling(window="1 week", step="2 days"):
    print(
        f"From {expanding_g.start_date_time()} to {expanding_g.end_date_time()} there were {expanding_g.num_temporal_edges()} monkey interactions"
    )

Output

Rolling 1 week, stepping 2 days (overlapping window)
From 2019-06-08 09:50:00 to 2019-06-15 09:50:00 there were 377 monkey interactions
From 2019-06-10 09:50:00 to 2019-06-17 09:50:00 there were 387 monkey interactions
From 2019-06-12 09:50:00 to 2019-06-19 09:50:00 there were 698 monkey interactions
From 2019-06-14 09:50:00 to 2019-06-21 09:50:00 there were 711 monkey interactions
From 2019-06-16 09:50:00 to 2019-06-23 09:50:00 there were 787 monkey interactions
From 2019-06-18 09:50:00 to 2019-06-25 09:50:00 there were 797 monkey interactions
From 2019-06-20 09:50:00 to 2019-06-27 09:50:00 there were 935 monkey interactions
From 2019-06-22 09:50:00 to 2019-06-29 09:50:00 there were 735 monkey interactions
From 2019-06-24 09:50:00 to 2019-07-01 09:50:00 there were 794 monkey interactions
From 2019-06-26 09:50:00 to 2019-07-03 09:50:00 there were 603 monkey interactions
From 2019-06-28 09:50:00 to 2019-07-05 09:50:00 there were 747 monkey interactions
From 2019-06-30 09:50:00 to 2019-07-07 09:50:00 there were 820 monkey interactions
From 2019-07-02 09:50:00 to 2019-07-09 09:50:00 there were 860 monkey interactions
From 2019-07-04 09:50:00 to 2019-07-11 09:50:00 there were 838 monkey interactions

As a small example of how useful this can be, in the following segment we plot the daily unique interactions of Lome via matplotlib in only 10 lines!

Info

We have to recreate the graph in the first section of this code block so that the output can be rendered as part of the documentation. Please ignore this.

# mkdocs: render
###RECREATION OF THE GRAPH SO IT CAN BE RENDERED
import matplotlib.pyplot as plt
import pandas as pd
from raphtory import Graph

edges_df = pd.read_csv(
    "data/OBS_data.txt", sep="\t", header=0, usecols=[0, 1, 2, 3, 4], parse_dates=[0]
)
edges_df["DateTime"] = pd.to_datetime(edges_df["DateTime"]).astype("datetime64[ms]")
edges_df.dropna(axis=0, inplace=True)
edges_df["Weight"] = edges_df["Category"].apply(
    lambda c: 1 if (c == "Affiliative") else (-1 if (c == "Agonistic") else 0)
)

g = Graph.load_from_pandas(
    edges_df=edges_df,
    src="Actor",
    dst="Recipient",
    time="DateTime",
    layer_in_df="Behavior",
    props=["Weight"],
)

###ACTUAL IMPORT CODE
importance = []
time = []
for rolling_lome in g.vertex("LOME").rolling("1 day"):
    importance.append(rolling_lome.degree())
    time.append(rolling_lome.end_date_time())

plt.plot(time, importance, marker="o")
plt.xlabel("Date")
plt.xticks(rotation=45)
plt.ylabel("Daily Unique Interactions")
plt.title("Lome's daily interaction count")
plt.grid(True)