Saving and loading graphs
The fastest way to ingest a graph is to load one from Raphtory's on-disk format using the
load_from_file() function on the graph. This does require first ingesting via one of the prior methods and saving the produced graph via
save_to_file(), but means for large datasets you do not need to parse the data every time you run a Raphtory script.
This is similar to pickling and can make a drastic difference on ingestion, especially if your datasets require a lot of preprocessing.
In the example below we ingest the edge dataframe from the last section, save this graph and reload it into a second graph. These are both printed to show they contain the same data.
Due to the ongoing development of Raphtory, a saved graph is not guaranteed to be consistent across versions.
from raphtory import Graph import pandas as pd edges_df = pd.read_csv("data/network_traffic_edges.csv") edges_df["timestamp"] = pd.to_datetime(edges_df["timestamp"]).astype( "datetime64[ms, UTC]" ) g = Graph() g.load_edges_from_pandas( edge_df=edges_df, src_col="source", dst_col="destination", time_col="timestamp", props=["data_size_MB"], layer_in_df="transaction_type", ) g.save_to_file("/tmp/saved_graph") loaded_graph = Graph.load_from_file("/tmp/saved_graph") print(g) print(loaded_graph)
Graph(number_of_edges=7, number_of_vertices=5, number_of_temporal_edges=7, earliest_time="1693555200000", latest_time="1693557000000") Graph(number_of_edges=7, number_of_vertices=5, number_of_temporal_edges=7, earliest_time="1693555200000", latest_time="1693557000000")