Exporting to Pandas dataframes
As we are ingesting from a set of dataframes, let's kick off by looking at how to convert back into them. For this Raphtory provides two functions to_vertex_df()
and to_edge_df()
as part of the export
module. As the names suggest, these extract information about the vertices and edges respectively.
Vertex Dataframe
To explore the use of to_vertex_df()
we first we call the function passing only our graph. As we don't set any flags this exports the full history, which can be seen within the printed dataframe. The property
column for ServerA
has been extracted and printed so this can be seen in full.
To demonstrate some of the flags which can be utilised, we call to_vertex_df()
again, this time disabling the update and property history. You can see in the second set of prints that now only the most recent property values are present and the update_history
column has been removed.
import raphtory.export as ex
df = ex.to_vertex_df(traffic_graph)
print("Dataframe with full history:")
print(f"{df}\n")
print("The properties of ServerA:")
print(f"{df[df['id'] == 'ServerA'].properties.iloc[0]}\n")
df = ex.to_vertex_df(
traffic_graph, include_update_history=False, include_property_histories=False
)
print("Dataframe with no history:")
print(f"{df}\n")
print("The properties of ServerA:")
print(f"{df[df['id'] == 'ServerA'].properties.iloc[0]}\n")
Output
Dataframe with full history:
id ... update_history
0 ServerA ... [1693555200000, 1693555500000, 1693556400000]
1 ServerB ... [1693555200000, 1693555500000, 1693555800000, ...
2 ServerC ... [1693555500000, 1693555800000, 1693556400000, ...
3 ServerD ... [1693555800000, 1693556100000, 1693557000000]
4 ServerE ... [1693556100000, 1693556400000, 1693556700000]
[5 rows x 3 columns]
The properties of ServerA:
{'_id': 'ServerA', 'server_name': 'Alpha', 'datasource': 'data/network_traffic_edges.csv', 'hardware_type': 'Blade Server', 'OS_version': [(1693555200000, 'Ubuntu 20.04')], 'uptime_days': [(1693555200000, 120)], 'primary_function': [(1693555200000, 'Database')]}
Dataframe with no history:
id properties
0 ServerA {'_id': 'ServerA', 'server_name': 'Alpha', 'ha...
1 ServerB {'uptime_days': 45, 'server_name': 'Beta', 'pr...
2 ServerC {'primary_function': 'File Storage', '_id': 'S...
3 ServerD {'uptime_days': 60, 'server_name': 'Delta', 'd...
4 ServerE {'server_name': 'Echo', 'OS_version': 'Red Hat...
The properties of ServerA:
{'_id': 'ServerA', 'server_name': 'Alpha', 'hardware_type': 'Blade Server', 'uptime_days': 120, 'datasource': 'data/network_traffic_edges.csv', 'OS_version': 'Ubuntu 20.04', 'primary_function': 'Database'}
Edge Dataframe
Exporting to an edge dataframe via to_edge_df()
works exactly the same as to_vertex_df()
. By default this will export the full update history for each edge, split by edge layer. The flags to remove the history are once again available, as well as a flag to explode the edges and view each update individually.
In the below example we first create a subgraph of the monkey interactions, selecting some monkeys we are interested in (ANGELE
and FELIPE
). This isn't a required step, but is to demonstrate the export working on graph views.
We then call to_edge_df()
on the subgraph, setting no flags. In the output you can see the update/property history for each interaction type (layer) between ANGELE
and FELIPE
.
Finally, we call to_edge_df()
again, turning off the property history and exploding the edges. In the output you can see each interaction which occurred between ANGELE
and FELIPE
.
Info
We have further reduced the graph to only one layer (Grunting-Lipsmacking
) to reduce the output size.
import raphtory.export as ex
subgraph = monkey_graph.subgraph(["ANGELE", "FELIPE"])
df = ex.to_edge_df(subgraph)
print("Interactions between Angele and Felipe:")
print(f"{df}\n")
grunting_graph = subgraph.layer("Grunting-Lipsmacking")
df = ex.to_edge_df(grunting_graph, explode_edges=True, include_property_histories=False)
print("Exploding the grunting-Lipsmacking layer")
print(df)
Output
Interactions between Angele and Felipe:
src ... update_history
0 ANGELE ... [1560419400000, 1560419400000, 1560419460000, ...
1 ANGELE ... [1560422580000, 1560441780000, 1560441780000, ...
2 ANGELE ... [1560855660000]
3 ANGELE ... [1560526320000, 1560855660000, 1561042620000]
4 ANGELE ... [1562253540000]
5 ANGELE ... [1561720320000]
6 FELIPE ... [1560419460000, 1560419520000, 1560419580000, ...
7 FELIPE ... [1562321580000]
8 FELIPE ... [1560526320000, 1561972860000, 1562253540000]
9 FELIPE ... [1561110180000]
10 FELIPE ... [1562057520000]
11 FELIPE ... [1560526260000, 1562253540000, 1562321580000]
12 FELIPE ... [1560526320000]
13 FELIPE ... [1562253540000]
14 FELIPE ... [1562057520000, 1562671200000]
[15 rows x 5 columns]
Exploding the grunting-Lipsmacking layer
src dst layer properties update_history
0 ANGELE FELIPE Grunting-Lipsmacking {'Weight': 1} 1560526320000
1 ANGELE FELIPE Grunting-Lipsmacking {'Weight': 1} 1560855660000
2 ANGELE FELIPE Grunting-Lipsmacking {'Weight': 1} 1561042620000
3 FELIPE ANGELE Grunting-Lipsmacking {'Weight': 1} 1560526320000
4 FELIPE ANGELE Grunting-Lipsmacking {'Weight': 1} 1561972860000
5 FELIPE ANGELE Grunting-Lipsmacking {'Weight': 1} 1562253540000