Quickstart

The design philosophy of the Python client is to mimic the GDS Cypher API in Python code. The Python client will translate the Python code written by the user to a corresponding Cypher query which it will then run on the Neo4j server using a Neo4j Python driver connection.

Import and setup

Use the Neo4j URI and credentials according to your setup.

from graphdatascience import GraphDataScience

# For example, in a local setup `NEO4J_URI` would be "neo4j://127.0.0.1:7687".
gds = GraphDataScience(NEO4J_URI, auth=(NEO4J_USER, NEO4J_PASSWORD))

The GraphDataScience object needs the Neo4j database to be available upon construction, and uses the default neo4j database by default. If the neo4j database does not exist or you want to use a different database, use the database keyword parameter:

gds = GraphDataScience(NEO4J_URI, auth=(NEO4J_USER, NEO4J_PASSWORD), database="my-db")

You can also change the database after creating the GraphDataScience object:

gds.set_database("my-db")

The Python client has dedicated support for Aura Graph Analytics.

This example shows how to instantiate the GraphDataScience object using an Aura API key pair and AuraDB connection information.

from graphdatascience.session import DbmsConnectionInfo, GdsSessions, AuraAPICredentials, SessionMemory

sessions = GdsSessions(api_credentials=AuraAPICredentials(AURA_API_CLIENT_ID, AURA_API_CLIENT_SECRET))

# `NEO4J_URI` has the format "neo4j+s://xxxxxxxx.databases.neo4j.io".
# The credentials are for the AuraDB instance.
gds = sessions.get_or_create(
    session_name="my-session",
    memory=SessionMemory.m_4GB,
    db_connection=DbmsConnectionInfo(NEO4J_URI, NEO4J_USER, NEO4J_PASSWORD),
)

If you are connecting the client to an AuraDS instance, you can get recommended non-default configuration settings of the Python driver applied automatically with aura_ds=True:

from graphdatascience import GraphDataScience

# Configures the driver with AuraDS-recommended settings.
# `NEO4J_URI` has the format "neo4j+s://xxxxxxxx.databases.neo4j.io:7687".
gds = GraphDataScience(
    NEO4J_URI,
    auth=(NEO4J_USER, NEO4J_PASSWORD),
    aura_ds=True
)

Additional checks

Check the version of GDS library running on the server:

print(gds.server_version())
print(gds.server_version())
print(gds.server_version())

Check if the GDS library running on the server has an enterprise license:

print(gds.is_licensed())
print(gds.is_licensed())
print(gds.is_licensed())

Usage example

The following example shows how to use the GraphDataScience object to:

  1. Run a Cypher query to populate the Neo4j database.

  2. Create a graph projection.

  3. Run an algorithm on the graph.

  4. Inspect the updated graph.

# Create a minimal example graph.
# The method returns a Pandas `DataFrame`.
gds.run_cypher(
  """
  CREATE
    (m: City {name: "Malmö"}),
    (l: City {name: "London"}),
    (s: City {name: "San Mateo"}),
    (m)-[:FLY_TO]->(l),
    (l)-[:FLY_TO]->(m),
    (l)-[:FLY_TO]->(s),
    (s)-[:FLY_TO]->(l)
  """
)

# Create an in-memory graph called `neo4j-offices` and
# a `G_office` object representing the projected graph.
G_office, project_result = gds.graph.project("neo4j-offices", "City", "FLY_TO")

# Run the `mutate` mode of the PageRank algorithm.
mutate_result = gds.pageRank.mutate(G_office, tolerance=0.5, mutateProperty="rank")

# Inspect the node properties of the projected graph
# via the graph object to confirm that a new property has been created.
assert G_office.node_properties("City") == ["rank"]
# Create a minimal example graph.
# The method returns a Pandas `DataFrame`.
gds.run_cypher(
  """
  CREATE
    (m: City {name: "Malmö"}),
    (l: City {name: "London"}),
    (s: City {name: "San Mateo"}),
    (m)-[:FLY_TO]->(l),
    (l)-[:FLY_TO]->(m),
    (l)-[:FLY_TO]->(s),
    (s)-[:FLY_TO]->(l)
  """
)

# Create an in-memory graph called `neo4j-offices` and
# a `G_office` object representing the projected graph.
# The Cypher query must contain the `gds.graph.project.remote()` function
# to project the graph into the GDS Session.
G_office, project_result = gds.graph.project(
    graph_name="my-graph",
    query="""
    CALL () {
        MATCH (from:City)
        OPTIONAL MATCH (from)-[r:FLY_TO]->(to:City)
        RETURN from AS source, r AS rel, to AS target, {} AS sourceNodeProperties, {} AS targetNodeProperties
    }
    RETURN gds.graph.project.remote(source, target, {
      sourceNodeProperties: sourceNodeProperties,
      targetNodeProperties: targetNodeProperties,
      sourceNodeLabels: labels(source),
      targetNodeLabels: labels(target),
      relationshipType: type(rel),
      relationshipProperties: properties(rel)
    })
    """,
)

# Run the `mutate` mode of the PageRank algorithm.
mutate_result = gds.pageRank.mutate(G_office, tolerance=0.5, mutateProperty="rank")

# Inspect the node properties of the projected graph
# via the graph object to confirm that a new property has been created.
assert G_office.node_properties("City") == ["rank"]
# Create a minimal example graph.
# The method returns a Pandas `DataFrame`.
gds.run_cypher(
  """
  CREATE
    (m: City {name: "Malmö"}),
    (l: City {name: "London"}),
    (s: City {name: "San Mateo"}),
    (m)-[:FLY_TO]->(l),
    (l)-[:FLY_TO]->(m),
    (l)-[:FLY_TO]->(s),
    (s)-[:FLY_TO]->(l)
  """
)

# Create an in-memory graph called `neo4j-offices` and
# a `G_office` object representing the projected graph.
G_office, project_result = gds.graph.project("neo4j-offices", "City", "FLY_TO")

# Run the `mutate` mode of the PageRank algorithm.
mutate_result = gds.pageRank.mutate(G_office, tolerance=0.5, mutateProperty="rank")

# Inspect the node properties of the projected graph
# via the graph object to confirm that a new property has been created.
assert G_office.node_properties("City") == ["rank"]

You can also use one of the datasets that comes with the library to get started. See the Datasets chapter for more on this.

Close open connections

# Close any open connections in the underlying Neo4j driver's connection pool
gds.close()

The close method is also called automatically when the GraphDataScience object is deleted.

# Delete the session to tear down the associated remote instance
# and close the connection
sessions.delete(session_name="my-session")

# In alternative, use `gds.delete()`
# Close any open connections in the underlying Neo4j driver's connection pool
gds.close()

The close method is also called automatically when the GraphDataScience object is deleted.