Skip to content

sbml

SBMLClient(uri='neo4j://localhost:7687', neo4j_user='neo4j', neo4j_password='neo4j', database='neo4j', create_db=True, drop_if_exists=False, reaction_groups=True)

Bases: Neo4jClient

In addition to the Neo4j driver, this class also includes a set of helper methods for creating nodes and relationships in the graph with data from the SBML file.

Parameters:

Name Type Description Default
uri str

URI of the Neo4j server. Defaults to neo4j://localhost:7687. For more details, see :class:neo4j.driver.Driver.

'neo4j://localhost:7687'
neo4j_user str

Neo4j user. Defaults to neo4j.

'neo4j'
neo4j_password str

Neo4j password. Defaults to neo4j.

'neo4j'
database str

Name of the database. Defaults to neo4j.

'neo4j'
create_db bool

Whether to create the database. See :meth:.setup_graph_db.

True
drop_if_exists bool

Whether to drop the database if it already exists.

False

Attributes:

Name Type Description
driver

:class:neo4j.Neo4jDriver or :class:neo4j.BoltDriver.

database

str, name of the database to use.

available_node_labels

tuple of strings indicating the possible node labels in the graph.

Parameters:

Name Type Description Default
create_db bool

If False, does not create the database. This is useful for running on neo4j AuraDB when database creation is not allowed.

True
drop_if_exists bool

See :meth:.create.

False
reaction_groups bool

Assume all group nodes are groups of Reaction nodes. This assumption greatly speeds up the creation of the graph.

True
Source code in db/sbml.py
def __init__(  # nosec B107 - default password is okay here
    self,
    uri: str = "neo4j://localhost:7687",
    neo4j_user: str = "neo4j",
    neo4j_password: str = "neo4j",
    database: str = "neo4j",
    create_db: bool = True,
    drop_if_exists: bool = False,
    reaction_groups: bool = True,
):
    """Create Neo4j database.

    Args:
        create_db: If False, does not create the database. This is useful
            for running on neo4j AuraDB when database creation is not allowed.
        drop_if_exists: See :meth:`.create`.
        reaction_groups: Assume all ``group`` nodes are groups of ``Reaction``
            nodes. This assumption greatly speeds up the creation of the
            graph.
    """
    super().__init__(uri, neo4j_user, neo4j_password, database)

    self.available_node_labels = set()
    # Create database
    if create_db:
        self.create(force=drop_if_exists)

    self.reaction_groups = reaction_groups

create_nodes(desc, nodes, query, batch_size=1000, progress_bar=False)

Create nodes in batches with the given label and properties.

For Compartment nodes, simply create them with given properties.

Each compound node is linked to its Compartment node. If it has related RDF nodes, these are also created and linked to the Compound node.

GeneProduct nodes don't have relationships to Compartment nodes, but they are linked to corresponding RDF nodes.

Parameters:

Name Type Description Default
desc str

Label of the node in log and progress bar.

required
nodes list[dict[str, Any]]

List of properties of the nodes.

required
query str

Cypher query to create the nodes.

required
batch_size int

Number of nodes to create in each batch.

1000
progress_bar bool

Show progress bar for slow queries.

False
Source code in db/sbml.py
def create_nodes(
    self,
    desc: str,
    nodes: list[dict[str, Any]],
    query: str,
    batch_size: int = 1000,
    progress_bar: bool = False,
):
    """Create nodes in batches with the given label and properties.

    For ``Compartment`` nodes, simply create them with given properties.

    Each ``compound`` node is linked to its ``Compartment`` node. If it
    has related ``RDF`` nodes, these are also created and linked to the
    ``Compound`` node.

    ``GeneProduct`` nodes don't have relationships to ``Compartment``
    nodes, but they are linked to corresponding ``RDF`` nodes.

    Args:
        desc: Label of the node in log and progress bar.
        nodes: List of properties of the nodes.
        query: Cypher query to create the nodes.
        batch_size: Number of nodes to create in each batch.
        progress_bar: Show progress bar for slow queries.
    """
    logger.info(f"Creating {desc} nodes")

    if progress_bar:
        it = tqdm(
            chunk(nodes, batch_size),
            desc=desc,
            total=len(nodes) // batch_size,
        )
    else:
        it = chunk(nodes, batch_size)

    for batch in it:
        self.write(query, batch_nodes=batch)

    logger.info(f"Created {len(nodes)} {desc} nodes")

sbml_to_graph(parser)

Populate Neo4j database with SBML data. The process is as follows:

. Parse the SBML file. All parsing errors are logged as warnings.

. Create the database and constraints.

. Feed the SBML file into the database. This will populate

Compartment, Reaction, Compound, GeneProduct, GeneProductSet, GeneProductComplex, and RDF nodes.

Nodes are created for each SBML element using MERGE statements: https://neo4j.com/docs/cypher-manual/current/clauses/merge/#merge-merge-with-on-create

Source code in db/sbml.py
def sbml_to_graph(self, parser: SBMLParser):
    """Populate Neo4j database with SBML data. The process is as follows:

    #. Parse the SBML file. All parsing errors are logged as warnings.
    #. Create the database and constraints.
    #. Feed the SBML file into the database. This will populate
       ``Compartment``, ``Reaction``, ``Compound``, ``GeneProduct``,
       ``GeneProductSet``, ``GeneProductComplex``, and ``RDF`` nodes.

    Nodes are created for each SBML element using ``MERGE`` statements:
    https://neo4j.com/docs/cypher-manual/current/clauses/merge/#merge-merge-with-on-create
    """
    # Read SBML file
    model: Model = parser.read_sbml(parser.sbml_file).getModel()

    # Compartments
    if cpts := (model.getListOfCompartments()):
        compartments = parser.collect_compartments(cpts)
        self._compartments_to_graph(compartments)

    # Compounds, i.e. metabolites, species
    if cpds := model.getListOfSpecies():
        compounds = parser.collect_compounds(cpds)
        self._compounds_to_graph(compounds)

    # Reactions
    if rxns := (model.getListOfReactions()):
        reactions = parser.collect_reactions(rxns)
        self._reactions_to_graph(reactions)

        # Gene products
        self._gene_products_to_graph(model, parser, rxns)

        # Dummy reverse reactions for easier queries
        rev_reactions = parser.collect_reverse_reactions(reactions)
        self._reverse_reactions_to_graph(rev_reactions)

    # Groups, i.e. related reactions in SBML
    self._groups_to_graph(model, parser)