Building the Simplified Database
The simplified database is built as part of the Reactome Release Pipeline. It is run late in the pipeline after the orthoinference, external resource links, and stable identifier information has been added to the test_reactome_XX database. The test_reactome_XX database is then used to extract relationships between pathways, reaction like events, physical entities and identifiers external to Reactome to create test_reactome_XX_dn, which is made available for download on the public Reactome website.
The stand alone script to create the simplified database can be found on reactomerelease with the name simplified_database.pl under the /usr/local/gkb/scripts/release/simplified_database directory. The source code is also available here.
The script can be run with the -help argument to see the default source database, source host, and simplified database name that will be used and the option flags to override them.
The current version of the simplified database can be accessed from our download page or directly here
Database and Table Structure
The simplified database consists of 8 tables which describe Reactome pathways, reactions, and physical entities as well as their relationships to each other and to external reference databases.
Entity Relationship Diagram for the Reactome Simplified Database
The Pathway Table contains records of all pathways in Reactome. Each record is composed of the pathway”s database identifier (primary key), display name, species, class (one of the following: BlackBoxEvent, Depolymerisation, FailedReaction, Polymerisation, Reaction), and stable identifier.
The ReactionLikeEvent Table contains records of all events in Reactome. Each record consists of the event’s database identifier (primary key), display name, species, class (one of the following: BlackBoxEvent, Depolymerisation, FailedReaction, Polymerisation, Reaction), and stable identifier.
The PhysicalEntity Table contains records of all molecules and collections of molecules in Reactome. Each record consists of the same fields as in the ReactionLikeEvent table, but the possible classes are EntityWithAccessionedSequence, GenomeEncodedEntity, SimpleEntity, Polymer, OtherEntity, DefinedSet, CandidateSet, OpenSet, and Complex.
The Pathway_To_ReactionLikeEvent Table contains records of which pathways contain which events. Each record has the database identifier of a pathway and the database identifier of an event. This table can be used to link information between the Pathway and ReactionLikeEvent tables.
The ReactionLikeEvent_To_PhysicalEntity Table contains records of which events contain which physical entities. Each record has the database identifier of an event and the database identifier of a physical entity. This table can be used to link information between the ReactionLikeEvent and PhysicalEntity tables.
The PathwayHierarchy Table contains records capturing the parent to child relationships among pathways. Each record has the database identifier of a parent pathway and the database identifier of a child pathway.
The PhysicalEntityHierarchy Table contains records capturing the parent to child relationships among physical entities. This relationship is either between a complex and its components or a set and its members/candidates. Each record has the database identifier of a parent physical entity (i.e. a set or complex) and the database identifier of a child physical entity.
The Id_To_ExternalIdentifier Table contains records of external database identifiers which apply to pathways, events, and physical entities in Reactome. Each record consists of the Reactome instance database identifier, the reference database name (unavailable in some records), the reference database identifier, and a description of the reference database identifier.
Linking External gene or protein IDs
The query below will find all human pathways associated with ENSEMBL gene ids
SELECT e.externalIdentifier, pa.stableId, pa.displayName
FROM PhysicalEntity p, Id_To_ExternalIdentifier e,
WHERE e.id = p.id and p.id = r.physicalEntityId
AND pa.id = pr.pathwayId
AND r.reactionLikeEventId = pr.reactionLikeEventId
AND p.species = 'Homo sapiens'
AND e.externalIdentifier LIKE 'ENSG00%'
AND e.referenceDatabase = 'ENSEMBL'
ORDER BY pa.stableId;
Additional example queries will be added periodically