bkms_react
get_bkms_tarball(filepath, extract=True)
The file is available at: https://bkms.brenda-enzymes.org/download.php.
The compressed file (Reactions_BKMS.tar.gz) includes the table in tab
stop separated format (Excel, OpenOffice). The table contains actual data of
BRENDA (release 2021.2, only reactions with naturally occurring substrates),
MetaCyc (version 24.5), SABIO-RK (07/02/2021) and KEGG data, downloaded on
the 23rd of April 2012. Downloading more recent KEGG data cannot be offered
because a KEGG license agreement would be necessary.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filepath |
str
|
The path to store the downloaded file. |
required |
extract |
bool
|
Extract the file. |
True
|
Source code in parser/bkms_react.py
read_bkms(filepath, clean=True)
Read the BKMS-react table and prepare it for further processing.
The table contains random ^M characters in some rows. These characters
won't break pandas, but they will make the parsed table unexpectedly long.
The clean parameter can be used to remove these characters.
BRENDA takes EC numbers as identifiers, so we need entries with non-empty
EC_Number and Reaction_ID_MetaCyc columns. There is a
Reaction_ID_BRENDA column that's sometimes non-empty when the EC number
is missing, but the field is not documented, and it's not clear how it can be
mapped to an entry in the BRENDA text file.
To be extra conservative, we only keep entries with non-empty
Reaction_ID_KEGG columns. Only reactions in MetaCyc/BioCyc that are
associated with matching EC numbers and KEGG IDs will be annotated.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filepath |
str
|
The path to the BKMS-react |
required |
clean |
bool
|
Remove the |
True
|
Returns:
| Type | Description |
|---|---|
pd.DataFrame
|
A pandas dataframe: |