SBML: What Is It About?
- 1. SBML: What Is It About?
Michael Hucka, Ph.D.
Department of Computing + Mathematical Sciences
California Institute of Technology
Pasadena, CA, USA
Email: mhucka@caltech.edu Twitter: @mhucka
HCLS Systems Biology, June 2012
- 2. General background and motivations
Brief summary of SBML features
Outline
Annotations, connections and semantics
SBML development today
Acknowledgments
- 3. General background and motivations
Brief summary of SBML features
Outline
Annotations, connections and semantics
SBML development today
Acknowledgments
- 5. One example of a type of model represented in SBML
Simulation
output
Tyson et al. (1991)
PNAS 88(1):7328–32
- 8. General background and motivations
Brief summary of SBML features
Outline
Annotations, connections and semantics
SBML development today
Acknowledgments
- 9. SBML = Systems Biology Markup Language
Format for representing computational models of biological processes
• Data structures + usage principles + serialization to XML
Neutral with respect to modeling framework
• E.g., ODE, stochastic systems, etc.
Development started in 2000, with first specification distributed in 2001
• XML was still relatively new, RDF even more so
- 10. so A li
ftw ng
ar ua
e ( fra
no nca
t h fo
um r
an
s)
- 11. The process is central
• Called a “reaction” in SBML
• Participants are pools of entities (species)
Models can further include: • Unit definitions
• Other constants & variables • Annotations
• Compartments
• Explicit math
• Discontinuous events
Basic SBML concepts are fairly simple
- 13. Species pools are located in compartments
c
protein A protein B
n
gene mRNAn mRNAc
- 16. Reaction/process rates can be (almost) arbitrary formulas
c
protein A f1(x) protein B
n
f5(x) f2(x)
gene f4(x) mRNAn f3(x) mRNAc
- 17. “Rules”: equations expressing relationships in addition to reaction sys.
g1(x) c
g2(x) protein A f1(x) protein B
.
.
. n
f5(x) f2(x)
gene f4(x) mRNAn f3(x) mRNAc
- 18. “Events”: discontinuous actions triggered by system conditions
g1(x) c
g2(x) protein A f1(x) protein B
.
.
. n
f5(x) f2(x)
gene f4(x) mRNAn f3(x) mRNAc
Event1: when (...condition...), Event2: when (...condition...), ...
do (...assignments...) do (...assignments...)
- 19. Annotations: machine-readable semantics and links to other resources
“This is identified “This is an enzymatic
c
g1(x)by GO id # ...” reaction with EC # ...”
g2(x)
. protein A f1(x) protein B
.
“This is a transport
. n
into the nucleus ...” “This compartment
represents the nucleus ...”
f5(x) f2(x)
gene f4(x) mRNAn f3(x) mRNAc
“This event
represents ...”
Event1: when (...condition...), Event2: when (...condition...), ...
do (...assignments...) do (...assignments...)
- 21. Today: spatially homogeneous models
• Metabolic network models
• Signaling pathway models
• Conductance-based models
• Neural models
• Pharmacokinetic/dynamics models
• Infectious diseases
Scope of SBML encompasses many types of models
- 22. Today: spatially homogeneous models
• Metabolic network models Find
BioM
exam
ples
in
• Signaling pathway models
http:
odels
Data
base
• Conductance-based models //bio
mod
els.ne
t/bio
• Neural models m odels
• Pharmacokinetic/dynamics models
• Infectious diseases
Scope of SBML encompasses many types of models
- 23. Today: spatially homogeneous models
• Metabolic network models Find
BioM
exam
ples
in
• Signaling pathway models
http:
odels
Data
base
• Conductance-based models //bio
mod
els.ne
t/bio
• Neural models m odels
• Pharmacokinetic/dynamics models
• Infectious diseases
Coming: SBML Level 3 packages to support other types
• E.g.: Spatially inhomogeneous models, also qualitative/logical
Scope of SBML encompasses many types of models
- 24. SBML Level 1 SBML Level 2 SBML Level 3
predefined math functions user-defined functions user-defined functions
text-string math notation MathML subset MathML subset
reserved namespaces for no reserved namespaces no reserved namespaces
annotations for annotations for annotations
no controlled annotation RDF-based controlled RDF-based controlled
scheme annotation scheme annotation scheme
no discrete events discrete events discrete events
default values defined default values defined no default values
monolithic monolithic modular
- 25. General background and motivations
Brief summary of SBML features
Outline
Annotations, connections and semantics
SBML development today
Acknowledgments
- 29. SBML provides syntax and only limited semantics
Raw models alone are insufficient
Need standard schemes for
Low info machine-readable annotations
content
• For authorship, publication info
• For links to other data resources
• For semantics of mathematics
Need common guidelines for
minimal model quality and content
No standard
identifiers
- 30. SBML provides syntax and only limited semantics
Raw models alone are insufficient
Need standard schemes for
Low info machine-readable annotations
content
• For authorship, publication info
Defined
•byFor links to other data resources
SBML
• For semantics of mathematics
Need common guidelines for
minimal model quality and content
No standard
identifiers
- 31. SBML provides syntax and only limited semantics
Raw models alone are insufficient
Need standard schemes for
Low info machine-readable annotations
content
• For authorship, publication info
Defined
•byFor links to other dataDefined
SBML resources
by MIRIAM
• For semantics of mathematics
Need common guidelines for
minimal model quality and content
No standard
identifiers
- 32. Linking SBML elements to external resources
}
In SBML Level 2–3,
MIRIAM
annotations
are restricted to
this specific
form and to
appear inside
<annotation>
elements.
(Other RDF can appear elsewhere in <annotation>)
- 33. Linking SBML elements to external resources
E.g.: species, compartment,
reaction, parameter
}
In SBML Level 2–3,
MIRIAM
annotations
are restricted to
this specific
form and to
appear inside
<annotation>
elements.
(Other RDF can appear elsewhere in <annotation>)
- 34. Linking SBML elements to external resources
E.g.: species, compartment,
reaction, parameter
}
Chosen from specific list— In SBML Level 2–3,
http://sbml.org/miriam/qualifiers MIRIAM
annotations
E.g.: bqbiol:isPartOf
are restricted to
this specific
form and to
appear inside
<annotation>
elements.
(Other RDF can appear elsewhere in <annotation>)
- 35. Linking SBML elements to external resources
E.g.: species, compartment,
reaction, parameter
}
Chosen from specific list— In SBML Level 2–3,
http://sbml.org/miriam/qualifiers MIRIAM
annotations
E.g.: bqbiol:isPartOf
are restricted to
this specific
form and to
appear inside
<annotation>
Taken from public list at elements.
http://sbml.org/miriam
(Other RDF can appear elsewhere in <annotation>)
- 36. Example
<species metaid="metaid_0000009" id="species_3" compartment="c_1">
<annotation>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:bqbiol="http://biomodels.net/biology-qualifiers/" >
<rdf:Description rdf:about="#metaid_0000009">
<bqbiol:is>
<rdf:Bag>
<rdf:li rdf:resource="urn:miriam:obo.chebi:CHEBI%3A15996"/>
<rdf:li rdf:resource="urn:miriam:kegg.compound:C00044"/>
</rdf:Bag>
</bqbiol:is>
</rdf:Description>
</rdf:RDF>
</annotation>
</species>
- 37. Example
<species metaid="metaid_0000009" id="species_3" compartment="c_1">
<annotation>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:bqbiol="http://biomodels.net/biology-qualifiers/" >
<rdf:Description rdf:about="#metaid_0000009">
<bqbiol:is> Data references
<rdf:Bag>
<rdf:li rdf:resource="urn:miriam:obo.chebi:CHEBI%3A15996"/>
<rdf:li rdf:resource="urn:miriam:kegg.compound:C00044"/>
</rdf:Bag>
</bqbiol:is>
</rdf:Description>
</rdf:RDF>
</annotation>
</species>
- 38. Example
<species metaid="metaid_0000009" id="species_3" compartment="c_1">
<annotation>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:bqbiol="http://biomodels.net/biology-qualifiers/" >
<rdf:Description rdf:about="#metaid_0000009">
<bqbiol:is> Relationship qualifier
<rdf:Bag>
<rdf:li rdf:resource="urn:miriam:obo.chebi:CHEBI%3A15996"/>
<rdf:li rdf:resource="urn:miriam:kegg.compound:C00044"/>
</rdf:Bag>
</bqbiol:is>
</rdf:Description>
</rdf:RDF>
</annotation>
</species>
- 40. Resolving resource identifiers
For linking to data, need:
• Globally unique, unambiguous identifiers
• ... that are persistent despite resource changes (e.g., changed URLs)
• ... that are maintained by the community
MIRIAM Registry provides data & identifiers.org provides resolvable URIs
• Unlike URNs, can type identifiers.org URI in a web browser
Example:
• EC Code entry #1.1.1.1
- MIRIAM URN: urn:miriam:ec-code:1.1.1
- identifiers.org URI: http://identifiers.org/ec-code/1.1.1.1
Developed by Nicolas Le Novère, Camille Laibe, Nick Juty @ EBI
- 41. General background and motivations
Brief summary of SBML features
Outline
Annotations, connections and semantics
SBML development today
Acknowledgments
- 42. SBML Level 3: Supporting more categories of models
Package W
Package X Package Y Package Z
SBML Level 3 Core
(dependencies)
A package adds constructs & capabilities
Models declare which packages they use
• Applications tell users which packages they support
Package development can be decoupled
- 46. Model Procedures Results
Representation
format SBRML
Minimal info
?
requirements
Semantics—
Mathematical
Other
annotations annotations annotations
Growing ecosystem of standards to improve reproducibility
- 47. General background and motivations
Brief summary of SBML features
Outline
Annotations, connections and semantics
SBML development today
Acknowledgments
- 48. People on SBML Team & BioModels.net Team
SBML Team BioModels.net Team
Michael Hucka Nicolas Le Novère
Sarah Keating Camille Laibe
Frank Bergmann Nicolas Rodriguez
Lucian Smith Nick Juty
Nicolas Rodriguez Vijayalakshmi Chelliah
Linda Taddeo Stuart Moodie
Akiya Joukarou Visionaries Sarah Keating
Akira Funahashi Hiroaki Kitano Maciej Swat
Kimberley Begley John Doyle Lukas Endler
Bruce Shapiro Chen Li
Andrew Finney Harish Dharuri
Ben Bornstein Lu Li
Ben Kovitz Enuo He
Hamid Bolouri Mélanie Courtot
Herbert Sauro Alexander Broicher
Jo Matthews Arnaud Henry
Maria Schilstra Marco Donizelli
- 49. National Institute of General Medical Sciences (USA)
fu
We a
nd
European Molecular Biology Laboratory (EMBL)
♥ ge
ing
ELIXIR (UK)
ou ie
r s
Beckman Institute, Caltech (USA)
nc
Keio University (Japan)
JST ERATO Kitano Symbiotic Systems Project (Japan) (to 2003)
JST ERATO-SORST Program (Japan)
International Joint Research Program of NEDO (Japan)
Japanese Ministry of Agriculture
Japanese Ministry of Educ., Culture, Sports, Science and Tech.
BBSRC (UK)
National Science Foundation (USA)
DARPA IPTO Bio-SPICE Bio-Computation Program (USA)
Air Force Office of Scientific Research (USA)
STRI, University of Hertfordshire (UK)
Molecular Sciences Institute (USA)
- 50. Attendees at SBML 10th Anniversary Symposium, Edinburgh, 2010
A huge thank you to the community
- 51. SBML http://sbml.org
BioModels Database http://biomodels.net/biomodels
identifiers.org http://identifiers.org
MIRIAM http://biomodels.net/miriam
URLs
MIASE http://biomodels.net/miase
SED-ML http://biomodels.net/sed-ml
SBO http://biomodels.net/sbo
SBRML http://tinyurl.com/sbrml
SBGN http://sbgn.org
- 52. I’d like your feedback!
You can use this anonymous form:
http://tinyurl.com/mhuckafeedback
- 54. Computational modeling has gained broad appeal
Metabolic networks: Fung et al. A synthetic gene-metabolic oscillator. Nature 2005; Herrgård et al. A consensus
yeast metabolic network reconstruction obtained from a community approach to systems biology. Nat Biotechnol
2008
Signalling pathways: Bray et al. Receptor clustering as a cellular mechanism to control sensitivity. Nature 1998; Bhalla
ad Iyengar. Emergent properties of signaling pathways. Science 1998; Schoeberl et al. Computational modeling of the
dynamics of the MAP kinase cascade activated by surface and internalized EGF receptors. Nat Biotechnol 2002;
Hoffmann et. The IκB-NF-κB signaling module: temporal control and selective gene activation. Science 2002; Smith et al.
Systems analysis of Ran transport. Science 2002; Bhalla et al. MAP kinase phosphatase as a locus of flexibility in a
mitogen-activated protein kinase signaling network. Science 2002; Nelson et al. Oscillations in NF-κB Signaling Control
the Dynamics of Gene Expression. Science 2004; Werner et al. Stimulus specificity of gene expression programs
determined by temporal control of IKK activity. Science 2005; Sasagawa et al. Prediction and validation of the distinct
dynamics of transient and sustained ERK activation. Nat Cell Biol 2005; Basak et al. A fourth IkappaB protein within the
NF-κB signaling module. Cell 2007; McLean et al. Cross-talk and decision making in MAP kinase pathways. Nat Genet
2007; Ashall et al. Pulsatile Stimulation Determines Timing and Specificity of NF-κB-Dependent Transcription. Science
2009; Becker et al. Covering a broad dynamic range: information processing at the erythropoietin receptor. Science 2010
Gene regulatory networks: McAdams and Shapiro. Circuit simulation of genetic networks. Science 1995; Yue et al.
Genomic cis-regulatory logic: Experimental and computational analysis of a sea urchin gene. Science 1998; Von Dassow
et al. The segment polarity network is a robust developmental module. Nature 2000; Elowitz and Leibler. A synthetic
oscillatory network of transcriptional regulators. Nature 2000; Shen-Orr et al, Network motifs in the transcriptional
regulation network of Escherichia coli. Nat Genet 2002; Yao et al. A bistable Rb-E2F switch underlies the restriction point.
Nat Cell Biol 2008; Friedland. Synthetic gene networks that count. Science 2009
Pharmacometrics models: Labrijn et al. Therapeutic IgG4 antibodies engage in Fab-arm exchange with endogenous
human IgG4 in vivo. Nat Biotechnol 2009
Physiological models: Noble. Modeling the heart from genes to cells to the whole organ. Science 2002; Izhikevich and
Edelman. Large-scale model of mammalian thalamocortical systems. PNAS 2008
Infectious diseases: Perelson et al. HIV-1 dynamics in vivo: Virion clearance rate, infected cell life-span, and viral
generation time. Science 1996; Nowak. Population dynamics of immune responses to persistent viruses. Science 1996;
- 56. General features of the survey
Online, implemented using commercial survey website
28 questions
• Mix of multiple choice and fill-in-the-blank
85 responses by July 2011
• Removed incomplete responses
• 81 software tools left
Avoided “corrections” to data
- 57. Purposes of the software systems
Question: Which of the following categories best describe your software?
(Check all that apply.)
Simulation software 42
Analysis s/w (in addition, or instead of, simulation) 40
Creation/model development software 31
Visualization/display/formatting software 31
Utility software (e.g., format conversion) 23
Data integration and management software 16
Repository or database 14
Framework or library (for use in developing s/w) 13
S/w for interactive env. (e.g., MATLAB, R, ...) 13
Annotation software 11
0 20 40 60 80
Total number of software tools
- 58. Purposes of the software systems
Question: Which of the following categories best describe your software?
(Check all that apply.)
Simulation software 42
Analysis s/w (in addition, or instead of, simulation) 40
Creation/model development software 31
Visualization/display/formatting software 31
Utility software (e.g., format conversion) 23
Data integration and management software 16
Repository or database 14
Framework or library (for use in developing s/w) 13
S/w for interactive env. (e.g., MATLAB, R, ...) 13
Annotation software 11
0 20 40 60 80
Total number of software tools
- 59. Purposes of the software systems
Question: Which of the following categories best describe your software?
(Check all that apply.)
Simulation software 42
Analysis s/w (in addition, or instead of, simulation) 40
Creation/model development software 31
Visualization/display/formatting software 31
Utility software (e.g., format conversion) 23
Data integration and management software 16
Repository or database 14
Framework or library (for use in developing s/w) 13
1/4 1/2 3/4
S/w for interactive env. (e.g., MATLAB, R, ...) 13
Annotation software 11
0 20 40 60 80
Total number of software tools
- 60. Purposes of the software systems
Question: Which of the following categories best describe your software?
(Check all that apply.)
Simulation software 42
Analysis s/w (in addition, or instead of, simulation) 40
Creation/model development software 31
Visualization/display/formatting software 31
Utility software (e.g., format conversion) 23
Data integration and management software 16
Repository or database 14
Framework or library (for use in developing s/w) 13
S/w for interactive env. (e.g., MATLAB, R, ...) 13
Annotation software 11
0 20 40 60 80
Total number of software tools
- 61. Mathematical frameworks
Question: Regardless of whether your software provides simulation
capabilities, what modeling frameworks does the package support when
working with SBML files?
Ordinary differential equations (ODE) 54
Discrete stochastic simulation 28
Discontinuous event handling 25
Differential-algebraic equations (DAE) 17
Logical/Boolean networks 11
Delay-differential equations (DDE) 9
Partial differential equations (PDE) 8
None of the above, or other framework 20
0 20 40 60 80
Total number of software tools
- 62. Mathematical frameworks
Question: Regardless of whether your software provides simulation
capabilities, what modeling frameworks does the package support when
working with SBML files?
Ordinary differential equations (ODE) 54
Discrete stochastic simulation 28
Discontinuous event handling 25
Differential-algebraic equations (DAE) 17
Logical/Boolean networks 11
Delay-differential equations (DDE) 9
Partial differential equations (PDE) 8
None of the above, or other framework 20 E.g.: FBA
0 20 40 60 80
Total number of software tools
- 63. Other supported standards
Question: Which other standards does your software support?
MIRIAM 16
SBO 14
SBGN 13
BioPAX 6
CellML 3
SED-ML 3
MFAML 1
PNML 1 (Warning:
SBOL 1 different scale)
0 5 10 15 20
Total # software tools supporting other standards
- 64. Availability of software
Fee-based Not
Fee-based
2% avail.
10%
21%
Code
Free Free available
98% 90% 79%
Fees for Fees for non- Is source code
academics academics available?
Editor's Notes
- \n
- \n
- \n
- computational methods, simulation, analysis are all an integral part\n
- \n
- Must weave solutions using different methods & tools\n
- \n
- \n
- a format that&#x2019;s not any particular software systems&#x2019; internal format, but could act as a lingua franca that allows different software tools to exchange model definitions via this intermediate format, SBML\n
- \n
- compatible: either a count of things, or an extensive property such as concentration or density\n\nincompatible: activity level\n
- \n
- \n
- \n
- \n
- \n
- \n
- \n
- \n
- These models all have the common feature that they involve unitary entities related by equations that determine their quantities. There are no spatial effects, and also, all flat models -- single file. These are limitations and we&#x2019;re working toward expanding SBML to remove those limitations.\n
- These models all have the common feature that they involve unitary entities related by equations that determine their quantities. There are no spatial effects, and also, all flat models -- single file. These are limitations and we&#x2019;re working toward expanding SBML to remove those limitations.\n
- These models all have the common feature that they involve unitary entities related by equations that determine their quantities. There are no spatial effects, and also, all flat models -- single file. These are limitations and we&#x2019;re working toward expanding SBML to remove those limitations.\n
- These models all have the common feature that they involve unitary entities related by equations that determine their quantities. There are no spatial effects, and also, all flat models -- single file. These are limitations and we&#x2019;re working toward expanding SBML to remove those limitations.\n
- Evolution of features took time & practical experience\n
- \n
- a lot of tools work only at this level, but other tools such as virtual cell and databases that are more sophisticated need additional information about a model\n
- a lot of tools work only at this level, but other tools such as virtual cell and databases that are more sophisticated need additional information about a model\n
- a lot of tools work only at this level, but other tools such as virtual cell and databases that are more sophisticated need additional information about a model\n
- a lot of tools work only at this level, but other tools such as virtual cell and databases that are more sophisticated need additional information about a model\n
- a lot of tools work only at this level, but other tools such as virtual cell and databases that are more sophisticated need additional information about a model\n
- \n
- \n
- \n
- \n
- \n
- \n
- \n
- \n
- suppose you have a large number of models. it might be interesting to find out how they are related. this might not be obvious from the notes in the models or information provided by the creators -- the authors might not have known about other models that deal with a similar topic\n
- \n
- \n
- \n
- \n
- \n
- \n
- \n
- \n
- \n
- \n
- \n
- additional work to address more of what&#x2019;s missing has been something that a growing community of people has been actively working on for many years\n\n
- \n
- \n
- \n
- \n
- \n
- corrections:\nadded missing dependencies on software\n\n
- This question included an &#x201C;other&#x201D;, but only 3 checked it\n
- This question included an &#x201C;other&#x201D;, but only 3 checked it\n
- This question included an &#x201C;other&#x201D;, but only 3 checked it\n
- This question included an &#x201C;other&#x201D;, but only 3 checked it\n
- Question had &#x201C;other&#x201D;\nMost common response:- FBA\n2nd most common: not applicble (for conv tools)\n
- \n
- \n