-
Notifications
You must be signed in to change notification settings - Fork 28
Setting up models
In the folder /Examples
we provide some showcase applications that can be used as a starting point for new modeling projects.
Section 1 describes how a model can be set up and Section 2 how experimental data can be linked to the model.
Please note that the software make heavy use of identifiers for all model and data quantities, for instance for dynamic variables, inputs, parameters, observables, ... It is very important to keep these identifiers consistent throughout one project.
A model definition file is a plain text file with the ending .def
located in the subfolder /Models
of the current MATLAB working directory. The illustration used in the following stem from the example applications in the /examples
subfolder of the main code folder: Becker_2010
and Bachmann_2011
. The model definition files are structured in various section:
DESCRIPTION
...
PREDICTOR
...
COMPARTMENTS
...
STATES
...
INPUTS
...
REACTIONS (or ODES)
...
DERIVED
...
OBSERVABLES
...
ERRORS
...
SUBSTITUTIONS (optional)
...
CONDITIONS
...
Each section will be explained in the following. Lines in the model definition file can be commented out using //
.
Can be used to store meta information about the model as quoted lines of text, e.g.
DESCRIPTION
"Model of Epo receptor internalization"
"Assembled by RG Timmer & RG Klingmüller"
"Date: 2010"
"Version: 1.0"
Is used to define the independent variable / the predictor. For dynamics model, this is usually time, e.g.
PREDICTOR
t T "min" "time" 0 100
- The 1st argument specifies the unique identifier (here
t
) that can be used in the mathematical expressions later. - The 2nd-4th argument specify the unit type (here
T
= time), the unit itself (here"min"
), and a label used for plots (here"time"
). - The 5th and 6th argument specify the range of the independent variable (here 0-100 minutes).
Can be used to define compartments in a model, for instance cytosol and nuclear compartment in the cell
COMPARTMENTS
cyt V "pl" "vol." 0.4
nuc V "pl" "vol." 0.275
...
The concentration dynamics of the model, in particular the transport reactions, are solved with respect to the size of the compartments.
- The 1st argument specifies the unique identifiers (here
cyt
andnuc
) that can be used in the mathematical expressions later. - The 2nd-4th argument specify the unit types (here
V
= volume), the units itself (here"pl"
), and labels used for plots (here"vol."
). These labels do not affect the calculations. - The 5th argument specifies the numerical size of the compartments. Omitting the 5th argument will expose a parameter called
vol_#
where # is the compartment name. This parameter can then be estimated or specified in a condition. Note that by default, the parameter is set to constant. If you wish to estimate it, change it to fitted usingarSetPars
.
If no compartments are required, this section should be left empty. Note that in D2D, rates are always specified relative to the origin of the species. If an in- and out rate of a compartment should be equal and is estimated using a single parameter (e.g. to model passive diffusion), it is important to correct the kinetic rates by the appropriate volumes.
This section describes the dynamic variables / states that evolve over time as will be specified in the REACTIONS / ODES section later.
STATES
STAT5 C "nM" "conc." cyt 1 "STAT5" 1
pSTAT5 C "nM" "conc." cyt 1 "phospho STAT5" 1
npSTAT5 C "nM" "conc." nuc 1 "nuclear phospho STAT5" 1
...
- The 1st argument specifies the unique identifiers (here
STAT5
,pSTAT5
andnpSTAT5
) that can be used in the mathematical expressions (reactions / ODE system) later. Initial conditions for each dynamic variable will be generated as free parameters using the prefixinit_
. - The 2nd-4th argument specify the unit types (here
C
= concentration), the unit itself (here"nM"
), and a plain text used for plots (here"conc."
). These labels do not affect the calculations. - The 5th argument specify the compartment where the state lives in (here either
cyt
ornuc
). This can be left empty if no compartments were specified above. If the latter arguments are to be used but no compartment is present it is possible to create andefault
compartment in the section before in which all state live in. - The 6th argument is a flag if the state should be showed in plots (0=no, 1=yes), default is yes.
- The 7th argument a clear text label for plotting.
- The 8th argument specifies if the state is strictly positive (0=no, 1=yes), default is no, but for inherently non-negative states yes is recommended.
In this section input function, usually depending on the independent variable, can be specified. The input function can be used in the mathematical expression later, such as the reaction rate equations or the observables. Please note that the input do not belong to a specific compartment. The user has to ensure plausibility of the usage of the input function himself.
INPUTS
Epo C "units/cell" "conc." "k1*exp(-k2*t)"
SAv C "units/cell" "conc." "k3"
...
- The 1st argument specifies the unique identifiers (here
Epo
andSAv
) that can be used in the mathematical expressions later. - The 2nd-4th argument specify the unit types (here
C
= concentration), the unit itself (here"units/cell"
), and a plain text used for plots (here"conc."
). - The 5th argument specify the mathematical expression of the input function (here
"k1*exp(-k2*t)"
is an exponential decay and"k3"
is a constant with unknown value).
Sometimes the input function should account for a change in the value of a dynamic variable, see the bolus injection example. Input estimation is described here
Step functions, splines and user defined functions can be used as mathematical expression of the input function.
Step functions can be defined by:
-
"step1(t, level1, switch_time, level2)"
giving a simple step function at the valueswitch_time
of the independent variable (usually time). -
"step2(t, level1, switch_time1, level2, switch_time2, level3)"
giving a double step function at the valuesswitch_time1/2
of the independent variablet
.
Discontinuous input require events for proper integration, see here and here
Smooth step functions can be defined by:
-
"smoothstep1(t, level1, switch_time, level2, smoothness)"
giving a smooth step function at the valueswitch_time
of the independent variable (usually time). -
"smoothstep2(t, level1, switch_time1, level2, switch_time2, level3, smoothness)"
giving a smooth double step function at the valuesswitch_time1/2
of the independent variablet
.
The smoothness parameter controls the smoothness of the step. For large values, the step becomes increasingly sigmoidal.
Bolus injections can be defined by:
-
"bolus(t, amount, time_point, duration)"
giving a bolus injection at valuetime point
of the independent variablet
. This corresponds to a Gaussian curve with areaamount
and standard deviationduration
.
Cubic splines with 3, 4, 5 or 10 knots can be defined by:
"spline3(t, t_knot1, p_knot1, t_knot2, p_knot2, t_knot3, p_knot3, q_initial_slope_constraint, initial_slope)"
"spline4(t, t_knot1, p_knot1, t_knot2, p_knot2, t_knot3, p_knot3, t_knot4, p_knot4, q_initial_slope_constraint, initial_slope)"
"spline5(t, t_knot1, p_knot1, t_knot2, p_knot2, t_knot3, p_knot3, t_knot4, p_knot4, t_knot5, p_knot5, q_initial_slope_constraint, initial_slope)"
"spline10(t, t_knot1, p_knot1, t_knot2, p_knot2, t_knot3, p_knot3, t_knot4, p_knot4, t_knot5, p_knot5, t_knot6, p_knot6, t_knot7, p_knot7, t_knot8, p_knot8, t_knot9, p_knot9, t_knot10, p_knot10, q_initial_slope_constraint, initial_slope)"
Here, t
denotes the independent variable, t_knot1-5
indicate the locations of the spline knots (these should be fixed to a numeric value), p_knot1-5
denote the spline parameters (these can either be fixed or left as free parameters to be estimated) and q_initial_slope_constraint
is a flag (0
= no, 1
= yes) that indicates if the slope of the spline at the first knot is to be constrained to the value given in initial_slope
(both values should be fixed to a numeric value). The spline parameter p_knot is defined as the value u(t_knot) of the spline at t_knot. If the spline should be constrained to positive values use the functions spline_pos3
, spline_pos4
and spline_pos5
the same way as described above. An example using splines is given in Examples/Swameye_PNAS2003
.
Monotonic splines with 3, 4, 5 or 10 knots can be defined by:
"monospline3(t, t_knot1, p_knot1, t_knot2, p_knot2, t_knot3, p_knot3)"
"monospline4(t, t_knot1, t_knot1, p_knot1, t_knot2, p_knot2, t_knot3, p_knot3, t_knot4, p_knot4)"
"monospline5(t, t_knot1, p_knot1, t_knot2, p_knot2, t_knot3, p_knot3, t_knot4, p_knot4, t_knot5, p_knot5)"
"monospline10(t, t_knot1, p_knot1, t_knot2, p_knot2, t_knot3, p_knot3, t_knot4, p_knot4, t_knot5, p_knot5, t_knot6, p_knot6, t_knot7, p_knot7, t_knot8, p_knot8, t_knot9, p_knot9, t_knot10, p_knot10)"
Monotonic are splines which are monotonic between knots. Their syntax is similar to that of the regular splines. Monotonic splines have as an extra advantage that they do not suffer from over and undershoot behaviour as much as the regular cubic splines, considering that the interpolating function between two knots is guaranteed to be monotonic. They are useful for modelling input sources which are not noisy, or inputs requiring a large number of knots. Setting the first and last two knot parameters to the same value guarantees that the monotonic spline remains constant outside the spline range. This means that monotonic splines can be combined into splines comprising of more than 10 knots (see Examples/LongSplines
).
For an example comparing the spline types see Examples/Splines
and Examples/LongSplines
(using splines > 10 knots). Note, when the splines are slow, invoke ar.config.turboSplines = 1
, to use a more performant spline implementation.
-
"inputspline(t, N, [t_knot1, t_knot2, ... t_knotN], [p_knot1, p_knot2, ... p_knotN])"
Theinputspline
is a special version of the monotonic spline. It allows for a spline of arbitrary length, but cannot be fitted (no parameters are allowed in the time or parameter series).
Custom input function can be defined in the c-files arInputFunctionsC.c
and arInputFunctionsC.h
.
Often, an input consists of transient and sustained parts. Such behaviour can be implemented by the following expression:
"gif_amp_trans*(1-exp(-t/gif_timescale_sust))*exp(-t/(gif_timescale_trans)) + gif_amp_sust*(1-exp(-t/gif_timescale_sust))"
The function has three parameters, two amplitudes (gif_amp_trans
and gif_amp_sust
) and two time scales (gif_timescale_trans
and gif_timescale_sust
), that encode the transient and sustained parts.
This section describes the ODE system that determines the time evolution of the states. One can either specify REACTIONS
or ODES
directly.
The ODE system can be specified as biochemical reaction network.
REACTIONS
Epo + EpoR -> Epo_EpoR CUSTOM "k1 * Epo * EpoR" "label1"
STAT5 -> pSTAT5 CUSTOM "k2 * STAT5" "label2"
pSTAT5 -> npSTAT5 CUSTOM "k3 * pSTAT5" "label3"
...
- The 1st argument describes the reaction in a simple notation (here
Epo + EpoR -> Epo_EpoR
is a complex formation,STAT5 -> pSTAT5
a protein modification, andpSTAT5 -> npSTAT5
a transport reaction between compartments). - The 2nd argument is a keyword that specifies the reaction type,
CUSTOM
orMASSACTION
.CUSTOM
is the recommended usage. - In case of a
CUSTOM
reaction, the 3rd argument specifies the mathematical expression of the reaction rate equation (here"k1 * Epo * EpoR"
,"k2 * STAT5"
and"k3 * pSTAT5"
). In case of aMASSACTION
reaction, the mathematical expression in the 3rd argument will be just one parametername
that will be the rate constant corresponding to this reaction. In case of aMASSACTION
reaction, one can use<->
in the 1st argument to specify a reversible reaction. In this case, there will be two parameters with suffixname_1
for the forward andname_2
for the back reaction. - The 4th argument (optional) specifies a label for the reaction used for plotting.
In contrast to the REACTIONS notation, the ordinary differential equations can be specified directly:
ODES
"mathematical expression 1"
"mathematical expression 2"
"mathematical expression 3"
...
where these mathematical expressions define the right-hand sides of the ODEs. Take care that there are as many ODEs as dynamic variables in the model.
In this section variable that are derived from the dynamic and input variables can be defined. Derived variables can be used in the OBSERVABLES and REACTIONS sections.
DERIVED
Epo_ext C "pM" "conc." "Epo + dEpo_e"
Epo_int C "pM" "conc." "Epo_EpoR_i + dEpo_i"
...
- The 1st argument specifies the unique identifiers (here
Epo_ext
andEpo_int
) that can be used in the mathematical expressions later. - The 2nd-4th argument specify the unit types (here
C
= concentration), the unit itself (here"pM"
), and a plain text used for plots (here"conc."
). - The 5th argument specify the mathematical expression of the input function (here
"Epo + dEpo_e"
and"Epo_EpoR_i + dEpo_i"
).
At this point of the model definition files one can specify the default model observables and their corresponding error model (Section 1.9). While the section in the model definition file are optional the corresponding sections in the data definition file (see Section 2.3) are mandatory and override the default specifications given here.
OBSERVABLES
tSTAT5_au C "au" "conc." 1 1 "scale_tSTAT5 * (STAT5 + pSTAT5)"
pSTAT5_au C "au" "conc." 1 1 "offset_pSTAT5 + scale_pSTAT5 * pSTAT5"
...
- The 1st argument specifies the unique identifiers (here
tSTAT5_au
andpSTAT5_au
). Please note that this identifier should be the same as the corresponding column header in the data sheet. Also, Observable identifiers are not allowed to be the same as dynamic variable identifiers. - The 2nd-4th argument specify the unit types (here
C
= concentration), the unit itself (here"au"
), and a plain text used for plots (here"conc."
). - The 5th argument corresponds to a flag (0=no, 1=yes) that indicates if the maximal value in the data sheet of the respective observable should be rescaled to one.
- The 6th argument corresponds to a flag (0=no, 1=yes) that indicates if both the raw data form the spread sheet and the model observables should be compared on a log-10 scale. This is frequently used for concentration data that contains log-normal measurement noise. The transformation is applied after rescaling in case of argument 5 applies.
- The 7th argument specify the mathematical expression of the observable function (here
"scale_tSTAT5 * (STAT5 + pSTAT5)"
and"offset_pSTAT5 + scale_pSTAT5 * pSTAT5"
. It will be places inside the log-10 of argument in case of argument 6 applies.
In the likelihood function, the measurement noise is modeled as normal or log-normal distribution, see 2.2 argument 6. The magnitude of the measurement noise, i.e. the standard deviation of the normal distribution (or standard deviation of the normal distribution in the log-space for log-normally distributed noise), can be implemented as parameterized function
ERRORS
tSTAT5_au "sd_STAT5_au"
pSTAT5_au "sd_STAT5_au"
...
- The 1st argument corresponds to an unique identifiers as given in the observables section (here
tSTAT5_au
andpSTAT5_au
). It is mandatory to specify the measurement noise for each observable in the data definition file. - The 2nd argument specify the mathematical expression of the noise function (here
"sd_STAT5_au"
and"sd_STAT5_au"
indicates that both measurements have the same measurement noise). Relative measurement noise can be implemented by using the observable identifier, e.g."sqrt(sd_STAT5_abs^2 + sd_STAT5_rel^2 * tSTAT5_au^2)"
.
Sometimes, instead of using a parametrized function for the measurement noise, one like to directly specify the amount of noise calculated from replicates in the data sheet. Put these value in the data sheet with column header same as the observable name but with the addition _std
(here e.g. tSTAT5_au_std
and pSTAT5_au_std
). Please note that these values will be interpreted as standard deviation of the data not as variance! An example for this usage can be found in the example application /Example/Swameye_PNAS2003
.
The value of ar.config.fiterrors
controls how experimental error enter the calculations:
ar.config.fiterrors = -1
: Only experimental error bars used for fitting.
ar.config.fiterrors = 0
: Use experimental errors by default and revert to the error model for data points that have no experimental error specified (This is the default).
ar.config.fiterrors = 1
: Only the error model is used for fitting. Experimental errors specified in the data sheet are ignored.
The visual output is controlled by the following flags:
ar.config.fiterror = 0
: Data is plotted as fitted.
ar.config.fiterror = -1
: Plot prediction bands as calculated by PPL.
ar.config.fiterror = -2
: Do not plot errors
and
ar.config.ploterrors = 0
: Observables are plotted as fitted.
ar.config.ploterrors = 1
: Data uncertainty is plotted as error bar.
ar.config.ploterrors = 2
: Only error bands are plotted.
In order to use multiple modes, set ar.config.useFitErrorMatrix = 1
and specify the different modes in ar.config.fiterrors_matrix(model index, data index)
. Accordingly, the different plotting modes can be specified in ar.config.ploterrors_matrix(model index, data index)
.
In this section identifiers can be specified which can subsequently be used in model conditions. They will be iteratively substituted in the expressions in the CONDITIONS sections.
SUBSTITUTIONS
mySteadyState "kpro/myDegradation"
myDegradation "2*kdeg"
...
- The 1st argument specifies a unique identifier, which may not have the same name as any model parameter
- The 2nd argument specifies a mathematical expression that will replace located instances of the identifier. Note that these will be evaluated repeatedly until there are no further substitutions. Hence, in this example
mySteadyState
will equate to0.5 * kpro / kdeg
. Cyclic substitutions are not allowed.
In this section conditions in term of the model parameters can be specified. All expression that are not uniquely defined above as either compartments, states or inputs will be treated as free parameters.
CONDITIONS
kD "koff/kon"
init_Epo_EpoR "0"
STAT5ActJAK2 "STAT5ActJAK2/init_EpoRJAK2"
...
- The 1st argument specifies a target parameter that occurred in the model definition above.
- The 2nd argument specifies a mathematical expression that will replace the target parameter in the model (here
"koff/kon"
,"0"
and"STAT5ActJAK2/init_EpoRJAK2"
). This can be any mathematical expression of composed of new and old parameters. Take care not to implement recursive conditions. All replacements are executed once and sequentially, so order matters.
In this optional section the default settings for parameter can be specified. Please note that the intension of this section is to save final parameter values for documentation purpose. During work in progress parameter values should be manipulate "soft" in the MATLAB workspace variables.
PARAMETERS
#ar.pExternLabels ar.pExtern ar.qFitExtern ar.qLog10Extern ar.lbExtern ar.ubExtern
init_PX 0 1 0 0 1000
...
In some cases, you may want to experiment with including different aspects in a single model. It is trivial to set up multiple .def
files, which in some cases may be an appropriate approach. Sometimes however, it can be nice to maintain a single model file and selectively comment / uncomment sections of the model. For this, a pre-processor was included in the model parser. The preprocessor can be used to incorporate basic compile time logic by defining or un-defining labels using #define and #undefine. Code blocks can then be bracketed by #ifdef, #else, #endif statements in order to selectively activate/deactivate specific parts of the model. An example use:
#define HAS_HILL
#ifdef HAS_HILL
// This code will be evaluated
#else
// This code will not be evaluated
#end
#undefine HAS_HILL
#ifdef HAS_HILL
// This code will not be evaluated
#else
// This code will be evaluated
#end
#ifndef HAS_HILL
// This code will be evaluated
#else
// This code will not be evaluated
#end
//#define TEST
#ifdef TEST
// This code will not be evaluated
#end
Similar to the model definition file experiments have data definition files that contain experiment specific information. A data definition file is a plain text file with the ending .def
located in the subfolder /Data
of the current MATLAB working directory. Each data set, stored in the subfolder /Data
as .xls
or .csv
file, must have its own data definition file with the same name but the suffix .def
. The data definition files are structured in various section:
DESCRIPTION
...
PREDICTOR (or PREDICTOR-DOSERESPONSE)
...
INPUTS
...
OBSERVABLES
...
ERRORS
...
SUBSTITUTIONS (optional)
...
CONDITIONS
...
Sections DESCRIPTION
and INVARIANTS
are same as in the model definition file, see 1.1 and 1.7. The remaining sections will be explained in the following. As in the model definition file lines in the data definition file can be commented out using //
.
Is used to define the independent variable / the predictor. For the data definition file there are two modes available, PREDICTOR
and PREDICTOR-DOSERESPONSE
. For using the PREDICTOR
mode see description in Section 1.2, the independent variable in this case is usually time. In the PREDICTOR-DOSERESPONSE
mode one can use an existing parameter
as additional independent variable
PREDICTOR-DOSERESPONSE parameter
t T "min" "time" 0 100
In this mode, there has to be a column named parameter
in the data sheet, besides the first column that encodes the standard independent variable (here time t
). Please note that formally this alternative mode does not change the results, however the plot of the model fit to the data will be modified. For instance if parameter
corresponds to the concentration of a certain input, the result will the a dose response plot.
Each experiment can have a different input function, e.g. instance implementing a different treatment scheme. Therefore, in the data definition file the input can be redefined for this specific experiment.
INPUTS
Epo "10 + sin(k1*t)"
SAv "1 + 0.1*t"
...
- The 1st argument corresponds to an unique identifiers as given in the model definition file (here
Epo
andSAv
). - The 2nd argument specify the mathematical expression of the input function (here
"10 + sin(k1*t)"
and"1 + 0.1*t"
), see 1.5 for more detailed explanation.
At this point of the model definition files one can specify observables and their corresponding error model that override the default specifications from the model definition file (see Sections 1.8 and 1.9).
Like in the model definition file, see 1.10 for detailed explanation, experiment specific substitutions can be specified.
Like in the model definition file, see 1.11 for detailed explanation, experiment specific conditions can be specified.
CONDITIONS
init_Epo_EpoR "10"
...
Please note that these "data" conditions are applied to the model after the "model" conditions have been applied.
Like in the model definition file, see 1.12 for detailed explanation, the default settings for parameter can be specified also on the level of the data definition file, i.e. for the additional parameter that might be introduced there.
- Installation and system requirements
- Setting up models
- First steps
- Advanced events and pre-equilibration
- Computation of integration-based prediction bands
- How is the architecture of the code and the most important commands?
- What are the most important fields of the global variable ar?
- What are the most important functions?
- Optimization algorithms available in the d2d-framework
- Objective function, likelhood and chi-square in the d2d framework
- How to set up priors?
- How to set up steady state constraints?
- How do I restart the solver upon a step input?
- How to deal with integrator tolerances?
- How to implement a bolus injection?
- How to implement washing and an injection?
- How to implement a moment ODE model?
- How to run PLE calculations on a Cluster?