The core framework is responsible for parsing and interpreting microbial physiology data by parsing it into a object-oriented and database-driven hierarchy. The data schema is based on the workflow of designing experiments and the associated data analysis process.
This is an overview of the models:
| Model | Function |
|---|---|
| TrialIdentifier | Describes a trial (time, analyte, strain, media, etc.) |
| AnalyteData | Time, data points and vectors for quantified data (g/L product, OD, etc.) |
| SingleTrial | All analytes for a given unit (e.g. a tube, well on plate, bioreactor, etc.) |
| ReplicateTrial | Contains a set of SingleTrials with replicates
grouped to calculate statistics |
| Experiment | All of the trials performed on a given date |
| Project | Groups of experiments with overall goals |
TrialIdentifier class¶Before any data importing, a description should be generated. Data is described based on the strain (organism, plasmids, etc.), the media (salts, carbon sources, nitrogen sources, etc.) and the environment (temperature, labware, shaking speed, etc.) A trial identifier is everything required to uniquely identify a point of data.
The three fundamental units of a trial identifier are the Strain,
Media and Environment classes.
| Model | Function |
|---|---|
| Strain | Describes the organism being characterized (e.g. strain, knockouts, plasmids, etc.) |
| Media | Described the medium used to characterize the organism (e.g. M9 + 0.02% glc_D) |
| Environment | The conditions and labware used (e.g. 96-well plate, 250RPM, 37C) |
We will build a trial identifier with all its components and use it to describe some data.s
In [12]:
import impact as impt
from importlib import reload
reload(impt)
strain = impt.Strain()
strain.name = 'LMSE001'
strain.plasmids.append(impt.Plasmid(name='pTrc99a'))
print(strain)
media = impt.Media()
media.add_component('IPTG',concentration= 20,unit='ng/mL')
print(media)
env = impt.Environment(labware=impt.Labware(name='96MTP'),
shaking_speed = 250,
temperature = 37)
print(env)
LMSE001+pTrc99a
20g/L IPTG
96MTP 250RPM 37C
With these fundamental units, we can construct a trial identifier.
In [13]:
ti = impt.TimeCourseIdentifier(strain=strain,media=media,environment=env)
print(ti)
strain: LMSE001+pTrc99a, media: 20g/L IPTG, env: 96MTP 250RPM 37C, analyte: None, rep: -1
We see the strain, media and environment set and some empty values for analyte and replicate. Let’s fill in the missing values required to fully describe the analyte.
In [14]:
ti.analyte_name = 'glc__D'
ti.analyte_type = 'substrate'
ti.replicate_id = 1
print(ti)
strain: LMSE001+pTrc99a, media: 20g/L IPTG, env: 96MTP 250RPM 37C, analyte: glc__D, rep: 1
We can now use this trial identifier to build objects with experimental data.
AnalyteData¶TimePoint and TimeCourse¶These time points are rarely used directly, but are included in order to flatten data into a relational database. These time points can either be created and added individually, or a time vector and data vector can be provided and the associated time points will automatically be generated.
In [15]:
substrate = impt.Substrate()
# Add each time point individually
for t, data in zip([0,1,2,3,4,5],[0,1,2,3,4,5]):
tp = impt.TimePoint(trial_identifier=ti,time=t,data=data)
substrate.add_timepoint(tp)
# or, add the vectors
substrate = impt.Substrate(trial_identifier=ti,time_vector=[0,1,2,3,4,5],data_vector=[0,1,2,3,4,5])
print(substrate.pd_series)
0 0
1 1
2 2
3 3
4 4
5 5
dtype: int64
Here we instantiated a substrate object because we are dealing with a
substrate - this differentiation allows impact to choose the appropriate
model for the data, as well as calculate features. Any time series data
can be imported as a impt.TimeCourse, but additional details can be
extracted if a specific data type is chosen.
| Analyte type | Function |
|---|---|
Substrate |
An analyte which is consumed |
Product |
An analyte which is produced |
Biomass |
A measurement of the biomass concentration |
Reporter |
A reporter, such as fluorescence from gfp or mCherry |
In [25]:
import numpy as np
import impact.plotting as implot
def exp_growth(t):
X0 = 0.05
mu = 0.1
return X0 * np.exp(mu*t)
# def production(t, product_yield, biomass_concentration):
# rate =
x = np.linspace(0,20,20)
y = exp_growth(x)
implot.plot([implot.go.Scatter(x=x,y=y)])
SingleTrial¶ReplicateTrial¶Experiment¶Project¶