Predict gene knockout strategies

In cameo we have two ways of predicting gene knockout targets: using evolutionary algorithms (OptGene) or linear programming (OptKnock)

If you’re running this notebook on try.cameo.bio, things might run very slow due to our inability to provide access to the proprietary CPLEX solver on a public webserver. Furthermore, Jupyter kernels might crash and restart due to memory limitations on the server.

from cameo import models
model = models.bigg.iJO1366
wt_solution = model.solve()
growth = wt_solution.fluxes["BIOMASS_Ec_iJO1366_core_53p95M"]
acetate_production = wt_solution.fluxes["EX_ac_e"]
from cameo import phenotypic_phase_plane
p = phenotypic_phase_plane(model, variables=['BIOMASS_Ec_iJO1366_core_53p95M'], objective='EX_ac_e')
p.plot(points=[(growth, acetate_production)])

OptGene

OptGene is an approach to search for gene or reaction knockouts that relies on evolutionary algorithms[1]. The following image from authors summarizes the OptGene workflow.

Every iteration we keep the best 50 individuals so we can generate a library of targets.

from cameo.strain_design.heuristic.evolutionary_based import OptGene
optgene = OptGene(model)
result = optgene.run(target="EX_ac_e",
                     biomass="BIOMASS_Ec_iJO1366_core_53p95M",
                     substrate="glc__D_e",
                     max_evaluations=5000,
                     plot=False)
Starting optimization at Fri, 17 Jun 2016 15:01:57
Finished after 00:01:48
/Users/joao/.virtualenvs/cameo-py3/lib/python3.4/site-packages/bokeh/io.py:532: UserWarning:

Cannot find a last shown plot to update. Call output_notebook() and show() before push_notebook()
result

OptGene Result

  • Simulation: fba
  • Objective Function: $$bpcy = \frac{(BIOMASS\_Ec\_iJO1366\_core\_53p95M * EX\_ac\_e)}{EX\_glc\_\_D\_e}$$
reactions genes size fva_min fva_max target_flux biomass_flux yield fitness
0 (UM4PL, PGCD, CPH4S, UM3PL, HEPT4, ATPS4rpp) ((b4233, b2913, b2765, b3738, b3623),) 5.0 0.0 14.976296 13.006011 0.388364 1.300601 0.505107
1 (PGCD, ARBabcpp, ACNAMt2pp, ATPS4rpp) ((b1900, b2913, b1033, b3738, b3224),) 5.0 0.0 14.976296 14.801391 0.388364 1.480139 0.574833
2 (PGCD, ATPS4rpp, ALAt4pp, GLYt4pp) ((b2913, b0007, b1612, b1033, b3738), (b2913, ... 5.0 0.0 14.976296 14.801391 0.388364 1.480139 0.574833
3 (HPYRI, PGCD, MALDDH, QUINDH, ATPS4rpp) ((b1800, b0508, b2913, b3738, b1692),) 5.0 0.0 14.976296 14.471909 0.388364 1.447191 0.562037
4 (PGCD, QUINDH, ATPS4rpp) ((b4024, b2913, b1033, b3738, b1692),) 5.0 0.0 14.976296 14.801391 0.388364 1.480139 0.574833
5 (ALAt4pp, PGCD, GLYt4pp, QUINDH, ATPS4rpp) ((b4226, b2913, b0007, b3738, b1692),) 5.0 0.0 14.976296 14.801391 0.388364 1.480139 0.574833
6 (PGCD, DDGLK, CPH4S, QUINDH, ATPS4rpp) ((b3738, b2913, b2765, b3526, b1692),) 5.0 0.0 14.976296 14.471909 0.388364 1.447191 0.562037
... ... ... ... ... ... ... ... ... ...
41 (PGCD, CINNDO, ATPS4rpp, PPPNDO, ARGDCpp) ((b2913, b1033, b2938, b4226, b2542, b3738, b2... 7.0 0.0 14.976296 14.801391 0.388364 1.480139 0.574833
42 (PGCD, SUCptspp, EDA, ACMUMptspp, AGMt2pp, ATP... ((b2913, b0007, b2429, b3738, b1850, b0433, b1... 7.0 0.0 14.976296 14.597554 0.388364 1.459755 0.566916
43 (FUCtpp, PGCD, G6PDA, QUINDH, ATPS4rpp, NO3R1bpp) ((b2913, b1033, b2204, b1692, b3738, b0678, b2... 7.0 0.0 14.976296 14.801391 0.388364 1.480139 0.574833
44 (HKNDDH, CITL, PGCD, FRD3, FRD2, SUCptspp, ACM... ((b4154, b2913, b2429, b3738, b0614, b3517, b0... 7.0 0.0 14.976296 14.801391 0.388364 1.480139 0.574833
45 (PGCD, HCYSMT2, QUINDH, HCYSMT, ATPS4rpp, ALAt... ((b2913, b0007, b1033, b1692, b3738, b0261, b2... 7.0 0.0 14.976296 14.304026 0.388364 1.430403 0.555517
46 (PGCD, FRD3, FRD2, SUCptspp, ACMUMptspp, HKNTD... ((b4154, b2913, b2429, b3738, b0614, b1101, b0... 7.0 0.0 14.976296 14.801391 0.388364 1.480139 0.574833
47 (PGCD, QUINDH, ATPS4rpp) ((b2913, b1033, b4226, b1692, b3738, b4244, b3... 8.0 0.0 14.976296 14.801391 0.388364 1.480139 0.574833

48 rows × 9 columns

result.plot(0)
result.display_on_map(0, "iJO1366.Central metabolism")
/Users/joao/.virtualenvs/cameo-py3/lib/python3.4/site-packages/escher/plots.py:155: UserWarning:

Map not in cache. Attempting download from https://escher.github.io/1-0-0/5/maps/Escherichia%20coli/iJO1366.Central%20metabolism.json

OptKnock

OptKnock uses a bi-level mixed integer linear programming approach to identify reaction knockouts[2]:

\[\begin{split}\begin{matrix} maximize & \mathit{v_{chemical}} & & (\mathbf{OptKnock}) \\ \mathit{y_j} & & & \\ subject~to & maximize & \mathit{v_{biomass}} & (\mathbf{Primal}) \\ & \mathit{v_j} & & & & \\ \end{matrix}\\ \begin{bmatrix} subject~to & \sum_{j=1}^{M}S_{ij}v_{j} = 0,\\ & v_{carbon\_uptake} = v_{carbon~target}\\ & v_{apt} \ge v_{apt\_main}\\ & v_{biomass} \ge v_{target\_biomass}\\ & v_{j}^{min} \cdot y_j \le v_j \le v_{j}^{max} \cdot y_j, \forall j \in \boldsymbol{M} \\ \end{bmatrix}\\ \begin{align} & y_j = {0, 1}, & & \forall j \in \boldsymbol{M} & \\ & \sum_{j \in M} (1 - y_j) \le K& & & \\ \end{align}\end{split}\]
from cameo.strain_design.deterministic.linear_programming import OptKnock
optknock = OptKnock(model, fraction_of_optimum=0.1)

Running multiple knockouts with OptKnock can take a few hours or days...

result = optknock.run(max_knockouts=1, target="EX_ac_e", biomass="BIOMASS_Ec_iJO1366_core_53p95M")
<IPython.core.display.Javascript object>
result

OptKnock:

  • Target: EX_ac_e
reactions size EX_ac_e biomass fva_min fva_max
0 {ATPS4rpp} 1.0 13.942943 0.402477 0.0 14.187817
result.plot(0)
result.display_on_map(0, "iJO1366.Central metabolism")

References

[1]Patil, K. R., Rocha, I., Förster, J., & Nielsen, J. (2005). Evolutionary programming as a platform for in silico metabolic engineering. BMC Bioinformatics, 6, 308. doi:10.1186/1471-2105-6-308

[2]Burgard, A.P., Pharkya, P., Maranas, C.D. (2003), “OptKnock: A Bilevel Programming Framework for Identifying Gene Knockout Strategies for Microbial Strain Optimization,” Biotechnology and Bioengineering, 84(6), 647-657.