Simulating E. coli Using Tinkercell

Computer-aided design is opening a door for scientists and aspiring scientists worldwide.

Heya Desai
9 min readJan 25, 2021

*In this guide, I go over some of the basic concepts underlying the topics discussed in this article.*

The development of synthetic biology software tools has opened up a new door for all researchers to biotech enthusiasts. We now have the ability to simulate and design life right on our computers.

These new frameworks are absolutely game-changing for everyone in the biological engineering space. They have removed many of the constraints posed by the lack of access to laboratory equipment and physical tools at all times.

Whether it be testing a hypothesis or experimenting with and analyzing existing or even new biological systems, with the assistance of computer-aided design — more commonly known as CAD, these strenuous processes are becoming more effective. And the designing and programming of genetic circuits have the potential to become almost automated.

For us learners, when we truly understand the underlying mechanisms that explain how organisms work — and how they can be manipulated, we can solve problems regarding complex cell behavior in more systematic ways. We can then bring in engineering approaches and pre-existing design tools to develop the resources that allow us to work with and reprogram elements of nature.

Source

If you aren’t familiar with genetic or synthetic biological circuits, you might be having some rather not so pleasant flashbacks to the electricity unit in your high school science class. However, I’m not talking about the traditional electrical circuits, instead ones that are in eukaryotic and/or prokaryotic cells.

But before we dive in, let’s get an idea of what exactly these systems are.

What are Genetic Circuits?

Opposed to being made up of electrical components, genetic circuits are composed of gene expression parts encoded in RNA or protein that work together to facilitate the interaction and response between cells. This allows them to carry out defined and novel functions which contribute towards establishing more and more feasible applications of synthetic biology. Everything from smart therapeutics to biomining, engineering crops and commensal soil organisms.

Modelling the Lac Operon in E. coli

Before going through with the design, build, and test cycle and designing synthetic circuits, we need to learn.

Molecular cloning; the techniques we use to study, modify and even introduce genes into new hosts. has E. coli to thank for its advancement and widespread use. The bacteria have become the preferred host for introduced DNA sequences and their protein products since it’s easy to culture in labs. Additionally, due to its rapid growth and ability to express proteins, it’s the ideal host for protein production as well.

Source

So, I decided to simulate the lac operon present in E. coli. One of the characteristics that have made the organism relatively cheap to grow is that it doesn’t have specific feeding requirements. It’s able to obtain usable energy from a variety of sources, one of them being lactose, when glucose isn’t available.

Aside from growing it a lab, it’s been recently used for several experimental and research purposes due to the reliable reporter system, which is able to both track and monitor the products made by lacZ genes.

https://www.livescience.com/64436-e-coli.html

For my project, I used Tinkercell, which is a CAD tool that integrates both programming API and a visual interface. But before I explain the simulation, we must go through some key concepts, starting with what an operon is.

Operons in Bacteria

An operon refers to a group of genes with similar functions controlled by one single promoter and transcribed as a single mRNA. The amount of an operon transcribed; copying a gene’s DNA process to make an RNA molecule — is controlled by regulatory DNA sequences as they are binding sites for regulatory proteins.

Source of inspiration

Furthermore, these regulatory proteins are called transcription factors and either prevent or promote transcription by turning genes “on or off”. They are encoded in the bacterium’s genome.

  • RNA polymerase is the main enzyme for transcription. When this enzyme binds to a promoter sequence, transcription begins.
  • Activators are transcription factors that INCREASE a gene’s transcription. They help RNA polymerase bind to the promoter.
  • Repressors are transcription factors that DECREASE a gene’s transcription. They serve as roadblocks so that the RNA polymerase does not bind to the promoter.
Source of inspiration

Regulation of Gene Expression

Transcription is the first level of control and plays a significant role in how gene expression is regulated since it restricts how much mRNA is produced. Gene expression itself is the production of RNA and proteins when a gene is turned on. However, it is crucial that this process is regulated because expressing every gene continuously would use a significant amount of energy.

We can use the analogy of a bathtub to explain this. If we leave the faucet running hot water at all times, along with the other implications of this, it will use a large amount of energy since water is heated with natural gas or a heat pump. In addition, we would need a large amount of space so that the water doesn’t overflow.

Similarly, when DNA is transcribed and translated, it is unwinded and hence occupies more space. So if gene’s were being expressed constantly, our cells would have to be much bigger.

A Bistable System — The Lac Operon

The lac operon is inducible and has all of the previously mentioned characteristics of an operon, and is only present in prokaryotic cells. In this specific operon, the genes encode proteins that let the bacterium use lactose as an energy source.

Humans’ main source of energy comes from carbohydrates, which are converted to glucose. While we don’t often compare E. coli to humans, glucose is also their preferred energy source. It’s an energy-efficient fuel since it requires significantly lower amount of energy to break down compared to other sugars.

But what happens when glucose isn’t available? The E.coli bacteria resorts to the next best meal, provided that it’s available; lactose. It’s important to note that glucose must be unavailable for the RNA polymerase to attach to the lac operon promoter.

Under these circumstances, cAMP levels increase. It is a signaling molecule in E. coli that binds to CAP — the catabolite activator protein that binds to DNA to assist RNA polymerase bind to the lac operon promoter, initiating transcription.

Source of inspiration

lacZ, lacY, and lacA are the three genes in the lac operon. They indicate proteins that will help E. coli use the lactose. The gene you will see in the simulation below is lacZ; it encodes β-galactosidase. It is an enzyme that catalyzes the degradation of lactose into two monosaccharides.This is where the grade 9 chemistry skills come into play. Lactose is a disaccharide, so it is split into glucose and galactose by the lactase enzyme. As a result, the sugars become usable sources of energy for the E. coli.

Now we can take a look at the first component of my simulation to get an understanding of how the lac operon looks once assembled in Tinkercell;

I’ve included additional subtitles in green to display which processes are taking place in the system. Here is a guide explaining of each of the biological regulations and reactions that take place in this operon.

Biological Parts in the Lac Operon

Each biological part (sequence of DNA) encodes a biological function, and are assembled to make devices for living cells like E. coli. They can be compared to cells in a sense since they’re single functional units, and these cannot be divided any further.

Think about a recipe. The more ingredients you have, the more complicated it becomes. In the same way, basic parts (2 or more) assembled together create complex composite parts.

  • Promoter: DNA sequences where the process of transcription of a gene by RNA polymerase initiates.
  • Repressor Binding Site: Involved in repression — inhibiting or decreasing gene expression.
  • Ribosome Binding Sequence: Messenger RNA sequence in mRNA that ribosomes can bind to and subsequently begin translation (protein synthesis)
  • Protein Coding Sequence: Encodes amino acid sequence for the protein of interest such as an enzyme or repressor

Once we learn the role, each part plays in the system, recognizing that it is a bistable system helps us understand how the initial and final lactose values in E. coli come to be. Many experimental studies have been conducted analyzing the bistability of the lac operon to determine the effect of lactose metabolism along with much more.

It’s been well established that if the intracellular level of lactose is low, it remains low and if there is a sufficient level at the beginning, lactose will upregulate its intake. The positive feedback loop being responsible for the system’s bistability was found in this study submitted to Frontiers in Systems Biology.

Using Tinkercell, I was also able to generate various simulations, both deterministic and stochastic. A model is classified as either depending on the level of approximation introduced in the simulation.

cel1 represents E. coli

This is an example of a deterministic simulation that can be produced using the corresponding ‘Parameters’ and ‘Initial Values’. It resembles a typical graph many of us are used to creating in a digital spreadsheet. While we don’t know the output, we know the parameter values and initial conditions. Hence there is no degree of randomness. The same output is always produced when the same set of parameters are chosen.

cel1 represents E. coli

This is where it gets a bit more fascinating. Stochastic simulations are methods to account for the heterogeneity in biological systems that was previously overlooked when computational tools were not as widely available. Biological systems in nature are undoubtedly discrete and random, and since stochastic models can convey this through the simulations, they are the most accurate type of algorithm.

Source

The Importance of Mathematical Modelling

Mathematical modelling plays a crucial role in the efficient and rational design of robust and complex synthetic biology systems since it serves as a formal link between the conception and physical realisation of a biological circuit.

— Synthetic Biology — A primer

Science is ever-changing and evolving, and in that regard, synthetic biology is a relatively newer discipline, and the more we discover, the more we’ll have to learn.

The Engineering Design Cycle

Source

We can use one or several of the 100’s of CAD tools available for in silico (on a computer) to design, build, analyze, evaluate and improve our models based on redesigned biological systems. Curated and standardized databases of composable models, as well as the Registry of Standard Biological Parts are amazing resources to consider when you’re ready to make your own mathematical models. I’d also recommend reading Synthetic Biology — A primer throughout this journey.

Opposed to displaying unfamiliar data, simulations, and models can be used to find the answers to the questions we have about our natural world.

After all, the world we want to live in and the biological systems we want to design will be created by the questions we ask.

It may take several months, or several years but the possibilities of the hard work are endless. With the assistance of computational tools, a future where E. coli senses and destroys cancer cells, cells are programmed as factories to produce protein therapeutics, or commodity chemicals can be produced from renewable biomass is more achievable.

Which ideas are you going to turn into reality?

Thank you for reading! If you enjoyed my article, subscribe to my monthly newsletter to keep up with my progress, find out when I create similar content, and receive insight into everything I’ve been up to! In the meanwhile, let’s connect on linkedin.

Make sure to check out my YouTube video where I dive deeper into CAD itself!

--

--