E. coli expression systems involve many variables but are one of the most common and efficient methods for protein expression.
E. coli is by far the most popular protein expression platform. The days when pharmaceutical manufacturers had to refine large quantities of plant or animal tissues and fluids into the desired protein are practically gone. Now, by selecting the the proper vector and the proper host, manufacturers can create large quantities of highly-refined proteins at a much lower cost. There are, however, a great many considerations that go into finding the right combination of vector and host. A few of the most important are detailed here.
The nature of the total recombinant protein expression system will affect which strain of E. eoli is the most appropriate.
Though there are many options for protein expression – bacteria, yeast, and unicellular algae, to name just a few – by far the most popular option is E. Coli. This is because under ideal conditions it can double every 20 minutes, and can transform plasmids in as little as five minutes. Its genome is well understood and a great deal of technology has been developed to make working with it even more predictable. There are a number of strains of E. coli, each with certain advantages and disadvantages. Most strains, however, have been developed for very specific purposes. When performing initial experiments, the two main strains worth testing are BL21 and K-12.
The BL21 strains have substantial genetic modifications that make it easy to work with in a protein expression system. Specifically, it is missing a number of proteases which break down foreign and extracellular proteins, meaning the desired proteins are not digested before they can be harvested. The BL21 strains is by far the most commonly used for protein expression, but derivatives of the K-12 strain are also common, as they have many modifications to stabilize plasmids. Which host strain is right for the expression of the protein in question depends on the interaction of the host strain with the plasmid itself, and the plasmids have many variable regions.
Coli expression vectors have many variable regions and the exact composition will depend on the desired protein expression system.
A plasmid vector, used to introduce the desired gene to the host organism for expression, has many variable regions, but four variables are of essential concern in building a protein expression system: the replicon, the promoter, the selection marker, and the affinity tag (and its associated remover).
The replicon is the origin of replication, and which replicon the plasmid contains will affect the copy number. The copy number is how many copies of the plasmid the host can have, and replicons of the same or similar type compete for the same replication machinery within the cell. It may also be desirable to introduce multiple expression vectors with replicons of different types, so they do not compete for replication resources, but more copies of the desired gene does not necessarily mean higher protein production, as each round of production puts a metabolic burden on the host which may slow growth or replication.
The promoter is a gene sequence that encourages the host to transcribe the desired gene for protein synthesis. By far one of the most common promoters is the lac promoter, which is activated in the presence of lactose, but deactivated in the presence of glucose, however it is also prone to expressing without without the presence of an inductor, and. this “leakiness” may be unacceptable in certain expression systems. Hyrbid promoters have been engineered in order to minimize these kinds of drawbacks.
Plasmids are not universally adopted by hosts, and not universally passed on when they are, given that they create an unnecessary metabolic burden for the host. Determining which hosts have transformed the vector and which haven’t would be an involved process without the presence of a selection marker. In E. Coli expression systems, antibiotic resistance genes are generally used as selection markers in the expression vector. The culture is grown on a selective medium, meaning only the cells with the plasmids are able to grow. Given concerns about antibiotic resistance, a recent practice has also been to remove necessary genes from the genome of E. Coli and place them in the vector, to encourage vector transformation.
Finally, once the protein is produced, it is necessary to detect it and purify it. An affinity tag solves both these problems at once, coding for a protein or an amino acid chain that connects to the desired protein. The specific form depends on the desired protein, but generally speaking these tags must eventually be removed either through enzymatic or chemical cleavage, as they can affect protein folding and function.
The correct combination of all these variables can only be found through experimentation.
With so many variable regions in the expression vector, and so many options for a host, the most efficient combination to scale up can only be determined through experimentation. Small scale screens can be performed using 96-well plates and high throughput protocols using robots exist which allow an individual to test as many as 1000 conditions in a week.
Though a great many variables exist just in the realm of E. Coli expression systems, by keeping these main considerations in mind, experimentation to find the optimal combination can be done efficiently and cost effectively.