Combined /omics approaches to understand and control library enriched microbial cell factories

Lead Research Organisation: University of Sheffield
Department Name: Chemical & Biological Engineering


The dynamic biological behaviour understanding needed for bioprocess development cannot be predicted solely by individual level /omic studies, since this approach only tells a proportion of the story. Therefore, we will implement an analytical technique, based on several different plasmid based genomic libraries (from two bacteria, Escherichia coli and Campylobacter jejuni) expressed in E. coli, coupled with measurements at the microarray (messenger RNA level) and proteome (protein complement of the cell) scale, to understand and improve the secreted glycosylated protein production bioprocess. Production of these types of proteins is very important to the pharmaceuticals industry, since nearly three quarters of proteins with human therapeutic importance are glycosylated (either released or in clinical and preclinical development). From the simplest to the most complex organisms, the process of transferring information from the genome to make proteins is universal and central to life. Understanding and quantifying this process is essential for scientific advancement. With such information it will become possible to manipulate organisms to achieve a desired biotechnological goal, such as production of proteins for medicinal purposes, and the replacement of synthetic chemicals. A genome sequence is the code for programming the way an organism functions. Sequencing the genome provides a database of information for identifying genes and assigning the potential function of these genes, and allows for comparison of similar genes across species. When genes switch on to start a biological function, a message is generated, that eventually makes a protein. Experimental technologies that exploit this message information across thousand of genes have been developed, such as SCALEs (multi-Scale Analysis of Library Enrichment). This field of information is known as transcriptomics. Transcriptomics, however, cannot be used solely to predict the dynamic biological behaviour needed for future biotechnology development, since this approach only tells a proportion of the story. The information missing from SCALEs is how these gene messages are used. What is needed is an integrated study of the message from the genome with the production of proteins. In order to achieve this, we also will implement an analytical technique, similar to SCALEs that will concentrate on the proteins rather than the genome. It is important to examine the protein complement of the organism (known as the proteome), because the observed physical health and behaviour of an organism is determined by the interaction of its genome with the environment, and this interaction is directly due to the proteins, rather than the genome, and its subsequent message (the transcriptome). Using our technique (called MLPPTM) which studies proteins, and experimental techniques such as SCALEs, which study the message from the genome, we will be able to provide an integrated study which generates a deeper knowledge of which proteins help give a cell certain properties. In this case, we seek to understand which proteins will give a cell an enhanced ability to generate glycosylated proteins (those with a linked oligosaccharide). This is important because the majority of proteins applied towards human heath applications are glycoproteins. Bacteria have not generally been thought of as being able to produce these proteins, and so more complicated organisms (eg from mammals), have been used instead. Bacteria are simpler to understand, grow faster and cheaper, and so would be very attractive if they could be designed to produce glycosylated proteins properly. The integrated transcriptomic and proteomic techniques examining E.coli containing overexpression libraries to be implemented here will allow us, when successful, to improve on glycosylated protein production in a bacterium, and set the scene for future efficient bioprocesses for making therapeutic proteins.

Technical Summary

This project aims to apply a genome-wide, multiscale approach for functional genomics to improve the production of recombinant proteins in Escherichia coli, and to take this approach further to begin to understand how to improve the production of glycosylated proteins. We will integrate data obtained from DNA microarray inverse metabolic engineering tools such as SCALEs (multi-Scale Analysis of Library Enrichment), with that obtained from high throughput quantitative shotgun proteomics (building on 8-plex isobaric mass tag technology - iTRAQ) methods as an addition, as proteomics is a level closer to the functional understanding of a phenotype. We will analysis the data using a multivariate approach. We then will seek to move beyond simple statement of whether the transcriptomic and proteomic data are concordant or discordant, but rather how these then can be interpreted in the context of biological pathways. In particular those related to recombinant protein synthesis of the model glycoprotein. Implementation of /omic based tools and the resulting data is necessary to provide a systems level understanding of an organism so that a deeper functional understanding results in bioprocess engineers being able to take advantage of findings in the biosciences, and translate these to valuable processes and products for UK bioprocessing businesses. We seek to ultimately improve the production of glycosylated recombinant proteins such as the N-glycoprotein AcrA, in E. coli here as an exemplar project. This protein has been demonstrated as being possible to produce in E. coli, following the transfer of the N-glycosylation system from Campylobacter jejuni into E.coli cells.
Description We have determined factors that are involved in making specialist therapeutic proteins (drugs) in E.coli. Were have used a range of different analytical tools to determine important aspects of cell metabolism that can be harnessed to improve the amount of a target protein and the form of this protein. This will be useful for the biopharmaceuticals sector.
Exploitation Route In general the tools can be used to optimise cellular metabolism to make protein products.
Sectors Chemicals,Healthcare,Manufacturing, including Industrial Biotechology
Description Advanced Life Science Research Technology Initiative
Amount £406,531 (GBP)
Funding ID BB/M012166/1 
Organisation Biotechnology and Biological Sciences Research Council (BBSRC) 
Sector Public
Country United Kingdom of Great Britain & Northern Ireland (UK)
Description BBSRC / EPSRC / Innovate UK IB Catalyst Round 1 Early Stage Translation
Amount £301,286 (GBP)
Funding ID BB/M018288/1 
Organisation Biotechnology and Biological Sciences Research Council (BBSRC) 
Sector Public
Country United Kingdom of Great Britain & Northern Ireland (UK)
Start 01/2015 
End 12/2018
Description BBSRC BRIC2
Amount £296,457 (GBP)
Funding ID BB/K011200/1 
Organisation Biotechnology and Biological Sciences Research Council (BBSRC) 
Sector Public
Country United Kingdom of Great Britain & Northern Ireland (UK)
Start 11/2013 
End 10/2016
Description Synthetic Biology IKC
Amount £4,990,071 (GBP)
Funding ID EP/L011573/1 
Organisation Engineering and Physical Sciences Research Council (EPSRC) 
Sector Academic/University
Country United Kingdom of Great Britain & Northern Ireland (UK)