Experimental Determination and System-Level Analysis of Essential Genes in E. coli MG1655

S.Y. Gerdes1,*, M.D. Scholle1,*, J.W. Campbell1, G. Balazsi2, E. Ravasz3, M.D. Daugherty1, A.L. Somera2, N.C. Kyrpides1, I. Anderson1, M.S. Gelfand1, A. Bhattacharya1, V. Kapatral1, M. D'Souza1, M.V. Baev1, F. Mseeh1, M.Y. Fonstein1, R. Overbeek1, A.-L. Barabasi3, Z.N. Oltvai2 and A.L.Osterman1

I. Supplementary Table S1 (Excel format, PDF, TXT format).
II. Supplementary Table S2 (PDF, TXT format).
III. Results: additional illustrations and analysis (PDF).
IV. Supplementary Table S6 (PDF, TXT format).
V. Experimental and Analytical Procedures:
1. Genetic footprinting procedure (PDF).
2. Assessment of conditional gene essentiality based on genetic footprinting data (PDF).

Supplementary Table S2 (PDF, TXT format)

Comments to Table S2: Essential protein-coding E. coli ORFs determined by genetic footprinting under conditions of aerobic logarithmic cell growth in rich medium. Genes are sorted so as to highlight discrepancies between our data and gene essentiality data of Blattner and colleagues, and the PEC database.

In this table essential E. coli genes are organized in the three main groups:
Set A: genes consistently essential between this study and PEC data
Set B: genes found essential in our study, but non-essential in PEC data
Set C: genes essential in this study and unassigned in PEC data

Each group is further divided into two subsets:
subset 1 contains genes, for which viable knockouts were obtained by Blattner and colleagues;
subset 2 contains genes for which no viable knockouts were obtained.

In addition, E. coli ORFs are shown that are essential in PEC database, but non-essential, undefined, or ambiguous in our study (Sets D, E, F). Comparative analysis of the three lists of essential genes produced by vastly different approaches (statistical summary, PDF) demonstrates that discrepancies between our data and PEC are not random. They largely reflect the difference between the total number of essential genes detected in this study (629) and those compiled in PEC (216; note, that PEC list is also incomplete). In other words, 40-50% of "essential" genes determined by genetic footprinting are in fact genes that impart a substantial fitness advantage, and are not absolutely required for cell viability. Comparison with gene essentiality data of Blattner et al. brings a similar estimate considering the fact that this data set is currently also incomplete. The datasets of Blattner et al.and PEC disagree for only 12.5% (27/216) of ORFs, suggesting that both of these sources predominantly list true essential genes.