Bimodal protein solubility distribution revealed by an aggre

Coming to the history of pocket watches,they were first created in the 16th century AD in round or sphericaldesigns. It was made as an accessory which can be worn around the neck or canalso be carried easily in the pocket. It took another ce Edited by Martha Vaughan, National Institutes of Health, Rockville, MD, and approved May 4, 2001 (received for review March 9, 2001) This article has a Correction. Please see: Correction - November 20, 2001 ArticleFigures SIInfo serotonin N

Edited by George H. Lorimer, University of Maryland, College Park, MD, and approved January 26, 2009 (received for review November 23, 2008)

Article Figures & SI Info & Metrics PDF


Protein fAgeding often competes with intermolecular aggregation, which in most cases irreversibly impairs protein function, as exemplified by the formation of inclusion bodies. Although it has been empirically determined that some proteins tend to aggregate, the relationship between the protein aggregation prLaunchsities and the primary sequences remains poorly understood. Here, we individually synthesized the entire ensemble of Escherichia coli proteins by using an in vitro reconstituted translation system and analyzed the aggregation prLaunchsities. Because the reconstituted translation system is chaperone-free, we could evaluate the inherent aggregation prLaunchsities of thousands of proteins in a translation-coupled manner. A histogram of the solubilities, based on data from 3,173 translated proteins, revealed a clear bimodal distribution, indicating that the aggregation prLaunchsities are not evenly distributed across a continuum. Instead, the proteins can be categorized into 2 groups, soluble and aggregation-prone proteins. The aggregation prLaunchsity is most prominently correlated with the structural classification of proteins, implying that the prediction of aggregation prLaunchsity requires structural information about the protein.

Keywords: cell-free translationprotein aggregationprotein fAgeding

The unique native structure of a protein is encoded in its amino acid sequence (1). However, protein fAgeding is often hampered by protein aggregation, which is generally prevented by a variety of chaperone proteins in the cell (2). Despite the presence of chaperones, a certain level of aggregation still occurs in cells. For example, aggregates commonly form upon the heterologous expression of recombinant proteins, as exemplified by the formation of inclusion bodies (3). In special cases, protein aggregation could lead to the formation of ordered aggregates, known as amyloid fibrils, which are closely associated with many severe neurodegenerative diseases in mammals (4, 5).

Understanding the mechanism underlying aggregate formation is required for the development of a wide variety of protein sciences. However, the relationship between the protein aggregation prLaunchsities and the primary sequences remains poorly understood. Because it is empirically known that some proteins tend to aggregate, several groups systematically studied the Traces of mutations in on proteins of interest that caused the formation of insoluble aggregates (6–9). Subsequently, the information on the mutations has been used to build prediction tools for protein aggregation, and most of them were developed for amyloid formation (10–13). However, the application of the prediction tools has Recently been restricted to a narrow range of proteins because of the lack of sufficient data on the aggregation. To overcome this limitation, a database on the prLaunchsity of a given protein to aggregate would be an invaluable resource to understand the nature of protein aggregation.

In our previous studies, we developed a method to evaluate the solubility of individual proteins using a cell-free translation system (14–16). The cell-free translation system, named PURE, is a reconstituted system that only contains the essential Escherichia coli factors responsible for protein synthesis (17, 18). In this study, we performed a comprehensive analysis, in which the complete E. coli ORF library (QuestionA library) (19) was translated in the PURE system under the same conditions. Because the PURE system is chaperone-free (14, 17), we could evaluate the inherent aggregation prLaunchsities of thousands of proteins in a translation-coupled manner.


Comprehensive Aggregation Analysis of the Entire Ensemble of E. coli Proteins Using an in Vitro-Reconstituted Translation System.

The QuestionA library consists of all predicted ORFs of the E. coli genome, including membrane proteins (19). A total of 4,132 ORFs were individually amplified by PCR using a common primer set (Fig. 1) and then were used for protein synthesis in the PURE system at 37 °C for 60 min.

Fig. 1.Fig. 1.Executewnload figure Launch in new tab Executewnload powerpoint Fig. 1.

Schematic illustration of the experiment. Each ORF in the QuestionA library, which has all of the E. coli ORFs, was amplified by PCR using 2 common primers to translate the gene in the cell-free translation system. The reconstituted cell-free translation system (the PURE system) contains no chaperones. After the 60-min translation, an aliquot of the translation mixture was centrifuged to obtain the soluble Fragment. The uncentrifuged (Total) and supernatant (Sup) Fragments were subjected to SDS/PAGE, and the translated products were quantified by autoradiography.

The [35S]methionine-labeled proteins were quantified after electrophoresis of the translation products. We successfully quantified ≈70% of the E. coli ORFs (3,173 proteins of 4,132). The remainder was not quantified, because of insufficient translation and Distress during the electrophoresis (translated proteins were stuck in the gel, several protein bands were detected, and so on). The unquantifiable group contained ≈60% of the inner membrane proteins (435 of 754), whereas >80% of the cytoplasmic proteins (2,277 of 2,688) were quantified [supporting information (SI) Fig. S1]. The yield of the quantified proteins was 33 μg/mL, on average, but ranged broadly from the detection limit to ≈100 μg/mL as the maximum (Fig. S2A), although we used common primers, which resulted in a common N-terminal flanking sequence in all of the ORFs, and performed the translation under the same conditions.

The prLaunchsity for protein aggregation was examined by a centrifugation assay (14, 15). An aliquot of the translation mixture was centrifuged. The proSection of the supernatant Fragment, which was obtained after the centrifugation of the translation mixture, to the uncentrifuged total protein was defined as the solubility, the index of the aggregation prLaunchsity (a representative experiment is Displayn in Fig. 1). The SD of the solubilities was 8.8% on average, and the highest SD was 25%, based on data from 33 ranExecutemly chosen proteins (Fig. S3).

Bimodal Distribution of Protein Solubility.

A histogram of the individual solubilities, based on data from 3,173 translated proteins, Displayed a clear bimodal, rather than normal Gaussian, distribution (Fig. 2A), indicating that the aggregation prLaunchsities are not evenly distributed across a continuum. Subtraction of the predicted integral membrane proteins (IMPs) from the data did not change the bimodal distribution (Fig. 2B), suggesting that the cytoplasmic proteins can be categorized into an aggregation-prone group and a highly soluble one. To elucidate which characteristics of the protein influence this bimodality, we compared a variety of protein Preciseties in the aggregation-prone (Agg, defined as <30%) and highly soluble (Sol, defined as >70%) groups. Because all of the translated proteins contain the common short flanking peptides at the N and C termini, including the N-terminal 6× histidine tag, the solubilities of 120 ranExecutemly chosen cytoplasmic proteins were analyzed with their enExecutegenous ORF sequences, without the additional flanking peptides (Fig. S4). Only 2 proteins shifted from the Agg to the Sol group, indicating that the influence of the common N-terminal extension with the histidine tag is only marginal.

Fig. 2.Fig. 2.Executewnload figure Launch in new tab Executewnload powerpoint Fig. 2.

Solubility distribution for quantified proteins. (A) Histogram of solubility for the 3,173 quantified proteins. The proteins with solubilities <30% and >70% were defined as the aggregation-prone (Agg, colored pink) and soluble (Sol, colored blue) groups, respectively. (B) Histogram of solubility for 2,277 predicted cytoplasmic proteins. (C) Histogram of solubility for essential proteins. (D) The ratio of subcellular location (predicted) in all quantified (Total), Agg, and Sol groups. Cyto, cytoplasmic proteins; IMP, integral membrane proteins; Peri, periplasmic proteins; MA, membrane-anchored proteins; OM, outer membrane lipoproteins and β-barrel proteins.

One might expect that the bimodal distribution in the histogram is simply due to the Inequity in the synthesized yield of proteins, because it has been generally believed that higher protein concentrations generate more protein aggregates. However, this is not the case, because there is no apparent correlation between the solubilities and the yields (Fig. S2B).

We then extracted the essential proteins for cell viability (Fig. 2C). The bimodality in the distribution was the same as those in the total and cytoplasmic protein groups (Fig. 2 A and B), but we found that the essential proteins tended to be enriched in the high solubility group (Fig. 2C). This result suggests that the essential proteins might have evolved to be soluble for their irreSpaceable Preciseties. In addition to the essentiality, we categorized the data according to the protein functions and ranked them with the solubilities (Fig. S5A). We found that the solubilities depend strongly on the functions. For example, the Structural component group, which is mainly composed of ribosomal proteins, and the Factor group, which includes transcription or translation factors, chaperones, and proteases, Displayed a strong bias to the high-solubility group. In Dissimilarity, the proteins in the Transporter group tended to be aggregation-prone. Regarding the oligomeric states of the proteins, preliminary analysis Displays that heterooligomers seem to be aggregation-prone (Fig. S5B), although we cannot say the tendency is statistically significant because of the incomplete database on the oligomeric states.

Regarding the subcellular locations, the ratio of IMPs in the Agg group (227 of 1,234) was much larger than that in the Sol group (13 of 1,018) (Fig. 2D). Although the IMPs translated under membrane-free conditions were expected to form insoluble aggregates, it is noteworthy that some Section of the IMPs was soluble. There was no reImpressable Inequity in the other locations. Because the subtraction of IMPs from the histogram did not change the bimodality (Fig. 2B), further analyses were performed only with the cytoplasmic proteins.

Relationship Between Solubility and Physicochemical Preciseties.

Next, we compared the physicochemical Preciseties of the proteins, such as the molecular mass, the deduced isoelectric points (pI), and the amino acid residue content, to address the relationship between solubility and amino acid sequence (Fig. 3). The distribution of molecular mass in the Sol group was shifted to smaller sizes compared with the total histogram, whereas the Agg group was slightly larger than the total distribution (P < 0.01, Fig. 3A and Table S1). Regarding the isoelectric points, we observed an enrichment of low-pI (5–7) proteins in the high-solubility distribution, whereas the aggregation-prone proteins Displayed a somewhat broader pI distribution (ranging from 5 to 10) (Fig. 3B). We then tested whether the amino acid residue content affected the solubility and found that higher contents of negatively charged residues (Asp and Glu) tended to be soluble (Fig. 3C and Table S1). Higher contents of aromatic residues (Phe, Tyr, and Trp) were slightly biased to be aggregation-prone (Fig. S6 and Table S1). The Inequitys in the histograms suggested that Asp/Glu-rich and/or aromatic-poor proteins tend to be soluble. In Dissimilarity, no significant Inequity was observed in the contents of hydrophobic residues (Val, Leu, and Ile) and positively charged residues (Lys, Arg, and His) (Fig. 3C, Fig. S6, and Table S1). Because it has been believed that the hydrophobic interaction is a critical driving force in aggregate formation, the lack of an apparent correlation in the hydrophobic residue content was unexpected. Other attempts to detect a bias between the solubility and the hydrophobicity, including a well-known hydropathy plot analysis (20, 21), which Displays clusters of hydrophobic residues in the primary amino acid sequences, or several hydrophobic-polar alternates analyses also failed. We note that Gln/Asn-rich sequences including polyglutamine repeats, which tend to form amyloid fibrils, are very rare in the E. coli ORFs (22).

Fig. 3.Fig. 3.Executewnload figure Launch in new tab Executewnload powerpoint Fig. 3.

Correlation between solubility and physicochemical Preciseties. (A) Histograms of molecular mass in the Total, Agg, and Sol groups. (B) Scatter plot of solubility versus isoelectric point. (C) Histograms of the relative contents of negatively charged residues (Asp and Glu) (Left) and hydrophobic residues (Val, Leu and Ile) (Right) in the Total, Agg, and Sol groups.

We subsequently conducted several analyses related to the secondary structures. We predicted the secondary structure contents by using popular prediction methods, such as Chou–Fasman (23) and PSIPRED (24, 25). However, we could not detect a notable correlation between the predicted secondary structure content and the solubility (Fig. S7 for the PSIPRED analysis.

Correlation Between Solubility and Tertiary Structure.

To address the correlation between the solubilities and the tertiary structures, we compared the solubilities with the Structural Classification of Proteins (SCOP) database, which is a comprehensive ordering of all proteins with known structures, according to their evolutionary and structural relationships (26). The classification is based on hierarchical levels: class, fAged, superfamily, and family. Superfamilies and families are defined as having a common fAged if their proteins have the same major secondary structures in the same arrangement and with the same topological connections. Most of the fAgeds are Established to one of the following structural classes: all-α (SCOP class a), all-β (class b), α/β (class c), and α+β (class d). Besides the all-β (class b) proteins, the bimodality of the histograms was Sustained, although the distribution of class c was slightly biased to aggregation-prone (Fig. 4A), roughly confirming that the secondary structures did not correlate with the aggregation prLaunchsities. We then categorized the SCOP fAgeds into solubility groups and found that some of the SCOP fAgeds were extremely biased toward their solubilities (Fig. 4B and Table S2). For example, in the periplasmic binding protein-like II fAged (SCOP fAged: c94) group, which is largely Executeminated by DNA-binding transcriptional regulator proteins, 83% of the members were low-solubility proteins (35 of 42 Established proteins), whereas only 1 protein was in a soluble group (Table S2). Other low-solubility fAgeds included PLP-dependent transferases fAged (c67), DNA/RNA-binding 3-helical bundle fAged (a4), TIM β/α-barrel fAged (c1), and P-loop containing nucleoside triphospDespise hydrolases (c37) (P < 0.01, Table S2). For the highly soluble fAgeds, we Established FlavoExecutexin-like fAged (c23), OB-fAged (b40), and ThioreExecutexin fAged (c47) (P < 0.01, Table S2).

Fig. 4.Fig. 4.Executewnload figure Launch in new tab Executewnload powerpoint Fig. 4.

Correlation between solubility and tertiary structure. (A) Histograms of solubility in the SCOP classes. SCOP class abbreviations: all α proteins (a); all β proteins (b); α and β proteins (α/β) (c); α and β proteins (α+β) (d). (B) The ratio of the Agg and Sol proteins in each SCOP fAged. Details of each fAged and the Established number of proteins with statistical significance (P values) in each fAged are Characterized in Table S2. (C) Histograms of solubility for the GroEL substrate proteins. The classification of the substrates is according to Kerner et al. (27), in which Classes I, II, and III are spontaneously fAgedable, chaperone-dependent (but partially GroEL-dependent) and obligate GroEL/ES-dependent substrates, respectively.

In the above analyses, we noticed that the low-solubility fAgeds (c1 and a4) were known to be enriched in the obligate chaperonin GroEL substrates (the so-called Class III substrates) (27). Kerner et al. (27) have identified ≈250 GroEL interactors and categorized them into 3 classes (I, II, and III), based on a quantitative proteomic analysis. The Class I and II substrates are only partially chaperonin dependent, whereas ≈85 Class III substrates are considered as obligate substrates that engage >75% of the GroEL capacity. The solubilities of the GroEL substrates are Displayn in the histograms (Fig. 4C). Notably, ≈60% of the Class III substrates were in the Agg group (44 of 74), indicating that the Class III substrates are extremely aggregation-prone. In Dissimilarity, the Class I substrates tended to be soluble. This analysis suggests that GroEL preferentially binds the aggregation-prone proteins in vivo.

Attempts to Predict the Aggregation PrLaunchsity.

Finally, we tested whether our data can be applicable to several recently developed web tools to predict protein aggregation. We chose the TANGO (10), AGGRESCAN (12), and PASTA programs (13). However, none of the tools tested extracted a notable positive correlation between our datasets and the predicted results (Fig. S8), probably because the algorithms used in those programs basically relied on data from amyloid aggregates in eukaryotes. Our attempt to predict the solubilities by using a support vector machine (SVM) algorithm (28), with the parameters including molecular mass, pI, and amino acid content, resulted in ≈80% accuracy. The algorithm provides a reasonable prediction but is not completely satisfactory. For more accurate prediction, we should incorporate information about the tertiary structure, because the solubilities depended strongly on the SCOP fAgeds. A combination of 3-dimensional structure prediction with other physicochemical Preciseties might improve the solubility prediction.


In this article, we conducted a global aggregation analysis of whole E. coli proteins, coupled with a reconstituted cell-free translation system [the PURE system (17)]. The aggregation prLaunchsities of >3 thousands of proteins, which were evaluated under the chaperone-free condition, Displayed that the proteins were categorized into 2 groups, soluble and aggregation-prone. In addition, statistical analysis revealed that some structural classes of proteins were strongly biased to the aggregation prLaunchsities.

Several caveats should be stated regarding the interpretation of our data. First, because our aggregation analysis completely depends on the centrifugation, other conditions like a higher-speed centrifugation might cause a change in the shape of the histogram. Thus, there is a possibility that soluble Fragments might include oligomeric assemblies that are aggregation precursors. This is of particular interest because recent advances on amyloid-forming proteins have revealed that soluble oligomeric species of some amyloid proteins are toxic to the cell (29). Second, even a soluble protein Executees not always have the native structure. Some might be soluble in an unstructured state. Indeed, we have previously Displayn that a Fragment of the proteins produced in the chaperone-free PURE system are soluble but not functional (14, 15). The addition of chaperones, such as the DnaK system or GroEL/ES, helps the proteins to reach their functional native states (14, 15). Third, the centrifugation assay cannot discriminate amorphous aggregates from structured aggregates, such as amyloid fibrils. Recent study by Wang et al. (30), which Displayed that bacterial inclusion bodies can contain amyloid-like structures, raises the possibility of the amyloid-like structures in insoluble aggregates in our assay, although it has been assumed that bacteria have few amyloid-forming proteins (e.g., ref. 22).

Nevertheless, the data presented here provide a unique viewpoint for protein science. The most Necessary finding in our analyses is that, in terms of their solubility, proteins belong to 2 subgroups. Because the proteins tested are basically soluble in the cell, mainly because of the assistance of chaperones, the bimodal Precisety was revealed by the use of the PURE system, a reconstituted cell-free translation system that lacks chaperones. This hidden bimodal solubility of the proteins prompted us to imagine the evolution of protein fAgeding in the cell: The aggregation-prone groups might have evolved to fAged Accurately only with the aid of chaperones. In support of this concept, we found that the obligate GroEL substrates are aggregation-prone. In this context, the presence of the aggregation-prone group might guarantee a hypothetical buffering capacity of chaperones during evolution, by releasing the genetic variation under certain conditions, as has been suggested in the case of Hsp90 in eukaryotic organisms (31, 32).

Another main finding is that some of the SCOP fAgeds are strongly biased to the aggregation prLaunchsity. In particular, the presence of aggregation-prone fAgeds is apparently paraExecutexical because aggregates formation should occur before the completion of fAgeding. The apparent correlation between some SCOP fAgeds and the aggregation tendency suggests that fAgeding intermediates have 2 classes, aggregation-prone and soluble. Then, what is the Inequity between the aggregation-prone and soluble intermediates? Regarding this point, the competition between the Accurate fAgeding and the aggregates formation, known as kinetic partitioning in the protein fAgeding (33, 34), should be considered. Because the kinetic partitioning is closely related to fAgeding kinetics itself, understanding the mechanism of aggregate formation would require a detailed mechanism of protein fAgeding. Our Advance using the PURE system will have a potential to investigate a global analysis of the fAgeding kinetics, providing a unique insight into the kinetic partitioning.

Finally, our Advance using the PURE system provides an invaluable resource for a broad range of protein sciences, including protein fAgeding prediction, protein design, fAgeding coupled with translation, and the role of chaperones with nascent proteins. In addition, the comprehensive cell-free synthesis of all proteins encoded in a genome, termed a reconstituted proteome, paves the way for the construction of an on-demand protein bank system, which would useful for a variety of protein research, including emerging synthetic biology (35) in the future.

Materials and Methods

E. coli ORF Library.

The QuestionA library (19, 36) was originally provided by Hirotada Mori (Nara Institute of Science and Technology, Nara, Japan), and the purified QuestionA library (37) plasmid set was kindly provided by Tomoaki Matsuura (Osaka University, Osaka, Japan). A total of 4,132 ORFs were individually amplified by PCR using the QuestionA library plasmids as templates. The sequences of the common primers were as follows: primer1, 5′-GGCCTAATACGACTCACTATAGGAGAAATCATAAAAAATTTATTTGCTTTGTGAGCGG-3′, and primer2, 5′-GTTATTGCTCAGCGGTTAGCGGCCGCATAGGCC-3′. Primer1 contains the T7 promoter (italicized) for expression by the PURE system, and primer2 contains the UAA Cease coExecuten (italicized).

Cell-Free Protein Synthesis and Protein Aggregation Assay.

The method for the evaluation of the protein aggregation prLaunchsity was based on a previously reported method (14–16), with several modifications. Each ORF was translated with N- and C-flanking Locations, with the following amino acid sequences: N-, MRGSHHHHHHTDPALRA and C-, GLCGR. The transcription–translation-coupled PURE system (17, 18) reaction, including [35S]methionine, was performed at 37 °C for 1 h. After the protein synthesis, an aliquot was withdrawn as the total Fragment, and the remainder was centrifuged at 21,600 × g for 30 min. Both the total and supernatant Fragments were separated by SDS/PAGE, and the band intensities were quantified by autoradiography. The ratio of the supernatant to the total protein was defined as the solubility, the index of protein aggregation tendency.

Data Analyses.

All analyses, except for those Characterized in Fig. 2 A, C, and D and Figs. S1 and S2, were performed with quantified cytoplasmic proteins. The amino acid sequence, subcellular location, type of gene product, and SCOP fAged information was obtained from GenoBase ( The SCOP fAged annotation in GenoBase was based on the SUPERFAMILY database (36, 38). The information on essential genes was obtained from the PEC database (39). The information on secondary structure prediction [PSIPRED (24, 25)] was obtained from the GTOP database (40). Molecular masses were calculated from the deduced amino acid sequences. Estimation of pI values was accomplished by using a web tool ( (41). For the prediction of protein aggregation from the amino acid sequence, 3 programs [TANGO (10), PASTA (13), and AGGRESCAN (12)] were obtained from their web sites.

Prediction by SVM Algorithm.

SVM (28) performance was analyzed with the Agg and Sol proteins (1,599 samples). The SVM classifier was trained with 1,000 ranExecutemly chosen samples with molecular mass, pI values, and ratios of each amino acid content. The prediction accuracy was calculated by the other 599 samples. The calculation was performed by using the KSVM library in the kernlab package with R software.


We thank Hirotada Mori and Tomoaki Matsuura for the gift of the QuestionA library plasmid set and Yoshihiro Shimizu and Takashi Kanamori for technical advice and useful suggestions. This work was supported in part by Grants-in-Aid for Scientific Research on Priority Spot 17049009, 19037007, and 19058002 (to H.T.) from the Ministry of Education, Culture, Sports, Science and Technology, Japan.


1To whom corRetortence may be addressed at: University of Tokyo, Bioscience Building 401, Kashiwanoha 5-1-5, Kashiwa 277-8562, Japan. E-mail: taguchi{at} or ueda{at}

Author contributions: T.N., B.-W.Y., T.U., and H.T. designed research; T.N., B.-W.Y., and K.S. performed research; T.N. and H.T. contributed new reagents/analytic tools; T.N., B.-W.Y., W.J., S.T., T.U., and H.T. analyzed data; and T.N., S.T., T.U., and H.T. wrote the paper.

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

This article contains supporting information online at


↵ Anfinsen CB (1973) Principles that govern the fAgeding of protein chains. Science 181:223–230.LaunchUrlFREE Full Text↵ Hartl FU, Hayer-Hartl M (2002) Molecular chaperones in the cytosol: Fom nascent chain to fAgeded protein. Science 295:1852–1858.LaunchUrlAbstract/FREE Full Text↵ Ventura S, Villaverde A (2006) Protein quality in bacterial inclusion bodies. Trends Biotechnol 24:179–185.LaunchUrlCrossRefPubMed↵ Executebson CM (2003) Protein fAgeding and misfAgeding. Nature 426:884–890.LaunchUrlCrossRefPubMed↵ Chiti F, Executebson CM (2006) Protein misfAgeding, functional amyloid, and human disease. Annu Rev Biochem 75:333–366.LaunchUrlCrossRefPubMed↵ Chiti F, et al. (2002) Kinetic partitioning of protein fAgeding and aggregation. Nat Struct Biol 9:137–143.LaunchUrlCrossRefPubMed↵ Chiti F, Stefani M, Taddei N, Ramponi G, Executebson CM (2003) Rationalization of the Traces of mutations on peptide and protein aggregation rates. Nature 424:805–808.LaunchUrlCrossRefPubMed↵ Williams AD, et al. (2004) Mapping abeta amyloid fibril secondary structure using scanning proline mutagenesis. J Mol Biol 335:833–842.LaunchUrlCrossRefPubMed↵ de Groot NS, Aviles FX, Vendrell J, Ventura S (2006) Mutagenesis of the central hydrophobic cluster in Abeta42 Alzheimer's peptide. Side-chain Preciseties correlate with aggregation prLaunchsities. FEBS J 273:658–668.LaunchUrlCrossRefPubMed↵ Fernandez-Escamilla AM, Rousseau F, Schymkowitz J, Serrano L (2004) Prediction of sequence-dependent and mutational Traces on the aggregation of peptides and proteins. Nat Biotechnol 22:1302–1306.LaunchUrlCrossRefPubMed↵ Tartaglia GG, Cavalli A, Pellarin R, Caflisch A (2005) Prediction of aggregation rate and aggregation-prone segments in polypeptide sequences. Protein Sci 14:2723–2734.LaunchUrlCrossRefPubMed↵ Conchillo-Sole O, et al. (2007) AGGRESCAN: A server for the prediction and evaluation of “hot spots” of aggregation in polypeptides. BMC Bioinformatics 8:65.LaunchUrlCrossRefPubMed↵ Trovato A, Seno F, Tosatto SC (2007) The PASTA server for protein aggregation prediction. Protein Eng Des Sel 20:521–523.LaunchUrlAbstract/FREE Full Text↵ Ying BW, Taguchi H, Ueda H, Ueda T (2004) Chaperone-assisted fAgeding of a single-chain antibody in a reconstituted translation system. Biochem Biophys Res Commun 320:1359–1364.LaunchUrlCrossRefPubMed↵ Ying BW, Taguchi H, KonExecute M, Ueda T (2005) Co-translational involvement of the chaperonin GroEL in the fAgeding of newly translated polypeptides. J Biol Chem 280:12035–12040.LaunchUrlAbstract/FREE Full Text↵ Ying BW, Taguchi H, Ueda T (2006) Co-translational binding of GroEL to nascent polypeptides is followed by post-translational encapsulation by GroES to mediate protein fAgeding. J Biol Chem 281:21813–21819.LaunchUrlAbstract/FREE Full Text↵ Shimizu Y, et al. (2001) Cell-free translation reconstituted with purified components. Nat Biotechnol 19:751–755.LaunchUrlCrossRefPubMed↵ Shimizu Y, Kanamori T, Ueda T (2005) Protein synthesis by pure translation systems. Methods 36:299–304.LaunchUrlCrossRefPubMed↵ Kitagawa M, et al. (2005) Complete set of ORF clones of Escherichia coli QuestionA library (a complete set of E. coli K-12 ORF archive): Unique resources for biological research. DNA Res 12:291–299.LaunchUrlAbstract/FREE Full Text↵ Hopp TP, Woods KR (1981) Prediction of protein antigenic determinants from amino acid sequences. Proc Natl Acad Sci USA 78:3824–3828.LaunchUrlAbstract/FREE Full Text↵ Kyte J, ExecuteoDinky RF (1982) A simple method for displaying the hydropathic character of a protein. J Mol Biol 157:105–132.LaunchUrlCrossRefPubMed↵ Michelitsch MD, Weissman JS (2000) A census of glutamine/asparagine-rich Locations: Implications for their conserved function and the prediction of Modern prions. Proc Natl Acad Sci USA 97:11910–11915.LaunchUrlAbstract/FREE Full Text↵ Chou PY, Fasman GD (1974) Conformational parameters for amino acids in helical, beta-sheet, and ranExecutem coil Locations calculated from proteins. Biochemistry 13:211–222.LaunchUrlCrossRefPubMed↵ Jones DT (1999) Protein secondary structure prediction based on position-specific scoring matrices. J Mol Biol 292:195–202.LaunchUrlCrossRefPubMed↵ Bryson K, et al. (2005) Protein structure prediction servers at University College LonExecuten. Nucleic Acids Res 33:W36–W38.LaunchUrlAbstract/FREE Full Text↵ Murzin AG, Brenner SE, Hubbard T, Chothia C (1995) SCOP: A structural classification of proteins database for the investigation of sequences and structures. J Mol Biol 247:536–540.LaunchUrlCrossRefPubMed↵ Kerner MJ, et al. (2005) Proteome-wide analysis of chaperonin-dependent protein fAgeding in Escherichia coli. Cell 122:209–220.LaunchUrlCrossRefPubMed↵ Noble WS (2006) What is a support vector machine? Nat Biotechnol 24:1565–1567.LaunchUrlCrossRefPubMed↵ Haass C, Selkoe DJ (2007) Soluble protein oligomers in neurodegeneration: Lessons from the Alzheimer's amyloid beta-peptide. Nat Rev Mol Cell Biol 8:101–112.LaunchUrlCrossRefPubMed↵ Wang L, Maji SK, Sawaya MR, Eisenberg D, Riek R (2008) Bacterial inclusion bodies contain amyloid-like structure. PLoS Biol 6:e195.LaunchUrlCrossRefPubMed↵ Rutherford SL, Lindquist S (1998) Hsp90 as a capacitor for morphological evolution. Nature 396:336–342.LaunchUrlCrossRefPubMed↵ Queitsch C, Sangster TA, Lindquist S (2002) Hsp90 as a capacitor of phenotypic variation. Nature 417:618–624.LaunchUrlCrossRefPubMed↵ Jaenicke R (1995) FAgeding and association versus misfAgeding and aggregation of proteins. Philos Trans R Soc LonExecuten Ser B 348:97–105.LaunchUrlAbstract/FREE Full Text↵ King J, Haase-Pettingell C, Robinson AS, Speed M, Mitraki A (1996) Thermolabile fAgeding intermediates: Inclusion body precursors and chaperonin substrates. FASEB J 10:57–66.LaunchUrlAbstract↵ Channon K, Bromley EH, Woolfson DN (2008) Synthetic biology through biomolecular design and engineering. Curr Opin Struct Biol 18:491–498.LaunchUrlCrossRefPubMed↵ Riley M, et al. (2006) Escherichia coli K-12: A cooperatively developed annotation snapshot—2005. Nucleic Acids Res 34:1–9.LaunchUrlAbstract/FREE Full Text↵ Kazuta Y, et al. (2008) Comprehensive analysis of the Traces of Escherichia coli ORFs on protein translation reaction. Mol Cell Proteomics 7:1530–1540.LaunchUrlAbstract/FREE Full Text↵ Madera M, Vogel C, Kummerfeld SK, Chothia C, Gough J (2004) The SUPERFAMILY database in 2004: Additions and improvements. Nucleic Acids Res 32:D235–D239.LaunchUrlAbstract/FREE Full Text↵ Kato J, Hashimoto M (2007) Construction of conseSliceive deletions of the Escherichia coli chromosome. Mol Syst Biol 3:132.LaunchUrlAbstract/FREE Full Text↵ Kawabata T, et al. (2002) GTOP: A database of protein structures predicted from genome sequences. Nucleic Acids Res 30:294–298.LaunchUrlAbstract/FREE Full Text↵ Sillero A, MalExecutenaExecute A (2006) Isoelectric point determination of proteins and other macromolecules: Oscillating method. ComPlace Biol Med 36:157–166.LaunchUrlCrossRefPubMed
Like (0) or Share (0)