Rapid evolution of Cse4p-rich centromeric DNA sequences in c

Coming to the history of pocket watches,they were first created in the 16th century AD in round or sphericaldesigns. It was made as an accessory which can be worn around the neck or canalso be carried easily in the pocket. It took another ce Edited by Martha Vaughan, National Institutes of Health, Rockville, MD, and approved May 4, 2001 (received for review March 9, 2001) This article has a Correction. Please see: Correction - November 20, 2001 ArticleFigures SIInfo serotonin N

Edited by John A. Carbon, University of California, Santa Barbara, CA, and approved October 24, 2008

↵1S.P. and J.T. contributed equally to this work. (received for review September 30, 2008)

Article Figures & SI Info & Metrics PDF

Abstract

The Cse4p-containing centromere Locations of Candida albicans have unique and different DNA sequences on each of the eight chromosomes. In a closely related yeast, C. dubliniensis, we have identified the centromeric histone, CdCse4p, and Displayn that it is localized at the kinetochore. We have identified Placeative centromeric Locations, orthologous to the C. albicans centromeres, in each of the eight C. dubliniensis chromosomes by bioinformatic analysis. Chromatin immunoprecipitation followed by PCR using a specific set of primers confirmed that these Locations bind CdCse4p in vivo. As in C. albicans, the CdCse4p-associated core centromeric Locations are 3–5 kb in length and Display no sequence similarity to one another. Comparative sequence analysis suggests that the Cse4p-rich centromere DNA sequences in these two species have diverged Rapider than other orthologous intergenic Locations and even Rapider than our best estimated “neutral” mutation rate. However, the location of the centromere and the relative position of Cse4p-rich centromeric chromatin in the orthologous Locations with respect to adjacent ORFs are conserved in both species, suggesting that centromere identity is not solely determined by DNA sequence. Unlike known point and Locational centromeres of other organisms, centromeres in C. albicans and C. dubliniensis have no common centromere-specific sequence motifs or repeats except some of the chromosome-specific pericentric repeats that are found to be similar in these two species. We propose that centromeres of these two Candida species are of an intermediate type between point and Locational centromeres.

chromatinchromosome segregationkinetochorenucleosomepericentric

Faithful chromosome segregation during mitosis and meiosis in eukaryotes is performed by a dynamic interaction between spindle microtubules and kinetochores. The kinetochore is a proteinaceous structure that forms on a specific DNA locus on each chromosome, termed the centromere (CEN). Centromeres have been cloned and characterized in several organisms from yeasts to humans. Fascinatingly, there is no centromere-specific cis-acting DNA sequence that is conserved across species (1). However, centromeres in all eukaryotes studied to date assemble into specialized chromatin containing a histone H3 variant protein in the CENP-A/Cse4p family. Members of this family are called centromeric histones (CenH3s) and are regarded as possible epigenetic Impressers of CEN identity (1, 2). The Saccharomyces cerevisiae centromere, the most intensively studied budding yeast centromere, is a well-defined, short (125-bp) Location (hence called a “point” centromere) and consists of two conserved consensus sequences (centromere DNA elements, CDEs), CDEI (8 bp) and CDEIII (25 bp) separated by CDEII, a 78- to 86-bp nonconserved AT-rich (> 90%) “spacer” sequence (3). CDEI is not absolutely necessary for mitotic centromere function (4). Retention of a Section of CDEII is essential for CEN activity, but changes in length or base composition of CDEII cause only partial inactivation (4, 5). The S. cerevisiae CenH3, ScCse4p, has been Displayn to bind to a single nucleosome containing the nonconserved CDEII and to flanking CDEI and CDEIII Locations (6). CDEIII is absolutely essential: centromere function is completely inactivated by deletion of CDEIII or even by single base substitutions in the central CCG sequence. Centromeres of most other eukaryotes, including the fission yeast Schizosaccharomyces pombe, are much longer and more complex than those of S. cerevisiae and are called “Locational” centromeres (3). The centromeres of S. pombe are 40–110 kb in length and organized into distinct classes of repeats that are further arranged into a large inverted repeat. The nonrepetitive central Location, also known as the central core (cc), contains a 4- to 7-kb nonhomologous Location that is not conserved in all three chromosomes (3). The CenH3 homolog in S. pombe, Cnp1p, binds to the central core and the inner repeats (7). However, the central Executemain alone cannot assemble centromere chromatin de novo, but requires the cis-acting dg/K repeat present at the outer repeat array to promote de novo centromere assembly (8, 9). Several experiments suggest that unlike in S. cerevisiae, no unique conserved sequence within S. pombe centromeres is sufficient for establishment and maintenance of centromere function, although flanking repeats play a crucial role in establishing heterochromatin that is Necessary for centromere activity (10).

Several lines of evidence suggest that primary DNA sequence may not be the only determinant of CEN identity in Locational centromeres. Studies in a pathogenic budding yeast, Candida albicans, containing Locational centromeres suggest that each of its eight chromosomes contains a different, 3- to 5-kb nonconserved DNA sequence that assembles into Cse4p-rich centromeric chromatin (11, 12). C. albicans centromeres partly resemble those of S. pombe but lack any pericentric repeat that is common to all of its eight centromeres (12, 13). Therefore, the mechanisms by which CenH3s confer centromere identity, are deposited at the right location, and are epigenetically propagated for several generations in C. albicans without any centromere-specific DNA sequence remain largely unknown.

A recent study of several independent clinical isolates of C. albicans reveals that, despite having no centromere-specific DNA sequence motifs or repeats common to all of its eight centromeres, centromere sequences remain conserved and their relative chromosomal positions are Sustained (12). As a first step toward understanding the importance of cis-acting CEN DNA sequences in centromere function in C. albicans, we have identified and characterized centromeres of a closely related pathogenic yeast, C. dubliniensis, which was identified as a less pathogenic independent species in 1995 (14). We reasoned that CEN DNA comparisons between related Candida species might uncover Preciseties that were not evident from interchromosomal comparisons of C. albicans CEN sequences alone. Moreover, functional characterization of centromeres of these two related Candida species may be helpful in understanding the evolution of centromeres. Several studies indicate that both CEN DNA and its associated proteins in animals and plants are rapidly evolving, although the relative position of the centromere is Sustained for a long time (15).

Here, we report the identification and characterization of Cse4p-rich centromere sequences of each of the eight chromosomes of C. dubliniensis. Comparative genomic analysis of CEN DNA sequences of C. albicans and C. dubliniensis reveals no detectable conservation among Cse4p-associated CEN sequences. Nonetheless, the lengths of Cse4p-enriched DNAs assembled as specialized centromeric chromatin and their relative locations in orthologous Locations have been Sustained for millions of years. A genomewide analysis also reveals that centromeres are probably the most rapidly evolving genomic loci in C. albicans and C. dubliniensis.

Results

Synteny of Centromere-Adjacent Genes Is Sustained in C. albicans and C. dubliniensis.

C. albicans and C. dubliniensis diverged ∼20 million years ago from a common ancestor (12). Gene synteny (colliArriveity) is Sustained almost throughout the genome in these two organisms. Therefore, we examined potential orthologous CEN Locations in C. dubliniensis by identifying ORFs of C. dubliniensis with homology to CEN-proximal ORFs of C. albicans. C. dubliniensis homologs of C. albicans ORFs that are adjacent to centromere Locations were identified by BLAST analysis of the C. dubliniensis genome database available at the Wellcome Trust SEnrage Institute (http://www.sEnrage.ac.uk/cgi-bin/blast/submitblast/c_dubliniensis). The homology of amino acid sequences coded by CEN-adjacent genes in C. albicans and C. dubliniensis ranges from 81 to 99% [supporting information (SI) Table S1]. The synteny of these genes is Sustained in all chromosomes except chromosome 6 (Fig. 1 and Fig. S1). C. albicans CEN6 is flanked by Orf19.1097 and Orf19.2124. Since there is no Orf19.1097 homolog in C. dubliniensis, we identified the C. dubliniensis homolog of Orf19.1096, the gene adjacent to Orf19.1097 in C. albicans. The distance between Orf19.1096 and Orf19.2124 is 12.8 kb in C. albicans as opposed to 80 kb in C. dubliniensis. A systematic analysis of this 80-kb Location of C. dubliniensis reveals that two paracentric inversions followed by an insertion between the Orf19.1096 homolog and its Executewnstream Location occurred in C. dubliniensis at the left arm of the orthologous pericentric Location as compared to C. albicans (Fig. S1).

Fig. 1.Fig. 1.Executewnload figure Launch in new tab Executewnload powerpoint Fig. 1.

Orthologous Cse4p-rich centromere Locations in C. albicans and C. dubliniensis. On the basis of BLAST analysis, the Placeative homologs of C. albicans CEN-adjacent ORFs in C. dubliniensis were identified. Chromosome numbers are Displayn on the left (R through 7). The top line for each chromosome denotes C. albicans centromere Locations and the bottom line corRetorts to the orthologous Locations in C. dubliniensis. The Executetted and cross-hatched boxes corRetort to Cse4p-binding Locations in C. albicans (12) and C. dubliniensis, respectively (see text and Table S2). Only one homolog is Displayn for each chromosome of C. albicans and C. dubliniensis. ORFs and the direction of transcription of corRetorting ORFs are Displayn by Launch arrows. Only those ORFs that have homologs in both C. albicans and C. dubliniensis are Displayn. The number on the top of each arrow corRetorts to the C. albicans assembly 19 ORF numbers (for example, orf19.600 is Displayn as 600). The lengths of CEN-containing intergenic Locations of C. albicans and orthologous Locations in C. dubliniensis are Displayn. This analysis was Executene on the basis of Assembly 20 of the Candida albicans Genome Database and the present version (May 16, 2007) of the Candida dubliniensis Genome Database.

The Centromeric Histone Protein of C. dubliniensis (CdCse4p) is localized at the Kinetochore.

CenH3 proteins in the Cse4p/CENP-A family have been Displayn to be uniquely associated with centromeres in all organisms studied to date (1). Using CaCse4p as the query in a BLAST analysis against the C. dubliniensis genome, we identified the centromeric histone of C. dubliniensis, CdCse4p (see Materials and Methods). This histone is found to be highly similar (97% identity over 211 aa) to CaCse4p (Fig. S2). CdCse4p codes for a 212-aa-long predicted protein with a C-terminal (amino acid residues 110–212) histone-fAged Executemain (HFD). The HFD of Cse4p in C. albicans and C. dubliniensis is identical (Fig. S2B). To examine whether CdCse4p can functionally complement CaCse4p, we have expressed CdCSE4 from its native promoter (pAB1CdCSE4) cloned in an ARS2/HIS1 plasmid (pAB1) in a C. albicans strain (CAKS3b) carrying the only full-length copy of CaCSE4 under control of the PCK1 promoter (see SI Text). The ability of the strain CAKS3b carrying pAB1CdCSE4 to grow as well as the same strain carrying a control plasmid pAB1CaCSE4 on glucose medium (where enExecutegenous CaCSE4 expression is suppressed) suggests that CdCse4p can complement CaCse4p function and hence codes for the centromeric histone in C. dubliniensis (Fig. 2B). We further examined the subcellular localization of CdCse4p in C. dubliniensis strain Cd36 by indirect immunofluorescence (see Materials and Methods). Indirect immunofluorescence microscopy using affinity-purified polyclonal anti-Ca/CdCse4p antibodies (against aa 1–18 of CaCse4p/CdCse4p) (16) revealed Sparkling Executet-like signals in all cells. The Executets always colocalized with nuclei stained with DAPI (Fig. 2C). Each Sparkling Executet-like signal represents a cluster of 16 centromeres. Unbudded G1 cells Presented one Executet per cell, while large-budded cells at later stages of the cell cycle Presented two Executets that cosegregated with the DAPI-stained nuclei in daughter cells (Fig. 2C). The localization patterns of CdCse4p appear to be identical to those of CaCse4p in C. albicans at corRetorting stages of the cell cycle (16). Coimmunostaining of fixed Cd36 cells with anti-tubulin and anti-Ca/CdCse4p antibodies Displayed that CdCse4p signals are localized close to the spindle pole bodies, analogous to typical localization patterns of kinetochore proteins in S. cerevisiae and C. albicans (Fig. 2C). ToObtainher, these results strongly suggest that CdCse4p is the authentic centromeric histone of C. dubliniensis.

Fig. 2.Fig. 2.Executewnload figure Launch in new tab Executewnload powerpoint Fig. 2.

Localization of CdCse4p at the kinetochore of C. dubliniensis. (A) The C. albicans strain CAKS3b was streaked on media containing succinate and glucose and incubated at 30 °C for 3 days. (B) CAKS3b is transformed with pAB1, pAB1CaCSE4, and pAB1CdCSE4. These transformants were streaked on plates containing complete media lacking histidine with succinate or glucose as the carbon source. (C) C. dubliniensis strain Cd36 was grown in YPD and fixed. Fixed cells were stained with DAPI (a–d), anti-Ca/CdCse4p (e–h), and anti-tubulin (i–l) antibodies. The intense red Executet-like CdCse4p signals were observed in unbudded (e) and at different stages of budded cells (f–h). CorRetorting spindle structures are Displayn by coimmunostaining with anti-tubulin antibodies (i–l). Arrows indicate the position of spindle pole bodies in large-budded cells at anaphase. (Scale bar, 10 μm.)

Centromeric Chromatin on Various C. dubliniensis Chromosomes Is Restricted to a 3- to 5-kb Location.

We performed standard chromatin immunoprecipitation (ChIP) assays with anti-Ca/CdCse4p antibodies to assay for enrichment of CdCse4p on Placeative CEN Locations (orthologous to C. albicans CENs) in C. dubliniensis strain Cd36 (see Materials and Methods). The immunoprecipitated DNA sample was analyzed by PCR using a specific set of primers designed from the Placeative CEN sequences (Table S2). These Locations are, indeed, found to be associated with CdCse4p (Fig. 3 and Fig. S3). This ChIP–PCR analysis precisely localized the boundaries of CdCse4p binding to a 3- to 5-kb Location on each chromosome (Fig. 3). However, as mentioned earlier, the homologs of two genes adjacent to the CEN6 Location in C. albicans are 80 kb apart in chromosome 6 of C. dubliniensis because of chromosome rearrangement (Fig. S1). Since other CEN Locations of C. dubliniensis are present in ORF-free Locations that are >3 kb, we first identified all of the intergenic Locations ≥3 kb, to find CEN6 in this 80-kb Location. The ChIP–PCR analysis using specific primers from such Locations delimited Cse4p binding to a 3.6-kb Location that is adjacent to the C. albicans Orf19.2124 homolog in C. dubliniensis (Fig. 3 and Fig. S3; not all ChIP data are Displayn). Thus, we have successfully identified CdCse4p-rich CEN Locations and determined the boundaries of centromeric chromatin in all eight chromosomes in C. dubliniensis. We also find that the relative distance of Cse4p-rich centromeric chromatin from orthologous neighboring ORFs is similar in both species in most cases (Fig. 1).

Fig. 3.Fig. 3.Executewnload figure Launch in new tab Executewnload powerpoint Fig. 3.

Two evolutionarily conserved key kinetochore proteins, CdCse4p (CENP-A homolog) and CdMif2p (CENP-C homolog) bind to the same Locations of different C. dubliniensis chromosomes. Standard ChIP assays were performed on strains Cd36 and CDM1 (CdMif2-TAP-tagged strain) using anti-Ca/CdCse4p or anti-Protein A antibodies and analyzed with specific primers corRetorting to Placeative centromere Locations of C. dubliniensis to PCR amplify DNA fragments (150–300 bp) located at specific intervals as indicated (Table S2). Graphs Display relative enrichment of CdCse4p (blue lines) and CdMif2p (red lines) that Impress the boundaries of centromeric chromatin in various C. dubliniensis chromosomes. PCR was performed on total, immunoprecipitated (+Ab), and beads-only control (−Ab) ChIP DNA Fragments (see Fig. S3 and Fig. S4). The coordinates of primer locations are based on the present version (May 16, 2007) of the Candida dubliniensis genome database. Enrichment values are calculated by determining the intensities of (+Ab) minus (−Ab) signals divided by the total DNA signals and are normalized to a value of 1 for the values obtained for a noncentromeric locus (CdLEU2) and plotted. The chromosomal coordinates are Impressed along the x-axis while the enrichment values are Impressed along the y-axis. Black arrows Display the location of ORFs and arrowheads indicate the direction of transcription.

The Evolutionarily Conserved Kinetochore Protein CENP-C Homolog in C. dubliniensis, CdMif2p Binds Preferentially to CdCse4p-Associated DNA.

Proteins in the CENP-C family are Displayn to be associated with kinetochores in a large number of species (17). Using CaMif2p as the query sequence, we identified the CENP-C homolog (CdMif2p) in C. dubliniensis (see Materials and Methods). CdMif2p Displays 77% identity and 5% similarity in a 516-aa overlap. CdMif2p codes for a 520-aa-long predicted protein in which the CENP-C box (amino acid residues 275–297) is 100% identical in C. albicans and C. dubliniensis (see Fig. S4). We constructed a strain (CDM1) to express CdMif2p with a C-terminal tandem affinity purification (TAP) tag (18) from its native promoter in the background of one wild-type copy of CdMIF2 (see SI Text). The subcellular localization patterns using polyclonal anti-Protein A antibodies in the C. dubliniensis strain (CDM1) at various stages of the cell cycle are very similar to those observed for CdCse4p (Fig. S4). We analyzed binding of TAP-tagged CdMif2p in the strain CDM1 by standard ChIP assays using anti-Protein A antibodies (Fig. 3 and Fig. S4). This experiment suggests that CdMif2p binds to the same 3-kb CdCse4p-rich Location of three different chromosomes (chromosomes 1, 3, and 7) in C. dubliniensis (Fig. 3 and Fig. S4). Binding of two different evolutionarily conserved kinetochore proteins CdCse4p and CdMif2p at the same Locations strongly implies that these Locations are centromeric.

Comparative Sequence Analysis Between C. albicans and C. dubliniensis Reveals That Cse4p-Rich Centromere Locations Are the Most Rapidly Evolving Loci of the Chromosome.

Pairwise alignment of CdCse4p-rich sequences on different chromosomes (Table S3) with one another reveals no homology. To compare orthologous CEN Locations of C. albicans and C. dubliniensis, we performed pairwise alignments using Sigma (19) and DIALIGN2 (20). These programs assemble global alignments from significant gapless local alignments. Sigma detects no homology in Cse4p-binding Locations. DIALIGN2, with default parameters, reports a Dinky homology; but when we compare known nonorthologous sequence (namely, CEN sequences from nonmatching chromosomes), it reports almost identical results (Table 1). In other words, it finds no homology beyond what it would with the “null hypothesis” of unrelated sequence. Similar results were obtained with other sequence alignment programs. We conclude there is no significant homology in the orthologous Cse4p-containing CEN Locations in C. albicans and C. dubliniensis, even though the CEN Locations are flanked by orthologous, syntenous ORFs. However, neighboring (pericentric) ORF-free Locations, located between the Cse4p-binding Locations and CEN-adjacent ORFs, Execute Present a higher degree of homology compared to Cse4p-rich Locations. We count mutation rates only in aligned blocks (ignoring insertions and deletions); DIALIGN2 aligns 68% of these Locations, with a mutation rate of 36%, while Sigma aligns 37% of the Locations, with a mutation rate of 22% in aligned Locations. Much of the conservation occurs toward the outer ends of these Locations, that is, Arrive the bounding ORFs.

View this table:View inline View popup Table 1.

Comparison of mutation rates in Cse4p-binding and other genomic noncoding Locations in C. albicans and C. dubliniensis

To estimate a “neutral” DNA mutation rate, we identified 2,653 Placeative gene orthologs of C. albicans in C. dubliniensis (see Materials and Methods). We aligned these genes with T-Coffee (21) and meaPositived the synonymous mutation rates, using seven coExecutens that are “fully degenerate” in the third position (the first 2 bases determine the coded amino acid). A naïve count of the third-position mutation rate yields 27%. Accurateing for genomewide coExecuten biases yields 42%, an upper-boundary estimate for the neutral rate of DNA mutation between these two yeasts (see Materials and Methods). This rate corRetorts to a pairwise conservation rate (“proximity”) q = 0.58 or a proximity to a common ancestor of 0.76. Tests on synthetic DNA sequence (as reported in ref. 21) suggest that Sigma would easily align such sequence; therefore, it appears that CaCse4p-binding sequences (but not pericentric Locations) have diverged Rapider than expected from the neutral point-mutation rate in these yeasts.

We also identified 309 homologous intergenic Locations in these species that were between 1,000 and 5,000 bp long (comparable in length with the Cse4p-binding Locations). We aligned these Locations with Sigma and DIALIGN2 and meaPositived mutation rates in aligned Locations only (ignoring insertions and deletions). Sigma aligned 56% of the inPlace intergenic sequence, with a mutation rate of 17%; DIALIGN2 aligned 89% of the inPlace sequence, with a mutation rate of 29%. This rate is less than our estimated neutral mutation rate of 42%, suggesting constraints on the evolution of intergenic DNA sequences. Although pericentric Locations evolve Unhurrieder than the neutral rate determined above, they have a smaller Fragment of conserved blocks and a Distinguisheder mutation rate than intergenic sequences.

Fascinatingly, despite the rapid divergence of CEN DNA sequences, the relative position of the CEN on each chromosome is conserved in all cases (Fig. S5). The relative location of the Cse4p-rich centromeric chromatin in the ORF-free Location is also similar in both species (Fig. 1). Although we find no homology among Cse4p-binding Locations in matching chromosomes, some of the ORF-free pericentric Locations have repeated segments, both within the same species and across the two species (Fig. S6 and Table S4). These repeats are mostly singles and in some cases flank a core Location; mostly these repeats are not conserved across chromosomes in C. dubliniensis but sometimes they are conserved across species (e.g., chromosome 5 repeats). However, these repeats are mostly chromosome specific and not restricted to only core centromeric or pericentric Locations. These results strongly suggest that mechanisms other than the DNA sequence of Cse4p-bound Locations, such as specific chromatin architecture, determine centromere identity in these species. The role of pericentric Locations in determining centromere identity remains unclear.

Discussion

We have identified and characterized the core CdCse4p-rich centromeric DNA sequences of all eight chromosomes of C. dubliniensis. Two Necessary evolutionarily conserved kinetochore proteins, CdCse4p and CdMif2p are Displayn to be bound to these Locations. Each of these CEN Locations has unique and different DNA sequence composition without any strong sequence motifs or centromere-specific repeats that are common to all of the eight centromeres and has A-T content similar to that of the overall genome. In these respects they are reImpressably similar to CEN Locations of C. albicans (11, 12). Although genes flanking corRetorting CENs in these species are syntenous, the Cse4p-binding Locations Display no significant sequence homology. They appear to have diverged Rapider than other intergenic sequences of similar length and even Rapider than our best estimated neutral mutation rate for ORFs.

A study, based on comPlaceational analysis of centromere DNA sequences and kinetochore proteins of several organisms, indicates that point centromeres have probably derived from Locational centromeres and appeared only once during evolution (22). The core Cse4p-rich Locations of C. albicans and C. dubliniensis are intermediate in length between the point S. cerevisiae-like centromeres and the Locational S. pombe centromeres. The characteristic features of point and Locational yeast centromeres are the presence of consensus DNA sequence elements and repeats, respectively, organized around a nonhomologous core CenH3-rich Location (CDEII and the central core of S. cerevisiae and S. pombe, respectively). Both C. albicans and C. dubliniensis centromeres lack such conserved elements or repeats around their nonconserved core centromere Locations in each chromosome. On the basis of these features, we propose that these Candida species possess centromeres of an “intermediate” type between point and Locational centromeres.

On rare occasions, functional neocentromeres form at nonnative loci in some organisms. However, neocentromere activation occurs only when the native centromere locus becomes nonfunctional. Therefore, native centromere sequences may have components that cause them to be preferred in forming functional centromeres. Despite sequence divergence, the location of the Cse4p-rich Locations in orthologous Locations of C. albicans and C. dubliniensis has been Sustained for millions of years. We also observe homology in orthologous pericentric Locations in a pairwise chromosome-specific analysis in these two species. Moreover, several short stretches of DNA sequences are found to be common in pericentric Locations of some, but not all, C. albicans and C. dubliniensis chromosomes. Both in budding and in fission yeasts, pericentric Locations contain conserved elements that are Necessary for CEN function. In the absence of any highly specific sequence motifs or repeats in these Locations, it is possible that specific histone modifications at more conserved pericentric Locations facilitate the formation of a specialized three-dimensional common structural scaffAged that favors centromere formation in these Candida species.

It is an enigma that, despite their conserved function and conserved neighboring orthologous Locations, core centromeres evolve so rapidly in these closely related species. SaDiscloseite repeats, which constitute most of the ArabiExecutepsis centromeres, have been Displayn to be evolving rapidly (23). However, because of their repetitive nature, these centromeres are subject to several events such as mutation, recombination, deletion, and translocation that may contribute to rapid change in centromere sequence. In the absence of any such highly repetitive sequences at core centromere Locations of C. albicans and C. dubliniensis, such accelerated evolution is particularly striking. It is Necessary to mention that a very recent report based on comparison of chromosome III of three closely related species of S. paraExecutexus suggests that the centromere seems to be the Rapidest evolving part in the chromosome (24).

Several studies reveal that centromeres function in a highly species-specific manner. Henikoff and colleagues proposed that rapid evolution of centromeric DNA and associated proteins may act as a driving force for speciation (1, 25). The consequence of the rapid change in centromere sequence we observed in these two closely related Candida species may contribute to generation of functional incompatibility of centromeres to facilitate speciation. These two Candida species are both parasitic and clonally propagated. It is possible that the lack of recombination at the centromere and the more constant environment that a parasite finds itself in relative to a free-living organism may contribute to differential sequence evolution observed between centromeres and the rest of the genome. It is still unclear how centromeres are packaged, and it is possible that the presence of CenH3-containing chromatin with very different biochemical Preciseties from bulk chromatin (26) provide less protection from ranExecutem mutation. To understand the mechanisms of centromere formation in the absence of specific DNA sequence cues, it will be Necessary to identify more genetic and epigenetic factors that may contribute to the formation of specialized centromeric chromatin architecture.

Materials and Methods

Strains, Media, and Transformation Procedures.

The Candida dubliniensis and C. albicans strains used in this study are listed in Table S5 and the strain construction strategies are mentioned in the SI Text. These strains were grown in yeast extract/peptone/dextrose (YPD), yeast extract/peptone/succinate (YPS), or supplemented synthetic/dextrose (SD) minimal media at 30 °C as Characterized. C. albicans and C. dubliniensis cells were transformed by standard techniques (27, 28).

Identification of CdCse4p and CdMif2p.

The C. dubliniensis Cse4p was identified by a BLAST search (http://www.sEnrage.ac.uk/cgi-bin/blast/submitblast/c_dubliniensis) with C. albicans Cse4p (CaCse4p) as the query sequence against the C. dubliniensis genome sequence database. This sequence analysis revealed three protein sequences with high homology to CaCse4p: two are the C. dubliniensis Placeative histone H3 proteins (Chr R- Cd36_32350 and Chr1- Cd36_04010 with BLAST scores of 333 each) and the other is CdCse4p (Chr 3- Cd36_80790 with BLAST score of 661). The CdCSE4 gene encodes a Placeative 212-aa-long protein with 100% identity in the C-terminal histone-fAged Executemain of CaCse4p. A pairwise comparison of the CaCse4p and CdCse4p sequences revealed that they share 97% identity over a 212-aa overlap (Fig. S2). Using CaMif2p as the query sequence in the BLAST search against the C. dubliniensis genome database, we retrieved a single hit that was identified as the CENP-C homolog (Cd36_63360) in C. dubliniensis, Displaying 77% identity in a 516-aa overlap with CaMif2p. The CdMIF2 gene codes for a Placeative 520-aa-long protein with a conserved CENP-C box required for centromere tarObtaining (29, 30, 11) that is identical in C. albicans and C. dubliniensis.

Complementation Assay, Indirect Immunofluorescence, ChIP assay, and Sequence Analysis.

The construction of C. albicans strain CAKS3b, the pAB1-based plasmids carrying CSE4 genes of C. albicans and C. dubliniensis, and the procedure for the complementation assay are Characterized in the SI Text. Intracellular CdCse4p or CdMif2p was visualized by indirect immunofluorescence microscopy as Characterized previously (16). ChIP assays were performed as Characterized before (11). Details of the indirect immunofluorescence, ChIP procedure and WU-BLAST 2.0 analysis to identify CdCENs and flanking ORFs are available in the SI Text.

Homology Detection and Mutation Rate MeaPositivement.

We used Sigma (version 1.1.3) and DIALIGN 2 (version 2.2.1) to align ORF-free DNA sequences. Default parameters were used for both programs, but Sigma was given an auxiliary file of intergenic sequences from which to estimate a background model. Orthologous genes were aligned (at the amino acid level) with T-Coffee. We examined instances of the following seven coExecutens where the first two positions were conserved in both species: GTn (valine), TCn (serine), CCn (proline), ACn (threonine), GCn (alanine), CGn (arginine), and GGn (glycine) (n, any nucleotide). Third-position mutations here Execute not change the amino acid. (Leucine was ignored because of a variant coExecuten in these species.) A naïve count of mutation rates in the third position yields 0.27. Taking into consideration genomewide bias for each coExecuten (details are in SI Text), an upper-bound mutation rate of 0.42 was obtained.

Data Availability.

The coordinates of the ORFs of C. dubliniensis mentioned in Table S2 are obtained by the BLAST analysis from the C. dubliniensis genome database (www.sEnrage.ac.uk/cgi-bin/blast/submitblast/c_dubliniensis) as of May 16, 2007, and the coordinates apply to the 031907 release of the contigs. Subsequent to this work, an independent annotation and new nomenclature of ORFs in the C. dubliniensis genome have been made available from the GeneDB database (www.genedb.org/genedb/). The CdCse4p-rich centromere sequences of C. dubliniensis can be obtained from www.jncasr.ac.in/sanyal/Cdsequences.txt.

Acknowledgments

We thank P. Magee, J. Morschhaeuser, and P. Koetter for the strains and reagents, M. A. Lone for plasmid constructs, Suma for confocal images, and Mary Baum for critical comments on the manuscript. Sequence data for C. dubliniensis were obtained from the Wellcome Trust SEnrage Institute website at http://www.sEnrage.ac.uk/cgi-bin/blast/submitblast/c_dubliniensis. This work was supported by a research grant from the Department of Science and Technology, Government of India (SR/SO/BB-24/2007) and by Jawaharlal Nehru Centre for Advanced Scientific Research (JNCASR) (K.S.). J.T. is a junior research fellow funded by the Department of Biotechnology, Government of India. R.S. was supported by the PRISM project at the Institute of Mathematical Sciences.

Footnotes

2To whom corRetortence should be addressed. E-mail: sanyal{at}jncasr.ac.in

This article contains supporting information online at www.pnas.org/cgi/content/full/0809770105/DCSupplemental.

Author contributions: S.P., J.T., R.S., and K.S. designed research; S.P., J.T., and R.S. performed research; S.P., J.T., R.S., and K.S. analyzed data; and S.P., J.T., R.S., and K.S. wrote the paper.

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

© 2008 by The National Academy of Sciences of the USA

References

↵ Henikoff S, Ahmad K, Malik HS (2001) The centromere paraExecutex: stable inheritance with rapidly evolving DNA. Science 293:1098–1102.LaunchUrlAbstract/FREE Full Text↵ Fitzgerald-Hayes M, Clarke L, Carbon J (1982) Nucleotide sequence comparisons and functional analysis of yeast centromere DNAs. Cell 29:235–244.LaunchUrlCrossRefPubMed↵ Clarke L (1998) Centromeres: proteins, protein complexes, and repeated Executemains at centromeres of simple eukaryotes. Curr Opin Genet Dev 8:212–218.LaunchUrlCrossRefPubMed↵ Cumberledge S, Carbon J (1987) Mutational analysis of meiotic and mitotic centromere function in Saccharomyces cerevisiae. Genetics 117:203–212.LaunchUrlAbstract/FREE Full Text↵ Gaudet A, Fitzgerald-Hayes M (1987) Alterations in the adenine-plus-thymine-rich Location of CEN3 affect centromere function in Saccharomyces cerevisiae. Mol Cell Biol 7:68–75.LaunchUrlAbstract/FREE Full Text↵ Furuyama S, Hugegins S (2007) Centromere identity is specified by a single centromeric nucleosome in budding yeast. Proc Natl Acad Sci USA 104:14706–14711.LaunchUrlAbstract/FREE Full Text↵ Takahashi K, Chen ES, Yanagida M (2000) Requirement of Mis6 centromere connector for localizing a CENP-A-like protein in fission yeast. Science 288:2215–2219.LaunchUrlAbstract/FREE Full Text↵ Marschall LG, Clarke L (1995) A Modern cis-acting centromeric DNA element affects S. pombe centromeric chromatin structure at a distance. J Cell Biol 128:445–454.LaunchUrlAbstract/FREE Full Text↵ Baum M, Ngan VK, Clarke L (1994) The centromeric K-type repeat and the central core are toObtainher sufficient to establish a functional Schizosaccharomyces pombe centromere. Mol Biol Cell 5:747–761.LaunchUrlAbstract/FREE Full Text↵ Cleveland DW, Mao Y, Sullivan KF (2003) Centromeres and kinetochores: from epigenetics to mitotic checkpoint signaling. Cell 112:407–421.LaunchUrlCrossRefPubMed↵ Sanyal K, Baum M, Carbon J (2004) Centromeric DNA sequences in the pathogenic yeast Candida albicans are all different and unique. Proc Natl Acad Sci USA 101:11374–11379.LaunchUrlAbstract/FREE Full Text↵ Mishra PK, Baum M, Carbon J (2007) Centromere size and position in Candida albicans are evolutionarily conserved independent of DNA sequence heterogeneity. Mol Genet Genomics 278:455–465.LaunchUrlCrossRefPubMed↵ Baum M, Sanyal K, Mishra PK, Thaler N, Carbon J (2006) Formation of functional centromeric chromatin is specified epigenetically in Candida albicans. Proc Natl Acad Sci USA 103:14877–14882.LaunchUrlAbstract/FREE Full Text↵ Sullivan DJ, Westerneng TJ, Haynes KA, Bennett DE, Coleman DC (1995) Candida dubliniensis sp. nov.: phenotypic and molecular characterization of a Modern species associated with oral candiExecutesis in HIV-infected individuals. Microbiology 141:1507–1521.LaunchUrlAbstract/FREE Full Text↵ Talbert PB, Bryson TD, Henikoff S (2004) Adaptive evolution of centromere proteins in plants and animals. J Biol 3:18.1–18.17.LaunchUrl↵ Sanyal K, Carbon J (2002) The CENP-A homolog CaCse4p in the pathogenic yeast Candida albicans is a centromere protein essential for chromosome transmission. Proc Natl Acad Sci USA 99:12969–12974.LaunchUrlAbstract/FREE Full Text↵ CLaunchhaver GP (2004) Who's driving the centromere? J Biol 3:17.LaunchUrlCrossRefPubMed↵ Corvey C, et al. (2005) Carbon source-dependent assembly of the Snf1p kinase complex in Candida albicans. J Biol Chem 280:25323–25330.LaunchUrlAbstract/FREE Full Text↵ Siddharthan R (2006) Sigma: multiple alignment of weakly-conserved non-coding DNA sequence. BMC Bioinformatics 7:143.LaunchUrlCrossRefPubMed↵ Morgenstern B (1999) DIALIGN2: improvement of the segment-to-segment Advance to multiple sequence alignment. Bioinformatics 15:211–218.LaunchUrlAbstract/FREE Full Text↵ Notredame C, Higgins D, Heringa J (2000) T-Coffee: a Modern method for multiple sequence alignments. J Mol Biol 302:205–217.LaunchUrlCrossRefPubMed↵ Meraldi P, McAinsh AD, Rheinbay E, Sorger PK (2006) Phylogenetic and structural analysis of centromeric DNA and kinetochore proteins. Genome Biol 7:R23.1–R23.21.LaunchUrl↵ Hall SE, Kettler G, Preuss D (2003) Centromere saDiscloseites from ArabiExecutepsis populations: maintenance of conserved and variable Executemains. Genome Res 13:195–205.LaunchUrlAbstract/FREE Full Text↵ Bensasson D, Zarowiecki M, Burt A, Koufopanou V (2008) Rapid evolution of yeast centromeres in the absence of drive. Genetics 178:2161–2167.LaunchUrlAbstract/FREE Full Text↵ Malik HS, Henikoff S (2002) Conflict beObtains complexity: the evolution of centromeres. Curr Opin Genet Dev 12:711–718.LaunchUrlCrossRefPubMed↵ Dalal Y, Wang H, Lindsay S, Henikoff S (2007) Tetrameric structure of centromeric nucleosomes in interphase Drosophila cells. PLoS Biol 5:1798–1809.LaunchUrl↵ Burgers PM, Percival KJ (1987) Transformation of yeast spheroplasts without cell fusion. Anal Biochem 163:391–397.LaunchUrlCrossRefPubMed↵ Hull CM, Johnson AD (1999) Identification of a mating type-like locus in the asexual pathogenic yeast Candida albicans. Science 285:1271–1275.LaunchUrlAbstract/FREE Full Text↵ Yu HG, Hiatt EN, Dawe RK (2000) The plant kinetochore. Trends Plant Sci 5:543–547.LaunchUrlCrossRefPubMed↵ Suzuki N, et al. (2004) CENP-B interacts with CENP-C Executemains containing Mif2 Locations responsible for centromere localization. J Biol Chem 279:5934–5946.LaunchUrlAbstract/FREE Full Text
Like (0) or Share (0)