![]() |
![]() |
| You are here: |
Home Aspergillus_nidulans Docs
|
|
A. nidulans assembly information231 contigs out of the 248 from the original Broad assembly have been assembled into supercontigs and chromosomes using data from the Broad Website and from John Clutterbuck's A. nidulans Linkage Map Website. The following pages were used from the Broad site:
The following pages were used from John Clutterbuck's site: Assembly into SupercontigsThe original Broad assembly placed contigs within supercontigs in a head-to-tail orientation (e.g. contigs 1 to 21 in supercontig 1). This, in general, has not changed, but John Clutterbuck has demonstrated that the sequences of many of these contigs actually overlap. These overlaps are shown within ContigView and, in some instances, mean that two genes have been annotated where in fact the overlap shows that only one gene is actually present. (For instance, contigs 102 and 103 in supercontig 7 overlap by 1885 bp and gene AN6008.3 is in fact just the 3' end of gene AN6009.3.) Because of the way Ensembl works, only the genes from the 5' contig are shown in the overlapping region within ContigView. The 'missing' genes are still present in the database and can be searched for and viewed within GeneView. In addition by using sequence overlaps, John Clutterbuck has been able to position many previously unassembled contigs (58 in total) within the supercontigs. In those cases where no overlap in the sequences was found, we have assumed that there is a sequence gap and have assigned an arbitrary gap of 1000 bp between contigs to indicate this. In four cases, actual gap sizes are given as John Clutterbuck was able to demonstrate that a previously cloned genomic sequence overlapped two contigs. (For example, AF497720, containing the mnpA gene, overlaps contigs 40 and 41, but leaves a 193 bp gap between the two contigs in supercontig 2.) There are 17 supercontigs at the end of this process. Assembly into ChromosomesIt has been possible using cloned genetic markers to assign supercontigs to linkage groups (LG) on the genetic map and also to confirm that the contig order is correct (see, however, the separate section on LG V). Fourteen of the supercontigs form chromosomal arms. Only three of these supercontigs (11, 14 and 15) have the telomeric simple sequence (TTAGGG) at one end, allowing the orientation of the arm to be validated independently of the genetic map. Using the methodology from Li et al., Mark Farman was able to link supercontigs to contigs containing the telomeric simple sequence and thus confirm the orientation of most of the remaining supercontigs. If these supercontigs represent chromosomal arms, then the centromere would be expected at the other end. All of them, except supercontigs 2 and 12, have a few kilobases of A+T-rich sequence at their extreme ends, which is indicative of the start of the centromeric region. For supercontig 16, the presence of approx. 4.5 kb of A+T-rich sequence at the end of contig 173 was the only piece of evidence used to orientate this arm. Chromosomal arms have been displayed with an arbitrary centromeric gap of 50 kb. Assembly of Linkage Group VVarious data indicate that the original assembly of supercontig 6 from contigs 87 to 98 was incorrect. John Clutterbuck was able to show that contig 216 overlaps contig 91 and as contig 216 ends in the telomeric repeat, this indicated that the supercontig should be broken at this point and that it should be inverted. This inversion resulted in the cloned genetic markers now being in the correct order relative to the genetic map. Contig 98 ends in approx. 5 kb of A+T-rich sequence which indicates that the centromere lies at this end and that therefore contigs 91 to 98 represent the entire right arm of LG V (supercontig 6b). The first 4 kb of contig 87 also contains A+T-rich sequence indicating that this contig lies at the other end of the centromere. The resulting inversion of contigs 87 to 90 (supercontig 6a) also means that the genetic markers within these contigs are now in the correct order. The end of contig 90 contains part of ribosomal DNA repeat, which explains the assembly break at this point. Supercontig 12 forms the rest of this arm based on its containing appropriate cloned genetic markers, the absence of any A+T-rich sequence at either end and the link to a telomere at the contig 146 end. An arbitrary gap of 200 kb has been shown between supercontigs 6a and 12 to indicate the position of the rDNA repeat region. Ribosomal DNA repeat regionMost of the ribosomal repeat region is missing from the assembly. Only a small portion of a repeat unit was found in contig 90. This includes the last 576 bp of the 28S gene (which is over 3 kb in size) and the first 564 bp of the non-transcribed and externally transcribed spacers. It looks, therefore, that the ribosomal DNA repeat region starts at this end with a small 3' fragment of a repeat unit. Unassembled contigsSeventeen contigs have not been assigned to a supercontig or chromosomal arm. They can be viewed as separate supercontigs within ContigView. Only four of the contigs have been linked into supercontigs containing more than one contig (184 and 185 in supercontig 27 and 217 and 218 in supercontig 59). All but contig 226 are repetitive in nature and seven contain A+T-rich sequence suggesting that they are located in pericentromeric regions. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Help Desk/Suggestions |