DNA Sequence Analysis

  • Datum: 28.12.15
  • DNA marker, sequencing, alignment and conversion tools

Sequence Marker

Alignment

Hintergrund

Algoritmen und Software

  • Clustal
  • MUSCLE
    • MUSCLE is one of the best-performing multiple alignment programs according to published benchmark tests, with accuracy and speed that are consistently better than CLUSTALW. MUSCLE can align hundreds of sequences in seconds. Most users learn everything they need to know about MUSCLE in a few minutes—only a handful of command-line options are needed to perform common alignment tasks.
    • Edgar, R. C. (2004a). MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics, 5(1), 113. http://doi.org/10.1186/1471-2105-5-113
    • Edgar, R. C. (2004b). MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Research, 32(5), 1792–7. http://doi.org/10.1093/nar/gkh340
  • MAFFT
    • MAFFT is a multiple sequence alignment program for unix-like operating systems.  It offers a range of multiple alignment methods, L-INS-i (accurate; for alignment of <∼200 sequences), FFT-NS-2 (fast; for alignment of <∼30,000 sequences), etc.
    • Katoh, K. (2002). MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Research, 30(14), 3059–3066. http://doi.org/10.1093/nar/gkf436
    • Katoh, K., Kuma, K., Toh, H., & Miyata, T. (2005). MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Research, 33(2), 511–8. http://doi.org/10.1093/nar/gki198
    • Katoh, K., & Toh, H. (2008). Recent developments in the MAFFT multiple sequence alignment program. Briefings in Bioinformatics, 9(4), 286–98. http://doi.org/10.1093/bib/bbn013
    • Katoh, K., Asimenos, G., & Toh, H. (2009). Multiple alignment of DNA sequences with MAFFT. In Bioinformatics for DNA sequence analysis (pp. 39–64). Springer.
    • Katoh, K., & Frith, M. C. (2012). Adding unaligned sequences into an existing alignment using MAFFT and LAST. Bioinformatics (Oxford, England), 28(23), 3144–6. http://doi.org/10.1093/bioinformatics/bts578
    • Katoh, K., & Standley, D. M. (2013). MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Molecular Biology and Evolution, 30(4), 772–80. http://doi.org/10.1093/molbev/mst010
  • T-Coffee
    • T-Coffee is a multiple sequence alignment package. You can use T-Coffee to align sequences or to combine the output of your favorite alignment methods (Clustal, Mafft, Probcons, Muscle...) into one unique alignment (M-Coffee). T-Coffee can align Protein, DNA and RNA sequences. It is also able to combine sequence information with protein structural information (3D-Coffee/Expresso), profile information (PSI-Coffee) or RNA secondary structures (R-Coffee).
    • Notredame, C., Higgins, D.G. & Heringa, J., 2000. T-Coffee: A novel method for fast and accurate multiple sequence alignment. Journal of molecular biology, 302(1), pp.205–17. http://www.sciencedirect.com/science/article/pii/S0022283600940427

Format Conversion

  • ALTER
    • Homepage: http://sing.ei.uvigo.es/ALTER/
    • Publication: Glez-Peña, D. et al., 2010. ALTER: program-oriented conversion of DNA and protein alignments. Nucleic acids research, 38, pp.W14–8. 
      http://nar.oxfordjournals.org/content/38/suppl_2/W14.
    • Description: "ALTER is an open web-based tool to transform between different multiple sequence alignment formats. The originality of ALTER lies in the fact that it focuses on the specifications of mainstream alignment and analysis programs rather than on the conversion among more or less specific formats. In addition, ALTER is capable of identify and remove identical sequences during the transformation process. Besides its user-friendly environment, ALTER allows access to its functionalities in a programmatic way through a Representational State Transfer web service."