Alignment statistic for identifying related protein sequences

G. William Moore, Morris Goodman

    Research output: Contribution to journalArticlepeer-review

    23 Scopus citations


    Closely related proteins show an obvious kinship by having numerous matching amino acids in their aligned sequences. Kinship between anciently separated proteins requires a statistical evaluation to rule out fortuitous similarities. A simple statistic is developed which assumes equal probability for all codon pairs, and a table of critical values for amino acid sequence alignments of length 200 or less is presented. Applying this statistic to V and C regions of immunoglobulin chains, aligned on the basis of shared features of three-dimensional structure, provides evidence that the V and C sequences descended from a common ancestor. Similarly the distant evolutionary relationship of dehydrogenases, flavdoxin, and subtilisin, suggested by structural alignments, is verified. On the other hand, the statistic does not verify a common evolutionary origin for the heme binding pocket in globins and cytochrome b5. Empirical evidence from the distribution of MMD values of amino acid pairs in comparisons of misaligned polypeptide chains and from Monte Carlo trials of sequences aligned with arbitrary gaps supports the validity of the statistic.

    Original languageEnglish (US)
    Pages (from-to)121-130
    Number of pages10
    JournalJournal of Molecular Evolution
    Issue number2
    StatePublished - 1977


    • Dehydrogenases
    • Evolutionary relationship
    • Minimum mutation distance
    • Significance test
    • Structural alignments
    • V and C immunoglobulin sequences

    ASJC Scopus subject areas

    • Agricultural and Biological Sciences(all)
    • Agricultural and Biological Sciences (miscellaneous)
    • Ecology, Evolution, Behavior and Systematics
    • Biochemistry, Genetics and Molecular Biology(all)
    • Biochemistry
    • Genetics
    • Molecular Biology
    • Genetics(clinical)


    Dive into the research topics of 'Alignment statistic for identifying related protein sequences'. Together they form a unique fingerprint.

    Cite this