Sequence Logos

A sequence logo is a graphical representation of aligned sequences where at each position the size of each residue is proportional to its frequency in that position and the total height of all the residues in the position is proportional to the conservation (information content) of the position ( TD Schneider & RM Stephens, "Sequence Logos: A New Way to Display Consensus Sequences", NAR 18:6097-6100 (1990) ).

Tom Schneider's Sequence Logos site


Logos from Related Sequences, Blocks, and Multiple Alignments

Blocks can be displayed as logos to examine sequence conservation. Start with a set of related sequences or a multiple alignment in Blocks, Clustal or FASTA-alignment format. For a set of related sequences, get a ClustalW alignment using either the EBI Clustal or the BCM Search Launcher multiple sequence alignment site for global multiple alignments from which blocks are made, or use Block Maker to make blocks directly. Clustal-generated alignments are copied and pasted into the Multiple Alignment Processor window in Clustal (include the word 'CLUSTAL' from the heading) or FASTA-alignment format, and choose 'Submit the sequences'. The processor carves out blocks from fully-ungapped regions that are at least 10 residues wide and provides an automatic link for making logos from the full set of blocks. Get Blocks (for Blocks/Prints Database entries) and Block Maker (which uses Motif or Gibbs sampling) provides blocks that are ready to go. In each case, there are links to view logos or other displays.

The logo for a block is computed from the position-specific scoring matrix (PSSM) that is used to score the block against a query sequence. The PSSM is based on sequence-weighted counts of each amino acid in each column of the block normalized by dividing by the expected frequency of each amino acid in a protein sequence database. ( S. Henikoff, J. G. Henikoff, W. J. Alford & S. Pietrokovski, "Automated construction and graphical presentation of protein blocks from unaligned sequences", Gene-COMBIS, Gene 163 (1995) GC 17-26).

Tom Schneider's makelogo program is used to create the sequence logo from the PSSM. The logo is in PostScript or PDF format. The amino acids in the logo are colored according to their chemical and physical characteristics.

[Blocks Home]


Contact us

Page last modified January 15, 1998