Sequence Logo Graphs
In CLC's workbenches there are a number of alignment-specific view options in the Alignment info preference group. One option is displaying a "sequence logo". The sequence logo displays the information content of all positions in the alignment as residues or nucleotides stacked on top of each other.
The sequence logo provides a far more detailed view of the alignment than the conservation view.
Each position of the alignment and consequently the sequence logo, shows the sequence information in a computed score based on Shannon entropy [Schneider and Stephens, 1990]. The height of the individual letters represents the sequence information content in that particular position of the alignment.
A sequence logo is also a much better visualization tool than a simple consensus sequence. An example is for instance an alignment where a particular residue is found in one position in 70% of the sequences.
If a consensus sequence were to be defined it would typically only display the single residue with 70% coverage. In the figure above, an ungapped alignment of 11 E. coli start codons including flanking regions are shown.
In this example, a consensus sequence would only display ATG as the start codon in position 1, but looking at the sequence logo it is seen that a GTG is also allowed as a start codon.
These options are available:
- Foreground color. Colors the letters using a gradient according to the information content of the alignment column.
- Background color. Sets a background color of the residues using a gradient in the same way as described above.
- Graph on/off. Displays sequence logo at the bottom of the alignment.
- Height.
- Color. The sequence logo can be displayed in black or Rasmol colors. For protein alignments, a polarity color scheme is also available, where hydrophobic residues are shown in black color, hydrophilic residues as green, acidic residues as red and basic residues as blue.























