Dot plots

Dot plots provide a powerful visual comparison of two sequences. Dot plots can also be used to compare two regions of similarity within a sequence of DNA or protein. Each axis of the plot represents the sequence. If a sequence is found between the two axis, a dot is drawn. This results in a diagonal line if two identical sequences are plotted against each other. This can also be used to find repeats within a sequence of interest.

After choosing which DNA(s) or protein(s) to compare, the following parameters are set:

  • Distance correction (only applicable for protein sequences) In order to treat evolutionary transitions of amino acids, a distance correction measure can be used when calculating the dot plot. The distance correction matrices take into account the likeliness of one amino acid changing to another. Available distance matrices:
    - BLOSUM45
    - BLOSUM62
    - BLOSUM80
    - PAM30
    - PAM60

  • Window size A residue by residue comparison would result in a lot of similarities because of a low number of different residues and therefore of course also in a very noisy background. Moreover, a residue by residue comparison can be very time consuming and computationally demanding. Increasing the window size will make the dot plot “smoother” to operate.

When viewing a dot plot, the usual CLC workbench maneuvering actions are available:

  • Zoom in
  • Zoom out
  • Move around within a zoomed area
Besides this, it is possible to modify the gradient of the dots. The upper gradient and the lower gradient can be changed independently, and the dot plot will be adjusted correspondingly.

Read more