18.12.2 Reporting the DIPs
When you click Next, you will be able to specify how the DIPs
should be reported:
- Annotate reference sequence(s). This will add an
annotation for each DIP to the reference sequences in the input.
- Annotate consensus sequence(s). This will add an
annotation for each DIP to the consensus sequences in the input. In either way, DIP annotations contain the following
information:
- Reference position. The first position of the DIP
in the reference sequence.
- Consensus position. The first position of the DIP
in the consensus sequence.
- Variation type. Will be "DIP" or "Complex
DIP", depending on the value of the maximum expected variations
setting and the actual number of variations found at the DIP site.
- Length. The length of the DIP. Note that only small deletions and insertions are found. This is because the DIP detection is based on the alignment of the reads generated by the mapping process, and the mapping only allows a few insertions/deletions (see Map reads to reference for information on how to map reads to a reference).
- Reference. The residues found in the reference sequence (either gaps for insertions or bases for deletions).
- Variants. The number of variants among the reads.
- Allele variation. The variations found in the reads
at the DIP site. Contains only those variations whose frequency is
at least that specified by the minimum variant frequency setting.
- Frequencies. The frequencies of the variations,
both absolute (counts) and relative (percentage of coverage).
- Coverage. The number of valid reads completely
covering the DIP site.
- Variant numbers and frequencies. The information from the Allele variations, frequencies and counts are also split apart and reported for each variant individually (variant #1, #2 etc., depending on the ploidy setting.
- Overlapping annotations. Says if the DIP is
covered, in part or in whole, by an annotation. The
annotation's type and name will displayed. For annotated
reference sequences, this information can be used to tell if
the DIP is found in e.g. a coding or non-coding region of the
genome. Note that annotations of type
Variation and Source are not reported.
- Amino acid change. If the reference sequence of
is annotated with ORF or CDS annotations, the DIP
detection will also report whether the DIP changes the amino
acid sequence resulting from translation, and, if so, whether
the change involves frame-shifting.
- Create table. This will create a table showing all the
DIPs found. The table will provide a valuable overview, whereas the
annotations are useful for detailed inspection of a DIP, and also if
the annotated sequences are used for further analysis in the
CLC Genomics Workbench.
Figure 18.100 shows the result of a DIP
detection output as annotations on the reference sequence. The DIP
detection found the DIPs of figure 18.98.
Figure 18.100: DIPs detected witin a coding region.
The DIPs occur within a coding region (identified by the long yellow
annotation) and you can see that they both shift the frame of the
translation, since their sizes are not divisible by 3. Placing your
mouse on the annotations will reveal detailed information about the
DIPs as shown in figure 18.101.
Figure 18.101: A DIP annotation with detailed information.
The same information is also recorded in the table output. An example
of a table is shown in figure 18.102.
Figure 18.102: A table of DIPs.
In addition to the information shown as annotation, the table also includes the name of the mapping (since the table can include DIPs for many references, you need to know which one it belongs to). The table can be Exported (
) as a csv file (comma-separated values) and imported into e.g. Excel. Note that the CSV export includes all the information in the table, regardless of filtering and what has been chosen in the Side Panel. If you only want to use a subset of the information, simply select and Copy (
) the information. The columns in the SNP and DIP tables have been synchronized to enable merging in a spreadsheet.
Note that if you make a split view of the table and the mapping, you will be able to browse through the DIPs by clicking in the table. This will cause the view to jump to the position of the DIP.