CLC Genomics Workbench

Paired end reads – duplications and inversions

CLC Genomics Workbench includes a number of graphical options of identifying genomic duplications and inversions when the sequencing produces paired-end reads.

Finding duplications

One way of identifying duplications is to analyze the graph showing double matches – i.e. reads that match more than once on the reference sequence.

Screenshot 1: A rise in the Non-specific matches.

The logic is that if a read matches more than once on the reference sequence, that part of the sequence must have been duplicated.

Screenshot 2: Non-specific matches are shown in yellow.

In the case of smaller duplications, the Paired-ends distance is increased because some of the reads are matched to the other part of the duplication.

Screenshot 3: Paired-ends distance increases.

Finding inversions

Looking at the Single paired-ends reads graph, inversions may be identified by looking for two consecutive peaks.

Screenshot 4: Two peaks in the Single paired-ends reads graph.

This pattern characterizes inversions because when the first peak starts, this is due to the reverse part of the paired-ends no longer matching the reference sequence.

Screenshot 5: Zooming in: Just before the inversion, only the forward reads match.

Scrolling further along the contig we can see the starting point of the inverted region. This is where the forward reads ends. At the same point, you will see a new pattern: a combination of reverse and paired-ends reads.

Screenshot 6: The inversion starts where the reads shift from green (forward) to a combination of red and blue (reverse and paired-ends) reads.

The forward counterpart of the reverse reads has no match because of the inversion, whereas the paired-ends reads have been reversed compared to the other paired-ends reads in the contig (this is not visible in the user interface, but a conclusion you can draw from the pattern of the other reads).

Scrolling to the end of the inversion, you will see a similar pattern as in the beginning - it is just mirrored: Forward reads kick in at the end of the inversion, and reverse reads take over at when we get back to a "normal" sequence

Screenshot 7: The inversion ends where the reads shift from green (forward) to a combination of red and blue (reverse and paired-ends) reads.