CLC Assembly Cell – Latest Improvements

 
 

CLC Assembly Cell 4.1

Release date: March 19, 2013

New features

  • A new tool for extracting a random subset of reads called clc_sample_reads
  • clc_remove_duplicates, a tool for identifying and removing duplicated reads in datasets, is now out of beta and a part of the Assembly Cell

Improvements

  • The number of N's output by the de novo assembler has been further reduced and now N's primarily occur in assemblies when the scaffolder is used.
  • Improved the performance of the Duplicate Removal tool to make it scale to larger datasets.
  • clc_cas_to_sam now outputs information on the number and types of mismatches.

Bug fixes

  • Fixed crashes when outputting contigs in the de novo assembler when paired reads was used as input.
  • Fixed an issue where circular contigs were extended too much by the de novo assembler.
  • Fixed a crash in clc_cas_to_sam.
  • Various small bug fixes.

CLC Assembly Cell 4.0.13

Release date: February 25, 2013

Bug fixes

  • Fixed a bug causing clc_cas_to_sam to crash.

CLC Assembly Cell 4.0.12

Release date: January 23, 2013

Bug fixes

  • Fixed bug which caused the de novo assembler to crash or go into an infinite loop when outputting contigs generated from paired reads.
  • Fixed compatibility issues for the clc mapping viewer on windows 64 bit platforms.

CLC Assembly Cell 4.0.11

Release date: January 8, 2013

Bugfixes

  • Fixed read mapper errors.
  • Fixed de novo assembly error.

CLC Assembly Cell 4.0

Release date: December 5, 2012

New features

New de novo assembler

  • Scaffolding is integrated into the assembly. This means better resolution of contigs and insertion of Ns when two contigs cannot be joined in sequence but there is pair information that connects them.
  • -Automatic paired distance estimation: Using the -e option, the de novo assembler will estimate the fragment size of your paired data.
  • Improved use of unpaired reads for resolving ambiguities in the de Bruijn Graph.
  • Various improvements of the assembly quality.
  • New parameter for specifying the maximum bubble size. There is a default value which is automatically calculated based on the input data.
  • New white paper with benchmarks and results from quality control.
  • Bug fix: Fixed a bug in the de novo assembler which caused an increased number of N's in the results, because the sequence of the read that spanned contigs was not looked up correctly. The de novo assembler now produces much fever N's for low coverage assemblies.

New read mapper

  • Great improvement of speed for mapping (see whitepaper for more details on speed and quality)
  • Support for complex genomes with many repeats
  • The previous read mapper is still included as a legacy version to allow color space mapping which is not supported in the new mapper.
  • The forward only mode of the clc_mapper now also works for paired reads.

Updated naming of tools

We have updated the names of the tools to be more consistent, and to reflect the use of "mapping" rather than "assembly" throughout the software. We have provided a helper script to assist updating existing scripts based on the old naming scheme. Read more here.

Licensing

A new license tool is included that will make it very easy to:
  • Download a license based on a license order ID. This would previously require some email exchange with CLC bio but can now be done in one go without involvement of CLC bio.
  • Request and download an evaluation license directly.

Furthermore, it is now checked if the license is valid for the particular version of the CLC Assembly Cell.

A new restriction has been added for running the CLC Assembly Cell on large computers: if the system has more than 64 cores (hyper threaded cores), it will not be able to run with a static license. In this case, a network license is needed.

Adapter trim

You can now trim adapters from sequencing reads prior to assembly or mapping. Read more here.

Miscellaneous

  • Added support for read group information in castosam.
  • Added support for non-specific reads in castosam
  • Added progress on castosam and samtocas
  • sort_pairs auto detects input files. Now supports for solid paired end and ion torrent files.
  • Proper out of memory error messages are shown if a tool runs out of memory
  • Various bug fixes

CLC Assembly Cell 3.2.2

Release date: April 11, 2011

Bugfixes

  • Fixed problem with read mapping on computers with Japanese Windows.

Older releases

For a complete list of older release, visit the CLC Assembly release archive.