Lena for fastq logo

Enancio's lossless genomic compression software, Lena for fastq

Because integrity of your data is what matters most, Enancio’s first product development has been Lena for fastq.

Lena for fastq features :

100% lossless compression

Compression with Lena is completely lossless: it preserves all the information, byte per byte. The format embeds two checksums, one to validate that the decompressed file is indeed the same as the original data, and one to check if some data corruption occurred during transmission or storage. In such an event, the file format will pinpoint the location in the file where the error occurred. High compression Ratio On data generated by the latest Novaseq sequencer, Lena typically achieves a compression ratio greater than 5x compared to a gzipped file. On other sequencers, compression ratio mainly depends on how the quality values are encoded in the fastq file. With a Hiseq X Ten, typical compression ratio is about 3x compared to gzipped data.

High compression Ratio

On data generated by the latest Novaseq sequencer, Lena typically achieves a compression ratio greater than 5x compared to a gzipped file. On other sequencers, compression ratio mainly depends on how the quality values are encoded in the fastq file. With a Hiseq X Ten, typical compression ratio is about 3x compared to gzipped data.

Lena compression performance

Ultra-high speed

The code has been thoroughly optimized to provide very good compression and decompression speed, without hampering the compression ratio. Speed is crucial to allow for an easy integration into an existing workflow, and to speed up file transfer.

Compression and decompression speed is between 2.5 and 5 times faster than the gzip software, compared to the multi-threaded pigz version. Actual execution time may be limited by disk IO speed. However, when using a fast SSD or decompressing on-the-fly in memory to feed a decompressed data stream to an analysis software, decompression goes up to an impressive 2.1 GB/s on a 8-CPU system.

Comparison of gzip and lena compression speed Comparison of gzip and lena decompression speed

Streamlined integration

When the compressed fastq.lena file is needed for some computation, e.g. for mapping with BWA, there is no need to decompress lena file on the disk. Instead, the file may be decompressed on the fly and fed directly to the analysis software that requires a fastq file input. This greatly reduces read / write to the disk and achieves much better performance.

Try it now