Introducing noodles, a suite of Rust libraries tailored for bioinformatics file formats like BAM, VCF, and FASTQ. Designed for accuracy and compliance, noodles incorporates asynchronous I/O and optional features to enhance performance. Whether you're in research or data processing, integrate noodles effortlessly into your projects and streamline your bioinformatics workflows.
noodles is a robust set of bioinformatics I/O libraries written in Rust, designed to facilitate the handling of diverse bioinformatics file formats with compliance to specifications where applicable. Currently, this library supports a wide array of formats including BAM 1.6, BCF 2.2, BED, BGZF, CRAM 3.0/3.1, CSI, FASTA, FASTQ, GFF3, GTF 2.2, htsget 1.3, refget 2.0, SAM 1.6, tabix, and VCF 4.3/4.4.
Key Features
-
Modular Architecture: The
noodles
library is split into multiple crates by file format, allowing users to include only the necessary components in their projects. -
Experimentation Ready: Though still considered experimental,
noodles
offers early versions for use in projects, ensuring cutting-edge capabilities for developers. -
Top-level Meta Crate: Simplify dependencies by using the top-level
noodles
crate and enabling specific features for desired formats. For instance, adding BAM support can be done easily with:cargo add noodles --features bam
-
Flexible Imports: Each feature can be imported conveniently, allowing for enhanced code organization. For example:
use noodles::bam;
-
Asynchronous I/O Support: Leverage asynchronous I/O capabilities by enabling optional features with the
async
flag across a range of file formats including BAM, BCF, and VCF. -
High Performance Compression with
libdeflate
: Enhance encoding and decoding performance on DEFLATE streams withlibdeflate
for formats like BGZF and CRAM.
Usage Examples
To explore the potential of the noodles
library, users can run various examples found within individual crates. For example, after cloning the repository, execute the following to view available examples:
cargo run --release --example
This will list executable examples which can be run with specific parameters, such as:
cargo run --release --example bam_write > sample.bam
cargo run --release --example bam_read_header sample.bam
For developers interested in bioinformatics data processing, noodles
offers a compelling solution that combines performance with ease of use, making it a valuable addition to any Rust-based bioinformatics toolkit.