What is lz4 compression

Last updated: April 1, 2026

Quick Answer: LZ4 is a lossless data compression algorithm designed for very fast compression and decompression speeds, prioritizing performance over maximum compression ratio.

Key Facts

Overview

LZ4 is a fast, lossless data compression algorithm created by Yann Collet that prioritizes compression and decompression speed over achieving maximum compression ratios. It operates on the principle of replacing redundant data sequences with shorter references, enabling rapid processing of large data volumes in real-time applications.

Technical Mechanism

LZ4 uses dictionary-based compression by maintaining a sliding window of recently processed data and searching for matching patterns within that window. When matches are found, the algorithm replaces longer sequences with shorter tokens containing a reference distance and length. This design emphasizes finding matches quickly rather than the longest possible matches, enabling its exceptional speed.

Speed and Compression Characteristics

LZ4 achieves compression speeds around 250-500 MB/s and decompression speeds of 1-2 GB/s, making it among the fastest compression algorithms available. Compression ratios typically reach 20-40% of original file size, comparing unfavorably to deflate at 30-50% but dramatically faster. This speed-versus-ratio tradeoff makes LZ4 ideal for performance-critical applications.

Primary Use Cases

LZ4 is extensively used in big data systems like Apache Hadoop for compressing data blocks, in NoSQL databases like Cassandra and Redis for reducing memory overhead, and in streaming protocols where decompression speed matters more than maximum compression. Many applications use LZ4 at multiple layers: for network transmission, in-memory caching, and disk storage.

Ecosystem and Adoption

LZ4 is available under the BSD 2-Clause license and has been implemented in numerous languages including C, Java, Python, Go, Rust, JavaScript, and C#. Major projects including Apache Kafka use LZ4 for default compression, demonstrating wide industry adoption in systems prioritizing throughput and latency over storage efficiency.

Related Questions

What is the difference between LZ4 and gzip compression?

LZ4 prioritizes speed with compression at 250-500 MB/s but lower ratios, while gzip uses deflate compression for better ratios at slower speeds. Choose LZ4 for real-time systems and gzip for archival and downloads where bandwidth matters more than latency.

How does LZ4 compare to Snappy compression?

LZ4 and Snappy are both fast compression algorithms, but LZ4 typically decompresses faster (1-2 GB/s vs 500-900 MB/s) and generally achieves better compression ratios, making it increasingly preferred in modern systems.

What are dictionary-based compression algorithms?

Dictionary-based algorithms like LZ4 and LZ77 identify repeated byte sequences and replace them with references to a sliding window dictionary. This reduces redundancy; performance depends on dictionary size, search speed, and match distance/length encoding efficiency.

Sources

  1. LZ4 GitHub Repository BSD-2-Clause
  2. Wikipedia - LZ4 CC-BY-SA-4.0