What is lzma

Last updated: April 1, 2026

Quick Answer: LZMA is a lossless data compression algorithm known for achieving very high compression ratios, commonly used in 7z archives and applications prioritizing compression over processing speed.

Key Facts

Overview

LZMA (Lempel-Ziv-Markov Chain) is a lossless data compression algorithm developed by Igor Pavlov specifically designed to achieve maximum compression ratios. LZMA is significantly more effective at reducing file sizes than commonly used algorithms like deflate, making it ideal for situations where storage space or bandwidth is expensive and compression time is not critical.

Compression Methodology

LZMA combines two compression techniques: dictionary-based compression that identifies repeated byte sequences and replaces them with references, and range encoding for efficient entropy coding. The algorithm can use extremely large dictionary windows up to 4 GB, allowing it to find distant matches for redundant patterns. Range encoding is more efficient than Huffman coding at further compressing the encoded data.

Compression Performance Metrics

LZMA typically achieves compression ratios of 30-60% of original file size, surpassing deflate compression which reaches 30-50%. This superior compression comes at a cost: compression is much slower, potentially requiring seconds or minutes for large files. Decompression, however, remains reasonably fast, making LZMA well-suited for compression-once distribution scenarios.

7z Archive Format Integration

LZMA is the primary compression method in the 7z archive format, which has gained significant adoption for achieving superior compression ratios. The 7z format can contain multiple files with optional encryption and is widely available on Windows through 7-Zip, on Unix/Linux through p7zip, and on macOS. This format represents one of LZMA's most visible applications.

Modern Applications

LZMA is used extensively in Linux distributions for compressing kernel images and large software packages, in backup solutions where storage efficiency matters, and in scenarios with bandwidth constraints. The XZ utility provides LZMA compression on Unix systems, and language bindings exist for Python and other languages, making LZMA accessible across diverse development environments and use cases.

Related Questions

What is the difference between LZMA and LZ4?

LZMA prioritizes maximum compression ratio (30-60%) with slower processing, while LZ4 prioritizes speed with 250-500 MB/s compression and lower ratios. Choose LZMA for archival and LZ4 for real-time applications.

What is the 7z file format?

7z is an open-source archive format using LZMA compression by default, supporting multiple files, optional encryption, and achieving superior compression ratios compared to ZIP. It's widely used on Windows via 7-Zip and on Unix systems.

What is range encoding in compression?

Range encoding is an entropy encoding technique treating compressed data as continuous probability ranges rather than discrete symbols. It's more efficient than Huffman coding and is used in LZMA's second compression stage to further reduce file size.

Sources

  1. Wikipedia - LZMA CC-BY-SA-4.0
  2. XZ Utils - LZMA Implementation Public Domain