开源数据压缩工具-xz
2013-05-24 11:11:45 阿炯

xz 是一个使用 LZMA压缩算法的无损数据压缩文件格式。与gzipbzip2一样,同样支持多文件压缩,但是约定不能将多于一个的目标文件压缩进同一个档案文件。相反xz通常作为一种归档文件自身的压缩格式,例如使用tar或cpio程序创建的归档。xz 在GNU coreutils(版本 7.1以后)中被使用,xz作为压缩软件包被收录在 Fedora (自Fedora 12起)、Arch Linux、 FreeBSD、Slackware Linux、CRUX 中,一些Linux及BSD发版本的软件包都是采用这种格式打包压缩后提供下载。


XZ Utils is free general-purpose data compression software with high compression ratio. XZ Utils were written for POSIX-like systems, but also work on some not-so-POSIX systems. XZ Utils are the successor to LZMA Utils.

The core of the XZ Utils compression code is based on LZMA SDK, but it has been modified quite a lot to be suitable for XZ Utils. The primary compression algorithm is currently LZMA2, which is used inside the .xz container format. With typical files, XZ Utils create 30 % smaller output than gzip and 15 % smaller output than bzip2.

XZ Utils consist of several components:
 liblzma is a compression library with API similar to that of zlib.
 xz is a command line tool with syntax similar to that of gzip.
 xzdec is a decompression-only tool smaller than the full-featured xz tool.

A set of shell scripts (xzgrep, xzdiff, etc.) have been adapted from gzip to ease viewing, grepping, and comparing compressed files.Emulation of command line tools of LZMA Utils eases transition from LZMA Utils to XZ Utils.

While liblzma has a zlib-like API, liblzma doesn't include any file I/O functions. A separate I/O library is planned, which would abstract handling of .gz, .bz2, and .xz files with an easy to use API.

The .xz file format is a container format for compressed streams. There are no archiving capabilities, that is, the .xz format can hold only a single file just like the .gz and .bz2 file formats used by gzip and bzip2, respectively.

Compared to a few other popular stream compression formats, the .xz format provides a couple of advanced features. At the same time, it has been kept simple enough to be usable in many embedded systems.

特点

基于数据流: 易于通过管道 (pipe) 生成压缩文件或解压缩文件。.xz 文件格式与 .gz/.bz2 文件一样,不具备对多个文件进行归档打包的能力。若要处理多个文件,可以和归档工具 tar 结合使用,生成扩展名为 .tar.xz 或 .txz 的压缩文件。

随机读取: 存储的数据被划分为独立的压缩块,并对每个压缩块进行索引,当每个压缩块比较小时,便能够进行有限的随机读取压缩数据。

完整性验证: 可以使用 CRC32、CRC64、SHA-256 来进行数据的完整性验证,也可以增加自定义验证方法。

可连接(concatenation): 类似于 .gz/.bz2 文件,可以把多个压缩数据流连接到一个文件中。解压缩时,就像解压一个正常单压缩流文件一样。

支持多filter和filter链: 提供自定义 filter 的能力,也能够将多个 filter 组成 filter 链,对数据进行处理。这点与 Unix 命令间使用的管道 (pipe) 类似。

可填充(padding): 可以在 .xz 文件末尾填充二进制'0'以充满特定大小的空间,比如备份磁带上的一个块 (block)。

GNU tar自版本1.22起使用这一软件透明支持xz文件格式,7-Zip在9.04 beta版支持了xz文件格式。

最新版本:5.2


项目主页:http://tukaani.org/xz/