Rsync
2012-05-18 11:11:59 阿炯

数据同步-rsync
rsync是类unix系统下的数据镜像备份工具,从软件的命名上就可以看出来了--remote sync。rsync 全名 Remote Sync,是类unix系统下的数据镜像备份工具。

rsync - a fast, versatile, remote (and local) file-copying tool

rsync是一个功能非常强大的工具,其命令也有很多功能特色选项。它的特性如下:
 1、可以镜像保存整个目录树和文件系统。
 2、可以很容易做到保持原来文件的权限、时间、软硬链接等等。
 3、无须特殊权限即可安装。
 4、优化的流程,文件传输效率高,第一次同步时 rsync 会复制全部内容,但在下一次只传输修改过的文件。rsync 在传输数据的过程中可以实行压缩及解压缩操作,可以使用更少的带宽。
 5、可以使用rcp、ssh等方式来传输文件,当然也可以通过直接的socket连接。
 6、支持匿名传输,以方便进行网站镜像。

rsync is an open source utility that provides fast incremental file transfer. rsync is freely available under the GNU General Public License and is currently being maintained by Wayne Davison.



rsync is a file transfer program for Unix systems. rsync uses the "rsync algorithm" which provides a very fast method for bringing remote files into sync. It does this by sending just the differences in the files across the link, without requiring that both sets of files are present at one of the ends of the link beforehand.

Features
 can update whole directory trees and filesystems
 optionally preserves symbolic links, hard links, file ownership, permissions, devices and times
 requires no special privileges to install
 internal pipelining reduces latency for multiple files
 can use rsh, ssh or direct sockets as the transport
 supports anonymous rsync which is ideal for mirroring

最新版本:3.1
该版本有很多性能的增强(如重写的I/O层代码),小功能改进增强以及Bug修正,建议升级。

官方主页:http://rsync.samba.org/

网络数据同步开发库-libsync
libsync是一个用于网络数据同步的开发库。


假设现在有两台计算机A和B ,计算机A能够访问A文件,计算机B能够访问B文件,文件A和B非常相似,计算机A和B通过低速网络互联。基于dedupe技术的数据同步算法大致流程与Rsync相似,简单描述如下:
 1、B采用数据切分算法,如FSP(fixed-size partition)、CDC(content-defined chuking),将文件B分割成大小相等或不等的数据块;
 2、B对于每一个数据块,计算一个类似rsync弱校验值和md5强校验值,并记录数据块长度len和在文件B中的偏移量offset;
 3、B将这将数据块信息发送给A;
 4、A采用同样的数据块切分技术将文件A切成大小相等或不等的数据块,并与B发过来的数据信息进行搜索匹配,生成差异编码信息;
 5、A将差异编码信息发送给B,并同时发送重构文件A的指令;
 6、B根据差异编码信息和文件B重构文件A。
 上面算法描述中,有几个关键问题需要解决,即文件切分、切分数据块信息描述、差异编码、差异编码信息描述、文件同步。

libsync函数库由提供三个API,原型描述如下:
1、int file_chunk(char src_filename, char chunk_filename, int chunk_algo)
功能:对件进行切分生成分块描述文件。
参数:src_filename为源文件,chunk_filename为生成的块信息描述文件,chunk_algo为文件切分算法,目前支持FSP、CDC、SB三种。

2、int file_delta(char src_filename, char chunk_filename, char delta_filename, int chunk_algo)
功能:使用生成的块描述信息对文件进行差异编码。
参数:src_filename为待编码文件,chunk_filename为通过函数file_chunk生成的块描述文件,chunk_algo为文件切分算法。

3、int file_sync(char src_filename, char delta_filename)
功能:使用差异编码文件将源文件同步至目标文件。
参数:src_filename为基本文件,delta_filename为通过函数file_delta生成的差异编码文件。

数据同步有PULL和PUSH两种应用模式,PULL是将远程数据同步到本地,而PUSH是将本地数据同步到远程。对应到同步算法,主要区别在于数据分块和差异编码位置不同。PULL和PUSH同步模式步骤分别如下所述。
 PULL同步模式流程:
 1、本地对文件A进行数据切分,生成数据块描述文件chunk;
 2、上传chunk文件至远程服务器;
 3、远程服务器对文件B进行差异编码,生成差异编码文件delta;
 4、下载delta文件至本地;
 5、本地同步文件A至文件B,相当于下载文件B到本地文件A。

PUSH同步模式流程:
 1、远程服务器对文件B进行数据切分,生成数据块描述文件chunk;
 2、下载chunk文件至本地;
 3、本地对文件A进行差异编码,生成差异编码文件delta;
 4、上传delta文件至远程服务器;
 5、远程同步文件B到A,相当于上传文件A到远程文件B。

最新版本:

项目主页:http://code.google.com/p/libsync/

数据同步工具-cwRsync
cwRsync是运用于windows 平台的数据同步机制,等于是Windows平台的 rsync 解决方案。cwRsync 打包了 rsync 和 cygwin 。

cwRsync is a yet another packaging of Rsync and Cygwin for Windows with a client GUI. You can use cwRsync for fast remote file backup and synchronization. Rsync uses the Rsync algorithm which provides a very fast method for bringing remote files into sync. It does this by sending just the differences in the files across the link, without requiring that both sets of files are present at one of the ends of the link beforehand. At first glance this may seem impossible because the calculation of diffs between two files normally requires local access to both files.

Rsync normally uses ssh for communication. It requires no special privileges for installation. You must, however, have a working ssh system.

Alternatively, rsync can run in `daemon' mode, listening on a socket. This is generally used for public file distribution, although authentication and access control are available. Cygwin is a Linux-like environment for Windows. It consists of a DLL (cygwin1.dll), which emulates substantial Linux API functionality, and a collection of tools.

最新版本:4.0

项目主页:https://www.itefix.no/i2/cwrsync

基于HTTP的文件同步工具-zsync
zsync 是一个基于 HTTP 协议的文件同步(rsync)工具,通过它可以从远程的Web服务器上同步文件的改动。

zsync is a file transfer program. It allows you to download a file from a remote server, where you have a copy of an older version of the file on your computer already. zsync downloads only the new parts of the file. It uses the same algorithm as rsync. However, where rsync is designed for synchronising data from one computer to another within an organisation, zsync is designed for file distribution, with one file on a server to be distributed to thousands of downloaders. zsync requires no special server software — just a web server to host the files — and imposes no extra load on the server, making it ideal for large scale file distribution.

zsync is open source, distributed under version 2 of the Artistic License. Feedback, bugs reports and patches are welcome.

Features
zsync fills a gap in the technology available for large-scale file distribution. Three key points explain why zsync provides a genuinely new technique for file distribution:
Client-side rsync — zsync uses the rsync algorithm, but runs it on the client side, thus avoiding the high server load associated with rsync.
Rsync over HTTP — zsync provides transfers that are nearly as efficient as rsync -z or cvsup, without the need to run a special server application. All that is needed is an HTTP/1.1-compliant web server. So it works through firewalls and on shared hosting accounts, and gives less security worries.
Handling for compressed files — rsync is ineffective on compressed files, unless they are compressed with a patched version of gzip. zsync has special handling for gzipped files, which enables update transfers of files which are distributed in compressed form.

最新版本:0.6

项目主页:http://zsync.moria.org.uk/

图形界面的-grsync
Grsync 是一个 rsync 的图形界面程序,rsync是类unix系统下的数据镜像备份工具。

Grsync is a rsync GUI (Graphical User Interface). Rsync is the well-known and powerful command line directory and file synchronization tool. Grsync makes use of the GTK libraries and is released under the GPL license, so it is opensource. It doesn't need the gnome libraries to run, but can of course run under gnome pretty fine. It can be effectively used to synchronize local directories and it supports remote targets as well (even though it doesn't support browsing the remote folder). Sample uses of grsync include: synchronize a music collection with removable devices, backup personal files to a networked drive, replication of a partition to another one, mirroring of files, etc.

Features
Most commonly used rsync options available, additional options may be specified by command line switches
Saves multiple settings with customized names (no limit on number of "sessions")
Session sets can be created: run multiple sessions at once!
Can do simulation or normal execution
Captures and prints rsync output nicely on a own window and log to a file
Parses rsync output to display progress bars and other information
Highlights errors and show them on a separate window, for better and faster control over rsync runs
Can pause rsync execution
A good number of translations available
Can run custom commands before (and stop in case of failure) and after rsync
Shell script for batch, crontab use etc. provided (grsync-batch)
Can import and export sessions on file; i.e. share your settings with people!
Can minimize to system tray (status icon)
Can run specific sessions with superuser privileges
Rsync backup made easy!
Needs rsync installed on the system (command line tool only, no need for server-side daemon) and GTK
Available for free and with sources!
Works on many linux distributions (including Nokia Maemo), Mac OS X and windows!

最新版本:1.2

项目主页:http://www.opbyte.it/grsync/

Perl文件同步脚本-fsync
Fsync 是一个允许与远程主机进行档案同步的Perl脚本。包含的功能类似于rsync和CVS软件包。自fsync是一个单一的Perl脚本,建立档案同步化 的一个新的机器是相当简单的。主机之间的通信是通过一个插槽机制或以上的硫醇(或SSH)方面,与远程服务器开始硫醇,通过SSH或手动。这一项目是书面缓慢调制解调器连接到。 Fsync支持合并的概念差异本地/远程主机与钩的工具,以合并的树木。 Fsync需要的Perl 5.004或更高版本。这项计划已获得了GNU公共许可证。

Fsync is a Perl script which allows for file synchronization between remote hosts, containing functionality similar to that of the rsync and CVS packages. Since fsync is a single Perl script, setting up file synchronization on a new machine is relatively simple. Communication between the hosts is via a socket mechanism or over an rsh (or ssh) connection, with the remote server started by rsh, by ssh or manually. The program was written with slow modem connections in mind. Fsync supports the concept of merging differences from local/remote hosts with hooks for tools to merge the trees. Fsync requires perl 5.004 or newer. This program is licensed under the GNU Public License.

最新版本:2.1

项目主页:http://schwieters.org/fsync/