FSSYNC(1) | General Commands Manual | FSSYNC(1) |
fssync - File system synchronization tool (1-way, over SSH)
fssync -d db -r root [option...] host
fssync is a 1-way file-synchronization tool that tracks inodes and maintains a local database of files that are on the remote side, making it able to:
It aims at minimizing network traffic and synchronizing every detail of a file system:
Other features:
Main usage of fssync is to prevent data loss in case of hardware failure, where RAID1 is not possible (e.g. in laptops).
On Btrfs [1] file systems, fssync is an useful alternative to btrfs send (and receive) commands, thanks to filtering capabilities. This can be combined with Btrfs snapshotting at destination side for a full backup solution.
Use fssync --help to get the complete list of options.
The most important thing to remember is that the local database must match exactly what's on the destination host:
Look at -c option if you wonder whether your database matches the destination directory.
First run of fssync:
An example of wrapper around fssync, with a filter, can be found at examples/fssync_home
fssync does never descend directories on other filesystems. Inodes masked by mount points are also skipped, so they should be unmounted temporarily if you want them to be synchronized. The same result can be achieved by synchronizing from a bind mount.
See also the NONE cipher switching [3] patch if you don't need encryption and you want to speed up your SSH connection.
fssync maintains a single SQLite table of all dirs/files that are on the remote side. Each row matches a path, with its inode (on local side), other metadata (on remote side) and a checked flag.
When running, fssync iterates recursively through all local dirs/files and for each path that is not ignored (see -f option), it queries the DB to decide what to do. If already checked, path is skipped immediately. When a path is synchronized, it is marked as checked. At the end, all rows that are not checked corresponds to paths that don't exist anymore. Once they are deleted on the remote side, all checked flags are reset.
In fact, fssync doesn't require that the database matches perfectly the destination. It tolerates some differences in order to recover any interrupted synchronization caused by a network failure, a file operation error, or anything other than an operating system crash of the local host (or something similar like a power failure).
In most cases, this is done by the remote host, which automatically create (or overwrite) an inode of the expected type if necessary. The only exception is that the remote will never delete a non-empty directory on its own. For most complex cases, fssync journalizes the operation in the database: in case of failure, fssync will be able to recover on next sync.
A race condition means that other processes on the local host are modifying inodes that fssync is synchronizing. fssync handles any kind of race condition. In fact, fssync has nothing to do for most cases.
When a race condition happens, fssync does not guarantee that the remote data is in a consistent state. Each sync always fixes existing inconsistencies but may introduces others, so fssync is not suitable for hot backuping of databases.
With Btrfs, you can get consistency by snapshotting at source side.
The idea of maintaining a local database actually comes from csync2 [4]. I was about to adopt it when I realized that I really needed a tool that always detects renames/moves of big files. That's why I see fssync as a partial rewrite of csync2, with inode tracking and without bidirectional synchronization. The local database really makes fssync & csync2 faster than the well-known rsync [5].
sqlite3(1), ssh(1)
If the DB is not corrupted and you don't want to rebuild it, you can try to update it by running fssync again as soon as possible, so that the same changes are replayed. fssync should be able to detect that all remote operations are already performed. See also -c and -F options.
Julien Muchembled <jm@jmuchemb.eu>