FDT – Deduplication Reimagined in OpenZFS

Deduplication in OpenZFS has been a bugbear for some years now. As data sets get larger, they have become even more difficult in using the present DeDuplication Table (DDT) method. Deduplication in OpenZFS is often derided as overwhelming and sluggish in performance.

Moreover, there is a common folklore passed on and on about allocating 5GB of RAM for every 1TB to dedupe in OpenZFS. I don’t know where this “sizing” came about. Probably derived from something Jeff Bonwick wrote back in the early days of ZFS. But there is some truth to this “rule of thumb”, commonly passed around in the TrueNAS® circles.

Nevertheless, given the exponential growth of data, and the advancement of processing power in modern day computer systems, the OpenZFS development community has decided to revamp the DDT method. Several prominent luminaries from iXsystems™, Klara Systems and the OpenZFS community have got together in mid-2023 to develop FDT or Fast Dedupe Table. And we got to see FDT announced to the world in the most recent OpenZFS Developer Summit in November 2023.

Fast Dedupe Table (FDT)

Fast Dedupe Table (FDT) is a log-based dedupe. In OpenZFS, all the write block I/Os that come into OpenZFS are coalesced into transaction groups (TXGs), hashed and checksummed, before they are committed to persistent media.

The new implementation in FDT is to put these incoming TXGs checksums and hashes into an append-only log structure in persistent storage, and also tracking the hashed changes in an AVL-tree residing in the memory. An AVL tree is a self-balancing binary search tree structure that is very efficient in searching, thus giving FDT the speed in initiating the deduplication lookups and updates.

OpenZFS Fast Dedupe Table (FDT) in a nutshell

The append-only log structure works hand-in-hand with the AVL tree to accept and stage (including intelligent sorting) the hash entries that are coming in after the TXGs writes. Then at a certain marker, that could be at a particular time-based trigger or a high-water mark, then the entries in the logs and AVL tree are flushed to the ZAP (ZFS Attribute Processor) where the actual full map of the OpenZFS blocks reside.

Continue reading