Deduplication in OpenZFS has been a bugbear for some years now. As data sets get larger, they have become even more difficult in using the present DeDuplication Table (DDT) method. Deduplication in OpenZFS is often derided as overwhelming and sluggish in performance.
Moreover, there is a common folklore passed on and on about allocating 5GB of RAM for every 1TB to dedupe in OpenZFS. I don’t know where this “sizing” came about. Probably derived from something Jeff Bonwick wrote back in the early days of ZFS. But there is some truth to this “rule of thumb”, commonly passed around in the TrueNAS® circles.
Nevertheless, given the exponential growth of data, and the advancement of processing power in modern day computer systems, the OpenZFS development community has decided to revamp the DDT method. Several prominent luminaries from iXsystems™, Klara Systems and the OpenZFS community have got together in mid-2023 to develop FDT or Fast Dedupe Table. And we got to see FDT announced to the world in the most recent OpenZFS Developer Summit in November 2023.
Fast Dedupe Table (FDT)
Fast Dedupe Table (FDT) is a log-based dedupe. In OpenZFS, all the write block I/Os that come into OpenZFS are coalesced into transaction groups (TXGs), hashed and checksummed, before they are committed to persistent media.
The new implementation in FDT is to put these incoming TXGs checksums and hashes into an append-only log structure in persistent storage, and also tracking the hashed changes in an AVL-tree residing in the memory. An AVL tree is a self-balancing binary search tree structure that is very efficient in searching, thus giving FDT the speed in initiating the deduplication lookups and updates.
The append-only log structure works hand-in-hand with the AVL tree to accept and stage (including intelligent sorting) the hash entries that are coming in after the TXGs writes. Then at a certain marker, that could be at a particular time-based trigger or a high-water mark, then the entries in the logs and AVL tree are flushed to the ZAP (ZFS Attribute Processor) where the actual full map of the OpenZFS blocks reside.
FDT creates a halfway house to amortize the performance impacts that have haunted the deduplication performance of the previous DDT method, and delaying the write IOPS operations in updated the hash tables to the ZAP. With intelligent sorting, this further improves the lookup and searches of hash entries (Read IOPS) to check if a block is unique or has existed below.
These recommended changes in FDT reduce both the Read I/Os and Write I/Os amplifications issues experienced in the existing DDT method. At the same time, by sorting sequentially in a more orderly manner, FDT also ensures that hash entries lookups are less erratic and more efficient as well.
Performance Results
Early testing results have been really promising. The Fast Dedupe Table (FDT) paper was presented at AsiaBSDCon 2024 on March 23rd shared these exciting numbers. The write performance table is shown below.
The results from the table are extremely exciting. An 800+% improvement! Mind blowing!
What else
Other than keeping the AVL tree and the log updated and be efficient, there are other maintenance features introduced into FDT. The objective is to keep the FDT dedupe performing well. Some of the features discussed are a dedupe quota, reducing the size of the ZAP through ZAP shrinking, and also allow entries pruning and FDT table pre-load in the event of a system reboot.
You can learn more from Allan Jude’s presentation at the OpenZFS Developer Summit in the video below.
iXsystems™ was truly pumped when FDT was made available to TrueNAS® SCALE nightly train release in March 2024. The announcement, lovingly, was on February 13th, 2024 and iX called it the “Valentine’s Gift to the OpenZFS and TrueNAS® communities“.
There are indeed lots of love shared in the thriving OpenZFS community. As the FDT continues to be development and its quality enhanced, we hope to see FDT the default deduplication method in future version of OpenZFS 2.3. This is truly a new era for data deduplication for OpenZFS.
Pingback: Random Short Take #94 | PenguinPunk.net