*New 12.1 series Release:
2019-11-08: XigmaNAS 12.1.0.4.7091 - released!

*New 11.3 series Release:
2019-10-19: XigmaNAS 11.3.0.4.7014 - released


We really need "Your" help on XigmaNAS https://translations.launchpad.net/xigmanas translations. Please help today!

Producing and hosting XigmaNAS costs money. Please consider donating for our project so that we can continue to offer you the best.
We need your support! eg: PAYPAL

To Use Dedup Or Not?

Forum rules
Set-Up GuideFAQsForum Rules
Post Reply
User avatar
juddyjacob
Starter
Starter
Posts: 48
Joined: 07 Sep 2012 03:01
Location: Leonardo New Jersey
Contact:
Status: Offline

To Use Dedup Or Not?

#1

Post by juddyjacob » 16 Jan 2013 05:32

I have a backup server using 5 2TB disk in a raidz1 dataset. Currently I have only 16% of the volume in use, but im just thinking ahead as it will be filling up quickly. It has 16 GB of memory, plus a 25GB swap partition on a SSD drive shared with the Operating system. I have herd so many different opinions on whether to use dedup or not, so I am carefully waying my decision. As I am sure, no one including myself, do I want to end up in a situation where I need to reload the server and start over. I have run a command to determine the size of the database and need a little assistance decifering the output

rbnas0:~# zdb -S tank1
Simulated DDT histogram:

bucket allocated referenced
______ ______________________________ ______________________________
refcnt blocks LSIZE PSIZE DSIZE blocks LSIZE PSIZE DSIZE
------ ------ ----- ----- ----- ------ ----- ----- -----
1 370K 45.0G 41.4G 41.6G 370K 45.0G 41.4G 41.6G
2 59.6K 7.20G 4.92G 5.10G 148K 17.9G 12.0G 12.5G
4 715K 87.5G 79.2G 79.8G 4.25M 533G 482G 486G
8 195K 24.0G 21.3G 21.5G 2.30M 291G 258G 260G
16 225K 28.0G 22.2G 22.6G 4.14M 528G 418G 427G
32 2.92K 351M 323M 325M 128K 15.0G 13.8G 13.9G
64 145 10.9M 4.08M 4.73M 12.5K 965M 357M 415M
128 139 7.55M 3.61M 4.23M 20.4K 1.03G 500M 594M
256 35 1.60M 622K 805K 11.0K 533M 216M 273M
512 10 302K 75.5K 128K 6.80K 207M 49.4M 84.8M
1K 4 257K 13.5K 38.3K 4.60K 302M 15.7M 44.4M
Total 1.53M 192G 169G 171G 11.4M 1.40T 1.20T 1.21T

dedup = 7.27, compress = 1.17, copies = 1.01, dedup * compress / copies = 8.38

1.53 x 320 = 489.6 MB (*current estimated table size)

Can you tell me about what compression ratio I can expect?
table size in reguards to dataset growth?

Thank you for all and any input!
x64-full on Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz : Supermicro X10SRL-F : 130926MiB ECC Ram: 8x4TB RZ2 : 19TB Usable

ku-gew
Advanced User
Advanced User
Posts: 173
Joined: 29 Nov 2012 09:02
Location: Den Haag, The Netherlands
Status: Offline

Re: To Use Dedup Or Not?

#2

Post by ku-gew » 16 Jan 2013 23:30

2 GB ram per TB of data, you are too limited. Don't dedupe.
As a ZFS expert somewhere else wrote, think carefully before enabling dedupe, and then don't enable it. Use compression instead.

Unless you have a server used for a lot of repetitive loads and data (virtualized machines, but not only 2-5, at least 20-50, then do not dedupe.
HP Microserver N40L, 8 GB ECC, 2x 3TB WD Red, 2x 4TB WD Red
XigmaNAS stable branch, always latest version
SMB, rsync

User avatar
misterredman
Forum Moderator
Forum Moderator
Posts: 184
Joined: 25 Jun 2012 13:31
Location: Switzerland
Status: Offline

Re: To Use Dedup Or Not?

#3

Post by misterredman » 17 Jan 2013 10:16

NAS1: Pentium E6300 - Abit IP35Pro - 4GB RAM - Backup of NAS2
NAS2: Core 2 Quad Q9300 - Asus P5Q-EM - 8GB RAM
pyload - flexget - tvnamer - subsonic - owncloud - crashplan - plex media server

User avatar
juddyjacob
Starter
Starter
Posts: 48
Joined: 07 Sep 2012 03:01
Location: Leonardo New Jersey
Contact:
Status: Offline

Re: To Use Dedup Or Not?

#4

Post by juddyjacob » 22 Jan 2013 07:07

Thank you for your input, I have chosen not to use the dedup feature. I did read that article. That's actually the same article where I learned the command to test the estimated size of the dedup database size. Though it's a little unclear where it speaks of the duplication compression rate. I can't quite figure out how much the dedup would compress my data with the output I previously posted. It would be nice to know but after all I dont plan on using it. Maybe one day Ill build a small dataset with a ton of ram and test it out. Currently it's not in my budget however.
x64-full on Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz : Supermicro X10SRL-F : 130926MiB ECC Ram: 8x4TB RZ2 : 19TB Usable

Post Reply

Return to “ZFS (only!)”