Dedup not working?
Added by Matt Hannon about 1 year ago
I'm running into an issue with zfs deduplication where it doesnt appear to be deduplicating. My test was to perform a full backup of a system without using compression to one zfs directory, perform the full backup again also without using compression to a second zfs directory in the same pool, and check what dedup ratio I got. Currently I have a dedup ratio of 1.00x. I've tested this with two program, Symantec BackupExec System Restore and StorageCraft Shadowprotect. Any ideas?
login as: zemeron
Using keyboard-interactive authentication.
Password:
Last login: Wed Mar 3 18:31:10 2010 from 172.16.1.63
zemeron@nexenta:~$ zpool list
NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT
dedup 99.5G 14.1G 85.4G 14% 1.00x ONLINE -
syspool 49.8G 1.27G 48.5G 2% 1.00x ONLINE -
zemeron@nexenta:~$ zfs get dedup dedup/besr
NAME PROPERTY VALUE SOURCE
dedup/besr dedup on local
zemeron@nexenta:~$ zfs get dedup dedup/besr2
NAME PROPERTY VALUE SOURCE
dedup/besr2 dedup on local
Replies
RE: Dedup not working? - Added by Christian o about 1 year ago
I haven't dared try out dedup yet. (Mostly because verify wasn't available and I believe it is cryptographically difficult to find a block that matches a SHA256 hash if I am given a hash - but I don't believe the opposite is true; SHA256 will generate collisions and then my data would be corrupted)
If dedup is activated and not dedup:ing either it is a bug or the data is
- offset differently so it is not aligned across blocks
- encrypted
- backup attributes may contain timestamps of when backup was performed and if this is interleaved every block will be different.
Have you tried simply copying the same data to a different directory ("cp x x.dup")?
RE: Dedup not working? - Added by Initial O about 1 year ago
I was under the impression that dedup would only work within a single zfs directory, not the entire pool.
Try copying the contents of besr into besr2?
RE: Dedup not working? - Added by Matt Hannon about 1 year ago
Tried coping the contents into a single directory, and still no dedup on the two sequential full system backups. Cleared out the contents then copied a one of the system backup files to both locations and got a dedup ratio of 2.00x. So dedup is working at least in general.
After doing some looking around I'm thinking that the backup header size is variable and is throwing the offset off. So it appears that there isn't any bug but it does mean that I wont be able to use nexenta, or zfs dedup in general, as an alternative to Data Domain like devices.