Dedup not working?

Added by Matt Hannon about 1 year ago

I'm running into an issue with zfs deduplication where it doesnt appear to be deduplicating. My test was to perform a full backup of a system without using compression to one zfs directory, perform the full backup again also without using compression to a second zfs directory in the same pool, and check what dedup ratio I got. Currently I have a dedup ratio of 1.00x. I've tested this with two program, Symantec BackupExec System Restore and StorageCraft Shadowprotect. Any ideas?

login as: zemeron
Using keyboard-interactive authentication.
Password:
Last login: Wed Mar  3 18:31:10 2010 from 172.16.1.63
zemeron@nexenta:~$ zpool list
NAME      SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
dedup    99.5G  14.1G  85.4G    14%  1.00x  ONLINE  -
syspool  49.8G  1.27G  48.5G     2%  1.00x  ONLINE  -
zemeron@nexenta:~$ zfs get dedup dedup/besr
NAME        PROPERTY  VALUE          SOURCE
dedup/besr  dedup     on             local
zemeron@nexenta:~$ zfs get dedup dedup/besr2
NAME         PROPERTY  VALUE          SOURCE
dedup/besr2  dedup     on             local

Replies

RE: Dedup not working? - Added by Christian o about 1 year ago

I haven't dared try out dedup yet. (Mostly because verify wasn't available and I believe it is cryptographically difficult to find a block that matches a SHA256 hash if I am given a hash - but I don't believe the opposite is true; SHA256 will generate collisions and then my data would be corrupted)

If dedup is activated and not dedup:ing either it is a bug or the data is

  • offset differently so it is not aligned across blocks
  • encrypted
  • backup attributes may contain timestamps of when backup was performed and if this is interleaved every block will be different.

Have you tried simply copying the same data to a different directory ("cp x x.dup")?

RE: Dedup not working? - Added by Initial O about 1 year ago

I was under the impression that dedup would only work within a single zfs directory, not the entire pool.

Try copying the contents of besr into besr2?

RE: Dedup not working? - Added by Matt Hannon about 1 year ago

Tried coping the contents into a single directory, and still no dedup on the two sequential full system backups. Cleared out the contents then copied a one of the system backup files to both locations and got a dedup ratio of 2.00x. So dedup is working at least in general.

After doing some looking around I'm thinking that the backup header size is variable and is throwing the offset off. So it appears that there isn't any bug but it does mean that I wont be able to use nexenta, or zfs dedup in general, as an alternative to Data Domain like devices.