Hello,
What is the current best practice to have redundancy of your data?
Typical Raid5 is considered risky with big harddisks due to the risk of read errors (which statistically are very likely to occur on big disks), preventing a rebuild.
I'm reading up on many things, such as SnapRaid, ZFS, Greyhole, ...
SnapRaid seems interesting with benefits (every data disk is readable) and downsides (redundancy is not instant but snapshot based). One benefit is actually that you can recover data that was accidentally deleted if the SnaprRaid has not been synchronized.
But I have a bit of problems getting my head around ZFS. Basically you have disks, which you group in some redundant way (mirror, raidz) in vdev, and then you group different vdev in a zfs pool. If I understand correctly, the redundancy is within the vdev; loose one vdev and you loose the zfs pool.
So why would one add multiple vdev to a zfs pool? Is the only reason drive pooling?
And how is raidz1 better than raid5: doesn't it suffer the same rebuild risk?
I like the system on DrivePool on Windows, and I suspect Greyhole comes closest to it on Linux... The main attraction of these systems (to me) is that in case of an emergency you can just plug the disk in another system and access the data. It basically seems like an intelligent way of mirroring, so it does waste more storage.
Any comments?
Jörg
What is the current best practice to have redundancy of your data?
Typical Raid5 is considered risky with big harddisks due to the risk of read errors (which statistically are very likely to occur on big disks), preventing a rebuild.
I'm reading up on many things, such as SnapRaid, ZFS, Greyhole, ...
SnapRaid seems interesting with benefits (every data disk is readable) and downsides (redundancy is not instant but snapshot based). One benefit is actually that you can recover data that was accidentally deleted if the SnaprRaid has not been synchronized.
But I have a bit of problems getting my head around ZFS. Basically you have disks, which you group in some redundant way (mirror, raidz) in vdev, and then you group different vdev in a zfs pool. If I understand correctly, the redundancy is within the vdev; loose one vdev and you loose the zfs pool.
So why would one add multiple vdev to a zfs pool? Is the only reason drive pooling?
And how is raidz1 better than raid5: doesn't it suffer the same rebuild risk?
I like the system on DrivePool on Windows, and I suspect Greyhole comes closest to it on Linux... The main attraction of these systems (to me) is that in case of an emergency you can just plug the disk in another system and access the data. It basically seems like an intelligent way of mirroring, so it does waste more storage.
Any comments?
Jörg
Comment