Log in

Fri, Jun. 3rd, 2011, 12:31 pm
RAnsrID: Galois Field / Reed-Solomon code rewrite pending

While starting to implement error resilience (not erasure resilience, which is already working nicely) in my RAID-lookalike multi-disk block device RAnsrID, I finally had to notice that I didn't understand one aspect of Galois Field mathematics - the Reed-Solomon representation type has heavy influence on how to deal with errors (read: corrupted data).

So far, I only understood the canonical matrix style representation (basically a linear combination over the data disks for each redundancy disk). Turns out that with polynomial representation you can create way better (read: faster) algorithms for error correction - according to my analysis, erasure and error recovery is O(N²), compared to O(N³) for erasure correction in the linear combination case, and unknown (presumably O(N³⋅M²)) in the error case (N: total number of disks, M: number of redundancy disks).

Thus I will rewrite the redundancy routines based on Phil Karn's Reed-Solomon implementation - the best implementation with an open license I could find. Most implementations (like in par2) don't bother with error correction, and use block-level checksums to detect errors. That done, erasure recovery can be used. Needless to say, this is no option for my block device (where and why should I store the checksums).

No need to say that this delays delivery timeframe of RAnsrID further; especially as I'd like to incorporate the change in an at least data preserving backwards compatible way.

Also Hackweek 6 wasn't as productive as I hoped; I only managed to get the test suite up and running. Oh well.

No HTML allowed in subject


Notice! This user has turned on the option that logs your IP address when posting. 

(will be screened)