Erasure Coding
Erasure Coding for Data Protection
Each vendor have implemented and derived their own mechanism and algorithm for erasure coding but the main principle remain same for each vendors. Principle is protecting the failure of drives with help of least additional drives.
Data protection is one of the key aspect while designing any storage product. Fault tolerance had been key part of data protection which means ability of data to recover at times of drive failures. To achieve fault tolerance data redundancy needs to be done. As for data redundancy there is trade off between fault tolerance and the cost of additional drives. Erasure coding helps to overcome the above mentioned cost overhead of additional drives required.
- Traditional Way (RAID-1)
Traditional way of redundancy is 1-1 mirroring of drives. Here space efficiency becomes 1/n , where n is number of drives.

For protecting 4 drives we need here 4 additional drives. Ratio n:k here becomes 4:4
Where n = given number of drives to be protected
k= additional drives required to protect given number of drives.
- Erasure Coding Implementation:

Before fault
In this approach we are given 4 data symbols namely A,B,C and D. These data symbols are encoded into two equations
- A+B+C+D
- A+2B+4C+8D
After fault
Two data symbol failures. Say B and D.
Recovery Mechanism
After two symbol failures( B and D) we are left with 4 variables and 4 equations. We can easily recover the two symbol failures.( B and D).
To conclude Erasure coding is helping to achieve data protection in efficient way with minimum additional drives required to protect given number of drives.
Recent Comments