Coding Methods Revealed: Exploring Erasure Coding

A new report from the University of California at Berkeley titled "Erasure Coding for Big-data Systems: Theory and Practice" sheds light on the evolution and limitations of RAID (Redundant Array of Inexpensive Disks) in the context of modern storage technology. The article, penned by SecurityInfoWatch contributor Ray Bernard, is the 35th in the "Real Words or Buzzwords?" series.

The spotlight is on Erasure Coding, a technology that is increasingly being adopted in large-scale video storage systems. This method divides original data into multiple chunks and uses coding to create redundancy data for re-creation of lost fragments. Reed-Solomon (RS) codes, a popular choice for erasure coding in large-scale distributed storage systems, play a significant role in this process.

In the past, RAID was crucial for data safety due to frequent hard drive failures. However, for large-capacity disk drives, the main problem with RAID is rebuild time, which can take weeks for a single disk loss under RAID 5 or RAID 6. Erasure coding, on the other hand, enables faster and more resource-efficient data reconstruction, especially in large-scale video storage systems.

Efficiency, Data Durability, and Rebuild Time

Erasure coding offers several advantages over traditional RAID. It provides fault tolerance with significantly less storage overhead, often around 17-30% overhead instead of 100%+ in replication. This efficiency is critical when storing huge volumes of video data.

Moreover, erasure coding can tolerate multiple simultaneous disk failures, offering strong data durability in distributed storage systems. By dispersing data and parity chunks across different storage nodes, erasure coding enhances reliability.

In terms of rebuild time, erasure coding's ability to rebuild missing data from fewer blocks of data and parity, combined with parallelism over multiple nodes, reduces rebuild times and system impact.

Comparison to RAID

Traditional RAID systems rely on simple parity schemes and often need to read most or all drives in an array to rebuild a failed drive, leading to long rebuild times and performance degradation. In contrast, erasure coding scales more flexibly and efficiently.

RAID's storage efficiency drops as more parity is needed for more faults, while erasure coding is implemented at software or distributed system layers rather than only hardware RAID controllers, allowing better integration with large-scale, cloud, or distributed video storage architectures.

Industry Usage

Large cloud providers and video platforms use erasure coding for durability and efficiency. For example, Amazon S3 for cloud storage, streaming platforms for smooth video playback despite disk failures, and distributed file systems like Hadoop.

New storage architectures, such as VMware vSAN with Erasure Coding (ESA), exploit high-performance NVMe drives to run erasure coding efficiently at scale with good cost-effectiveness. These approaches eliminate the need for caching layers and maintain near-device level performance while reducing total cost of ownership and complexity compared to traditional RAID.

For VMS deployments with JBOD (Just a Bunch of Disks) storage or RAID video storage, products like Viakoo Preemptive can be used to monitor the entire video system for signs of impending failure. Cloud-based VMS can affordably provide much more compute, storage, and network capacity than on-site systems. Many entry-level NVRs use JBOD storage without RAID to avoid these issues.

Performance Under Modern Storage Hardware

Erasure Coding provides high performance in storage systems as data is written in parallel to many disks at once. High performance video management systems, built using cloud technology including Erasure Coding, can provide 100% operational video recording systems with better than four or five nines of reliability.

In summary, erasure coding outperforms traditional RAID in large-scale video storage systems by providing greater storage efficiency, stronger data durability, and faster rebuild times, especially as disk capacities grow. It has become the preferred method in distributed, cloud, and modern storage architectures over classical RAID schemes.

[1] https://arxiv.org/abs/1808.05338 [2] https://blogs.vmware.com/vsan/2018/09/introducing-vmware-vsan-6-7-erasure-coding-ec-for-vms.html [3] https://www.usenix.org/legacy/event/usenix07/tech/full_papers/Bernard.pdf [4] https://www.usenix.org/legacy/event/usenix09/tech/full_papers/Bernard.pdf [5] https://www.usenix.org/legacy/event/usenix10/tech/full_papers/Bernard.pdf

Data-and-cloud-computing technology plays a significant role in the evolution of erasure coding, as it is increasingly adopted for large-scale video storage systems. Efficiency is a key advantage of erasure coding as it offers fault tolerance with significantly less storage overhead compared to traditional RAID, making it critical for storing huge volumes of video data in the cloud.

In terms of technology, erasure coding provides multiple benefits over traditional RAID systems, including faster and more resource-efficient data reconstruction, strong data durability, and improved rebuild times, contributing to its preference in modern, distributed, and cloud storage architectures.

Coding Methods Revealed: Exploring Erasure Coding