Learn you Galois Fields for Great Good (11)

Navigation | first | previous | next (soon)

Reed-Solomon as Linear Algebra

This is part 11 of a series on Abstract Algebra. In this part, we'll continue exploring Reed-Solomon Erasure Codes.

Last time we introduced Reed-Solomon as polynomial evaluation (encoding) and polynomial interpolation (decoding). This time, we'll recast those ideas in the framework of linear algebra. This will give us new capabilities and variants.

Background required: This article assumes the reader has a good understanding of linear algebra, including the concepts: matrix multiplication, inverses, solving linear-systems with LU factorization, linear independence, linear combinations, matrix rank, non-singularity conditions, etc.

If you are missing some background, checkout the resources listed in part 9.

A Matrix for Polynomial Evaluation

Let's continue with the example from the previous article using the data vector [3, 2, 5] representing the polynomial p(x) = 3 + 2x + 5x^2. Suppose we wish to evaluate this polynomial at x = 2.

Using linear algebra, we can represent this evaluation as an inner-product (also called a dot-product):

$$ \begin{align} p(2) &= 3 + 2(2) + 5(2)^2 \nonumber \\ &= \left[ \begin{array}{ccc} 1 & 2 & 2^2 \\ \end{array} \right] \left[ \begin{array}{ccc} 3 \\ 2 \\ 5 \\ \end{array} \right] \nonumber \\ &= 27 \nonumber \end{align} $$

Similarly, we can evaluate many points $\textbf{x} = (1, 2, 3, 4, 5)$ with a single matrix-vector product:

$$ \begin{align} p(\textbf{x}) &= \left[ \begin{array}{ccc} 1 & 1 & 1^2 \\ 1 & 2 & 2^2 \\ 1 & 3 & 3^2 \\ 1 & 4 & 4^2 \\ 1 & 5 & 5^2 \\ \end{array} \right] \left[ \begin{array}{ccc} 3 \\ 2 \\ 5 \\ \end{array} \right] \nonumber = \left[ \begin{array}{ccc} 10 \\ 27 \\ 54 \\ 91 \\ 138 \\ \end{array} \right] \nonumber \\ &= \textbf{A} \left[ \begin{array}{ccc} 3 \\ 2 \\ 5 \\ \end{array} \right] \nonumber \end{align} $$

Vandermonde Matrices

The $\textbf{A}$ matrix constructed above is known as a Vandermonde matrix.

Given some distinct constants $x_0, x_1, \dots, x_{n-1}$ the vandermonde matrix has the form:

$$ \left[ \begin{array}{ccc} 1 & x_0 & x_0^2 & ... & x_0^{n-1} \\ 1 & x_1 & x_1^2 & ... & x_1^{n-1} \\ 1 & x_2 & x_2^2 & ... & x_2^{n-1} \\ 1 & x_3 & x_3^2 & ... & x_3^{n-1} \\ 1 & x_4 & x_4^2 & ... & x_4^{n-1} \\ \vdots & & & & \vdots \\ 1 & x_{n-1} & x_{n-1}^2 & ... & x_{n-1}^{n-1} \\ \end{array} \right] $$

A square $n \times n$ vandermonde matrix is always invertible (as long as all $x_i$ are distinct).

This gives intuition to what's happening in Reed-Solomon. Suppose we construct a $5 \times 3$ Vandermonde matrix (as above), then we can erase any 2 rows and obtain a square $3 \times 3$ Vandermonde matrix that is invertible!

This property suggests a very straightforward approach to implementing Reed-Solomon using ordinary linear algebra.

Encoding/Decoding in Matrix Form

A few definitions:

Encoding is:

$$ \textbf{e} = \textbf{A} \textbf{d} $$

Decoding is:

  1. Erase rows from $\textbf{A}$ and $\textbf{e}$ until we have a square Vandermonde $\textbf{A}_r$ and the corresponding $\textbf{e}_r$
  2. Solve the linear system for $d$:

$$ \textbf{A}_r \textbf{d} = \textbf{e}_r $$

It's exactly that simple!

A Worked Example

Let's work through an example to drive the idea home. We'll use the $5 \times 3$ Vandermonde matrix we constructed above:

$$ \textbf{A} = \left[ \begin{array}{ccc} 1 & 1 & 1^2 \\ 1 & 2 & 2^2 \\ 1 & 3 & 3^2 \\ 1 & 4 & 4^2 \\ 1 & 5 & 5^2 \\ \end{array} \right] = \left[ \begin{array}{ccc} 1 & 1 & 1 \\ 1 & 2 & 4 \\ 1 & 3 & 9 \\ 1 & 4 & 16 \\ 1 & 5 & 25 \\ \end{array} \right] $$

But, for fun, let's use a new data vector:

$$ \textbf{d} = \left[ \begin{array}{ccc} 6 \\ 0 \\ 2 \end{array} \right] $$

Let's encode:

$$ \textbf{e} = \textbf{A} \textbf{d} = \left[ \begin{array}{ccc} 1 & 1 & 1 \\ 1 & 2 & 4 \\ 1 & 3 & 9 \\ 1 & 4 & 16 \\ 1 & 5 & 25 \\ \end{array} \right] \left[ \begin{array}{ccc} 6 \\ 0 \\ 2 \end{array} \right] = \left[ \begin{array}{ccc} 8 \\ 14 \\ 24 \\ 38 \\ 56 \\ \end{array} \right] $$

Easy peasy.

Okay, let's decode it. We'll erase the first two rows to get a square $3 \times 3$ matrix:

$$ \textbf{A}_r = \left[ \begin{array}{ccc} 1 & 3 & 9 \\ 1 & 4 & 16 \\ 1 & 5 & 25 \\ \end{array} \right]\\ \textbf{e}_r = \left[ \begin{array}{ccc} 24 \\ 38 \\ 56 \\ \end{array} \right] $$

Let's solve the system $\textbf{A}_r \textbf{d} = \textbf{e}_r$ using an $\textbf{A}_r = \textbf{L}\textbf{U}$ factorization:

$$ \left[ \begin{array}{ccc} 1 & 3 & 9 \\ 1 & 4 & 16 \\ 1 & 5 & 25 \\ \end{array} \right] = \left[ \begin{array}{ccc} 1 & 0 & 0 \\ 1 & 1 & 0 \\ 1 & 2 & 1 \\ \end{array} \right] \left[ \begin{array}{ccc} 1 & 3 & 9 \\ 0 & 1 & 7 \\ 0 & 0 & 2 \\ \end{array} \right]\\ $$

Solving $\textbf{L}\textbf{y} = \textbf{e}_r$

$$ \left[ \begin{array}{ccc} 1 & 0 & 0 \\ 1 & 1 & 0 \\ 1 & 2 & 1 \\ \end{array} \right] \textbf{y} = \left[ \begin{array}{ccc} 24 \\ 38 \\ 56 \\ \end{array} \right] $$ $$ \textbf{y} = \left[ \begin{array}{ccc} 24 \\ 14 \\ 4 \\ \end{array} \right] \\ $$

Solving $\textbf{U}\textbf{d} = \textbf{y}$

$$ \left[ \begin{array}{ccc} 1 & 3 & 9 \\ 0 & 1 & 7 \\ 0 & 0 & 2 \\ \end{array} \right] \textbf{d} = \left[ \begin{array}{ccc} 24 \\ 14 \\ 4 \\ \end{array} \right] \\ $$ $$ \textbf{d} = \left[ \begin{array}{ccc} 6 \\ 0 \\ 2 \\ \end{array} \right] \\ $$

As desired $\square$

Exercise: Repeat this example, but erase the last two rows. You should still recover the $d$ vector

Moving beyond Vandermonde

Above we just used ordinary linear algebra to do the same operations as polynomial evaluation and polynomial interpolation without a single polynomial in sight! The reason we can do this all comes back the invertibility of Vandermonde matrices.

But this leads to a new ideas. Why does it have to be Vandermonde? Are there other matrices with the same invertibility properties? Can we use our linear algebra tools to construct one?

What kind of matrix would be useful?

Systematic encoding matrices

One very useful $3 \times 5$ encoding matrix looks like:

$$ \left[ \begin{array}{ccc} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ a & b & c \\ d & e & f \\ \end{array} \right] $$

This is called a systematic encoding matrix because the first k encoded elements are the same as the unencoded data:

$$ \left[ \begin{array}{ccc} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ a & b & c \\ d & e & f \\ \end{array} \right] \left[ \begin{array}{ccc} 6 \\ 7 \\ 8 \\ \end{array} \right] = \left[ \begin{array}{ccc} 6 \\ 7 \\ 8 \\ p \\ q \\ \end{array} \right] $$

The additional rows p and q are often commonly called parity (for historical reasons).

On encode, we only have to compute the parity rows (p and q). And, our decode is a trivial identity function unless one of the original data elements is erased. These features make systematic codes very popular in data-storage systems where storage devices fail infrequently.

Constructing systematic matrices

There are many ways to construct such a systematic matrix. We'll demonstrate a method that starts with a Vandermonde matrix and transforms it into a systematic matrix.

Recall our $5 \times 3$ vandermonde matrix:

$$ \textbf{A} = \left[ \begin{array}{ccc} 1 & 1 & 1 \\ 1 & 2 & 4 \\ 1 & 3 & 9 \\ 1 & 4 & 16 \\ 1 & 5 & 25 \\ \end{array} \right] $$

Let's invert the upper $3 \times 3$ square:

$$ \textbf{A}_r^{-1} = \left( \left[ \begin{array}{ccc} 1 & 1 & 1 \\ 1 & 2 & 4 \\ 1 & 3 & 9 \\ \end{array} \right] \right) ^ {-1} = \left[ \begin{array}{ccc} 3 & -3 & 1 \\ -2.5 & 4 & -1.5 \\ 0.5 & -1 & 0.5 \\ \end{array} \right] $$

Now we can multiply with $\textbf{A}$:

$$ \begin{align} \textbf{A} \textbf{A}_r^{-1} &= \left[ \begin{array}{ccc} 1 & 1 & 1 \\ 1 & 2 & 4 \\ 1 & 3 & 9 \\ 1 & 4 & 16 \\ 1 & 5 & 25 \\ \end{array} \right] \left[ \begin{array}{ccc} 3 & -3 & 1 \\ -2.5 & 4 & -1.5 \\ 0.5 & -1 & 0.5 \\ \end{array} \right] \nonumber \\ &= \left[ \begin{array}{ccc} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ 1 & -3 & 3 \\ 3 & -8 & 6 \\ \end{array} \right] \nonumber \end{align} $$

And now we've obtained the desired "parity" encoding rows:

$$ \left[ \begin{array}{ccc} 1 & -3 & 3 \\ 3 & -8 & 6 \\ \end{array} \right] $$

Implementing Reed-Solomon

Let's implement all of this in code! Like the previous section, we'll implement over GF(256). And we'll use the linear algebra routines developed in part 9.

Source Code: src/reed_solomon_linalg.rs

#![allow(non_snake_case)]
#![allow(dead_code)]
use crate::gf_256::GF;
use crate::linalg::Matrix;

As before, we need some parameters:

const K: usize = 3;  // Number of data elements (before encoding)
const N: usize = 5;  // Number of encoded elements

We also need to specify the evaluation points. We need to have N of these. For this variant of Reed-Solomon, it really doesn't matter what we pick. They just need to be distinct elements of GF(256).

const X: [GF; 5] = [
    GF::new(42), GF::new(222), GF::new(2), GF::new(8), GF::new(99)
];

Encoding and Decoding with a Matrix

For encoding, we'll use a generic matrix A. This matrix should have the required invertibility properties.

We just need to perform a simple multiply (e = A*d):

pub fn reed_solomon_linalg_encode(A: &Matrix, data: &[GF]) -> Vec<GF> {
    // Sanity check
    assert_eq!(data.len(), K);

    // Construct the encoding matrix (A) and data vector (d)
    let d = Matrix::new(K, 1, data.to_vec());

    // Encode: e = A*d
    let e = A.matmul(&d);

    // Return the raw data
    e.data().to_vec()
}

Decoding will use the same A matrix, removing any rows for erased elements.

We just need to solve a linear system (solve A_r * d = e_r for d):

pub fn reed_solomon_linalg_decode(A: &Matrix, encoded: &[Option<GF>]) -> Option<Vec<GF>> {
    // Sanity check
    assert_eq!(encoded.len(), N);

    // First, we need to gather up evaluations that haven't been erased.
    let mut evals = vec![];
    let mut idx = vec![];
    for i in 0..N {
        if let Some(value) = encoded[i] {
            evals.push(value);
            idx.push(i);
        }
    }

    // Make sure we have enough evaluations to decode
    if evals.len() < K {
        return None; // Too many erasures, can't decode!
    }

    // Select the rows from the
    // Construct the matrix (A_r)
    let A_r = A.select_rows(&idx[..K]);
    let e_r = Matrix::new(K, 1, evals[..K].to_vec());

    // Decode by linear solve: A_r * d = e_r
    let d = A_r.solve(&e_r).unwrap();

    // Return the raw data
    Some(d.data().to_vec())
}

Vandermonde Matrices

We'll build a Vandermonde Matrix out of N evaluation points (xs)

pub fn vandermonde_encoding_matrix(xs: &[GF]) -> Matrix {
    // Sanity check
    assert_eq!(xs.len(), N);

    // The resulting matrix will be an N x K Vandermonde
    let mut mat = Matrix::zeros(N, K);

    // Build up one row at a time
    for i in 0..N {
        // Compute each value in the row: 1, x, x^2, x^3, etc
        let mut value = GF::new(1);
        for j in 0..K {
            mat[(i,j)] = value;
            value = value * xs[i];
        }
    }

    mat
}

Testing Time

Just like the polynomial version, we should be able to decode up to 2-element erasures. But, 3-element erasures will fail.

Let's test each possibility.

#[cfg(test)]
fn encode_decode_all(A: &Matrix, data: &[GF], expected_enc: &[GF]) {

    // encode
    let enc = reed_solomon_linalg_encode(A, data);
    assert_eq!(enc, expected_enc);
    let recv_all: Vec<_> = enc.iter().map(|x| Some(*x)).collect();

    // decode with no erasures: success!
    assert_eq!(reed_solomon_linalg_decode(A, &recv_all), Some(data.to_vec()));

    // decode with all one element erasures: success!
    for i in 0..N {
        let mut recv = recv_all.clone();
        recv[i] = None;
        assert_eq!(reed_solomon_linalg_decode(A, &recv), Some(data.to_vec()));
    }

    // decode with all two element erasures: success!
    for i in 0..N {
        for j in (i+1)..N {
            let mut recv = recv_all.clone();
            recv[i] = None;
            recv[j] = None;
            assert_eq!(reed_solomon_linalg_decode(A, &recv), Some(data.to_vec()));
        }
    }

    // decode with all three element erasures: failure!
    for i in 0..N {
        for j in (i+1)..N {
            for k in (j+1)..N {
                let mut recv = recv_all.clone();
                recv[i] = None;
                recv[j] = None;
                recv[k] = None;
                assert_eq!(reed_solomon_linalg_decode(A, &recv), None);
            }
        }
    }
}

Now we can test a bunch of different data vectors. Note: these are the exact same ones as in reed_solomon_poly.rs, giving an empirical example of equivalence.

#[cfg(test)]
#[test]
fn test_vandermonde_encode_decode() {
    // construct our vandermonde encoding matrix
    let A = vandermonde_encoding_matrix(&X);

    // test: trivial
    encode_decode_all(&A,
        &[GF::new(0), GF::new(0), GF::new(0)],
        &[GF::new(0), GF::new(0), GF::new(0), GF::new(0), GF::new(0)],
    );

    // test: ones
    encode_decode_all(&A,
        &[GF::new(1), GF::new(1), GF::new(1)],
        &[GF::new(3), GF::new(161), GF::new(7), GF::new(73), GF::new(160)],
    );

    // test: pattern
    encode_decode_all(&A,
        &[GF::new(100), GF::new(150), GF::new(200)],
        &[GF::new(160), GF::new(135), GF::new(94), GF::new(104), GF::new(194)],
    );

    // test: random
    encode_decode_all(&A,
        &[GF::new(216), GF::new(196), GF::new(171)],
        &[GF::new(81), GF::new(157), GF::new(209), GF::new(193), GF::new(105)],
    );
}

Systematic Matrices

We will now construct systematic matrices by transforming an existing A matrix using the A * inv(A_r) approach:

pub fn systematic_encoding_matrix(A: &Matrix) -> Matrix {
    // Compute the inverse of the upper K x K square
    let inv = A.slice_rows(0..K).inverse().unwrap();

    // Multiply it
    A.matmul(&inv)
}

Testing again

We can completely reuse the encoding and decoding routines for the systematic matrices. This is a very inefficient approach. In practice, you'd instead develop hardcoded routines for the specific parity rows. But for our pedagogical purposes, this approach is okay.

Let's demonstrate these with a test, just like before.

#[cfg(test)]
#[test]
fn test_systematic_encode_decode() {
    // construct our systematic matrix by transforming a vandermonde matrix
    let A = systematic_encoding_matrix(&vandermonde_encoding_matrix(&X));

    // The A matrix should start with a K x K identity
    assert_eq!(A.slice_rows(0..K), Matrix::identity(K));

    // The remaining N-K rows should be parity
    assert_eq!(A.slice_rows(K..N), Matrix::new(N-K, K, vec![
        GF::new(146), GF::new(30),  GF::new(141),
        GF::new(155), GF::new(137), GF::new(19),
    ]));

    // test: trivial
    encode_decode_all(&A,
        &[GF::new(0), GF::new(0), GF::new(0)],
        &[GF::new(0), GF::new(0), GF::new(0), GF::new(0), GF::new(0)],
    );

    // test: ones
    encode_decode_all(&A,
        &[GF::new(1), GF::new(1), GF::new(1)],
        &[GF::new(1), GF::new(1), GF::new(1), GF::new(1), GF::new(1)],
    );

    // test: pattern
    encode_decode_all(&A,
        &[GF::new(100), GF::new(150), GF::new(200)],
        &[GF::new(100), GF::new(150), GF::new(200), GF::new(64), GF::new(57)],
    );

    // test: random
    encode_decode_all(&A,
        &[GF::new(216), GF::new(196), GF::new(171)],
        &[GF::new(216), GF::new(196), GF::new(171), GF::new(31), GF::new(66)],
    );
}

Linear Independence and Projection to K-dimensions

As this article shows, viewing Reed-Solomon codes as linear algebra can be very powerful. This view allows us to bring the tools of linear algebra to bear on erasure coding.

One way to view a Vandermonde Matrix is as K linearly independent vectors in an N dimensional space. These vectors have the fascinating property that we can project them to K dimensions (by removing rows) and they remain linearly independent! In this way, we can view Reed-Solomon encoding as a simple linear-combination of basis vectors with these special properties!

This observation leads us to non-Vandermonde encoding matrices that have this same key property. We saw previously that we could construct Systematic Matrices. But, we could also use Cauchy matrices and these can lead to an interesting class of codes.

We can also develop bespoke encoding matrices. If the size is small enough, it's tractable for a computer to verify that all square $K \times K$ sub-matrices are invertible.

Computational Complexity

We now have two equivalent encoding and decoding approaches to Reed-Solomon. Let's compare them in computational complexity.

First some definitions:

Operation Runtime Complexity Rationale
Encoding (Polynomial) O(mn) Each evaluation is O(n) and we perform m evaluations
Encoding (Linear Algebra) O(mn) A matrix-vector multiply is O(mn), i.e. each inner-product is O(n) and we perform m of them
Encoding (Systematic) O(pn) Only the parity rows need to be computed
Decoding (Polynomial) O(n^2) The inner-loop of interpolation is O(n) and we interpolate n points
Decoding (Linear Algebra) O(n^3) An LU factorization is O(n^3) and an LU solve is O(n^2)
Decoding (Systematic) O(p^3 + np) Solving a $p \times p$ matrix after eliminating known data

The above table shows that, while encoding is essentially equivalent, decoding is not. This is a commonly known effect of problems described in linear-algebra form. Linear algebra is a very general toolset and often specific problems have special structure that yield better algorithms.

Despite this disadvantage, reasoning from a linear algebra perspective can still be very useful, as discussed above.

Also, consider the case of decoding a systematic code using parity rows. In many cases, we may only need a few parity rows to recover the full data vector. In other words, p is small and O(p^3 + np) approaches O(n), becoming an effectively linear decoding time!

But, if n is large, can we do better that O(mn) encoding and O(n^2) decoding? Yes, we actually can. But, we'll have to wait for next time!

Conclusion

Well it's now time to say goodbye.

Here, we've explored Reed-Solomon from the perspective of Linear Algebra. We barely even touched a polynomial in this article and came away with an implementation that passed the exact same tests as the polynomial version! The linear algebra variant is shockingly simple, but pays for the luxury in decoding performance.

However, the linear algebra view is very useful for reasoning and allows us to introduce a wider variety of possible encoding matrices.

Linear Algebra will also be very useful in the next article where we'll discuss a third view of Reed-Solomon that will help us achieve very fast encoding and decoding.

What's your frequency, Kenneth? (soon)