Learn you Galois Fields for Great Good (09)

Linear Algebra over Finite Fields

This is part 9 of a series on Abstract Algebra. In this part, we'll start doing Linear Algebra over Finite Fields.

In the introduction to this series I stated that the numbers don't matter, but normal math rules still work. In this section, we'll really drive that point home by using our finite fields to perform ordinary linear algebra.

Now, this series is not focused on linear algebra, so we're not going to deeply explain core linear algebra concepts or standard algorithms. If you find this section hard to follow and wish to brush up on your linear algebra, here are couple good resources:

Video Lectures: Gilbert Strang's MIT 18.06 (youtube.com)
Book: Chapter 2 of Scientific Computing by Michael Heath (siam.org)

From this point on, we'll assume you have the required linear algebra prerequisites.

A Vector Space over an Arbitrary Field

For engineers, linear algebra is often introduced in the context of real numbers. In practice, one will typically end up using linear algebra with floating-point arithmetic on a computer. But, generally this is the extent of their exposure.

However, there's nothing that constrains linear algebra to just real numbers or its weird non-associative floating-point forms.

The concept of linear algebra in mathematics comes from the definition of a vector space. An element in a vector space can take many structural forms, but for our purposes we can stick with the familiar ordered array of elements form. But instead of the elements being real numbers, they will be elements of some finite field.

Over an arbitrary field, we have the concepts of:

Linear combinations
Linear independence
Linear subspaces
Spans
Bases

Unfortunately, we do not have the concepts of:

Normality / Unit Vectors (requires a norm, i.e. Banach Space)
Orthogonality (requires an inner product, i.e. Hilbert Space).

But, these limitations will not prove to be a big issue for practical applications.

Vectors of `GF(256)` elements

For this implementation we'll use the GF(256) field. This will just be a variant of our GF(2^k) fields using the irreducible polynomial Q = x^8 + x^4 + x^3 + x + 1 (decimal: 283). This is the same field used by AES Encryption.

GF(256) is a very useful field because each element maps exactly to a single byte (u8). Thus a vector of elements is simply an array of bytes. Therefore we can perform linear algebra over ordinary data byte buffers! How cool!

For this, we'll use an implementation based on our GF(2^k) fields with parameters hardcoded for GF(256). There is nothing new worth noting in this implementation. You can read the source code here: src/gf_256.rs

Implementing Linear Algebra

Okay, let's get started implementing a linear algebra library using matrices over GF(256).

Source Code: `src/linalg.rs`

#![allow(non_snake_case)]
use crate::gf_256::GF;
use std::ops::{Index, IndexMut, Range};

Matrix structure

First, we will define a matrix type. Each matrix may have a different number of rows and columns. We'll store the matrix elements in row-major ordering. We will use the shape parameters to calculate offsets into the flattened array.

#[derive(Debug, Clone, PartialEq, Eq)]
pub struct Matrix {
    rows: usize,
    cols: usize,
    dat: Vec<GF>, // elements stored in row-major order
}

We'll have two primitive ways to construct a matrix: (1) from explicit data elements or (2) initialized to all zeros

impl Matrix {
    pub fn new(rows: usize, cols: usize, dat: Vec<GF>) -> Self {
        assert_eq!(rows * cols, dat.len());
        Self { rows, cols, dat }
    }

    pub fn zeros(rows: usize, cols: usize) -> Self {
        let dat = vec![GF::new(0); rows * cols];
        Self { rows, cols, dat }
    }
}

We'll also need a few simple accessors:

impl Matrix {
    pub fn shape(&self) -> (usize, usize) {
        (self.rows, self.cols)
    }

    pub fn data(&self) -> &[GF] {
        &self.dat
    }
}

Element Indexing

Let's now implement matrix element indexing.

An m x n matrix is indexed by a tuple (i,j) where 0 <= i < m and 0 <= j < n.

Syntactically:

mat[(0,2)] accesses the element in row 0 and column 2
mat[(5,0)] accesses the element in row 5 and column 0

impl Index<(usize, usize)> for Matrix {
    type Output = GF;
    fn index(&self, idx: (usize, usize)) -> &GF {
        assert!(idx.0 < self.rows);
        assert!(idx.1 < self.cols);
        &self.dat[idx.0 * self.cols + idx.1]
    }
}

impl IndexMut<(usize, usize)> for Matrix {
    fn index_mut(&mut self, idx: (usize, usize)) -> &mut GF {
        assert!(idx.0 < self.rows);
        assert!(idx.1 < self.cols);
        &mut self.dat[idx.0 * self.cols + idx.1]
    }
}

Identity Matrices

Now that we have indexing, we can write a routine to construct identity matrices. We'll start with all zeros, and then write 1 to the diagonal entries:

impl Matrix {
    pub fn identity(n: usize) -> Self {
        let mut m = Matrix::zeros(n, n);
        for i in 0..n {
            m[(i,i)] = GF::new(1);
        }
        m
    }
}

Slicing

We'll occasionally need to slice rows and columns of a matrix to form a new sub-matrix.

Syntactically:

mat.slice_rows(0..3) constructs a new matrix with rows 0 through 3 of the original matrix
mat.slice_cols(2..4) constructs a new matrix with columns 2 through 4 of the original matrix

impl Matrix {
    pub fn slice_rows(&self, r: Range<usize>) -> Matrix {
        let start = r.start;
        let end = r.end;
        assert!(end <= self.rows);

        let new_rows = end - start;
        let mut out = Matrix::zeros(new_rows, self.cols);
        for i in 0..new_rows {
            for j in 0..self.cols {
                out[(i, j)] = self[(start+i, j)];
            }
        }

        out
    }

    pub fn slice_cols(&self, r: Range<usize>) -> Matrix {
        let start = r.start;
        let end = r.end;
        assert!(end <= self.cols);

        let new_cols = end - start;
        let mut out = Matrix::zeros(self.rows, new_cols);
        for i in 0..self.rows {
            for j in 0..new_cols {
                out[(i, j)] = self[(i, start+j)];
            }
        }

        out
    }
}

Selection

Similar to slicing, we'll also want to select subsets of rows and columns by index

Syntactically:

mat.select_rows(&[0,1,2,4]) constructs a new matrix with rows 0, 1, 2, and 4 of the original matrix
mat.select_cols(&[3,2]) constructs a new matrix with columns 3 and 2 of the original matrix

impl Matrix {
    pub fn select_rows(&self, sel: &[usize]) -> Matrix {
        let new_rows = sel.len();
        let mut out = Matrix::zeros(new_rows, self.cols);
        for i in 0..new_rows {
            for j in 0..self.cols {
                out[(i, j)] = self[(sel[i], j)];
            }
        }
        out
    }

    pub fn select_cols(&self, sel: &[usize]) -> Matrix {
        let new_cols = sel.len();
        let mut out = Matrix::zeros(self.rows, new_cols);
        for i in 0..self.rows {
            for j in 0..new_cols {
                out[(i, j)] = self[(i, sel[j])];
            }
        }
        out
    }
}

Matrix multiplication

A matrix multiply can be decomposed into a sequence of multiplications and additions, and any field has these two operations. This means that we can perform matrix multiplication using field arithmetic!

The implementation isn't anything special either. It's just the textbook algorithm.

impl Matrix {
    pub fn matmul(&self, other: &Matrix) -> Matrix {
        // Multiply two matrices: (m x n) * (n x p) => (m x p)
        let m = self.rows;
        let n = self.cols;
        let p = other.cols;
        assert_eq!(self.cols, other.rows);

        let mut out = Matrix::zeros(m, p);
        for i in 0..m {
            for j in 0..p {
                let mut elem = GF::new(0);
                for k in 0..n {
                    elem = elem + self[(i,k)] * other[(k,j)];
                }
                out[(i,j)] = elem;
            }
        }

        out
    }
}

LU Factorization

Now, we wish to solve linear systems of equations in the form of Ax = b. We can also perform this operation over an arbitrary field!

The classic approach is to factor the A matrix into upper-triangular (U) and lower-triangular (L) parts. That is, we want to find A = LU. This can be done with Gaussian Elimination.

First, we'll define a struct to store the results of a factorization. In optimized implementations, the A matrix is typically mutated in-place to store the resulting L and U factors. But we'll expose them explicitly here for ease of understanding.

We'll also need a permutation vector for row exchanges. This vector maps the new row (used for pivoting) to the original row in the A matrix.

#[derive(Debug)]
pub struct LU {
    lower: Matrix,       // L: lower triangular matrix
    upper: Matrix,       // U: upper triangular matrix
    permute: Vec<usize>, // P: row permutation map (new_row => old_row)
}

Let's now implement Gaussian elimination. We will need to use partial pivoting because non-singular matrices can have zeros on the diagonal. But, unlike numerical linear algebra, we don't have to worry about numerical stability. Finite field arithmetic is exact. This is one of the many cool things about linear algebra over finite fields.

Note, the algorithm below is completely "textbook". I'm not going to explain it in detail. If you have trouble following, consider reviewing LU factorization over Real Numbers. The algorithm doesn't fundamentally change when using it over a finite field.

impl Matrix {
    pub fn factor_lu(&self) -> Option<LU> {
        // Sanity check the shape (a square matrix required in this implementation)
        assert_eq!(self.rows, self.cols);

        let mut A = self.clone();
        let rows = A.rows;
        let cols = A.cols;
        let mut L = Matrix::zeros(rows, cols);
        let mut U = Matrix::zeros(rows, cols);
        let mut P: Vec<_> = (0..rows).collect();

        // Loop over columns
        for k in 0..cols {
            // Search for a non-zero pivot. Unlike floating-point operations, we don't
            // have to worry about numerical issues. We only need to pivot because
            // non-singular matrices can still have zeros in the pivots. If the matrix
            // is non-singular, then the pivot column must have at least one non-zero
            // entry. This, partial pivoting is always sufficient!
            let mut found = false;
            for i in k..rows {
                if A[(P[i], k)] != GF::new(0) {
                    // Perform a swap of the permutation map
                    let save = P[k];
                    P[k] = P[i];
                    P[i] = save;

                    // All good!
                    found = true;
                    break;
                }
            }
            if !found {
                return None; // Matrix is singular!
            }

            // Retrieve the pivot element
            let pivot = A[(P[k], k)];

            // Copy the pivot row to the U matrix
            for j in k..cols {
                U[(k,j)] = A[(P[k],j)];
            }


            // Compute the multipliers and store in column of L
            L[(P[k],k)] = GF::new(1);
            for i in (k+1)..rows {
                L[(P[i],k)] = A[(P[i], k)] / pivot;
            }

            // Apply the transform (row subtraction to the sub-matrix)
            for i in (k+1)..rows {
                let m = L[(P[i],k)];
                for j in (k+1)..cols {
                    A[(P[i],j)] = A[(P[i],j)] - m * A[(P[k],j)];
                }
            }
        }

        Some(LU {
            lower: L,
            upper: U,
            permute: P,
        })
    }
}

Solving `Ax = b` using an LU Factorization

Once we have an LU Factorization, we can solve Ax = b as LUx = b. This is a much easier problem.

There are two steps:

Solve Ly = b for y (using forward-substitution)
Solve Ux = y for x (using back-substitution)

But, we'll need to be careful to use our permutation map to reorder the b vector in the forward-substitution step.

Let's start with forward-substitution. This is also completely "textbook".

fn solve_lower_triangular(L: &Matrix, b: &Matrix, permute: &[usize]) -> Matrix {
    // Sanity check the shape
    assert!(L.rows == L.cols);
    assert!(b.rows == L.rows);
    assert!(b.cols == 1);

    let n = L.rows;
    let mut b = b.clone();
    let mut y = Matrix::zeros(n, 1);
    for j in 0..n { // columns in L
        let elt = b[(permute[j],0)] / L[(permute[j],j)];
        y[(j,0)] = elt;

        for i in (j+1)..n { // walk down the column, subtracting off
            b[(permute[i],0)] = b[(permute[i], 0)] - L[(permute[i],j)] * elt;
        }
    }
    y
}

Next, we'll implement back-substitution. This is nearly the same as forward-substitution, but going in the opposite direction.

fn solve_upper_triangular(U: &Matrix, y: &Matrix) -> Matrix {
    // Sanity check the shape
    assert!(U.rows == U.cols);
    assert!(y.rows == U.rows);
    assert!(y.cols == 1);

    let n = U.rows;
    let mut y = y.clone();
    let mut x = Matrix::zeros(n, 1);
    for j in (0..n).rev() { // columns in L
        let elt = y[(j,0)] / U[(j,j)];
        x[(j,0)] = elt;

        for i in (0..j).rev() { // walk up the columns, subtracting off
            y[(i,0)] = y[(i, 0)] - U[(i,j)] * elt;
        }
    }
    x
}

With these two routines, we can now solve the full LUx = b problem by simply combining them.

impl LU {
    // Solve LUx = b
    pub fn solve(&self, b: &Matrix) -> Matrix {
        // Sanity check the shape
        assert!(self.lower.rows == b.rows);
        assert!(self.upper.rows == b.rows);
        assert!(self.upper.rows == self.upper.cols);
        assert!(self.lower.rows == self.lower.cols);
        assert!(b.cols == 1);

        // Ly = b: Forward solve with lower triangular matrix (using the permutation)
        let y = solve_lower_triangular(&self.lower, b, &self.permute);

        // Ux = y: Backward solve with upper triangular matrix
        let x = solve_upper_triangular(&self.upper, &y);

        // Done! The result is 'x'
        x
    }
}

And we can add a solver routine for Ax = b directly

impl Matrix {
    // Solve Ax = b with this matrix
    pub fn solve(&self, b: &Matrix) -> Option<Matrix> {
        Some(self.factor_lu()?.solve(b))
    }
}

Matrix Inverses

Solving linear systems is a core operation in linear algebra. We now have great power!

Let's use it to construct matrix inverses. This is as simple as solving the linear system Ax = b where b is each column of the identity matrix (i.e. each basis vector).

impl Matrix {
    // Compute the inversion of the matrix. Returns None if the matrix is singular.
    pub fn inverse(&self) -> Option<Matrix> {
        // Factor A into LU
        let lu = self.factor_lu()?;

        // Allocate the output inverse matrix
        let n = self.rows;
        let mut inv = Matrix::zeros(n, n);

        // Loop over each column of the output inverse matrix
        for j in 0..n {
            // Construct the ith basis vector
            let mut b = Matrix::zeros(n, 1);
            b[(j,0)] = GF::new(1);

            // Solve that basis vector
            let x = lu.solve(&b);

            // Copy the result into the ith column of the inverse matrix
            for i in 0..n {
                inv[(i, j)] = x[(i, 0)];
            }
        }

        Some(inv)
    }
}

Testing time

We omit the test listing in this article for cleanliness. We encourage you to read them in the source code.

A Leap of Faith / A Lack of Intuition

All the linear algebra we implemented here just works. But the numbers are weird and it throws off the intuition. It's just something you get used to and learn to trust the rules.

Consider matrix A (in GF(256)):

$$ A = \left[ \begin{array}{ccc} 50 & 123 \\ 201 & 66 \\ \end{array} \right] $$

If we compute its LU Factorization, we'll find:

$$ A = LU = \left[ \begin{array}{ccc} 1 & 0 \\ 150 & 1 \\ \end{array} \right] \left[ \begin{array}{ccc} 50 & 123 \\ 0 & 123 \\ \end{array} \right] $$

Solving for the first basis vector (using LUx = b), we find

$$ \left[ \begin{array}{ccc} 1 & 0 \\ 150 & 1 \\ \end{array} \right] \left[ \begin{array}{ccc} 50 & 123 \\ 0 & 123 \\ \end{array} \right] x = \left[ \begin{array}{ccc} 1 \\ 0 \\ \end{array} \right] $$

$$ x = \left[ \begin{array}{ccc} 105 \\ 89 \\ \end{array} \right] $$

Solving for the second basis vector (using LUx = b), we find

$$ \left[ \begin{array}{ccc} 1 & 0 \\ 150 & 1 \\ \end{array} \right] \left[ \begin{array}{ccc} 50 & 123 \\ 0 & 123 \\ \end{array} \right] x = \left[ \begin{array}{ccc} 0 \\ 1 \\ \end{array} \right] $$

$$ x = \left[ \begin{array}{ccc} 146 \\ 6 \\ \end{array} \right] $$

So, we find inv(A) to be:

$$ A^{-1} = \left[ \begin{array}{ccc} 105 & 146 \\ 89 & 6 \\ \end{array} \right] $$

None of this seems intuitive. None of the numbers "make sense". But, does it behave as an inverse? If we multiply A * inv(A) do we get the identity?

$$ \begin{align} A * A^{-1} &= \left[ \begin{array}{ccc} 50 & 123 \\ 201 & 66 \\ \end{array} \right] \left[ \begin{array}{ccc} 105 & 146 \\ 89 & 6 \\ \end{array} \right] \nonumber \\ &= \left[ \begin{array}{ccc} 50*105+123*89 & 50*146+123*6 \\ 201*105+66*89 & 201*146+66*6 \\ \end{array} \right] \nonumber \\ &= \left[ \begin{array}{ccc} 1 & 0 \\ 0 & 1 \\ \end{array} \right] \nonumber \end{align} $$

Such is the magic of finite fields! 😊

Conclusion

Well it's that time again! Another episode on Abstract Algebra is complete.

We took some pretty big steps here:

Introduced the idea that linear algebra can be done over any Field, not just the Real Numbers
Introduced GF(256) to do arithmetic over ordinary data bytes (u8)
Implemented matrix multiplication over data bytes (u8)
Implemented a linear system solver over data bytes (u8)

Hopefully you're starting to see the kind of power this gives us to process data. When I was first learning about finite fields, this was my moment of "wow, this is awesome!"

Fascinatingly, linear algebra works better over bytes (u8) than it does over floating-point (f32)! Over floating-point, there is always a big focus on numerical stability since the floating-point numbers are just a crude approximation to the real number field. But, finite fields are not an approximation. They are precise, the way maths intended!

We will take big advantage of this precise nature in the next article.

Forward to a world without error where memories can be erased!

xorvoid

Learn you Galois Fields for Great Good (09)

Linear Algebra over Finite Fields

A Vector Space over an Arbitrary Field

Vectors of GF(256) elements

Implementing Linear Algebra

Source Code: src/linalg.rs

Matrix structure

Element Indexing

Identity Matrices

Slicing

Selection

Matrix multiplication

LU Factorization

Solving Ax = b using an LU Factorization

Matrix Inverses

Testing time

A Leap of Faith / A Lack of Intuition

Conclusion

Vectors of `GF(256)` elements

Source Code: `src/linalg.rs`

Solving `Ax = b` using an LU Factorization