Adapting Matrix Algorithms for multivectors

Eric Wieser, Hugo Hadfield
Cambridge University Engineering Department

slides: eric-wieser.github.io/agacse-2021 2021-09-10

Hugo Hadfield was going to prepare and present this presentation…

…but had other ideas!

Introduction

Implementations of many matrix algorithms are widely available.

Languages with LAPACK bindings

Python (numpy, scipy)
Matlab
Julia
R
…

Algorithms of particular interest

Square root
Exponential
Logarithm
Eigen-decomposition
…

Can we exploit the existing libraries for multivector computation?

Can we reimplement the algorithms to form a new GAPACK?

Matrix representation of multivectors

Consider a multivector $A$ in $\mathcal{G}(\mathbb{R}^2)$: $$A = a + a^1e_1 + a^2e_2 + a^{1,2}e_{1,2}$$

There's an obvious linear equivalence to coefficients of a basis, $\mathcal{G}(\mathbb{R}^n) \simeq \mathbb{R}^{2^{(p+q+r)}}$, we'll call $\operatorname{v}$:

$$\operatorname{v}(A) = \left[\begin{smallmatrix}a \\ a^1 \\ a^2 \\ a^{1,2}\end{smallmatrix}\right]$$

$$\begin{aligned} \color{red}{\operatorname{v}(0)} &\color{red}{{}= 0}\\ \color{red}{\operatorname{v}(A + B)} &\color{red}{{}= \operatorname{v}(A) + \operatorname{v}(B)} \\ \color{red}{\operatorname{v}(\alpha A)} &\color{red}{{}= \alpha\operatorname{v}(A)} \end{aligned}$$

Of greater interest is the embedding into a matrix algebra, $\mathcal{G}(\mathbb{R}^{p,q,r}) \to \mathbb{R}^{2^{(p+q+r)}\times2^{(p+q+r)}}$, we'll call $\operatorname{M}$:

$$\operatorname{M}(A) = \left[\begin{smallmatrix} a & a^1 & a^2 & -a^{1,2} \\ a^1 & a & a^{1,2} & -a^2 \\ a^2 & -a^{1,2} & a & a^1 \\ a^{1,2} & -a^2 & a^1 & a \\ \end{smallmatrix}\right]$$

$$\begin{aligned} \color{red}{\operatorname{M}(0)} &\color{red}{{} = 0} \\ \color{red}{\operatorname{M}(A + B)} &\color{red}{{} = \operatorname{M}(A) + \operatorname{M}(B)} \\ \color{red}{\operatorname{M}(\alpha A)} &\color{red}{{} = \alpha\operatorname{M}(A)} \end{aligned}$$

$$\begin{aligned} \color{blue}{\operatorname{M}(1)} &\color{blue}{{} = I = 1} \\ \color{blue}{\operatorname{M}(AB)} &\color{blue}{{} = \operatorname{M}(A)\operatorname{M}(B)} \\ \operatorname{M}(\tilde A) &= \operatorname{M}(A)^T \\ \operatorname{v}(AB) &= \operatorname{M}(A)\operatorname{v}(B) \\ \end{aligned}$$

Notably, this is not only linear, but also a morphism of algebras.

Matrix representation of multivectorsReversing the mapping

$$\operatorname{M}(a + a^1e_1 + a^2e_2 + a^{1,2}e_{1,2}) = \left[\begin{smallmatrix} a & a^1 & a^2 & -a^{1,2} \\ a^1 & a & a^{1,2} & -a^2 \\ a^2 & -a^{1,2} & a & a^1 \\ a^{1,2} & -a^2 & a^1 & a \\ \end{smallmatrix}\right]$$

Clearly the mapping $\operatorname{M}$ is not surjective and so cannot have a two-sided inverse, but we can define:

$$\operatorname{M}^{-1}(X) = \operatorname{v}^{-1}(X_{*,1})$$

where $\operatorname{v}^{-1}$ reassembles a multivector from a vector of coefficients and is the two-sided inverse to $\operatorname{v}$:

$$\operatorname{v}^{-1}\left(\left[\begin{smallmatrix}a \\ a^1 \\ a^2 \\ a^{1,2}\end{smallmatrix}\right]\right) = a + a^1e_1 + a^2e_2 + a^{1,2}e_{1,2}$$

and $X_{*,1}$ is the first column of the matrix $X$. By inspection, this forms a left-inverse to $\operatorname{M}$ as it satisfies:

$$\operatorname{M}^{-1}(\operatorname{M}(A)) = A$$

We are not interested in them here, but there are more sophisticated choices of $\operatorname{M}^{-1}$ which satisfy various additional properties such as $\operatorname{M}^{-1}(X^T) = \operatorname{M}^{-1}(X)^{\sim}$.

Wrapping Matrix Algorithms

When reusing a matrix algorithm $f$, we're looking for a multivector algorithm $\operatorname{wrap}[f]$ that makes the following diagram commute:

$$\begin{CD} \mathcal{G}(\mathbb{R}^{p,q,r}) @>>\operatorname{wrap}[f]> \mathcal{G}(\mathbb{R}^{p,q,r}) \\ @V\operatorname{M}VV @V\operatorname{M}VV \\ \mathbb{R}^{2^{(p+q+r)}\times2^{(p+q+r)}} @>>f> \mathbb{R}^{2^{(p+q+r)}\times2^{(p+q+r)}} \\ \end{CD}$$

Simple approach; define:

$$\operatorname{wrap}[f](A) := \operatorname{M}^{-1}(f(\operatorname{M}(A)))$$

Whether this produces meaningful results depends on the properties of $f$.

Wrapping Matrix AlgorithmsExample: exponentiation

The exponential function can be defined on any ring via the Taylor expansion,

$$\operatorname{exp}(A) = \sum_0^\infty \frac{A^i}{i!}$$

If we pass this through $\operatorname{M}$, we find

$$\operatorname{M}(\operatorname{exp} A) = \sum_0^\infty \operatorname{M}\left(\frac{A^i}{i!}\right) = \sum_0^\infty \frac{\operatorname{M}(A)^i}{i!} = \operatorname{exp} \operatorname{M}(A)$$

This means that our “wrapped” algorithm is correct:

$$\begin{aligned} \operatorname{wrap}[\operatorname{exp}](A) &= \operatorname{M}^{-1}(\operatorname{exp}(\operatorname{M}(A))) \\ &= \operatorname{M}^{-1}(\operatorname{M}(\operatorname{exp} A)) \\ &= \operatorname{exp} A \end{aligned}$$

Any real algorithm $E$ for computing the exponent of a matrix approximates the taylor expansion; so $\operatorname{wrap}[E]$ will too. We can use existing implementations of matrix exponentiation such as the scipy.linalg.expm function in the scipy package for the Python programming language.

Wrapping Matrix AlgorithmsExample: square root

The square root is more problematic, as it is not unique. While it's easy to show

$$\operatorname{M}(\sqrt{A})\operatorname{M}(\sqrt{A}) =\operatorname{M}(\sqrt{A}\sqrt{A}) = \operatorname{M}(A) \implies \operatorname{M}(\sqrt{A}) = \sqrt{\operatorname{M}(A)},$$

the $\sqrt{\phantom{1}}$ on the right is just one of many possible matrix square roots, and may or may not be the one computed by a particular algorithm.

Consider:

$$\begin{aligned} S &= \left[\begin{smallmatrix} 1 & 0 & 0 & 0 \\ 0 & -1 & 0 & 0 \\ 0 & 0 & -1 & 0 \\ 0 & 0 & 0 & 1 \end{smallmatrix}\right]& \implies S^2 &= I = \operatorname{M}(1) & \implies S &= \sqrt{\operatorname{M}(1)}\end{aligned}$$

Here $S$ is no longer in the range of $\operatorname{M}$, so we cannot in general assume that applying $\operatorname{M}^{-1}$ is valid (although in this case it still is).

Paths forward

Prove or disprove the validity of $\operatorname{M}^{-1}$ for all possible $\sqrt{\phantom{1}}$s.
Choose a specific $\sqrt{\phantom{1}}$ algorithm that ensures validity.

Reimplementing Matrix Algorithms

Idea: instead of reusing an implementation of an algorithm, reimplement it for multivectors.

Disadvantages

Doesn't build upon tried, tested, and optimized libraries like LAPACK.
Requires writing new code!
Some matrix operations have no directly analogous GA operation (eg LU decomposition).

Advantages

No redundant coefficients.
Better runtime:

Algebra	$A : \mathcal{G}(\mathbb{R}^n)$	$\operatorname{M}(A) : \mathbb{R}^{N\times N}$
Addition	$O(N)$	$O(N^2)$
Multiplication	$O(N^2)$	$O(N^3)$

($n = p+q+r$, $N = 2^n$)

Less memory consumption.
Potential for multivector-specific optimizations.
No need to worry about whether $\operatorname{M}^{-1}$ is valid.

Reimplementing Matrix AlgorithmsExample: square root

One of many algorithms from [1] as an example, which computes $\lim_{k\to\infty}X_k = \sqrt{B}$:

Product form Denman and Beavers (DB) iteration (scaled)
$$\begin{aligned} \mu_k &= \left|\operatorname{det}(M_k)\right|^{-1/(2n)} \\ M_{k+1} &= \tfrac{1}{2}\big(I + \tfrac{\mu_k^2 M_k + \mu_k^{-2}M_k^{-1}}{2}\big), & M_0 &= B, \\ X_{k+1} &= \tfrac{1}{2}\mu_k X_k\left(I + \mu_k^{-2}M_k^{-1}\right), & X_0 &= B \end{aligned}$$

Let $M_k = \operatorname{M}(M_k')$, $X_k = \operatorname{M}(X_k')$, and $B = \operatorname{M}(A)$, where $M_k', X_k', A \in \mathcal{G}(\mathbb{R}^n)$:

$$\begin{aligned} \mu_k &= \left|\operatorname{det}(\operatorname{M}(M_k'))\right|^{-1/(2n)} \\ \operatorname{M}(M_{k+1}') &= \tfrac{1}{2}\big(I + \tfrac{\mu_k^2 \operatorname{M}(M_k') + \mu_k^{-2}\operatorname{M}(M_k')^{-1}}{2}\big), & \operatorname{M}(M_0') &= \operatorname{M}(A), \\ \operatorname{M}(X_{k+1}') &= \tfrac{1}{2}\mu_k \operatorname{M}(X_k')\left(I + \mu_k^{-2}\operatorname{M}(M_k')^{-1}\right), & \operatorname{M}(X_0') &= \operatorname{M}(A) \end{aligned}$$

$$\begin{aligned} \mu_k &= \left|\operatorname{det{\scriptsize\star}}(M_k')\right|^{-1/(2n)} \\ \operatorname{M}(M_{k+1}') &= \operatorname{M}\big(\tfrac{1}{2}\big(I + \tfrac{\mu_k^2 M_k' + \mu_k^{-2}M_k'^{-1}}{2}\big)\big), & \operatorname{M}(M_0') &= \operatorname{M}(A), \\ \operatorname{M}(X_{k+1}') &= \operatorname{M}(\tfrac{1}{2}\mu_k X_k'\left(I + \mu_k^{-2}M_k'^{-1}\right)), & \operatorname{M}(X_0') &= \operatorname{M}(A) \end{aligned}$$

$$\begin{aligned} \mu_k &= \left|\operatorname{det{\scriptsize\star}}(M_k')\right|^{-1/(2n)} \\ M_{k+1}' &= \tfrac{1}{2}\big(I + \tfrac{\mu_k^2 M_k' + \mu_k^{-2}M_k'^{-1}}{2}\big), & M_0' &= A, \\ X_{k+1}' &= \tfrac{1}{2}\mu_k X_k'\left(I + \mu_k^{-2}M_k'^{-1}\right)), & X_0' &= A \end{aligned}$$

The result: an almost identical DB algorithm on multivectors, without any conversion to matrices!

Nicholas J. HighamFunctions of Matrices: Theory and Computation D. S. ShirokovOn determinant, other characteristic polynomial coefficients, and inverses in Clifford algebras of arbitrary dimension

Future work

Translate more algorithms from Functions of Matrices: Theory and Computation to multivectors.
Construct a collection of pathological multivectors analogous to those generated by Matlab's gallery function.
- These are used extensively to compare the performance of the matrix algorithms.
- Cases which bias against one algorithm may not apply to multivectors, resulting in different relative accuracies.
Analyze the time-complexity and accuracy characteristics of the translated algorithms.
Publish Python implementations of these algorithms compatible with the clifford package.