Adapting Matrix Algorithms for multivectors

Eric Wieser, Hugo Hadfield
Cambridge University Engineering Department

slides: eric-wieser.github.io/agacse-2021 2021-09-10

Hugo Hadfield was going to prepare and present this presentation…

…but had other ideas!

Introduction

Implementations of many matrix algorithms are widely available.

Languages with LAPACK bindings

  • Python (numpy, scipy)
  • Matlab
  • Julia
  • R

Algorithms of particular interest

  • Square root
  • Exponential
  • Logarithm
  • Eigen-decomposition

Can we exploit the existing libraries for multivector computation?

Can we reimplement the algorithms to form a new GAPACK?

Matrix representation of multivectors

Consider a multivector $A$ in $\mathcal{G}(\mathbb{R}^2)$: $$A = a + a^1e_1 + a^2e_2 + a^{1,2}e_{1,2}$$

There's an obvious linear equivalence to coefficients of a basis, $\mathcal{G}(\mathbb{R}^n) \simeq \mathbb{R}^{2^{(p+q+r)}}$, we'll call $\operatorname{v}$:

$$\operatorname{v}(A) = \left[\begin{smallmatrix}a \\ a^1 \\ a^2 \\ a^{1,2}\end{smallmatrix}\right]$$
$$\begin{aligned} \color{red}{\operatorname{v}(0)} &\color{red}{{}= 0}\\ \color{red}{\operatorname{v}(A + B)} &\color{red}{{}= \operatorname{v}(A) + \operatorname{v}(B)} \\ \color{red}{\operatorname{v}(\alpha A)} &\color{red}{{}= \alpha\operatorname{v}(A)} \end{aligned}$$

Of greater interest is the embedding into a matrix algebra, $\mathcal{G}(\mathbb{R}^{p,q,r}) \to \mathbb{R}^{2^{(p+q+r)}\times2^{(p+q+r)}}$, we'll call $\operatorname{M}$:

$$\operatorname{M}(A) = \left[\begin{smallmatrix} a & a^1 & a^2 & -a^{1,2} \\ a^1 & a & a^{1,2} & -a^2 \\ a^2 & -a^{1,2} & a & a^1 \\ a^{1,2} & -a^2 & a^1 & a \\ \end{smallmatrix}\right]$$
$$\begin{aligned} \color{red}{\operatorname{M}(0)} &\color{red}{{} = 0} \\ \color{red}{\operatorname{M}(A + B)} &\color{red}{{} = \operatorname{M}(A) + \operatorname{M}(B)} \\ \color{red}{\operatorname{M}(\alpha A)} &\color{red}{{} = \alpha\operatorname{M}(A)} \end{aligned}$$
$$\begin{aligned} \color{blue}{\operatorname{M}(1)} &\color{blue}{{} = I = 1} \\ \color{blue}{\operatorname{M}(AB)} &\color{blue}{{} = \operatorname{M}(A)\operatorname{M}(B)} \\ \operatorname{M}(\tilde A) &= \operatorname{M}(A)^T \\ \operatorname{v}(AB) &= \operatorname{M}(A)\operatorname{v}(B) \\ \end{aligned}$$

Notably, this is not only linear, but also a morphism of algebras.

Matrix representation of multivectorsReversing the mapping

$$\operatorname{M}(a + a^1e_1 + a^2e_2 + a^{1,2}e_{1,2}) = \left[\begin{smallmatrix} a & a^1 & a^2 & -a^{1,2} \\ a^1 & a & a^{1,2} & -a^2 \\ a^2 & -a^{1,2} & a & a^1 \\ a^{1,2} & -a^2 & a^1 & a \\ \end{smallmatrix}\right]$$

Clearly the mapping $\operatorname{M}$ is not surjective and so cannot have a two-sided inverse, but we can define:

$$\operatorname{M}^{-1}(X) = \operatorname{v}^{-1}(X_{*,1})$$

where $\operatorname{v}^{-1}$ reassembles a multivector from a vector of coefficients and is the two-sided inverse to $\operatorname{v}$:

$$\operatorname{v}^{-1}\left(\left[\begin{smallmatrix}a \\ a^1 \\ a^2 \\ a^{1,2}\end{smallmatrix}\right]\right) = a + a^1e_1 + a^2e_2 + a^{1,2}e_{1,2}$$

and $X_{*,1}$ is the first column of the matrix $X$. By inspection, this forms a left-inverse to $\operatorname{M}$ as it satisfies:

$$\operatorname{M}^{-1}(\operatorname{M}(A)) = A$$

We are not interested in them here, but there are more sophisticated choices of $\operatorname{M}^{-1}$ which satisfy various additional properties such as $\operatorname{M}^{-1}(X^T) = \operatorname{M}^{-1}(X)^{\sim}$.

Wrapping Matrix Algorithms

When reusing a matrix algorithm $f$, we're looking for a multivector algorithm $\operatorname{wrap}[f]$ that makes the following diagram commute:

$$\begin{CD} \mathcal{G}(\mathbb{R}^{p,q,r}) @>>\operatorname{wrap}[f]> \mathcal{G}(\mathbb{R}^{p,q,r}) \\ @V\operatorname{M}VV @V\operatorname{M}VV \\ \mathbb{R}^{2^{(p+q+r)}\times2^{(p+q+r)}} @>>f> \mathbb{R}^{2^{(p+q+r)}\times2^{(p+q+r)}} \\ \end{CD}$$

Simple approach; define:

$$\operatorname{wrap}[f](A) := \operatorname{M}^{-1}(f(\operatorname{M}(A)))$$

Whether this produces meaningful results depends on the properties of $f$.

Wrapping Matrix AlgorithmsExample: exponentiation

The exponential function can be defined on any ring via the Taylor expansion,

$$\operatorname{exp}(A) = \sum_0^\infty \frac{A^i}{i!}$$

If we pass this through $\operatorname{M}$, we find

$$\operatorname{M}(\operatorname{exp} A) = \sum_0^\infty \operatorname{M}\left(\frac{A^i}{i!}\right) = \sum_0^\infty \frac{\operatorname{M}(A)^i}{i!} = \operatorname{exp} \operatorname{M}(A)$$

This means that our “wrapped” algorithm is correct:

$$\begin{aligned} \operatorname{wrap}[\operatorname{exp}](A) &= \operatorname{M}^{-1}(\operatorname{exp}(\operatorname{M}(A))) \\ &= \operatorname{M}^{-1}(\operatorname{M}(\operatorname{exp} A)) \\ &= \operatorname{exp} A \end{aligned}$$

Any real algorithm $E$ for computing the exponent of a matrix approximates the taylor expansion; so $\operatorname{wrap}[E]$ will too. We can use existing implementations of matrix exponentiation such as the scipy.linalg.expm function in the scipy package for the Python programming language.

Wrapping Matrix AlgorithmsExample: square root

The square root is more problematic, as it is not unique. While it's easy to show

$$\operatorname{M}(\sqrt{A})\operatorname{M}(\sqrt{A}) =\operatorname{M}(\sqrt{A}\sqrt{A}) = \operatorname{M}(A) \implies \operatorname{M}(\sqrt{A}) = \sqrt{\operatorname{M}(A)},$$

the $\sqrt{\phantom{1}}$ on the right is just one of many possible matrix square roots, and may or may not be the one computed by a particular algorithm.

Consider:

$$\begin{aligned} S &= \left[\begin{smallmatrix} 1 & 0 & 0 & 0 \\ 0 & -1 & 0 & 0 \\ 0 & 0 & -1 & 0 \\ 0 & 0 & 0 & 1 \end{smallmatrix}\right]& \implies S^2 &= I = \operatorname{M}(1) & \implies S &= \sqrt{\operatorname{M}(1)}\end{aligned}$$

Here $S$ is no longer in the range of $\operatorname{M}$, so we cannot in general assume that applying $\operatorname{M}^{-1}$ is valid (although in this case it still is).

Paths forward

  • Prove or disprove the validity of $\operatorname{M}^{-1}$ for all possible $\sqrt{\phantom{1}}$s.
  • Choose a specific $\sqrt{\phantom{1}}$ algorithm that ensures validity.

Reimplementing Matrix Algorithms

Idea: instead of reusing an implementation of an algorithm, reimplement it for multivectors.

 

Disadvantages

  • Doesn't build upon tried, tested, and optimized libraries like LAPACK.
  • Requires writing new code!
  • Some matrix operations have no directly analogous GA operation (eg LU decomposition).

Advantages

  • No redundant coefficients.
  • Better runtime:
  • Algebra$A : \mathcal{G}(\mathbb{R}^n)$$\operatorname{M}(A) : \mathbb{R}^{N\times N}$
    Addition$O(N)$$O(N^2)$
    Multiplication$O(N^2)$$O(N^3)$

    ($n = p+q+r$, $N = 2^n$)

  • Less memory consumption.
  • Potential for multivector-specific optimizations.
  • No need to worry about whether $\operatorname{M}^{-1}$ is valid.

Reimplementing Matrix AlgorithmsExample: square root

One of many algorithms from [1] as an example, which computes $\lim_{k\to\infty}X_k = \sqrt{B}$:

Product form Denman and Beavers (DB) iteration (scaled)

$$\begin{aligned} \mu_k &= \left|\operatorname{det}(M_k)\right|^{-1/(2n)} \\ M_{k+1} &= \tfrac{1}{2}\big(I + \tfrac{\mu_k^2 M_k + \mu_k^{-2}M_k^{-1}}{2}\big), & M_0 &= B, \\ X_{k+1} &= \tfrac{1}{2}\mu_k X_k\left(I + \mu_k^{-2}M_k^{-1}\right), & X_0 &= B \end{aligned}$$

Let $M_k = \operatorname{M}(M_k')$, $X_k = \operatorname{M}(X_k')$, and $B = \operatorname{M}(A)$, where $M_k', X_k', A \in \mathcal{G}(\mathbb{R}^n)$:

$$\begin{aligned} \mu_k &= \left|\operatorname{det}(\operatorname{M}(M_k'))\right|^{-1/(2n)} \\ \operatorname{M}(M_{k+1}') &= \tfrac{1}{2}\big(I + \tfrac{\mu_k^2 \operatorname{M}(M_k') + \mu_k^{-2}\operatorname{M}(M_k')^{-1}}{2}\big), & \operatorname{M}(M_0') &= \operatorname{M}(A), \\ \operatorname{M}(X_{k+1}') &= \tfrac{1}{2}\mu_k \operatorname{M}(X_k')\left(I + \mu_k^{-2}\operatorname{M}(M_k')^{-1}\right), & \operatorname{M}(X_0') &= \operatorname{M}(A) \end{aligned}$$
$$\begin{aligned} \mu_k &= \left|\operatorname{det{\scriptsize\star}}(M_k')\right|^{-1/(2n)} \\ \operatorname{M}(M_{k+1}') &= \operatorname{M}\big(\tfrac{1}{2}\big(I + \tfrac{\mu_k^2 M_k' + \mu_k^{-2}M_k'^{-1}}{2}\big)\big), & \operatorname{M}(M_0') &= \operatorname{M}(A), \\ \operatorname{M}(X_{k+1}') &= \operatorname{M}(\tfrac{1}{2}\mu_k X_k'\left(I + \mu_k^{-2}M_k'^{-1}\right)), & \operatorname{M}(X_0') &= \operatorname{M}(A) \end{aligned}$$
$$\begin{aligned} \mu_k &= \left|\operatorname{det{\scriptsize\star}}(M_k')\right|^{-1/(2n)} \\ M_{k+1}' &= \tfrac{1}{2}\big(I + \tfrac{\mu_k^2 M_k' + \mu_k^{-2}M_k'^{-1}}{2}\big), & M_0' &= A, \\ X_{k+1}' &= \tfrac{1}{2}\mu_k X_k'\left(I + \mu_k^{-2}M_k'^{-1}\right)), & X_0' &= A \end{aligned}$$

The result: an almost identical DB algorithm on multivectors, without any conversion to matrices!

Nicholas J. HighamFunctions of Matrices: Theory and Computation D. S. ShirokovOn determinant, other characteristic polynomial coefficients, and inverses in Clifford algebras of arbitrary dimension

Future work

  • Translate more algorithms from Functions of Matrices: Theory and Computation to multivectors.
  • Construct a collection of pathological multivectors analogous to those generated by Matlab's gallery function.
    • These are used extensively to compare the performance of the matrix algorithms.
    • Cases which bias against one algorithm may not apply to multivectors, resulting in different relative accuracies.
  • Analyze the time-complexity and accuracy characteristics of the translated algorithms.
  • Publish Python implementations of these algorithms compatible with the clifford package.