The polynomial set associated with a fixed number of matrix-matrix multiplications
Abstract
We consider the problem of computing matrix polynomials , where is a large dense matrix, with as few matrix-matrix multiplications as possible. More precisely, let represent the set of polynomials computable with matrix-matrix multiplications, but with an arbitrary number of matrix additions and scaling operations. We characterize this set through a tabular parameterization. By deriving equivalence transformations of the tabular representation, we establish new methods that can be used to construct elements of and determine general properties of the set. The transformations allow us to eliminate variables and prove that the dimension is bounded by , which is subsequently proven to be sharp, i.e., . Consequently, we have identified a parameterization that, to the best of our knowledge, is the first minimal parameterization. We also conduct a study using computational tools from algebraic geometry to determine the largest degree such that all polynomials of that degree belong to , or its closure. In many cases, the computational setup is constructive in the sense that it can also be used to determine a specific evaluation scheme for a given polynomial.
1 Introduction
The application that motivates the research question in this paper, is the computation of matrix functions in the sense of Higham [13], which is a classical problem in numerical linear algebra. We define a matrix function as an extension of a scalar function from to matrices, i.e., . Important matrix functions like , , and are crucial in various contexts such as linear ODEs [18], control theory [5], network analysis [7], and quantum chemistry [22]; see [13, Chapter 2] for more applications. Further applications relevant for our setting appear in problems where the action of a matrix function on a vector , i.e., is needed, see [9].
In this paper we study methods for computing matrix functions when the input matrix is very large and dense. In particular, we consider a family of methods that only utilize the following two operation types:
-
•
Linear combination of two matrices
-
•
Multiplication of two matrices .
A direct consequence of considering only these basic operation types is that this family of methods computes matrix polynomials. Another direct consequence is that the first operation type can be viewed as free in terms of computational cost, since the computational complexity is and respectively for the two considered operation types.
We want to study the polynomials that can be computed with a given cost. Since the cost is essentially given by the number of matrix-matrix multiplications, methods with a given fixed cost corresponding to multiplications form evaluations of polynomials in the set
| (1) |
where is the set of all polynomials of degree in the usual sense. Here, is the closure of , that is, the set of polynomials of degree .
Although our application stems from matrix polynomials, the set is, in fact, a univariate polynomial set, not a matrix polynomial set; it is a univariate semi-algebraic set, as we concretize in Section 2. The set can be defined more abstractly as follows. Let be the polynomial ring over a field ; typically the polynomial coefficients are scalars with or , such that is the vector space of polynomials in of degree at most . We make the following assumptions about computations involving elements of :
-
•
Linear combination of two polynomials is considered computationally free, where are polynomials, and are scalars.
-
•
Multiplication of two polynomials , where , incurs unit computational cost, independent of the degrees of and .
Now, if we fix a computational budget to , i.e., fix the number of non-scalar multiplications, the total space of computable polynomials is restricted. Since each multiplication can at most double the degree (i.e., ), the maximal degree reachable using multiplications is bounded by . The definition of in (1) corresponds to all polynomials in that can be computed using at most non-scalar multiplications, and an arbitrary number of free linear combination operations. Although the matrix polynomial evaluation application is our main source of interest, there are other applications, for example, when the matrix is replaced by a (scalar) polynomial; see the discussion and references in Section 6.
We also wish to stress the difference of this setting in comparison to the approximation theory assumptions. A common application of matrix polynomials involves the approximation of a non-polynomial function . In such applications, one often wants to compute a polynomial approximation of using a fixed cost. Hence, we want to find the best approximation in . This is in complete contrast to the classical problem in (scalar) approximation theory, where one seeks the best approximation of a given fixed degree. Since in general, the fixed degree and fixed cost approximation problems are different in character. In order to construct methods for the fixed cost approximation problem we need to understand the set . The objective of this paper is to characterize this set.
Techniques to keep the number of matrix-matrix multiplications low have been studied for decades in the context of matrix functions and matrix polynomials. In Paterson and Stockmeyer’s seminal paper [19] an algorithm is presented to compute for in ) matrix-matrix multiplications, which was made more precise in [8]. Since they also show that dim = , the Paterson and Stockmeyer algorithm is optimal in the asymptotic order sense.
The fact that the Paterson and Stockmeyer algorithm is often improvable has served as the motivation for a number of recent works. For example, Sastre [24] showed how to improve the Paterson–Stockmeyer method in general; the result shows how to compute most polynomials of degree using only three multiplications, and most polynomials of degree using only four multiplications (in contrast to the Paterson–Stockmeyer algorithm that handles degrees and , respectively). The work has been expanded in various ways (e.g., [25]). This research demonstrates how an approximation of a general function, such as the Taylor expansion of the exponential, can be computed using multiplications. Specifically, the approximation known as 15+ corresponds to a polynomial of degree 16 that matches all Taylor coefficients except for the last, i.e., the coefficient for . Related concepts for rational functions were examined in [23]. Further refinements of the algorithms, especially regarding the scaling-and-squaring method [18, Section 7], have been investigated for both the matrix exponential [27] and the matrix cosine [29] as well as in [28]. Ideas for efficient polynomial evaluation combined with rational approximations have been used in [2, 3], with particular focus on preservation of Lie algebra properties for the matrix exponential. In general, these types of efficient polynomial methods are (at least in theory) better than the standard implementation of the matrix exponential using a Padé approximant combined with the scaling-and-squaring algorithm [1], although further research is needed to definitively support this claim.
In the terminology of our framework, considerable parts of the results in the literature discussed above, are examples involving polynomials that are elements of . Despite this research attention, the understanding of this set is far from complete. The basis of our analysis is an evaluation scheme (further explained in Section 2) that gives a parameterization of with parameters given by a triplet , where and are matrices and a vector. One triplet corresponds to one polynomial . There are several ways to obtain the same polynomial; that is, a different triplet may lead to the same polynomial .
The first set of new results are equivalence transformations. In Section 3, we present procedures to transform a triplet into another triplet that yields the same polynomial . Several conclusions can be drawn from the transformations. For example, we prove that the first column of the matrices and can be selected as zero without loss of generality. More complex transformations reveal that the element of can also be set to zero. Moreover, we establish a transformation related to the third multiplication, which includes an algebraic condition on the elements of and . Further analysis shows that the element of can be coupled in a simple way to the element of without loss of generality.
The starting point of Section 4 are conclusions of the equivalence transformations. Using the transformations we can reduce the number of free parameters that parameterize and conclude that
This bound is sharper than those presented in, for example, [19]. Moreover, we prove that this is optimal and, to our knowledge, our parameterization therefore forms the first minimal parameterization of unreduced evaluation schemes in .
Further results are presented (in Section 5) regarding the problem of finding included polynomial subsets, specifically determining
| (2) |
Using the transformations and tools from computational algebraic geometry, we solve this problem for and provide conjectures supported by strongly indicative computational results in high-precision arithmetic for , , and . Our solutions to (2) are , , , , and for , , (complex arithmetic), (real arithmetic), and (complex arithmetic), respectively. The numerical experiments are reproducible and provided in a publicly available repository, including generated code for Matlab and Julia.
While the simulations focus on the determination of the dimension of and the solution to (2), other aspects of the findings in this paper may be significant beyond these specific research questions. The transformations themselves are constructive and may be used to improve various properties of the evaluation scheme, such as numerical stability. The study of the polynomial subset in Section 5 is also constructive. We provide evaluation schemes for the truncated Taylor expansion of the matrix exponential at specified degrees. For instance, we achieve the degree- expansion using multiplications, whereas prior works require multiplications [27]. Moreover, our simulations are applicable to several different functions, and the software we provide can be used to compute the evaluation scheme coefficients for polynomials other than those reported in this paper.
2 Evaluation scheme
2.1 Definition of the evaluation scheme
Fundamental to the results of this paper is an evaluation scheme for constructing elements of . Such evaluation schemes have been presented in various forms in [19, Section 2], [2, section 3], and [14], where in [14] they are referred to as degree-optimal polynomials. The evaluation scheme involves parameters stored in the matrices and , and the vector . Matrices and contain coefficients for linear combinations, where each row corresponds to the linear combinations associated with the th multiplication. As we perform multiplications, these matrices have rows. The vector contains coefficients for linear combinations that are used after the multiplications have been completed.
In particular, consider the triplet and associate the following sequence of operations involving exactly matrix-matrix multiplications. Define and . Then, we iterate the process to generate using the elements of matrices and as follows:
| (3) | ||||
Furthermore, we compute the output of the scheme by forming the linear combination corresponding to the -vector:
| (4) |
Thus, a given triple , with and being matrices, defines a polynomial .
Note that all evaluation schemes can be expressed in this manner, due to the fact that for each multiplication step we use all preceding information available. For instance, in the second multiplication step we determine by computing the product of two linear combinations of , and . Hence, a triplet parameterizes through the expressions in (3) and (4).
Examples of instances of
To see how the tables represent standard evaluation schemes, such as monomial evaluation, Horner evaluation, and Paterson–Stockmeyer evaluation, see [14, Section 3]. In the following, we show how to evaluate
in three multiplications. Consider the coefficient triplet:
| (5d) | ||||
| (5f) | ||||
2.2 The semi-algebraic set perspective of
For our study, we need concepts from algebraic geometry. Let be the field of the parameters in the evaluation. Let be the product space associated with the parameters. That is, , where is the number of parameters in the evaluation.
If we denote the evaluation scheme (3)–(4) by the map , we can describe as its image
| (6) |
By construction, is algebraic, since it depends polynomially on the parameters. It follows from the Tarski–Seidenberg theorem [17, Theorem 8.6.6] that is a semi-algebraic set, since it is the image of an algebraic map over a vector space.
Example (5d)–(5f) illustrates that the set is not necessarily an algebraic variety, i.e., it is only semialgebraic. More precisely, let denote topological closure in a Zariski sense. Then, in relation to the example (5d)–(5f) we see that is not a valid choice, and indeed . However, since is a limit point, we have . Therefore is not a variety since it does not include all of its limit points.
If denotes a basis of the ambient space , e.g., as in our simulations a monomial basis, and let be the Jacobian of the map with an output expressed in this basis. Following the standard definitions in algebraic geometry [30], the dimension of a semi-algebraic set is given by the rank of at any non-singular point. Most points in the image of a smooth map are non-singular; therefore for almost all we have
where , and is the number of parameters in the method.
The dimension of is always bounded by the number of free parameters, and in this setting it can be concluded from the size of that
| (7) |
In this paper we improve this bound by reducing the number of free parameters and in Section 4 we subsequently conclude that is the exact dimension, for .
We note that the properties of the set differ from an algebraic perspective when considering versus . Most results in this paper are applicable to both cases; however, when explicitly discussing the Zariski closure, we will assume unless stated otherwise. In the simulations presented in Section 5, we concentrate on and provide additional observations for .
3 Equivalence transformations
3.1 Substitution transformations
The following theorems describe transformations of the evaluation scheme , as defined in Section 2, such that the output polynomial is unchanged. The modified evaluation scheme will be denoted and the corresponding -matrices will be denoted . The first transformation can be viewed as a change of variables, where we rescale one of the -coefficients. For example if we set
the tables (3) are changed by
In order to keep unchanged, we can cancel the transformation by scaling the coefficient in the corresponding term, e.g.,
This implies that without changing the polynomial , we may rescale one row in and one column in and . This is formalized in the following theorem, where we omit indices on elements of and that are unchanged in and . By symmetry, we note that the same result holds for and switched.
Theorem 1.
Let be the polynomial associated with and let be the polynomial associated with , where
| (8) | ||||||
and
| (9a) | |||||
| (9b) | |||||
| (9c) | |||||
Then,
| (10) |
for any and .
Proof.
By definition, we have and . The proof is based on establishing the following three statements:
| (11a) | |||||
| (11b) | |||||
| (11c) | |||||
Together with the definition of in (8), this implies that the conclusion of the theorem, stated in (10), holds.
To prove (11a), we observe that since the first rows of and are unchanged, the variables are also unchanged.
In a similar fashion, we carry out a change of variables where we add a given value to one of the elements in the first column. For example, adding to the coefficient corresponding to in the first multiplication leads to
which can be expanded as
| (12) |
Let and be the factors that form , i.e., . In order to keep unchanged, we keep both factors in unchanged by compensating for the transformation (12):
| (13) | |||||
| (14) |
Hence, a modification in the coefficient corresponding to can be compensated for by modifying the coefficients in all rows below the modification. This can be applied to an arbitrary row and arbitrary .
Theorem 2.
Let be the polynomial associated with and let be the polynomial associated with where
| (15) | ||||||
and
| (16a) | |||||
| (16b) | |||||
| (16c) | |||||
Then,
| (17) |
for any and .
Proof.
Similarly to the proof of Theorem 1, we establish the following three statements:
| (18a) | |||||
| (18b) | |||||
| (18c) | |||||
Together with the definition of in (16c), this implies that the conclusion of the theorem in equation (17) holds. Relation (18a) follows analogously to the proof in Theorem 1.
To prove (18c), we prove that and , i.e., that the factors for ,…, are unchanged. For we have
| (19) | ||||||
where we have inserted (16a) and (18b) to obtain the second equality. The relation can be shown analogously using (16b) and (18b). By induction, the corresponding relation for any factor. Consequently, we have
| (20) |
which proves (18c).
3.2 Normalized forms and unreduced schemes
The theorems in the previous section can be applied to impose a certain structure on the triplet without (or with very little) loss of generality. For any triplet we can invoke Theorem 2 repeatedly. By setting to the negation of the element in the first column, we obtain matrices and with a first column containing only zeros. The first column corresponds to the addition of a scaled identity, which is independent (constant) with respect to the input . Note that this can be done for any triplet, and no generality (in the sense of parameterized polynomials) is lost by assuming the first column is zero.
Definition 3.
If , we call the evaluation scheme a constant-free evaluation scheme.
Corollary 4.
Any evaluation scheme is equivalent to a constant-free evaluation scheme.
Similarly, if we assume that the Hessenberg matrices and are unreduced [11, p. 381], i.e., the elements and are all nonzero, we can impose further structure by repeatedly applying Theorem 1 with scaling determined by the corresponding last nonzero element of the row. This process imposes a normalization on each row.
Definition 5.
A constant-free evaluation scheme satisfying is called a normalized evaluation scheme, and we call the triplet normalized.
Corollary 6.
Any evaluation scheme corresponding to where and are unreduced Hessenberg matrices is equivalent to a normalized evaluation scheme.
3.3 Further transformations
For normalized constant-free evaluation schemes, we have and , meaning that the first multiplication always corresponds to squaring the input matrix, i.e.,
| (21) |
This assumption can be made without loss of generality and will be used henceforth. This fact was already observed by Paterson and Stockmeyer [19, p. 61].
We now derive further transformations under a technical assumption. For the first two rows we assume that we have the structure of an unreduced constant-free triplet.111This assumption is made mostly to simplify the derivation, and similar results hold, e.g., when . This in turn is equivalent to assuming .
Consider the following transformation based on modifying the second row. We add and to the second element in the second row in and , respectively, i.e.,
| (22a) | |||||
| (22b) | |||||
If we enforce , one term in the modification disappears, and we can compensate for the other term in a way such that , i.e., they are unmodified. Consequently, the output polynomial is also unmodified.
Theorem 7.
Let be the polynomial associated with a constant-free triplet , satisfying , and let be the polynomial associated with where
| (23a) | |||||
| (23b) | |||||
and
| (24) |
The modifications to the coefficients are given by
| (25a) | |||||
| (25b) | |||||
| (25c) | |||||
where
Then,
| (26) |
Proof.
From the first row we have . The proof is based on establishing the following two statements:
| (27a) | |||||
| (27b) | |||||
This, together with the definition of in (25c) implies that the theorem conclusion (26) holds.
To prove (27a), we use relation (22). More precisely, we substitute into (22b) and obtain
| (28) |
To prove (27b), we show that the factors for are unchanged, i.e., and . For the first factor equality, we have
| (29a) | |||||
| (29b) | |||||
where we have used (25a) and (27a) with , in the second equality. The relation can be shown analogously using (25b) and (27a). By induction, we can prove the corresponding factor relation for . Consequently, we have for which proves (27b). The theorem conclusion (26) follows by the same construction as (29). ∎
For a given evaluation scheme satisfying , we can apply the previous theorem with , resulting in . This shows that we can assume for evaluation schemes that are unreduced in the first two rows, without loss of generality. For the second multiplication, under these assumptions, we get
| (30) |
Next, we state and prove a theorem for the third row of the coefficient matrices. For this we assume that the first three rows are unreduced, i.e., . The theorem is based on perturbing each of the free elements in the third row of and , while simultaneously preserving the final output polynomial. In particular, we use perturbations with the following structure:
It turns out that when the perturbations have this particular structure, is modified by the addition of a linear combination of , and . This is advantageous because we can compensate for and , since we have access to these matrices directly. Moreover, we can ensure that the -coefficient modification is zero by placing an additional condition on and , encoded in the equality , where is a function explicitly given in the theorem.
Theorem 8.
Let be the polynomial associated with the constant-free triplet satisfying , and let be the polynomial associated with where
| (32a) | |||||
| (32b) | |||||
and
| (33) |
The modifications to the coefficients are given by
| (34a) | |||||
| (34b) | |||||
| (34c) | |||||
| (34d) | |||||
| (34e) | |||||
| (34f) | |||||
where
| (35a) | |||||
| (35b) | |||||
Then,
| (36) |
for any and satisfying
| (37) |
Proof.
It follows from the first two rows that , . The proof is based on establishing the following two statements:
| (38a) | |||||
| (38b) | |||||
This, together with the definition of in (34e) and (34f) implies that the theorem conclusion (36) holds.
To prove (38a) we express in terms of the multiplication factors and substitute the definition of the modified coefficients. We obtain
| (39) |
This expression can be simplified by using :
| (40) |
The next step is to show that the difference between and is a linear combination of , , and , where the -coefficient is given by , which is zero by assumption. We factorize the expression in (40) to obtain
| (41) |
Before proceeding we define
| (42) |
This allows us to describe the difference between the multiplication factors more compactly:
| (43) |
We substitute this expression into (41) and simplify the first factor of the difference such that
| (44) |
Next, we expand the final expression
| (45) |
where we have simplified in the last equality using (35a) and (35b). By using , and , we can rewrite this as
| (46) |
Finally, we use (30) in order to express in terms of and
| (47) |
where we have used that in the last equality. This proves statement (38a).
Free variables in Theorem 8
Note that Theorem 8 includes a scalar-valued condition, , involving two scalar variables. Let and be defined as in the proof of the theorem. If we let be given, we get a quadratic equation in :
| (50) |
When , the solution to the equation is explicitly available from the solution of the quadratic equation. This has a disadvantage of introducing a square root operation. In a real setting, this can yield complex coefficients.
This is related to the result in paper [24] as follows. The results suggest several approaches to evaluate polynomials with a low number multiplications. For the case , the formulas [24, Eq. (31)] involve a square root, and indeed that approach can be derived from the above transformation with and solving for .
Suppose is given. Then, the solution for can be expressed as
| (51) |
with the condition . To reframe this condition in terms of entries in matrices and , we use a change of variables that essentially generalizes [21] (where it is given for ):
| (52) |
With this choice, we obtain updated table entries:
| (53) |
This leads to the relation
| (54) |
Consequently, any evaluation scheme that is unreduced in the first three rows can be transformed into an equivalent scheme satisfying
| (55) |
since we can choose any nonzero . Thus, when considering evaluation schemes that are unreduced in the first three rows, we can assume without loss of generality that . This assumption allows us to reduce the number of variables without introducing additional constraints, as explained in the following section.
4 Minimality of parameterization
4.1 Minimality of unreduced evaluation schemes
As a consequence of the equivalence theorems in the previous section, we can conclude that any unreduced evaluation scheme can be assumed to have the form
| (56) |
with an arbitrary -vector. The dimension of is determined from the rank of the Jacobian for a generic element in . A generic element in is unreduced; hence, the dimension is bounded by the number of free variables in a parameterization of the unreduced schemes. The parameterization (56) contains parameters and, therefore, similar to (7), we have the following.
Corollary 9.
For , we have
It turns out that this bound is sharp; i.e., we can establish a lower bound which is also . To this end it is sufficient to find a triplet such that the Jacobian of the output with respect to the free variables has rank . Many trivial choices of unreduced , e.g., a triplet corresponding to only a sequence of squaring operations, lead to a points in whose rank is less than . The triplet with a full Jacobian rank and leading to the simplest formulas for the Jacobian that we could find is the following:
| (57) |
and .
In order to bound the rank, it is sufficient to find linearly independent columns in the Jacobian. A sufficient condition for linear independence for polynomials is that they have distinct degrees. Based on that reasoning, we now establish linear combinations of partial derivatives of the output of the evaluation, i.e., , with respect to the elements of and . We obtain distinct degrees.
Although the choice (57) leads to the simplest derivation we could establish analytically, it is admittedly not very simple. The proof of the general case can be found in the appendix. For illustration, we sketch the derivation for , specifying the partial derivatives, which can be computed with symbolic computation tool (e.g., those described in the next section). In the full proof in the appendix we provide the details without such tools. We base the proof, as well as this sketch on a separation into cases, each case leading to a polynomial degrees that are distinct, and adding up to different degrees.
Case 1 corresponds to forming derivatives with respect to elements of : . Since in (57) is unreduced, and have maximal degree and we have that
| (58) |
Case 2 corresponds to forming derivatives with respect to . For example, we have
| (59) | |||||
| (60) |
As shown in the proof of the theorem, the general formula for the degrees in Case 2 is
| (61) |
Case 3 stems from the observation that we obtain the same degree if we differentiate with respect to and . Therefore we form the difference in order to obtain a different degree. For example,
| (62) |
and the difference with (60) is
| (63) |
The degree is distinct from the degrees in Cases 1 and 2. In the appendix we prove that the general formula for the degrees in Case 3 is
| (64) |
and that they are distinct from previous cases.
Case 4 is based on the observation that if we use the idea from Case 3 for , we get a degree which already included in Case 1. In order to establish a degree not present in the previous cases, we must form a linear combination of partial derivatives with respect to the elements , , , and . Consider the following identities:
| (65) | |||||
| (66) | |||||
| (67) |
By forming the sum of equations (66) and (67), we obtain
Noting from (63) that this degree coincides with Case 3 for and , and we can reduce the degree by forming the sum
The resulting degree 3 is distinct from those in Cases 1-3.
From the reasoning above, in this example, we have 6 distinct degrees from Case 1, 6 distinct degrees from Case 2, 3 distinct degrees from Case 3 and 1 degree from Case 4, which yield a total of distinct degrees.
In the general case, i.e., when , we also provide a Case 5, leading to an additional to distinct degrees. The idea for this case is very similar to that of Case 4, and corresponds to forming a linear combination of partial derivatives with respect to the elements , , and for .
Theorem 10.
For , we have
| (68) |
Proof.
See Section A. ∎
4.2 Reduced evaluation schemes
In practice, unreduced evaluation schemes are rarely useful for evaluating a given polynomial of high degree , because of the limited number of degrees of freedom in for large , in comparison to . If the Hessenberg matrices and are reduced, we obtain output polynomials of lower degree. For example the pair
| (69) |
corresponds to multiplications but results in a polynomial of degree . By a single reduction, we mean setting the last nonzero element in a row to zero in either or . With reductions we mean the repeated application of a single reduction. Note that we can still normalize each row since the transformation theorems are also applicable to reduced systems. Hence, we lose one degree of freedom with every reduction, and due to Corollary 9, we expect the corresponding dimension to be
| (70) |
In the following section we proceed by studying reduced evaluation schemes. More precisely, we study reduced evaluation schemes that lead to specific polynomial degrees and describe ways to compute for that reduction structure for a polynomial given in a monomial basis.
5 Polynomial subsets
This section is devoted to the study of the question: What is the largest such that all -degree polynomials can be computed with multiplications? Formally, we use two versions of this problem
| (71) |
We consider the left-hand side when possible, and otherwise study the right-hand side in order to avoid limit cases similar to (5).
The previous section stressed the use of reduced matrices. For a given reduction, we can compute the corresponding polynomial degree. Hence, we can investigate candidate solutions to (71) by considering all possible reductions. This approach is depicted in Figure 1, which illustrates all combinations of reductions for up to multiplications and reductions. For instance, (69) corresponds to the scenario (horizontal axis) with multiplications and reductions and achieves a polynomial degree of , as shown on the vertical axis of the figure.
To identify optimal solutions to (71), higher polynomial degrees are advantageous. However, excessively high degrees may result in insufficient degrees of freedom. More precisely, (70) gives a bound on the degree of the admissible candidate solutions to (71) for a given number of multiplications and reductions. This is visualized with a blue line in the figure. Hence, for the purpose of studying (71) we can disregard points above this line.
The problem (71) becomes increasingly complex as increases. For , the problem is essentially already solved in [24] since an explicit procedure is provided to compute almost all polynomials of degree with multiplications, i.e.,
For , similar constructions are also provided in [24], yielding a method for degree . An alternate approach requiring fewer assumptions on the monomial coefficients is given in Section 5.1. From Figure 1, we see that this is the highest admissible degree and therefore conclude that it is optimal.
For we were not able to solve the problem analytically and instead resorted to computational tools. More precisely, we frame the problem with a given reduction and structure as a system of polynomial equations. For , we use the package HomotopyContinuation.jl [4] to find, as far as we can tell, all solutions. For and , the system was too complicated, and we were not able to find solutions with this package. However, by using the software [14], we were able to construct locally convergent iterative methods and successfully found solutions in all specified test cases. These simulations combined with reasoning based on admissible degrees in Figure 1, led us to state conjectures concerning the solution to (71). For reproducibility, all simulations (including starting values) are provided in the publicly available GitHub repository: https://github.com/GustafLorentzon/polynomial-set-paper.
5.1 Four multiplications
When we study , we identify from Figure 1 that the highest degree of the admissible polynomials is , since without reductions we only have degrees of freedom, which cannot parameterize . The solution to (71) is indeed 12. This can already be concluded from the method in [24, p. 237] which is a method to evaluate polynomials of degree 12, given in their monomial basis, using only multiplications. In our terminology this corresponds to the reduction and, additionally, . The method in [24, p. 237] involves square roots of expression containing the monomial coefficients, roots of a polynomial of degree four, as well as divisions of certain quantities leading to exceptions corresponding to some limit cases. Therefore, from [24] we conclude that
| (72) |
in a complex sense.
We now present a slightly more general method for evaluating any polynomial in with four multiplications. The new method does not involve square roots or divisions except for the leading monomial coefficient. Consider the evaluation schemes with the following structure
| (73f) | ||||
| (73h) | ||||
Suppose represent a given polynomial . If we expand the parameterization, we obtain a multivariate polynomial system of equations—one for each monomial coefficient in the output polynomial. In the terminology of Section 2.2, the system corresponds to considering the th,,th derivatives of the equation with respect to , evaluated at . In this case, we have equations in variables. To solve solve this system, we first introduce the auxiliary variables and . This system is explicitly solvable by considering the equations in the output polynomial ordered in descending degree, so that the equation corresponding to is treated first. In this sense, the system is triangular, which was also crucial for the construction in [24, Section 3]. The solution to the system is given by the following sequence of equations:
| (74a) | ||||
| (74b) | ||||
| (74c) | ||||
| (74d) | ||||
| (74e) | ||||
| (74f) | ||||
| (74g) | ||||
| (74h) | ||||
| (74i) | ||||
| (74j) | ||||
| (74k) | ||||
With the conditions , , and , we have explicitly computed all variables in (73).
Recall that for ; therefore, we have made no assumptions other than the degree of the polynomial. Moreover, the formulas preserve the algebraic structure of the variables, e.g., if , then is a real triplet, so the evaluation coefficients are real. We conclude that
| (75) |
holds in both a real and a complex sense.
5.2 Five multiplications
For multiplications we see in Figure 1 that the highest degree of the admissible polynomials is . To our knowledge, the state of the art is as given in [25, Equation (17)-(19)] with which is based on [24] combined with the Paterson–Stockmeyer evaluation. In our terminology, that approach corresponds to the reduction and additionally imposing . The reduction in Figure 1 leading to a polynomial of degree corresponds to .
We were not able to explicitly derive a solution to the multivariate polynomial system for the structure with analytically; instead, we needed to resort to computational tools. The Julia package HomotopyContinuation.jl [4] includes methods to solve polynomial systems of equations based on numerical continuation and with advanced initialization of starting points for the homotopy method. We implemented the system equations derived from considering each monomial coefficient for the data structures of this package. Since we have more variables than equations for this structure, we empirically fixed some variables, choosing which variable to fix by trying to reduce the total degree of the system as much as possible. We additionally solved those equations that could be solved explicitly, e.g., the first and last equations.
With this setup, we were able to compute for a large number of given polynomials , including the truncated Taylor expansion of the exponential as well as the function . The simulations were done in Julia, and the code is given in the GitHub repository. Moreover, code to actually evaluate polynomials is provided in both Julia and Matlab. For conciseness, we report only the numbers for the matrix exponential in the following.
In an attempt to prevent large condition numbers, we sought solutions that did not have excessively large or small values. In this case, we mitigated large numbers by scaling the input. For the exponential, we approximate with fixed, since this makes in the order of magnitude one. Although the input scaling can be reversed by transforming entries in the table, it was not deemed numerically useful and therefore it was not included in the following presentation of results.
HomotopyContinuation.jl found several solutions and the solution with the smallest values was the following
| (76f) | ||||
| (76h) | ||||
where the missing values are given in Table 1. The Jacobian of the polynomial system evaluated at this solution has a condition number of .
| 2.3374451754385963 | |||
| 2.8309861554443847 | -6.657689892163032 | ||
| 8.7616118485412 | 50.445902670306005 | ||
| 5.123957592622475 | 19.754729172913187 | ||
| 1.4484649122807018 | 2.8090057997411706 | ||
| 6.389966463262669 | |||
| 6.697361614351532 | |||
| 2.1451472591988834 | |||
| -2.458444697550387 | |||
| -3.6724346694235876 | |||
| 16.044090747953085 | |||
| 7.557067023178642 |
Since we were able to find solutions in the case studies we conjecture that this corresponds to a realization of a method for the maximum polynomial degree.
Conjecture 11.
The upcoming work [26] suggests that there is indeed a constructive way to form such evaluation schemes for five multiplications.
5.3 Six multiplications
To our knowledge, the state of the art for multiplications is , again given in [25] with . In our terminology, that corresponds to the reduction . Similar to the situation for , this is not the highest admissible degree. From Figure 1 we see that the highest admissible degree is . With , we can use the reduction (69).
Unfortunately, the application of the package HomotopyContinuation.jl was not successful for this case. We have equations, and unknowns. The creation of initial vectors for the homotopy continuation seems too computationally demanding, likely related to the high total degree of the polynomial system. Instead we used the package GraphMatFun.jl [14] to create a locally convergent iterative solution method. The system is highly ill-conditioned and a standard Newton approach was not successful. Instead, we employ an iteratively regularized Newton’s method. Following the approach in [6], we compute a Tikhonov–Newton step by applying Newton’s method to a Tikhonov-regularized system. The Tikhonov–Newton step can be computed using the singular value decomposition of the Jacobian matrix, which is directly available from the graph representation in GraphMatFun.jl. The best results were obtained by using Armijo step-length damping and selecting between a Newton and a Tikhonov–Newton step in a greedy fashion. For the sake of reproducibility, the starting vectors are given explicitly in the software available in the GitHub repository. In order to determine conclusively that a solution is found, the solution was post-processed with high-precision arithmetic (BigFloat in Julia) so that the first 50 decimals of the coefficients appear accurate. This was done for this particular simulation as well as for all subsequent simulations.
The above approach with the structure (69) was successful in finding a solution vector for the problem corresponding to the Taylor expansion of the matrix exponential, using complex arithmetic. Unfortunately, we were not able to find a real coefficient vector.
From an application viewpoint, real coefficients are advantageous, e.g., since the evaluation of can be done in real arithmetic if is real. In order to find a real evaluation scheme, we investigated instead the structure
This leads to a polynomial of degree . Using the locally convergent iterative method described above, we successfully computed a real solution vector.
For the actual numbers, we refer to the software. The maximum absolute value in the coefficient vector for the complex case is , and the Jacobian evaluated at the solution has a condition number of . The maximum absolute value in the coefficient vector for the real case is , and the condition number of the Jacobian at the solution is .
The highest degree polynomial set can indeed be different for the real and complex cases, and the simulations suggest the following.
Conjecture 12 (Six multiplications).
For complex coefficients,
For real coefficients,
5.4 Seven multiplications
For seven multiplications, we believe that the state of the art is [25, Table 3] with , which in our context corresponds to the reduction and . In Figure 1 we see that the highest admissible degree appears to be with . Using the same technique as for , we were able to find a solution to the system for the Taylor expansion of the exponential. We used scaling , complex arithmetic, and the following structure:
| (77) |
The fixed elements were selected empirically in an attempt to avoid very large condition numbers. The maximum absolute value in the coefficient vector is , and the condition number of the Jacobian at the solution is . From this we conclude the following in a complex sense.
Conjecture 13 (Seven multiplications).
Remark 14 (Computational example).
Although the main contributions of this manuscript are theoretical in character, we also wish to point out the practical value of the simulations. For example, the output of the simulations concerning multiplications can be directly used to compute the matrix exponential, in a very competitive way. In fact, in theory (in the sense of number of floating point operations) the method is the fastest for large norm matrices, as far as we know. Although a complete computational study is beyond the scope of this paper, we provide (for a specific but random matrix) a comparison with the scaling-and-squaring algorithm [18], which is the most commonly method used in mathematical software for the matrix exponential. In the following we see that our approach is faster or more accurate, or both depending on viewpoint, than two valid parameter choices for the scaling-and-squaring method. What is marked as our method is the evaluation (77) with coefficients precomputed with the Tikhonov-Newton method.
julia> Random.seed!(0); setprecision(128); n=200; alpha=16;
julia> A=randn(Complex{BigFloat},n,n); A=20*A/norm(A);
julia> E1 = @btime exp16_deg42_bigfloat(A/alpha); # Our method
58.648 s (719681983 allocations: 29.50 GiB)
julia> E2 = @btime exp_sas_6mult_1div(A); # standard version 1
65.340 s (762205498 allocations: 31.24 GiB)
julia> E3 = @btime exp_sas_7mult_1div(A); # standard version 2
80.014 s (860125595 allocations: 35.25 GiB)
julia> # Compute a reference solution with high precision
julia> expAref=mapreduce(i-> (A^i)/factorial(big(i)), +, 0:100);
julia> norm(expAref-E1); # Error for our method
9.25199599560488132729984839075891589943e-33
julia> norm(expAref-E2); # Error for standard implementation version 1
1.935440065066927257455720406200123063236e-30
julia> norm(expAref-E3); # Error for standard implementation version 2
1.14214602689126618911305625822908609328e-36
6 Conclusions
This work focuses on a characterization of the set , with particular attention given to minimality and to computing the maximum degree polynomial subset of in the sense of (71). From our perspective, the minimality question is well understood in this paper. The determination of the maximum degree polynomial subset can be further investigated. We have only described the case using computational reasoning. Both a theoretical description of the general case, e.g., using further tools from algebraic geometry [33], and a more general computational approach could be of interest and useful in practice.
Based on our simulations, one major component is missing before this can be directly used in matrix function evaluation software: understanding the effect of rounding errors. Evaluating high-degree polynomials is, in general, prone to rounding errors—unless special representations such as a Chebyshev basis are used. In this case, the issue appears even more intricate. For example, by using high-precision arithmetic, the coefficients were computed such that we could guarantee correctness in full double precision. However, the fact that the system has a rather large condition number (at least for ) is an indication that this evaluation is sensitive with respect to these coefficients. Heuristics similar to [15] might be applicable in a general setting. Further work is needed to determine which evaluation schemes, in the continuum of , lead to better numerical stability; a necessary condition for numerical stability is that the condition number is not too large.
The Paterson–Stockmeyer method has proven beneficial not only for evaluating matrix polynomials; similar computational challenges arise in other contexts, such as when the input is a polynomial. In these scenarios, multiplying two quantities is significantly more computationally demanding than forming linear combinations. The construction in the Paterson–Stockmeyer approach resembles the baby-step giant-step (BSGS) technique introduced in [31], which has found various applications and has been combined with the Paterson–Stockmeyer approach in public key and privacy-preserving cryptography [12]. Moreover, both the Paterson–Stockmeyer method and BSGS serve as valuable tools in high-precision arithmetic [16, 32]. Open research questions include how the approach presented in this paper, or methods for in general, can be applied in these contexts.
The fixed-cost computation approach presented in this paper can be complemented by insights from research on composite polynomials or deep polynomials. See [20] for composite polynomials. Rational approximations corresponding to this concept, such as those in [10], illustrate how successive compositions, e.g., , can achieve rapid convergence in terms of both the number of compositions and the parameters involved. Similar findings are noted in [34], motivated by the link between this approach and universal approximation in deep learning. The composite polynomial method can fit within the framework of this paper by zeroing certain elements in matrices and . Nevertheless, there is a significant distinction in research objectives: our objective is to minimize the number of matrix-matrix multiplications, while [34] focuses on reducing the number of parameters. Although some results, such as the approximation of the th root [10], may be directly applicable, further research is needed to fully explore the differences resulting from these objectives.
Acknowledgements
The authors wish to express gratitude for the insightful discussions and feedback from Prof. Kathlén Kohn (KTH Royal Insititute of Technology), particularly regarding Section 2.2. This research was partially conducted during the first author’s sabbatical at EPFL / University of Geneva. The support of the hosts, Prof. Daniel Kressner and Prof. Bart Vandereycken, is greatly appreciated. The authors also greatly acknowledge the valuable comments from Prof. Massimiliano Fasi (University of Leeds) and Prof. Jorge Sastre (Polytechnic University of Valencia).
References
- [1] A. H. Al-Mohy and N. J. Higham. A new scaling and squaring algorithm for the matrix exponential. SIAM J. Matrix Anal. Appl., 31(3):970–989, 2010.
- [2] P. Bader, S. Blanes, and F. Casas. Computing the matrix exponential with an optimized Taylor polynomial approximation. Mathematics, 7(12):1174, 2019.
- [3] S. Blanes, N. Kopylov, and M. Seydaoğlu. Efficient scaling and squaring method for the matrix exponential. SIAM J. Matrix Anal. Appl., 46(1):74–93, 2025.
- [4] P. Breiding and S. Timme. HomotopyContinuation.jl: A Package for Homotopy Continuation in Julia. In Int. Congr. Math. Softw, pages 458–465. Springer, 2018.
- [5] R. Byers. Solving the algebraic Riccati equation with the matrix sign function. Linear Algebra Appl., 85:267–279, 1987.
- [6] J. Eriksson and P.-Å. Wedin. Regularization methods for almost rank-deficient nonlinear problems. In B. Jacobsen, K. Mosegaard, and P. Sibani, editors, Inverse Methods, pages 295–302. Springer-Verlag, Berlin, 1996.
- [7] E. Estrada and D. J. Higham. Network properties revealed through matrix functions. SIAM Rev., 52(4):696–714, 2010.
- [8] M. Fasi. Optimality of the Paterson–Stockmeyer method for evaluating matrix polynomials and rational matrix functions. Linear Algebra Appl., 574:182–200, 2019.
- [9] M. Fasi, S. Gaudreault, K. Lund, and M. Schweitzer. Challenges in computing matrix functions, 2024. Arxiv: 2401.16132.
- [10] E. S. Gawlik and Y. Nakatsukasa. Approximating the th root by composite rational functions. J. Approx. Theory, 266:105577, 2021.
- [11] G. Golub and C. Van Loan. Matrix Computations. The Johns Hopkins University Press, 2013. 4th edition.
- [12] K. Han and D. Ki. Better bootstrapping for approximate homomorphic encryption. In Topics in Cryptology – CT-RSA 2020, pages 364–390, 2020.
- [13] N. J. Higham. Functions of Matrices: Theory and Computation. SIAM, Philadelphia, 2008.
- [14] E. Jarlebring, M. Fasi, and E. Ringh. Computational graphs for matrix functions. ACM Trans. Math. Softw., 48(4):39, 2023.
- [15] E. Jarlebring, J. Sastre, and J. Ibáñez. Polynomial approximations for the matrix logarithm with computation graphs. Linear Algebra Appl., 2024. In press.
- [16] F. Johansson. Evaluating parametric holonomic sequences using rectangular splitting. In Proc. 39th Int. Symp. Symbolic Algebraic Comput. (ISSAC ’14), pages 256–263, 2014.
- [17] B. Mishra. Algorithmic Algebra. Applied Mathematical Sciences. Springer-Verlag, 1993.
- [18] C. Moler and C. Van Loan. Nineteen dubious ways to compute the exponential of a matrix, twenty-five years later. SIAM Rev., 45(1):3–49, 2003.
- [19] M. S. Paterson and L. J. Stockmeyer. On the number of nonscalar multiplications necessary to evaluate polynomials. SIAM J. Comput., 2(1):60–66, 1973.
- [20] J. F. Ritt. Prime and composite polynomials. Trans. Amer. Math. Soc., 23(1):51–66, 1922.
- [21] E. H. Rubensson, G. Lorentzon, and E. Jarlebring. Recursive expansion of the matrix step function using eight-degree polynomials. In progress, 2025.
- [22] E. H. Rubensson, E. Rudberg, and P. Sałek. Density matrix purification with rigorous error control. J. Chem. Phys., 128:074106, 2008.
- [23] J. Sastre. Efficient mixed rational and polynomial approximation of matrix functions. Appl. Math. Computation, 218(24):11938–11946, August 2012.
- [24] J. Sastre. Efficient evaluation of matrix polynomials. Linear Algebra Appl., 539:229–250, 2018.
- [25] J. Sastre and J. Ibáñez. Efficient evaluation of matrix polynomials beyond the Paterson–Stockmeyer method. Mathematics, 9(14):1600, July 2021.
- [26] J. Sastre, J. Ibáñez, J. M. Alonso, and E. Defez. Beyond paterson–stockmeyer: Advancing matrix polynomial computation. Presented at the 5th Int. Conf. on Applied Mathematics, Computational Science and Systems Engineering, Institut Henri Poincaré, Paris, France, Apr. 14–16, 2025.
- [27] J. Sastre, J. Ibáñez, and E. Defez. Boosting the computation of the matrix exponential. Appl. Math. Computation, 340:206–220, January 2019.
- [28] J. Sastre, J. Ibáñez, E. Defez, and P. Ruiz. New scaling-squaring Taylor algorithms for computing the matrix exponential. SIAM J. Sci. Comput., 37:A439–A455, 2015.
- [29] J. Sastre, J. Ibáñez P. Ruiz, and E. Defez. Efficient computation of the matrix cosine. Appl. Math. Computation, 219(14):7575–7585, March 2013.
- [30] I. R. Shafarevich and M. Reid. Basic algebraic geometry, volume 1. Springer, 1994.
- [31] D. Shanks. Class number, a theory of factorization and genera. In Proc. Sympos. Pure Math., volume 20, pages 415–440, Providence, RI, 1971. Amer. Math. Soc.
- [32] D. M. Smith. Efficient multiple-precision evaluation of elementary functions. Appl. Math. Computation, 52(185):131–134, January 1989.
- [33] B. Sturmfels and M. Michalek. Invitation to Nonlinear Algebra. Amer. Math. Soc., Providence, RI, english edition, July 2021.
- [34] K. Yeon. Deep univariate polynomial and conformal approximation, 2025. arXiv:2503.00698.
Appendix A Auxiliary material for the proof of Theorem 10
The proof is based on two technical lemmas. We note that several of these results hold for arbitrary choices of , but some parts of the statements and some of the proofs are clearer when we assume the specific structure of the triplet given in (57), which is sufficient for the proof of Theorem 10. Note that for our example, , , and .
The first lemma states a formula for the degree when we apply a differentiation operator consisting of a linear combination of derivatives of the elements in but not of the last rows of and .
Lemma 15.
Let be of the specific structure given in (57). Let be an operator consisting of a linear combination of the elements of in rows . Assume that
and that . Then, and
| (78) |
Proof.
We prove the statement by induction, starting with row . By applying the product rule and using , we obtain:
| (79) |
where . Note that . Therefore, on the right-hand side of equation (79), the degree of the first term is larger than that of the second term, given our assumption that . Thus, we have:
| (80) |
This establishes the base step of our induction.
To proceed with the induction step, assume that the inequality holds for steps, i.e.,
Analogous to the derivation in equation (79), we have:
Since and have the same degree, and by our induction assumption, the first term has a higher degree, we conclude that:
| (81) |
This completes the proof of the increasing degree progression. The relation (78) follows by applying equation (81) for and using (80) once. ∎
The previous lemma (Lemma 15) gives us the degree of the partial derivative of the output, given the degree of the partial derivative of . It remains to determine the derivatives of . We select the differentiation operator in several ways for the purpose of later combining them to form distinct degrees of the Jacobian.
Lemma 16.
Let be of the specific structure given in (57). Then, for and
| (82) |
and for and ,
| (83) |
Moreover, for ,
| (84) |
where and also,
| (85) |
Proof.
For notational convenience, we define the differential operator:
| (88) |
for which we derive the auxiliary relation:
| (89a) | ||||
| (89b) | ||||
| (89c) | ||||
| (89d) | ||||
Here, we have used (83) with , and , in the first equality, and the fact that for the specific structure given in (57), we have .
To prove (84), we use that since these elements are independent of row . This, together with the chain rule, yields:
| (91a) | ||||
| (91b) | ||||
| (91c) | ||||
The final equality is a reformulation which simplifies the following derivation,
| (92a) | ||||
| (92b) | ||||
where we have used (82) in the first equality. The degree of the first term in the right hand side is given by
| (93) |
and the degree of the second term is given by
| (94) |
The theorem conclusion (84) follows from Equation (92) and Equation (94).
∎
Proof of Theorem 10
We want to prove the equality in (68); but the upper bound is already given in Corollary 9. To prove that is also a lower bound it is sufficient to find one point, i.e., one triplet, with a Jacobian of rank . This is done by finding particular linear combinations of partial derivatives of the entries in with distinct degrees. We assume the structure given in (57). We construct linear combinations of partial derivatives in 5 different ways. We refer to these linear combinations as Cases 1, 2, 3, 4 and 5.
Case 1: The fact that the evaluation scheme is unreduced means , and for . This, together with , directly implies (58).
For Case 2, we consider the operator
for and . Since does not appear in the first rows, we have . Consequently, the conditions of Lemma 15 are satisfied. Moreover, (82) implies that
| (95) |
Combining this with (78) yields
| (96a) | ||||
| (96b) | ||||
We conclude (61).
For Case 3, we consider the operator
for and . The conditions of Lemma 15 are satisfied by the same reasoning as in Case 2. From (83) we obtain
| (97) |
For Case 4, we consider the operator
From equation (84) we immediately conclude that
| (98) |
For Case 5, we consider the operator
for . We show that this operator satisfies the assumptions of Lemma 15. Firstly, we observe that by the same reasoning as in Case 2. Therefore, it is enough to show that . From (84) and (94) we have that . Next, we observe that
| (99a) | ||||
| (99b) | ||||
| (99c) | ||||
Thus the assumptions of Lemma 15 are satisfied. It follows that
| (100a) | ||||
| (100b) | ||||
To summarize, we have obtained the following degrees through linear combinations of partial derivatives.
-
•
Case 1: .
-
•
Case 2: .
-
•
Case 3: .
-
•
Case 4: .
-
•
Case 5:
These form distinct degrees, in the following way. Firstly, we note that when the degrees are expressed as binary numbers, Case 1 and Case 4 always involve one and two nonzeros respectively, making these degrees distinct. Similarly, when we express the degrees of Case 2, 3 and 5 as binary numbers, they always involve three or more nonzeros, making them distinct from Case 1 and Case 4. For Case 2, Case 3 or Case 5 to coincide, the degrees as binary numbers must have the same number on nonzeros. This corresponds to choosing the same in each of the formulas, which leads to different degrees. Therefore we conclude that all the above degrees are distinct from each other.
Counting the number of unique degrees, we get: Case 1: ; Case 2: ; Case 3: ; Case 4: ; Case 5: . In total, we have distinct degrees for .