Multiplicative Hyperdeterminants

May 26, 2009

When I first became interested in hyperdeterminants about ten years ago, there were very few people researching them. Recently they have emerged as interesting in physics, expecially quantum entanglement and superstring theory. This has led to a resurgence of interest in hyperdeterminants.

Of course the work of  Gelfand, Kapranov, and Zelevinsky who wrote the hyperdeterminant book led the way. Another longterm practitioner of the art is David Glynn who has done important work on multiplicative hyperdeterminants and the relationship with coding theory. He is probably the leading expert on much of the stuff that I have started to get interested in. David recently left some comments on my blog article about Cayley’s Hyperdeterminants  so I think this is a good moment to continue this blog with some posts in which I will try to understand and followup on his comments.

David drew my attention to a different type of hyperdeterminant which Cayley first found in 1843 two years before he defined the hyperdeterminant  as a discriminant that I use here. This determinant for a hypermatrix A of size mn is written det0(A) to distinguish it from the discriminant version. det0(A) is a polynomial of degree m in the components of A and it is an SL(m)n invariant when n is even. In that case it can be expressed by contracting the nm indices of m copies of the n indices of the hypermatrix over the mn indices of  n copies of the m indices of the Levi-Civita symbol. When n is odd this construction degenerates to zero because of the conflict between the symmetry of the product of hypermatices and the anti-symmetry of the product of Levi-Civita symbols, but when n is even it gives an invariant that is det0(A) up to a normalisation. When n=2 this gives the familiar determinant of a square matrix but for n > 2 it is different from the hyperdeterminant of hypermatrices defined as discriminants.

That much was already familiar to me, but David also remarked that this hyperdeterminant has a multiplicative property that generalises the well known property of matrices i.e. det(AB) = det(A)det(B). This was something I had not really appreciated before. The product of two hypermatrices is taken to be the contraction of an index from one hypermatrix with an index from the other. E.g. the product of an mn hypermatrix with an mk hypermatrix is an mn+k-2 hyprmatrix. The discriminant hypermatrix has a similar multiplicative property for boundary format hypermatrices but not for more general formats. One other case where a multiplicative property works is the product of a hypermatrix with a matrix as I demonstrated when looking at the hypermatrix as an invariant.

Looking at how all these cases work I now realise that they are all special cases of a general result. Suppose A and B are two hypermatrices that may be of different formats but they have a matching pair of indices so that a product AB can be formed by contracting over those indices. Suppose also that there are invariants for each format I1(A) and I2(B) that have the same polynomial degree d. Then there is always a third invariant I3(C) of degree d on the format of the product hypermatrix which satisfies a multiplicative property I3(AB) = I1(A) I2(B). If we have two invariants I1 and I2 of different degrees d1 and d2 we can of course form invariants of degree d = lcm(d1,d2) (least common multiple of d1 and d2),  by taking powers of  I1 and  I2  then the result applies to those.

Before I show why this result is true let’s look at how it works for the special cases. if A and B are of size mn and mk then det0(A) and det0(B) are both of degree m. So there must be an invariant of degree m such that I(AB) = det0(A) det0(B) , but the only invariant of the required degree is det0(AB) up to a factor so det0(AB) = f det0(A) det0(B) for a suitable factor f independent of A and B. The factor f can be shown to be one for a suitable normalisation in the definition of det0 by checking one test case. A similar argument works for the case of hyperdeterminants of boundary format hypermatrices when the contraction indices are choosen so that the product of two boundary hypermatrices is another boundary hypermatrix and exponents are used to construct invariants of the same degree. This follows provided the hyperdeterminants are unique invariants of the specific degree for boundary format hypermatrices, which I have not checked. Other products of hypermatrices raised to suitable exponents will give some invariant on the product. It will not in general be a hyperdeterminant but its form can be worked out in specific cases.

So why is this general multiplicative property of invariants true? It follows from the fact that all hypermatrix invariants can be constructed using contractions over sums and products of the Levi-Civita symbols εa…c together with a general identity which reduces the product of two such symbols to an antisymmetric sum over products of the Kronecker delta symbol.

      εa…c εd…f = δa [d … δc f]      

      (where the square brackets indicate a normalised antisymmetrization over all the second indices of the deltas)

Express the product of invariants  I1(A) and I2(B) using Levi-Civita symbols then plug in this identity into the expression for all pairs of symbols over the two indices subject to contraction. The deltas then form combinations of products of the hypermatrices and the result is obtained.


The Degree of Hyperdeterminants

October 18, 2008

Working out hyperdeterminants is an involved process. Apart from the smallest ones they have huge numbers of terms. There is a bit more hope in the task of working out the degree of the hyperdeterminant of a given format. We have already seen that Cayley’s hyperdeterminant for a 2x2x2 hypermatrix is of degree 4 and the hyperdeterminant of size 2x2x2x2 is of degree 24. Of course we also know that determinants of size N x N have degree N.

For the general case there is in fact a generating formula for the degree N(k1,…,kn) for a hyperdeterminant of size (k1+1) x … x (kn+1) which is

Σ N(k1,…,kn) z1k1 … znkn = 1/(1 – Σ (i-1)ei(z1,…,zn))2

Where ei(z1,…,zn) is the ith elementary symmetric polynomial in n variables. For a derivation you should look at the hyperdeterminant book.

For the case of a hyperdeterminant of size 2n you can use this simpler generating function

Σ Nk zk/k! = e-2x/(1-x)2

Which gives us the sequence for Nk  =2, 4, 24, 128, 880, 6816, 60032, 589312, 6384384, 75630080, 972387328, etc . This is also known as sequence A087981.


Boundary Format Hyperdeterminants

October 12, 2008

Although Ihave given a general definition of a hyperdeterminant I have not yet said for which formats it exists. An exisitence proof is beyond the scope of this blog, at least unless I have the energy to explain the basics of algebraic geometry later (ha ha), but I can at least state the answer.

Recall that a hypermatrix could have a size of N1 x … x Nn. and the definition of a hyperdeterminant makes sense for any such hypermatrix, but it may not exist for each case. Sometimes we also say that the format is (N1-1,…,Nn-1) to emphasise that the spaces we are looking a are projective and have one less degree of freedom. Without loss of generality assume that Nn is (one of) the largest dimension.

Think about the condition for the matrix to be singular which includes that the derivative of the form with respect to any of its vector arguments xi is zero, we’ll just consider the largest xn. This is really a set of Nn equations in all the remaining arguments, ie. there are effectively K = N1+ … + Nn-1 – (n-1) unknown degrees of freedom. In general if  N> K this will impose N– K conditions on the hypermatrix. Setting the hyperdeterminant to zero only imposes one condition so if N– K > 1 the hyperdeterminant should not exist.

This is in fact the complete condition for the existence of a hyperdeterminant of a given format. I.e. the hyperdeterminant exists iff N ≤ N1+ … + Nn-1 – n + 2. When we are dealing with matrices this reduces ot the condition that the matrix must be square,but in general hyperdterminants exist for hypermatrices with dimensions of different sizes.

In the special case where N = N1+ … + Nn-1 – n + 2 (which includes square matrices) the hypermatrix is said to be of boundary format.


Hyperdeterminants and Elliptic Curves II

October 11, 2008

Earlier I showed you a surprising way in which Cayley’s 2x2x2x2 hyperdeterminant shows up in the theory of elliptic curves. Now we are ready to see how Schläfli’s 2x2x2x2 hyperdeterminant is related to elliptic curves in a more basic way.

Suppose then that you have a 2x2x2x2 hypermatrix with components tijkl taking values from the integers. I’ll pose the following problem for you: Find me three non-zero pairs of integers xj, yk, zl such that  

Σ tijklxj yk zl = 0

At first site you mght think this is going to be pretty easy since the problem is linear in each undetermined variable, but dont forget that there are two equations corresponding to i = 1,2. To solve it just peel off the vectors one at a time, firstly the matrix components Mij = Σ tijklyk zl must permit a solution to the equation

Σ Mijxj = 0

So we need det(M) = 0. Now think of M as being derived from a 2x2x2 tensor

Aijk = Σ tijklzl     =>  Mij = Σ Aijkyk

det(M) is now a quadratic form in the variables yk. The condition for it to have integer roots is that the discriminant must be a square, but the discriminant is the same one that we identified earlier as Cayley’s hyperdeterminant . So we need

t2= det(A)

But A is linear in zl and the hyeprdeterminant is a quartic so we are left with a diophantine equation to solve  of the form   t2 = av4+ bv3u + cv2u2+ dvu3+ eu4. Although elliptic curves are more commonly seen in a form where the right hand side is a cubic, this form is also an elliptic curve and can be reduced to the cubic form. So solving the original problem actually reduced to solving an elliptic curve based on Cayley’s Hyperdeteriminant and standard methods apply.

But we cango one step further. The quartic equation is the same one that arose when we were constructing Schläfli’s Hyperdeterminant for our 2x2x2x2 hypermatrix t. From the theory of elliptic curves we know that the nature of the elliptic curve is determined by its J-invariant given by

J = g23/Δ  = S3/det(t)

where S is the octic invariant of the hypermatrix. It is remarkable enough that Schläfli hyperdeterminant is associated with the J-invariant of an elliptic curve but even more surprising is the significance that this implies the the degree of the hyperdeterminant which we showed was 24, the dimension of the Leech lattice.


Schläfli’s Hyperdeterminant

October 11, 2008

We analysed Cayley’s 2x2x2 hyperdeterminant earlierand now it’s time to look at Schläfli’s 2x2x2x2 hyperdeterminant. I’ll sketch how this can be done using suitable elimination argumentsto reduce the number of non-zero elements in the hypermatrix. I am not going to go though all cases because it would  not be instructive. You can complete it for yourself.

For a 2x2x2x2 matrix with components tijklto be singular we require the existence of four vectors w1,x1,y1,z1 such that simultaneously

Σ tijklx1jy1kz1l = 0

Σ tijklw1iy1kz1l = 0

Σ tijklw1ix1jz1l = 0

Σ tijklw1ix1jy1k = 0

If we concentrate on the 2x2x2 hypermatrix given by sijk= Σ tijklz1l we then need from the first three conditions that

det(s(z1)) = 0

and from the last condition

Σ sijkw1ix1jy1k = 0

The determinant of s(z) is Cayley’s hyperdeterminant of degree 4 so it forms a homogeneous quartic in the two components of z. It must have a root at z = z1 Let’s assume that it has at least one other distinct root z2(We should consider separately the case where it has four equivalent roots but I’ll skip that.)

becuase det(s(z2)) = 0 there must also be vectors w2,x2,y2 such that

Σ tijklx2jy2kz2l = 0

Σ tijklw2iy2kz2l = 0

Σ tijklw2ix2jz2l = 0

Transform the hypermatrix to a basis using these vectors suitably normalised, so that x1= (1,0), x2 = (0,1), y1= (1,0), y2 = (0,1) etc. In this basis the equations now simply tell us that certain components are zero

t1111= t2111= t1211= t1121= t1112= t1222= t2222= t2122= t2212= 0

With this simplification the quartic det(s(z)) can be wroked out in terms of the remaining components. We find that some of the terms are zero and with z = (u,v) it takes the form

det(s(z)) = cu2v2+ duv3

This tells us that v = 0 is in fact a double root of the quartic. By consideration of all other possible cases it is possible to show that in fact the 2x2x2x2 hypermatrix is singular iff the quartic has a double root. This is then equivalent to the requirement that the discriminant of the quartic is zero. So finally the hyperdteriminant must be given by this discriminant.

The discriminant of a quartic is of degree 6 in the coefficients, and the coefficient are degree four in the components of the hypermatrix. Therefore the degree of the 2x2x2x2 hyperdeterminant is 24. 

It may not be obvious at first but if you think about it you will realise that Schläfli’s hyperdeterminant has quite a large number of terms. For most purposes we do not need to compute the whole thing but it can be done and the total number of terms is 2894276 (see arXiv:math/0602149)

Another observation is that the hyperdteriminant is not a fundamental generator in the ring of invariants on the 2x2x2x2 hypermatrix. We know that because we saw that the discriminant of the quartic can be written in terms of two simpler Sl(2) invariants Δ4 = g23 – 27g32  . These invariants also provide full invariants S and T of the hypermatrix A such that det(t) = S3 – 27T2, but that is not the end of the story.  We have already seen that there is a much simpler invariant of degree 2 with just eight terms. In fact it is not difficult to construct other invariants of degree 4 and 6. then with some work all invariants including the hyperdeterminant can be expressed in terms of those (see arXiv:quant-ph/0212069

This demonstrates that the analysis of hyperdeterminants becomes very complex even for quite small formats. Nevertheless we are going to become interested in some significantly larger examples quite soon.


Hypermatrix Diagonalisation

October 10, 2008

If you’ve done much matrix algebra you will know that many linear problems can be quickly solved using the method of diagonalisation. The symmetry of the problem is used to transform the matrix until only the diagonal elements are non-zero. Matrix multiplication then just involves multiplying the diagonal elements (known as the eigenvalues) and the determinant is just their product.

It would be nice if we could do a similar trick with multi-dimensional hypermatrices. A hypermatrix does not have a specific set of components that form a diagonal but we could choose one of the diagonals and try to use linear transforms to eliminate the other elements. If that fails we could look for other ways to reduce most of the components to zero so that computations simplify. In general the diagonalisation trick does not work that well for hypermatrices and it is easy to see why. A hypermatrix of rank n with Nn components can be transformed using n linear transforms which have N2-1 degrees of freedom. Just by counting we can see that less than nN2 components can be eliminated by a general method. If n > 2 and N or n is large then only a minority of the components can be removed.

Nevertheless, for specific small formats the elimination method is still useful and often we are interested in properties of small hypermatrices. An example already came up in the construction of cayley’s hyperdeterminant. For the 2x2x2 hypermatrix A we demonstrated that if the hyperdeterminant is not zero then the matrix (over the complex numbers) can be written in this form

A = f1 x*1y*1z*1 + f2 x*2y*2z*2

And we can normalise the vectors so that x1Λ x2 = y1Λ y2  = z1Λ z2 = 1.

Although we did not mention it at the time, we can use special linear transforms to rotate to a system where these vectors form a basis for each of the three vector spaces. The hypermatrix A then only has two non-zero elements (a111 = f1 and  a222 = f2) and the other six are zero. For this specific case the hypermatrix can be diagonalised and the hyperdeterminant in this form is just det(A) = f12 f22 .

What of the case where the hyperdeteriminant is zero? Then it is not always possible to diagonalise the 2x2x2 hypermatrix in the same way. However we can eliminate a different set of elements. To see this first recall that when the hyperdeterminant det(A) is zero there are three vectors x1, y1, z1 such that contracting any two with A gives a zero vector. Choose a second set of vectors that are linearly independent of these three and normalise so that x1Λ x2 = y1Λ y2  = z1Λ z2 = 1. Then we can use these as a basis in which x1= (1,0), x2= (0,1) etc. To get the zero vectors from the contractions it must then follow that

a111 = a112 = a121 = a211   = 0

So half the components have been eliminated. A similar argument works for any hypermatrix whose hyperdeterminant is zero to eliminante the first element and all others in the same row in each of the directions.

Another way to get rid of components is to use an elimination process similar to the familiar one for matrices. Consider a transformation by a matrix whose diagonal elements are all one and whose off diagonal elements are all zero except for one exception which has a value u. Such a matrix always has determinant one so the hyperdteriminant (and other invariants) are unchanged under such a transformation. You can see that this transformation on a hypermatrix of rank n is actually equivalent to taking an (n-1) dimensional slice from the hypermatrix, multiplying by u and adding that to a different slice with the same orientation. This generalises the rule that the determinant of a matrix is unchanged when you add a multiple of a row (or column) to another row (o column), and it is this rule that can be used to eliminate elements 

We will be using some of these methods later.

 


Quartic Discriminants

October 9, 2008

In the last post we discussed how the discriminant for a polynomial of degree n can be seen as an SL(2) invariants on a fully symmetric tensor of rank n. We have also seen how invariants of hypermatrix tensors can be constructed using alternating forms. Of course a similar mathod can be applied for constructing polynomial invariants. The only difference is that e can contract the alternating tensors over the form indices in any order since they are all transformed simualtaneously.

I’ll walk through this for the lowest order polynomials. First a warning, I am changing notation for the coefficients to avoid fractions and to be consistent with the most common usage.

The quadratic is now written P(x) = ax2 + 2bx + c. This corresponds to a matrix

M = ( a b )
b c

The discriminant is simply the negative determinant of the matrix

Δ2 = b2 – ac

Next we looked at the cubic which I will now right as P(x) = ax3 + 3bx2+ 3cx + d. This corresponds to a 2x2x2 tensor with components

a111 = a

a211 = a121 = a112 = b

a221 = a212 = a122= c

a222 = d

The only independent invariant is the discriminant which comes from Cayley’s hyperdeterminant

Δ3 = 3b2c2 + 6abcd – 4b3d – 4ac3 a2d2

Next up is the quartic P(x) = ax4 + 4bx3+ 6cx2 + 4dx + e. This corresponds to a 2x2x2x2 tensor whose components are

a1111 = a

a2111 = a1211 = a1121 = a1112 = b

a2211 = a2121 = a1221= a2112 = a1212 = a1122= c

a2221 = a2212 = a2122 = a1222 = d

a2222 = e

This has a quadratic invariant which can be written in terms of alternating tensors like this

X =   εim εjn εkp εlq aijkl amnpq

This corresponds to a polynomial invariant

g2 = ae – 4bd + 3c2

There is another cubic invariant which is most easily written as a determinant

g3 =   a b c  
b c d
c d e

The discriminant is a dependent of these two invariants given by

Δ4 = g23 – 27g32


Discriminants as Invariants

October 9, 2008

In my last post I showed how polynomial discriminants are related to hyperdeterminants. Previously I also showed that hyperdeterminants are invariants . Now I am going to put these together and look at  discriminants as invariants. This is a preliminary to allow us to construct the discriminant for the quartic which we will need to understand the all important Schlafli hyperdterminant for a 2x2x2x2 hypermatrix.

Historically the subject developped the other way round. discriminants of polynomials were studied first and by time Cayley graduated Boole had a general theory and new the degree of any multivariable discriminant. Cayley took up the challenge of looking at more general invariants and this discovered hyperdeteriminants. he spent much ofthe rest of his life buried in massive algebraic manipulations working out polynomial invariants to higher and higher orders.

We have seen a polynomial P(x) of degree n can be represented using a symmetric tensor 

P(x/y) = y-n F(x) where F(x) = Σ ti…k x(i)…x(k) and x = (x,y)

When we apply a linear transorm S to the vector x we generate a transformation on the underlying tensor where each index is transformed using the same matrix thus keeping it symmetric. The hyperdeterminant of the underlying tensor is invariant under special linear transforms applied more generally to any index individually, and the symmetric transformation with a matrix of determiant one is a special case of this. Since the hyperdeterminant is a power of the discriminant we can draw the conclusion that the discriminant is also actually an invariant of SL(2). Since we have come to this conclusion in a very roundabout way (and there were holes in the argument), it is worth trying to verify it directly from the definition of the discriminant.

Δ = ann-2 (ri – rj)2
i < j

where ri for r = 1,…,n are the roots of P(x) = 0 and an is the leading coefficient. We apply a special linear transform (x,y) -> (ax+by, cx+dy) where ad-bc = 1. This induces a transformation of the form 

F -> G   such that G(ax+by, cx+dy) = F(x,y)

This in turn implies a transform of the polynomial coefficients and its roots. We need to check that the expression for the discriminant is invariant under this transformation.

The argument of the polynomial transforms rationally

x/y -> (ax+by)/(cx+dy) = (a(x/y)+b)/(c(x/y)+d)

So the set of roots is transformed by

  r-> (ari + b)/(cri+d)

Using this we can verify that

  ri – rj -> (ri – rj )/[(cri+d)(crj+d)]

We also need to know how the leading coefficient transforms

an = F(1,0) -> G(1,0) = F(d,-c) = (-c)nP(-d/c)

using the factorisation P(x) = anΠ(x-ri) we arrive at

a -> an(-c)nΠ(-d/c-ri) = anΠ(cri+d)

It is now just a matter of substitution of these transformations back into the definition of the discriminant to check that the exponent of the leading coefficient is correctly choosen so that the powers of (cri+d) cancel leaving the discriminant invariant.

Furthermore there is an extra surprise that we can get from this. Using simialr expressions in terms of the polynomial roots other invariants can sometimes be constructed. For example, in the case of the quartic we can define a new invariant as

V = a42( (r1-r2)2(r3-r4)2 +  (r1-r3)2(r2-r4)2 +  (r1-r4)2(r2-r3)2 )

This will give us an invariant of degree two in the coefficients. We see more about this later.


Hyperdeterminants and Discriminants

October 8, 2008

One way to understand an unfamiliar concept is to relate it to a more familiar one. According to the Google unpopularity test, hyperdeterminants are 400 times more unpopular than discriminants, but there is a close relation between the two, so lets see how that works.

The discriminant of the quadratic polynomial ax2 + bx + c is the familiar quantity

Δ = b2 – 4ac

When D > 0 the quadratic equation ax2 + bx + c = 0 has two real roots. When D < 0 it has no real roots and when D = 0 it has one double root.

The discriminant is related to the determinant of a 2 x 2 matrix M

M = ( a b/2 )
b/2 c

Then Δ = -4det(M).

Discriminants can be defined for higher order polynomials as well. The general definition of the discriminant for a polynomial

P(x) = anxn + … + a1x + a0

is given by

Δ = ann-2 (ri – rj)2
i < j

where rifor r = 1,…,n are the roots of P(x) = 0 including multiplicities. It is clear from this expression that the discriminant is zero iff the polynomial has multiple roots. For a specific n it is possible to expand the product into a symmetric polynomial in the roots. By a general theorem any symmetric polynomial can be expressed in terms of elementary symmetric polynomials that are equal to the coefficients on P(x). In this way the familiar expressions for discriminants as polynomials of degree 2n-2 in the coefficients can be constructed. E.g. for cubics we get

P(x) = ax3 + bx2+ cx + d

 Δ = b2c2 – 4ac3 – 4b3d – 27a2d2+ 18abcd

Just as the discriminant of a quadratic is the determinant of a symmetric matrix, the discriminant of a cubic is given by Cayley’s hyperdterminant of a symmetric hypermatrix A with components as follows

a111 = a

a211 = a121 = a112 = b/3

a221 = a212 = a122= c/3

a222 = d

By applying the formula for Cayley’s Hyperdeterminant it is not difficult to verify that for cubics

Δ = -27 det(A)

How can we understand this relationship and what is the general result? Suppose then we have the polynomial of nth degree P(x) = anxn + … + a1x + a0 and we construct a symmetric hypermtrixT of rank n where a component with n-r indices equal to 1 and r indices equal to 2 given by

t1..12..2 = ar/Crn

where Crn is a binomial coefficient

Using a vector x = (x, y) , then you can check that

P(x/y) = y-n F(x)   where  F(x) = Σ ti…k x(i)…x(k)

The form F(x) has a singular point where ∂F/∂x = 0which is equvalent to P'(x/y) = 0 and P(x/y) = 0 i.e the polynomial has a double root. So F(x) is singular iff Δ = 0.

Now recall the definition of a hyperdeterminantwhich is a discriminant for a more general multilinear form for a non-symmetric hypermatrix T given by

F(x1,…,xn) = Σ ta…h x1(a)…xn(h)

Of course this hyperdeterminant is equally well defined for the case in point where T is symmetric. Suppose then that  Δ = 0 which implies that the form F(x) has a singular point x = ξ, but by comparing derivatives we find that the multilinear form F(x1,…,xn) is also singular at the point where xi = ξfor each i. It then follows from the definition that the hyperdeterminant must be zero. So we have that Δ = 0 => det(T) = 0 from which we conclude that det(T) = Δ Q for some polynomial Q in the coefficients of P. Of course if the degree of Δ matches the degree of det(T) as it does in the case of quadratics and cubics then Q must be a constant.

Actually we can go a little further. if det(T) = 0 then it has a singular point at say xi = ξi , but since T is symmetric any permutation of the vectors gives another singular point xi = ξσ(i). In general the hypermatrixdoes not have multiple singular points and what we actually have here is just one with xi = ξ. This means that the form F(x) is singular and Δ = 0. So the implication goes both ways. det(T) = 0 <=>  Δ = 0 . Does this mean that det(T) and Δ must be equal up to a constant? Not quite. What actually happens is that

det(T) = KΔm

for some constant K and intger exponent m. Actually we have not quite proven this because we have assumed that Δ itself is not a power of some polynomial in which case m could be rational, however I think the result is correct.

So there is a link between polynomial discrminants of one variale and hyperdterminants on 2n hypermatrices. Of course this result can also be generalised to discriminants for multivariable polynomials which are similarly related to larger symmetric hypermatrices, but that’s not really the point. the importnant hting to see is that hyperdterminants are just generalisations of the familiar discriminants. Discriminants have been much more heavily studied and generalised in other ways. They an be applied in agebraic geometry and used to understand concepts such as ramification. Putting hyperdterminants in the same context should help us see more clearly where they belong.

I’ll finish by mentioning one last interesting consequnce of this result. Since the hyperdeterminant is a constant times a power of the discriminant it follows that the degree of the hyperdeterminant is a multiple of the degree of the discriminant. In other words the degree of the hyperdeterminant of format 2n is a multiple of 2n -2. This may not sound much but it is not obvious at this stage what the degrees of the hyperdeterminants are so this is worth knowing.


Hyperdeterminants and The Levi-Civita Symbol

October 5, 2008

Earlier we established that hyperdeterminants are invariants. The next obvious step is to look at how to construct the invariants and see what they can tell us about hyperdeterminants.

When Cayley discovered hyperdeterminants there was not very much known about the theory of invariants. Today the subject is taught routinely to undergraduates of mathematics and physics in some form. I am going to base this post on the knowledge of invariants as traditionally taught to physicists as tensor analysis without the more abstract approaches that mathematicians prefer.

From this standpoint our hypermatrix is a tensor T of rank n which we may write using indices over its components Tij…k. Even if you prefer not to think of tensors as a system of components, the notation can be regarded as an abstract way of labelling the vector spaces. Indices corresponding to a dual space and written as superscripts and the Einstein summation convention for contraction over indices is a way of indicating which spaces act on other spaces. I will assume that the notation is familiar. I still find this clearer and less constrained than the alternative notations devised by mathematicians including Sweedler notation.

From the theory of tensor analysis we know that all invariants can be constructed using tensor products and contractions over invariant tensors. In the present case where the relevant group is SL(N) the invariant tensors are the alterating tensors εi…k with N indices which are antisymmetric in any pair of indices. The components of this tensor are 0 or ±1 and are uniquely determined. This notation for the alternating tensor is commonly called the Levi-Civita symbol.

For the hypermatrix Tij…k, the subscript indices label different vector spaces which might be of different dimensions. To construct the invariants we need an alternating tensor for each of these spaces and we must be careful to contract over indices in the same space. In this way we can construct all invariants of the tensor and since the hyperdeterminant is an invariant we know that it must be possible to represent it in  this way. As an example, the determinant of a 3×3 matrix M would be written

det(M) = 1/6 εijkεlmnMilMjmMkn

All hyperdeterminants are invariants of this form but the converse is not true. A simple counterexample would be that the square of the determinant which is also obviously invariant. More generally, the sums and products of invariants are also invariants and in fact the collection of all polynomial invariants forms a ring which is graded over the natural numbers by dint of the degree of the polynomial.

Hilbert proved a fundamental theorem of invriants called the Hilbert Basis Theorem that tells us that this ring is always finitely generated. In other words, for any hypermatrix format a finite number of invariants can be used to generate all the invariants. In the case of 2x2x2 hypermatrices, Cayley’s hyperdeterminant is sufficient to generate the entire ring of SL(2)3invariants, but in general the hyperdeterminant is not enough. For example, if Tijklis a 2x2x2x2 hypermatrix, then εimεjnεkpεlqTijklTmnpqis an invariant of degree two but it is not the hyperdeteriminant which we will see later is actaully a much more complex object of degree 24.

In general the problem of finding a complete set of generators of a ring for a given format of hypermatrix (and any relations between them) is an arduous task except for the simplest cases. Furthermore it is not even easy to form the correct invriant expression for a hyperdeterminant in a systematic way. However I’ll finish this exercise by giving a treatment for the case of Cayley’s hyperdeterimant.

For a 2x2x2 hypermatrix the only independent expression that might give an invariant of degree two is

X =   εimεjnεkpTijkTmnp

But we can commute the components, relabel indices and alternate the tensors …

X =   εimεjnεkpTijkTmnp =  εimεjnεkpTmnpTijk = εmiεnjεpkTijkTmnp =   -εimεjnεkpTijkTmnp = -X

So this attempt to form an invariant just gave us an expression that must be identically zero. By the way the similar invariant we cited for the 2x2x2x2 format above is not identically zero. As a general rule we must always be careful in this game to check that any expression we give is not zero and it is not always obvious. It can also happen that two apparently different expressions give the same result or even that a complicated expression can factorise into a product of two simpler ones unexpectedly. Beware these pitfalls.

There is no way of forming invariant expressions of odd degree for the 2x2x2 hyperdeteriminant. We can see this because the alternating tensor has two components. Recall also that how we showed that the dimension of the vector spaces must divide the order of the determinant when we demonstrated that it was an invariant. This result can be extended to any invariant. So the lowest degree for invariants on this case is quartic and the possible expressions can be quickly reduced to these four

W = εaeεimεbjεfnεcpεgkTabcTefgTijkTmnp

X = εaeεimεbfεjnεcpεgkTabcTefgTijkTmnp

Y = εaeεimεbjεfnεcgεkpTabcTefgTijkTmnp

Z = εaeεimεbjεfnεckεgpTabcTefgTijkTmnp

You can see that the first expression is more symmetrical than the other three so does this mean that W is the hyperdeterminant which we know to be symmetrical? There are relationships between expressions involving alternating tensors e.g

εaeεim + εaiεme + εamεei =0

That this expression is identically zero can be verified by noticing that it is antisymmetric under exchange of any two indices which is not possible for a rank four tensor in two dimensions.

using this identity we can show that

W – X + Y = W – Y + Z = W – Z + X = 0

which implies that

W = 0 and X = Y = Z

So our guess based on symmetry was wrong and in fact the hyperdeterminant is given by the other expressions.