Archive for September, 2008

Hyperdeterminants and Elliptic Curves I

September 29, 2008

One of the connections that makes hyperdeterminants very interesting is their link with elliptic curves. Elliptic curves are connected to may other things of interest including the proof of Fermat’s lat theorem and monstrous moonshine and their unexplored link with elliptic curves may throw new light on these areas.

I wanted to introduce this link as early as possible in the blog so I’ll do it now before I have mentioned a number of other basic features of hyperdeterminants. In fact there are two independent ways to make the link, one with Cayleys 2x2x2 hyperdeterminant and another with the 2x2x2x2 hyperdeteriminant. I’ll start with the former.

An elliptic curve is a name for an equation that is usually put into this form

y2 = px3 +  qx2 + rx + s

A general problem associated with elliptic curves is to find solutions in rationals when p,q,r and s are given rationals. There is a group theoretic element to the solutions in that they form an abelian group under a certain operation. I am going to assume that you know enough about these things already.

For this excersize I am going to look at a specific class of elliptic curves which have a solution at x = 0 and for which the cubic side of the equation fully factorises. This means we can write it in the form

y2 +  4(bx –  n)(cx – k)(ex – m) = 0,  where nkm = t2

My aim is to try and rewrite this in a form that is as symmetric as possible, so i’ll satisfy the last condition by setting n = fd,   k = dg,  m = fg, so that t = fdg and the equation can be written concisely as

y2 +  4(bx –  fd)(cx – dg)(ex – fg) = 0

This gives us a symmetry corresponding to S3 the symmetric group on three elements. The symmetry is realised by an action on the “known” coefficients by applying the same permutation to (b,c,e) and to (g,f,d). Of course, our elliptic curve originally had four coefficients so by increasing the number to six we must have introduced a two parameter redundancy. This symmetry is generated by transformations with parameters r and s as follows

b -> rb, f -> rf, c -> c/r, g -> g/r

b -> sb, d -> rd, e -> e/s, g -> g/s

These are transformations on the known coefficients and there does not appear to be any symmetry between the two unknown parameters x and y. However, it is also possible to make elliptic curves symmetrical in two unknowns by writing them in this form

u2v2 + 2αuv + 2βu + 2γv + δ = 0

In this form the equation is symmetrical under exchange of the two unknowns u and v if the fixed coefficients β and γ are also swapped. The standard form for an elliptic curve can be recovered by completing the square with respect to the equation as a quadratic in u. This leads us to rewrite it as

(uv2 + αv + β)2 + 2γv3 + (δ -α2)v2 – 2αβv – β2 = 0

This form can be identified with our original elliptic curve if we make the equations

x = v

y = uv2 + αv + β

4bce = 2γ

4(bcfg + bedg + cefd) = α2

4fdg(bg + cf + ed) = -2αβ

2fdg = β

These can then be inverted to give as the coefficients α, β, γ and δ in terms of b,c,d,e and f.  Substituting back into the equation for u and v we get

u2v2 – 2(bg + cf + ed)uv + 4fdgu + 4bcev + (bg + cf + ed)2 – 4(bcfg + bedg + cefd) = 0

expanding and rearranging gives

u2v2 + b2g2 + c2f2 + e2d2 – 2bguv – 2cfuv – 2eduv – 2bcfg – 2bedg – 2cefd – 4fdgu + 4bcev = 0

It is clear from this expression that we now have more symmetry than we were originally looking for. We set up an S3 symmetry acting on the coefficients and a Z2 symmetry in the varaibles, but we now also have symmetries in the expression that interchange the coefficients and the variables. And of course you will recognise the expression as Cayley’s hyperdeterminant in the cube formed by the eight variables. The symmetry group of the expression is therefore at least the symmetry of the cube which is isomorphic to S4. In fact, as we shall see next, the hyperdetermiant has a much larger symmetry even than that because the continuous symmetries have also expnaded. We shall look at what the fullsymmetry is shortly.



Cayley’s Hyperdeterminant

September 29, 2008

Hyperdeterminants were discovered by the English mathematician Arthur Cayley in 1845. He worked out the formula for the most simple hyperdeterminant that is not a determinant, i.e. the 2x2x2 hyperdeterminant. Although he also looked at larger examples we now refer to this specific case as Cayley’s hyperdeterminant. If you fancy reading Cayley’s original work “On the Theory of Linear Transformations” you will find it in  “The collected mathematical papers of Arthur Cayley Vol 1” which can be downloaded from .

The 2x2x2 array A has 8 components which we refer to either using index notation of individual letters 

a111 = a
a112 = b
a121 = c
a122 = d
a211 = e
a212 = f
a221 = g
a222 = h

Then Cayley’s hyperdetermiant written exactly as he did himself is

det(A) = a2h2 + b2g2 + c2f2 + d2e2 – 2ahbg – 2ahcf – 2ahde – 2bgcf – 2bgde – 2cfde + 4adfg + 4bech

It is useful to see how this is derived from the definition of the hyperdeterminant. Recall that we start with a form

F = Σ aijkx(i)y(j)z(k)

This has a stationary point when simulataneously

Σ aijkx(i)y(j) = 0
Σ aijkx(i)z(k) = 0
Σ aijky(j)z(k) = 0

Consider a matrix M(z) with components mij = Σ aijkz(k). Then in matrix notation the last two conditions read xTM(z) = 0 and M(z)y = 0, and this can be realised iff det(M(z)) = 0. Since det(M(z)) is quadratic in the components z(k) it can also be written as a symmetric quadratic form in the two components u = z(1), v = z(2) 

det(M(z)) = (ag – ce) u2 + (ah + bg – cf – de) uv + (bh – df) v2

The nature of the solutions depends on the discriminant of this equation

disc. = (ah + bg – cd -de)2 – 4(ag – ce)(bh – df)

Which can be expanded to confirm that it is Cayley’s hyperdeterminant given above. If this is non-zero then there are two solutions of det(M(z)) = 0 up to scale factors over the complex numbers, but if it is zero there is just one. Dealing with the non-zero case first we label the two soltuions as z1 and z2 . The corrseponding eigenvectors of M(z) will be x1, x2, y1, y2, labelled in reverse so that

x2TM(z1) = x1TM(z2) = 0

M(z1)y2 = M(z2)y1 = 0

It is also possible to work through similar arguments by starting with the application of x or y instead of z. The discriminants in these cases are again the same hyperdeterminant which we know to be non-zero and the same vectors are the solutions. It follows that x1 is distinct from x2 and y1 is distinct from y2

From this we can see that M(z1) = f x*2y*2 where the star operator is a dualization defined by (u,v)* = (v,-u) (so that z.z* = 0), and f is a non-zero scale factor. Likewise we find that the original array can be written as the sum of two products

A = f1 x*1y*1z*1 + f2 x*2y*2z*2

Where f1and f2 are non-zero. Once we know that the hypermatrix can be written in this form it is easy to see that we can satisfy any two of the three equations for a singular point at a time, but not all three at once. So if the hyperdeterimant is non-zero then the hypermatrix is non-singular. I’ll leave it as an excersize to prove the converse.

The Determinant is a Hyperdeterminant of a Matrix

September 27, 2008

In the previous post we defined the hyperdeterminant. To illustrate how the definition works and to justify using the same notation as we do for determinants, we need to work out the hyperdeterminant for an N x M matrix A (which may not be square) and show that it is the equivalent to the definition of the determinant for this case.

We start with the bilinear form of the matrix A over two vector spaces of dimension N and M which can be written is matrix notation

F(x,y) = xTAy

F has a stationary point at (x,y) when

∂F/∂x = Ay = 0      and       ∂F/∂y = xTA = 0

The point is a singularity if x0 and y0. From the well known theory of matrices we know that this is equivalent to saying that A has both a left and right eigenvector with zero eigenvalue. When the matrix A is square (M = N), this happens if and only if A is singular, i.e. det(A) = 0. This confirms that the definition of a singular hypermatrix when applied to square matrices is equivalent to the usual definition of a singular matrix. It also tells us that the hyperdeterminant for a square matrix only exists when the matrix is square and that it must be the determinant times some factor. To confirm that the factor is one we must see that the coefficient of the first term in the determinant is one when the terms are ordered as required in the definition of the hyperdeterminant. This follows from the formula for the determinant which is

det(A) = Σ sign(σ)A1 σ(1) … AN σ(N)

Where the sum is over all permutations σ which permute the values of the indices and sign(σ) = ±1 is the signature of the permutation. The first term under the given ordering of terms is the diagonal term (+1) A11…ANN which has coeficient one as required.

In the case of a non-square matrix with N < M we can still get singular points, but the singular matrices are determined by more than one polynomial equation (M-N+1 of them in fact) these polynomials are determinants of N x N submatrices of the N x M matrix. Because the set of singular matrices in this case is determined by more than one polynomial equation there is no hyperdeterminant.

Hyperdeterminant Definition

September 27, 2008

There is more than one way to define hyperdeterminants but I think the most natural is to define it as a discriminant. Hyperdeterminants are defined for tensors T, that are an element of the tensor product of a number of finite dimensional vector spaces:

T ∈ V1 ⊗ … ⊗ Vn

The vector spaces don’t have to have the same dimension. We’ll use the symbol Ni for the dimension of vector space Vi. In case you don’t like thinking in terms of tensors you can just regard T as a multidimensional array of numbers of size N1 x … x Nn. Sometimes it is called a hypermatrix to emphasize that it is a generalization of a matrix which is an array of N1 x N2 numbers. The components of the array can be written with an index notation Tabc…hIf on the other hand you like to maximise your mathematical sophistication you might prefer to think of the vector spaces as projective spaces and use a Segre embedding in place of the tensor product. You can do that because everything we do is homogeneous and works up to a non-zero factor multiplying the vectors and hypermatrices. Since this is a relatively unsophisticated blog I will mostly talk in terms of hypermatrices but I will sometimes use the “format” of a hypermatrix which is defined as the n-tuple of numbers (N1-1,…,Nn-1). This notation reflects the importance of the dimension of the projective spaces which are one less than the dimension of the corresponding vector spaces.

I should also clarify what I mean by “numbers” as components of the hypermatrix. They can actually be elements of any field but to keep things simple Let’s just think of them as complex numbers for the purposes of this definition. Once we have constructed hyperdeterminants as polynomials we can happily stick elements of other fields or even rings into the components of the hypermatrix and freely use the hyperdeterminant for those too.

Now here’s the definition. Using our tensor/hypermatrix T we can define a function F from an n-tuple of vectors (x1,…,xn) to the real numbers by contracting the vectors over the tensor. These vectors are elements of the dual vector spaces Vi*. In less sophisticated terms, F is just a multi-linear polynomial that can be written as a sum over component indices

F(x1,…,xn) = Σ ta…h x1(a)…xn(h)

(We use superscripts rather than subscripts for the indices of the vectors because they are elements of the dual space. They are in brackets to avoid confusion with power exponents which we will also be using later).

A stationary point of this function is one where all the partial derivatives with respect to all the components of the vectors is zero. We call a stationary point a singularity if all the vector arguments are non-zero. For a given hypermatrix there is not always a singularity present. When there is at least one singularity we say the hypermatrix is singular. In the vector space of all possible hypermatrices with a given format, the ones which do have a singularity may fall on a surface described by a single equation given by the zero of a polynomial in the components of the tensor, P(T) = 0. In other words the polynomial is a discriminant for the singular hypermatrix. When this happens the polynomial is the hyperdeterminant of the hypermatrix times a factor. This condition defines the hyperdeterminant up to a factor.

To complete the definition the factor has to be fixed. This can be done either by specifying the value of the hyperdeterminant for one specific hypermatrix of each format or by fixing one of the coefficients of the polynomial. Both of these options are a little messy but I’ll choose the latter. The components of the tensor can be ordered into a sequence using the numerical order of the indices like this (T1 1…1, T2 1…1, … , T1 2…1, T2 2…1…). The monomial terms in the hyperdeterminant are a coefficient times powers of these components, so these monomials can also be ordered to put the ones first that have the highest powers in the components of the hypermatrix that come first in the sequence above. Once that has been done we fix the hyperdeterminant by specifying that the coefficient of the first non-zero term in this ordering is one.

That completes the definition and it just remains to say that the hyperdeterminant is denoted using the same notation as for the determinant det(T) = |T|. To make sure we are being consistent we must demonstrate that the definition of the hyperdeterminant is the same as the determinant for the special case of matrices. That will be the next post.


September 25, 2008

This may turn out to be one of the least read blogs in all the blogosphere but never mind.  It is going to be all about mathematical structures called hyperdeterminants and other things related to them. I think it is fair to say that even most mathematicians fall asleep when the subject of hyperdeterminants is mentioned but actually they are really interesting. They just tend to hide their best assets in dark corners where few people find them.

Hyperdeterminants are a generalisation of determinants to multi-dimensional arrays.  I’ll be explaining more about how they are defined and what their properties are in later posts. If you don’t already know what determinants are then I’m afraid you may not ready for this blog. But don’t despair. You can turn to wikipedia to read up on determinants or anything else unfamilair that comes up.

Any mathematicians who has been around awhile knows of many surprising connections that have been found between parts of mathematics that at first seemed separate. Examples include modular forms, elliptic curves, exceptional groups, lattices, coding theory, special functions, cohomology, octonions and even quantum physics and superstrings. Many of these beautiful subjects have deep connections that suggest the existence of some unknown master structure that encompasses them all in some elegant way. Hyperdeterminants are not the answer but they are a missing link that might help us find it.  They are in fact connected with all the above things in remarkable and unexpected ways some of which are only known to a very small group of researchers, so if you want to know more you had best follow this blog.