Archive for June, 2009

Diophantine Quadruples and Symmetric Matrices

June 29, 2009

It’s funny how sometimes a simple but useful mathematical trick can go unnoticed for a long time, yet be obvious when it is pointed out. Here is a good example.

A few days ago I posted here about ,Diophantine Quadruples which are sets of four different non-zero numbers (rational or integer) such that the product of any two is one less than a square. This problem of finding them has been around since Diophantus and has been looked at by some of the greatest classical number theorists including Fermat and Euler, as well as 20th Century specialists of number theory such as Baker and Davenport. In the last few years the literature on the subject has been steadily growing. I myself have been thinking about the problem on and off since I heard about it as a teenager 30 years ago. Well, we all seem to have missed an easy trick for finding them.

Actually this method applies to a generalization of the problem which is to find sets of four distinct positive integers such that the product of any two is less than a square by a fixed number n. These quadruples are said to have the property D(n). For n = 1 there is an infinite set of them called regular quadruples that can be constructed recursively. There are probably no irregular examples but that has not been settled yet. When n is not a square the problem is more difficult with only a few examples known for each n. For the state of the art you should consult the references listed by Anrej Dujella.

So let’s suppose we have a quadruple (a,b,c,d) with property D(n)

ab = x2+n
ac = y2+n
bc = z2+n
ad = r2+n
bd = s2+n
cd = t2+n

Now form a symmetric matrix with the quadruple down the diagonal and the square roots as the correspondng off diagonal elements

          ( a  x  y  r )
   D =    ( x  b  z  s )
          ( y  z  c  t )
          ( r  s  t  d )

Next form the Adjoint matrix. For non-singular matrices this is the inverse times the determinant.

Adj(D) = D-1  det(D)

The elements of this matrix can also be constructed from the 3×3 minor determinants of D and this works even when D is singular. It is therefore a matrix of integers just like D, and it is symmetric.

            ( k  u  v  w )
Adj(D) =    ( u  l  e  f )
            ( v  e  m  g )
            ( w  f  g  n )

The 2×2 minors of an adjoint 4×4 matrix are given by the opposing 2×2 minors from the original matrix times its determinant. e.g  

 
kl – u2 = (cd – t2) det(D)

This works for all the minors, but we only need to look at the principal minors. The result should now be obvious. Since (cd – t2) = -n, it follows that kl – u2 = -n det(D). With the same being true for each minor, it follows that the quadruple (k,l,m,n) is also a Diophantine quadruple with the property D(n det(D)), although this could fail in specific cases if two of the numbers are the same or are not positive.

In most cases this means that if you give me a Diophantine quadruple, I can give you another one just by inverting its matrix. In fact I can give you (potentially) eight of them! This is because I can reverse the signs of the square roots and there are eight distinct cases that will usually give eight different quadruples. Obviously this process can be repeated by flipping signs in the adjoint and inverting again.

By the way, I only discovered this trick because I first realised some years ago that regular quadruples can be arranged on a 2x2x2 hypermatrix with zero hyperdeterminant. Then recently some papers appeared that showed that the minors of a symmetric matrix also give a singular hypermatrix . Putting the two together made me look at the matrix generated from a quadruple and its inverse.

It may not be a revolutionly result in mathematics but it does show how it is easy to miss simple tricks even for problems that have been around for a long time, and it is just possible that this trick could provide a new attack on some of the still unsolved problems concenring Diophantine quadruples.

Cayley’s Hyperdeterminant and Principal Minors

June 25, 2009

Just when you think you must know everything about a matematical structure you suddenly start to find out all sorts of new things about it. This has been happening to me recently with Cayley’s Hyperdeterminant  and I plan to blog about some of the things here.

First off is a surprise relationship between the principal minors of a 3×3 symmetric matrix and the hyperdeterminant that has been reported recently in arXiv:math/0604374 by Olga Holtz and Bernd Sturmfels (and possibly others).

The principal minors of a matrix are the determinants you are left with when you eliminate rows and columns of the original matrix passing through diaginal elements. The principal minors for a 3×3  matrix

          ( a   x   y )
    M =   ( u   b   z )
          ( v   w   c )

are

a111 = 1
a211 = a
a121 = b
a112 = c
a221 = ab – xu
a212 = ac – yv
a122 = bc – wz
a222 = det(M)

These are eight numbers which naturally fall on the corners of a cube to form a hypermatrix A, so what is its hyperdterminant? This can be worked out by hand to yield the answer,

det(A) = (yuw – xzv)2

In particular, if the matrix is symmetric the hyperdeterminant is zero. [The same is true for any symmetric matrix which is transformed by pre or post multiplication by a diagonal matrix]

Why is this so interesting? Well, the symmetry of a 3×3 matrix is preserved under SO(3) transformations applied simulataneously to rows and columns. The determinant a222 is invariant (of course a111 is trivially invariant). The principal minors appear in the diagonal elements of the inverse of M. So if the eight components of the hypermatrix are augmented with the six off-diagional elements of the matrix amd its inverse, then we seem to have a sytem of 14 dimensions with some extra symmetry. 

With a small adjustment we can extend this to a general 2x2x2 hypermatrix with components aijk and with a zero hyperdeterminant. Working in reverse we now define

a = a211
b = a121
c = a112
d = a222

k = a122
l = a212
m = a221
n = a111

then

x2 = ab – mn
y2 = ac – ln
z2 = bc – kn

This produces a symmetric matrix

          ( a   x   y )
    M =   ( x   b   z )
          ( y   z   c )

We can also add

r2 = da – lm
s2 = db – mk
t2 = dc – kl

And this gives a second symmetric matrix

          ( k   t   s )
    N =   ( t   l   r )
          ( s   r   m )

Multiplying these matrices together (and resolving sign ambiguities) we get a tidy relation

MN = nd I

The extra six components also have a simple interpretation. Since the hypermatrix has zero hyperdetermiant we know that it has a system of three vectors which act like the zero eignevectors of the system. The six components are the components of these vectors.

So what we have found is that the linear system consisting of the 8 components of a singular hypermatrix and the 6 components of its zero eigenvectors forms a system which has the SL(2) x SL(2) x SL(2) symmetry of the hypermatrix and also some extra SO(3) symmetries. So what is the system and what is its full symmetry?

If you know a bit about excpetional alegebras you can probably at least guess the answer to this question. The two symmetric 3×3 matrices can be regraded as elements in the Jordan algebra J3R. Then the matrices M and N and the two numbers n and d can be used to build a six dimensional matrix

    F =   ( nI   M  )
          ( N    dI )

This can be regarded as an element of a Freudenthal Triple System (FTS) which has dimention 14. Its automorphism group is a symplectic group (Sp(6) I think) and the relation MN = nd I is equivalent to det(F) = 0.

As far as I can tell this relationship is only valid between singular hypermatrices and singular elements of the FTS. It does not extend to non-singular elements in any way (please correct me if you think otherwise). I think it is of particular interest concerning the problem of Diophantine quadruples  since I can now think of these as special cases of an FTS as well as a hyperdeterminant.

Diophantine Quadruples

June 25, 2009

My interest in hyperdeterminants arose from investigating a very old problem in number theory that originated with Diophantus himself. In his books on algebra probably written around 250 AD, Diophantus looked at many equations which he tried to solve in rational numbers. One such problem was to find sets of rational numbers such the product of any two is one less than a square. I have no idea what motivated him to consider this particular puzzle. It has no obvious interpretation in geometry or any other practical application. He seems to have just invented it as an abstract quastion. However, I do know that it was well chosen because this problem has hidden symmetries of enormous interest even today and its mysteries are still not fully revealed.

Diophantus provided examples of triplets and quadruples of rational numbers that solved his problem. A neater example was provided by Fermat who read Diophantus but became more interested in integer solutions to equations than rational ones. He observed that 1,3,8,120 has the required property with the products being one less than the squares of 2,3,4,11,19 and 31. We still dont know if there are 5 non-zero integers with this property although Dujella has shown that there can be at most a finite number of them. Euler found that a fifth rational number (777480/8288641) can be added to Fermat’s quadruple giving a set  of five. A few years ago I carried out a computer search and found some rational diophantine sextuples of which the smallest is this one

11/192    35/192   155/27    512/27   1235/48    180873/16

There are still plenty of related unsolved problems and a growing literature on the subject for any number theorist with time on their hands. They might start at the list of references maintained by Dujella at  http://web.math.hr/~duje/ref.html

Of more interest to me these days are the hidden symmetries in this problem. These first become evident when you attack the probelm directly. To find two numbers with the property is easy. We just want a and b with

ab = x2 -1

In rationals we can just take any non-zero numbers a and x and solve b = (x2-1)/a for a complete solution over rationals. A third number would take the form c = (y2-1)/a , with the requirement that bc = z2 -1

(x2-1)(y2-1)/a= z2 -1

Because the largest part of each side is nearly a square it is natural to look for a perturbation solution with

z = (xy + t)/a

Eliminating z in favour of t and simplifying reduces to

x2 + y2 + t 2+ 2xyt = a2 + 1

The first thing to notice about this equation is that it is quadratic in any of the variables x,y,t. This means that if we have one solution in integers or rationals, then we can find another by looking for the other root of the quadratic for any of the three variables. This can be repeated to give an infinite nunber of diophantine triples a,b,c. The second observation is that the equation is symmetrical in the variables x,y,t. This is an accidental hidden symmetry that was not apparent originally and it has a useful consequence. It means that given any triplet solution a,b,c we can construct a fourth number d = (t2-1)/a and by the symmetry of the equation this must extend the triplet to a quadruplet with the property of Diophantus.

Although the equation shows some symmetry it does not reflect the expected symmetry of the problem between the number a,b,c,d. This can be remedied by eliminating x,y and t in favour of b,c and d,

(ab+1) + (ac+1) + (ad+1) + 2 sqrt((ab+1) (ac+1) (ad+1))  = a2 + 1

=> 4(ab+1) (ac+1) (ad+1) = (a2 – ab – ac – ad – 2)2

By expanding this, cancelling terms and dividing out a factor of a2 we get an equation symmetric in a,b,c,d

P(a,b,c,d) = a2 + b2 + c2 + d2 -2ab – 2ac – 2ad – 2bc – 2bd – 2cd – 4abcd + 4 = 0

By construction this equation should be capable of being used to extend an Diophantine triple (a,b,c) to a Diophantine quaadruple (a,b,c,d). This can be seen explicitly if we can use it to solve for d when a,b,c are given. Since the equation is quadratic in d this amounts to completing the square to rewrite it as

 P(a,b,c,d) = (d – a – b – c – 2abc)2 – 4(ab+1)(ac+1)(cd+1) = 0

So if (a,b,c) is a Diophantine triple meaning that (ab+1), (ac+1) and (cd+1) are integer squares, then the equation can indeed ne solved to findand integer d. That (a,b,c,d) is then a quadruple follows from the observation that (ad+1) must be a square because

P(a,b,c,d) = (a – b – c + d)2 – 4(ad+1)(bc+1) = 0

and similar identities for the other squares.

At this point it appears that the ability to extend a triple to a quadruple is the result of the fact that this polynomial can be written in a number of different ways as the difference of a square and a product of the numbers that need to be squares. This result seems to defy explanation. If this is not remarkable enough it is more surprising that there is a similar polynomial in five variables that allows any Diophantine quadruple to be extended to a rational Diophantine quintuple. I will return to that in a later post.

So where does this polynomial come from and to what mathematical principles does it owe its properties?

A partial answer to this question is that the expression is a special case of Cayley’s hyperdeterminant for a 2x2x2 hypermatrix with components

a111 = -a
a112 = 1
a121 = 1
a122 = b
a211 = 1
a212 = c
a221 = d
a222 = -1

The additional symmetries of the hyperdeterminant reduce the number of mysterious identities that it fulfills and most of those that remain arise if the derivation of the hyperdeterminant. Yet it is not obvious that this fully explains the relationships and some mystery remains.

Hypermatrix Inverses

June 6, 2009

The last post about multiplicative hyperdeterminants used a form of multiplication of hypermatrices where the last direction of one hypermatrix is contracted over the first direction of another. We can write the product using ordinary product notation so A multiplied by B is just AB. This generalises multiplication of matrices and multipication of matrices with vectors. For two vectors it gives the scalar product.

A more general product can be defined if we contract over more than one of the hypermatrix directions. I will write A(p)B to mean the product of hypermatrices A and B where the last p directions of A are contracted with the first p directions of B in reverse order. When a hypermatrix A of rank r  is multiplied by a hypermatrix B of rank s  the product A(p)B is a hypermatrix of rank r+s-2p. It is only defined if the dimensions of the last p directions of A match the dimensions of the first p directions of B. If A and B are multi-linear operators on vector spaces then the last vector spaces that A operates on must be the dual spaces of the first vector space that B operates on. When a vector space is transformed by a non-singular linear operator, its dual is transformed by the inverse of the operator.

The problem we want to look at now is how to invert a hypermatrix so that we can use it to reverse a multiplication. In other words, if C = AB we want  to solve for A in terms of C and B. We can do this if we have a hypermatrix D with the same format as B, which inverts B in a suitable fashion. Given any matrix D with the same format as B (of rank r) we can form r matrices by contracting over all but one pair of matching directions. If all these matrices are a non-zero multiple of the identity matrix we call D an inverse of B. In particular we need B(r-1)D = xI and D(r-1)B = yI, but these are just 2 out of r ways of doing the contraction. Notice that we say “an” inverse rather than “the” inverse because such an inverse may not exist, and if it does it may not be unique (even up to a factor).

Such an inverse can be used to invert the multiplication C = AB to give the solution A = (1/x) C(r-1)D. This does not require all the properties of the inverse, it just requires B(r-1)D = xI. But our more specific definition of inverse is interesting because there is a means to construct such inverses from hyperdeterminants and other invariants.

The term inverse is justified because if D is an inverse of B then B is an inverse of D. This follows immediately from the symmetry in the definition. Furthermore the usual inverse of a non-singular matrix is also an inverse in the hypermatrix sense, but so is any scalar multiple of the inverse. Why not fix the normalisation? Take the trace of B(r-1)D = xI and D(r-1)B = yI and you get B(r)D = xd1 = ydr (where di is the dimension of the hypermatrix in the ith direction) and similarly for the other directions of the hypermatrix. If the hypermatrix is hypercubical, so that all the directions have the same dimension, then we could fix z = y =1, but that wont work for more general hypermatrix formats. Instead we would have to fix x = 1/d1,  y = 1/detc., but that would be inconsistent with the inverse of a matrix. The best compromise is to not fix the normalisation at all.

Another plus for this definition is that the set of all inverses (with zero included) forms a vector space. The dimension of this vector space is an important attribute of the hypermatrix format and is not easy to calculate even for simple formats. What makes this interesting is that we can form inverses using invariants. This generalises the expression for the inverse in terms of its determinant, which is,

             A-1 = (1/detA) (d/dA) detA

If A is a hyperdeterminant of rank r and K(A) is an invariant of degree m, then the inverse of A with respect to K is defined as

             InvK(A) = (1/K(A)) (d/dA) K(A)

If we contract A with this inverse over all directions we find that

             A(r)InvK(A) = m

This result is true for any homogeneous multivariate polynomial of degree m, but we can use the invariance property of K(A) to derive the stronger result that this inverse is an inverse in the sense defined previously. To see this consider a linear transform of A when its last direction is transformed by a linear operator δA = AE where E is small.

      δK(A) = Tr(  (d/dA)K(A) (r-1) δA  )  = K(A) Tr( InvK(A) (r-1) AE ) + O(E2)

But by the transformation rule for invariants we get

     K(A) -> K(A) det(I+E)m/d

     => δK(A) = (m/d) K(A) Tr(E) + O(E2)

By bringing these together we conclude that

           InvK(A) (r-1) A   = (m/d)I

With similar results for the other directions, which confirms that InvK(A) is an inverse of A.

That is all I have to say about inverses for now, but I’ll leave you with a few questions to consider. Firstly, I have not demonstrated that InvK(InvK(A)) = A. How can this be proved in general? Secondly, the ring of invaraints for a given format is finitely generated. Given a set of generators, we get a set of inverses. Are these linearly independent and do they generate the full vector space of inverses? I dont really know the answers to these questions so if you do know something, please leave a comment about it.