Update on Popularity of “Hyperdeterminant”

May 16, 2010

I am sure you all remember how I looked into the unpopularity of the word “Hyperdetermiant” according to the returns from a google search. It came second from bottom in a random list of mathematical terms.

That was nearly two years ago so I thought it would be interesting to recount and see how things have changed. Here is the results showing the count then and now along with the percentage rise

cubicuboctahedron 745 3790 408.72%
hyperdeterminant 4480 20700 362.05%
antiprism 30100 116000 285.38%
disphenoid 4620 16900 265.80%
circumsphere 11100 31700 185.59%
profinite 36600 101000 175.96%
octonion 33000 74800 126.67%
zonohedra 4770 10800 126.42%
multilinear 334000 743000 122.46%
grassmanian 105000 176000 67.62%
endomorphism 204000 288000 41.18%
functor 584000 822000 40.75%
pseudodifferential 202000 284000 40.59%
hyperbola 411000 571000 38.93%
quartic 391000 508000 29.92%
quadratic 6120000 7380000 20.59%
tensor 6100000 7220000 18.36%
hypergeometric 1360000 1600000 17.65%
tetrahedron 2190000 2560000 16.89%
polynomial 6990000 7820000 11.87%
automorphism 600000 668000 11.33%
contravariant 162000 178000 9.88%
dodecahedron 278000 302000 8.63%
quaternion 469000 492000 4.90%
elliptic 4170000 4280000 2.64%
pfaffian 56200 54600 -2.85%
cobordism 79000 74700 -5.44%
determinant 7670000 7240000 -5.61%
discriminant 1940000 1800000 -7.22%
diophantine 369000 323000 -12.47%
polyhedron 963000 830000 -13.81%
algebra 28300000 22500000 -20.49%
cohomology 951000 753000 -20.82%
covariant 1020000 793000 -22.25%
polytope 294000 228000 -22.45%
toric 1780000 1270000 -28.65%

As you can readily see “hyperdeterminant” is the second fastest riser! It is now merely fourth from the bottom when ordered by word count.

Transcript for the String Wars Video

October 26, 2009


Some people wanted a transcript of the video.

Scene 1: On the street

QG: Hello there anonymous string theorist. What are you working on these days?

ST: Hello anonymous quantum gravitist. I am working on string theory, but you knew that.

QG: Really? I thought it was M-theory now.

ST: Its true that it is now a theory about M’s but we still call it string theory because there are so many string theory departments and it is too expensive to change their names.

QG: Ho Ho Ho. The truth is that most of us have realised that string theory is a failed theory and are moving on to new areas based on reality, such as cosmology and nuclear physics.

ST: No. The truth is that string theory is so successful, that it is being applied even in cosmology and nuclear physics. string theory predicts a vast landscape of multiverses that explain why the cosmological constant is small thanks to the anthropic principle as predicted by Weinberg. In nuclear physics the A-D-S/C-F-T duality means that we can now use string theory to understand the strong nuclear force. This shows that string theory is the only game in town and you, are a crackpot.

QG: The application to nuclear physics is just an approximation scheme unrelated to the extravagant idea that string theory is a unified theory of everything in higher dimensional space-times. The problem is that the landscape makes no predictions and we have no reason to believe in it. It’s not a real testable scientific theory, rather it is an untestable end-point of a failed idea, and you are so wrong that you are not  even wrong.

ST: Back on the planet where I live, the Large hadron Collider will find supersymmetry next year with a probability of 90 percent.? Once supersymmetry has been verified the correctness of supergravity is theoretically overwhelming and superstring theory follows as the unique logical completion of physics.

QG: There is nothing unique about string theory in physics. Loop Quantum Gravity is just one example of many alternatives that make testable predictions without the need for unsupported inventions such as higher dimensions and supersymmetry. Do I have to remind you that supersymmetry was predicted to be discovered at Fermilab by 2000? We are still waiting. String theory is so weak that it can be made to fit any lack of evidence and aging string theorists will continue to support it no matter has compelling its failure becomes.

ST: Even slightly retarded children know that Loop Quantum Gravity predicted a violation of Lorentz Invariance that was ruled out by the Fermi gamma ray telescope. I am afraid that your IQ has descended below the level of a retarded monkey if you believe anything can match the successes of string theory. Crackpot.

QG: Loop Quantum gravity has progressed and no longer predicts a violation of Lorentz Invariance. Well I’d love to continue with this high level of discussion but kindergarten breaktime is up so you must have to go.

ST: Yes I am sure there are more useful things you could be doing like lying down on the freeway. Crackpot.

QG: Actually, I am on my way to do a television interview on “science question time”.

ST: That is interesting. So am I.

Scene 2: In the TV studio

I: Hello and welcome to science question time. Today I have with me anonymous string theorist and anonymous quatum gravitist. Guys, you look very similar. Am I mistaken or are you twins?

QG: Absolutely not

ST: We are completely unrelated

I: Err, okay. String Theorist, let me ask you a question first. Next month the Large hey-dron colly-door will start searching for new physics. What do you expect to see?

QG: Ho Ho Ho.

I: Why are you laughing?

ST: It’s because you called it a hey-dron colly-door. I think you mean hadron collider. Never mind. We dont expect journalists like you to know much about science, so I will try to answer your question in very simple terms. Basically, we expect to see particles of supersymmetry, and after that, evidence for large extra dimensions. This will confirm the verity of string theory.

QG: If I may interrupt. The only thing that we can predict with any possible certainty is the discovery of the Higgs bose-on. The rest is pure speculation and hype from string theorists who forget to mention its complete lack of testability.

I: So then, tell me quantum gravitist, what do you work on?

QG: I work on Loop Quantum Gravity, an alternative to string theory in which space-time is built from loops.

I: Well, I understood that string theory also says that everything is made of loops of string. Are these theories related?

ST: Absolutely not! In loop quantum gravity the loops are just holonomies of space-time that form knots. In string theory we have loops that vibrate to form states of particles including gravitons. They can’t form knots because they easily pass through each other. Furthermore, we normally work in 10 or 11 dimensions where knots cannot form. So clearly the loops and strings are not the same thing.

I: But when the strings cross over, does anything happen?

ST: Sometimes they interact and split to re-join differently. Other times they just pass through. Diagramatically it looks like this…

I: And in loop quantum gravity. Can the loops pass through each other?

QG: No I agree that they are utterly different from strings. The loops form knotted states that can evolve according to mathematical equations called skein relations. Diagramatically it looks like this.

I: Is that where? quantum groups come in?

QG: Yes. These are connected with quantum groups. Good Guess!

ST: Do you use quantum groups? So do we! What an amazing coincidence!

I: I thought you could not have knots in multi-dimensional space-time.

ST: But the quantum groups are used in conformal field theory on the worldsheet which has fewer dimensions.

I: So if you use the same mathematics doesn’t that suggest there is a close connection?

QG: Not at all. It’s just an example of the unity of mathematics. The same equations often appear in different places, but there is no real connection.

I: I see. So string theorist. What would you say was the biggest weakness of string theory?

ST: string theory is very successful, but despite years of development, we still dont have a non-perturbative formulation of string theory based on sound principles which respect the background independence of space-time that we expect from Einstein’s theory of relativity. In short, we just don’t understand what string theory is yet. Without that we are limited when we want to understand what happens in extreme circumstances such as the big bang singularity.

I: And quantum gravitist. What is the main strength of Loop Quantum Gravity?

QG: Loop quantum gravity is a direct quantisation of gravity that builds on the principles of relativity as laid down by Einstein. It is fully background independent.

I: Right. What about its weaknesses?

QG: Well, we do find it hard to incorporate forms of matter and interactions other than pure gravity. It is also hard to re-cover the classical limit and understand how familiar space-time emerges from the theory.

I: That is quite interesting. So string theorist. What are the strengths of string theory?

ST: string theory incorporates all the forces and particles of physics in a unique way. Furthermore, gravity arises in string theory in a natural and inescapable way as a result of special vibration modes of the strings that act correctly like gravitons. This means that we include matter and space-time in a way that has a well-behaved continuum limit.? I hope I am not getting too technical for you.

I: Well. If I understood you correctly it means that the weaknesses of string theory are the strengths of loop quantum gravity.? And, the weaknesses of loop quantum gravity are the strengths of string theory. Couldn’t they be combined to form a better theory?

ST: No! I mean we shouldn’t even try because loop quantum gravity is clearly a failure. It can’t even reproduce the physics of ordinary space-time. Anyone who thinks it could work is a crackpot.

QG: Really, the problem is that string theory is not background independent. This is a fundamental failure. It is not even wrong, so it can’t be made to work with loop quantum gravity.

I: Are there areas where string theory is better understood?

ST: Yes. In three dimensions we have a much better understanding because string theory can be formulated as Chern Simons gauge theory.

QG: Loop quantum gravity is also much easier to understand in three dimensions. The states are described by spin networks that combine to form a spin foam model of three dimensional gravity known as chern simons theory.

I: The same chern simons theory?

ST: Yes, sort of.

I: More accidental unity I suppose. But If I may persist with this idea for a moment.? Perhaps you could look back to the origins of each theory to see if there is a common ground?

ST: Curiously, string theory actually started as an attempt to understand the strong nuclear force that binds quarks inside nucleons. Particle physicists thought this confinement could be explained by the presence of strings of energy binding the quarks. This theory did not work but we noticed that the gravitational field was included accidentally, so the idea was re-born as a unified theory of quantum gravity and other forces at higher energies. 

QG: Whereas loop quantum gravity developed out of a formulation of the strong nuclear force called the loop representation. It did not help understand the confinement of quarks as has been hoped, but we noticed that it worked naturally as a description of states in quantum gravity.

I: So you are saying that both ideas came from the same nuclear physics and developed in similar ways?

ST: Superficially, yes.

QG: But that is the only similarity. Loop quantum gravity is an alternative to string theory.

I: With completely separated funding?

ST: Of course!

I: I wonder. Do you physicists ever get together to compare your ideas?

QG: Yes. We are doing it now. Silly!

I: I mean, do you have conferences on quantum gravity where you compare your different theories and look for ways to benefit from each others work?

QG: Mostly we have conferences about Loop Quantum Gravity, and they have conferences about String Theory.

ST: I remember one time we invited anonymous quantum gravitist to give a talk at our String conference. It was a complete waste of time.

I: Oh? Why?

ST: His talk was about things we had known ten years ago. There has been no progress.

QG: Actually my talk was just an overview of the subject for the young string theorists. I did not have time to talk about our latest work.

I: But I thought these conferences were to discuss the latest research.

ST: Yes but there is no recent work worth talking about in Loop Quantum Gravity.

QG: On the contrary. We have recently discovered new spin foam models such as EPRL which solve many of the old problems. These models they offer the hope of building spacetime from abstract algebra  using higher category theory.

I: And what are the latest developments in string theory?

ST: We are pursueing a more algebraic approach to understanding the different p-branes in string theory. Some new understanding comes from the application of what we call higher category theory.

I: Would it be worth comparing these ideas to see if there is anything you could learn from each other?

QG: No. I am working on alternatives to string theory. It would not make sense to work with them.

ST: In fact, our approaches are completely different. There is nothing in common. I do not need to work with crackpots.

I: Gentlemen. I’m afraid that is all we have time for today. Thank you for this fascinating discussion.

QG: Goodnight.

ST: Goodnight crackpot.

A double take on the string wars

October 23, 2009



Humor for physicists.


September 9, 2009

Its, 09:09:09 on 09/09/09 UTC,  have fun

Mount Wilson Observatory Threatened by California Wild Fires

August 31, 2009

The mount wilson observatory in California is threatened by wildfires.

To see some pictures captured today from the wbe cam try  http://vixra.org/mtwilson/

A Symmetric Hypermatrix and Brahmagupta’s Formula

July 2, 2009

Before I go back to some more number theory blogging, here is another post about one of the curious properties of Cayley’s Hyperdeterminant.

Some times it is useful to look at hypermatrices that have extra symmetries. For example we might be interested in a 3×3 hypermatrix A whose components aijk form a symmetric form i.e.

aijk = aikj = ajik

This means that the hypermatrix has just four independent components

a111 = a
a112 = a121 = a211 = b/3
a221 = a212 = a122 = c/3
a222 = d

This hypermatrix often comes up because it represents a symmetric bilinear form or more simply a cubic polynomial

A(x) = a x3 + b x2y + c xy2 + d z3

The invariance group of the hypermatrix that preserves the symmetry structure is reduced from SL(2) X SL(2) X SL(2) to just SL(2). The hyperdeterminant just reduces to the discriminant of the polynomial

 -27 det(A) = b2c2 – 4ac3 – 4b3d – 27 a2d2 + 18 abcd

For higher dimensional hypermatrices an analogous result holds but the hypermatrix reduces to a power of the discriminant, e.g. for a 2x2x2x2 hypermatrix the hyperdeterminant is the fourth power of the discriminant times some factor.

There are other ways that hypermatrices can be constrained that still retain some symmetry.  Suppose we demand that the hypermatrix is invariant under some fixed set of linear transforms. E.g. for a  mn hypermatrixwe select n different mxm matrices and apply them as a transofrmation on the hypermatrix, then we require that this gives the same hypermatrix. For which sets of matrices is this possible in a non-trivial way? The question can be made more general if we allow indices to be transposed as well so that the symmetric hypermatrices already considered are a special case. I wont attempt to give a complete answer but I’ll look at some special cases, without the transpositions.

A useful special case for matrices is when the transposition matrices are permutation matrices. An m x m matrix M can be transformed to give PMQ where P and M are matrices which just permute the m dimensions. This will generate a permutation of the m2 elements in the matrix. So to impose that the matrix is invariant we would require that the elements of the matrix in any cycle are the same. Circulant matrices are obviously special cases of this.

When some of the cycles are of even length we can put a sign factor in the tranasformation matrix. For example, transpose a 2×2 matrix in both directions using J =

(0  -1)
(1   0)

An invariant matrix will have the form

(a  -b)
(b   a)

Which is the familair representaion of the complex numbers.

 The residual symmetry is reduced from SL(2) x SL(2) to transformation matrices that commute with J, i.e. SO(2) x SO(2).

We could try and generalise this to the 2n hypermatrix but it only works when n is even. For odd numbers of dimensions the constraint makes the hypermatrix all zero.

However, if we use instead the matrix K

(0   1)
(1   0)

Then the result is non-trivial for 3 (or any number of) dimensions and the residual symmetry  is SO(1,1) X SO(1,1) X SO(1,1) 

This means that opposing elements of the hypermatrix A must be equal

a111 = a222 = a/2
a112 = a221 = b/2
a121 = a212 = c/2
a122 = a211 = -d/2

For this case something nice happens for the hyperdeterminant. It factorises!

det(A) = (a + b + c – d)(a + b – c + d)(a – b + c + d)(-a + b + c + d)/16

Classical geometry students will instantly recognise this from Brahmagupta’s formula for the area of a quadrilateral inscribed in a circle with sides of length a, b, c and d, which generalises Heron’s formula for the area of a triangle. We can write is as.

Area = sqrt(det(A))

This is quite pleasing because geometric interpretations of hyperdeterminants are quite hard to come by.

Notice that I had to introduce a minus sign on one pair of opposing numbers to get this to work. This was also the case when comparing the formula for regular Diophantine Quadruples with the hyperdeterminant, so in some sense it seems to be a natural thing to do.


Diophantine Quadruples and Symmetric Matrices

June 29, 2009

It’s funny how sometimes a simple but useful mathematical trick can go unnoticed for a long time, yet be obvious when it is pointed out. Here is a good example.

A few days ago I posted here about ,Diophantine Quadruples which are sets of four different non-zero numbers (rational or integer) such that the product of any two is one less than a square. This problem of finding them has been around since Diophantus and has been looked at by some of the greatest classical number theorists including Fermat and Euler, as well as 20th Century specialists of number theory such as Baker and Davenport. In the last few years the literature on the subject has been steadily growing. I myself have been thinking about the problem on and off since I heard about it as a teenager 30 years ago. Well, we all seem to have missed an easy trick for finding them.

Actually this method applies to a generalization of the problem which is to find sets of four distinct positive integers such that the product of any two is less than a square by a fixed number n. These quadruples are said to have the property D(n). For n = 1 there is an infinite set of them called regular quadruples that can be constructed recursively. There are probably no irregular examples but that has not been settled yet. When n is not a square the problem is more difficult with only a few examples known for each n. For the state of the art you should consult the references listed by Anrej Dujella.

So let’s suppose we have a quadruple (a,b,c,d) with property D(n)

ab = x2+n
ac = y2+n
bc = z2+n
ad = r2+n
bd = s2+n
cd = t2+n

Now form a symmetric matrix with the quadruple down the diagonal and the square roots as the correspondng off diagonal elements

          ( a  x  y  r )
   D =    ( x  b  z  s )
          ( y  z  c  t )
          ( r  s  t  d )

Next form the Adjoint matrix. For non-singular matrices this is the inverse times the determinant.

Adj(D) = D-1  det(D)

The elements of this matrix can also be constructed from the 3×3 minor determinants of D and this works even when D is singular. It is therefore a matrix of integers just like D, and it is symmetric.

            ( k  u  v  w )
Adj(D) =    ( u  l  e  f )
            ( v  e  m  g )
            ( w  f  g  n )

The 2×2 minors of an adjoint 4×4 matrix are given by the opposing 2×2 minors from the original matrix times its determinant. e.g  

kl – u2 = (cd – t2) det(D)

This works for all the minors, but we only need to look at the principal minors. The result should now be obvious. Since (cd – t2) = -n, it follows that kl – u2 = -n det(D). With the same being true for each minor, it follows that the quadruple (k,l,m,n) is also a Diophantine quadruple with the property D(n det(D)), although this could fail in specific cases if two of the numbers are the same or are not positive.

In most cases this means that if you give me a Diophantine quadruple, I can give you another one just by inverting its matrix. In fact I can give you (potentially) eight of them! This is because I can reverse the signs of the square roots and there are eight distinct cases that will usually give eight different quadruples. Obviously this process can be repeated by flipping signs in the adjoint and inverting again.

By the way, I only discovered this trick because I first realised some years ago that regular quadruples can be arranged on a 2x2x2 hypermatrix with zero hyperdeterminant. Then recently some papers appeared that showed that the minors of a symmetric matrix also give a singular hypermatrix . Putting the two together made me look at the matrix generated from a quadruple and its inverse.

It may not be a revolutionly result in mathematics but it does show how it is easy to miss simple tricks even for problems that have been around for a long time, and it is just possible that this trick could provide a new attack on some of the still unsolved problems concenring Diophantine quadruples.

Cayley’s Hyperdeterminant and Principal Minors

June 25, 2009

Just when you think you must know everything about a matematical structure you suddenly start to find out all sorts of new things about it. This has been happening to me recently with Cayley’s Hyperdeterminant  and I plan to blog about some of the things here.

First off is a surprise relationship between the principal minors of a 3×3 symmetric matrix and the hyperdeterminant that has been reported recently in arXiv:math/0604374 by Olga Holtz and Bernd Sturmfels (and possibly others).

The principal minors of a matrix are the determinants you are left with when you eliminate rows and columns of the original matrix passing through diaginal elements. The principal minors for a 3×3  matrix

          ( a   x   y )
    M =   ( u   b   z )
          ( v   w   c )


a111 = 1
a211 = a
a121 = b
a112 = c
a221 = ab – xu
a212 = ac – yv
a122 = bc – wz
a222 = det(M)

These are eight numbers which naturally fall on the corners of a cube to form a hypermatrix A, so what is its hyperdterminant? This can be worked out by hand to yield the answer,

det(A) = (yuw – xzv)2

In particular, if the matrix is symmetric the hyperdeterminant is zero. [The same is true for any symmetric matrix which is transformed by pre or post multiplication by a diagonal matrix]

Why is this so interesting? Well, the symmetry of a 3×3 matrix is preserved under SO(3) transformations applied simulataneously to rows and columns. The determinant a222 is invariant (of course a111 is trivially invariant). The principal minors appear in the diagonal elements of the inverse of M. So if the eight components of the hypermatrix are augmented with the six off-diagional elements of the matrix amd its inverse, then we seem to have a sytem of 14 dimensions with some extra symmetry. 

With a small adjustment we can extend this to a general 2x2x2 hypermatrix with components aijk and with a zero hyperdeterminant. Working in reverse we now define

a = a211
b = a121
c = a112
d = a222

k = a122
l = a212
m = a221
n = a111


x2 = ab – mn
y2 = ac – ln
z2 = bc – kn

This produces a symmetric matrix

          ( a   x   y )
    M =   ( x   b   z )
          ( y   z   c )

We can also add

r2 = da – lm
s2 = db – mk
t2 = dc – kl

And this gives a second symmetric matrix

          ( k   t   s )
    N =   ( t   l   r )
          ( s   r   m )

Multiplying these matrices together (and resolving sign ambiguities) we get a tidy relation

MN = nd I

The extra six components also have a simple interpretation. Since the hypermatrix has zero hyperdetermiant we know that it has a system of three vectors which act like the zero eignevectors of the system. The six components are the components of these vectors.

So what we have found is that the linear system consisting of the 8 components of a singular hypermatrix and the 6 components of its zero eigenvectors forms a system which has the SL(2) x SL(2) x SL(2) symmetry of the hypermatrix and also some extra SO(3) symmetries. So what is the system and what is its full symmetry?

If you know a bit about excpetional alegebras you can probably at least guess the answer to this question. The two symmetric 3×3 matrices can be regraded as elements in the Jordan algebra J3R. Then the matrices M and N and the two numbers n and d can be used to build a six dimensional matrix

    F =   ( nI   M  )
          ( N    dI )

This can be regarded as an element of a Freudenthal Triple System (FTS) which has dimention 14. Its automorphism group is a symplectic group (Sp(6) I think) and the relation MN = nd I is equivalent to det(F) = 0.

As far as I can tell this relationship is only valid between singular hypermatrices and singular elements of the FTS. It does not extend to non-singular elements in any way (please correct me if you think otherwise). I think it is of particular interest concerning the problem of Diophantine quadruples  since I can now think of these as special cases of an FTS as well as a hyperdeterminant.

Diophantine Quadruples

June 25, 2009

My interest in hyperdeterminants arose from investigating a very old problem in number theory that originated with Diophantus himself. In his books on algebra probably written around 250 AD, Diophantus looked at many equations which he tried to solve in rational numbers. One such problem was to find sets of rational numbers such the product of any two is one less than a square. I have no idea what motivated him to consider this particular puzzle. It has no obvious interpretation in geometry or any other practical application. He seems to have just invented it as an abstract quastion. However, I do know that it was well chosen because this problem has hidden symmetries of enormous interest even today and its mysteries are still not fully revealed.

Diophantus provided examples of triplets and quadruples of rational numbers that solved his problem. A neater example was provided by Fermat who read Diophantus but became more interested in integer solutions to equations than rational ones. He observed that 1,3,8,120 has the required property with the products being one less than the squares of 2,3,4,11,19 and 31. We still dont know if there are 5 non-zero integers with this property although Dujella has shown that there can be at most a finite number of them. Euler found that a fifth rational number (777480/8288641) can be added to Fermat’s quadruple giving a set  of five. A few years ago I carried out a computer search and found some rational diophantine sextuples of which the smallest is this one

11/192    35/192   155/27    512/27   1235/48    180873/16

There are still plenty of related unsolved problems and a growing literature on the subject for any number theorist with time on their hands. They might start at the list of references maintained by Dujella at  http://web.math.hr/~duje/ref.html

Of more interest to me these days are the hidden symmetries in this problem. These first become evident when you attack the probelm directly. To find two numbers with the property is easy. We just want a and b with

ab = x2 -1

In rationals we can just take any non-zero numbers a and x and solve b = (x2-1)/a for a complete solution over rationals. A third number would take the form c = (y2-1)/a , with the requirement that bc = z2 -1

(x2-1)(y2-1)/a= z2 -1

Because the largest part of each side is nearly a square it is natural to look for a perturbation solution with

z = (xy + t)/a

Eliminating z in favour of t and simplifying reduces to

x2 + y2 + t 2+ 2xyt = a2 + 1

The first thing to notice about this equation is that it is quadratic in any of the variables x,y,t. This means that if we have one solution in integers or rationals, then we can find another by looking for the other root of the quadratic for any of the three variables. This can be repeated to give an infinite nunber of diophantine triples a,b,c. The second observation is that the equation is symmetrical in the variables x,y,t. This is an accidental hidden symmetry that was not apparent originally and it has a useful consequence. It means that given any triplet solution a,b,c we can construct a fourth number d = (t2-1)/a and by the symmetry of the equation this must extend the triplet to a quadruplet with the property of Diophantus.

Although the equation shows some symmetry it does not reflect the expected symmetry of the problem between the number a,b,c,d. This can be remedied by eliminating x,y and t in favour of b,c and d,

(ab+1) + (ac+1) + (ad+1) + 2 sqrt((ab+1) (ac+1) (ad+1))  = a2 + 1

=> 4(ab+1) (ac+1) (ad+1) = (a2 – ab – ac – ad – 2)2

By expanding this, cancelling terms and dividing out a factor of a2 we get an equation symmetric in a,b,c,d

P(a,b,c,d) = a2 + b2 + c2 + d2 -2ab – 2ac – 2ad – 2bc – 2bd – 2cd – 4abcd + 4 = 0

By construction this equation should be capable of being used to extend an Diophantine triple (a,b,c) to a Diophantine quaadruple (a,b,c,d). This can be seen explicitly if we can use it to solve for d when a,b,c are given. Since the equation is quadratic in d this amounts to completing the square to rewrite it as

 P(a,b,c,d) = (d – a – b – c – 2abc)2 – 4(ab+1)(ac+1)(cd+1) = 0

So if (a,b,c) is a Diophantine triple meaning that (ab+1), (ac+1) and (cd+1) are integer squares, then the equation can indeed ne solved to findand integer d. That (a,b,c,d) is then a quadruple follows from the observation that (ad+1) must be a square because

P(a,b,c,d) = (a – b – c + d)2 – 4(ad+1)(bc+1) = 0

and similar identities for the other squares.

At this point it appears that the ability to extend a triple to a quadruple is the result of the fact that this polynomial can be written in a number of different ways as the difference of a square and a product of the numbers that need to be squares. This result seems to defy explanation. If this is not remarkable enough it is more surprising that there is a similar polynomial in five variables that allows any Diophantine quadruple to be extended to a rational Diophantine quintuple. I will return to that in a later post.

So where does this polynomial come from and to what mathematical principles does it owe its properties?

A partial answer to this question is that the expression is a special case of Cayley’s hyperdeterminant for a 2x2x2 hypermatrix with components

a111 = -a
a112 = 1
a121 = 1
a122 = b
a211 = 1
a212 = c
a221 = d
a222 = -1

The additional symmetries of the hyperdeterminant reduce the number of mysterious identities that it fulfills and most of those that remain arise if the derivation of the hyperdeterminant. Yet it is not obvious that this fully explains the relationships and some mystery remains.

Hypermatrix Inverses

June 6, 2009

The last post about multiplicative hyperdeterminants used a form of multiplication of hypermatrices where the last direction of one hypermatrix is contracted over the first direction of another. We can write the product using ordinary product notation so A multiplied by B is just AB. This generalises multiplication of matrices and multipication of matrices with vectors. For two vectors it gives the scalar product.

A more general product can be defined if we contract over more than one of the hypermatrix directions. I will write A(p)B to mean the product of hypermatrices A and B where the last p directions of A are contracted with the first p directions of B in reverse order. When a hypermatrix A of rank r  is multiplied by a hypermatrix B of rank s  the product A(p)B is a hypermatrix of rank r+s-2p. It is only defined if the dimensions of the last p directions of A match the dimensions of the first p directions of B. If A and B are multi-linear operators on vector spaces then the last vector spaces that A operates on must be the dual spaces of the first vector space that B operates on. When a vector space is transformed by a non-singular linear operator, its dual is transformed by the inverse of the operator.

The problem we want to look at now is how to invert a hypermatrix so that we can use it to reverse a multiplication. In other words, if C = AB we want  to solve for A in terms of C and B. We can do this if we have a hypermatrix D with the same format as B, which inverts B in a suitable fashion. Given any matrix D with the same format as B (of rank r) we can form r matrices by contracting over all but one pair of matching directions. If all these matrices are a non-zero multiple of the identity matrix we call D an inverse of B. In particular we need B(r-1)D = xI and D(r-1)B = yI, but these are just 2 out of r ways of doing the contraction. Notice that we say “an” inverse rather than “the” inverse because such an inverse may not exist, and if it does it may not be unique (even up to a factor).

Such an inverse can be used to invert the multiplication C = AB to give the solution A = (1/x) C(r-1)D. This does not require all the properties of the inverse, it just requires B(r-1)D = xI. But our more specific definition of inverse is interesting because there is a means to construct such inverses from hyperdeterminants and other invariants.

The term inverse is justified because if D is an inverse of B then B is an inverse of D. This follows immediately from the symmetry in the definition. Furthermore the usual inverse of a non-singular matrix is also an inverse in the hypermatrix sense, but so is any scalar multiple of the inverse. Why not fix the normalisation? Take the trace of B(r-1)D = xI and D(r-1)B = yI and you get B(r)D = xd1 = ydr (where di is the dimension of the hypermatrix in the ith direction) and similarly for the other directions of the hypermatrix. If the hypermatrix is hypercubical, so that all the directions have the same dimension, then we could fix z = y =1, but that wont work for more general hypermatrix formats. Instead we would have to fix x = 1/d1,  y = 1/detc., but that would be inconsistent with the inverse of a matrix. The best compromise is to not fix the normalisation at all.

Another plus for this definition is that the set of all inverses (with zero included) forms a vector space. The dimension of this vector space is an important attribute of the hypermatrix format and is not easy to calculate even for simple formats. What makes this interesting is that we can form inverses using invariants. This generalises the expression for the inverse in terms of its determinant, which is,

             A-1 = (1/detA) (d/dA) detA

If A is a hyperdeterminant of rank r and K(A) is an invariant of degree m, then the inverse of A with respect to K is defined as

             InvK(A) = (1/K(A)) (d/dA) K(A)

If we contract A with this inverse over all directions we find that

             A(r)InvK(A) = m

This result is true for any homogeneous multivariate polynomial of degree m, but we can use the invariance property of K(A) to derive the stronger result that this inverse is an inverse in the sense defined previously. To see this consider a linear transform of A when its last direction is transformed by a linear operator δA = AE where E is small.

      δK(A) = Tr(  (d/dA)K(A) (r-1) δA  )  = K(A) Tr( InvK(A) (r-1) AE ) + O(E2)

But by the transformation rule for invariants we get

     K(A) -> K(A) det(I+E)m/d

     => δK(A) = (m/d) K(A) Tr(E) + O(E2)

By bringing these together we conclude that

           InvK(A) (r-1) A   = (m/d)I

With similar results for the other directions, which confirms that InvK(A) is an inverse of A.

That is all I have to say about inverses for now, but I’ll leave you with a few questions to consider. Firstly, I have not demonstrated that InvK(InvK(A)) = A. How can this be proved in general? Secondly, the ring of invaraints for a given format is finitely generated. Given a set of generators, we get a set of inverses. Are these linearly independent and do they generate the full vector space of inverses? I dont really know the answers to these questions so if you do know something, please leave a comment about it.