-------------------------------------
How are matrices and tensors related?
-------------------------------------
Mathematicians and physicist differ in the notation used for
vectors, tensors, matrices, and multilinear forms. Here is
a dictionary.
T^q = tensor product of q copies of the vector space T;
in particular, T^0=S is the algebra of scalar fields and T^1=T.
T^p_q = space of all linear mappings from T^q to T^p;
elements are (p,q)-tensors with p upper and q lower indices.
T_0^q = T^q
T_p0 =: T_p = (T^p)^* is the so-called dual space of T^q;
in particular, T_1 = T^* is the dual space of T;
its elements are the linear forms = covectors.
One can associate with every A in T_p^q canonically a multilinear
mapping B: T_q tensor T^p --> S with
B(s,t) = t(As) for s in T^q, t in T_p,
and conversely; indeed, since the image As of s under A is in T^p,
its image t(As) is a well-defined scalar. Using the B's in place of
the A's gives an alternative way of defining tensors, although one
less convenient for visualization.
Given a basis on T and a dual cobasis on T^*, one can use coordinates.
Then physicists write
- elements of T as vectors = column vectors with an upper index,
- elements of T^* as linear forms = 1-forms = covectors = row vectors
with a lower index,
- elements of T^q as multivectors with q upper indices,
- elements of T_p as multicovectors with p lower indices,
- elements of T_p as mixed multi/co/vectors with p lower and q upper
indices.
(There is also a dual version of this, where vector are considered
as rows and covectors as columns. The remainder then changes
accordingly.)
In particular.
(0,0)-tensor = scalar,
(1,0)-tensor = vector (vector in T=T^1) = column vektor,
(0,1)-tensor = covector (vector in the dual space T^*=T_1)
= row vector,
(1,1)-tensor = matrix (linear mapping from T to T).
Clearly, the columns of the matrix A_i^k are column vectors = vectors,
the rows are row vectors = covectors, and the indexing is consistent.
The requirement that basis and cobasis are dual is equivalent to the
statement that for every vector u and covector w (i.e., linear mapping
from vectors to scalars),
w(u) = w_i u^i;
here the Einstein convention is used that formulas involving
pairs of equally labelled indices, one of them a lower index
and the other an upper index must be interpreted as a sum over these
indices.
Mathematicians using linear algebra (where no tensors of order
>2 appear) write instead all indices as lower indices, no matter
whether they belong to row vectors, column vectors, or matrices.
They also write all sums explicitly, consider all vectors given
by a single letter as column vectors, and write covectors (1-forms)
explicitly using the transposition sign (^T, but statisticians often
use a prime ' instead, which is also the form used in Matlab).
This has many advantages and allows a simple notation
which increases understandability of otherwise long formulas.
Phys. notation: s = x^k y_k x vector, y covector
Math. notation: s = sum_k y_k x_k
or simply s=y^Tx.
Phys. notation: y_i = A_i^k x_k x,y vectors, A matrix
Math. notation: y_i = sum_k A_ik x_k
or simply y=Ax.
Phys. notation: s = A_i^i A matrix
Math. notation: s = sum_i A_ii
or simply s = tr A (trace).
Phys. notation: y_i = A_i^j B_j^k x_k x,y vectors, A,B matrices,
Math. notation: y_i = sum_jk A_ij B_jk x_k
or simply y=ABx.
Phys. notation: y_i = A_i^j B_j^k C_k^l D_l^m x_k
x,y vectors, A,B,C,D matrices
Math. notation: y_i = sum_jklm A_ij B_jk C_kl D_lm x_k
or simply y=ABCDx.
The linear algebra notation is compact and index-free,
in spite of the fact that coordinates are being used.
For higher order tensors, the advantages of the linear algebra
notation are less pronounced since one has to specify which
pairs of indices must be contracted. However, often, an index-free
notation is still possible:
Phys. notation: A_li = R_ijkl b^j c^k
Math. notation: A(u,v) = R(v,b,c,u)
Phys. notation: A_l^i = R^i_j^k_l b^j c_k
Math. notation: A(u,v^T) = R(v^T,b,c^T,u)
Phys. notation: A_i^j = R_i^kkj
Math. notation: A = tr_23 R,
where the subscripts indicate which indices must be contracted.
All this is completely independent of any metric.
If a metric = nondegenerate symmetric (0,2)-tensor g is given on T,
which associates with u,v in T the scalar g(u,v),
one can canonically identify vectors and covectors, at the
expense of some confusion if one is not careful.
This reads in physicists notation as follows: The metric is
g_ik=g_ki (expressing the symmetry),
and for every vector u^k, the associated covector is
u_i = g_ik u^k.
Conversely, one can reconstruct the vector from the covector using
u^k = g^ik u_i,
where g^ik=g^ki is the inverse metric, a symmetric (2,0)-tensor which
for consistency must satisfy the equations
g_ij g^kj = delta_i^k (*)
with the Kronecker delta
delta_i^k = 1 if i=k, = 0 otherwise,
which is the identity matrix written as a (1,1)-tensor in index
notation. Nondegeneracy is precisely the solvability of (*) for the
dual metric.
Mathematicians find it confusing to label different objects with the
same symbol, and prefer to always distinguish between a vector and its
canonically associated covector. Given a basis of T and the dual
cobasis of T^*, coordinates (row and column vectors) can be used to
define the elements of T and T^*; the metric g in T_2 is represented
in these coordinates by an invertible symmetric matrix = (1,1)-tensor G
such that
g(u,v) = u^TGv for u,v in T.
The canonical pairing induced by the metric therefore associates with
the vector u the covector
w^T = u^TG. (**)
Conversely, one can reconstruct from the covector w^T the canonically
associated vector
u = G^{-1}w.
The dual metric therefore maps u^T, v^T to u^TG^{-1}v, and is
represented by the inverse matrix G^{-1}.
The relation between the physicists form and the linear algebra form
of writing things can be inferred from (**) - we simply have
Phys. notation: g_ik
Math. notation: G = (g_ik)
Phys. notation: g^ik
Math. notation: G^{-1} = (g^ik)
Again, the linear algebra notation is compact and index free,
in spite of the fact that coordinates are being used.