An Introduction to Tensor Notation


Mathematics and the methods to communicate and work with it have evolved ever since mankind began. In our modern world, most are driven to represent concepts and expressions in clear forms based on topics like Linear Algebra, keeping things in the form of matrices and vectors. While this is quite useful for visualization and ease of computation in many cases, there are other approaches to viewing and tackling problems that ignore some of the mainstream methods. One example I am going to provide an introduction to is the area of Tensor Notation.

Tensor Notation

The Basics

Tensor notation is a tool to represent and work with mathematics in a way that essentially uses indices to represent other dimensions of a quantity. This notation allows for easily working with tensors of all varieties and in turn generalizes better than most typical Linear Algebra techniques.

The basic concepts of tensor notation are the following:

  • Terms that share indicies represent a summation
    • $a_i b_i = \sum_{i}^{n} a_i b_i$
  • Shared indices are dummy variables, so they can be changed to anything
    • $a_i b_i = a_k b_k = a_p b_p$
  • Order of tensor variables next to each other doesn’t matter
    • $a_i b_j c_k = b_j c_k a_i = b_j a_i c_k$
  • Vectors are represented in some unit vector basis $\left\lbrace \hat{e}_i \right\rbrace$
    • $\textbf{u} = u_i \hat{e}_i$
  • Dot product between unit vectors in basis results in Kronecker Delta Property
    • $\hat{e}_i \cdot \hat{e}_i = \delta_{i,j}$
  • Derivative of tensors with respect to self result in Kronecker Delta Property
    • $\frac{\partial x_i}{\partial x_j} = \delta_{ij}$
    • $\frac{\partial A_{ij}}{\partial A_{mn}} = \delta_{im}\delta_{jn}$
  • Multiplying tensor with Kronecker Delta when they share indices is equivalent to an indice swap
    • $A_{ij}\delta_{jp} = A_{ip}$
  • Cross products are represented by third-order tensors
    • $(\textbf{a} \times \textbf{b})_i = \mathcal{E}_{ijk} a_j b_k$
  • Transpose of second-order tensor is just a swapped indice
    • $A_{ij}^{T} = A_{ji}$


\delta_{ij} &= \begin{cases}
1 & i = j \\
0 & i \neq j
\end{cases} \\
\mathcal{E}_{ijk} &= \begin{cases}
1 & (i,j,k) \text{ is even permutation of (1,2,3)} \\
-1 & (i,j,k) \text{ is odd permutation of (1,2,3)} \\
0 & \text{otherwise}


Given we have some simple basics listed out, let’s do a set of examples to try and solidify an understanding of the basics!

Example 1

Expand the expression $3 a_i b_i$ given $i \in \left\lbrace 1,2\right\rbrace$.
3 a_i b_i = 3 \left(a_1 b_1 + a_2 b_2 \right)

Example 2

Use tensor notation to represent the inner product between vectors $\textbf{u}$ and $\textbf{v}$.
\textbf{u} \cdot \textbf{v} &= u_i \hat{e}_i \cdot v_j \hat{e}_j \\
&= u_i v_j (\hat{e}_i \cdot \hat{e}_j) \\
&= u_i v_j \delta_{ij} \\
&= u_i v_i

Example 3

Expand the expression $A_{ij} b_{j}$ given $j \in \left\lbrace 1,2\right\rbrace$.
A_{ij} b_{j} = A_{i1} b_{1} + A_{i2} b_{2}

Example 4

Compute the derivative of the quantity $J = x^{T}Ax$ with respect to $x$, where $A$ is symmetric
J = x^{T}Ax &= x_{i}A_{ij}x_{j}\\
\frac{\partial J}{\partial x_{k}} &= \frac{\partial}{\partial x_{k}}\left(x_{i}A_{ij}x_{j}\right) \\
\frac{\partial J}{\partial x_{k}} &= \frac{\partial x_{i}}{\partial x_{k}}A_{ij}x_{j} + x_{i}A_{ij}\frac{\partial x_{j}}{\partial x_{k}} \\
\frac{\partial J}{\partial x_{k}} &= \delta_{ik}A_{ij}x_{j} + x_{i}A_{ij}\delta_{jk} \\
\frac{\partial J}{\partial x_{k}} &= A_{kj}x_{j} + x_{i}A_{ik} \\
\frac{\partial J}{\partial x_{k}} &= A_{kj}x_{j} + A_{jk}x_{j} \\
\frac{\partial J}{\partial x_{k}} &= A_{kj}x_{j} + A_{kj}x_{j} \\
\frac{\partial J}{\partial x_{k}} &= 2A_{kj}x_{j}

Example 5

Assuming some matrix $C$ is invertible, find $\frac{\partial C^{-1}_{ij}}{\partial C_{kl}}$.
First, we know that since $C$ is invertible, the following is true: $C_{ik}C^{-1}_{kj} = \delta_{ij}$
Given this fact, the following derivation can be done:
C_{ik}C^{-1}_{kj} &= \delta_{ij}\\
\frac{\partial}{\partial C_{lm}}\left(C_{ik}C^{-1}_{kj}\right) &= \frac{\partial \delta_{ij}}{\partial C_{lm}}\\
\frac{\partial C_{ik}}{\partial C_{lm}}C^{-1}_{kj} + \frac{\partial C^{-1}_{kj}}{\partial C_{lm}}C_{ik} &= 0 \\
\frac{\partial C^{-1}_{kj}}{\partial C_{lm}}C_{ik} &= -\delta_{il}\delta_{km}C^{-1}_{kj} \\
\frac{\partial C^{-1}_{kj}}{\partial C_{lm}}C_{ik}C^{-1}_{ri} &= -\delta_{il}C^{-1}_{mj}C^{-1}_{ri} \\
\frac{\partial C^{-1}_{kj}}{\partial C_{lm}}\delta_{rk} &= -C^{-1}_{mj}C^{-1}_{rl} \\
\frac{\partial C^{-1}_{rj}}{\partial C_{lm}} &= -C^{-1}_{mj}C^{-1}_{rl} \\
\frac{\partial C^{-1}_{ij}}{\partial C_{kl}} &= -C^{-1}_{lj}C^{-1}_{ik} \\


With this post, we have covered some basic aspects of Tensor Notation and investigated how to use it for various derivations. I have found that this skill has proven very useful in doing derivations with respect to matrices, which is a common task to complete in deriving control algorithms based on state-space models of the dynamics. I know Tensor Notation also finds its way into modern physics, though I am sure there are many other disciplines that use it often as well.

In the future, we may investigate using Tensor Notation and applying it to some area of study, like controls or some aspect of physics.

So what do you want to tell me?