Dynamic Mode Decomposition#

Dynamic mode decomposition (DMD) is a method that identifies linear dynamics from high-dimensional data. It combines POD in space with Fourier analysis in time. The benefits of DMD are:

  • It is purely data-driven, requiring no prior knowledge of the system.

  • It is computationally efficient, requiring only basic linear algebra.

  • It can provide insight into the underlying dynamics of the system.

  • It can be used to predict future states of the system.

  • It can account for control inputs and external forcing.

  • It forms the backbone of more advanced techniques, such as data-driven Koopman analysis.

There are some drawbacks to DMD:

  • It is a linear method, so it cannot capture nonlinear dynamics, transient behavior, or bifurcations.

  • It has trouble with traveling waves, such as shock waves or solitons.

  • It requires a large amount of data to accurately capture the dynamics.

  • It can be sensitive to noise in the data.

  • It requires a lot of sensor data, which may not always be available.

References#

  • This notebook follows closely the exposition in Chapter 7.2 of “Data-Driven Science and Engineering” by Brunton and Kutz.

Theory of DMD#

Suppose we are observing the state of a dynamical system at discrete and equidistant time points \(t_1,\dots,t_m, t_{m+1}\). Assume

\[\Delta t = t_{k+1} - t_k \]

is the time step between observations. The state of the system at time \(t_k\) is given by a vector \(\mathbf{x}_k = \mathbf{x}(t_k)\) in \(\mathbb{R}^n\). The state \(\mathbf{x}_k\) could be discrete measurements of a continuous field. For example, \(\mathbf{x}_k\) could be the temperature distribution in a room, the velocity field of a fluid, or the concentration of a chemical species.

Arrange the first \(m\) observations into a matrix:

\[ \mathbf{X} = \begin{bmatrix} \mathbf{x}_1 & \mathbf{x}_2 & \cdots & \mathbf{x}_{m} \end{bmatrix}. \]

Similarly, put the last \(m\) observations into a matrix:

\[ \mathbf{X}' = \begin{bmatrix} \mathbf{x}_2 & \mathbf{x}_3 & \cdots & \mathbf{x}_{m+1} \end{bmatrix}. \]

DMD seeks a linear operator \(\mathbf{A}\) such that

\[ \mathbf{X}' = \mathbf{A} \mathbf{X}. \]

In terms of the state vectors, this equation reads

\[ \mathbf{x}_{k+1} = \mathbf{A} \mathbf{x}_k. \]

The best such operator \(\mathbf{A}\) is found by minimizing the error

\[ \| \mathbf{X}' - \mathbf{A} \mathbf{X} \|_F^2, \]

where \(\| \cdot \|_F\) denotes the Frobenius norm. The solution is given by the pseudo-inverse of \(\mathbf{X}\):

\[ \mathbf{A} = \mathbf{X}' \mathbf{X}^\dagger. \]

We know that we can find \(\mathbf{X}^\dagger\) by doing SVD on \(\mathbf{X}\). This step will establish a connection to POD.

However, we would like to be able to predict the state of the system at future times efficiently. This could be done if we could diagonalize \(\mathbf{A}\):

\[ \mathbf{A}\boldsymbol{\Phi} = \boldsymbol{\Phi}\boldsymbol{\Lambda}, \]

where \(\boldsymbol{\Phi}\) is the matrix of eigenvectors and \(\boldsymbol{\Lambda}\) is the diagonal matrix of eigenvalues. If we had this decomposition, then we could write the state of the system at time \(t_k\) as

\[ \mathbf{x}_k = \boldsymbol{\Phi}\boldsymbol{\Lambda}^k\boldsymbol{\Phi}^{-1}\mathbf{x}_1. \]

To see this, observe that:

\[ \mathbf{A} = \boldsymbol{\Phi}\boldsymbol{\Lambda}\boldsymbol{\Phi}^{-1}. \]

So that:

\[ \mathbf{A}^k = \boldsymbol{\Phi}\boldsymbol{\Lambda}^k\boldsymbol{\Phi}^{-1}. \]

Finally,

\[ \mathbf{x}_k = \mathbf{A}\mathbf{x}_{k-1} = \dots = \mathbf{A}^k\mathbf{x}_1 = \boldsymbol{\Phi}\boldsymbol{\Lambda}^k\boldsymbol{\Phi}^{-1}\mathbf{x}_1. \]

The problem in DMD is that \(\mathbf{A}\) is \(n \times n\) (with \(n\) huge) and that it is not guaranteed to be diagonalizable. So, we need to derive some sort of approximation to the eigenvalues and eigenvectors of \(\mathbf{A}\). Let’s see how to do this.

Step 1: Compute the SVD of \(\mathbf{X}\)#

The first step in DMD is to compute the SVD of the matrix \(\mathbf{X}\). This is needed to find the best \(\mathbf{A}\). We typically use the truncated SVD keeping \(r\) terms:

\[ \mathbf{X} \approx \tilde{\mathbf{U}} \tilde{\boldsymbol{\Sigma}} \tilde{\mathbf{V}}^T. \]

As a reminder, \(\tilde{\mathbf{U}}\) is \(n \times r\), \(\tilde{\boldsymbol{\Sigma}}\) is \(r \times r\), and \(\tilde{\mathbf{V}}\) is \(m \times r\). We have that \(\tilde{\mathbf{U}}^T\tilde{\mathbf{U}} = \mathbf{I}_r\) and \(\tilde{\mathbf{V}}^T\tilde{\mathbf{V}} = \mathbf{I}_r\).

Recall that the POD modes are given by the columns of \(\tilde{\mathbf{U}}\). So, think of each column \(\mathbf{u}_j\) as a basis vector for the state space. We can project each observation \(\mathbf{x}_k\) onto this basis:

\[ \mathbf{x}_k \approx \sum_{j=1}^r \langle \mathbf{x}_k, \mathbf{u}_j \rangle \mathbf{u}_j = \sum_{j=1}^r \mathbf{u}_j \mathbf{u}_j^T \mathbf{x}_k = \tilde{\mathbf{U}}\tilde{\mathbf{U}}^T\mathbf{x}_k. \]

We will use \(\tilde{\mathbf{x}}_k\) to denote the principal components of \(\mathbf{x}_k\):

\[ \tilde{\mathbf{x}}_k = \tilde{\mathbf{U}}^T\mathbf{x}_k. \]

In terms of matrices:

\[ \tilde{\mathbf{X}} = \tilde{\mathbf{U}}^T\mathbf{X} = \tilde{\boldsymbol{\Sigma}}\tilde{\mathbf{V}}^T. \]

Step 2: Compute the projection of the optimal \(\mathbf{A}\) on the POD modes#

The optimal \(\mathbf{A}\) is given by

\[ \mathbf{A} = \mathbf{X}'\mathbf{X}^\dagger. \]

Let’s use the SVD of \(\mathbf{X}\) to find \(\mathbf{X}^\dagger\):

\[ \mathbf{X}^\dagger = \tilde{\mathbf{V}}\tilde{\boldsymbol{\Sigma}}^{-1}\tilde{\mathbf{U}}^T. \]

So, we get:

\[ \mathbf{A} = \mathbf{X}'\tilde{\mathbf{V}}\tilde{\boldsymbol{\Sigma}}^{-1}\tilde{\mathbf{U}}^T. \]

This is the matrix whose eigenvectors we want to find. This is an \(n \times n\) matrix with \(n\) typically very large. We don’t even want to form this matrix. I wouldn’t fit in memory. What do we do?

Well, instead of looking at the dynamics in the original space, let’s look at the dynamics of the principal components. We want to find a matrix \(\tilde{\mathbf{A}}\) such that

\[ \tilde{\mathbf{x}}_{k+1} = \tilde{\mathbf{A}}\tilde{\mathbf{x}}_k. \]

What is the relation between \(\mathbf{A}\) and \(\tilde{\mathbf{A}}\)? Start with the left hand side:

\[ \tilde{\mathbf{x}}_{k+1} = \tilde{\mathbf{U}}^T\mathbf{x}_{k+1} = \tilde{\mathbf{U}}^T\mathbf{A}\mathbf{x}_k = \tilde{\mathbf{U}}^T\mathbf{A}\tilde{\mathbf{U}}\tilde{\mathbf{x}}_k. \]

By comparison, we see that:

\[ \tilde{\mathbf{A}} = \tilde{\mathbf{U}}^T\mathbf{A}\tilde{\mathbf{U}}. \]

Plugging in the expression for \(\mathbf{A}\), we get:

\[ \tilde{\mathbf{A}} = \tilde{\mathbf{U}}^T\mathbf{X}'\tilde{\mathbf{V}}\tilde{\boldsymbol{\Sigma}}^{-1}. \]

This is an \(r \times r\) matrix. We can easily form it and find its eigenvalues and eigenvectors.

Step 3: Find the eigenvalues and eigenvectors of \(\tilde{\mathbf{A}}\)#

Now we just use a standard eigenvalue solver to find the eigenvalues and eigenvectors of \(\tilde{\mathbf{A}}\). We write:

\[ \tilde{\mathbf{A}}\mathbf{W} = \mathbf{W}\boldsymbol{\Lambda}, \]

where \(\mathbf{W}\) is the matrix of eigenvectors and \(\boldsymbol{\Lambda}\) is the diagonal matrix of eigenvalues. The eigenvalues will in general be complex. As we will show below \(\boldsymbol{\Lambda}\) are also eigenvalues of \(\mathbf{A}\). The eigenvectors \(\mathbf{W}\) are related to the eigenvectors of \(\mathbf{A}\).

Step 4: Find the \(r\)-first eigenvalues and eigenvectors of \(\mathbf{A}\)#

We can now find the first \(r\) eigenvalues and eigenvectors of \(\mathbf{A}\). Let’s try the \(n\times r\) matrix:

\[ \boldsymbol{\Phi} = \mathbf{X}'\tilde{\mathbf{V}}\tilde{\boldsymbol{\Sigma}}^{-1}\mathbf{W}, \]

and see if we are lucky. We have:

\[ \mathbf{A}\boldsymbol{\Phi} = \left(\mathbf{X}'\tilde{\mathbf{V}}\tilde{\boldsymbol{\Sigma}}^{-1}\tilde{\mathbf{U}}^T\right)\left(\mathbf{X}'\tilde{\mathbf{V}}\tilde{\boldsymbol{\Sigma}}^{-1}\mathbf{W}\right). \]

Now, recall that \(\tilde{\mathbf{A}} = \tilde{\mathbf{U}}^T\mathbf{X}'\tilde{\mathbf{V}}\tilde{\boldsymbol{\Sigma}}^{-1}\). So, we can write:

\[ \mathbf{A}\boldsymbol{\Phi} = \mathbf{X}'\tilde{\mathbf{V}}\tilde{\boldsymbol{\Sigma}}^{-1}\tilde{\mathbf{A}}\mathbf{W}. \]

This is the same as:

\[ \mathbf{A}\boldsymbol{\Phi} = \left(\mathbf{X}'\tilde{\mathbf{V}}\tilde{\boldsymbol{\Sigma}}^{-1}\mathbf{W}\right)\boldsymbol{\Lambda} = \boldsymbol{\Phi}\boldsymbol{\Lambda}. \]

So, we have found \(r\) eigenvalues and eigenvectors of \(\mathbf{A}\) without ever forming the matrix!

The columns of \(\boldsymbol{\Phi}\) are the DMD modes.

Spectral expansion#

If \(\boldsymbol{\Phi}\) was \(n\times n\) and invertible, then we have seen that we could write the dynamics as:

\[ \mathbf{x}_k = \boldsymbol{\Phi}\boldsymbol{\Lambda}^k\boldsymbol{\Phi}^{-1}\mathbf{x}_1. \]

But, we have only found the first \(r\) eigenvectors. The matrix \(\boldsymbol{\Phi}\) is not even square. So, we can’t do this.

Instead, we write:

\[ \mathbf{x}_k = \boldsymbol{\Phi}\boldsymbol{\Lambda}^k\mathbf{b}, \]

where \(\mathbf{b}\) is a vector that we need to find. This vector \(\mathbf{b}\) is some sort of optimal projection of the initial condition onto the DMD modes. We can find it by minimizing the error:

\[ \| \mathbf{x}_1 - \boldsymbol{\Phi}\mathbf{b} \|_2. \]

The solution can be found using the pseudoinverse of \(\boldsymbol{\Phi}\). However, this tends to be very expensive computationally.

Another idea, is to project \(\mathbf{x}_1\) on the POD modes and try to match \(\mathbf{b}\) there. We relax \(\mathbf{x}_1 = \boldsymbol{\Phi}\mathbf{b}\) by projecting the left side on the first \(r\) POD modes:

\[ \tilde{\mathbf{U}}\tilde{\mathbf{x}}_1 = \boldsymbol{\Phi}\mathbf{b}. \]

In other words, we require that \(\mathbf{b}\) matches the first \(r\) principal components - not the full state. Now, replace \(\boldsymbol{\Phi}\) by its definition:

\[ \tilde{\mathbf{U}}\tilde{\mathbf{x}}_1 = \mathbf{X}'\tilde{\mathbf{V}}\tilde{\boldsymbol{\Sigma}}^{-1}\mathbf{W}\mathbf{b}. \]

Multiply both sides by \(\tilde{\mathbf{U}}^T\):

\[ \tilde{\mathbf{x}}_1 = \left(\tilde{\mathbf{U}}^T\mathbf{X}'\tilde{\mathbf{V}}\tilde{\boldsymbol{\Sigma}}^{-1}\right)\mathbf{W}\mathbf{b}. \]

Observe that we have created again \(\tilde{\mathbf{A}} = \tilde{\mathbf{U}}^T\mathbf{X}'\tilde{\mathbf{V}}\tilde{\boldsymbol{\Sigma}}^{-1}\):

\[ \tilde{\mathbf{x}}_1 = \tilde{\mathbf{A}}\mathbf{W}\mathbf{b}. \]

Use the fact that \(\mathbf{W}\) are the eigenvectors of \(\tilde{\mathbf{A}}\):

\[ \tilde{\mathbf{x}}_1 = \mathbf{W}\boldsymbol{\Lambda}\mathbf{b}. \]

And this is a linear system that can be solved for \(\mathbf{b}\):

\[ \mathbf{b} = \left(\boldsymbol{\Lambda}\mathbf{W}\right)^{-1}\tilde{\mathbf{x}}_1. \]

Please note that \(\mathbf{W}\) is not unitary!

DMD expansion in continuous dynamics#

We have succeeded in writing the dynamics as:

\[ \mathbf{x}_k = \boldsymbol{\Phi}\boldsymbol{\Lambda}^k\mathbf{b} = \sum_{j=1}^r \boldsymbol{\phi}_j \lambda_{j}^k b_j. \]

Here \(\boldsymbol{\phi}_j\) are the DMD modes, \(\lambda_j\) the DMD eigenvalues, and \(b_j\) the DMD amplitudes of the initial condition. Let’s make the connection to continuous time. Write:

\[ t_k = k\Delta t, \]

and also introduce the angular frequencies \(\omega_j\) by:

\[ \lambda_j = e^{\omega_j\Delta t}. \]

Then, we can write:

\[ \lambda_j^k = e^{\omega_j k \Delta t} = e^{\omega_j t_k}. \]

In terms of the state:

\[ \mathbf{x}(t_k) = \sum_{j=1}^r \boldsymbol{\phi}_j e^{\omega_j t_k} b_j. \]

Or for arbitrary time \(t\):

\[ \mathbf{x}(t) = \sum_{j=1}^r \boldsymbol{\phi}_j e^{\omega_j t} b_j. \]