Hide code cell source
import matplotlib.pyplot as plt
%matplotlib inline
import matplotlib_inline
matplotlib_inline.backend_inline.set_matplotlib_formats('svg')
import seaborn as sns
sns.set_context("paper")
sns.set_style("ticks");

The Model Calibration Problem#

The model calibration problem is the inverse of the uncertainty propagation problem. That is why such problems are also called inverse problems. It goes as follows. One observes a quantity that the model predicts, and they want to go back and characterize how this observation changes the state of knowledge about the model’s parameters.

Example: Driving a trailer on a rough road#

Recall the trailer example. We have a trailer of mass \(m\) moving on a rough road with speed \(v\). The suspension spring constant is \(k\). We are interested in the vibration

Hide code cell source
from graphviz import Digraph
g = Digraph('Trailer')
g.node('k')
g.node('m')
g.node('y0', label='<y<sub>0</sub>>')
g.node('omega', label='<&omega;>')
g.node('v')
g.node('L')
g.node('X')
g.edge('v', 'omega')
g.edge('L', 'omega')
g.edge('y0', 'X')
g.edge('omega', 'X')
g.edge('k', 'X')
g.edge('m', 'X')
g.node('Xm', label='<X<sub>m</sub>>', style='filled')
g.edge('X', 'Xm')
#g.render('trailer_m_g', format='png')
g
../_images/275bb0ee4486a0e639ba32f19e1616eb71953420b3007729e1d618f36500cabe.svg

We have filled the node \(X_m\) with color to indicate that we observe it. Here the calibration problem is to identify all unknown parameters given the sensor data. This particular problem is ill-defined. It is impossible to find five variables, \(k\), \(m\), \(y_0\), \(v\), and \(L\) from a single noisy amplitude measurement.

Being engineers, we do not give up easily. We call the manufacturer and ask for the spring constant \(k\). They tell us that it is \(k = 1000\) N/m with very small uncertainty. We also add a sensor to measure the trailer’s mass \(m\) and another to measure its velocity \(v\). Let \(m_m\) and \(v_m\) be the measurement from such sensors.

We need to update the causal graph as follows:

  • We shade the spring constant \(k\) to indicate that it is known.

  • We add the two new sensors as nodes \(m_m\) and \(v_m\). They are shaded to indicate that they are observed.

Here is the new causal graph:

Hide code cell source
g = Digraph('Trailer')
g.node('k', style='filled')
g.node('m')
g.node('mm', label='<m<sub>m</sub>>', style='filled')
g.node('y0', label='<y<sub>0</sub>>')
g.node('omega', label='<&omega;>')
g.node('v')
g.node('vm', label='<v<sub>m</sub>>', style='filled')
g.node('L')
g.node('X')
g.edge('v', 'omega')
g.edge('v', 'vm')
g.edge('L', 'omega')
g.edge('y0', 'X')
g.edge('omega', 'X')
g.edge('k', 'X')
g.edge('m', 'X')
g.edge('m', 'mm')
g.node('Xm', label='<X<sub>m</sub>>', style='filled')
g.edge('X', 'Xm')
#g.render('trailer_m_g', format='png')
g
../_images/51caf8335c02ba1e3bf7fa9298c849a811dfb7b7bccdcbce074ef49543987f2d.svg

Now \(y_0\) and \(L\) is identifiable. The process we just followed formalizes something that we do all the time in engineering practice.

Solving inverse problems#

We need several lectures to understand how to pose and solve the problem. But here is the answer:

  • First, we need to specify all structural equations of the causal graph by introducing unknown parameters as required. In the diagram above, we had all the structural equations except the ones that connect the sensors to the corresponding variables, i.e., we still need the measurement model, otherwise known as the likelihood.

  • Second, we need to quantify our prior knowledge about all the unknown parameters by assigning probability densities to them. The resulting model is typically called a probabilistic graphical model. Another commonly used jargon is Bayesian network or hierarchical Bayesian model.

  • Third, we use some called the “Bayes’ rule” to condition our prior knowledge of the observations. This updated knowledge is our posterior knowledge. We call this step Bayesian inference.

  • Unfortunately, this posterior knowledge is rarely analytically available. So, the fourth step is to create a practical procedure that characterizes our posterior state of knowledge. The most common approach is to sample from the posterior distribution, e.g., via Markov chain Monte Carlo (MCMC). Another approach is to approximate the posterior distribution with a simpler distribution. This approach is called variational inference.

The steps above are the core of the Bayesian approach to inverse problems. It is not only about trailers. It is about more than just calibrating physical models. It is about any problem with a model with unknown parameters and some observations we can use to find the parameters. The problem encompasses all machine learning. Understanding the Bayesian approach to inverse problems is the key to understanding the state-of-the-art of modern machine learning. Most algorithms follow the steps above in one way or another.

Questions#

  • Modify the causal graph above to account for an indirect measurement of the spring stiffness

  • Modify the causal model to add sensors that estimate the wavelength \(L\) of the road oscillations, e.g., by taking and analyzing pictures of the road from a vehicle camera.