Uncertainty Propagation Through a Boundary Value Problem

MAKE_BOOK_FIGURES=Trueimport numpy as npimport scipy.stats as stimport matplotlib as mplimport matplotlib.pyplot as plt%matplotlib inlineimport matplotlib_inlinematplotlib_inline.backend_inline.set_matplotlib_formats('svg')import seaborn as snssns.set_context("paper")sns.set_style("ticks")def set_book_style():    plt.style.use('seaborn-v0_8-white')     sns.set_style("ticks")    sns.set_palette("deep")    mpl.rcParams.update({        # Font settings        'font.family': 'serif',  # For academic publishing        'font.size': 8,  # As requested, 10pt font        'axes.labelsize': 8,        'axes.titlesize': 8,        'xtick.labelsize': 7,  # Slightly smaller for better readability        'ytick.labelsize': 7,        'legend.fontsize': 7,                # Line and marker settings for consistency        'axes.linewidth': 0.5,        'grid.linewidth': 0.5,        'lines.linewidth': 1.0,        'lines.markersize': 4,                # Layout to prevent clipped labels        'figure.constrained_layout.use': True,                # Default DPI (will override when saving)        'figure.dpi': 600,        'savefig.dpi': 600,                # Despine - remove top and right spines        'axes.spines.top': False,        'axes.spines.right': False,                # Remove legend frame        'legend.frameon': False,                # Additional trim settings        'figure.autolayout': True,  # Alternative to constrained_layout        'savefig.bbox': 'tight',    # Trim when saving        'savefig.pad_inches': 0.1   # Small padding to ensure nothing gets cut off    })def set_notebook_style():    plt.style.use('seaborn-v0_8-white')    sns.set_style("ticks")    sns.set_palette("deep")    mpl.rcParams.update({        # Font settings - using default sizes        'font.family': 'serif',        'axes.labelsize': 10,        'axes.titlesize': 10,        'xtick.labelsize': 9,        'ytick.labelsize': 9,        'legend.fontsize': 9,                # Line and marker settings        'axes.linewidth': 0.5,        'grid.linewidth': 0.5,        'lines.linewidth': 1.0,        'lines.markersize': 4,                # Layout settings        'figure.constrained_layout.use': True,                # Remove only top and right spines        'axes.spines.top': False,        'axes.spines.right': False,                # Remove legend frame        'legend.frameon': False,                # Additional settings        'figure.autolayout': True,        'savefig.bbox': 'tight',        'savefig.pad_inches': 0.1    })def save_for_book(fig, filename, is_vector=True, **kwargs):    """    Save a figure with book-optimized settings.        Parameters:    -----------    fig : matplotlib figure        The figure to save    filename : str        Filename without extension    is_vector : bool        If True, saves as vector at 1000 dpi. If False, saves as raster at 600 dpi.    **kwargs : dict        Additional kwargs to pass to savefig    """        # Set appropriate DPI and format based on figure type    if is_vector:        dpi = 1000        ext = '.pdf'    else:        dpi = 600        ext = '.tif'        # Save the figure with book settings    fig.savefig(f"{filename}{ext}", dpi=dpi, **kwargs)def make_full_width_fig():    return plt.subplots(figsize=(4.7, 2.9), constrained_layout=True)def make_half_width_fig():    return plt.subplots(figsize=(2.35, 1.45), constrained_layout=True)if MAKE_BOOK_FIGURES:    set_book_style()else:    set_notebook_style()make_full_width_fig = make_full_width_fig if MAKE_BOOK_FIGURES else lambda: plt.subplots()make_half_width_fig = make_half_width_fig if MAKE_BOOK_FIGURES else lambda: plt.subplots()

Uncertainty Propagation Through a Boundary Value Problem#

Consider the steady-state heat equation on a heterogeneous rod with no heat sources:

\[ \frac{d}{dx}\left(c(x)\frac{d}{dx}T(x)\right) = 0, \]

for \(x\) between \(0\) and \(1\text{m}\) and boundary values:

\[ T(0) = 400^\circ\text{C}\;\text{and}\;T(x=1\text{m}) = 4^\circ\text{C}. \]

We are interested in cases in which we are uncertain about the conductivity, \(c(x)\). Let’s work on modeling this uncertainty in the first place.

Modeling our state of knowledge about the conductivity of the rod#

We need to come up with \(c(x)\). The first thing we need to do is write down everything we know about the rod:

The rod is one meter long. That is, we are ignoring the possibility of an uncertain length.
The rod is from laminations of \(2\) different materials: fiberglass (labeled material 0) and steel (labeled material 1).
We know that the concentration of fiberglass and steel are \(0.3\) and \(0.7\), respectively.
We know that the thermal conductivity of the fiberglass \(0.045\;\text{Wm}^{-1}\text{K}^{-1}\) and of the second \(38\;\text{Wm}^{-1}\text{K}^{-1}\).
The rod is made out of \(D\) segments, each consisting of only one of the two types of material. Even though we do not know the exact number of elements \(D\), we expect them to be around \(100\).

Let’s turn this information into a mathematical model for the conductivity of the rod. First, the number of segments \(D\) must be a discrete random variable with a given expectation, \(\mathbb{E}[D] = 100\). The best choice is a Poisson random variable with the correct rate parameter. It is:

\[ D \sim \operatorname{Poisson}(100). \]

Let’s plot its pmf to build some intuition about it:

import scipy.stats as st

D = st.poisson(100)

fig, ax = plt.subplots()
ax.bar(
    np.arange(50, 150),
    D.pmf(np.arange(50, 150))
)
ax.set_xlabel('$d$')
ax.set_ylabel('$p(D=d)$')
sns.despine(trim=True);

../_images/8a72e8b16b7c5a504d34d29483bb8b03cdb66bf122064bedf370dc11ded49bc4.svg

This looks reasonable. Let’s move to the segment coordinates. The coordinates of these segments are also random. Specifically, the leftmost coordinate is just \(X_0 = 0\); the right-most is \(X_D = 1\). We know this because we know the total length. What are the intermediate \(D-1\) coordinates? Since we are not told anything else, let’s assume that these intermediate coordinates are generated by uniformly sampling in \([0,1]\) and then sorting the numbers. Mathematically, the random vector \(X_{1:D-1}\) is given by:

\[ (X_1,\dots, X_{D-1}) = \operatorname{sort}(U_1,\dots,U_{D-1}), \]

where

\[ U_i \sim U([0,1]). \]

Let’s visualize this as well to develop our intuition.

# Create some random rod segments
fig, ax = plt.subplots()
for i in range(5):
    # Get the number of segments
    d = D.rvs()
    # Get the intermediate coordinates
    u = np.random.rand(d-1)
    # Create the coordinates of the segment in a sorted manner (including the end points)
    x = np.hstack([[0.0], u, [1.0]])
    # Plot the segment locations
    ax.plot(
        x,
        np.ones(d+1) * i,
        '.',
        label=f'Rod {i}'
    )
ax.set_xlabel('$x$')
plt.legend(loc='best', frameon=False)
sns.despine(trim=True);

../_images/4166060f6b52c1019dd671e3485d6a344ce53af53aca99afce17bfc539fe6ddf.svg

Okay, this looks reasonable as well. Let’s move on to the material types. Each segment will be assigned a random material. Say that the material we assign to segment \(d\) is \(M_d\) taking values in \(\{0, 1\}\). What distribution should we assign to the \(M_d\)’s? We expect the concentration of material \(0\) to be \(0.3\) and of material \(1\) to be \(0.7\). We can interpret this information as the statement that:

\[ \mathbb{E}\left[\frac{1}{D}\sum_{d=1}^D1_{\{0\}}(X_d)\right] = 0.3, \]

and

\[ \mathbb{E}\left[\frac{1}{D}\sum_{d=1}^D1_{\{1\}}(X_d)\right] = 0.7. \]

These equations say that the expected percentage of material \(0\) is \(0.3\) and similarly for material \(1\). What distribution for \(M_d\) is compatible with this information? Let’s assume that all the \(M_d\)’s are independent and identically distributed conditioned on \(D\). Then we can write:

\[ \mathbb{E}\left[\frac{1}{D}\sum_{d=1}^D1_{\{0\}}(M_d)\right] = \mathbb{E}\left[\mathbb{E}\left[\frac{1}{D}\sum_{d=1}^D1_{\{0\}}(M_d)|D\right]\right], \]

where the internal expectation is conditional on \(D\), i.e., it is with respect to \(p(M_1,\dots,M_D|D)\). So, in this internal expectation, you can treat \(D\) as constant and simplify the equation as:

\[ \mathbb{E}\left[\mathbb{E}\left[\frac{1}{D}\sum_{d=1}^D1_{\{0\}}(M_d)|D\right]\right] = \mathbb{E}\left[\frac{1}{D}\sum_{d=1}^D\mathbb{E}\left[1_{\{0\}}(M_d)|D\right]\right]. \]

Now because the \(X_d\)’s are identically distributed conditioned on \(D\), the internal expectation is simply the conditional probability of \(X_d\) taking the value \(0\), i.e., we can rewrite it as:

\[ \mathbb{E}\left[\frac{1}{D}\sum_{d=1}^D\mathbb{E}\left[1_{\{0\}}(M_d)|D\right]\right] = \mathbb{E}\left[\frac{1}{D}\sum_{d=1}^Dp(M_d = 0|D)\right]. \]

So, we will satisfy the initial equation if we take \(p(M_d = 0|D) = 0.3\) independently of what \(D\) is. Let’s do exactly that. This means that we can take the \(M_d\) to follow Categorical distributions:

\[ M_d \sim \operatorname{Categorical}(0.3, 0.7), \]

for \(d=1,\dots,D\).

Okay, now we have created random variables that model the number of segments and the material we have on each piece. How do we combine this to make the thermal conductivity a random function? Well, this is now simple. When we want the thermal conductivity at location \(x\), all we have to do is to find the segment \(d\) containing \(x\) and look up the conductivity of material type \(M_d\). If the conductivity of the material \(0\) is \(c_0 = 0.3\) and that of material \(1\) is \(c_1 = 0.8\), then we can write this function as:

\[ c(x) = \sum_{d=1}^Dc_{M_d}1_{[X_{d-1},X_{d}]}(x). \]

Notice that the conductivity is a function of \(D\) and the \(X_d\)’s and the \(M_d\)’s. So, it is a random function.

Getting a bit organized using Python classes#

The model we put together for the rod conductivity needs to be simplified. Let me repeat it here for clarity. It is as follows. The number of segments is:

\[ D\sim \operatorname{Poisson}(100). \]

The \(D+1\) segment coordinates \(X_0,\dots,X_D\) are:

\[ X_0 = 0, \]

\[ X_D = 1, \]

and the intermediate segment coordinates are:

\[ (X_1,\dots, X_{D-1}) = \operatorname{sort}(U_1,\dots,U_{D-1}), \]

where the segment coordinates are uniformly sampled:

\[ U_i \sim U([0,1]), \]

for \(i=1,\dots,D-1\). Finally, the material types are

\[ M_d \sim \operatorname{Categorical}(0.3, 0.7), \]

for \(d=1,\dots,D\), and the random thermal conductivity function:

\[ c(x) = \sum_{d=1}^Dc_{M_d}1_{[X_{d-1},X_{d}]}(x). \]

This is a lot to keep track of. We want to group everything so that we minimize the possibility of bugs. We are going to achieve this using Python classes. If you want to learn what classes are, you may want to pause at this point and watch this tutorial video. Otherwise, skip the definition and go to the end, where we use them.

We will make two classes, a Rod and a RandomRod. Their descriptions are as follows:

Rod represents a specific rod, a rod sample if you like. We will be able to initialize it using the segment coordinates, the material ids that go in each segment, and the thermal conductivity that comes with each material. This class will have the following methods:

Rod.get_conductivity(x) will give us the thermal conductivity of the specimen at location x.
Rod.__repr__() will return a text representation of the specimen. This function will be used when you try to print a rod.
Rod.plot() will be a function that visualizes the rod.

RandomRod represents a … random rod. It collects all the random variables that affect the characteristics of the rod. Hence, it needs to know the rate of the Poisson distribution for \( D \), the concentration of each material, and, of course, the thermal conductivity associated with each material. This class will have the following methods:

RandomRod.rvs() will be a function that samples a random Rod.

Okay, let’s do it. First, the Rod. You can skip the class definition and go directly to the code block where we use it.

Show code cell source Hide code cell source

from matplotlib.patches import Rectangle

class Rod(object):
    """A class representing a rod made out of different materials.
    
    Arguments:
    segment_coords -- Coordinates of the segments (both left and right). Total N + 1
                      if segments are N.
    mat_id         -- The material ids on each segment. Total N. Values must start
                      at 0 and go sequential to the maximum number of materials
                      we have on the rod.
    mat_cond       -- The conductivity associated with each unique material id.
    """
    
    def __init__(
        self,
        segment_coords,
        mat_id,
        mat_cond
    ):
        # Save segments for later
        self.segment_coords = np.sort(segment_coords)
        # The total number of segments
        self.num_segments = self.segment_coords.shape[0] - 1
        # Save the material id on each segment for later
        self.mat_id = mat_id
        # The number of unique materials
        self.num_mat = len(np.unique(self.mat_id))
        # The conductivity on each segment
        self.segment_cond = np.array([mat_cond[m] for m in mat_id])
        
    def _get_conductivity(self, x):
        """Evaluate the conductivity at location x, assuming x is a scalar."""
        # Find the segment that contains x
        for i in range(self.num_segments):
            if self.segment_coords[i] <= x and x <= self.segment_coords[i + 1]:
                return self.segment_cond[i]
            
    def get_conductivity(self, x):
        """Evaluate the conductivity at location x. 
        
        This works when x is a scalar or a numpy array.
        """
        if isinstance(x, float):
            return self._get_conductivity(x)
        # This checks if x is a numpy array. The function will fail other wise.
        assert isinstance(x, np.ndarray)
        # And it will only work with 1D arrays
        assert x.ndim == 1
        # This evaluates the conductivity at all the elements of x and returns
        # a numpy array
        return np.array([self._get_conductivity(xx) for xx in x])
    
    def __repr__(self):
        """Get a string representation of the rod."""
        s = 'SEGID\tLeft\tRight\tMat.\tCond.\n'
        s += '-' * 37 + '\n'
        for i in range(self.num_segments):
            s += f' {i}\t{segment_coords[i]:.2f}'
            s += f'\t{segment_coords[i+1]:.2f}'
            s += f'\t{material_on_each_segment[i]:d}'
            s += f'\t{segment_cond[i]:.2f}\n'
        return s
    
    def plot(self, ax=None):
        """Plots the bar. Returns the axes object on which the rod is plot.
        
        Arguments:
        ax -- An axes object to plot on. If not given, a new one will be created.
        """
        if ax is None:
            fig, ax = plt.subplots()
        for i in range(self.num_segments):
            mat_segment = Rectangle(
                (self.segment_coords[i], -0.1),
                width=self.segment_coords[i+1] - self.segment_coords[i],
                height=0.1,
                color=sns.color_palette()[self.mat_id[i]]
            )
            ax.add_patch(mat_segment)
        ax.set_ylim(-0.1, 1.1)
        ax.set_xlabel("x")
        sns.despine(trim=True)
        return ax

Let’s play with this class for a little bit by generating a random material:

help(Rod)

Help on class Rod in module __main__:

class Rod(builtins.object)
 |  Rod(segment_coords, mat_id, mat_cond)
 |  
 |  A class representing a rod made out of different materials.
 |  
 |  Arguments:
 |  segment_coords -- Coordinates of the segments (both left and right). Total N + 1
 |                    if segments are N.
 |  mat_id         -- The material ids on each segment. Total N. Values must start
 |                    at 0 and go sequential to the maximum number of materials
 |                    we have on the rod.
 |  mat_cond       -- The conductivity associated with each unique material id.
 |  
 |  Methods defined here:
 |  
 |  __init__(self, segment_coords, mat_id, mat_cond)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  __repr__(self)
 |      Get a string representation of the rod.
 |  
 |  get_conductivity(self, x)
 |      Evaluate the conductivity at location x. 
 |      
 |      This works when x is a scalar or a numpy array.
 |  
 |  plot(self, ax=None)
 |      Plots the bar. Returns the axes object on which the rod is plot.
 |      
 |      Arguments:
 |      ax -- An axes object to plot on. If not given, a new one will be created.
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors defined here:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |      list of weak references to the object (if defined)

So this just shows the text that we put inside tripple quotes. Let’s make a rod.

# Let's take two materials with conductivities:
material_cond = [0.3, 0.8]
# The number of materials we have:
num_materials = len(material_cond)
# The number of segments we want to have
num_segments = 100
# The segment coordinates. First segment starts at zero. The last ends at one.
# The intermediate segments have random locations.
segment_coords = np.hstack(
    [
        [0.0],
        np.sort(np.random.rand(num_segments - 1)),
        [1.0]
    ]
)
# Now let's sample a different conductivity for each segment
material_on_each_segment = np.random.randint(num_materials, size=num_segments)
# Conductivity on each segment
segment_cond = np.array(
    [
        material_cond[m]
        for m in material_on_each_segment
    ]
)
# Now create the rod object:
rod = Rod(
    segment_coords,
    material_on_each_segment,
    material_cond)

You can evaluate the rod object at point to get the conductivity:

rod.get_conductivity(0.8)

0.8

Or you can print it to see how it looks like:

print(rod)

Show code cell output Hide code cell output

SEGID	Left	Right	Mat.	Cond.
-------------------------------------
0.00	0.00	1	0.80
0.00	0.00	1	0.80
0.00	0.00	0	0.30
0.00	0.01	0	0.30
0.01	0.01	1	0.80
0.01	0.02	0	0.30
0.02	0.03	1	0.80
0.03	0.05	0	0.30
0.05	0.06	0	0.30
0.06	0.07	0	0.30
0.07	0.08	1	0.80
0.08	0.08	1	0.80
0.08	0.10	1	0.80
0.10	0.10	1	0.80
0.10	0.10	0	0.30
0.10	0.10	0	0.30
0.10	0.13	0	0.30
0.13	0.13	1	0.80
0.13	0.13	0	0.30
0.13	0.15	0	0.30
0.15	0.16	0	0.30
0.16	0.16	1	0.80
0.16	0.17	0	0.30
0.17	0.22	0	0.30
0.22	0.22	1	0.80
0.22	0.23	0	0.30
0.23	0.24	0	0.30
0.24	0.24	1	0.80
0.24	0.24	1	0.80
0.24	0.26	0	0.30
0.26	0.27	1	0.80
0.27	0.27	1	0.80
0.27	0.27	0	0.30
0.27	0.30	1	0.80
0.30	0.31	0	0.30
0.31	0.32	0	0.30
0.32	0.32	0	0.30
0.32	0.33	1	0.80
0.33	0.33	0	0.30
0.33	0.36	0	0.30
0.36	0.37	0	0.30
0.37	0.37	0	0.30
0.37	0.38	1	0.80
0.38	0.39	1	0.80
0.39	0.40	1	0.80
0.40	0.41	0	0.30
0.41	0.43	0	0.30
0.43	0.44	1	0.80
0.44	0.44	0	0.30
0.44	0.46	1	0.80
0.46	0.46	0	0.30
0.46	0.48	0	0.30
0.48	0.48	1	0.80
0.48	0.49	0	0.30
0.49	0.49	0	0.30
0.49	0.50	0	0.30
0.50	0.53	0	0.30
0.53	0.54	1	0.80
0.54	0.54	0	0.30
0.54	0.55	0	0.30
0.55	0.55	0	0.30
0.55	0.56	1	0.80
0.56	0.56	0	0.30
0.56	0.57	1	0.80
0.57	0.57	0	0.30
0.57	0.58	0	0.30
0.58	0.59	1	0.80
0.59	0.62	1	0.80
0.62	0.62	1	0.80
0.62	0.64	0	0.30
0.64	0.64	0	0.30
0.64	0.65	0	0.30
0.65	0.66	1	0.80
0.66	0.71	1	0.80
0.71	0.73	1	0.80
0.73	0.74	1	0.80
0.74	0.75	0	0.30
0.75	0.76	0	0.30
0.76	0.76	0	0.30
0.76	0.76	0	0.30
0.76	0.77	0	0.30
0.77	0.79	0	0.30
0.79	0.79	0	0.30
0.79	0.79	0	0.30
0.79	0.81	1	0.80
0.81	0.83	1	0.80
0.83	0.83	0	0.30
0.83	0.84	0	0.30
0.84	0.86	1	0.80
0.86	0.88	0	0.30
0.88	0.88	1	0.80
0.88	0.88	1	0.80
0.88	0.91	0	0.30
0.91	0.91	1	0.80
0.91	0.91	0	0.30
0.91	0.92	0	0.30
0.92	0.93	1	0.80
0.93	0.96	1	0.80
0.96	0.97	1	0.80
0.97	1.00	0	0.30

Or you can plot it:

fig, ax = plt.subplots()
rod.plot(ax=ax);

../_images/17e60ebb739fff75a1e3d205d6ba6c96d284a4e915c3722d22e2ce4aa1209c04.svg

Now, let’s make the RandomRod class:

Let’s initialize it:

# Create the random rod object
R = RandomRod(
    100,          # The number of segments
    [0.3, 0.7],   # The concentration of each material
    [0.045, 38.0] # The thermal conductivity of each material
)  
# Let's sample a few of them and plot them
for n in range(5):
    fig, ax = plt.subplots()
    R.rvs().plot(ax=ax)
    sns.despine(trim=True);

Now that we can generate random rods let’s write a solver for the steady-state heat equation. The numerical solver will also be a class that takes a given rod and solves the problem.

Fipy is used to solve the boundary value problem using a finite volume scheme You may have to install it if you don’t have it using:

# RUN THIS BLOCK IF YOU HAVEN'T INSTALLED fipy yet
!pip3 install fipy

Show code cell output Hide code cell output

DEPRECATION: Loading egg at /opt/homebrew/lib/python3.11/site-packages/ipython_tikzmagic-0.1.1-py3.11.egg is deprecated. pip 23.3 will enforce this behaviour change. A possible replacement is to use pip for package installation..
Collecting fipy
  Obtaining dependency information for fipy from https://files.pythonhosted.org/packages/3c/90/35015c5e45573144c8c05fc4ce559316205b128eb32fa8126f16f6deb6ea/FiPy-3.4.4-py3-none-any.whl.metadata
  Downloading FiPy-3.4.4-py3-none-any.whl.metadata (1.8 kB)
Requirement already satisfied: numpy in /opt/homebrew/lib/python3.11/site-packages (from fipy) (1.24.3)
Requirement already satisfied: scipy in /opt/homebrew/lib/python3.11/site-packages (from fipy) (1.10.1)
Requirement already satisfied: matplotlib in /opt/homebrew/lib/python3.11/site-packages (from fipy) (3.7.1)
Collecting future (from fipy)
  Downloading future-0.18.3.tar.gz (840 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 840.9/840.9 kB 1.0 MB/s eta 0:00:00a 0:00:01
?25h  Preparing metadata (setup.py) ... ?25ldone
?25hRequirement already satisfied: contourpy>=1.0.1 in /opt/homebrew/lib/python3.11/site-packages (from matplotlib->fipy) (1.0.7)
Requirement already satisfied: cycler>=0.10 in /opt/homebrew/lib/python3.11/site-packages (from matplotlib->fipy) (0.11.0)
Requirement already satisfied: fonttools>=4.22.0 in /opt/homebrew/lib/python3.11/site-packages (from matplotlib->fipy) (4.39.4)
Requirement already satisfied: kiwisolver>=1.0.1 in /opt/homebrew/lib/python3.11/site-packages (from matplotlib->fipy) (1.4.4)
Requirement already satisfied: packaging>=20.0 in /opt/homebrew/lib/python3.11/site-packages (from matplotlib->fipy) (23.1)
Requirement already satisfied: pillow>=6.2.0 in /opt/homebrew/lib/python3.11/site-packages (from matplotlib->fipy) (9.5.0)
Requirement already satisfied: pyparsing>=2.3.1 in /opt/homebrew/lib/python3.11/site-packages (from matplotlib->fipy) (3.0.9)
Requirement already satisfied: python-dateutil>=2.7 in /opt/homebrew/lib/python3.11/site-packages (from matplotlib->fipy) (2.8.2)
Requirement already satisfied: six>=1.5 in /opt/homebrew/lib/python3.11/site-packages (from python-dateutil>=2.7->matplotlib->fipy) (1.16.0)
Downloading FiPy-3.4.4-py3-none-any.whl (443 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 443.6/443.6 kB 1.4 MB/s eta 0:00:0000:0100:01
?25hBuilding wheels for collected packages: future
  Building wheel for future (setup.py) ... ?25ldone
?25h  Created wheel for future: filename=future-0.18.3-py3-none-any.whl size=492022 sha256=ab00cb2c2686e1ed87c348ca138447f55faa536b67f27727a970a5b2d5210282
  Stored in directory: /Users/ibilion/Library/Caches/pip/wheels/da/19/ca/9d8c44cd311a955509d7e13da3f0bea42400c469ef825b580b
Successfully built future
Installing collected packages: future, fipy
Successfully installed fipy-3.4.4 future-0.18.3

Here is our solver using Fipy:

Let’s draw some random rods and plot the temperature profile for each one of them

../_images/eed373e9aba9634e84db6f58bf13a34e73bc5da8e87fd6785f09a0f852c64caa.svg

../_images/f1bc2e30cd9e2ea8d2025fd2e410ee96061841d2966a6ef9169c416cca5d662e.svg

Now let’s draw multiple samples and plot them on the same figure:

fig, ax = plt.subplots()
for i in range(20):
    rod = R.rvs()
    x, y = solver(rod)
    ax.plot(x, y, 'r', lw=0.5)
ax.set_xlabel('$x$ (m)')
ax.set_ylabel(r'$T(x)\;(^\circ C)$')
sns.despine(trim=True);

../_images/f971571bd73c782e4009d366b629885482255599a6c50f27b2e8b51f4c1bbe16.svg

Our classes are working as expected. Now let’s use them to do Monte Carlo.

Uncertainty propagation through the boundary value problem#

We start by generating the data to reuse them later to analyze various quantities of interest. Writing the data to a file we can reuse later without rerunning the notebook is also a good idea. This is especially useful if we are using computationally expensive simulators.

# Number of samples to take
N = 1000

# A place to store the results of all simulations
data = np.ndarray((N, solver.nx + 1))

# Loop over samples
for n in range(N):
    # Draw a random rod
    rod = R.rvs()
    # Evaluate the solver at this rod
    y = solver(rod)[1]
    # Store the data for later use
    data[n, :] = y
    
# Okay, we have the data. Let's write them to a file in the simplest possible way
# (using a text format). IF YOU ARE ON GOOGLE COLAB THE FILE IS 
data_filename = f'steady_state_heat_N={N}.txt'
# This saves the data:
np.savetxt(data_filename, data)

Here is how you can reload the data (this doesn’t require running the code above if you have already done it).

# The file to read (MAKE SURE YOU CHANGE THIS IF YOU WNAT TO READ ANOTHER FILE)
# IF YOU ARE ON GOOGLE COLAB THE FILE 
input_filename = 'steady_state_heat_N=1000.txt'
# This is just a check that the file exists
import os
if not os.path.exists(input_filename):
    raise RuntimeError(
        'File "{input_filename}" does not exist.'
        + 'Make sure you run code above with the appropriate parameters'
    )
# If you reach this point, the file exists. Load it:
data = np.loadtxt(input_filename)
print('Loaded data shape: ', data.shape)
# Make sure N reflects the correct data set
N = data.shape[0]

Loaded data shape:  (1000, 501)

And now, we can use the data to estimate statistics using Monte Carlo. Let’s start with the expectation of \(T(x)\) at a single point, say \(x=0.5\). That is, we want to estimate \(\mathbb{E}[T(x=0.5)]\) where the expectation is over the random rods. Here is how we can do it:

# Extract only the relevant data
T_middle_data = data[:, int(solver.nx / 2)]

# Evaluate the sample average 
T_middle_running = np.cumsum(T_middle_data) / np.arange(1, N + 1)

# Evaluate the sample average for the squared of T_middle
T_middle2_running = np.cumsum(T_middle_data ** 2) / np.arange(1, N + 1)

# Evaluate the running average of the variance
sigma2_running = T_middle2_running - T_middle_running ** 2

# Alright, now we have quantified our uncertainty about I for every N
# from a single MC run. Let's plot a (about) 95% predictive interval
# Running lower bound for the predictive interval
T_middle_lower_running = (
    T_middle_running 
    - 2.0 * np.sqrt(sigma2_running / np.arange(1, N + 1))
)
# Running upper bound for the predictive interval
T_middle_upper_running = (
    T_middle_running
    + 2.0 * np.sqrt(sigma2_running / np.arange(1, N + 1))
)

# A common plot for all estimates
fig, ax = plt.subplots()
# Shaded area for the interval
ax.fill_between(
    np.arange(1, N + 1),
    T_middle_lower_running,
    T_middle_upper_running,
    alpha=0.25
)
# Here is the MC estimate:
ax.plot(np.arange(1, N+1), T_middle_running, 'b', lw=2)
# The true value
ax.plot(np.arange(1, N+1), [(4.0 + 400.0) / 2] * N, color='r')
# and the labels
ax.set_xlabel('$N$')
ax.set_ylabel(r'$\mathbb{E}[T(x=0.5)]$')
sns.despine(trim=True);

../_images/41979b79dbe2023d3dfb345ec74d0ada566b4ad8e16c06522e9dc5cce19f9205.svg

How many samples must you take for this to be sufficiently converged? Remember, that in real life you cannot see the red line. Suppose that we want to ensure an accurate of some small positive \(\epsilon\). We can say that we want an \(N\) such that the 95% prediction interval is less wide than \(\epsilon\). If we knew the variance \(\sigma^2\) of the random variable, then the interval width (according to CLT) would be approximately:

\[ \epsilon = 4 \frac{\sigma}{\sqrt{N}}. \]

So, we want an \(N\) that satisfies:

\[ N \geq \frac{16\sigma^2}{\epsilon^2}. \]

Of course, we do not know \(\sigma^2\). So, the next best thing we can do is to estimate it as well and then pick \(N\) such that:

\[ N \geq \frac{16\bar{\sigma}^2}{\epsilon^2}. \]

Here is an attempt:

bar_sigma = np.sqrt(sigma2_running[-1]) bar_sigma

epsilon = 1
bar_sigma2 = sigma2_running[-1]
N = int(16 * bar_sigma2 / epsilon**2 + 1)
print("N = ", N)

N =  30503

That’s a big number. And it will be bigger if you choose a smaller \(\epsilon\). There are ways to deal with this issue. The most common way to deal with this is to replace the computationally expensive numerical solver with an inexpensive surrogate model. You build this model using a finite number of input-output pairs from the numerical solver. Then you use this surrogate model to predict the output for any input. There many possibilities for the surrogate model: polynomial regression, neural networks, Gaussian processes, etc. You will be equipped with the knowledge to build these surrogate models by the time you finish this course.

Let us finish by finding the median and a 95% predictive interval for all spatial points:

mu = np.median(data, axis=0)
mu_025 = np.percentile(data, 2.5, axis=0)
mu_975 = np.percentile(data, 97.5, axis=0)

fig, ax = plt.subplots()
ax.fill_between(x, mu_025, mu_975, color=sns.color_palette()[0], alpha=0.25)
# Let's take a couple of samples and see if they fall inside:
for _ in range(5):
    rod = R.rvs()
    y = solver(rod)[1]
    ax.plot(x, y, 'r', lw=0.5)
ax.plot(x, mu)
ax.set_xlabel('$x$')
ax.set_ylabel('$T(x)$')
sns.despine(trim=True);

../_images/175685b2e125e70e05417ada42ec9d1e3ad2975bb9196c04b77d7356a596f739.svg

Questions#

Use the data you have collected in data to estimate the probability that \(T(x=0.5)\) is greater than \(300\) degrees C.