Information Entropy

MAKE_BOOK_FIGURES=Trueimport numpy as npimport scipy.stats as stimport matplotlib as mplimport matplotlib.pyplot as plt%matplotlib inlineimport matplotlib_inlinematplotlib_inline.backend_inline.set_matplotlib_formats('svg')import seaborn as snssns.set_context("paper")sns.set_style("ticks")def set_book_style():    plt.style.use('seaborn-v0_8-white')     sns.set_style("ticks")    sns.set_palette("deep")    mpl.rcParams.update({        # Font settings        'font.family': 'serif',  # For academic publishing        'font.size': 8,  # As requested, 10pt font        'axes.labelsize': 8,        'axes.titlesize': 8,        'xtick.labelsize': 7,  # Slightly smaller for better readability        'ytick.labelsize': 7,        'legend.fontsize': 7,                # Line and marker settings for consistency        'axes.linewidth': 0.5,        'grid.linewidth': 0.5,        'lines.linewidth': 1.0,        'lines.markersize': 4,                # Layout to prevent clipped labels        'figure.constrained_layout.use': True,                # Default DPI (will override when saving)        'figure.dpi': 600,        'savefig.dpi': 600,                # Despine - remove top and right spines        'axes.spines.top': False,        'axes.spines.right': False,                # Remove legend frame        'legend.frameon': False,                # Additional trim settings        'figure.autolayout': True,  # Alternative to constrained_layout        'savefig.bbox': 'tight',    # Trim when saving        'savefig.pad_inches': 0.1   # Small padding to ensure nothing gets cut off    })def set_notebook_style():    plt.style.use('seaborn-v0_8-white')    sns.set_style("ticks")    sns.set_palette("deep")    mpl.rcParams.update({        # Font settings - using default sizes        'font.family': 'serif',        'axes.labelsize': 10,        'axes.titlesize': 10,        'xtick.labelsize': 9,        'ytick.labelsize': 9,        'legend.fontsize': 9,                # Line and marker settings        'axes.linewidth': 0.5,        'grid.linewidth': 0.5,        'lines.linewidth': 1.0,        'lines.markersize': 4,                # Layout settings        'figure.constrained_layout.use': True,                # Remove only top and right spines        'axes.spines.top': False,        'axes.spines.right': False,                # Remove legend frame        'legend.frameon': False,                # Additional settings        'figure.autolayout': True,        'savefig.bbox': 'tight',        'savefig.pad_inches': 0.1    })def save_for_book(fig, filename, is_vector=True, **kwargs):    """    Save a figure with book-optimized settings.        Parameters:    -----------    fig : matplotlib figure        The figure to save    filename : str        Filename without extension    is_vector : bool        If True, saves as vector at 1000 dpi. If False, saves as raster at 600 dpi.    **kwargs : dict        Additional kwargs to pass to savefig    """        # Set appropriate DPI and format based on figure type    if is_vector:        dpi = 1000        ext = '.pdf'    else:        dpi = 600        ext = '.tif'        # Save the figure with book settings    fig.savefig(f"{filename}{ext}", dpi=dpi, **kwargs)def make_full_width_fig():    return plt.subplots(figsize=(4.7, 2.9), constrained_layout=True)def make_half_width_fig():    return plt.subplots(figsize=(2.35, 1.45), constrained_layout=True)if MAKE_BOOK_FIGURES:    set_book_style()else:    set_notebook_style()make_full_width_fig = make_full_width_fig if MAKE_BOOK_FIGURES else lambda: plt.subplots()make_half_width_fig = make_half_width_fig if MAKE_BOOK_FIGURES else lambda: plt.subplots()

Information Entropy#

Information entropy is a measure of the uncertainty of a probability distribution. It is defined as:

\[ \mathbb{H}[p(X)] = -\sum_x \log p(x) p(x). \]

The sum is over all possible values of the random variable \(X\). If \(X\) is continuous, the sum becomes an integral:

\[ \mathbb{H}[p(X)] = -\int_x \log p(x) p(x) dx. \]

Example - Information Entropy of a Binary Distribution#

Let’s take a random variable \(X\) with two possible values, say \(0\) and \(1\). Two numbers can describe the probability mass function:

\[ p_0 = p(X=0), \]

and

\[ p_1 = p(X=1) = 1 - p_0. \]

So, the information entropy of this distribution is simply a function of \(p_0\):

\[ \mathbb{H}[p(X)] = -\sum_x \log p(x) p(x) = -p_0 \log p_0 - p_1 \log p_1 = -p_0 \log p_0 + (1-p_0)\log (1-p_0). \]

Let’s plot it as we vary \(p_0\):

../_images/13e1e3815194956428290d055f462f679b02349ba72a6fba83af1595756e9166.svg

Notice that the function is maximized at \(p_0 = 0.5\) because this corresponds to maximum uncertainty. The function is minimized (as a matter of fact it is exactly zero) at \(p_0 = 0\) and \(p_0 = 1\) because both these cases correspond to minimum uncertainty (you are certain what is going to happen).

Questions#

You are given two Categorical distributions:

\[ X\sim \operatorname{Categorical}(0.1, 0.3, 0.5, 0.1), \]

and

\[ Y\sim \operatorname{Categorical}(0.2, 0.2, 0.4, 0.2). \]

Let’s visualize them:

../_images/785c670af46945911f231f671efae8305b85740d1f8fbff0b89d9d3e86daf0f4.svg

Questions#

Based on the picture above which of the two random variables, \(X\) or \(Y\), has the most uncertainty?
Use the block code below to calculate the entropy of each one of the distributions and answer the question above (which variable is more uncertaint) in a quantitative way. We can use the functionality of scipy.stats to compute the entropy.

ent_X = X.entropy()
print(f'H[X] = {ent_X:.2f}')
# Write code that computes and prints the entropy of Y

H[X] = 1.17

Information Entropy

Contents

Information Entropy#

Example - Information Entropy of a Binary Distribution#

Questions#

Questions#