Distribution of maximum for 1-D prior example (RBF covariance)

Investigation into the distribution of the maximum for a 1-D toy example.

[2]:

from dolfin import *
import numpy as np
import numba
import ot
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
# import required functions from oneDim
from statFEM_analysis.oneDim import mean_assembler, kernMat, cov_assembler, sample_gp
from scipy.stats import linregress
from scipy import integrate
from scipy.linalg import sqrtm
from tqdm import tqdm
import sympy; sympy.init_printing()
# code for displaying matrices nicely
def display_matrix(m):
    display(sympy.Matrix(m))

Let’s first import the wass function created in the sub-module maxDist.

[3]:

from statFEM_analysis.maxDist import wass

Sampling from true maxiumum distribution in 1D

We will be interested in the toy example introduced in oneDim, however we will use different values for the parameters in the distribution of \(f\). To obtain a sample for the maximum from the true prior we must first sample trajectories from the true prior.

[4]:

# set up true mean
@numba.jit
def m_u(x):
    return 0.5*x*(1-x)

[5]:

# set up kernel functions for forcing f
l_f = 0.01
σ_f = 0.1

@numba.jit
def c_f(x,y):
    return (σ_f**2)*np.exp(-(x-y)**2/(2*(l_f**2)))

# translation invariant form of c_f
@numba.jit
def k_f(x):
    return (σ_f**2)*np.exp(-(x**2)/(2*(l_f**2)))

[6]:

# set up true cov function for solution
# compute inner integral over t
def η(w,y):
    I_1 = integrate.quad(lambda t: t*c_f(w,t),0.0,y)[0]
    I_2 = integrate.quad(lambda t: (1-t)*c_f(w,t),y,1.0)[0]
    return (1-y)*I_1 + y*I_2

# use this function eta and compute the outer integral over w
def c_u(x,y):
    I_1 = integrate.quad(lambda w: (1-w)*η(w,y),x,1.0)[0]
    I_2 = integrate.quad(lambda w: w*η(w,y),0.0,x)[0]
    return x*I_1 + (1-x)*I_2

[7]:

%%time
n_sim = 1000
grid = np.linspace(0,1,100)
np.random.seed(235)
u_sim = sample_gp(n_sim, m_u, c_u, grid, par = True, trans = False, tol = 1e-8)

CPU times: user 1.24 s, sys: 102 ms, total: 1.34 s
Wall time: 21.8 s

We now plot the sampled trajectories from the true prior.

[8]:

plt.rcParams['figure.figsize'] = (10,6)
plt.plot(grid,u_sim)
plt.xlabel('$x$')
plt.ylabel('$u$')
plt.title('Trajectories from the true prior')
plt.grid()
plt.show()

_images/07_oneDim_prior_max_results_13_0.png

Note:

The maximum is not always located at \(x=0.5\). We will demonstrate this by plotting the \(\operatorname{argmax}\) for each simulated curve below.

[9]:

plt.plot(np.arange(1,n_sim + 1),grid[u_sim.argmax(axis=0)])
plt.hlines(0.5,1,n_sim,colors='red',linestyles='dashed')
plt.grid()
plt.xlabel('Sample number')
plt.ylabel('$\operatorname{argmax}$')
plt.title('Location of $\operatorname{argmax}$ for each simulated curve')
plt.show()

_images/07_oneDim_prior_max_results_15_0.png

Let’s now get the maximum value for each simulated curve and store this in a variable max_true for future use. We plot the histogram for these maximum values.

[10]:

max_true = u_sim.max(axis=0)

[11]:

sns.displot(max_true,kde=True,stat="density")
plt.xlabel('$\max$')
plt.title('Histogram of maximum for true prior')
plt.show()

_images/07_oneDim_prior_max_results_18_0.png

Sampling from statFEM prior maxiumum distribution in 1D

We now create a function to utilise FEniCS to draw trajectories from the statFEM prior.

[12]:

# create statFEM sampler function
def statFEM_sampler(n_sim, grid, h, f_bar, k_f, par = False, trans = True, tol=1e-9):
    # get length of grid
    d = len(grid)

    # get size of FE mesh
    J = int(np.round(1/h))

    # get statFEM mean function
    μ_func = mean_assembler(h, f_bar)

    # evaluate this on the grid
    μ = np.array([μ_func(x) for x in grid]).reshape(d,1)

    # get statFEM cov mat on grid
    Σ = cov_assembler(J, k_f, grid, parallel=par, translation_inv=trans)

    # construct the cholesky decomposition Σ = GG^T
    # we add a small diagonal perturbation to Σ to ensure it
    # strictly positive definite
    G = np.linalg.cholesky(Σ + tol * np.eye(d))

    # draw iid standard normal random vectors
    Z = np.random.normal(size=(d,n_sim))

    # construct samples from GP(m,k)
    Y = G@Z + np.tile(μ,n_sim)

    # return the sampled trajectories
    return Y

Let’s test this function out by using it to obtain samples from the statFEM prior for a particular value of \(h\).

[13]:

h = 0.1
f_bar = Constant(1.0)
np.random.seed(2123)
u_sim_statFEM = statFEM_sampler(n_sim, grid, h, f_bar, k_f)

Let’s plot the above trajectories.

[14]:

plt.rcParams['figure.figsize'] = (10,6)
plt.plot(grid,u_sim_statFEM)
plt.grid()
plt.xlabel('$x$')
plt.ylabel('$u$')
plt.title('Trajectories from statFEM prior with $h = $ %0.1f' % h)
plt.show()

_images/07_oneDim_prior_max_results_25_0.png

Let’s now get the maximum value for each simulated curve and store this in a variable max_statFEM for future use. We then plot the histograms for these maximum values together with those from the true prior.

[15]:

max_statFEM = u_sim_statFEM.max(axis=0)

[16]:

max_data = pd.DataFrame(data={'True': max_true, 'statFEM': max_statFEM})
sns.displot(max_data,kde=True,stat="density")
plt.xlabel('$\max$')
plt.title('Histograms of maximums for true and statFEM priors')
plt.show()

_images/07_oneDim_prior_max_results_28_0.png

Our wass function requires the number of bins to be specified. Let’s investigate how varying this parameter affects the computed distance between the true maximums and our statFEM maximums.

[17]:

n_bins = np.arange(10,200,10)
wass_bin_dat = [wass(max_true,max_statFEM,n) for n in n_bins]

[18]:

plt.plot(n_bins,wass_bin_dat)
plt.grid()
plt.xlabel('number of bins')
plt.ylabel('2-Wasser distance')
plt.title('Plot showing how number of bins influences the computed distance')
plt.show()

_images/07_oneDim_prior_max_results_31_0.png

From the plot above we see that the distance flattens after around \(50\) bins. We choose to take n_bins=100 for the rest of this example.

We now set up a range of \(h\)-values to use for the statFEM prior.

[19]:

# set up range of h values to use
h_range_tmp = np.linspace(0.25,0.02,100)
h_range = 1/np.unique(np.round(1/h_range_tmp))
np.round(h_range,2)

[19]:

array([0.25, 0.2 , 0.17, 0.14, 0.12, 0.11, 0.1 , 0.09, 0.08, 0.08, 0.07,
       0.07, 0.06, 0.06, 0.06, 0.05, 0.05, 0.05, 0.05, 0.04, 0.04, 0.04,
       0.04, 0.03, 0.03, 0.03, 0.03, 0.02, 0.02, 0.02])

We now loop over these \(h\)-values, and for each value we simulate maximums from the statFEM prior and then compute the 2-Wasserstein distance between these maximums and those from the true prior.

[20]:

%%time
errors = []
###################
n_bins = 100
##################
np.random.seed(3252)
for h in tqdm(h_range):
    # sample trajectories from statFEM prior for current h value
    sim = statFEM_sampler(n_sim,grid,h,f_bar,k_f)
    # get max
    max_sim = sim.max(axis=0)
    # compute error
    error = wass(max_true,max_sim,n_bins)
    # append to errors
    errors.append(error)

100%|██████████| 30/30 [00:00<00:00, 38.58it/s]

CPU times: user 777 ms, sys: 6.76 ms, total: 784 ms
Wall time: 781 ms

Let’s now plot these errors against \(h\).

[21]:

errors = np.array(errors)

[22]:

plt.plot(h_range,errors)
plt.grid()
plt.xlabel('h')
plt.ylabel('error')
plt.title('Plot of errors vs h for max')
plt.show()

_images/07_oneDim_prior_max_results_39_0.png

We see that as \(h\) decreases so too does the distance/error. Let’s investigate the rate by plotting this data in log-log space and then estimating the slope of the line of best fit.

[23]:

log_h_range = np.log(h_range)
log_errors = np.log(errors)

# perform linear regression to get line of best fit for the log scales:
lm = linregress(log_h_range,log_errors)
print("slope: %f    intercept: %f    r value: %f     p value: %f" % (lm.slope, lm.intercept, lm.rvalue, lm.pvalue))

# plot line of best fit with the data:
x = np.linspace(np.min(log_h_range),np.max(log_h_range),100)
plt.scatter(log_h_range,log_errors,label='data')
plt.plot(x,lm.intercept + lm.slope * x, 'r', label='fitted line')
# plt.plot(x,2+2*x,'--',c='black',label='reference line, $p=2$')
plt.plot(x,-1+1.35*x,'--',c='black',label='reference line, $p=1.35$')
plt.grid()
plt.legend()
plt.xlabel('$\log(h)$')
plt.ylabel('$\log(W)$')
plt.title('Plot of $\log(W)$ vs $\log(h)$ for maximum')
plt.show()

slope: 1.346020    intercept: -3.028927    r value: 0.971118     p value: 0.000000

_images/07_oneDim_prior_max_results_41_1.png

We can see from the above plot that the slope is around \(1.35\) - this differs from our theory showing the results for the \(\max\) are out of the remit of our theory.