Gaussian Example

This simple example using a Gaussian model will guide you through the process of performing Bayesin parameter estimation with Differential Evolution MCMC. Suppose we observe $\mathbf{y} = \left[y_1,y_2, \dots, y_{50} \right]$ which are assumed to follow a Gaussian Distribution. Thus, we can write the following sampling statement:

\[\mathbf{y} \sim \mathrm{normal}(\mu, \sigma)\]

where $\theta$ is a parameter representing the probability of success. Our goal is to estimate the probability of success $\theta$ from the $k$. Let's assume the prior distribution of $theta$ is given by

\[\mu \sim \mathrm{normal}(0, 1)\]

In this simple case, the posterior distribution of $\theta$ has a simple closed-form solution:

\[\sigma \sim \mathrm{Cauchy}(0, 1)_{0}^{\infty}\]

Load Packages

Our first step is to load the required packages including StatsPlots.jl for plotting the MCMC chain.

using DifferentialEvolutionMCMC
using Random
using Distributions
using StatsPlots
Random.seed!(6541)

Random.TaskLocalRNG()

Generate Data

Let's assume we make $N=50$ using $\mu = 0$ and $\sigma=1$:

data = rand(Normal(0.0, 1.0), 50)

50-element Vector{Float64}:
 -0.09337763891909169
 -0.28508056495780393
 -0.9502027716092141
  0.6242913712660065
  0.21606943500548537
  0.9188346780817548
  1.3493876365646733
  0.7464469735709638
  0.33844924552464856
 -1.012760788053415
  ⋮
  0.29717618808379703
 -0.005642984321770095
  0.33544517082133707
  1.3251660046679248
 -1.6082902816560543
  0.17500811598343874
 -0.003916790050612109
  0.10722488875928632
  0.6422955265593758

Define Prior Log Likelihood Function

The next step is to define a function that passes our parameters $\mu$ and $\sigma$ and evaluates the log likelihood. The function to compute the prior log likelihood is as follows:

function prior_loglike(μ, σ)
    LL = 0.0
    LL += logpdf(Normal(0, 1), μ)
    LL += logpdf(truncated(Cauchy(0, 1), 0, Inf), σ)
    return LL
end

prior_loglike (generic function with 1 method)

Define Sample Prior Distribution

Each particle in DifferentialEvolution must be seeded with an initial value. To do so, we define a function that returns the initial value. A common approach is to use the prior distribution as it is intended to encode likely values of $\mu$ and $\sigma$. The function for sampling from the prior distribution of $\mu$ and $\sigma$ is given in the following code block:

function sample_prior()
    μ = rand(Normal(0, 1))
    σ = rand(truncated(Cauchy(0, 1), 0, Inf))
    return [μ,σ]
end

sample_prior (generic function with 1 method)

Define Log Likelihood Function

Next, we define a function to compute the log likelihood of the data. The first argument must be the data followed by a separate argument for each parameter, maintaining the same order specified in prior_loglike. In our case, we can write the log likelihood function with data followed by $\mu$ and $\sigma$:

function loglike(data, μ, σ)
    return sum(logpdf.(Normal(μ, σ), data))
end

loglike (generic function with 1 method)

Define Bounds and Parameter Names

We must define the lower and upper bounds of each parameter. The bounds are a Tuple of tuples, where the $i^{\mathrm{th}}$ inner tuple contains the lower and upper bounds of the $i^{\mathrm{th}}$ parameter. Note that the parameters must be in the same order as specified in prior_loglike and loglike. In the present case, we only have $\mu$ followed by $\sigma$:

bounds = ((-Inf,Inf),(0.0,Inf))

((-Inf, Inf), (0.0, Inf))

We can also pass a Tuple of names for each parameter, again in the same order as specified in prior_loglike and loglike:

names = (:μ,:σ)

(:μ, :σ)

Define Model Object

Now that we have defined the necessary components of our model, we will organize them into a DEModel object as follows:

model = DEModel(;
    sample_prior,
    prior_loglike,
    loglike,
    data,
    names)

DEModel{DifferentialEvolutionMCMC.var"#8#10"{typeof(Main.loglike), Vector{Float64}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, Tuple{}}, DifferentialEvolutionMCMC.var"#7#9"{typeof(Main.prior_loglike)}, Tuple{Symbol, Symbol}, typeof(Main.sample_prior)}(DifferentialEvolutionMCMC.var"#7#9"{typeof(Main.prior_loglike)}(Main.prior_loglike), DifferentialEvolutionMCMC.var"#8#10"{typeof(Main.loglike), Vector{Float64}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, Tuple{}}(Main.loglike, [-0.09337763891909169, -0.28508056495780393, -0.9502027716092141, 0.6242913712660065, 0.21606943500548537, 0.9188346780817548, 1.3493876365646733, 0.7464469735709638, 0.33844924552464856, -1.012760788053415  …  -1.100839976455977, 0.29717618808379703, -0.005642984321770095, 0.33544517082133707, 1.3251660046679248, -1.6082902816560543, 0.17500811598343874, -0.003916790050612109, 0.10722488875928632, 0.6422955265593758], Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}(), ()), Main.sample_prior, (:μ, :σ))

Define Sampler

Next, we will create a sampler with the constructor DE. Here, we will pass the sample_prior function and the variable bounds, which constains the lower and upper bounds of each parameter. In addition, we will specify $1,000$ burnin samples, Np=6 particles per group (default 4 particles per group).

de = DE(;sample_prior, bounds, burnin=1000, Np=6)

DE{Tuple{Tuple{Float64, Float64}, Tuple{Float64, Float64}}, typeof(random_gamma), typeof(DifferentialEvolutionMCMC.mh_update!), typeof(compute_posterior!), typeof(sample), DifferentialEvolutionMCMC.var"#3#5", Vector{Bool}, Array{Float64, 3}}(4, 6, 1000, true, 0.1, 0.1, 0.001, 0.05, 1.0, 0.0, ((-Inf, Inf), (0.0, Inf)), 0, 1, DifferentialEvolutionMCMC.random_gamma, DifferentialEvolutionMCMC.mh_update!, DifferentialEvolutionMCMC.compute_posterior!, StatsBase.sample, DifferentialEvolutionMCMC.var"#3#5"(), Bool[0], Array{Float64, 3}(undef, 0, 0, 0))

Estimate Parameter

The code block below runs the sampler for $2000$ iterations with each group of particles running on a separate thread. The progress bar is also set to display.

n_iter = 2000
chains = sample(model, de, MCMCThreads(), n_iter, progress=true)

Chains MCMC chain (1000×4×24 Array{Float64, 3}):

Iterations        = 1:1:1000
Number of chains  = 24
Samples per chain = 1000
parameters        = μ, σ
internals         = acceptance, lp

Summary Statistics
  parameters      mean       std      mcse    ess_bulk    ess_tail      rhat   ⋯
      Symbol   Float64   Float64   Float64     Float64     Float64   Float64   ⋯

           μ    0.0013    0.1235    0.0026   2262.4133   2941.7876    1.0087   ⋯
           σ    0.8701    0.0898    0.0021   1978.7988   2577.9714    1.0129   ⋯
                                                                1 column omitted

Quantiles
  parameters      2.5%     25.0%     50.0%     75.0%     97.5%
      Symbol   Float64   Float64   Float64   Float64   Float64

           μ   -0.2389   -0.0815    0.0018    0.0819    0.2460
           σ    0.7138    0.8062    0.8637    0.9257    1.0665

Convergence

We will evaluate convergence using two methods. First, we will verify that $\hat{r} \leq 1.05$ in the chain summary above. Indeed, it is. Its also important to examine the trace plot for any evidence of non-stationarity, or getting stuck at the same value for long periods of time. In the left panel of the plot below, the samples for each chain display the characteristic behavior indicative of efficient sampling and convergence: they look like white noise, or a "hairy caterpillar". The plot on the right shows the posterior distribution of $\mu$ and $\sigma$ for each of the 12 chains.

plot(chains, grid=false)