Gaussian Example
This simple example using a Gaussian model will guide you through the process of performing Bayesin parameter estimation with Differential Evolution MCMC. Suppose we observe $\mathbf{y} = \left[y_1,y_2, \dots, y_{50} \right]$ which are assumed to follow a Gaussian Distribution. Thus, we can write the following sampling statement:
\[\mathbf{y} \sim \mathrm{normal}(\mu, \sigma)\]
where $\theta$ is a parameter representing the probability of success. Our goal is to estimate the probability of success $\theta$ from the $k$. Let's assume the prior distribution of $theta$ is given by
\[\mu \sim \mathrm{normal}(0, 1)\]
In this simple case, the posterior distribution of $\theta$ has a simple closed-form solution:
\[\sigma \sim \mathrm{Cauchy}(0, 1)_{0}^{\infty}\]
Load Packages
Our first step is to load the required packages including StatsPlots.jl
for plotting the MCMC chain.
using DifferentialEvolutionMCMC
using Random
using Distributions
using StatsPlots
Random.seed!(6541)
Random.TaskLocalRNG()
Generate Data
Let's assume we make $N=50$ using $\mu = 0$ and $\sigma=1$:
data = rand(Normal(0.0, 1.0), 50)
50-element Vector{Float64}:
-0.09337763891909169
-0.28508056495780393
-0.9502027716092141
0.6242913712660065
0.21606943500548537
0.9188346780817548
1.3493876365646733
0.7464469735709638
0.33844924552464856
-1.012760788053415
⋮
0.29717618808379703
-0.005642984321770095
0.33544517082133707
1.3251660046679248
-1.6082902816560543
0.17500811598343874
-0.003916790050612109
0.10722488875928632
0.6422955265593758
Define Prior Log Likelihood Function
The next step is to define a function that passes our parameters $\mu$ and $\sigma$ and evaluates the log likelihood. The function to compute the prior log likelihood is as follows:
function prior_loglike(μ, σ)
LL = 0.0
LL += logpdf(Normal(0, 1), μ)
LL += logpdf(truncated(Cauchy(0, 1), 0, Inf), σ)
return LL
end
prior_loglike (generic function with 1 method)
Define Sample Prior Distribution
Each particle in DifferentialEvolution must be seeded with an initial value. To do so, we define a function that returns the initial value. A common approach is to use the prior distribution as it is intended to encode likely values of $\mu$ and $\sigma$. The function for sampling from the prior distribution of $\mu$ and $\sigma$ is given in the following code block:
function sample_prior()
μ = rand(Normal(0, 1))
σ = rand(truncated(Cauchy(0, 1), 0, Inf))
return [μ,σ]
end
sample_prior (generic function with 1 method)
Define Log Likelihood Function
Next, we define a function to compute the log likelihood of the data. The first argument must be the data followed by a separate argument for each parameter, maintaining the same order specified in prior_loglike
. In our case, we can write the log likelihood function with data
followed by $\mu$ and $\sigma$:
function loglike(data, μ, σ)
return sum(logpdf.(Normal(μ, σ), data))
end
loglike (generic function with 1 method)
Define Bounds and Parameter Names
We must define the lower and upper bounds of each parameter. The bounds are a Tuple
of tuples, where the $i^{\mathrm{th}}$ inner tuple contains the lower and upper bounds of the $i^{\mathrm{th}}$ parameter. Note that the parameters must be in the same order as specified in prior_loglike
and loglike
. In the present case, we only have $\mu$ followed by $\sigma$:
bounds = ((-Inf,Inf),(0.0,Inf))
((-Inf, Inf), (0.0, Inf))
We can also pass a Tuple
of names for each parameter, again in the same order as specified in prior_loglike
and loglike
:
names = (:μ,:σ)
(:μ, :σ)
Define Model Object
Now that we have defined the necessary components of our model, we will organize them into a DEModel
object as follows:
model = DEModel(;
sample_prior,
prior_loglike,
loglike,
data,
names)
DEModel{DifferentialEvolutionMCMC.var"#8#10"{typeof(Main.loglike), Vector{Float64}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, Tuple{}}, DifferentialEvolutionMCMC.var"#7#9"{typeof(Main.prior_loglike)}, Tuple{Symbol, Symbol}, typeof(Main.sample_prior)}(DifferentialEvolutionMCMC.var"#7#9"{typeof(Main.prior_loglike)}(Main.prior_loglike), DifferentialEvolutionMCMC.var"#8#10"{typeof(Main.loglike), Vector{Float64}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, Tuple{}}(Main.loglike, [-0.09337763891909169, -0.28508056495780393, -0.9502027716092141, 0.6242913712660065, 0.21606943500548537, 0.9188346780817548, 1.3493876365646733, 0.7464469735709638, 0.33844924552464856, -1.012760788053415 … -1.100839976455977, 0.29717618808379703, -0.005642984321770095, 0.33544517082133707, 1.3251660046679248, -1.6082902816560543, 0.17500811598343874, -0.003916790050612109, 0.10722488875928632, 0.6422955265593758], Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}(), ()), Main.sample_prior, (:μ, :σ))
Define Sampler
Next, we will create a sampler with the constructor DE
. Here, we will pass the sample_prior
function and the variable bounds
, which constains the lower and upper bounds of each parameter. In addition, we will specify $1,000$ burnin samples, Np=6
particles per group (default 4 particles per group).
de = DE(;sample_prior, bounds, burnin=1000, Np=6)
DE{Tuple{Tuple{Float64, Float64}, Tuple{Float64, Float64}}, typeof(random_gamma), typeof(DifferentialEvolutionMCMC.mh_update!), typeof(compute_posterior!), typeof(sample), DifferentialEvolutionMCMC.var"#3#5", Vector{Bool}, Array{Float64, 3}}(4, 6, 1000, true, 0.1, 0.1, 0.001, 0.05, 1.0, 0.0, ((-Inf, Inf), (0.0, Inf)), 0, 1, DifferentialEvolutionMCMC.random_gamma, DifferentialEvolutionMCMC.mh_update!, DifferentialEvolutionMCMC.compute_posterior!, StatsBase.sample, DifferentialEvolutionMCMC.var"#3#5"(), Bool[0], Array{Float64, 3}(undef, 0, 0, 0))
Estimate Parameter
The code block below runs the sampler for $2000$ iterations with each group of particles running on a separate thread. The progress bar is also set to display.
n_iter = 2000
chains = sample(model, de, MCMCThreads(), n_iter, progress=true)
Chains MCMC chain (1000×4×24 Array{Float64, 3}):
Iterations = 1:1:1000
Number of chains = 24
Samples per chain = 1000
parameters = μ, σ
internals = acceptance, lp
Summary Statistics
parameters mean std mcse ess_bulk ess_tail rhat ⋯
Symbol Float64 Float64 Float64 Float64 Float64 Float64 ⋯
μ 0.0013 0.1235 0.0026 2262.4133 2941.7876 1.0087 ⋯
σ 0.8701 0.0898 0.0021 1978.7988 2577.9714 1.0129 ⋯
1 column omitted
Quantiles
parameters 2.5% 25.0% 50.0% 75.0% 97.5%
Symbol Float64 Float64 Float64 Float64 Float64
μ -0.2389 -0.0815 0.0018 0.0819 0.2460
σ 0.7138 0.8062 0.8637 0.9257 1.0665
Convergence
We will evaluate convergence using two methods. First, we will verify that $\hat{r} \leq 1.05$ in the chain summary above. Indeed, it is. Its also important to examine the trace plot for any evidence of non-stationarity, or getting stuck at the same value for long periods of time. In the left panel of the plot below, the samples for each chain display the characteristic behavior indicative of efficient sampling and convergence: they look like white noise, or a "hairy caterpillar". The plot on the right shows the posterior distribution of $\mu$ and $\sigma$ for each of the 12 chains.
plot(chains, grid=false)