BNP with Turing.jl

We recently extended Turing.jl to be used for non-parametric probabilistic modelling. This blog post briefly summarizes our approach.

Probabilistic modelling is a core component of a scientists’ toolbox for incorporating uncertainties about the model parameters and noise in the data. Statistical models with a Bayesian nonparametric (BNP) component are difficult to handle due to the infinite dimensionality which prevents the straight forward use of standard inference methods. Probabilistic programming languages — such as Turing.jl — however, enable the rapid development of new probabilistic models while simultaneously automating statistical inference. This is made possible by separating the model definition from the inference scheme and by using generic inference algorithms. To allow non-experts to use a variety of BNP models, we extended Turing.jl to be used for BNP applications.

Our extension admits rapid development of new BNP priors for parametric and non-parametric probabilistic modelling. Non-parametric priors are implemented by separating the representation from the random measure. Thus, providing a flexible interface for BNP modelling.

Consider for example the well-known Dirichlet process (DP). A possible way to construct samples from a DP is the use of the so-called stick-breaking (SB) construction. One way to implement a DP using SB is by defining a type called

struct StickBreakingProcess <: ContinuousUnivariateDistribution
  rpm::AbstractRandomMeasure # random measure of interest
end

and to define the DP using

struct DirichletProcess <: AbstractRandomMeasure
  alpha::Float64
end

Now given a random measure (e.g. the Dirichlet process) and a construction (e.g. the stick-breaking process) we can define functions which allow us to draw samples from the process and evaluate the log density function.

Doing so allows us to provide a clean and easy to extend interface for BNP modelling.

CrossCat Model (Example)

The following example illustrates the use of Turing to formulate the CrossCat model which uses Chinese Restaurant Process construction for a nested Dirichlet process formulation. Consider the following variation of the original CrossCat model:

CrossCat

To implement this model (assuming Gaussian likelihoods and Gaussian priors for the locations for simplicity) we can write the following model in Turing.

@model CrossCat(x, μ0, α, d, θ) = begin
  N,P = size(x)

  # assignments to views (dimensions)
  y = tzeros(Int, P)
  for p = 1:P
    pk = Int[sum(y .== k) for k = 1:maximum(y)]
    y[p] ~ ChineseRestaurantProcess(PitmanYorProcess(d, θ, p-1), pk)
  end

  μ = Vector{Vector{Vector{Real}}}(undef, maximum(y))
  # assignments of observations to clusters in each view

  z = TArray(TArray, maximum(y))

  for k = 1:maximum(y)
    μ[k], z[k] = Vector{Vector{Real}}(), tzeros(Int, N)

    # for each observation
    for n = 1:N
      # draw assignment to a cluster inside a view
      J = maximum(z[k])
      nj = Int[sum(z[k] .== j) for j = 1:J]
      z[k][n] ~ ChineseRestaurantProcess(DirichletProcess(α), nj)

      # make new cluster if necessary
      if z[k][n] > J
        push!(μ[k], Vector{Float64}(undef, P))
        μ[k][z[k][n]] ~ MvNormal(μ0, ones(P))
      end

      for p in findall(y .== k)
        x[n, p] ~ Normal(μ[k][z[k][n]][p])
      end
    end
  end
end

The code above clearly resamples our model of interest in a very concise way. After implementing this model in Turing, we can use SMC or a combination of HMC and PG to perform inference.

You can find more example implementation of BNP models in Turing under: http://tiny.cc/BNP12

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s