Stan

Andrew Gelman has announced the release of Stan version 1.0.0 and its R interface RStan.  Stan – named after Stanislaw Ulam, the inventor of the Monte Carlo method – is a new MCMC program that represents a major technological leap forward. It works flawlessly on my Linux desktop and is very, very fast. (Note that I have nothing to do with the Stan project: I am just posting this here because it is clearly of interest to JAGS users).

The starting point for Stan was that some large hierarchical models are intractable by Gibbs sampling.  Everyone has come across examples in which the Gibbs sampler will not converge in a reasonable time because the variables are highly correlated. Older BUGS users will remember re-parametrizing their models to reduce this correlation, e.g. by centring predictor variables like this:

x.mean <- mean(x[])
for (i in 1:N) {
   y[i] ~ dnorm(mu[i], tau)
   mu[i] <- alpha + beta * (x[i] - x.mean)
}

OpenBUGS introduced updaters that sample blocks of correlated parameters together.  The glm module in JAGS also uses block sampling for more efficient sampling of generalized linear models.  Stan takes this approach to its logical conclusion by using Hamiltonian Monte Carlo [1] to do block updating of all the parameters in the model together.  It can thereby tackle problems that remain beyond the scope of OpenBUGS and JAGS.

Stan works a bit differently from OpenBUGS and JAGS. It takes a description of the model and converts it into C++ code, then compiles the C++ program into a standalone executable file. This executable is run on a data set to produce the sampled output.  The model description used by Stan is a BUGS-like language, which BUGS users should find easy to read.  To help BUGS users further, the Stan User’s Guide contains an appendix that highlights the similarities and differences between Stan and BUGS.  The Stan distribution also contains the familiar BUGS examples converted to Stan format.

Despite the similarities, there are still some surprises for BUGS users. Since the model description is converted to C++ before being compiled, the programming style is imperative, not declarative. This means you can do some things that are impossible in BUGS, such as defining multiple likelihoods for the same variable

y ˜ normal(0,1);
y ˜ normal(2,3);

The contributions to the log-likelihood are added together.  You can also have local variables that are modified inside a for loop:

{
   total <- 0;
   for (i in 1:n) {
      theta[i] ˜ normal(total, sigma);
      total <- total + theta[i];
   }
}

Rethinking everything from the ground up has allowed the Stan developers to correct some of the design problems of BUGS, such as the counter-intuitive parametrization of distributions in terms of the precision rather than the standard deviation or variance. This was important in the early days of BUGS to allow efficient Gibbs sampling of conjugate distributions. However, nobody can express an informative prior in terms of the precision, which leads to extensive use of parameter transformations and rather verbose BUGS code.

That’s about all I have to say at this point. Stan 1.0.0 was only released two days ago so I have not had much time to explore further. To find out more, visit the Stan web site: http://mc-stan.org

[1] For an explanation of Hamiltonian Monte Carlo, see the introduction by Radford Neal in The Handbook of Markov Chain Monte Carlo. The sampler used by Stan is the No U-Turn Sampler, or NUTS.

About these ads