Shlemiel the painter strikes again

This post concerns a bug in JAGS 3.x.y. I wrote it quite a long time ago but was too embarrassed to post it. Now that JAGS 4.0.0 is out it is confession time.

velocideX-times
Software maven Joel Spolsky tells the story of Shlemiel the painter to illustrate a common problem in software.

Shlemiel gets a job as a street painter, painting the dotted lines down the middle of the road. On the first day he takes a can of paint out to the road and finishes 300 yards of the road. “That’s pretty good!” says his boss, “you’re a fast worker!” and pays him a kopeck.

The next day Shlemiel only gets 150 yards done. “Well, that’s not nearly as good as yesterday, but you’re still a fast worker. 150 yards is respectable,” and pays him a kopeck.

The next day Shlemiel paints 30 yards of the road. “Only 30!” shouts his boss. “That’s unacceptable! On the first day you did ten times that much work! What’s going on?”

“I can’t help it,” says Shlemiel. “Every day I get farther and farther away from the paint can!”

Spolky’s point is that Shlemiel’s dumb behaviour is easy to reproduce when programming in a high-level language if you do not pay attention to the low-level details of the implementation.  His analogy is so famous that the Shlemiel the painter algorithm even has its own Wikipedia page.So when forum users Fabio and VelocideX reported that jags was slowing down in long runs, I knew that somewhere inside the code there had to be a little Shlemiel patiently walking back and forth to his paint pot.

The figure shows cumulative run time for an example posted by Velocidex. You can clearly see that the runtime is quadratic, not linear. The bug only struck if you used the rjags interface, and then only when using the progress bar. Batch runs in R and using the command line interface were unaffected. Fortunately it was easy to work around the bug by using a lower refresh frequency of the progress bar. The dotted line shows the (near) linear behaviour of rjags 0.3-12 on the same example. The cost of this change is to make updates less responsive to user interrupt.

So what is happening here? It is a fine example of premature optimization. I decided to pre-allocate the memory required to store monitored values. Unfortunately, with multiple short calls to the Model::update function, this mechanism leads to multiple reallocation of the existing stored values. The longer you run the model with monitoring, the slower it gets as the stored values are copied and recopied…

JAGS 4.0.0  using the built-in reallocation mechanism of the C++ vector class, and as a result appending new monitored values is an amortized constant time operation.

Advertisements

2 thoughts on “Shlemiel the painter strikes again

    • It does not occur at all with JAGS 4.0.0. I’m currently trying to get rjags version 4 accepted on CRAN and when that happens there will be nothing to worry about.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s