Those bastards!!

Those bastards!!

I ran into more problems trying to run OpenBUGS on a Bayesian spatial regression model today. The problems are laid out pretty simply in the picture above: I ran into an illegal memory write. I know this is not the fault of my model, as it runs smoothly for 50, 500 or 1000 iterations. This illegal memory write happens somewhere between 6000 and 6100 iterations.

This model is a simple version of a larger model I’m building. It has data on about 1800 small areas collected over four time points, with age and sex covariates, with a Poisson distribution for the outcome and an adjacency matrix for the 1800 areas. The data are laid out with the time points and age- and sex-values as separate records, so for one municipality I have 4*6*2=24 observations. I can’t use other packages for this because they assume one observation per municipality (i.e a very simple relationship between data structure and adjacency matrix). So I’m using OpenBUGS or WinBUGS. The ultimate goal is to extend to 30 time points and 18 age categories, with a more complex time and age structure, but I want to get this simple model running first because I’m running into major computational issues.

Today I identified that these computational issues can’t be solved. The reason for the illegal memory write is that OpenBUGS has a 1.6Gb RAM limit. People say this but it’s hard to confirm; however, in digging through critiques of BUGS I discovered it is written in Component Pascal and compiled in BlackBox, and a bit more digging reveals BlackBox has a 1.6Gb RAM limit.

This is frankly ridiculous. It’s understandable that there would be a RAM limit in a 32-bit system, but OpenBUGS has been around for about 9 years now and no one has upgraded it to handle either multi-core computers or 64 bit architecture. Given that OpenBUGS is designed to handle complex, processor-demanding simulation-heavy Bayesian models, the decision to write it in such a restrictive framework is ridiculous. I’m a strong critic of open source systems but if it had been written for R it would at least be able to use the 64-bit architecture, even though R has been late to the multi-core party. I bought a PC with 128 Gb of RAM specifically for this big project, and I might as well have bought a laptop with the minimum RAM. For the model I ultimately aim to build I expect I will need about 100,000 iterations to get stability, which means that OpenBUGS will never get there. The only way to run this model without a workaround is to either 1) write the full adjacency matrix by hand and implement it directly [something I am considering] or 2) recompile the source code (which I think might not be available) in a different Pascal compiler. I have no idea how to even start with that.

I have considered two workarounds, however, though I don’t know whether either of them might work.

  1. Save and rerun: the reason that OpenBUGS hits its RAM limit is that it saves all iterations in RAM, but I think this is possibly bad design. So one option could be to run the model to 5000 iterations, then save the results, shut down OpenBUGS, reopen OpenBUGS, load the model file, and then use the update functions to run another 5000 iterations on what has already been run. I think this can be done (running additional updates on a past model) but I’m not sure. If this works I just need to run one set of runs a night for about 3 weeks, and I’ll get my model.
  2. Rerun with sequential initial values: If method 1 doesn’t work, another option that I am very sure will work is to run the model for 5000 iterations, extract all estimated values from the model, then start a new model of 5000 runs with the past estimated values as the initial values for the next model. I’m pretty sure it will start off where it left off, assuming I correctly specify them all, although there might be small jumps in the trace, but ultimately it’ll get where it needs to go. But restarting the model is going to take a lot of time unless I can find a way to loop through and extract values automatically (I think I can’t). So probably a month to run the model.

Ridiculous! And even if I had a supercomputer I couldn’t speed it up …

Today Stata 14 was released, and it finally has built in functionality for a wide range of Bayesian statistical tasks. I’m hoping that they’ll introduce conditional autoregression and the BYM model in version 15. Really, if you have a grant that can afford a single Stata license, it’s really worth getting. You just can’t rely on open source statistical packages. Only use them when you must!

I’ve recently been building a fairly complex series of Bayesian spatial regression models in BUGS, and thought I’d share some tips based on hard won experience with the models. The various BUGS packages have the most cryptic and incoherent error messages of any stats software I have ever worked with, and although various Bayesian boosters claim that their modeling approach is intuitive, in my opinion it is the exact opposite of intuitive, and it is extremely hard to configure data for use in the packages. Furthermore, online help is hard to find – google an error message and you will find multiple websites with people asking questions that have never been answered, which is rare in the modern world. I take this as a sign that most people don’t understand the error message, and indeed the BUGS manual includes a list of errors with “possible interpretations” that reads more like the I Ching than a software guide. But Confucius say Enlightenment is not to be found in Black Box Pascal, so here is my experience of BUGS.

The models I’m running are complex, with nested conditional autoregressive structures and the higher level having more than 1000 areas with complex neighbour relationships, and millions of observations. I originally ran them on a completely hideous Hewlett Packard laptop, with 4 cores and 8Gb of RAM. I subsequently upgraded to a Dell Workstation (joy in comparison to HP’s clunky root-kitted horror) with 8 cores and 16Gb of RAM; I’m not sure that hardware is the main barrier to performance here though …

The HP machine had a secret administrator account (arseholes!) so I couldn’t install winBUGS[1], so I started off running OpenBUGS called through R’s R2OpenBUGS package running in RStudio. I use R to set up the data and initial values, because I can’t think of any other way to load millions of observations into a text file without going stir crazy. But when I call OpenBUGS it just hangs … no error messages or any other kind of indication of what is going on. I also can’t tell if it is happening at the data loading or compiling or inits stage.

Some digging around online and I found an old post by Andrew Gelman, observing that BUGS does not work well with “large datasets, multivariate structures, and regression coefficients.”

i.e. pretty much every statistical problem worth doing. Gelman also notes that “efficiently-programmed models can get really long, ugly, and bug-prone,” which seems like a contradiction in terms.

Anyway, noting that my data was large, with multivariate structures and regression coefficients, I thought maybe I should tone it down a bit so I tried using a higher level of spatial heirarchy, which reduces the adjacency matrix by an order of magnitude. Still no dice. It was at this point that I upgraded to the bigger computer.

On the bigger computer the smaller model actually worked! But it didn’t work in the sense that anything meaningful came out of it … It worked in the sense that it reported a completely incomprehensible bug, something like a node having an invalid value. I tried multiple different values and nothing worked, but somewhere on the internet I found someone hinting that you should try running BUGS directly rather than calling through R, so I tried this … having created the data in R, I killed OpenBUGS then opened the OpenBUGS interface directly and input the model, then the data, using the text files created by R[2].

When I did this I could step through the process – model was syntatically correct, then model failed to compile! Given that loading inits comes after compilation, an error telling me that I had the wrong initial value seems a bit misleading… in fact I had an “index out of range” error, and when I investigated I found I had made a mistake preparing one part of the data. So where the actual error was “the model can’t compile because you have provided the wrong data,” when called through R the problem was “you have the wrong initial values” (even though I haven’t actually loaded initial values yet).

WTF?! But let’s step back and look at this process for a moment, because it is seven shades of wrong. When you run R2OpenBUGS in R, it first turns the data and inits into a form that OpenBUGS can read; then it dumps these into a directory; then it opens OpenBUGS and gets OpenBUGS to access those files in a stepwise process – at least, that’s what I see R doing. If I decide to do the model directly in the OpenBUGS graphical interface, then what I do is I get R to make the data, then I use the task manager to kill OpenBUGS, then I call OpenBUGS directly, and get OpenBUGS to access the files R made in a stepwise process. i.e. I do exactly the same thing that R does, but I get completely different error messages.

There are various places on the internet where you might stumble on this advice, but I want to stress it: you get different error messages in OpenBUGS run natively than you do in OpenBUGS called through R. Those error messages are so different that you will get a completely different idea of what is wrong with your program.

Anyway, I fixed the index but then I ran into problems after I tried to load my initial values. Nothing seemed to work, and the errors were really cryptic. “Invalid initial value” is not very useful. But further digging on the internet showed me that OpenBUGS and WinBUGS have different approaches to initial values, and winBUGS is not as strict about the values that it accepts. Hmmm … so I installed winBUGS, and reran the model… and it worked! OpenBUGS apparently has some kind of condition on certain variables that they must sum to 0, while winBUGS doesn’t check that condition. A free tip for beginners: setting your initial values so they sum to 0 doesn’t help, but running the same model, unchanged, in winBUGS, works.

So either OpenBUGS is too strict, or winBUGS lets through a whole bunch of dodgy stuff. I am inclined to believe the former, because initial values shouldn’t be a major obstacle to a good model, but as others[3] have observed, BUGS is programmed in a completely opaque system so no one knows what it is doing.

So, multiple misleading errors, and a complex weirdness about calling external software through R, and I have a functioning model. Today I expanded that model back to the original order of magnitude of small areas, and it also worked, though there was an interesting weirdness here. When I tried to compile the model it took about three hours, and produced a Trap. But the weird thing is the Trap contained no warnings about BUGS at all, they were all warnings about windows (something called Windows.AddInteger or similar), and after I killed the Trap my model updated fine. So I think the compile problems I previously experienced may have had something to do with memory problems in Windows (I had no problems with badly designed adjacency matrices in the larger model), but OpenBUGS just doesn’t tell you what’s going on, so you have no idea …

I should also add, for intrepid readers who have got this far, that this dude provides an excellent catalogue of OpenBUGS errors with his plain English explanations of what they actually meant. He’s like this mystical interpreter of the I Ching for Bayesian spatial regressives. Also I want to add that I think the CAR spatial correlation model is super dodgy. I found this article (pdf) by Melanie Wall from the Journal of Statistical Planning and Inference (what a read!) that shows that the way we construct the spatial adjacency matrix is the primary determinant of the correlation structure, and that the correlation structure determined by this adjacency matrix is nothing like what we think we are getting. Today on my whiteboard and with the help of R I imagined a simple industrial process where each stage in the process is correlated with the one before and after it, and I showed very easily based on Wall’s work that the adjacency matrix required to describe this process is completely different to the one that you would naively set up under the framework described for CAR modeling. So I think most of the “spatial correlation” structures described using CAR models have no relationship to what the programmer thinks they’re entering into the model. But I have no proof of this, so I guess like everyone else I’ll just press on, using the adjacency matrix I think works …

So there you have it. Next time you see an opinion formed on the basis of a spatial regression model built in BUGS, remember the problems I had getting to the output, and ask yourself – do you trust that model? Really?

fn1: Well, I could copy winBUGS into the program files folder but I couldn’t patch it or install the immortality key, which, wtf? When I bought Time Series Modelling and Forecasting by Brockwell and Davis, ITSM came as a free disk with the book. When I buy the BUGS book I get to install software that comes with a big message telling me to get stuffed, and years later they finally provide a key that enables you to use it for free …

fn2: If you aren’t aware of how this works, basically when you call OpenBUGS in R, providing data from inside R, R first dumps the data into text files in the directory of your choosing, then OpenBUGS opens those files. So if you aren’t comfortable preparing data for BUGS yourself, use the list and structure commands in R, then kill OpenBUGS and go to OpenBUGS directly … the text files will remain in the directory you chose.

fn3: Barry Rowlingson does a pretty good job of showing how interesting and useful spatial analysis can be: see e.g. his post on mapping the Zaatari refugee camp in Syria.