Raman spectroscopy is a technique for detecting and identifying molecules such as DNA. It is sensitive at very low concentrations and can accurately quantify the amount of a given molecule in a sample. The presence of a large, nonuniform background presents a major challenge to analysis of these spectra. We introduce a sequential Monte Carlo (SMC) algorithm to separate the observed spectrum into a series of peaks plus a smoothly-varying baseline, corrupted by additive white noise. Our model-based approach accounts for differences in resolution and experimental conditions. By incorporating this representation into a Bayesian functional regression, we can quantify the relationship between molecular concentration and peak intensity, resulting in an improved estimate of the limit of detection. We also calculate the model evidence using SMC to investigate long-range dependence between peaks.
I’ve been invited to present my work on sequential Monte Carlo methods for Raman spectroscopy at the Oxford Computational Statistics and Machine Learning Reading Group (OCSMLRG), 11am on Friday 11 March. I’ve made some good progress since my seminar at QUT last year, so I’m looking forward to presenting these methods for a new audience. The abstract of my talk is below.
In my experience, the default libRblas gives horrible performance across all of the platforms that I regularly use (Windows, Mac OS & Linux). The R package that I’m currently developing uses RcppEigen, which is not dependent on an efficient BLAS or LAPACK library. However, many other R packages do have this dependency. Therefore, I would recommend following the instructions in the R Installation and Administration guide to switch over to a more efficient implementation. Avraham Adler, Tony Fischetti and Zachary Mayer have written similar blog posts on this topic. I use the Accelerate Umbrella Framework (vecLib) on OS X and Intel MKL with icc on Linux. The following instructions describe how (and why) to install GotoBLAS for R on Microsoft Windows.
The first paper from my postdoc at Warwick has been accepted and is now available online:
Gracie, Moores, Smith, Harding, Girolami, Graham & Faulds, “Preferential attachment of specific fluorescent dyes and dye labelled DNA sequences in a SERS multiplex,” to appear in Analytical Chemistry. DOI: 10.1021/acs.analchem.5b02776
No Bayesian methods in this one, as the model and algorithms are still under development (watch this space…) Instead, I use nonparametric bootstrap confidence intervals for comparison of Raman peak intensity.
I also have two middle-author papers accepted from my PhD work. The first is a review of Bayesian modelling and computation for satellite imagery:
Falk, Alston, McGrory, Clifford, Heron, Leonte, Moores, Walsh, Pettitt & Mengersen (2015) “Recent Bayesian approaches for spatial analysis of 2-D images with application to environmental modelling,” Environmental and Ecological Statistics 22(3): 571-600. DOI: 10.1007/s10651-015-0311-1
The second paper introduces a method for aligning 3D CT scans using DICOM coordinates:
Hargrave, Mason, Guidi, Miller, Becker, Moores, Mengersen, Poulsen & Harden, “Automated replication of cone beam CT-guided treatments in the Pinnacle³ treatment planning system for adaptive radiotherapy,” to appear in the Journal of Medical Radiation Sciences. DOI: 10.1002/jmrs.141
There are also two draft papers available as online pre-prints:
Drovandi, Moores & Boys, “Accelerating Pseudo-Marginal MCMC using Gaussian Processes,” Technical Report, Queensland University of Technology, Brisbane, Australia.
Moores, Pettitt & Mengersen, “Scalable Bayesian Inference for the Inverse Temperature of a Hidden Potts Model,” arXiv:1503.08066 [stat.CO]
Last but not least, my thesis has been posted on the QUT ePrints server:
Moores (2015) “Bayesian computational methods for spatial analysis of images,” PhD thesis, Queensland University of Technology, Brisbane, Australia.
An extended abstract is available online:
Moores (2016) “Bayesian computational methods for spatial analysis of images,” to appear in the Bulletin of the Australian Mathematical Society. DOI: 10.1017/S0004972715001598
2015 was a very busy and successful year. Looking forward to publishing more great science with all of my co-authors in 2016!
I have mixed feelings about some of the new features in RStudio v0.99.489, but one thing I really like is the menu command “Insert Roxygen Skeleton.” It has always bugged me that the R package created by a call to utils::package.skeleton(..) doesn’t pass R CMD check without a bunch of manual changes. Editing .Rd files by hand isn’t my idea of fun, so I’m much happier with the new workflow:
I’ve taken a crack at compiling the R package gputools on Windows, but I hit a bit of a roadblock. I’ll describe the steps I’ve taken below, hopefully someone will be able to suggest a way forward.
Some background: gputools provides an interface between R and CUDA, the nVidia API for executing code on a graphics card (GPU). On Linux or Mac OS, installation is fairly straightforward because the CUDA compiler nvcc depends on gcc or clang, respectively. However, interfacing between R and CUDA on Windows is an unsolved problem, since the respective toolchains are fundamentally incompatible. The only supported toolchain for compiling R on Windows is MinGW with gcc and gfortran. On the other hand, the nVidia CUDA Toolkit requires certain specific versions of Microsoft Visual Studio.
An unexpected gotcha with nVidia CUDA Toolkit on Windows is that it refuses to work with the current version of Microsoft Visual Studio:
nvcc fatal : nvcc cannot find a supported version of Microsoft Visual Studio. Only the versions 2010, 2012, and 2013 are supported
According to the installation guide (and the above compile error), CUDA Toolkit 7.5 only integrates with Visual Studio 2013, 2012 or 2010. I could have saved myself many hours of downloading and installing Visual Studio 2015 and Intel Parallel Studio XE 16.0, if I’d known that beforehand! Now to uninstall everything and start again from scratch…