Gina Gruenhage has just arxived a new paper describing an algorithm we call cMDS. Here’s what it’s for: if you do any kind of data analysis you often find yourself comparing datapoints using some kind of distance metric. All’s well if you have a unique reasonable distance metric you can use, but often what you have is a family of possible distance functions, and very little idea how to choose among them. What if the patterns in the data change according to how you measure distance?
The slides for my ECVP tutorial on classification images are available here. Try this alternative version if the equations look funny.
(image from Mineault et al. 2009)
The slides are in HTML and contain some interactive elements. They’re the result of experimenting with R Markdown, D3 and pandoc. You write the slides in R Markdown, use knitr and pandoc to produce the slides, and add interaction using D3.
I’m not completely happy with the results but it’s a pretty cool set of tools.
STAN is a new system for Bayesian inference, similar to BUGS and JAGS. I’ve played with it a bit and it’s quite promising, it really has the potential to make MCMC less of a pain (on simple models). I’ve written a short introduction to fitting psychometric functions using STAN and R, in case that’s useful to psychophysicists out there.
Imagine a world in which people are taught that there’s two kinds of counting: there’s potato-counting, and there’s counting other stuff (beans, points, cards, etc.) Potatoes are special, so that potato-counting gets its own courses, under the name “Kartoffelanalysis”. When you take a Kartoffelanalysis 101 course, nobody mentions that you could use the same techniques to count other objects. Potatoes are special and unique. More advanced students learn that there are special techniques for counting a mix of potatoes and other things, and these sophisticated techniques are called Mixed Kartoffelanalysis. Only a select few ever learn that counting potatoes works pretty much the same way as counting other stuff.
I’ve uploaded a draft tutorial on some aspects of prediction using point processes. I wrote it using R-Markdown, so there’s bits of R code for readers to play with. It’s hosted on Rpubs, which turns out to be a great deal more convenient than WordPress for that sort of thing.
We’ve just revised and re-arxived our manuscript on point processes for the analysis of eye movement data (joint work with Hans Trukenbrod & Ralf Engbert of the University of Potsdam, Felix Wichmann of the University of Tübingen).
The main idea is that often one is interested mostly in where people have looked and why. Fixation locations at are just points in space, and so you can analyse that sort of data with point processes. The reason you’d want to do that is that point processes give interesting ways of characterising the statistical patterns of points in space. The thing we focus on is predicting what people look at based on image content, but that’s only one of the many things you can do in that framework.
Of potential interest to eye movement researchers and people who like statistical models of stuff.
Regular expressions are a fantastic tool when you’re looking for patterns in time series. I wish I’d realised that sooner.
Here’s a timely example: traditionally, when you have two successive quarters of negative GDP growth, you’re in recession. We have a quarterly GDP time series for Australia, and we want to know how many recessions they’ve been having down under. How do we count these events?
Of course you can always write a loop, but it’s clumsy. You can use the “embed” function, that’s much better in a lot of cases. Or you could use regexps. Here’s how.
First, our Australian GDP data, courtesy of the expsmooth package:
plot(ausgdp, main = "Australian GDP per capita", ylab = "dollars", xlab = "Year")
(caveat: the series is per capita GDP, not absolute GDP, so my recession count might be off).