## statistics and applied probability

### national university of singapore

#### Month: June 2010

We’re holding a 3d workshop in September in IMS on recent advances in Bayesian computation. We haven’t listed the speakers yet, but we have an impressive roll call of overseas participants. Registration is open and free, so sign up and tell your friends!

Ok, the connexion to statistics is tenuous, but with the world cup starting tonight, I’m getting excited.  I read a great snippet in Today about the Argentine squad, managed by the erratic genius, Maradona. Ariel Garca, or Chino as he’s known, is a relatively unknown defender at Colon, but was picked for the Argentine world cup squad

because of coach Diego Maradona’s dream one night that Argentina had won the World Cup, and Garce’s face was the only one he remembered the next morning.

You have to love Maradona (here’s his Hand of God again on youtube and wikipedia). Football is so full of superstitions (I recall Glasgow Ranger’s then manager Dick Advocaat, on being asked how he prepared for an important game, replying with a straight face “I put on my lucky underpants”), it would be interesting to see how many such beliefs actually hold up to a randomised experiment (the tenuous connexion).

My paper with Major Dr Vernon Lee and co was published last night in the New England Journal of Medicine.  We showed that a suite of control measures, including proactively dispensing antivirals, stopped some outbreaks of H1N1 in army camps early on.  It’s a bit of a score for the army, me, and Singapore, as NEJM is the top medical journal and has an impact factor of 50; for comparison, AnnStat has an IF of 2.3, JASA of 2.4 and JRSSB of 2.8, while Science and Nature get 28 and 31, resp.

We encountered some difficulties in the analysis: for one, there was no control to the control (!), i.e. no outbreaks in which the disease was left to run its course, as the units were needed for stuff like the National Day Parade.  So I put together a kind-of Poisson infection model and whopped it with some MCMC.  That did the trick.  The other problem was that the army were just too damn efficient!  They started control for most outbreaks so soon it was hard to show the control was effective (think small n).  Perhaps the lesson is to let things get really bad before you try to show that your treatment is effective…  Anyway, this has given me an idea for an honour’s project…

PS The paper’s being publicised in the Straits Times and Today on Friday 11 June and on the Channel NewsAsia website.

In Today today:

In 2006, [Germany defender Per] Mertesacker was paired with new Schalke signing Christoph Metzelder and they were statistically the best centre-back pairing of the last World Cup.

Ah, but was the difference between them and the second best statistically significant?

Comments aren’t appearing at the moment.  Apologies to those who have commented: I’ll liaise with the IT people and try to get it working again!

A tip somewhat related to Getting papers, google, NUS library:
Getting books from other NUS libraries delivered to the one nearest to you. (Intra Library Loan)

Log in and select E-Forms → Loan Services → Request to borrow books held in other NUS Libraries

Useful if you are lazy to make that trip to the Central or HSS Library.

An, ahem, interesting and novel graphical display of data can be found on the MOH webpage.  It comes from a survey by Accenture of about 1000 people in each of 16 countries, including Singapore, on their perceptions of the “importance” of several health related issues and the “performance” of their national government in tackling them.  Accenture decided on a somewhat unorthodox way to present the results: via a “radar chart”.  Not heard of them before, have you?  Well, take a look at the Singapore chart:

Appropriately, it looks like a ball. But I catch no ball!  After much contemplation, I worked out the rules for understanding it. The hours round the side aren’t hours, they indicate questions, but to find what the questions actually are, you have to scroll back 25 pages.  And the questions themselves are kind of weirdly banal:

1. Focus on delivering real improvements in the overall health of the nation.
2. Target health services to help people with the highest level of need.
3. etc

So, the interviewees were asked “Do you think it’s important to focus on delivering real improvements in the overall health of the nation?” I wonder who answered no? Maybe people who like imaginary health benefits?  Or perhaps those who think the nation is already fully healthy?  Anyway, in addition, they had to say if they thought the government was performing on this issue.  Thus the graph: the radial distance from centre to the red curve indicates performance, the distance from centre to blue curve indicates desirability, and the gap indicates those people who think the government is not doing something that they think is important.

Now that you’ve understood the rules, does it make any easier to catch ball?  No, not really, cos you still have to flick back to the questions, and you’re unable to compare countries, and those distances are just hard to work out.

I wrote to Kaiser Fung of junkcharts about it, and in an extremely pithy reply, he sent the dataset Accenture used.  Taking that as a cue to do it myself, I did a makeover of the plot.  Voila!

It seemed that the most important thing is to be able to compare across countries, so I switched it to focus on one question at a time.  Countries are ranked by average “performance” over the 16 questions: Singapore comes top, Ireland bottom.  The black bar is the ostensible performance, the black and red bar combined is the “importance”, and the red bar is the gap between expectations and experience.

Now, to work out why richer countries mostly do badly, while middling-income countries do well, probably requires actually reading through the report…

The Beeb report that China has developed the world’s second fastest supercomputer. (Partial compensation for getting beaten by Singapore’s paddlers, I suppose.) What makes this worth writing about on a stats blog?  Well, the Beeb also provide data on the top 100 supercomputers, and let you play around with them (data, not computers) via a “treemap”.  Here’s an example:

Hmm… not sure what I think of these treemaps.  The neat thing is, you can hover your mouse over it and get info about that particular supercomputer.  But this hardly seems worthwhile for any of the 50-odd computers in the US, say, except maybe the biggest few, so this feature doesn’t seem to add value.  Would a barchart not be a bit easier to visualise?

Anyway, those who know my computing habits will realise I was delighted by clicking on the “by operating system” option: linux linux linux linux … linux windows.

Theme by Anders NorenUp ↑