Redrawing Steven Jay Gould's Graph

The late Harvard paleontologist and baseball fan Steven Jay Gould wrote a famous study on the disappearance of the .400 batting average in baseball in his book… 596 more words


6. Make It Pretty: Plotting 2-way Interactions with ggplot2

ggplot2, as I’ve already made clear, is one of my favourite packages for R. And since that original post about ggplot2 remains one of my most frequently visited, I thought I would proceed with starting a series of posts called “Make It Pretty”, all about sharing ways of visualizing data that I think are attractive/effective/comprehensive. 2,288 more words

Spaghetti plots with ggplot2 and ggvis

This post was motivated by this article that discusses the graphics and statistical analysis for a two treatment, two period, two sequence (2x2x2) crossover drug interaction study of a new drug versus the standard. 1,331 more words

Rape at State Level in India Revisited

In my previous post I calculated the number of reported rapes per 100,000, or 1 lakh women, as the Indians say, for each state in India for the year 2011, and sorted them in a graph to see which ones have the highest rate of rapes. 960 more words


Rape in India at the State Level

After revealing the counts of rapes in India (see my previous post) I have been wondering what might be the number regarding the population, in order to compare which states have proportionally more reported rapes. 1,362 more words


Basic plotting in R with ggplot2: A Beginner's Guide to Time Series

Graphical illustration is essential both for good presentations, good seminar papers or other forms of content orientated communication and analysis. Humans are visual beings and although there are those “just-show-me-numbers!”-types out there, nearly everybody is happy – or at least not opposed – to see a good graph, which allows to intuitively comprehend what you are trying to say. 2,246 more words


Analyzing Medical Expenditures with R - Part 1

Using a sample from the Medical Expenditures Panel Survey (MEPS), I examined the distributions of inpatient and outpatient medical expenditures with R.

I used the truehist() function in the MASS library to create the histogram below, which depicts outpatient expenditures.   1,226 more words