Use These 4 Easy Steps When Creating Your First Data Story
03/27/2019 by Jaime D’Agord Data Communication
Data storytelling has become a popular way to present data. Still, many data professionals struggle with the concept or don’t understand the value of using storytelling methods with their data. There are four steps to crafting a data story. If you can master these four elements, then you can tell a powerful story that persuades management toward actions. [Need a data story definition?]
As an example of the storytelling process, I’ll use one of the first data stories ever told. It occurred during 1854 when a cholera outbreak claimed the lives of over 600 southern London residents. This data story was told by Dr. John Snow, who is often credited with ending the outbreak and being the father of epidemiology.
Of course, a data story starts with data – that seems like a ridiculous thing to even say. Most data professionals find the challenge is the overabundance of data. Your goal is to determine the key insights and then filter and group that data to get there.
In the mid-1800s there was a cholera outbreak in London. Many people died and a local physician, John Snow, was trying to prevent more deaths. He didn’t have a handy Oracle database to study the data, so he had to walk door-to-door and survey residents. Through this data discovery process, he was able to make some observations about certain locations where the deaths occurred. The most notable area was in London’s south region known as Soho, specifically on Broad Street.
The following figure, from Dr. Snow’s On the Mode of Communication of Cholera essay, uses data to show which districts were the most impacted. This data shows the deaths prior to the specific event Dr. Snow investigated and reveals that the southern London district was impacted by cholera before.
In modern times, it’s known that you can catch diseases, like cholera, from unclean water. This fact was not common knowledge in 1849. People understood that germs could be transmitted through the air, but the water was a different story. Think about it – it was a common practice for people to dump raw sewage into the Thames. They must have realized the sewage was nasty, but they may not have thought through how germs might survive in water.
When Dr. Snow suggested his water-based illness theory, it seems like insanity– like someone suggesting that a minimum wage job at McDonald’s was more profitable than dealing drugs. (It is … but that’s a different data story.) Dr. Snow’s theory was an unlikely narrative, which made it compelling! In the Wikipedia article, the author notes that the idea of germs spreading from fecal matter to someone’s mouth was more than people could tolerate.
However, it’s these interesting storylines that get attention. When the data reveals something unexpected, it creates a captivating storyline that engages the viewer.
When you write a data story, you must consider what questions your audience will have and how your data answers those questions. Dr. Snow’s audience was the local government officials about the water pump on Broad Street. In this letter to the editor of the Medical Times and Gazette, he said:
“With regard to the deaths occurring in the locality belonging to the pump, there were 61 instances in which I was informed that the deceased persons used to drink the pump water from Broad Street, either constantly or occasionally…
The result of the inquiry, then, is that there has been no particular outbreak or prevalence of cholera in this part of London except among the persons who were in the habit of drinking the water of the above-mentioned pump well.
I had an interview with the Board of Guardians of St James’s parish, on the evening of the [7 September], and represented the above circumstances to them. In consequence of what I said, the handle of the pump was removed on the following day.”
From this passage, we understand that he spoke to his audience directly. Notice that he limited the statistics to the two most convincing ones. Since the Broad Street pump was shut down the following day, he must have answered their questions succinctly.
I think Dr. Snow was a good communicator. His 30-page essay called On the Mode of Communication of Cholera described how he had collected the data, how the disease had progressed through the Soho district, and why he thought it was tied to the water supply. It’s an interesting narrative because as you read you understand the mystery and feel the impact the disease has on the community.
It’s written in very simple terms and is easy to follow. He was able to lay out the case for how the water supply was the most likely cause of the cholera outbreak. In today’s world, he would have needed social media, blog posts, and probably some YouTube videos to get his point across. At that time he had limited means of communicating.
Dr. Snow’s famous data visualization was created years later in a separate essay. The following figure shows the Broad Street pump responsible for the outbreak. The stacked bars indicate the deaths attributed to cholera. When we view the geo-based data visualization, it’s obvious what happened.
I suspect after collecting the data he must have visualized it in this head to make the correlation with the Broad Street water pump. But it took Dr. Snow’s master storytelling to persuade an audience to action. You can do the same.