Saturday, December 18, 2010

Visualization of the data and animation - part II

I had written a piece earlier about Hans Rosling's animation of country-level data using the Gapminder tool. Here are some more examples of some extremely cool examples of data animation.

At the start of this series, there is more animation from the Joy Of Stats program that Rosling hosted in the BBC. The landing page is a link that shows the plotting of crime data in downtown San Francisco and how this visual overlay on the city topography provides some valuable insights on where one might expect to find crime. This is a valuable tool for police departments (to try and prevent crime that is local to an area and has some element of predictability), residents (to research neighbourhoods before they buy property, for example) and tourists (who might want to doublecheck a part of the city before deciding on a really attractive hotel deal). The researchers who have created this tool that maps the crime data to maps. The researchers in the clip talk about how tools such as this can be used to improve citizen power and government accountability. Another good example of crime data, this time reported by Police Departments across the US can be found here. Finally, towards the end of the clip, the researchers go on to mention what could be the Holy Grail of this kind of visualization. They talk about how real-time data put up on social media and networking sites like Facebook and Twitter (geo-tagged perhaps) could provide a real-time feed into these maps. Now this would have been certainly in the realm of science fiction only a few years back but suddenly now it doesn't seem as impossible.

The San Francisco crime mapping link has a few other really impressive videos as you scroll further down. I really like the one of Florence Nightingale, whose graphs during the Crimean war helped reveal important insights on how injuries and deaths were occurring in hospitals. It is interesting to know that Lady of the Lantern was not just renowned for tending for the sick, but also was a keen student of statistics. Her graphs of deaths which were accidental, caused by war injuries and wounds and finally those that were preventable (and caused by poor hygiene that was quite prevalent at the time) created a very powerful imagery of the high incidence of preventable deaths and the need to address this area with the right focus.

Why is visualization and animation of data helpful and such a critical tool in the arsenal of any serious data scientist? For a few reasons.

For one, it helps tell a story way better than equations or tables of data do. That is so essential to convey the message to people who are not necessarily experts who have insight into the tables, but are important influencers and stakeholders nevertheless who need to be educated on the subject being conveyed. Think of it as how an advertisement (either picture or moving image) is more powerful in conveying the strength of a brand as compared to boring old text.
The other reason, in my opinion, is that graphical depiction and visualization of the data allows the powerful human brain (which is far more powerful than any computer at pattern recognition) to take over the part of data analysis that the human brain is really good at and computers generally not so good at. This is forming hypotheses on-the-fly about the data being displayed and reaching conclusions based on visual patterns in the data. Also the ability to hook into remote memory banks within our brains and form linkages. While Machine Learning and AI are admirable goals, there is still some way to go before computers can match the sheer ingenuity and flexibility of thought that the human brain possesses.

