The designers of every new visualization think they’ve got the best form to convey information, and they’ve thought that for more than 200 years. William Playfair was no different. In 1801 he represented the relative proportions of African, Asiatic, and European dominions of the Ottoman Empire with a clever diagram, a sectored circle that showed the whole and its parts simultaneously. He used it twice in his Statistical Breviary to describe geopolitical divisions, and in another book to show the relative areas of regions of the United States.
It wasn’t immediately popular, but the pie chart, as it was later called, gave visual metaphor to a concept that numbers had difficulty conveying. It became the default means of collating and displaying data pictorially. Unfortunately, these days few people know which types of data are appropriate for the chart, or even for visualizing at all. The pie chart is intended to display proportions of a whole within a single, small data set, but overzealous Excel users dump in large data sets or stack multiple pies. The resulting complex defeats the purpose of using a picture: simplification. The charting features of Excel, which has had its share of critics — see Edward Tufte’s scathing essay on its flaws — were only the beginning. For example, on ManyEyes.com, an IBM project aiming to “democratize” visualization, the visual representations include pies, bar charts, bubble charts, Wordles (tag clouds with color and whimsical arrangement), word trees, tree maps, and network maps. On VisualComplexity.com, network maps abound. Many visualization types have cropped up just in the past two decades, riding the growth of the internet. But they nevertheless share many characteristics with the garden-variety pie chart, including some of its primary weaknesses and a slew of new ones. Recognizing them will move science closer to tools that work for users, rather than the other way around.
The flood of data from many scientific fields can seem too great to process in the usual way. “How can we now cope with a large amount of data and still do a thorough job of analysis so that we don’t miss the Nobel Prize?” says Bill Cleveland, a Bell Labs and Purdue statistician. The response is often to resort to visualizations intended to simplify reams of information. “Lots of people think, oh, visualization is going to save us, but there may not be a mapping that makes sense for everything,” says Colin Ware, director of the Data Visualization Research Lab at the Center for Coastal and Ocean Mapping at the University of New Hampshire. In the pursuit of new, potentially more powerful graphics, users visualize data sets regardless of whether a picture is appropriate.
Playfair rightly intuited that visual representations of data can enable people to make comparisons more easily. Many psychoperceptual studies have explored the human mind’s aptitude for gleaning information from pictures. Unfortunately, the pie chart incorporates tasks that we humans systematically fail to perform accurately, all those exercises that come at the bottom of the hierarchy of perceptual tasks, formalized by Cleveland in a landmark 1984 paper. So although we’re good at comparing linear distances along a scale — judging which of two lines is longer, a task used in bar graphs — and we’re even better at judging the position of points along a scale, pie charts don’t bring those skills to bear. They do ask us compare angles, but we tend to underestimate acute angles, overestimate obtuse angles, and take horizontally bisected angles as much larger than their vertical counterparts. The problems worsen when we’re asked to judge area and volume: Regular as clockwork, we overestimate the size of smaller objects and underestimate the size of larger ones, to a much greater degree with volume than with area.
Newer visualizations can have these defects and more. Ian Spence, a psychologist at the University of Toronto, recalls a pair of tag clouds from the third US presidential debates, each showing the frequency of words used throughout. This conveyed which buzzwords each candidate used, but the impression of the relative quality of either side was entirely lost. Designers frequently violate the perceptual-task hierarchy. Popular charts like bubble charts and Wordles use the task of area comparison (a black mark in the perceptual psychologist’s book), and both use meaning- less colors and arrangement, complicating the display without adding data. They seem to convey information, but the design stifles the transmission. They are fuzzy charts.
Some of the most confusing new visualizations are the popular network diagrams, which are intended to show connections between nodes and invite inferences about the forces that govern the connections. Numerous groups have produced maps of social networks, internet traffic, and other complicated phenomena, but the impression one gets is merely of connectivity, rather than of any of the patterns the visualization purports to convey. Few obey the principles of perception-informed design or Edward Tufte’s rules for graphical integrity, which state that graphics should make viewers think about the subject matter, not design.
One way to solve the problem of overly complicated diagrams is to introduce interactivity. For an expert using visualization to analyze data, interactive displays may be much more useful than inert maps. A flat picture of a network does not suffice. “A network diagram will be completely unusable at about 30 nodes,” data visualizer Ware says. “But if you have simple interaction techniques, you can work up to a few thousand.” For those who work with networks, zooming in on a node to observe its connections with others nearby can mean the difference between a useless tangle and a successful tool.
Iterated simplicity, however, may turn out to be an even bigger breakthrough. Cleveland finds that an effective way of detecting patterns in massive data sets is to make a simple chart for each subset and view the hundreds of charts in quick succession. The human perceptual system is phenomenally good at spotting patterns, he notes: “We take our vision processes for granted. But there’s nothing that humans have created that’s more complex and amazing than the visual perceptual system in the first 100 milliseconds.” Skimming through these visual databases, he’s found, can be much more effective than complicated visualization; Cleveland is working on a protocol to share with others soon. Yet even as he advocates the use of visualization databases, he emphasizes that numerical tools — statistical tests of variance and significance — are just as important in assessing trends. Current enthusiasm for putting numbers into pictures sometimes obscures the fact that science is, after all, a quantitative pursuit, and an image alone cannot replace numbers.
While modern designers keep inventing more and more creative visualizations, the true frontiers this year may be much more modest — learning the limitations of graphics, and using perception-informed design and interactive techniques to make the most of the forms that already exist. The pie chart, which has borne the scorn of perceptual psychologists for decades, may fail in some respects, but modern visualization has in many ways failed to learn from its mistakes. Above all, we should remember that throwing data into a chart is not always the route to greater understanding. Says Spence: “There is a place for tables in the world.”
Originally published February 18, 2009