Data Visualization

Data visualization is probably the visual mode that comes to mind first for most scientists, so it’s surprising how little emphasis is placed on appreciating the connection it serves between a strong understanding of the underlying data and design elements of a suitable visualization. This is perhaps due to the persistent difficulty in producing elegant and appropriate visualizations, despite the dramatic reduction in complexity observed over the past 10 years and the explosion of tools available. In the following chapters we’ll take a look at both the data and the design elements and how they fit together.

Data visualizations also suffer from two seemingly contradictory yet inter-connected problems that plague much communication in STEM fields.

First, STEMians are surprisingly gullible when presented with flashy, exciting, colorful, shiny, new, sexy media. We are easily taken in by interactive plots, which may not serve any purpose, and perhaps unsurprisingly we also beautiful things, which can unfortunately distract from a critical assessment of the actual information presented.

Second, STEMians are biased to favor overly-complex, dry, difficult visualizations that also fail to serve a purpose in communication. This mirrors biases we see when presented with writing or with scientific presentations. I can’t recall how many students insist on making heatmaps althogh the data can more easily and clearly be presented with a differnt geometry, or how eager students are to generate and see box plots and, recently, violin plots, without knowing what thos plots actually show, nor if they actually represent the data properly.

With these biases in mind, let us first discuss the purposes in data visualization, which also mirror purposes in other modes of communication discussed in this book.