Pie Charts I

Showing parts-of-a-whole using Pie Charts and Stacked Bar Charts

We are often interested in understanding parts-of-a-whole, that is, the proportions of each group in a categorical variable. Molecular biologists will be familiar with a common part-of-a-whole question concerning gene classifications. In a list of significantly over-expressed genes identified by a large screen, like a microarray or a mass spectrometry, we can ask the question which gene families are more abundant than expected? which are less abundant? Consider the single variable, hair color in \(n = 592\) individuals.

Hair	N
Black	108
Brown	286
Red	71
Blond	127

A common way to view this data would be a pie chart, which at this point is not to bad. There is only one variable, hair color, and there are only four groups to consider.

Figure 1: a pie chart and a stacked bar chart of the hair color of 592 individuals.

In this case a stacked bar chart, although easier to read, it not that much more informative than the pie chart. This changes quickly once we start adding variables, like eye color and sex. If we continue along these lines we’ll end up with a collection of pie charts and bar charts which fail to serve the purpose of

Males
Hair	Eye	N
Black	Brown	32
Brown	Brown	53
Red	Brown	10
Blond	Brown	3
Black	Blue	11
Brown	Blue	50
Red	Blue	10
Blond	Blue	30
Black	Hazel	10
Brown	Hazel	25
Red	Hazel	7
Blond	Hazel	5
Black	Green	3
Brown	Green	15
Red	Green	7
Blond	Green	8

Females
Hair	Eye	N
Black	Brown	36
Brown	Brown	66
Red	Brown	16
Blond	Brown	4
Black	Blue	9
Brown	Blue	34
Red	Blue	7
Blond	Blue	64
Black	Hazel	5
Brown	Hazel	29
Red	Hazel	7
Blond	Hazel	5
Black	Green	2
Brown	Green	14
Red	Green	7
Blond	Green	8

Beginning with

So the real question we are interested in is not just the proportions within a categorical variable, but which groups are over or under-represented. This qould require that we have some a priori

intersection between , but The implicit question is if any sub-groups are over- or under-represented.

As an example, we will consider the results for a survey of male and female hair and eye colour. We want to uncover and represent any biases in the data set which may reveal previously unappreciated associations between the three variables: sex, hair colour and eye colour.

As we have seen, bar charts are usually the first choice for nominal comparisons (fig. @ref(fig:bar-counts)). However, the major deficiency here is that we are presenting the absolute values, when we are really interested in the parts-of-a-whole distribution. Bar charts don’t reveal any interesting trends, such as over- or under-representation without a large time investment from the reader.

As an alternative to the bar chart, pie charts and stacked bar charts are commonly used for representing parts-of-a-whole. Unfortunately, they both have major drawbacks in visual perception and don’t excel at communicating an effective story. Here we will show how this data is presented and provide a solution using mosaic plots in the next sub-section.

Use pie charts and stacked bar charts with caution

¹ Your readers will preferentially and intuitively compare different aspects of a pie chart. Regardless of their particular preference, imagine the difficulty in calculating the angle (\(\theta\)), the area (\(\theta r^{2}/2\)), or the arc length (\(r(\theta\pi/180)\)), which which is what you are effectively asking your readers to do in their minds.

Pie charts are particularly appealing for representing parts-of-a-whole data sets since they intuitively tell the reader that all parts add up to 100%. In addition, the sample size can be encoded in the radius of the circle (fig. Figure fig-bar-and-pie). The major disadvantage of pie charts is that they encode values using slice area, arc length or angle at center, all of which are fairly inaccurate methods of encoding quantitative information (see Figure fig-cont-encoding).

Perhaps the only instance where pie charts are suitable is for representing large quantitative differences in a small number of groups, which begs the question if a visualization is even necessary.

Figure 3: Multiple pie charts, stacked bar plots and a filled bar chart copared. Each set contains information on 32 unique combinations of hair, eye and sex. The multiple pie charts have become unweildly. Both bar chart variants are acceptable, each containing diffrent information.

Stacked bar charts, plotted on a relative scale, depict the relative proportions of each sub-group in a categorical variable (Figure fig-stacked-bar-plots). This provides a common scale of relative abundances. Similar to the radius of pie charts, we can encode the sample size in the width of each bar.

Stacked bar charts are an improvement over the pie charts since at least some of the sub-groups are plotted on a common scale. However, since we have four categories, only the two at the bottom and top of the bar chart benefit from this feature. In Figure fig-stacked-bar-plots we have plotted the three variables as three pair-wise plots. Although all our data is visualized, these plots fail to really tell a story, we still don’t know which sub-groups are over- or under-represented and the relationship between hair and eye color is not displayed, a third plot would be required for that.

Barnett, Adrian, and Nicole White. 2024. “Something is off-base with this title: P esteems, statical significance and more slapdash stats.” Significance 21 (1): 11–13. https://doi.org/10.1093/jrssig/qmae007.

Berger, J. 2008. Ways of Seeing. Penguin Modern Classics. Penguin Books Limited. https://books.google.de/books?id=QxdperNq5R8C.

Berne, E. 2011. Games People Play: The Basic Handbook of Transactional Analysis. Tantor Media, Incorporated. https://books.google.de/books?id=D9dOBAAAQBAJ.

Bilz, S., R. Klanten, and M. Mischler. 2011. The Little Know-It-All: Common Sense for Designers. Gestalten. https://books.google.de/books?id=JA8FfAEACAAJ.

Bjork, Robert A, and Elizabeth L Bjork. 2011. “Making Things Hard on Yourself, but in a Good Way: Creating Desirable Difficulties to Enhance Learning.” In Psychology and the Real World: Essays Illustrating Fundamental Contributions to Society, edited by Morton A Gernsbacher, Robert W Pew, Leah M Hough, and James R Pomerantz, 56–64. Worth Publishers.

Bloom, P. 2016. Against Empathy: The Case for Rational Compassion. HarperCollins. https://books.google.de/books?id=op67CwAAQBAJ.

Bringhurst, R. 2004. The Elements of Typographic Style. Elements of Typographic Style. Hartley & Marks, Publishers. https://books.google.de/books?id=940sAAAAYAAJ.

Briscoe, M. H. 2012. Preparing Scientific Illustrations: A Guide to Better Posters, Presentations, and Publications. Springer New York. https://books.google.de/books?id=mYTlBwAAQBAJ.

Cepeda, Nicholas J, Harold Pashler, Edward Vul, John T Wixted, and Doug Rohrer. 2006. “Distributed Practice in Verbal Recall Tasks: A Review and Quantitative Synthesis.” Psychological Bulletin 132 (3): 354–80.

Chasson, Gregory, and Sara R. Jarosiewicz. 2014. “Social Competence Impairments in Autism Spectrum Disorders.” In Comprehensive Guide to Autism, edited by Vinood B. Patel, Victor R. Preedy, and Colin R. Martin, 1099–1118. New York, NY: Springer New York. https://doi.org/10.1007/978-1-4614-4788-7_60.

Cheeseman, Ian H., Natalia Gomez-Escobar, Celine K. Carret, Alasdair Ivens, Lindsay B. Stewart, Kevin KA Tetteh, and David J. Conway. 2009. “Gene Copy Number Variation Throughout the Plasmodium Falciparum Genome.” BMC Genomics 10 (1): 353. https://doi.org/10.1186/1471-2164-10-353.

Cherry, C. 1980. On Human Communication: A Review, a Survey, and a Criticism. MIT Press Classics. MIT Press. https://books.google.de/books?id=kQwqSwAACAAJ.

Daston, L., and P. Galison. 2007. Objectivity. Book Collections on Project MUSE. Zone Books.

Diemand-Yauman, Connor, Daniel M Oppenheimer, and Erikka B Vaughan. 2011. “Fortune Favors the Bold (and the Italicized): Effects of Disfluency on Educational Outcomes.” Cognition.

Economist, The. 2018. “How heavy use of social media is linked to mental illness.” https://www.economist.com/graphic-detail/2018/05/18/how-heavy-use-of-social-media-is-linked-to-mental-illness.

Hamming, R., and B. Victor. 2020. The Art of Doing Science and Engineering: Learning to Learn. Stripe Matter Incorporated.

Harris, S. 2012. Free Will. Free Press. https://books.google.com.ec/books?id=iRpkNcRt1IcC.

Hench, Virginia K., and Lishan Su. 2011. “Regulation of IL-2 Gene Expression by Siva and FOXP3 in Human t Cells.” BMC Immunology 12 (1): 54. https://doi.org/10.1186/1471-2172-12-54.

Hill, Jennifer, and Maria Singer. 2014. “A Comparison of Print and Digital Reading Comprehension by Middle School Students.” Reading Research Quarterly 49 (2): 185–203. https://doi.org/10.1002/rrq.68.

Hofmann, A. H. 2020. Scientific Writing and Communication: Papers, Proposals, and Presentations. Oxford University Press. https://books.google.de/books?id=vQXuxAEACAAJ.

Jeffares, A. N., and M. B. Davies. 1958. The Scientific Background: A Prose Anthology. Pitman. https://books.google.de/books?id=F_gLAQAAIAAJ.

Kahneman, D. 2011. Thinking, Fast and Slow. Farrar, Straus; Giroux. https://books.google.com.ec/books?id=ZuKTvERuPG8C.

Lupton, E. 2010. Thinking with Type, 2nd Revised and Expanded Edition: A Critical Guide for Designers, Writers, Editors, & Students. Princeton Architectural Press. https://books.google.de/books?id=Y_NVRQAACAAJ.

Mangen, Anne, and Don Kuiken. 2014. “Lost in an iPad: Narrative Engagement on Paper and Tablet.” Scientific Study of Literature 4 (2): 150–77. https://doi.org/10.1075/ssol.4.2.01man.

Mangen, Anne, Bente R Walgermo, and Kolbjørn Brønnick. 2013. “Reading Linear Texts on Paper Versus Computer Screen: Effects on Reading Comprehension.” International Journal of Educational Research 58: 61–68. https://doi.org/10.1016/j.ijer.2012.12.002.

Margolin, Sara J, Christine Driscoll, Michael J Toland, and Jessica L Kegler. 2013. “E-Readers, Computer Screens, or Paper: Does Reading Comprehension Change Across Media Platforms?” Applied Cognitive Psychology 27 (4): 512–19. https://doi.org/10.1002/acp.2930.

Murayama, Hiroshi, Yusuke Takagi, Hirokazu Tsuda, and Yuri Kato. 2023. “Applying Nudge to Public Health Policy: Practical Examples and Tips for Designing Nudge Interventions.” International Journal of Environmental Research and Public Health. MDPI. https://doi.org/10.3390/ijerph20053962.

producer, Stephen Lambert ;. written executive, and produced by Adam Curtis ;. RDF Television; BBC. [2009?]. “The Century of the Self.” Standard format. Wyandotte, MI : BigD Productions, [2009?]. https://search.library.wisc.edu/catalog/9910135083802121.

Roediger, Henry L, and Jeffrey D Karpicke. 2006. “Test-Enhanced Learning: Taking Memory Tests Improves Long-Term Retention.” Psychological Science 17 (3): 249–55.

Rohrer, Doug, and Kelli Taylor. 2007. “The Shuffling of Mathematics Problems Improves Learning.” Instructional Science 35 (6): 481–98.

Roman, K., and J. Raphaelson. 2010. Writing That Works, 3rd Edition: How to Communicate Effectively in Business. HarperCollins. https://books.google.de/books?id=3Rcv5CmGYf0C.

Roßa, N. 2017. Sketchnotes: Visuelle Notizen für Alles. frechverlag.

———. 2020. Sketchnotes: Die Große Symbol-Bibliothek. frechverlag.

Rousselet, Guillaume A, John J Foxe, and J Paul Bolam. 2016. “A Few Simple Steps to Improve the Description of Group Results in Neuroscience.” Eur. J. Neurosci. 44 (9): 2647–51.

Sanges, Remo, Yavor Hadzhiev, Marion Gueroult-Bellone, Agnes Roure, Marco Ferg, Nicola Meola, Gabriele Amore, et al. 2013. “Highly conserved elements discovered in vertebrates are present in non-syntenic loci of tunicates, act as enhancers and can be transcribed during development.” Nucleic Acids Research 41 (6): 3600–3618. https://doi.org/10.1093/nar/gkt030.

Shannon, Claude Elwood. 1948. “A Mathematical Theory of Communication.” The Bell System Technical Journal 27: 379–423. http://plan9.bell-labs.com/cm/ms/what/shannonday/shannon1948.pdf.

Singer, Leona M, Patricia A Alexander, and Deborah D Reese. 2014. “Reading on Paper and Digitally: What the Past Decades of Empirical Research Reveal.” Review of Educational Research 84 (4): 509–45. https://doi.org/10.3102/0034654314541101.

Slamecka, Norman J, and Peter Graf. 1978. “The Generation Effect: Delineation of a Phenomenon.” Journal of Experimental Psychology: Human Learning and Memory 4 (6): 592–604.

“Status of Mind - social media and young people’s mental health and wellbeing.” 2017. Royal Society for Public Health.

Steed, S., and an O’Reilly Media Company Safari. 2019. Empathy at Work. O’Reilly Media. https://books.google.de/books?id=U-j8xAEACAAJ.

Wästlund, Erik, Lars Nilsson, and Kenneth Holmqvist. 2012. “Eye Movement Patterns and Reading Processes in Eye-Friendly and Non-Eye-Friendly Typography.” Information Design Journal 19 (2): 119–32.

Weschler, L. 2006. Everything That Rises: A Book of Convergences. McSweeney’s Books. https://books.google.de/books?id=dqefAAAAMAAJ.

Zinsser, W. 2012. On Writing Well, 30th Anniversary Edition: An Informal Guide to Writing Nonfiction. HarperCollins. https://books.google.de/books?id=mp16BDRDaYQC.