Mosaic Plots

Comparing two or more Categorical Variables using Mosaic Plots

Mosaic plots are an excellent alternative to stacked bar plots. The major difference here is that we are essentially representing our data as a contingency table. Each sub-group is presented as a box (hence mosaic plot) with area proportional to sample size. Notably, scales are absent. Relying on area rather than a common scale is one of the major disadvantage of pie charts, but as we have seen, even stacked bar charts don’t adequately solve this problem. In short, we are forced to encode proportions as areas, where calculating the area and line lengths of boxes is certainly more intuitive than slice area and arc length of circles.

Use mosaic plots to visualise proportional comparisons of multiple categorical variables

The following figure is a direct conversion of stacked bar charts into mosaic plots, showing all three pair-wise comparisons for sex, hair colour and eye colour.

The real strength of mosaic plots is their ability to represent three categorical variables simultaneously (Figure fig-Mosaic-plots-1). This is the first plot that starts to tell the reader something interesting. The area of each box represents three variables, all possible combinations can be viewed and compared in a single plot. A logical extension of the basic mosaic plot is to reveal our underlying statistical analysis, where we asked which sub-groups are over- or under-represented. Using a color scale, boxes (understood as sub-group intersections) are shaded according to their relative over- or under-representation in the data set as a whole. This communicates a story to the reader in an effective manner, using appropriate encoding and coloring to convey a message. In this case, the story is that categories with high positive residuals are more frequent in the population than the equiprobability model would predict.

Note that in figures Figure fig-mosaic-first-view and Figure fig-associaiton-plot, color is mapped to a continuous variable, something that we advised against in section Figure fig-Cont-Encoding. In this case it is permissible because the continuous scale (Pearson’s residuals) has been partitioned into discrete bins, making it a categorical (interval) variable.

Figure 1: Mosaic plots with three variables (hair colour, eye colour and sex) depicting all possible combinations in the data set. Uniform shading.

Figure 2: Shading according to Pearson’s residuals.

We can go one step further in our representation of these three categorical variables and use area to represent the results from a Pearson \(\chi^{2}\) test as an association plot, as shown in Figure fig-associaiton-plot. The heights are proportional to the standardized Pearson residuals and the width is proportional to the square root of the expected value for that category given the equiprobability model. In each row, the base line equals independence and each box is plotted on a common scale. Here position and color clearly indicate which categories are over- and under-represented, and to what degree.

Figure 3: An association plot of all 32 possible hair colour/eye colour/sex combinations.

Barnett, Adrian, and Nicole White. 2024. “Something is off-base with this title: P esteems, statical significance and more slapdash stats.” Significance 21 (1): 11–13. https://doi.org/10.1093/jrssig/qmae007.

Berger, J. 2008. Ways of Seeing. Penguin Modern Classics. Penguin Books Limited. https://books.google.de/books?id=QxdperNq5R8C.

Berne, E. 2011. Games People Play: The Basic Handbook of Transactional Analysis. Tantor Media, Incorporated. https://books.google.de/books?id=D9dOBAAAQBAJ.

Bilz, S., R. Klanten, and M. Mischler. 2011. The Little Know-It-All: Common Sense for Designers. Gestalten. https://books.google.de/books?id=JA8FfAEACAAJ.

Bjork, Robert A, and Elizabeth L Bjork. 2011. “Making Things Hard on Yourself, but in a Good Way: Creating Desirable Difficulties to Enhance Learning.” In Psychology and the Real World: Essays Illustrating Fundamental Contributions to Society, edited by Morton A Gernsbacher, Robert W Pew, Leah M Hough, and James R Pomerantz, 56–64. Worth Publishers.

Bloom, P. 2016. Against Empathy: The Case for Rational Compassion. HarperCollins. https://books.google.de/books?id=op67CwAAQBAJ.

Bringhurst, R. 2004. The Elements of Typographic Style. Elements of Typographic Style. Hartley & Marks, Publishers. https://books.google.de/books?id=940sAAAAYAAJ.

Briscoe, M. H. 2012. Preparing Scientific Illustrations: A Guide to Better Posters, Presentations, and Publications. Springer New York. https://books.google.de/books?id=mYTlBwAAQBAJ.

Cepeda, Nicholas J, Harold Pashler, Edward Vul, John T Wixted, and Doug Rohrer. 2006. “Distributed Practice in Verbal Recall Tasks: A Review and Quantitative Synthesis.” Psychological Bulletin 132 (3): 354–80.

Chasson, Gregory, and Sara R. Jarosiewicz. 2014. “Social Competence Impairments in Autism Spectrum Disorders.” In Comprehensive Guide to Autism, edited by Vinood B. Patel, Victor R. Preedy, and Colin R. Martin, 1099–1118. New York, NY: Springer New York. https://doi.org/10.1007/978-1-4614-4788-7_60.

Cheeseman, Ian H., Natalia Gomez-Escobar, Celine K. Carret, Alasdair Ivens, Lindsay B. Stewart, Kevin KA Tetteh, and David J. Conway. 2009. “Gene Copy Number Variation Throughout the Plasmodium Falciparum Genome.” BMC Genomics 10 (1): 353. https://doi.org/10.1186/1471-2164-10-353.

Cherry, C. 1980. On Human Communication: A Review, a Survey, and a Criticism. MIT Press Classics. MIT Press. https://books.google.de/books?id=kQwqSwAACAAJ.

Daston, L., and P. Galison. 2007. Objectivity. Book Collections on Project MUSE. Zone Books.

Diemand-Yauman, Connor, Daniel M Oppenheimer, and Erikka B Vaughan. 2011. “Fortune Favors the Bold (and the Italicized): Effects of Disfluency on Educational Outcomes.” Cognition.

Economist, The. 2018. “How heavy use of social media is linked to mental illness.” https://www.economist.com/graphic-detail/2018/05/18/how-heavy-use-of-social-media-is-linked-to-mental-illness.

Hamming, R., and B. Victor. 2020. The Art of Doing Science and Engineering: Learning to Learn. Stripe Matter Incorporated.

Harris, S. 2012. Free Will. Free Press. https://books.google.com.ec/books?id=iRpkNcRt1IcC.

Hench, Virginia K., and Lishan Su. 2011. “Regulation of IL-2 Gene Expression by Siva and FOXP3 in Human t Cells.” BMC Immunology 12 (1): 54. https://doi.org/10.1186/1471-2172-12-54.

Hill, Jennifer, and Maria Singer. 2014. “A Comparison of Print and Digital Reading Comprehension by Middle School Students.” Reading Research Quarterly 49 (2): 185–203. https://doi.org/10.1002/rrq.68.

Hofmann, A. H. 2020. Scientific Writing and Communication: Papers, Proposals, and Presentations. Oxford University Press. https://books.google.de/books?id=vQXuxAEACAAJ.

Jeffares, A. N., and M. B. Davies. 1958. The Scientific Background: A Prose Anthology. Pitman. https://books.google.de/books?id=F_gLAQAAIAAJ.

Kahneman, D. 2011. Thinking, Fast and Slow. Farrar, Straus; Giroux. https://books.google.com.ec/books?id=ZuKTvERuPG8C.

Lupton, E. 2010. Thinking with Type, 2nd Revised and Expanded Edition: A Critical Guide for Designers, Writers, Editors, & Students. Princeton Architectural Press. https://books.google.de/books?id=Y_NVRQAACAAJ.

Mangen, Anne, and Don Kuiken. 2014. “Lost in an iPad: Narrative Engagement on Paper and Tablet.” Scientific Study of Literature 4 (2): 150–77. https://doi.org/10.1075/ssol.4.2.01man.

Mangen, Anne, Bente R Walgermo, and Kolbjørn Brønnick. 2013. “Reading Linear Texts on Paper Versus Computer Screen: Effects on Reading Comprehension.” International Journal of Educational Research 58: 61–68. https://doi.org/10.1016/j.ijer.2012.12.002.

Margolin, Sara J, Christine Driscoll, Michael J Toland, and Jessica L Kegler. 2013. “E-Readers, Computer Screens, or Paper: Does Reading Comprehension Change Across Media Platforms?” Applied Cognitive Psychology 27 (4): 512–19. https://doi.org/10.1002/acp.2930.

Murayama, Hiroshi, Yusuke Takagi, Hirokazu Tsuda, and Yuri Kato. 2023. “Applying Nudge to Public Health Policy: Practical Examples and Tips for Designing Nudge Interventions.” International Journal of Environmental Research and Public Health. MDPI. https://doi.org/10.3390/ijerph20053962.

producer, Stephen Lambert ;. written executive, and produced by Adam Curtis ;. RDF Television; BBC. [2009?]. “The Century of the Self.” Standard format. Wyandotte, MI : BigD Productions, [2009?]. https://search.library.wisc.edu/catalog/9910135083802121.

Roediger, Henry L, and Jeffrey D Karpicke. 2006. “Test-Enhanced Learning: Taking Memory Tests Improves Long-Term Retention.” Psychological Science 17 (3): 249–55.

Rohrer, Doug, and Kelli Taylor. 2007. “The Shuffling of Mathematics Problems Improves Learning.” Instructional Science 35 (6): 481–98.

Roman, K., and J. Raphaelson. 2010. Writing That Works, 3rd Edition: How to Communicate Effectively in Business. HarperCollins. https://books.google.de/books?id=3Rcv5CmGYf0C.

Roßa, N. 2017. Sketchnotes: Visuelle Notizen für Alles. frechverlag.

———. 2020. Sketchnotes: Die Große Symbol-Bibliothek. frechverlag.

Rousselet, Guillaume A, John J Foxe, and J Paul Bolam. 2016. “A Few Simple Steps to Improve the Description of Group Results in Neuroscience.” Eur. J. Neurosci. 44 (9): 2647–51.

Sanges, Remo, Yavor Hadzhiev, Marion Gueroult-Bellone, Agnes Roure, Marco Ferg, Nicola Meola, Gabriele Amore, et al. 2013. “Highly conserved elements discovered in vertebrates are present in non-syntenic loci of tunicates, act as enhancers and can be transcribed during development.” Nucleic Acids Research 41 (6): 3600–3618. https://doi.org/10.1093/nar/gkt030.

Shannon, Claude Elwood. 1948. “A Mathematical Theory of Communication.” The Bell System Technical Journal 27: 379–423. http://plan9.bell-labs.com/cm/ms/what/shannonday/shannon1948.pdf.

Singer, Leona M, Patricia A Alexander, and Deborah D Reese. 2014. “Reading on Paper and Digitally: What the Past Decades of Empirical Research Reveal.” Review of Educational Research 84 (4): 509–45. https://doi.org/10.3102/0034654314541101.

Slamecka, Norman J, and Peter Graf. 1978. “The Generation Effect: Delineation of a Phenomenon.” Journal of Experimental Psychology: Human Learning and Memory 4 (6): 592–604.

“Status of Mind - social media and young people’s mental health and wellbeing.” 2017. Royal Society for Public Health.

Steed, S., and an O’Reilly Media Company Safari. 2019. Empathy at Work. O’Reilly Media. https://books.google.de/books?id=U-j8xAEACAAJ.

Wästlund, Erik, Lars Nilsson, and Kenneth Holmqvist. 2012. “Eye Movement Patterns and Reading Processes in Eye-Friendly and Non-Eye-Friendly Typography.” Information Design Journal 19 (2): 119–32.

Weschler, L. 2006. Everything That Rises: A Book of Convergences. McSweeney’s Books. https://books.google.de/books?id=dqefAAAAMAAJ.

Zinsser, W. 2012. On Writing Well, 30th Anniversary Edition: An Informal Guide to Writing Nonfiction. HarperCollins. https://books.google.de/books?id=mp16BDRDaYQC.