Saturday, June 8, 2024

NewsHorizon

Where your horizon expands every day.

Science

Foundations in Data Analysis for Undergraduate STEM Students


Photo of the Aurora Borealis in Iceland

The blog Editors’ Vox is produced by the Publications Department of AGU.

An integral aspect of undergraduate STEM education involves comprehending arrays of numerical values and methods for manipulating, graphing, and contrasting them. Nevertheless, the majority of programs only provide a basic overview of data analysis techniques, data visualization, and dealing with uncertainty.

The AGU’s Advanced Textbook Series has released a new book titled “Data Analysis for the Geosciences” which offers a thorough introduction to data analysis, visualization, and comparing data models and metrics. The book also addresses the uncertainty surrounding data values. We interviewed the author for a brief overview of the textbook and its potential applications.

I understand that many students may complain about being required to take a statistics course. As a teacher and through the use of this textbook, how do you make the process of analyzing data more intriguing and captivating for students?

I often refer to this class as the “broccoli” of geoscience for my students. Its content is like a superfood for analysis and will enhance their skills as scientists, but it’s best enjoyed with a generous serving of cheese. In this case, the “special sauce” includes examples from Earth, atmosphere, space, and planetary studies, which are integrated throughout the book and course. These examples are not only applied in class assessments, but also in every homework and exam question. My goal is to demonstrate the practical application of seemingly obscure statistical concepts, so my students can see how they can benefit their future research projects.

What sets your textbook apart from other standard undergraduate textbooks and courses on statistics and applied statistics for STEM students?

I prioritize explaining the ideal scenarios for utilizing specific statistical methods and methods for determining the most appropriate technique for a given situation when introducing a new concept.

I concentrate on comprehending the application of statistical formulas in terms of when, where, and how they should be utilized. In other words, when introducing a concept, I stress the ideal scenarios for implementing each statistical method and provide guidance on selecting the most appropriate one for a given situation. I solely include the derivation of equations to aid students in understanding the constraints of these tools.

There are numerous statistical formulas available for analyzing data sets or comparing sets of numbers. Therefore, I make an effort to clarify the limitations that must be considered when using and interpreting the results obtained from these equations.

Why is it crucial for STEM students to possess a wide range of data analysis tools and techniques?

In careers related to science, technology, engineering, and math, we often rely on analyzing data sets or comparing them in order to make decisions. However, it can be challenging for the human mind to comprehend a large list of numbers, such as a million or even a hundred. In these cases, we must use techniques to condense the data into a few understandable numbers. While there are commonly used formulas for this purpose, it is important to acknowledge their limitations and understand the underlying assumptions behind them. If these assumptions are not met, our interpretation of the data may be inaccurate. Additionally, the final decision based on our analysis may not align with standard metrics, and alternative measures should be considered. It is beneficial to be aware of these alternative techniques in order to choose the most suitable one for a specific purpose.

How does uncertainty factor into data analysis?

Without knowledge of the uncertainty of each number, it is not possible to compare them. For instance, if I were to analyze two sets of data that appear to be similar and found their means to be 8 and 10, can we consider them equivalent? It is impossible to determine without knowing the uncertainty estimates. In the case that the uncertainties are 5 and 6, it can be assumed that they are similar. However, if the uncertainties are 0.5 and 0.6, it is evident that they are different. Essentially, any statistical analysis relies on uncertainty estimates to give significance and understanding to the fundamental values.

Why is having a solid understanding of data analysis, comparison, and visualization crucial for students?

My research primarily involves analyzing data sets and numerical model output. I frequently compare simulation runs to real event data or compile data from longer time periods.

The fundamental aspect of scientific research involves analyzing a set of numbers using graphs, basic statistical methods, and advanced comparative strategies.

Being able to analyze a group of numbers using charts, basic calculations, and advanced comparison methods is crucial in scientific studies. A course that teaches students how to visualize and compare number sets not only lays a foundation for a successful STEM career but also helps in evaluating statistics that we encounter in our everyday routines.

Who else might benefit from your book besides STEM students?

The method of beginning with a plot, then progressing to fundamental analysis, and ultimately to more specialized comparative approaches can be applied to various fields. For instance, it serves as the foundation for engineering system analysis and making financial choices.

I hope that this is attractive to anyone who conducts comparisons between two sets of numbers for any data science application, not just limited to the traditional areas of STEM but also including medicine, social sciences, and economics.

I also wish for students from different areas of study to find the Earth and space science illustrations in the book to be captivating add-ons about the environment.

Data Analysis for the Geosciences: Essentials of Uncertainty, Comparison, and Visualization, 2023. ISBN: 978-1-119-74789-5. List price: $149.95 (hardcover), $120.00 (e-book)

You can access Chapter 1 for free. Just go to the book’s page on Wiley.com and select “Read an Excerpt” located below the cover image.

Find out more about this book in this short video.

“I am a member of the University of Michigan (0000-0002-7039-2631), located in the United States.”

AGU Publications has a policy of requesting that the authors or editors of recently published books compose a summary for Eos Editors’ Vox.

Reference: Liemohn, M. W. (2023), Data analysis fundamentals for undergraduate STEM students, Eos, 104, https://doi.org/10.1029/2023EO235031. Published on October 31, 2023.

The views expressed in this article are not reflective of those of AGU, Eos, or any related organizations. They are solely the opinions of the author(s).

The content is protected by copyright in 2023 by the authors under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 license.

Unless explicitly stated, images are protected by copyright and cannot be used without the explicit permission of the copyright owner.