Kenan Fellows Program Logo and page header graphic

Understanding Data Mining: Extracting, Organizing, and Analyzing Large Sets of Data

Summary

Large sets of data, accessible through new technology, are paramount to forecasting trends in business and economics. In Algebra I, students typically study data sets with one predictor variable and one response variable. But in the real world, most response variables have numerous predictors which may significantly impact the data. It is important to be able to identify their effects and use them appropriately to make sound, valid predictions.

As a result of this project, mathematics students in grades nine through twelve will be able to extract useful information from large sets of data that represent multiple disciplines. Using these real-world applications, students will analyze data and use their findings to make predictions and to provide solutions to problems.

These lessons have been designed to help Algebra I students navigate the basics of data mining, and then learn to determine which variables are most influential in a given situation. Students will also use R© Statistical Software to help with variable selection.

Learning Outcomes

The following goals from the North Carolina Standard Course of Study are addressed:

Algebra I:

3.03 Create linear models for sets of data to solve problems.

Algebra II:

2.04 Create and use best-fit mathematical models of linear functions to solve problems involving sets of data.

Technical Math II:

2.03 Create, interpret, and analyze best-fit models of linear functions to solve problems.

Discrete Mathematics:

1.01 Create and use calculator-generated models of linear functions of bivariate data to solve problems.

Pre-Calculus

2.03 For sets of data, create and use calculator-generated models of linear functions.

Integrated Math I

3.03 Create linear models, for sets of data, to solve problems.

AP Statistics

4.01 Analyze bivariate data.

The following principles and standards of the National Council of Teachers of Mathematics are supported:

  • Use mathematical models to represent and understand quantitative relationships
  • Formulate questions that can be addressed with data and collect, organize, and display relevant data to answer them
  • Select and use appropriate statistical methods to analyze data
  • Develop and evaluate inferences and predictions that are based on data
  • Build new mathematical knowledge through problem solving
  • Recognize and apply mathematics in contexts outside of mathematics