Q1 Ch 1 AP Statistics Objectives

Mr. Rogers - AP Statistics Objectives

Syllabus	1st Quarter		2nd Quarter	3rd Quarter	4th Quarter
Exp Data	2 N-Distribution	3 Regression		4 NL Regression	5 Data

Unit Plan

Practice Test

Practice Test Answers

Chapter 1: Exploring Data

AP Statistics Standards
I. Exploring Data: Describing patterns and departures from patterns (20% –30%)

A. Constructing and interpreting graphical displays of distributions of univariate data (dotplot, stemplot, histogram, cumulative frequency plot)

Center and spread

Clusters and gaps

Outliers and other unusual features

Shape

B. Summarizing distributions of univariate data

Measuring center: median, mean

Measuring spread: range, interquartile range, standard deviation

Measuring position: quartiles, percentiles, standardized scores (z-scores)

Using boxplots

The effect of changing units on summary measures

C. Comparing distributions of univariate data (dotplots, back-to-back stemplots, parallel boxplots)

Comparing center and spread: within group, between group variation

Comparing clusters and gaps

Comparing outliers and other unusual features

Comparing shapes

Objectives

Essential Question: How many numbers are needed to describe a complex event or object?

Introduction--what is statistics about?

Given a complex system or object, describe it adequately with a limited number of indicators or measurements.

exercise

describe yourself with 2 words and 2 numbers. See if your classmates can ID you from the description.

Formative assessment: What did you learn about the power of indicators or measurements from doing the above exercise?

State the key elements used for answering a research question in a statistically acceptable manner. Statistical analysis is an internationally recognized way of answering research questions and communicating data. It is a powerful international communication tool.

Design--the systematic way in which the data is collected.

Analysis--the systematic use of graphical and mathematical tools to describe and evaluate the data.

Conclusions--the systematic manner in which inferences are drawn from the data and uncertainties are evaluated.
Evaluate information to determine if it is anecdotal evidence. Anecdotal evidence is based on data that's collected in a haphazard manner. It usually consists of a small sample size, often a single data point, frequently chosen for emotional impact.

Evidence consisting of a single data point is always considered anecdotal

Conclusions based on anecdotal evidence are not statistically defensible

Homefun (formative/summative assessment): Find an article that uses anecdotal evidence. Briefly describe the evidence and how it is used. Provide a reference to the source.

Essential Question: Is data always expressed as numbers?

State the difference between categorical and quantitative variables and give examples of each.

quantitative variables: consists of numerical values that could reasonably be expressed as an average.

height

weight

age

categorical variable: a classification system

zip codes

grade (freshmen, sophomores, juniors, seniors)

size (small, medium, large)

Note: Categorical data is drawn only on bar graph or pie charts

Evaluate the effectiveness of bar charts and other graphs. examples: .

exercise

Write a conclusion inferred from the each chart: Chart 1, Chart 2

Formative assessment: Evaluate the effectiveness of the above charts

Create frequency tables for categorical data. In other words, convert the "count" data to % data.
Convert the above tables into bar charts. A 2-way table will contain:

a vertical and a horizontal marginal distribution

multiple conditional distributions
Use conditional distributions based on relative frequencies to establish associations. This is typically done by looking at bar charts of the distributions

associations: a pattern exists between the values of one variable and the values of another. Association does not establish that one variable causes the other.

Homefun (formative/summative assessment): Read section 1.1, work exercises 1, 11, 17 pages 22 to 24

Essential Question: Can data sets be added together to obtain a larger sample size and hence more meaningful conclusion?

Simpson's Paradox

Analyze data for Simpson's paradox.

Conclusions based on parts can be reversed when considering the whole
Conclusions based on parts is more likely to be valid.

State two conditions which must exist for Simpson's Paradox to occur.

One or more lurking variables

Data from unequal sized groups being combined into a single group.

Homefun (formative/summative assessment):

Read Simpson's Paradox - When Big Data Sets Go Bad
Read "A closer Look at SAT Scores Decline", Summarize in a paragraph how Simpson's paradox might be involved.
work exercises 20, 35 pages 25-26

Essential Question: When using a number to describe a complex event or object is there a difference between using a single number and using a single data point?

Ch1.2 Describing Distributions

Define distribution and state two key pieces of information require to produce a distribution.

The pattern of variation of a single variable

Quantitative data(numbers along horizontal or x-axis)

Frequency--How often various values are expected (along vertical or y-axis)

State the 3 key ways a distribution can be described.

Central tendency or center

Spread or variability

Shape

Name and define the 3 key measure of central tendency.
- Mean - numerical average

Mean = Σx_i / n or

Mean = ( x₁+ x₂ + x₃ + ... + x_n) / n

Median - midpoint, 50% above, 50% below

Mode - most common data point or highest peak

Given a set of data determine the mean, median and mode.
Define and ID outliers. Outliers are data points that are thought to belong to a different distribution, hence, any influence they have on the properties of a distribution causes errors.

Data point not in distribution

Gaps

Outliers and skew are not the same thing. Skew is part of a distribution outliers are not.

Conclusions unduly influenced by a single data point are statistically indefensible! --these data points are Outliers

State which measure of central tendency is generally most influenced by outliers.

Using the Mr. Rogers Rat Tail Rule, state whether a distribution is skewed left or right, high or low.

The Mr. Rogers Rat Tail Rule--FAQ

Skewed distributions often look like a rat with a long tail. The tail points in the direction of skew.

What gets skewed? The mean gets skewed or moved in the direction the rat tail points.

Why does skew matter? For a skewed distribution, the mean poorly represents the bulk of the data points.

What gets skewed very little? The median. It is represents the bulk of the data points better than the mean.

Give examples of data that would tend to be symmetrical and data that would be skewed left or right.

Easy Test - skewed left or skewed low

Hard Test - skewed right or skewed high

Normal Test - symmetrical

Incomes - skewed right or skewed high

Homefun (formative/summative assessment): Read section 1.2

Stats Investigation: Investigation School Evaluation - time approx 3 class periods (individual work)

Purpose: Determine if it is reasonable for 50% of all schools receiving a school report card to be scored below average.

Instructions: Perform the simulation of school ratings using the Excel Spread Sheet provided.

Questions /Conclusions: (see Excel spread sheet.)

Essential Question: Is there a difference between looking at tables of numbers and looking at plots or graphs of numbers?

Make dot plots.

gasoline consumption analysis

Old Fathful analysis

foreign born analysis

Is IQ a bell-curve distribution?
Make histograms using the TI-83 calculator and in Minitab.
State the key weakness of histograms (see "Four Histograms").

Homefun (formative/summative assessment): work exercises 37, 41, 55, 57 pages 42-46

Essential Question: Can the type of plot influence the conclusions drawn and if so how can this be prevented?

Stem and Leaf Plots

Draw and interpret stem and leaf plots.

clusters
skew
gaps
multiple modes--these imply that the data comes from more than one distribution.

Draw and interpret back to back stem and leaf plots .
State why a time plot should always be used in an analysis of data. Virtually everything is a function of time.

Homefun (formative/summative assessment): read section 1.3; exercises 45, 47, 49, pages 44-45

Essential Question: Is there a difference between skew and outliers?

Box and Whiskers Plots

Calculate quartiles, Q1 and Q3.
Interpret 5 number summaries. Low, Q1, Med.,Q3, Hi
Find the IQR or interquartile range for a data set.

IQR = Q3 - Q1

Draw a box and whiskers plot.
State the Mr. Rogers Rat Whisker Rule for determining skew using a box and whiskers plot. Long whisker indicates direction of skew.
State the % of the data expected in each whisker and in the box for a box and whiskers plot. 25%

Homefun (formative/summative assessment):

Essential Question: Why are outliers important?

Modified Box and Whiskers Plot

Identify outliers using a modified box and whiskers plot.

Whisker's End = 1st data pt within 1.5 IQR of Q3
Outlier = data pt beyond the whisker's end

Create box and whisker plots on the TI-83.
Create and interpret parallel box and whisker plots on the TI-83 and in Minitab. Note that a box and whiskers plot cannot detect gaps, clusters, or multi-modes, but here's the problem with other types of graphs such as dot plots, stem and leaf plots, and histograms: the ability to detect patterns depends on the interval size. There's no perfect plot for visualizing distributions.

Formative assessments:

Which type of plot(s) is(are) best at identifying clusters?
Which type of plot(s) is(are) best at identifying multiple-modes?
Which type of plot(s) is(are) best at identifying gaps?

Homefun (formative/summative assessment): exercises 91,93, 95 p. 71 Work the Chapter 1 practice Test TI.1 to TI.15 78-81:

Essential Question: Ideally, how many data points in a set of data are needed to characterize spread?

Standard Deviation

Quantities represented as Greek alphabet symbols are considered true (known by Zeus).
Quantities represented in our normal alphabet (known by mere mortals) are estimates of the ones represented as Greek alphabet symbols.

Calculate the range and explain why it is a poor indicator of spread.
- range = (highest) - (lowest)
- Range uses only 2 data points to characterize spread. Only one of these points needs to be an outlier to give a misleading indication.
Write the mathematical definition for standard deviation from memory and explain its meaning.

	Calculated from an entire population
	σ =	[ Σ(x_i - μ)² / n ]^1/2

	Calculated from a sample
	s =	[ Σ(x_i - xbar)² / (n - 1) ]^1/2

The standard deviation is a way to express how much a typical data point differs from the mean but it is weighted so that large deviations have more influence.

State how standard deviation and variance are related.

variance = (standard deviation)²

Calculate standard deviations by hand and with a calculator
Explain the difference between S and sigma.
- S = an estimate of a population's std dev based on a sample
- sigma = the actual standard deviation of a population
State why the standard deviation is a better indicator of spread than range. Std dev uses all the data points, range uses only 2 pts.
State an approximate relationship between range and standard deviation. (range roughly = 6 sigma.)

exercise

Rank the distributions show here from lowest to highest standard deviation.

Formative assessment: What does a distribution with high or low standard deviaton look like?

Homefun (formative/summative assessment): exercise 97, 99 p. 72; Work the Chapter 1 practice Test TI.1 to TI.15 78-81

Essential Question: How can I make an "A" on the test?

Exploring Data Review

Work the practice test.
Review the objectives.
Correctly interpret 5 number summaries.
Look over free response problems from previous years.
Memorize the mathematical definitions of variance and standard deviation for samples and populations.
Master the vocabulary (see example below).

Descriptive Term

Comments

Central Tendency

Mean Sensitive to outliers & skew

Median good when outliers or skew present

Mode rarely used

Spread

range Very sensitive to outliers & skew

variance Sensitive to outliers & skew

standard deviation Sensitive to outliers & skew

IQR good when outliers or skew present

Shape

Symmetrical can have multiple peaks

Skewed left Skewed low, easy test

Skewed right Skewed high, hard test, income

Summative Assessment: Test--Objectives 1-36

SAM Team--Southside High School's STEM and Computer Science extra-curricular club (Mr. Rogers Sponsor)

Mr. Rogers' Twitter Site

Mr. Rogers Teacher's Blog

Mr. Rogers T-shirts

Mr. Rogers Information for Teachers

Mr. Rogers Science Fair Information

Check out other web sites created by Mr. R:

Check out Articles by Mr. Rogers:

Nerds: Let's Celibrate Nerdiness!:

Insultingly Stupid Movie Physics is one of the most humorous, entertaining, and readable physics books available, yet is filled with all kinds of useful content and clear explanations for high school, 1st semester college physics students, and film buffs.

It explains all 3 of Newton's laws, the 1st and 2nd laws of thermodynamics, momentum, energy, gravity, circular motion and a host of other topics all through the lens of Hollywood movies using Star Trek and numerous other films.

If you want to learn how to think physics and have a lot of fun in the process, this is the book for you!

First the web site,

now the book!