Stats Investigation: Meaning of R-Square (Individuals)

Purpose: Determine if a regression analysis using random numbers can yield an r-square value of 50% or more.

Instructions: Set up a regression analysis in Excel using integer x-values from 0 to 9. Use a random number from 0 to 10 for the y-values. Run this simulation 100 times. Calculate the average r-square and record the highest r-squared value. Record the three highest r-square values obtained in the class.

Questions /Conclusions:

  1. Based on your data, does a high r-square value by itself indicate a meaningful association or causation?
  2. Is the random number generator used in this investigation truly random?
  3. Is it possible to get a high r-squared value merely from random events?
  4. What does it really mean when we say that r-square represents the fraction of the variation in the values of y that is "explained" by the least squares regression of y on x? Discuss things like the SSM and SSE.