Chapter 1 Terminology
Fathom Software
- Toolbar
- Inspection Window
- Formula Editor
- Collection
- Case or Cases
- Table (or Case Table)
- Graph
- Summary Table
- Attribute
- Value
Excel Software
- Graphs & Labeling (horizontal and vertical axes, title)
Statistical Terms
- Data
- Summary Statistic (mean, median, mode, etc.)
- Analyze
Chapter 2 Terminology
-
univariate data
- quantitative variable (aka attribute in Fathom software)
- categorical (or qualitative) variable (aka attribute in Fathom software)
-
graphs (plots)
- bar chart
- dotplot
- stemplot
- histogram or relative frequency plot
- boxplot (box-and-whisker plot) and modified boxplot
- outlier (resistant vs sensitive to) , clusters, gaps
-
shape of distribution
- symmetric
- uniform (or rectangular) distribution
- normal distribution
- skewness
- left skewed distribution
- right skewed distribution
- other
- bimodal distribution
-
CENTER [summary statistic: single value measurement or computation]
- mean
- median
- mode (only for bimodal)
-
SPREAD [summary statistic: single or multiple valued measurement or computation]
- deviation (or residue)
- standard deviation
- variance
- quartiles (ranges)
- 5-number summary [min, Q1, Q2 (median), Q3, max]
- interquartile range (IQR)
-
statistical (math or calculator) symbols and terms
- x-bar [also the formula to calculate]
- s or SD [also the formula to calculate]
- z-score [also the formulat to calculate as well as re-centering and re-scaling]
- percentages calculated from normal curve
- normalcdf ( leftbound, rightbound, mean, SD )
- invNorm ( area, mean, SD )
Chapter 3 Terminology
Chapter 2: One Variableunivariate data : shape -> center -> spread |
Chapter 3: Two Variablesbivariate data : shape -> trend -> strength -> variability |
|
Key Idea |
Distribution |
Relationship (association) |
Plots/Graphs |
Dot plot |
ScatterPlot |
Shape |
Normal, uniform, or skewed Symmetric Clusters, gaps, and outliers |
Linear or curved constant strength clusters, gaps, and outliers |
Ideal Shape |
Normal |
Linear (oval/ellipse) |
Measure of Center |
Mean Median |
Regression Line (LSRL) |
Measure of Spread |
Standard Deviation Interquartile Range |
Correlation |
-
bivariate data (from Chapter 3): shape -> trend -> strength -> variability
- plausible explanation : causation, common response, or confounding
- lurking variable
- residual plots
- outliers
-
scatter plots : shape -> trend -> strength -> variability
- data's shape : linear, curved, or none
- y = a1 + b1*x [equivalent to algebra equation of line : y = mx + b]
- shape's trend : positive slope, negative slope, or none
- b1 : measure of slope
- trend's strength : strong trend (tight cluster), moderate trend (some clustering), or weak trend (no cluster)
- correlation : a measure of a trend's strength
- strength's variability : uniform or heteroscedasticity (fan-shaped)
-
summary line
-
- Least Squares Regression Line (also called LSRL)
- Line of Best Fit (best guess or LSRL)
- Regression Line (LSRL)
- Trend Line (LSRL)
- Fitted Line (best guess)
-
statistical (math or calculator) symbols and terms
- randomNormal ( mean, SD )
- least squares regression line ("regression line" or just LSRL) : y = a1 + b1*x
- explanatory variable or predictor variable, x
- response variable or observed variable, y
- slope, b1
- y-intercept, a1
- predicted value, ŷ
- interpolations, extrapolation
- influential point
- residual
- sum of square errors (SSE)
- r : correlation coefficient
- r2 : coefficient of determination
- "correlation does not imply causation" due to lurking variable
Chapter 4 Terminology
- units (and population size)
- population
- census & sample
- parameter & statistic
-
sample bias
- selection bias
- size bias
- volunteer bias
- convenience bias
- judgement bias
- nonresponsive bias
- questionnaire bias
- wording or language bias
- incorrect response bias
-
samples - unbiased representatation of population
- simple random sample (SRS)
- stratified random sample
- cluster sample
- two-stage cluster sample
- systematic sample with random start
-
experiment vs observational study
- blind experiment, double blind experiment
- treatment
- factor (categorical)
- level
- experimental unit
- response variable
- designs of experiments
- completely randomized design
- randomized paired comparison design
- randomized block design
- variables in experiments
- explanatory variable or factor (if catagorical)
- response variable
- lurking variable
- confounding variable
- variability
- between-treatment
- within-treatment
Chapter 5 Terminology
- event
- table of random digits
- venn diagrams
-
Probability
- event, P(A)
- complement of event, P(Ac)
- distribution
- model
- mutually exclusive [disjoint categories]
- conditional events
- independent events
- "of at least one"
- variance & standard deviation
-
Sampling
- space
- with Replacement
- without Replacement
-
Mathematical
- Law of Large Numbers
- Fundamental Counting Principle
- Terminology with Probabilities
- Complement of Event : P(Ac) = 1 - P(A)
- P("of at least one") = 1 - P("exactly none")
- Mutually Exclusive Events [i.e. disjoint] : P(A and B) = 0
- Conditional Events : P(A|B) = P(A and B) / P(B) often written as P(A and B) = P(A|B) * P(B)
- Independent Events : P(A|B) = P(A)
- Addition Rules ["or"]
- Full Rule : P(A or B) = P(A) + P(B) - P(A and B)
- Simplified Rule for Mutually Exclusive Events [disjoint] : P(A or B) = P(A) + P(B)
- Multiplication Rules ["and"]
- Full Rule : P(A and B) = P(A) * P(B|A) can also be writen as P(A and B) = P(B) * P(A|B)
- Simplified Rule for Independent Events : P(A and B) = P(A)*P(B)
Chapter 6 Terminology
- Probability Distributions
- Random Variable, X
- Expected Value, E(X)
- mean, ux
- standard deviation, sx
- from Collected Data
- using Known Data Frequencies that model Your Situation
- simulation using Random selection from a known data
- from Theory
- assumptions + Basic Mathematical Principles
- Binomial Distributions
- n = number of trials
- p = probability of success on any one trial
- 1-p = q = probability of failure on any one trial
- P(X = k) = nCk pk (1 - p)n - k
- binompdf(number of trials, probability of success, number of successes)
- Normal Distribution ~ BINS (binomial, independant, number of trials is fixed, success probabilities is known)