Submitted by: Submitted by mohk1234
Views: 72
Words: 1178
Pages: 5
Category: Business and Industry
Date Submitted: 04/24/2014 09:08 PM
Mean, Median, and Mode - Just as we have summary statistics like the mean, median, and mode to give us a sense of the 'central tendency' of a data set, we need a summary statistic that captures the level of dispersion in a set of data.
Mean – Generally used unless the has outliers or skewed dirstribution
Median – used when there are outliers or skewed data
Mode – when there is more than one peak (Bimodal Distribution)
Skewed – where the tail leads
* The mean > the median
Standard Deviation
* Large means that data are widely dispersed
* Small means that data is clustered close together
* Variance
* Sum the Square of the differences from each data point minus the mean
* (Data Point – Mean)2
* Sum them all for each data point and than divide by (n-1)
* Take the square root of the variance to find Standard Deviation
* a smaller standard deviation indicates that more data points are near the mean, and that the mean is more representative of the data.
Coefficient of Variation
* compare the standard deviation of the data to the data’s mean
* Coefficient of Variance = standard deviation / mean
Causality – scatter plots never prove that one variable causes another
Correlation Coefficient (# of absences and temperature outside)
* Give more weight to outlier data
* -1 or 1 shows a strong negative or positive linear relationship
Generating Samples – Unbiased and Representative
* Random Sample
* Sample Size
* Depends on the level of accuracy
* Size does not really matter as long as sample is representative of the entire population
* Response Rate
* Low response rate susceptible to Bias
* Demonstrate that non respondents opinions do not differ from those who responded
* Raise response rate
* Classic Mistakes
* Unrepresentative Sample
* Low response rate
* Biased respondents
* Biased Questions
Confidence Interval...