Standardize your_data (in other words find z-score for each observation). After finding z-score for each value make so called QQ-plot.

statistics

Description

Question 1. Standardize your_data (in other words find z-score for each observation). After finding z-score for each value make so called QQ-plot. QQ-plot is a graph with normal-score (or z-score or standardized score, welcome to call it the way you want) on y-axis and corresponding observations to each z-score on x-axis. We have talked a bit about it. If you are unsure please use google. Moreover, in the lecture notes we have such graphs. Does this graph suggest normality of your dataset or not? Find all probabilities corresponding to all 100 z-scores you have calculated from your_data. Use any software you want. There will be many repeated values, certainly.
Question 2. How many percent of observations are located i) within 1 standard deviation distance to either sides of the mean?; ii) 1.5 standard deviations distance; iii) 2.5 standard deviations; iv) 3 standard deviations of the mean? Does it comply with Chebyshev rule? Why do not we have probabilities in z-score tables for z-scores greater than 3 (sometimes 3.4) ?
Question 3. Create a new dataset from your_data by dividing each value in the sequence by 10 and call it new_dataset. For example, the values in my dataset are 2,4,4,6,4…..blah..blah…blah….. My corresponding new_dataset is therefore is as follows: 0.2, 0.4, 0.4, 0.6, 0.4…..blah….blah….blah. As you can see, all values in the new_dataset will be between 0 and 1. Let us assume that they are probabilities. Now, find z-scores corresponding to all 100 probabilities specified in the new_dataset ( In other words, find Prob(z<value_you_have_to _find) = probability in new_dataset). Draw probabilities on y-axis and z-scores in x axis. You can ignore the repeated values of z-scores. You should have 10 unique z-scores the rest will be repeated.

Instruction Files

Related Questions in statistics category