Dr. H. Friedman

STATISTICS

Midterm Review I

 

The following problems are designed to help you study for the midterm examination.  Solutions are on the bottom of the page.

(1)  For each of the following, indicate whether the data is measured on a nominal, ordinal, interval, or ratio scale.

_______ a)  waist sizes of 65-year-old men
_______ b) amount of alcohol in a can of Budweiser beer
_______ c) countries of origin of CUNY students
_______ d) temperature in centigrade of Starbuck’s coffee
_______ e) religions of CUNY college students
_______ f) social class of CUNY students
_______ g) weights of 65-year-old women
_______ h) IQ of CUNY students
_______ i) scores on the CPA exam for CUNY students
_______ j) speeds of fastballs thrown by American League pitchers
_______ k) strength categories of hurricanes (1, 2, …, 5)
_______ l) genders of Wal-Mart employees 
_______ m) rank of college professors (Instructor, Assistant Professor, Associate Professor, Full Professor)
________n) class standing of CUNY students (Freshman, Sophomore, … )

 

(2)  Indicate  which of the following are discrete measurements and which are continuous measurements:

Discrete Continuous

___            ___      a) the number of defective laptops in batches of 30
___            ___      b) life of a Duracell battery
___            ___      c) mileage of Toyota Prius cars
___            ___      d) weight of guinea pigs
___            ___      e) span of butterfly wings
___            ___      f) number of left-handed people on basketball teams
___            ___      g) time to complete the New York City Marathon
___            ___      h) amount of alcohol in a can of beer
___            ___      i) number of foreign students in each statistics class
___            ___      j) height of basketball players
___            ___      k) number of pens in backpacks of college students

 

(3)   Indicate which of the following is a parameter and which is a statistic:

Parameter Statistic

___            ___      a) sample mean
___            ___      b) population standard deviation
___            ___      c) sample standard deviation
___            ___      d) population variance
___            ___      e) population median
___            ___      f) population mean
___            ___      g) a mean obtained from the U.S. census
 

 (4)  For each of the following, indicate the appropriate statistical measures that may be used for analysis (e.g., proportions, median, quantiles, mode, mean, …).  List as many as are appropriate.

a) For data measured on a nominal scale, you may use: ___________________________
b) For data measured on an ordinal scale, you may use: ___________________________
c) For data measured on an interval scale, you may use: ___________________________

 (5)  CUNY has 200,000 students.  A researcher wants a sample of 2,000 students.  Students are assigned numbers and then random numbers are used so that every student has an equal chance of being selected (1%). 

a) This is known as a: ___________________. 
b) Measurements obtained from this are known as: ___________________.
c) Suppose the researcher decides to survey all 200,000 students.  This is called a _________________. 
d) Any measurement obtained is known as a ________________.

 (6)  A researcher has converted all grades on this year’s CPA exams into Z-scores. 

a) The average Z-score will be: ________
b) If CPA exam scores are normally distributed, about ______ % of scores will be between +1 and -1.
c) If CPA exam scores are skewed and not normally distributed, we still would expect at least __________ % of the Z-scores to be between +2 and -2.
d) You find out that the exam scores are normally distributed, and that your score on the exam is exactly +1.96.  This means you scored higher than ______ % of the individuals who took the exam.

(7)  The life of a Hyundai Excel car is normally distributed with a  mean life of  15 years, and a population standard deviation of 2 years.  What proportion of Hyundai Excels will die within 10 years?

(8) a) If P(A and B) = 0, then A and B are: _________
b) If P(A|B) = P(A), then A and B are: ________
c) Is P(A|B) always equal to P(B|A)? _____
d) Is P(A and B) always equal to P(B and A)?____

(9) A manufacturer of computers has lowered prices for her product. A sample of 16 stores selected randomly indicate the following sales (in units) during the past week:
0, 10, 2, 10, 3, 10, 3, 10, 6, 6, 8, 8, 4, 4, 5, 6

(a) Calculate: The mean, median, Q1, Q3, mode, range, IQR, standard deviation, variance, and the coefficient of variation. (b) Standardize the 10 (i.e., convert the 10 into a z-score):

(10) The following is a frequency distribution showing the amount of time it took a sample of employees to complete a certain job:

 Number of Days

 Frequency

 2

 10

 4

17

 5

 18

 7

 12

 40

 10

 3


(a) Calculate the mean, median, Q1, Q3, mode, and range.

(11) A study of smoking and sex found the following relationship:

 

 Male

 Female

 Smokes

 150

 130

 Does Not Smoke

 250

 470


(a) Compute these probabilities: P(male) ; P(female (and) smoker); P(female or smoker); P(male/smoker); P(smoker/male). (b) Prove that smoking and sex are not independent.

(12) Ten percent of the population is left-handed. What is the probability of 2 lefties in a group of 10?

(13) The mileage of cars is normally distributed with a mean of 20 mpg and a standard deviation of 4 mpg. (mpg is miles per gallon). Calculate the following:
(a) The percentage of cars under 15 mpg: (b) The percentage of cars above 22 mpg:
(c) The probability that a car will have a mileage between 22 and 28 mpg: (d) Compute the mpg of the top 10% of cars (i.e., the 90th percentile) (e) Compute the 9th percentile.

(14) Define each of the following: (a) parameter and statistic (b) population mean (c) mutually exclusive (d) interquartile range (e) histogram and bar chart (f) skewness (g) symmetric (h) independence (i) permutation (j) combination (k) probability density function (l) percentile (m) nominal, ordinal, inteval, and ratio scales. (n) discrete and continuous measurements (o) frequency distribution.
Nominal Scale:
Classifications, e.g., sex, race, religion (appropriate statistics = mode, percentages)
Ordinal Scale :
Rankings & you do not have equal intervals, e.g., social class (approp. statistic = median)
Interval Scale: Equal intervals but no "true" zero, e.g., temperature, IQ (approp statistic = mean)
Ratio Scale: Equal intervals and a "true" zero, e.g., height, weight (may say twice as heavy, twice as cold, etc.)

Solutions:
(1) a)  ratio   b) ratio   c) nominal  d) interval   e) nominal   f) ordinal   g) ratio   h) interval   i) interval   j) ratio   k) ordinal   l) nominal    m) ordinal 
n) ordinal   (Hint:  If you can say "twice as"  then it is a ratio scale.  If you can throw a fastball at 90 MPH and my fastball is clocked at 45 MPH, yours is twice as fast.  A waistline of 48" is twice as large as a 24" waistline.)
(2)
  a) Discrete  b) Continuous  c) Continuous   d) Continuous   e) Continuous   f) Discrete    g) Continuous    h) Continuous    i) Discrete   
j) Continuous    k) Discrete     
(3)
    a) Statistic  b) Parameter   c) Statistic   d) Parameter   e) Parameter   f) Parameter     g) Parameter   
(4)  a) frequencies, proportions  b) frequencies, proportions, median, percentiles and other quantiles   c)  frequencies, proportions, median, percentiles and other quantiles, mean, standard deviation
(5)   a) simple random sample  b) statistics c) census d)  parameter
(6)   a) 0    b) 68.26%         c) 75% -- Chebyshev's Theorem       d) 97.5%
(7)   Z = -2.5   Proportion of Hyundai Excels that will die within 10 years is 0.62%.
(8)  a) mutually exclusive  b) independent c) no  d) yes
(9) mean =5.938 units, median = 6 units, Q1=3.5 units, Q3=9 units, mode =10 units, range = 10 units, IQR = 5.5 units, s.d. = 3.17 units, C.V. = 53.4%, Z-score for 10 is 1.28 [note that the Z-score is a "pure" number]
(10) mean = 6.52 days, median = 7 days, Q1= 4 days, Q3=9 days, mode = 9 days, range = 8 days.
(11) P(male) = .40, P(female and smoker) = .13, P(female or smoker) = .75, P(male/smoker) = 53.57% , P(smoker/male) = .375, smoking and sex are not independent since P(S/M) is .375 and P(S/F) is .217. This indicates that the probability of being a smoker is higher for men than for women..
(12) .19371
(13) (a) .1056 (b) .3085 (c) .2857 (d) 25.12 mpg [ (x - 20) / 4 = 1.28, solve for x] (e) 14.64 mpg [(x - 20) / 4 = -1.34, solve for x ]

E-mail:  x.friedman@att.net