Module #6 Assignment

Module #6 Assignment


A. Consider a population consisting of the following values, which represents the number of ice cream purchases during the academic year for each of the five housemates.

8, 14, 16, 10, 11









a. Compute the mean of this population.
8 + 14 + 16 + 10 + 11 = 59/5 = 11.8






b. Select a random sample of size 2 out of the five members. See the example used in the Power-point presentation slide # 13.

Random sample : 14 & 10

The random sample selected out of the five members will be 14 and 10. 


c. Compute the mean and standard deviation of your sample.

14 + 10 = 24 / 2 = 12 

Sample mean is 12 

sd = √((14-12)^2 + (10-12)^2) / (2-1) =

sd = √(2^2) + (-2^2) / 1 =

sd = √4 + (4) / 1 =

sd = √8 / 1 =

sd= √8

sd= 2.8284

Sample standard deviation is 2.8284



d. Compare the Mean and Standard deviation of your sample to the entire population of this set (8,14, 16, 10, 11).

sd = √(( 8-11.8)^2 + (14-11.8)^2 + (16-11.8)^2 + (10-11.8)^2 + (11-11.8)^2) / 5

sd = √(-3.8^2) + (2.2^2) + (4.2^2) + (-1.8^2) + (-0.8^2)) / 5

sd = √(14.44 + 4.84 + 17.64 + 3.24 + 0.64) / 5

sd = √40.8 / 5

sd = √8.16

sd = 2.85657

Population Standard Deviation is 2.85657


The mean of the population is 11.8 while the sample mean is 12.
The sample mean is slightly higher than the population mean. 
The population standard deviation is 2.85657 while the sample standard deviation is 2.8284
The sample standard deviation is very close to the population standard deviation, indicating that the sample variability is similar to that of the entire population.

B. 

Suppose that the sample size n = 100 and the population proportion p = 0.95.

1. Does the sample proportion p have approximately a normal distribution? Explain.

When determining if the sample proportion p has a normal distribution, np and nq have to be greater than 5. We use the Central Limit Theorem. This is a key concept in statistics that says that if we take enough samples, the average of those samples will look like a normal distribution (a bell-shaped curve), even if the original data doesn’t. As mentioned, for the CLT to work, np and nq must be greater than 5. 

n is the sample size which is 100.
p is the population proportion which is 0.95
q is the complementary probability which is 1-p = 1-0.95= 0.05

np = n x p = 100 x 0.95 = 95 [This is greater than 5]

nq = n x q = 100 x 0.05 =  5 [This is greater than 5]


Due to np and nq being greater than 5, the sample proportion p has a normal distribution. 
Since both conditions are met the expected number of successes is 95 and the expected number of failures is 5. The sample proportion has an approximately normal distribution because both conditions for normal approximation are satisfied. Since you have a large enough sample size and enough successes and failures, this allows you to use normal distribution methods to make predictions and decisions based on your sample.


2.What is the smallest value of n for which the sampling distribution of p is approximately normal?  

np≥5
np = n x p

5 = n x 0.95 
5 = 0.95n 
5/0.95 = n   
n = 5.26315


n(1−p)≥5 or nq≥5
nq = n x q

5 = n x 0.05
5 = 0.05n
5/0.05 = n
n = 100


From the first condition, 𝑛 must be at least 6 (5.26315).
From the second condition, 𝑛 must be at least 100.
The smallest value of 𝑛 for which both conditions are satisfied is 100. Thus, 100 is the minimum sample size needed for the sampling distribution of p to be approximately normal. If we only had 6 samples, we might have many successes but very few failures (just one), which could lead to an unreliable approximation. This means 100 ensures a balance between successes and failures, leading to a better approximation of the sampling distribution by the normal distribution.




The sample mean  from a group of observations is an estimate of the population mean μ . Given a sample of size n, consider n independent random variables X1, X2, ..., Xn, each corresponding to one randomly selected observation. Each of these variables has the distribution of the population, with mean μ  and standard deviation σ .
A. Population mean= (8­­+14+16+10+11)/__ 
B. Sample of size n= ___
C. 
Mean of sample distribution: ____
sample 1=
sample 2=
sample 3 and so on and so forth…
And Standard Error Qm=Q/square root of n=4.4/square root of 5=
D. I am looking for table with the following variables X, x=u, and 
(x-u)^2



The table has the variables, X, x=u, and 
(x-u)^2

X represents the individual data points or observations from the population or sample. x=u represents the average value of the data set. It is calculated by summing all the values and dividing by the number of values. (x-u)^2 represents the squared difference between each individual observation (X) and the population mean ((x-u)^2). It measures how far each observation is from the mean.





C.
 From our textbook, Chapter 3: Probability Exercises # 3.4 (pg. 65 on 2nd Edition)
Simulated coin tossing: is probability better done using function called rbinom than using function called sample?  Explain.




The rbinom function generates random numbers following a binomial distribution. This function helps you find out how many times you get heads when you flip a coin however many times you want it to flip. There is a 50% chance for it landing on either heads or tails. 



The sample function randomly selects elements from a specific vector. This function lets you see the results of each flip. 
rbinom lets you know how many heads you get after a number of flips while sample lets you see the actual sequence of flips. Sample is not designed for probabilistic testing in mind. If your goal is to count the number of heads, use rbinom. If you want to see what happened on each flip, use sample. Not to mention, the sample function requires additional steps to count successes after sampling, which could be substituted with rbinom which is simpler and more efficient use. 




Summary of this week's work: 

This week I learned important statistical concepts related to calculating the mean and standard deviation for both populations and samples. I explored the conditions necessary for sample proportions to be approximately normally distributed, focusing on the significance of having enough successes and failures in a sample. I also learned about using different R functions for simulating probability distributions, specifically comparing the rbinom function to the sample function. 










Comments

Popular posts from this blog

Module #7 Assignment

Final Project

Module #11 Assignment