Module #11 Assignment
From our textbook, Introductory Statistics with R": pp. 224 Question 12.1 and 12.3
12.1: Set up an additive model for the ashina data, as part of ISwR package.
This data contains additive effects on subjects, period and treatment. Compare the results with those with those obtained from t tests.
Hint
ashina$subject <- factor(1:16)
attach(ashina)
act <- data.frame(vas=vas.active, subject, treat=1, period=grp)
plac <-data.frame(vas=vas.plac, subject, treat=0,
12.3. Consider the following definitions
a <- g1(2, 2, 8)
b <- g1(2, 4, 8)
x <-- 1:8
y <- c(1:4, 8:5)
z <- rnorm (8)
Note:
The rnorm() is a built-in R function that generates a vector of normally distributed random numbers. The rnorm() method takes a sample size as input and generates that many random numbers.
Your assignment
Generate the model matrices for models z ~ a*b, z ~ a:b, etc. In your blog posting, discuss the implications. Carry out the model fits and notice which models contain singularities.
Hint: We are looking for:
model.matrix (~ a:b);
lm (z ~ a:b)
12.1
The image above is code that represents the linear regression model of the ashina data set.
Before continuation, a breakdown of the ashina dataset will be explained.
Subjects: This represents each person in the study, which is represented by an ID with 16 subjects in total.
Treatment: This represents whether the subject received an active treatment or a placebo.
Period: This is the time period or group the subject was situated in during the treatment.
The goal of the analysis is to set up an additive model that includes the effects of subject, period, and treatment.
The data from act and plac will be combined to fit a linear model with an additive structure. The model will be able to predict the VAS scores based on the effects of subject, period, and treatment. This was done with the rbind function.
A t-test will be performed to compare the means of the VAS scores between the active and placebo treatments.
Interpretation of Results (from image above):
The model shows various pieces of information. For example, the estimated value for each subject is to decrease by -42.87. The F-statistic and p-value highlight the model to be statistically significant.
The R-squared value represents that 75.66% of the vas variable has variability through the presence of other variables. The R-squared value measures the proportion of the variance in the response variable that is explained by the predictors in the model.
Code Results for the two-test between the treat and period variable:
There is a 95% confidence interval to which the difference in means is between -64.613 and 2.347 which indicates the difference between the respective means will be within those values. When observing the treat t-test, we're shown that there is a p-value that indicates statistical significance.
Referring to the results, there is a 95% confidence interval to which the means will remain between 7.05 and 78.70. Due to the p-value of the period t-test being greater than 0.05, there is very limited evidence to suggest difference in mean values between the period groups.
Summary of Interpretation:Period Groups: No statistically significant difference in VAS scores was found between period groups since the confidence interval includes zero and the p-value is greater than 0.05.
Treatment Groups: There is a statistically significant difference in VAS scores between active and placebo treatments, as indicated by the p-value and the positive confidence interval range.
12.3
Model 1
Model.matrix() is a design for the specified model formula (~ a*b). If a and b interact, the effect of one factor depends on the level of the other.
Model 2
This model creates a design matrix for the interaction between a and b only. This model only considers the interaction between a and b, without considering their individual effects.
Model 3
This model includes the main effects of a and b, but no interaction. It assumes that a and b independently affect z. The effect of a on z doesn't depend on the level of b.
Model 4
This model adds y to the additive model. It looks at the main effects of a, b, and y, treating them independently of each other.
Model 5
This model includes the interaction between a and b and also adds y as a main effect. The interaction between a and b is considered, and the effect of u is modeled independently.
Singularities in models occur when two or more predictors are perfectly correlated, meaning they carry the same information.
The function alias() in R identifies dependent terms in a linear model. Aliasing occurs when two or more predictors in the model are perfectly correlated, meaning the model cannot distinguish between them. For example, is.null(alias(lm1)) checks whether the model lm1 suffers from singularity.
The results indicate that no singularities exist in any of the models. This indicates the models do not contain linear dependencies. The model shows stability and interpretability. If there was singularity, the model cannot separate the individual effects of the correlated predictors. This can leads to problems such as unstable coefficients where the model may fail to compute some of the regression coefficients.
Comments
Post a Comment