Module #7 Assignment
Module #7 Assignment
1. In this assignment's segment, we will use the following regression equation Y = a + bX +e
Where:
Y is the value of the Dependent variable (Y), what is being predicted or explained
a or Alpha, a constant; equals the value of Y when the value of X=0
b or Beta, the coefficient of X; the slope of the regression line; how much Y changes for each one-unit change in X.
X is the value of the Independent variable (X), what is predicting or explaining the value of Y
e is the error term; the error in predicting the value of Y, given the value of X (it is not displayed in most regression equations).
1.1
The data in this assignment:
x <- c(16, 17, 13, 18, 12, 14, 19, 11, 11, 10) y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48)
1.1 Define the relationship model between the predictor (x) and the response (Y) variable:
The relationship model between the predictor and the response variable is a linear regression model. Here, the predictor X, which is the independent variable, is ordered, and the data forms a clear linear relationship with the response variable , which is dependent on
We assume a linear regression model because changes in are expected to produce proportional, straight-line changes in . This model is useful for analyzing and predicting based on changes in , making it ideal for data that follow a linear trend.
1.2 Calculate the coefficients
The coefficients are:
a = 19.205597
b = 3.269107
In the formula, it would be Y = 19.205597 + 3.269107 x X.
The a value is the intercept. This means that when X = 0, the predicted value of Y is 19.205597.
For the b value, it is the slope. This indicates that for each incremental unit increase in X, the predicted value of Y increases by 3.269107. This positive slope shows a direct, proportional relationship between X and Y. As X goes up, Y increases.
Problem -
Apply the simple linear regression model (see the above formula) for the data set called "visit" (see below), and estimate the the discharge duration if the waiting time since the last eruption has been 80 minutes.
> head(visit)
discharge waiting
1 3.600 79
2 1.800 54
3 3.333 74
4 2.283 62
5 4.533 85
6 2.883 55
Employ the following formula discharge ~ waiting and data=visit)
2.1 Define the relationship model between the predictor and the response variable.
The relationship model between the predictor and the response variable is simple linear regression.
Waiting time would be the independent variable (predictor), while the discharge duration is the dependent variable (response). This model indicates that a change in waiting time results in a corresponding effect on discharge duration.
2.2 Extract the parameters of the estimated regression equation with
the coefficients function.
2.3 Determine the fit of the eruption duration using the estimated regression equation.
The estimated discharge duration for a waiting time of 80 minutes is 3.87 minutes. This means if the waiting time is 80 minutes, we expect the geyser to erupt for 3.87 minutes. This is computed through the use of the regression equation.
3. Multiple regression
We will use a very famous datasets in R called mtcars. This dateset was extracted from the 1974 Motor Trend US magazine, and comprises fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973--74 models).
This data frame contain 32 observations on 11 (numeric) variables.
[, 1] | mpg | Miles/(US) gallon |
[, 2] | cyl | Number of cylinders |
[, 3] | disp | Displacement (cu.in.) |
[, 4] | hp | Gross horsepower |
[, 5] | drat | Rear axle ratio |
[, 6] | wt | Weight (1000 lbs) |
[, 7] | qsec | 1/4 mile time |
[, 8] | vs | Engine (0 = V-shaped, 1 = straight) |
[, 9] | am | Transmission (0 = automatic, 1 = manual) |
[,10] | gear | Number of forward gears |
To call mtcars data in R
R comes with several built-in data sets, which are generally used as demo data for playing with R functions. One of those datasets build in R is mtcars.
In this question, we will use 4 of the variables found in mtcars by using the following function
input <- mtcars[,c("mpg","disp","hp","wt")]
print(head(input))
3.1 Examine the relationship Multi Regression Model as stated above and its Coefficients using 4 different variables from mtcars (mpg, disp, hp and wt).
Report on the result and explanation what does the multi regression model and coefficients tells about the data?
input <- mtcars[,c("mpg","disp","hp","wt")]
lm(formula = mpg ~ disp + hp + wt, data = input)
With the rmr data set, plot metabolic rate versus body weight. Fit a linear regression to the relation. According to the fitted model, what is the predicted metabolic rate for a body weight of 70 kg?
The data set rmr is R, make sure to install the book R package: ISwR. After installing the ISwR package, here is a simple illustration to the set of the problem.
Model |
With the rmr data set, the predicted metabolic rate for a body weight of 70kg is roughly 1305.394. This could be solved through a series of code. Unfortunately, the model gave a semblance of an answer but it was not specific enough. The plot gives a visualization of the relationship between body weight and metabolic rate. After running the code, a predicted metabolic rate for a body weight of 70kg will be resulted. The output will tell you the estimated daily calorie expenditure based on the model.
Comments
Post a Comment