Module #7 Assignment

 Module #7 Assignment


1. In this assignment's segment, we will use the following regression equation  Y = a + bX +e

Where:
Y is the value of the Dependent variable (Y), what is being predicted or explained

a or Alpha, a constant; equals the value of Y when the value of X=0

or Beta, the coefficient of X; the slope of the regression line; how much Y changes for each one-unit change in X.

X is the value of the Independent variable (X), what is predicting or explaining the value of Y

e is the error term; the error in predicting the value of Y, given the value of X (it is not displayed in most regression equations).


1.1 
The data in this assignment:

x <- c(16, 17, 13, 18, 12, 14, 19, 11, 11, 10)
y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48)


1.1 Define the relationship model between the predictor (x) and the response (Y) variable:

The relationship model between the predictor
X
and the response variable
Y
is a linear regression model. Here, the predictor 
X, which is the independent variable, is ordered, and the data forms a clear linear relationship with the response variable 
Y
, which is dependent on X.

We assume a linear regression model because changes in
X
are expected to produce proportional, straight-line changes in
Y
. This model is useful for analyzing and predicting
Y
based on changes in
X
, making it ideal for data that follow a linear trend.



1.2 Calculate the coefficients



The coefficients are:

a = 19.205597

b = 3.269107


In the formula, it would be 19.205597 3.269107 x X. 

The a value is the intercept. This means that when X = 0, the predicted value of Y is 19.205597. 

For the b value, it is the slope. This indicates that for each incremental unit increase in X, the predicted value of Y increases by 3.269107. This positive slope shows a direct, proportional relationship between X and Y. As X goes up, Y increases. 



Problem -

Apply the simple linear regression model (see the above formula) for the data set called "visit" (see below), and estimate the the discharge duration if the waiting time since the last eruption has been 80 minutes.
> head(visit) 
  discharge  waiting 
1     3.600      79 
2     1.800      54 
3     3.333      74 
4     2.283      62 
5     4.533      85 
6     2.883      55 


Employ the following formula discharge ~ waiting and data=visit)

2.1 Define the relationship model between the predictor and the response variable.


The relationship model between the predictor and the response variable is simple linear regression. 

Waiting time would be the independent variable (predictor), while the discharge duration is the dependent variable (response). This model indicates that a change in waiting time results in a corresponding effect on discharge duration. 



2.2 Extract the parameters of the estimated regression equation with

 the coefficients function.




The intercept represents the expected discharge duration when the waiting time is zero minutes. 
The slope indicates how much discharge duration is expected to increase for each aditional minute of waiting time. A slope of 0.067557 means that for every minute of waiting time added, the discharge duration is expected to increase by 0.067557.


2.3 Determine the fit of the eruption duration using the estimated regression equation.




The estimated discharge duration for a waiting time of 80 minutes is 3.87 minutes. This means if the waiting time is 80 minutes, we expect the geyser to erupt for 3.87 minutes. This is computed through the use of the regression equation. 


3.  Multiple regression

We will use a very famous datasets in R called mtcarsThis dateset was extracted from the 1974 Motor Trend US magazine, and comprises fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973--74 models).

This data frame contain 32 observations on 11 (numeric) variables.

[, 1]mpgMiles/(US) gallon
[, 2]cylNumber of cylinders
[, 3]dispDisplacement (cu.in.)
[, 4]hpGross horsepower
[, 5]dratRear axle ratio
[, 6]wtWeight (1000 lbs)
[, 7]qsec1/4 mile time
[, 8]vsEngine (0 = V-shaped, 1 = straight)
[, 9]amTransmission (0 = automatic, 1 = manual)
[,10]gearNumber of forward gears

To call mtcars data in R
R comes with several built-in data sets, which are generally used as demo data for playing with R functions. One of those datasets build in R is mtcars.
In this question, we will use 4 of the variables found in mtcars by using the following function

input <- mtcars[,c("mpg","disp","hp","wt")]
print(head(input))



3.1 Examine the relationship Multi Regression Model as stated above and its Coefficients using 4 different variables from mtcars (mpg, disp, hp and wt).
Report on the result and explanation what does the multi regression model and coefficients tells about the data? 
 

input <- mtcars[,c("mpg","disp","hp","wt")]  
lm(formula = mpg ~ disp + hp + wt, data = input)



The Y- intercept is 37.105. The MPG is inversely proportional to the disp, hp, and wt. With this information, we are able to draw the conclusion that increasing the mpg will lead to a decrease in disp, hp, and wt. The multiple regression model helps us understand how displacement, horsepower, and weight affect fuel efficiency in cars. The model can tell us information such as heavier cars and those with more horsepower tend to have lower MPG. 

In summary: 
As displacement increases, mpg decreases.
As horsepower increases, mpg decreases.
As weight increases, mpg decreases.



4.  From our textbook pp. 124, 6.5-Exercises # 6.1
With the rmr data set, plot metabolic rate versus body weight. Fit a linear regression to the relation. According to the fitted model, what is the predicted metabolic rate for a body weight of 70 kg? 
The data set rmr is R, make sure to install the book R package: ISwR. After installing the ISwR package, here is a simple illustration to the set of the problem.


Model

Code to find exact predicted metabolic rate



With the rmr data set, the predicted metabolic rate for a body weight of 70kg is roughly 1305.394. This could be solved through a series of code. Unfortunately, the model gave a semblance of an answer but it was not specific enough. The plot gives a visualization of the relationship between body weight and metabolic rate. After running the code, a predicted metabolic rate for a body weight of 70kg will be resulted. The output will tell you the estimated daily calorie expenditure based on the model. 






Summary:


From this week's lesson, I learned about linear and multiple regression techniques. By learning these techniques, I gained a better understanding of modeling the relationship between a dependent variable as well as an independent variable. I also learned more about data visualization through the use of scatter plots. to identify relationships between variables. Using what I learned, I plan to apply these concepts throughout the course to better my assignments and understanding. 



Comments

Popular posts from this blog

Final Project

Module #11 Assignment