Regression In R | R Multiple Regression - r - learn r - r programming
- Multiple regression is an extension of linear regression into relationship between more than two variables.
- In simple linear relation we have one predictor and one response variable, but in multiple regression we have more than one predictor variable and one response variable.
- The general mathematical equation for multiple regression is
- y is the response variable.
- a, b1, b2...bn are the coefficients.
- x1, x2, ...xn are the predictor variables.
- We create the regression model using the lm() function in R.
- The model determines the value of the coefficients using the input data.
- Next we can predict the value of the response variable for a given set of predictor variables using these coefficients.
lm() Function
- This function creates the relationship model between the predictor and the response variable.
Syntax
- The basic syntax for lm() function in multiple regression is
Following is the description of the parameters used −
- formula is a symbol presenting the relation between the response variable and predictor variables.
- data is the vector on which the formula will be applied.
Example
Input Data
- Consider the data set "mtcars" available in the R environment.
- It gives a comparison between different car models in terms of mileage per gallon (mpg), cylinder displacement("disp"), horse power("hp"), weight of the car("wt") and some more parameters.
- The goal of the model is to establish the relationship between "mpg" as a response variable with "disp","hp" and "wt" as predictor variables.
- We create a subset of these variables from the mtcars data set for this purpose.
When we execute the above code, it produces the following result −
Create Relationship Model & get the Coefficients
When we execute the above code, it produces the following result −
Create Equation for Regression Model
- Based on the above intercept and coefficient values, we create the mathematical equation
Apply Equation for predicting New Values
- We can use the regression equation created above to predict the mileage when a new set of values for displacement, horse power and weight is provided.