Regression is a statistical method used to model the relationship between a dependent variable and one or more independent variables
plot(jitter(child,4) ~ parent, galton) #plots data y~x
regrline <- lm(child ~ parent, galton) #lm= linear model y~x
abline(regrline, lwd=3, col='red') #plotting the line
In lm(x ~ y, data)
:
COMMON MISTAKE: Thinking lm(x ~ y)
means “x is the input, y is the output”, its the other way round, as y here is the independent variable.
The regression line always passes through (mean(x), mean(y)).
slope of regression line is :
$$ \text{Slope (} \beta_{1} \text{)} = r_{xy} \times \frac{σ_{y}}{σ_{x}} $$
residuals(model)
model$residuals
both of these give you the residuals
OLS (Ordinary Least Squares) finds the best-fit line by minimising the sum of squared residuals (errors). So using the linear model, automatically takes care of all this.