For this exercise, we will use the variables igf1, age and tanner belonging to the dataset juul from the package ISwR.

EXE 4.1

multiple regression

First conduct simple regression and then multiple regression, and inspect the changes in parameters:

## (Intercept)         age 
##  360.860182   -1.191814
## (Intercept)      tanner 
##   165.23424    64.95695
## (Intercept)         age      tanner 
##  194.118206   -4.832264   76.534903

Make a vector of the dependent variable, make a matrix of the two independent variables and include a vector of 1’s as the first column. Apply the formula in the slides. Use solve function to calculate the inverse, use t function to calculate the transpose, and remember that lm uses listwise deletion!

##            [,1]
## [1,] 194.118206
## [2,]  -4.832264
## [3,]  76.534903

Inspect assumptions

Check normality of age and tanner, check outliers of the igf1 for each level of tanner (see 4.4.2 Dalgaard). Show all plots in one window.

EXE 4.2

a. Mean centering & moderation

Calculate the mean centered variables and include them in the dataset simultaneously. Compare the models with and without mean centering.

## (Intercept)         age      tanner 
##  194.118206   -4.832264   76.534903
## (Intercept)        Mage     Mtanner 
##  358.628788   -4.832264   76.534903

b. Calculate the interaction term directly in the model

Check 10.6 Dalgaard how to do this. Compare the models with and without interaction effect. Look at the coefficients, the model fit (i.e. R2), and the standard errors). I used the uncentered variables below.

## (Intercept)         age      tanner 
##  194.118206   -4.832264   76.534903
## (Intercept)         age      tanner  age:tanner 
## -106.694684   27.317193  162.418264   -8.023521

EXE 4.3

Compare the coefficients calculated under 4.1 in terms of strength. The coefficients need to be expressed in terms of their standard deviations. Calculate these standardized coefficients, and add them to the dataset.

##  (Intercept)         Zage      Ztanner Zage:Ztanner 
##    0.3655958    0.1198682    0.6005949   -0.5088640

Check with scale function. Note that the mean is zero, and the standard deviation is 1. First conduct a standardization with ‘scale’ function. Second, check the range of values of both your manually standardized variable and the result obtained by scale function using summary.

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## -1.9070 -0.8135 -0.0181  0.0000  0.7152  3.2190
##        V1         
##  Min.   :-1.9069  
##  1st Qu.:-0.8135  
##  Median :-0.0181  
##  Mean   : 0.0000  
##  3rd Qu.: 0.7152  
##  Max.   : 3.2186

Note that the scale function allows you to either only standardize (i.e. divide by standard deviation) or center (i.e. subtract the mean from all observations). For a more detailed discussion, see summary.

Estimate a simple regression (i.e. including only one independent variable) and compare with correlation coefficient. Use function cor. The second solution is obtained using cor.

##  (Intercept)         Zage 
## 5.005913e-16 4.089242e-01
## [1] 0.4089242