Latent Growth Curves

2018/04/15

Latent Growth Curves

I will progress through three models: linear, quadratic growth, and latent basis. In every example I use a sample of 400, 6 time points, and ‘affect’ as the variable of interest.

Don’t forget that multiplying by time

is different from describing over time

1) Linear

The data generating process:

\[\begin{equation} y_{it} = 4 - 0.6t + e_{t} \end{equation}\]
library(tidyverse)
library(ggplot2)
library(MASS)

N <- 400
time <- 6

intercept <- 4
linear_growth <- -0.6

df_matrix <- matrix(, nrow = N*time, ncol = 3)

count <- 0

for(i in 1:400){
  
  unob_het_affect <- rnorm(1,0,3)

  
  for(j in 1:6){
    
    count <- count + 1
    
    if(j == 1){
      
      df_matrix[count, 1] <- i
      df_matrix[count, 2] <- j
      df_matrix[count, 3] <- intercept + unob_het_affect + rnorm(1,0,1)
    }else{
      
      
      df_matrix[count, 1] <- i
      df_matrix[count, 2] <- j
      df_matrix[count, 3] <- intercept + linear_growth*j + unob_het_affect + rnorm(1,0,1)
      
    }
    
    
    
  }
  
  
}

df <- data.frame(df_matrix)
names(df) <- c('id', 'time', 'affect')

random_ids <- sample(df$id, 5)

random_df <- df %>%
  filter(id %in% random_ids)
  

ggplot(df, aes(x = time, y = affect, group = id)) + 
  geom_point(color = 'gray85') + 
  geom_line(color = 'gray85') + 
  geom_point(data = random_df, aes(x = time, y = affect, group = id), color = 'blue') + 
  geom_line(data = random_df, aes(x = time, y = affect, group = id), color = 'blue')

Estimating the model:

Formatting the data:

df_wide <- reshape(df, idvar = 'id', timevar = 'time', direction = 'wide')

First, an intercept only (no change) model:

library(lavaan)

no_change_string <- '

# Latent intercept factor

intercept_affect =~ 1*affect.1 + 1*affect.2 + 1*affect.3 + 1*affect.4 + 1*affect.5 + 1*affect.6

# Mean and variance of latent intercept factor

intercept_affect ~~ intercept_affect

# Fix observed variable means to 0

affect.1 ~ 0
affect.2 ~ 0
affect.3 ~ 0
affect.4 ~ 0
affect.5 ~ 0
affect.6 ~ 0

# Constrain residual (error) variance of observed variables to equality across time

affect.1 ~~ res_var*affect.1
affect.2 ~~ res_var*affect.2
affect.3 ~~ res_var*affect.3
affect.4 ~~ res_var*affect.4
affect.5 ~~ res_var*affect.5
affect.6 ~~ res_var*affect.6


'

no_change_model <- growth(no_change_string, data = df_wide)
summary(no_change_model, fit.measures = T)
## lavaan (0.5-23.1097) converged normally after  19 iterations
## 
##   Number of observations                           400
## 
##   Estimator                                         ML
##   Minimum Function Test Statistic             1969.527
##   Degrees of freedom                                24
##   P-value (Chi-square)                           0.000
## 
## Model test baseline model:
## 
##   Minimum Function Test Statistic             3851.438
##   Degrees of freedom                                15
##   P-value                                        0.000
## 
## User model versus baseline model:
## 
##   Comparative Fit Index (CFI)                    0.493
##   Tucker-Lewis Index (TLI)                       0.683
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)              -5174.353
##   Loglikelihood unrestricted model (H1)      -4189.589
## 
##   Number of free parameters                          3
##   Akaike (AIC)                               10354.705
##   Bayesian (BIC)                             10366.680
##   Sample-size adjusted Bayesian (BIC)        10357.161
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.450
##   90 Percent Confidence Interval          0.433  0.467
##   P-value RMSEA <= 0.05                          0.000
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.195
## 
## Parameter Estimates:
## 
##   Information                                 Expected
##   Standard Errors                             Standard
## 
## Latent Variables:
##                       Estimate  Std.Err  z-value  P(>|z|)
##   intercept_affect =~                                    
##     affect.1             1.000                           
##     affect.2             1.000                           
##     affect.3             1.000                           
##     affect.4             1.000                           
##     affect.5             1.000                           
##     affect.6             1.000                           
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .affect.1          0.000                           
##    .affect.2          0.000                           
##    .affect.3          0.000                           
##    .affect.4          0.000                           
##    .affect.5          0.000                           
##    .affect.6          0.000                           
##     intercept_ffct    1.951    0.148   13.206    0.000
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##     intrcp_           8.292    0.618   13.422    0.000
##    .affct.1 (rs_v)    2.657    0.084   31.623    0.000
##    .affct.2 (rs_v)    2.657    0.084   31.623    0.000
##    .affct.3 (rs_v)    2.657    0.084   31.623    0.000
##    .affct.4 (rs_v)    2.657    0.084   31.623    0.000
##    .affct.5 (rs_v)    2.657    0.084   31.623    0.000
##    .affct.6 (rs_v)    2.657    0.084   31.623    0.000

Now, a linear growth model centered at time point 1. The intercept factor estimate, therefore, is the estimated average affect at time 1.

library(lavaan)

linear_change_string <- '

# Latent intercept and slope factors

intercept_affect =~ 1*affect.1 + 1*affect.2 + 1*affect.3 + 1*affect.4 + 1*affect.5 + 1*affect.6
slope_affect =~ 0*affect.1 + 1*affect.2 + 2*affect.3 + 3*affect.4 + 4*affect.5 + 5*affect.6

# Mean and variance of latent factors

intercept_affect ~~ intercept_affect
slope_affect ~~ slope_affect

# Covariance between latent factors

intercept_affect ~~ slope_affect

# Fix observed variable means to 0

affect.1 ~ 0
affect.2 ~ 0
affect.3 ~ 0
affect.4 ~ 0
affect.5 ~ 0
affect.6 ~ 0

# Constrain residual (error) variance of observed variables to equality across time

affect.1 ~~ res_var*affect.1
affect.2 ~~ res_var*affect.2
affect.3 ~~ res_var*affect.3
affect.4 ~~ res_var*affect.4
affect.5 ~~ res_var*affect.5
affect.6 ~~ res_var*affect.6


'

linear_change_model <- growth(linear_change_string, data = df_wide)
summary(linear_change_model, fit.measures = T)
## lavaan (0.5-23.1097) converged normally after  34 iterations
## 
##   Number of observations                           400
## 
##   Estimator                                         ML
##   Minimum Function Test Statistic              102.840
##   Degrees of freedom                                21
##   P-value (Chi-square)                           0.000
## 
## Model test baseline model:
## 
##   Minimum Function Test Statistic             3851.438
##   Degrees of freedom                                15
##   P-value                                        0.000
## 
## User model versus baseline model:
## 
##   Comparative Fit Index (CFI)                    0.979
##   Tucker-Lewis Index (TLI)                       0.985
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)              -4241.009
##   Loglikelihood unrestricted model (H1)      -4189.589
## 
##   Number of free parameters                          6
##   Akaike (AIC)                                8494.018
##   Bayesian (BIC)                              8517.967
##   Sample-size adjusted Bayesian (BIC)         8498.928
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.099
##   90 Percent Confidence Interval          0.080  0.118
##   P-value RMSEA <= 0.05                          0.000
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.034
## 
## Parameter Estimates:
## 
##   Information                                 Expected
##   Standard Errors                             Standard
## 
## Latent Variables:
##                       Estimate  Std.Err  z-value  P(>|z|)
##   intercept_affect =~                                    
##     affect.1             1.000                           
##     affect.2             1.000                           
##     affect.3             1.000                           
##     affect.4             1.000                           
##     affect.5             1.000                           
##     affect.6             1.000                           
##   slope_affect =~                                        
##     affect.1             0.000                           
##     affect.2             1.000                           
##     affect.3             2.000                           
##     affect.4             3.000                           
##     affect.5             4.000                           
##     affect.6             5.000                           
## 
## Covariances:
##                       Estimate  Std.Err  z-value  P(>|z|)
##   intercept_affect ~~                                    
##     slope_affect         0.007    0.037    0.200    0.842
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .affect.1          0.000                           
##    .affect.2          0.000                           
##    .affect.3          0.000                           
##    .affect.4          0.000                           
##    .affect.5          0.000                           
##    .affect.6          0.000                           
##     intercept_ffct    3.648    0.151   24.191    0.000
##     slope_affect     -0.679    0.012  -56.819    0.000
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##     intrcp_           8.543    0.643   13.275    0.000
##     slp_ffc          -0.003    0.005   -0.728    0.467
##    .affct.1 (rs_v)    1.057    0.037   28.284    0.000
##    .affct.2 (rs_v)    1.057    0.037   28.284    0.000
##    .affct.3 (rs_v)    1.057    0.037   28.284    0.000
##    .affct.4 (rs_v)    1.057    0.037   28.284    0.000
##    .affct.5 (rs_v)    1.057    0.037   28.284    0.000
##    .affct.6 (rs_v)    1.057    0.037   28.284    0.000
inspect(linear_change_model, 'cov.lv')
##                  intrc_ slp_ff
## intercept_affect  8.543       
## slope_affect      0.007 -0.003

This model does an adequate job recovering the intercept and slope parameters.

If I wanted to center the model at time point 3 the latent intercept term would be interpreted as the estimated average affect at time 3 and the syntax would change to:

'
slope_affect =~ -2*affect.1 + -1*affect.2 + 0*affect.3 + 1*affect.4 + 2*affect.5 + 3*affect.6

'

2) Quadratic

The data generating process:

\[\begin{equation} y_{it} = 4 + 0.2t + 0.7t^2 + e_{t} \end{equation}\]
library(tidyverse)
library(ggplot2)
library(MASS)

N <- 400
time <- 6



intercept_mu <- 4
linear_growth2 <- 0.2
quad_growth <- 0.7

df_matrix2 <- matrix(, nrow = N*time, ncol = 3)

count <- 0

for(i in 1:400){
  
  unob_het_affect <- rnorm(1,0,3)

  
  for(j in 1:6){
    
    count <- count + 1
    
    if(j == 1){
      
      df_matrix2[count, 1] <- i
      df_matrix2[count, 2] <- j
      df_matrix2[count, 3] <- intercept + rnorm(1,0,1) + rnorm(1,0,1)
    }else{
      
      
      df_matrix2[count, 1] <- i
      df_matrix2[count, 2] <- j
      df_matrix2[count, 3] <- intercept + linear_growth2*j + quad_growth*(j^2) + unob_het_affect + rnorm(1,0,1)
      
    }
    
    
    
  }
  
  
}

df2 <- data.frame(df_matrix2)
names(df2) <- c('id', 'time', 'affect')

random_ids2 <- sample(df2$id, 5)

random_df2 <- df2 %>%
  filter(id %in% random_ids2)
  

ggplot(df2, aes(x = time, y = affect, group = id)) + 
  geom_point(color = 'gray85') + 
  geom_line(color = 'gray85') + 
  geom_point(data = random_df2, aes(x = time, y = affect, group = id), color = 'blue') + 
  geom_line(data = random_df2, aes(x = time, y = affect, group = id), color = 'blue')

Estimating the model:

Quadratic growth model:

df_wide2 <- reshape(df2, idvar = 'id', timevar = 'time', direction = 'wide')


library(lavaan)

quad_change_string <- '

# Latent intercept, linear slope, and quad slope factors

intercept_affect =~ 1*affect.1 + 1*affect.2 + 1*affect.3 + 1*affect.4 + 1*affect.5 + 1*affect.6
slope_affect =~ 0*affect.1 + 1*affect.2 + 2*affect.3 + 3*affect.4 + 4*affect.5 + 5*affect.6
quad_slope_affect =~ 0*affect.1 + 1*affect.2 + 4*affect.3 + 9*affect.4 + 16*affect.5 + 25*affect.6

# Mean and variance of latent factors

intercept_affect ~~ intercept_affect
slope_affect ~~ slope_affect
quad_slope_affect ~~ quad_slope_affect

# Covariance between latent factors

intercept_affect ~~ slope_affect
intercept_affect ~~ quad_slope_affect
slope_affect ~~ quad_slope_affect

# Fix observed variable means to 0

affect.1 ~ 0
affect.2 ~ 0
affect.3 ~ 0
affect.4 ~ 0
affect.5 ~ 0
affect.6 ~ 0

# Constrain residual (error) variance of observed variables to equality across time

affect.1 ~~ res_var*affect.1
affect.2 ~~ res_var*affect.2
affect.3 ~~ res_var*affect.3
affect.4 ~~ res_var*affect.4
affect.5 ~~ res_var*affect.5
affect.6 ~~ res_var*affect.6


'

quad_change_model <- growth(quad_change_string, data = df_wide2)
summary(quad_change_model, fit.measures = T)
## lavaan (0.5-23.1097) converged normally after 106 iterations
## 
##   Number of observations                           400
## 
##   Estimator                                         ML
##   Minimum Function Test Statistic              505.403
##   Degrees of freedom                                17
##   P-value (Chi-square)                           0.000
## 
## Model test baseline model:
## 
##   Minimum Function Test Statistic             2972.004
##   Degrees of freedom                                15
##   P-value                                        0.000
## 
## User model versus baseline model:
## 
##   Comparative Fit Index (CFI)                    0.835
##   Tucker-Lewis Index (TLI)                       0.854
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)              -4576.979
##   Loglikelihood unrestricted model (H1)      -4324.277
## 
##   Number of free parameters                         10
##   Akaike (AIC)                                9173.957
##   Bayesian (BIC)                              9213.872
##   Sample-size adjusted Bayesian (BIC)         9182.141
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.268
##   90 Percent Confidence Interval          0.248  0.288
##   P-value RMSEA <= 0.05                          0.000
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.224
## 
## Parameter Estimates:
## 
##   Information                                 Expected
##   Standard Errors                             Standard
## 
## Latent Variables:
##                        Estimate  Std.Err  z-value  P(>|z|)
##   intercept_affect =~                                     
##     affect.1              1.000                           
##     affect.2              1.000                           
##     affect.3              1.000                           
##     affect.4              1.000                           
##     affect.5              1.000                           
##     affect.6              1.000                           
##   slope_affect =~                                         
##     affect.1              0.000                           
##     affect.2              1.000                           
##     affect.3              2.000                           
##     affect.4              3.000                           
##     affect.5              4.000                           
##     affect.6              5.000                           
##   quad_slope_affect =~                                    
##     affect.1              0.000                           
##     affect.2              1.000                           
##     affect.3              4.000                           
##     affect.4              9.000                           
##     affect.5             16.000                           
##     affect.6             25.000                           
## 
## Covariances:
##                       Estimate  Std.Err  z-value  P(>|z|)
##   intercept_affect ~~                                    
##     slope_affect         0.749    0.144    5.198    0.000
##     quad_slop_ffct      -0.116    0.023   -5.065    0.000
##   slope_affect ~~                                        
##     quad_slop_ffct      -0.443    0.048   -9.212    0.000
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .affect.1          0.000                           
##    .affect.2          0.000                           
##    .affect.3          0.000                           
##    .affect.4          0.000                           
##    .affect.5          0.000                           
##    .affect.6          0.000                           
##     intercept_ffct    4.093    0.067   60.784    0.000
##     slope_affect      1.999    0.102   19.513    0.000
##     quad_slop_ffct    0.644    0.016   39.326    0.000
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##     intrcp_           0.461    0.140    3.299    0.001
##     slp_ffc           3.002    0.301    9.976    0.000
##     qd_slp_           0.063    0.008    8.106    0.000
##    .affct.1 (rs_v)    1.648    0.067   24.495    0.000
##    .affct.2 (rs_v)    1.648    0.067   24.495    0.000
##    .affct.3 (rs_v)    1.648    0.067   24.495    0.000
##    .affct.4 (rs_v)    1.648    0.067   24.495    0.000
##    .affct.5 (rs_v)    1.648    0.067   24.495    0.000
##    .affct.6 (rs_v)    1.648    0.067   24.495    0.000

This model recovers the intercept and quadratic parameters but not the linear growth parameter.

3) Latent Basis

This model allows us to see where a majority of the change occurs in the process. For example, does more change occur between time points 2 and 3 or 5 and 6? In this model we are not trying to recover the parameters, but describe the change process in detail.

Data generating process:

Time 1 - Time 3: \[\begin{equation} y_{it} = 4 + 0.2t + e_{t} \end{equation}\] Time 4 - Time 6: \[\begin{equation} y_{it} = 4 + 0.8t + e_{t} \end{equation}\]
library(tidyverse)
library(ggplot2)
library(MASS)

N <- 400
time <- 6


intercept_mu <- 4
growth_1 <- 0.2
growth_2 <- 0.8


df_matrix3 <- matrix(, nrow = N*time, ncol = 3)

count <- 0

for(i in 1:400){
  
  unob_het_affect <- rnorm(1,0,3)
  
  
  for(j in 1:6){
    
    count <- count + 1
    
    if(j < 4){
      
      df_matrix3[count, 1] <- i
      df_matrix3[count, 2] <- j
      df_matrix3[count, 3] <- intercept + growth_1*j + unob_het_affect + rnorm(1,0,1)
      
    }else{
      
      
      df_matrix3[count, 1] <- i
      df_matrix3[count, 2] <- j
      df_matrix3[count, 3] <- intercept + growth_2*j + unob_het_affect + rnorm(1,0,1)
      
    }
    
    
    
  }
  
  
}

df3 <- data.frame(df_matrix3)
names(df3) <- c('id', 'time', 'affect')

random_ids3 <- sample(df3$id, 5)

random_df3 <- df3 %>%
  filter(id %in% random_ids3)
  

ggplot(df3, aes(x = time, y = affect, group = id)) + 
  geom_point(color = 'gray85') + 
  geom_line(color = 'gray85') + 
  geom_point(data = random_df3, aes(x = time, y = affect, group = id), color = 'blue') + 
  geom_line(data = random_df3, aes(x = time, y = affect, group = id), color = 'blue')

Estimating the model:

Latent basis:

Similar to a linear growth model but we freely estimate the intermediate basis coefficients. Remember to constrain the first basis coefficient to zero and the last to 1.

df_wide3 <- reshape(df3, idvar = 'id', timevar = 'time', direction = 'wide')


library(lavaan)

lb_string <- '

# Latent intercept and slope terms with intermediate time points freely estimated

intercept_affect =~ 1*affect.1 + 1*affect.2 + 1*affect.3 + 1*affect.4 + 1*affect.5 + 1*affect.6
slope_affect =~ 0*affect.1 + bc1*affect.2 + bc2*affect.3 + bc3*affect.4 + bc4*affect.5 + 1*affect.6

# Mean and variance of latent factors

intercept_affect ~~ intercept_affect
slope_affect ~~ slope_affect

# Covariance between latent factors

intercept_affect ~~ slope_affect

# Fix observed variable means to 0

affect.1 ~ 0
affect.2 ~ 0
affect.3 ~ 0
affect.4 ~ 0
affect.5 ~ 0
affect.6 ~ 0

# Constrain residual (error) variance of observed variables to equality across time

affect.1 ~~ res_var*affect.1
affect.2 ~~ res_var*affect.2
affect.3 ~~ res_var*affect.3
affect.4 ~~ res_var*affect.4
affect.5 ~~ res_var*affect.5
affect.6 ~~ res_var*affect.6


'

lb_model <- growth(lb_string, data = df_wide3)
summary(lb_model, fit.measures = T)
## lavaan (0.5-23.1097) converged normally after  73 iterations
## 
##   Number of observations                           400
## 
##   Estimator                                         ML
##   Minimum Function Test Statistic               17.016
##   Degrees of freedom                                17
##   P-value (Chi-square)                           0.453
## 
## Model test baseline model:
## 
##   Minimum Function Test Statistic             4051.775
##   Degrees of freedom                                15
##   P-value                                        0.000
## 
## User model versus baseline model:
## 
##   Comparative Fit Index (CFI)                    1.000
##   Tucker-Lewis Index (TLI)                       1.000
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)              -4160.185
##   Loglikelihood unrestricted model (H1)      -4151.677
## 
##   Number of free parameters                         10
##   Akaike (AIC)                                8340.369
##   Bayesian (BIC)                              8380.284
##   Sample-size adjusted Bayesian (BIC)         8348.553
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.002
##   90 Percent Confidence Interval          0.000  0.045
##   P-value RMSEA <= 0.05                          0.974
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.015
## 
## Parameter Estimates:
## 
##   Information                                 Expected
##   Standard Errors                             Standard
## 
## Latent Variables:
##                       Estimate  Std.Err  z-value  P(>|z|)
##   intercept_affect =~                                    
##     affect.1             1.000                           
##     affect.2             1.000                           
##     affect.3             1.000                           
##     affect.4             1.000                           
##     affect.5             1.000                           
##     affect.6             1.000                           
##   slope_affect =~                                        
##     affect.1             0.000                           
##     affect.2 (bc1)       0.027    0.015    1.782    0.075
##     affect.3 (bc2)       0.079    0.015    5.408    0.000
##     affect.4 (bc3)       0.671    0.013   49.928    0.000
##     affect.5 (bc4)       0.846    0.014   59.563    0.000
##     affect.6             1.000                           
## 
## Covariances:
##                       Estimate  Std.Err  z-value  P(>|z|)
##   intercept_affect ~~                                    
##     slope_affect        -0.187    0.154   -1.214    0.225
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .affect.1          0.000                           
##    .affect.2          0.000                           
##    .affect.3          0.000                           
##    .affect.4          0.000                           
##    .affect.5          0.000                           
##    .affect.6          0.000                           
##     intercept_ffct    4.241    0.160   26.510    0.000
##     slope_affect      4.527    0.069   65.491    0.000
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##     intrcp_           9.287    0.681   13.647    0.000
##     slp_ffc           0.011    0.074    0.144    0.886
##    .affct.1 (rs_v)    0.950    0.034   28.284    0.000
##    .affct.2 (rs_v)    0.950    0.034   28.284    0.000
##    .affct.3 (rs_v)    0.950    0.034   28.284    0.000
##    .affct.4 (rs_v)    0.950    0.034   28.284    0.000
##    .affct.5 (rs_v)    0.950    0.034   28.284    0.000
##    .affct.6 (rs_v)    0.950    0.034   28.284    0.000

bc1 represents the percentage of change for the average individual between time 1 and 2. bc2 represents the percentage change betwen time 1 and 3, bc4 is the percentage change between time 1 and 5, etc.

Bo\(^2\)m =)