Latent Growth Curves

I will progress through three models: linear, quadratic growth, and latent basis. In every example I use a sample of 400, 6 time points, and ‘affect’ as the variable of interest.

Don’t forget that multiplying by time

  • \(0.6t\)

is different from describing over time

  • \(0.6_t\).

1) Linear

The data generating process:

\[\begin{equation} y_{it} = 4 - 0.6t + e_{t} \end{equation}\]

library(tidyverse)
library(ggplot2)
library(MASS)

N <- 400
time <- 6

intercept <- 4
linear_growth <- -0.6

df_matrix <- matrix(, nrow = N*time, ncol = 3)

count <- 0

for(i in 1:400){
  
  unob_het_affect <- rnorm(1,0,3)

  
  for(j in 1:6){
    
    count <- count + 1
    
    if(j == 1){
      
      df_matrix[count, 1] <- i
      df_matrix[count, 2] <- j
      df_matrix[count, 3] <- intercept + unob_het_affect + rnorm(1,0,1)
    }else{
      
      
      df_matrix[count, 1] <- i
      df_matrix[count, 2] <- j
      df_matrix[count, 3] <- intercept + linear_growth*j + unob_het_affect + rnorm(1,0,1)
      
    }
    
    
    
  }
  
  
}

df <- data.frame(df_matrix)
names(df) <- c('id', 'time', 'affect')

random_ids <- sample(df$id, 5)

random_df <- df %>%
  filter(id %in% random_ids)
  

ggplot(df, aes(x = time, y = affect, group = id)) + 
  geom_point(color = 'gray85') + 
  geom_line(color = 'gray85') + 
  geom_point(data = random_df, aes(x = time, y = affect, group = id), color = 'blue') + 
  geom_line(data = random_df, aes(x = time, y = affect, group = id), color = 'blue')

Estimating the model:

Formatting the data:

df_wide <- reshape(df, idvar = 'id', timevar = 'time', direction = 'wide')

First, an intercept only (no change) model:

library(lavaan)

no_change_string <- '

# Latent intercept factor

intercept_affect =~ 1*affect.1 + 1*affect.2 + 1*affect.3 + 1*affect.4 + 1*affect.5 + 1*affect.6

# Mean and variance of latent intercept factor

intercept_affect ~~ intercept_affect

# Fix observed variable means to 0

affect.1 ~ 0
affect.2 ~ 0
affect.3 ~ 0
affect.4 ~ 0
affect.5 ~ 0
affect.6 ~ 0

# Constrain residual (error) variance of observed variables to equality across time

affect.1 ~~ res_var*affect.1
affect.2 ~~ res_var*affect.2
affect.3 ~~ res_var*affect.3
affect.4 ~~ res_var*affect.4
affect.5 ~~ res_var*affect.5
affect.6 ~~ res_var*affect.6


'

no_change_model <- growth(no_change_string, data = df_wide)
summary(no_change_model, fit.measures = T)
## lavaan 0.6-2 ended normally after 20 iterations
## 
##   Optimization method                           NLMINB
##   Number of free parameters                          8
##   Number of equality constraints                     5
## 
##   Number of observations                           400
## 
##   Estimator                                         ML
##   Model Fit Test Statistic                    2107.318
##   Degrees of freedom                                24
##   P-value (Chi-square)                           0.000
## 
## Model test baseline model:
## 
##   Minimum Function Test Statistic             4004.779
##   Degrees of freedom                                15
##   P-value                                        0.000
## 
## User model versus baseline model:
## 
##   Comparative Fit Index (CFI)                    0.478
##   Tucker-Lewis Index (TLI)                       0.674
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)              -5202.480
##   Loglikelihood unrestricted model (H1)      -4148.821
## 
##   Number of free parameters                          3
##   Akaike (AIC)                               10410.961
##   Bayesian (BIC)                             10422.935
##   Sample-size adjusted Bayesian (BIC)        10413.416
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.466
##   90 Percent Confidence Interval          0.449  0.483
##   P-value RMSEA <= 0.05                          0.000
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.198
## 
## Parameter Estimates:
## 
##   Information                                 Expected
##   Information saturated (h1) model          Structured
##   Standard Errors                             Standard
## 
## Latent Variables:
##                       Estimate  Std.Err  z-value  P(>|z|)
##   intercept_affect =~                                    
##     affect.1             1.000                           
##     affect.2             1.000                           
##     affect.3             1.000                           
##     affect.4             1.000                           
##     affect.5             1.000                           
##     affect.6             1.000                           
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .affect.1          0.000                           
##    .affect.2          0.000                           
##    .affect.3          0.000                           
##    .affect.4          0.000                           
##    .affect.5          0.000                           
##    .affect.6          0.000                           
##     intercept_ffct    2.075    0.151   13.779    0.000
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##     intrcp_           8.615    0.641   13.434    0.000
##    .affct.1 (rs_v)    2.712    0.086   31.623    0.000
##    .affct.2 (rs_v)    2.712    0.086   31.623    0.000
##    .affct.3 (rs_v)    2.712    0.086   31.623    0.000
##    .affct.4 (rs_v)    2.712    0.086   31.623    0.000
##    .affct.5 (rs_v)    2.712    0.086   31.623    0.000
##    .affct.6 (rs_v)    2.712    0.086   31.623    0.000

Now, a linear growth model centered at time point 1. The intercept factor estimate, therefore, is the estimated average affect at time 1.

library(lavaan)

linear_change_string <- '

# Latent intercept and slope factors

intercept_affect =~ 1*affect.1 + 1*affect.2 + 1*affect.3 + 1*affect.4 + 1*affect.5 + 1*affect.6
slope_affect =~ 0*affect.1 + 1*affect.2 + 2*affect.3 + 3*affect.4 + 4*affect.5 + 5*affect.6

# Mean and variance of latent factors

intercept_affect ~~ intercept_affect
slope_affect ~~ slope_affect

# Covariance between latent factors

intercept_affect ~~ slope_affect

# Fix observed variable means to 0

affect.1 ~ 0
affect.2 ~ 0
affect.3 ~ 0
affect.4 ~ 0
affect.5 ~ 0
affect.6 ~ 0

# Constrain residual (error) variance of observed variables to equality across time

affect.1 ~~ res_var*affect.1
affect.2 ~~ res_var*affect.2
affect.3 ~~ res_var*affect.3
affect.4 ~~ res_var*affect.4
affect.5 ~~ res_var*affect.5
affect.6 ~~ res_var*affect.6


'

linear_change_model <- growth(linear_change_string, data = df_wide)
summary(linear_change_model, fit.measures = T)
## lavaan 0.6-2 ended normally after 37 iterations
## 
##   Optimization method                           NLMINB
##   Number of free parameters                         11
##   Number of equality constraints                     5
## 
##   Number of observations                           400
## 
##   Estimator                                         ML
##   Model Fit Test Statistic                      99.847
##   Degrees of freedom                                21
##   P-value (Chi-square)                           0.000
## 
## Model test baseline model:
## 
##   Minimum Function Test Statistic             4004.779
##   Degrees of freedom                                15
##   P-value                                        0.000
## 
## User model versus baseline model:
## 
##   Comparative Fit Index (CFI)                    0.980
##   Tucker-Lewis Index (TLI)                       0.986
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)              -4198.745
##   Loglikelihood unrestricted model (H1)      -4148.821
## 
##   Number of free parameters                          6
##   Akaike (AIC)                                8409.489
##   Bayesian (BIC)                              8433.438
##   Sample-size adjusted Bayesian (BIC)         8414.400
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.097
##   90 Percent Confidence Interval          0.078  0.116
##   P-value RMSEA <= 0.05                          0.000
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.036
## 
## Parameter Estimates:
## 
##   Information                                 Expected
##   Information saturated (h1) model          Structured
##   Standard Errors                             Standard
## 
## Latent Variables:
##                       Estimate  Std.Err  z-value  P(>|z|)
##   intercept_affect =~                                    
##     affect.1             1.000                           
##     affect.2             1.000                           
##     affect.3             1.000                           
##     affect.4             1.000                           
##     affect.5             1.000                           
##     affect.6             1.000                           
##   slope_affect =~                                        
##     affect.1             0.000                           
##     affect.2             1.000                           
##     affect.3             2.000                           
##     affect.4             3.000                           
##     affect.5             4.000                           
##     affect.6             5.000                           
## 
## Covariances:
##                       Estimate  Std.Err  z-value  P(>|z|)
##   intercept_affect ~~                                    
##     slope_affect         0.016    0.037    0.432    0.666
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .affect.1          0.000                           
##    .affect.2          0.000                           
##    .affect.3          0.000                           
##    .affect.4          0.000                           
##    .affect.5          0.000                           
##    .affect.6          0.000                           
##     intercept_ffct    3.826    0.153   25.006    0.000
##     slope_affect     -0.701    0.012  -60.023    0.000
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##     intrcp_           8.838    0.662   13.342    0.000
##     slp_ffc          -0.003    0.004   -0.666    0.505
##    .affct.1 (rs_v)    1.004    0.036   28.284    0.000
##    .affct.2 (rs_v)    1.004    0.036   28.284    0.000
##    .affct.3 (rs_v)    1.004    0.036   28.284    0.000
##    .affct.4 (rs_v)    1.004    0.036   28.284    0.000
##    .affct.5 (rs_v)    1.004    0.036   28.284    0.000
##    .affct.6 (rs_v)    1.004    0.036   28.284    0.000
inspect(linear_change_model, 'cov.lv')
##                  intrc_ slp_ff
## intercept_affect  8.838       
## slope_affect      0.016 -0.003

This model does an adequate job recovering the intercept and slope parameters.

If I wanted to center the model at time point 3 the latent intercept term would be interpreted as the estimated average affect at time 3 and the syntax would change to:

'
slope_affect =~ -2*affect.1 + -1*affect.2 + 0*affect.3 + 1*affect.4 + 2*affect.5 + 3*affect.6

'

2) Quadratic

The data generating process:

\[\begin{equation} y_{it} = 4 + 0.2t + 0.7t^2 + e_{t} \end{equation}\]

library(tidyverse)
library(ggplot2)
library(MASS)

N <- 400
time <- 6



intercept_mu <- 4
linear_growth2 <- 0.2
quad_growth <- 0.7

df_matrix2 <- matrix(, nrow = N*time, ncol = 3)

count <- 0

for(i in 1:400){
  
  unob_het_affect <- rnorm(1,0,3)

  
  for(j in 1:6){
    
    count <- count + 1
    
    if(j == 1){
      
      df_matrix2[count, 1] <- i
      df_matrix2[count, 2] <- j
      df_matrix2[count, 3] <- intercept + rnorm(1,0,1) + rnorm(1,0,1)
    }else{
      
      
      df_matrix2[count, 1] <- i
      df_matrix2[count, 2] <- j
      df_matrix2[count, 3] <- intercept + linear_growth2*j + quad_growth*(j^2) + unob_het_affect + rnorm(1,0,1)
      
    }
    
    
    
  }
  
  
}

df2 <- data.frame(df_matrix2)
names(df2) <- c('id', 'time', 'affect')

random_ids2 <- sample(df2$id, 5)

random_df2 <- df2 %>%
  filter(id %in% random_ids2)
  

ggplot(df2, aes(x = time, y = affect, group = id)) + 
  geom_point(color = 'gray85') + 
  geom_line(color = 'gray85') + 
  geom_point(data = random_df2, aes(x = time, y = affect, group = id), color = 'blue') + 
  geom_line(data = random_df2, aes(x = time, y = affect, group = id), color = 'blue')

Estimating the model:

Quadratic growth model:

df_wide2 <- reshape(df2, idvar = 'id', timevar = 'time', direction = 'wide')


library(lavaan)

quad_change_string <- '

# Latent intercept, linear slope, and quad slope factors

intercept_affect =~ 1*affect.1 + 1*affect.2 + 1*affect.3 + 1*affect.4 + 1*affect.5 + 1*affect.6
slope_affect =~ 0*affect.1 + 1*affect.2 + 2*affect.3 + 3*affect.4 + 4*affect.5 + 5*affect.6
quad_slope_affect =~ 0*affect.1 + 1*affect.2 + 4*affect.3 + 9*affect.4 + 16*affect.5 + 25*affect.6

# Mean and variance of latent factors

intercept_affect ~~ intercept_affect
slope_affect ~~ slope_affect
quad_slope_affect ~~ quad_slope_affect

# Covariance between latent factors

intercept_affect ~~ slope_affect
intercept_affect ~~ quad_slope_affect
slope_affect ~~ quad_slope_affect

# Fix observed variable means to 0

affect.1 ~ 0
affect.2 ~ 0
affect.3 ~ 0
affect.4 ~ 0
affect.5 ~ 0
affect.6 ~ 0

# Constrain residual (error) variance of observed variables to equality across time

affect.1 ~~ res_var*affect.1
affect.2 ~~ res_var*affect.2
affect.3 ~~ res_var*affect.3
affect.4 ~~ res_var*affect.4
affect.5 ~~ res_var*affect.5
affect.6 ~~ res_var*affect.6


'

quad_change_model <- growth(quad_change_string, data = df_wide2)
summary(quad_change_model, fit.measures = T)
## lavaan 0.6-3 ended normally after 106 iterations
## 
##   Optimization method                           NLMINB
##   Number of free parameters                         15
##   Number of equality constraints                     5
## 
##   Number of observations                           400
## 
##   Estimator                                         ML
##   Model Fit Test Statistic                     575.327
##   Degrees of freedom                                17
##   P-value (Chi-square)                           0.000
## 
## Model test baseline model:
## 
##   Minimum Function Test Statistic             3109.033
##   Degrees of freedom                                15
##   P-value                                        0.000
## 
## User model versus baseline model:
## 
##   Comparative Fit Index (CFI)                    0.820
##   Tucker-Lewis Index (TLI)                       0.841
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)              -4578.013
##   Loglikelihood unrestricted model (H1)      -4290.350
## 
##   Number of free parameters                         10
##   Akaike (AIC)                                9176.026
##   Bayesian (BIC)                              9215.941
##   Sample-size adjusted Bayesian (BIC)         9184.210
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.287
##   90 Percent Confidence Interval          0.267  0.307
##   P-value RMSEA <= 0.05                          0.000
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.235
## 
## Parameter Estimates:
## 
##   Information                                 Expected
##   Information saturated (h1) model          Structured
##   Standard Errors                             Standard
## 
## Latent Variables:
##                        Estimate  Std.Err  z-value  P(>|z|)
##   intercept_affect =~                                     
##     affect.1              1.000                           
##     affect.2              1.000                           
##     affect.3              1.000                           
##     affect.4              1.000                           
##     affect.5              1.000                           
##     affect.6              1.000                           
##   slope_affect =~                                         
##     affect.1              0.000                           
##     affect.2              1.000                           
##     affect.3              2.000                           
##     affect.4              3.000                           
##     affect.5              4.000                           
##     affect.6              5.000                           
##   quad_slope_affect =~                                    
##     affect.1              0.000                           
##     affect.2              1.000                           
##     affect.3              4.000                           
##     affect.4              9.000                           
##     affect.5             16.000                           
##     affect.6             25.000                           
## 
## Covariances:
##                       Estimate  Std.Err  z-value  P(>|z|)
##   intercept_affect ~~                                    
##     slope_affect         0.969    0.141    6.874    0.000
##     quad_slop_ffct      -0.150    0.023   -6.644    0.000
##   slope_affect ~~                                        
##     quad_slop_ffct      -0.467    0.050   -9.372    0.000
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .affect.1          0.000                           
##    .affect.2          0.000                           
##    .affect.3          0.000                           
##    .affect.4          0.000                           
##    .affect.5          0.000                           
##    .affect.6          0.000                           
##     intercept_ffct    4.171    0.065   64.146    0.000
##     slope_affect      2.181    0.104   20.975    0.000
##     quad_slop_ffct    0.610    0.017   36.403    0.000
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##     intrcp_           0.333    0.132    2.524    0.012
##     slp_ffc           3.122    0.310   10.083    0.000
##     qd_slp_           0.068    0.008    8.344    0.000
##    .affct.1 (rs_v)    1.653    0.068   24.495    0.000
##    .affct.2 (rs_v)    1.653    0.068   24.495    0.000
##    .affct.3 (rs_v)    1.653    0.068   24.495    0.000
##    .affct.4 (rs_v)    1.653    0.068   24.495    0.000
##    .affct.5 (rs_v)    1.653    0.068   24.495    0.000
##    .affct.6 (rs_v)    1.653    0.068   24.495    0.000

This model recovers the intercept and quadratic parameters but not the linear growth parameter.

3) Latent Basis

This model allows us to see where a majority of the change occurs in the process. For example, does more change occur between time points 2 and 3 or 5 and 6? In this model we are not trying to recover the parameters, but describe the change process in detail.

Data generating process:

Time 1 - Time 3: \[\begin{equation} y_{it} = 4 + 0.2t + e_{t} \end{equation}\]

Time 4 - Time 6: \[\begin{equation} y_{it} = 4 + 0.8t + e_{t} \end{equation}\]

library(tidyverse)
library(ggplot2)
library(MASS)

N <- 400
time <- 6


intercept_mu <- 4
growth_1 <- 0.2
growth_2 <- 0.8


df_matrix3 <- matrix(, nrow = N*time, ncol = 3)

count <- 0

for(i in 1:400){
  
  unob_het_affect <- rnorm(1,0,3)
  
  
  for(j in 1:6){
    
    count <- count + 1
    
    if(j < 4){
      
      df_matrix3[count, 1] <- i
      df_matrix3[count, 2] <- j
      df_matrix3[count, 3] <- intercept + growth_1*j + unob_het_affect + rnorm(1,0,1)
      
    }else{
      
      
      df_matrix3[count, 1] <- i
      df_matrix3[count, 2] <- j
      df_matrix3[count, 3] <- intercept + growth_2*j + unob_het_affect + rnorm(1,0,1)
      
    }
    
    
    
  }
  
  
}

df3 <- data.frame(df_matrix3)
names(df3) <- c('id', 'time', 'affect')

random_ids3 <- sample(df3$id, 5)

random_df3 <- df3 %>%
  filter(id %in% random_ids3)
  

ggplot(df3, aes(x = time, y = affect, group = id)) + 
  geom_point(color = 'gray85') + 
  geom_line(color = 'gray85') + 
  geom_point(data = random_df3, aes(x = time, y = affect, group = id), color = 'blue') + 
  geom_line(data = random_df3, aes(x = time, y = affect, group = id), color = 'blue')

Estimating the model:

Latent basis:

Similar to a linear growth model but we freely estimate the intermediate basis coefficients. Remember to constrain the first basis coefficient to zero and the last to 1.

df_wide3 <- reshape(df3, idvar = 'id', timevar = 'time', direction = 'wide')


library(lavaan)

lb_string <- '

# Latent intercept and slope terms with intermediate time points freely estimated

intercept_affect =~ 1*affect.1 + 1*affect.2 + 1*affect.3 + 1*affect.4 + 1*affect.5 + 1*affect.6
slope_affect =~ 0*affect.1 + bc1*affect.2 + bc2*affect.3 + bc3*affect.4 + bc4*affect.5 + 1*affect.6

# Mean and variance of latent factors

intercept_affect ~~ intercept_affect
slope_affect ~~ slope_affect

# Covariance between latent factors

intercept_affect ~~ slope_affect

# Fix observed variable means to 0

affect.1 ~ 0
affect.2 ~ 0
affect.3 ~ 0
affect.4 ~ 0
affect.5 ~ 0
affect.6 ~ 0

# Constrain residual (error) variance of observed variables to equality across time

affect.1 ~~ res_var*affect.1
affect.2 ~~ res_var*affect.2
affect.3 ~~ res_var*affect.3
affect.4 ~~ res_var*affect.4
affect.5 ~~ res_var*affect.5
affect.6 ~~ res_var*affect.6


'

lb_model <- growth(lb_string, data = df_wide3)
summary(lb_model, fit.measures = T)
## lavaan 0.6-3 ended normally after 55 iterations
## 
##   Optimization method                           NLMINB
##   Number of free parameters                         15
##   Number of equality constraints                     5
## 
##   Number of observations                           400
## 
##   Estimator                                         ML
##   Model Fit Test Statistic                       9.055
##   Degrees of freedom                                17
##   P-value (Chi-square)                           0.939
## 
## Model test baseline model:
## 
##   Minimum Function Test Statistic             4006.479
##   Degrees of freedom                                15
##   P-value                                        0.000
## 
## User model versus baseline model:
## 
##   Comparative Fit Index (CFI)                    1.000
##   Tucker-Lewis Index (TLI)                       1.002
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)              -4197.926
##   Loglikelihood unrestricted model (H1)      -4193.398
## 
##   Number of free parameters                         10
##   Akaike (AIC)                                8415.852
##   Bayesian (BIC)                              8455.767
##   Sample-size adjusted Bayesian (BIC)         8424.036
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.000
##   90 Percent Confidence Interval          0.000  0.010
##   P-value RMSEA <= 0.05                          1.000
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.012
## 
## Parameter Estimates:
## 
##   Information                                 Expected
##   Information saturated (h1) model          Structured
##   Standard Errors                             Standard
## 
## Latent Variables:
##                       Estimate  Std.Err  z-value  P(>|z|)
##   intercept_affect =~                                    
##     affect.1             1.000                           
##     affect.2             1.000                           
##     affect.3             1.000                           
##     affect.4             1.000                           
##     affect.5             1.000                           
##     affect.6             1.000                           
##   slope_affect =~                                        
##     affect.1             0.000                           
##     affect.2 (bc1)       0.026    0.015    1.795    0.073
##     affect.3 (bc2)       0.095    0.014    6.641    0.000
##     affect.4 (bc3)       0.628    0.013   48.088    0.000
##     affect.5 (bc4)       0.788    0.014   57.862    0.000
##     affect.6             1.000                           
## 
## Covariances:
##                       Estimate  Std.Err  z-value  P(>|z|)
##   intercept_affect ~~                                    
##     slope_affect        -0.236    0.164   -1.437    0.151
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .affect.1          0.000                           
##    .affect.2          0.000                           
##    .affect.3          0.000                           
##    .affect.4          0.000                           
##    .affect.5          0.000                           
##    .affect.6          0.000                           
##     intercept_ffct    4.042    0.162   25.022    0.000
##     slope_affect      4.701    0.070   66.840    0.000
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##     intrcp_           9.451    0.693   13.636    0.000
##     slp_ffc           0.009    0.082    0.106    0.916
##    .affct.1 (rs_v)    0.985    0.035   28.284    0.000
##    .affct.2 (rs_v)    0.985    0.035   28.284    0.000
##    .affct.3 (rs_v)    0.985    0.035   28.284    0.000
##    .affct.4 (rs_v)    0.985    0.035   28.284    0.000
##    .affct.5 (rs_v)    0.985    0.035   28.284    0.000
##    .affct.6 (rs_v)    0.985    0.035   28.284    0.000

bc1 represents the percentage of change for the average individual between time 1 and 2. bc2 represents the percentage change betwen time 1 and 3, bc4 is the percentage change between time 1 and 5, etc.

Bo\(^2\)m =)