Model and modeling

What is model?

  • description, analogy
  • a usually miniature representation of something
  • a description or analogy used to help visualize something (such as an atom) that cannot be directly observed
  • a system of postulates, data, and inferences presented as a mathematical description of an entity or state of affairs.

Merriam-Webster Online Dictionary

Model parameters

  • Parameter - is a characteristic or feature which defines some system.
  • Model parameters - true defining characteristics of a model, usually numerical. In modeling context, only unknown features of the model. E.g. correlation.
  • Estimated parameters - approximation of the true values of parameters as a result of comparing the model to the data. E.g. correlation coefficient.

Parsimony rule

a.k.a. Occam’s Razor- entities should not be multiplied unnecesserily.

The simpler – the better. Unless otherwise is necessary.

If two models have the same fit to the data, the simpler one should be preferred.

Higher number of parameters automatically increases the model’s fit to the data, but decreases its fit to reality - the overfitting problem.

Empirical data vs Model structure/configuration

Data:

  • Experience, information based on senses, collected in a systemtic way
  • Usually separated from model
  • ❗️Data are not parameters. Parameters are unknowns, data are knowns.

Model:

  • Research question, hypothesis;
    • there are no right answers to the wrong questions.
  • Implicit parts of the model: defaults and assumptions:
    • e.g. linearity, causality, no measurement error, uncorrelatedness of factors;
  • ❗️models are often based on a different sort of information, e.g. pure theorizing.

Approaches to model development

  • Confirmatory
    • requires a well-established theory, supported by many previous studies, usually tests little issues to add precision to existing theory. Reveals a binary result: confirmed/rejected.
  • Alternative models
    • Requires two or more contradictory hypotheses, also well-based in established theories.
  • Model generation
    • Models are generated iteratively, based on data and intuition. Beneficial when there is no theory.

Latent Variables

Observed (manifest) vs Latent Variables

Is it reasonable to assume that a construct is measured without error?

  • Subjective well-being
  • Generalized trust
  • Education
  • Gender
  • Age
  • Height

Latent variable is

  • something you can’t directly observe.
  • expected value of observed variable (cleared from the measurement error, “true score”)
  • any variable with unknown values (Bollen).
  • common variance of several indicators (local independence axiom) - in a min.

Classifications of latent variables

  • apriori and aposteriori (whether they are part of exisiting theory or not)
  • reflective and formative (whether indicators are consequences or reasons of the latent construct)
  • continuous or categorical (e.g. factors vs. clusters)
  • identified or non-identified (whether possible to measure).

Measurement of latent variables

Latent variables are included in many models

  • Linear regressions (residuals are LVs)
  • Multinomial regressions (transformed dependent variable is LV)
  • Factor analysis (factors are LVs)
  • Latent growth curves (curves are LVs)
  • Latent class analysis (classes are LVs)
  • Item Response Theory (ability is LV)
  • Structural models with LVs

Thurstone’s Common Factor Model

The aim of factor analysis is to find parameters of latent variable(s), which explain all covariances between indicators via splitting variance of each indicator to the common and unique.

Conceptual

Every indicator is a linear combinatio of one or more latent factors and unique variance of indicator (residual/error).

The same but as a factor analysis diagram .

Principal component analysis is not a factor analysis

because

  • it doesn’t decompose variance of observed variables into common (factors) and unique
  • measurement error is not accounted for;
  • it searches for component loadings which maximize explained indicators’ variance rather than covariance. The main point of factor analysis techniques is to explain covariances.

The main purpose of PCA is to reduce dimensionality rather than to find latent variables. Example: image compression. To describe most of the pixels with only a few variables.

PCA doesn’t assume causal relations between variables, therefore:

  • It works with very small samples
  • It doesn’t require theory about latent variable
  • It’s not appropriate for weakly related observed variables.
  • It’s not appropriate with small number of observed variables.
  • It’s not appropriate as a preliminary analysis before confirmatory factor analysis.

Misconception steamed by SPSS

Exploratory Factor Analysis

  1. EFA provides guesses about underlying latent variable(s) by extracting common covariance (communalities). The main purpose of the first stage is to find a number (how many) factors.

  2. Factor rotation that simplifies the factor structure.

Results in a theory regarding the underlying latent variables and their structure.

Prerequisites

  • Some expectations in regard to continuous latent variables
  • Continuous data (or assumed to be continuous)

Determining the number of factors

Common criteria:

  • Keiser’s Rule: Number of eigenvalues greater than 1 (problematic)
  • Interpretability (ok, but subjective)
  • Scree test of eigenvalues (Cattell, 1966): if the eigenvalue plot looks steep enough
  • Acceleration factor - determining a number of factor using the second derivative (don’t ask), detects the location of the elbow on eigenvalue graph.
  • Optimal coordinates - regression line is predicted for each pair points at the eigenvalues graph, if it predicts the third point correctly, then the optimal number of factors is reached.
  • Parallel analysis - algorithm generates random data of the same size as empirical, computes factor analysis for them and compares the factor solution to the empirical one. This way we can separate purely random features from systematic, specific to our data. Sometimes considered to be the most reliable method.
  • Chi-square and other fit measures can also be used.

Unrotated solution (basic values)

Rotation

  • Simplifies factor solution
  • Are factors expected to correlate, or not? And how much?
    • Initially Unrotated factors are uncorrelated (orthogonal).~~ Huge assumption.
  • The model fit doesn’t change after rotation.
  • Orthogonal rotation keeps factors unrelated.
  • Oblique rotation allows rotated factors to correlate. But requires specification of the degree of correlation.

Rotated loadings (“oblimin” rotation)

Confirmatory Factor Analysis

Exploratory vs confirmatory factor analysis

Compared to exploratory, confirmatory factor analysis:

  • It is very straightforward;
  • Follows the parsimony rule by using less parameters;
  • Cross-loadings are initially fixed to zero (but you can set them free as well);
  • Rotation is not needed, because simple structure is reached by explicit fixation of loadings;
  • Allows to include covariances of residuals;
  • Allows to test hypotheses.

Types of factor analysis

Exploratory – EFA Confirmatory – CFA
Usually no theory before analysis. A priori theory is required.
Closely follows the data Tests theory with the data
Aim is to descibe the data. Aim is to test hypotheses.
Number of factors is unknown. Number of factor is known from theory.
Applied at earlier stages of scale development Applied later in scale development cycle, when the indicators are already known to manifest specific factors
All factor loadings are non-zero (harder to interpret) Some factor loadings are fixed to zero

Local independence axiom

All the common covariance between indicators is due to a latent factor.

Local independence - when the indicator residuals are not correlated, i.e. when the only common thing across indicators is a factor.

Quite often this assumption is unrealistic, because indicators have many sources of covariance, e.g.:

  • mode of survey;
  • question format;
  • question wording;
  • data source;
  • question order;
  • content (points to presence of another latent factor).

Indicator residuals can correlate only in CFA

Fundamental equations of CFA

Variance of indicator =squared factor loading multiplied by factor variance + residual. \[ Var(y_1) = F.loading_{y_1}*F.loading_{y_1} * Var_{F_1} + Residual_{y_1} \]

Covariance between two indicators of single factor = simply a product of factor loadings and factor variance. \[ Covar_{y_1, y_2} = F.loading_{y_1}*F.loading_{y_2}*Var_{F}\] Covariance between two indicators of a single factor, with a covariance between residuals = product of two factor loadings, factor variance plus covariance of residuals. \[ Covar_{y1,y2} = F.loading_{y_1}*F.loading_{y_2}*Var_{F}+Covar_{Residuals(y_1,y_2)} \]

Covariance between two indicators of two different factors = product of corresponding factor loadings and covariance between factors. \[ Covar_{y1(F1),y3(F2)} = F1.loading_{y1} * F2.loading_{y3} * Covar_{F1,F2} \]

Example

Model-implied variance-covariance table
ipadvnt impfun impdiff
ipadvnt 1.99 0.60 0.95
impfun 0.60 1.51 0.71
impdiff 0.95 0.71 1.76

☑️ Task

Using formulas and parameter estimates on diagram, compute:

  • variance of ipcrtiv
  • covariance between impdiff and ipgdtim

Answers

  • variance of ipcrtiv = 1.5795862
  • covariance between impdiff and ipgdtim = 0.6879641

Model identification

Identification of a model is a characteristic of a model (i.e. theory, not data) which allows to uniquely estimate (identify) parameters.

Example of non-identified model:

\[ x + y = 10 \]

\(x\) and \(y\) can take infinite number of values (e.g. 1 and 9, 5 and 5, 4.5 and 6.5, etc.) and the equation will still be true. Non-identification means there is not enough information in the model.

Example of identified model:

\[ x + 1 = 10 \]

Here, \(x\) can have only one unique value of 9.

Conditions of model identification

  • Number of degrees of freedom is equal or more than 0: df ≥ 0.
  • Regression models always have df=0, so are always identified.
  • Recuvrsive path models are always identified.
  • CFA models have their own rules of identification (below).

Degrees of freedom

  • DF is a difference between pieces of known information and number of unknown parameters.
  • Every degree of freedom can be seen as an opportunity to find one extra unknown parameter.

The model \(x + y = 10\) has 2 unknown parameters and 1 known piece. Therefore, df= 1 - 2 = -1. This equation \(x + 1 = 10\) has 1 unknown parameter and 2 known pieces. Therefore, df = 2 - 1 = 1. Therefore this model is identified.

In CFA and other structural equation models the counted information is a number of unique elements in variance-covariance matrix of observed variables.

number of unique elements in variance-covariance matrix of observed variables : \[ N_{obs} = {k*(k+1)} / 2, \] where \(k\) is a number of observed variables. For example, for 4 observed variables, there are \[ N_{obs} = {4*(4+1)} / 2 = 10 \] ten unique pices of information.

Identification of CFA model

\[ N_{parameters} = (N_{factors} * (N_{factors}+1))/2 + N_{obs.variables}*N_{factors} + N_{obs.variables} – N_{factors} \]

Parameters in CFA include:

  • variances and covariances of factors \((N_{factors} * (N_{factors}+1))/2\),
  • residuals of observed variables/indicators \(N_{obs.variables}\),
  • factor loadings \(N_{obs.variables}*N_{factors}\),
  • ❗️excludes fixed parameters - \(N_{factors}\)

Three ways to identify simple CFA

Every factor in CFA should be given a metric to be identified.

  • Fix one factor loading to 1 (“marker variable” method)
  • Fix factor variance to 1
  • Fix the average loading to 1 (“effect coding” method).

Simple rules of identification in CFA:

Make sure that:

  • every factor has been assigned a metric; preferred in most cases.
  • there are at least 3 indicators in 1-factor model and at least 2 indicators in multifactor models;
  • residual covariances consume degrees of freedom, so check if df ≧ 0.

Sorts of identification

  • non-identified, df < 0, impossible to estimate parameters;
  • overidentified, df > 0. We should strive for this;
  • just-identified, df = 0, it’s ok, but the fit cannot be assessed.

“Weakly identified” or “empirically underidentified” can occur in case of very high correlations between observed variables (~> 0.9).

How to choose factor metric?

Use an indicator which is the most reliable and most closely related to latent construct.

It sets the measurement unit of the latent variable.

1

2

3

Non-identified-1

-1 degrees of freedom

Non-identified-2

-1 degrees of freedom

Some common problems of CFA

  • Empirical underidentification. It can happen when two or more indicators are very highly correlated (multicollinearity);
  • Insufficient sample size. CFA requires “large” samples. Some say, it’s more than 20 cases per each free parameter (for the simplest one-factor-three-indicators model with 6 free parameters, it requires 120 cases).
  • No convergence. Various reasons, but most often:
    • poor identification (sometimes due to a very complex model);
    • poor fit to the data.
  • Negative variance estimates (inadmissible solutions, Heywood cases). Negative variance doesn’t exist, it’s a mathematical artifact. So look out for negative estimates. It may be a symptom of an underidentification.

Testing hypotheses with CFA

Fixing parameters

  • fix parameter to some value, usually 1 or 0 and check if the overall model fit became worse
    • to save degrees of freedom (e.g. in order to identify model);
    • to test if some parameter is actually zero (e.g. cross-loading, covariance of residuals).
  • constrain parameter to some function of the other parameter:
    • force the equality with other parameter (e.g. all factor loadings are equal);
    • fix the range of possible values (e.g. factor variance is greater than 0);
    • in multiple groups models, fix parameters to equality (e.g. if factor loadings are the same across groups).

Fixed/constrained parameters and global fit comparison

To be able to compare the models, they must be nested.

The models are nested, if one can be derived from the other by either adding or removing parameters. Model with more parameters is called “full”, model with less parameters is called “reduced”.

In CFA, the typical nested models include:

  • one- and two-factor models (but not 2- and 3- or more factors);
  • correlated and non-correlated factors;
  • with residual covariance and without it.

Not nested, if

  • some parameters were added and some were dropped
    • given the same set of observed varables, it’s possible to compare with AIC и BIC (smaller values indicate better fit).
  • models with different sets of observed variables
    • these cannot be compared at all.

Example

Full

Nested

Model fit

Quality of the model is the similarity of estimated parameters to the true population parameters (given the model is correct). Impossible to check directly.

Model fit indicates the similarity between observed variance-covariance matrix and implied (i.e.predicted by model).

There are many (>50) fit indices. But a few are conventinally used: chi-square, CFI, TLI, RMSEA, SRMR.

Predicted and observed variance-covariance matrices

Observed variance-covariance matrix

ipadvnt impfun impdiff ipgdtim ipcrtiv impfree
ipadvnt 1.99 0.67 0.99 0.49 0.50 0.17
impfun 0.67 1.51 0.68 0.77 0.42 0.33
impdiff 0.99 0.68 1.76 0.66 0.68 0.38
ipgdtim 0.49 0.77 0.66 1.41 0.52 0.47
ipcrtiv 0.50 0.42 0.68 0.52 1.58 0.50
impfree 0.17 0.33 0.38 0.47 0.50 1.30

Model implied variance-covariance matrix

ipadvnt impfun impdiff ipgdtim ipcrtiv impfree
ipadvnt 1.99 0.60 0.95 0.58 0.57 0.33
impfun 0.60 1.51 0.71 0.77 0.43 0.25
impdiff 0.95 0.71 1.76 0.69 0.68 0.40
ipgdtim 0.58 0.77 0.69 1.41 0.41 0.24
ipcrtiv 0.57 0.43 0.68 0.41 1.58 0.50
impfree 0.33 0.25 0.40 0.24 0.50 1.30

Difference between model-implied and observed a.k.a. unstandardized residuals

ipadvnt impfun impdiff ipgdtim ipcrtiv impfree
ipadvnt 0.00 -0.07 -0.04 0.09 0.07 0.16
impfun -0.07 0.00 0.03 0.00 0.01 -0.08
impdiff -0.04 0.03 0.00 0.03 0.00 0.02
ipgdtim 0.09 0.00 0.03 0.00 -0.11 -0.23
ipcrtiv 0.07 0.01 0.00 -0.11 0.00 0.00
impfree 0.16 -0.08 0.02 -0.23 0.00 0.00

χ2 - Chi-square of the model

  • Test of exact fit of observed variance-covariance matrix and the model-implied matrix.
  • We are usually interested in their equality, so χ2 should be insignificant.
  • Large χ2 indicates that the two matrices are different.
  • The value of χ2 is not meaningful by itself: the smaller the better.

In just-identified models, χ2=0 always, but it means that we simply cannot estimate the model fit.

χ2 is sensitive to sample size. It is always large and significant (when N>1,000). For this reason, its interpretation is typically limited to small samples only.

\[\chi^2 = F_{max.lik.}(N-1) \] with a model degrees of freedom, where \(F_{max.lik.}\) - final value of maximum likelihood function; \(N\) - sample size.

Two different chi-squares

Chi-square of factor [tested] model - compares observed variance-covariance matrix with the model-implied one.

Chi-square of baseline [independence] model - compares observed variance-covariance matrix with zeros. Measures probability that the data were generated by independence model. It’s usually very large and significant (and that’s ok).

Do not confuse these two!

Standardized Root Mean Square Residual - SRMR

Absolute fit, literally just the aggregated and standardized residuals.

Recommended values: <0.08

  • Averaged standardized residual, difference between model-implied and observed variance-covariance matrices
  • If SRMR is too high, check the matrix of residuals to identify the source of the problem.

Comparative Fit index - CFI

Recommended values: >0.90 or >0.95

Compares chi-squares of the model and of the imaginary “independence model” in which variables are unrelated to each other.

Weak null hypothesis, so CFI values are usually very high.

Theoretical range 0 : 1, where 0 is equality to “independence model”, and 1 is zero probability that the tested model is “independence model”.

\[ CFI = 1- \frac{\chi^2_{model}-df_{model}}{\chi^2_{independence}-df_{independence}}\]

Tucker-Lewis Index - TLI

TLI is very similar to CFI, though can be a bit higher than 1. Simulations have shown that this index might be more robust than the CFI.

\[ TLI = \frac{\frac{\chi^2_{independence}}{df_{independence}} - \frac{\chi^2_{model}}{df_{model}}}{ \chi^2_{independence}/df_{independence}-1 }\]

Root Mean Squared Error of Approximation - RMSEA

Parsimony-corrected fit index, i.e. corrects for the model complexity.

Recommended values: <0.08; <0.05.

Unlike other fit indices RMSEA has a confidence interval. The upper bound should not exceed 0.08.

PCLOSE – probability of RMSEA to be equal to 0.05; Pclose should be greater than .05.

It’s inversed index of fit, i.e. higher values mean worse fit.

Works better in larger samples.

\[ RMSEA = \sqrt\frac{\chi^2_{model} - df_{model}}{df_{model}*(N-1)} \]

Use all of the model fit indices

well, at least three, preferably from different classes of fit indices to maximize insights drawing on their respective strengths.

Remember: every index has its benefits and downsides. If at least one fit index is beyond recomended values, the model cannot be accepted.

  • Fit indices do not reflect the predictive validity of the model (they aren’t like \(R^2\))
  • High model fit does not guarantee theoretical soundness of the model.
  • There are usually many models that have similarly good fit to the data, so you found only one of them.

❗️ If the model fit is poor, estimated model parameters should not be taken seriously.

Which part of the model causes poor fit?

Chi-squre, CFI, TLI, and RMSEA are global fit indices, demonstrating fit of the whole model.

Standardized residuals may point to the problematic relations. Modification indicices can help finding missing parameters.

Modification indices

Preliminary estimates of estimates before they are actually included in the model.

A list of all the possible parameters not yet included in the model. Useful diagnostic information, for example:

  • MI – modification index (degree to which chi-square will be reduced after adding this parameter in the model)
  • EPC – expected parameter change – what would be the parameter estimate.

NB. This works only for single parameters, that is, they change every time a new parameter is included in the model.

Model comparison

Chi-square test of nested models

Simple difference between chi-squares of two nested models shows if adding/removing parameters significantly improved/decreased model fit to data.

If chi-square difference \(\chi^2_{difference} = \chi^2_{reduced} –\chi^2_{full}\) is significant, then full model has better fit. If it is insignificant, then reduced and full model have the same fit, so, following parsimony rule, the reduced model is preferred.

Nested models comparison using CFI, RMSEA, SRMR

Difficult to test statistically, but larger CFI and TLI and smaller RMSEA and SRMR point to better models.

Comparison of non-nested models

Information criteria - are just chi-squares with different adjustment by model complexity, sample size, and/or number of obserbed variables

  • ❗️ Different software uses different formulas.
  • not scaled, do not mean much by themselves;
  • smaller values indicate better model.

Akaike Information Criterion - AIC \[ AIC = χ^2 - 2*df \]

Bayesian Information Criterion - BIC \[ BIC = χ^2+\log(N_{samp})*(N_{vars}(N_{vars} + 1)/2 – df) \]

The Sample-Size Adjusted BIC \[ SABIC = χ^2 +[(N_{samp} + 2)/24]*[N_{par}*(N_{par} + 1)/2 - df] \]

\(df\) - degrees of freedom, \(N_{vars}\) – number of variables in the model, \(N_{par}\) – number of free parameters on the model, \(N_{samp}\) – sample size.

Applications of CFA

Validity and reliability

The main focus of psychometrics

Reliability of measurement – how precise

Degree to which latent variable scores are free of measurement error.

Common types of reliability:

  • Reliability-Consistency (Cronbach’s Alpha) – overall intercorrelation between indicators;
  • Test-retest reliability - does the instrument provides the same results in different measurement conditions? E.g. measuring the depression in the morning and in the evening.
  • Reliability of parallel forms: do different forms of the same instrument measure the same construct? E.g. Can we measure (and compare) a level of religiosity with frequency of prayer and with church attendance? Will it reveal the same scores?
  • DIF - differential item functioning – does the instrument work in the same way across different groups? Our main focus*

*

Some researchers think DIF is a validity problem rather than reliability.

McDonalds Omega: measure of reliability-consistency based on CFA (modern version of Cronbach’s Alpha)

Higher values indicate higher consistency indicators, the share of the common variance in total variance of all indicators. \(\omega\) can be used only with indicators measured with the same scales.

\[ \omega_{McDonald} = \frac{(Sum~of~loadings)^2} {(Sum~of~loadings)^2+Sum~of~residuals +Covariance~of~residuals}\]

Validity of measurement – whether instrument measures what it is supposed to measure

  • Construct validity - indicates if measured score is actually our construct. The most important validity, all other types of validity are proxy measures of the construct validity. aka simply - validity

Content

indicators capture all theoretically relevant aspects of construct

Convergent

another instrument aimed to measure the same construct gives similar results

Discriminant

measure does not correlate to another construct to which it is not/weakly theoretically related

Criterion

the measure is aligned with some other variable which is believed to be crucial for the construct. Criterion can usually be directly observed, or validly measured.

Predictive

how well the measure can predict some theoretically relevant outcomes, e.g. ability to learn -> academic performance after 1 year at university

DIF – differential item functioning

when the relations between latent construct (factor) and items (indicators) differ across groups

Reliability-Validity paradox

Increase in reliability can deteriorate validity, and vice versa.

Example:

Maksim wants a good measure of outgroup enmity. In the pilot questionnaire, he included five items on attitudes toward immigrants and 5 items on attitudes towards various social minorities. He found that Cronbach’s Alpha reliability coefficient was not high enough (only 0.5!). In the next version of the questionnaire, he replaced the second 5 items with more items on attitudes toward migrants. Reliability increased up to 0.85. Maksim is very happy now 🤓 But what happened to vthe alidity of the outgroup enmity scale?

Many other applications of CFA

Higher-order factor models

second-order factor model

Second-order factors replace covariances of first-order factors. First-order factors are indicators of the second-order factors.

Structural models with latent variables




Maksim Rudnev, 2019 using RMarkdown.