Model and modeling

What is model?

description, analogy
a usually miniature representation of something
a description or analogy used to help visualize something (such as an atom) that cannot be directly observed
a system of postulates, data, and inferences presented as a mathematical description of an entity or state of affairs.

Merriam-Webster Online Dictionary

Model parameters

Parameter - is a characteristic or feature which defines some system.
Model parameters - true defining characteristics of a model, usually numerical. In modeling context, only unknown features of the model. E.g. correlation.
Estimated parameters - approximation of the true values of parameters as a result of comparing the model to the data. E.g. correlation coefficient.

Parsimony rule

a.k.a. Occam’s Razor- entities should not be multiplied unnecesserily.

The simpler – the better. Unless otherwise is necessary.

~~If two models have the same fit to the data, the simpler one should be preferred.~~

Higher number of parameters automatically increases the model’s fit to the data, but decreases its fit to reality - the overfitting problem.

Empirical data vs Model structure/configuration

Data:

Experience, information based on senses, collected in a systemtic way
Usually separated from model
❗️Data are not parameters. Parameters are unknowns, data are knowns.

Model:

Research question, hypothesis;
- there are no right answers to the wrong questions.
Implicit parts of the model: defaults and assumptions:
- e.g. linearity, causality, no measurement error, uncorrelatedness of factors;
❗️models are often based on a different sort of information, e.g. pure theorizing.

Approaches to model development

Confirmatory
- requires a well-established theory, supported by many previous studies, usually tests little issues to add precision to existing theory. Reveals a binary result: confirmed/rejected.
Alternative models
- Requires two or more contradictory hypotheses, also well-based in established theories.
Model generation
- Models are generated iteratively, based on data and intuition. Beneficial when there is no theory.

Latent Variables

Observed (manifest) vs Latent Variables

Is it reasonable to assume that a construct is measured without error?

Subjective well-being
Generalized trust
Education
Gender
Age
Height

Latent variable is

something you can’t directly observe.
expected value of observed variable (cleared from the measurement error, “true score”)
any variable with unknown values (Bollen).
common variance of several indicators (local independence axiom) - in a min.

Classifications of latent variables

apriori and aposteriori (whether they are part of exisiting theory or not)
reflective and formative (whether indicators are consequences or reasons of the latent construct)
continuous or categorical (e.g. factors vs. clusters)
identified or non-identified (whether possible to measure).

Measurement of latent variables

Latent variables are included in many models

Linear regressions (residuals are LVs)
Multinomial regressions (transformed dependent variable is LV)
~~Factor analysis (factors are LVs)~~
Latent growth curves (curves are LVs)
Latent class analysis (classes are LVs)
Item Response Theory (ability is LV)
Structural models with LVs

Thurstone’s Common Factor Model

The aim of factor analysis is to find parameters of latent variable(s), which explain all covariances between indicators via splitting variance of each indicator to the common and unique.

Conceptual

Every indicator is a linear combinatio of one or more latent factors and unique variance of indicator (residual/error).

The same but as a factor analysis diagram .

Principal component analysis is not a factor analysis

because

it doesn’t decompose variance of observed variables into common (factors) and unique
measurement error is not accounted for;
it searches for component loadings which maximize explained indicators’ variance rather than covariance. The main point of factor analysis techniques is to explain covariances.

The main purpose of PCA is to reduce dimensionality rather than to find latent variables. Example: image compression. To describe most of the pixels with only a few variables.

PCA doesn’t assume causal relations between variables, therefore:

It works with very small samples
It doesn’t require theory about latent variable
~~It’s not appropriate for weakly related observed variables.~~
~~It’s not appropriate with small number of observed variables.~~
~~It’s not appropriate as a preliminary analysis before confirmatory factor analysis.~~

Misconception steamed by SPSS

Exploratory Factor Analysis

EFA provides guesses about underlying latent variable(s) by extracting common covariance (communalities). The main purpose of the first stage is to find a number (how many) factors.
Factor rotation that simplifies the factor structure.

Results in a theory regarding the underlying latent variables and their structure.

Prerequisites

Some expectations in regard to continuous latent variables
Continuous data (or assumed to be continuous)

Determining the number of factors

Common criteria:

Keiser’s Rule: Number of eigenvalues greater than 1 (problematic)
Interpretability (ok, but subjective)
Scree test of eigenvalues (Cattell, 1966): if the eigenvalue plot looks steep enough
Acceleration factor - determining a number of factor using the second derivative (don’t ask), detects the location of the elbow on eigenvalue graph.
Optimal coordinates - regression line is predicted for each pair points at the eigenvalues graph, if it predicts the third point correctly, then the optimal number of factors is reached.
Parallel analysis - algorithm generates random data of the same size as empirical, computes factor analysis for them and compares the factor solution to the empirical one. This way we can separate purely random features from systematic, specific to our data. Sometimes considered to be the most reliable method.
Chi-square and other fit measures can also be used.

Unrotated solution (basic values)

Rotation

Simplifies factor solution
Are factors expected to correlate, or not? And how much?
- Initially Unrotated factors are uncorrelated (orthogonal).~~ Huge assumption.
The model fit doesn’t change after rotation.
Orthogonal rotation keeps factors unrelated.
Oblique rotation allows rotated factors to correlate. But requires specification of the degree of correlation.

Rotated loadings (“oblimin” rotation)

Confirmatory Factor Analysis

Exploratory vs confirmatory factor analysis

Compared to exploratory, confirmatory factor analysis:

It is very straightforward;
Follows the parsimony rule by using less parameters;
Cross-loadings are initially fixed to zero (but you can set them free as well);
Rotation is not needed, because simple structure is reached by explicit fixation of loadings;
Allows to include covariances of residuals;
Allows to test hypotheses.

Types of factor analysis

Exploratory – EFA	Confirmatory – CFA
Usually no theory before analysis.	A priori theory is required.
Closely follows the data	Tests theory with the data
Aim is to descibe the data.	Aim is to test hypotheses.
Number of factors is unknown.	Number of factor is known from theory.
Applied at earlier stages of scale development	Applied later in scale development cycle, when the indicators are already known to manifest specific factors
All factor loadings are non-zero (harder to interpret)	Some factor loadings are fixed to zero

Local independence axiom

All the common covariance between indicators is due to a latent factor.

Local independence - when the indicator residuals are not correlated, i.e. when the only common thing across indicators is a factor.

Quite often this assumption is unrealistic, because indicators have many sources of covariance, e.g.:

mode of survey;
question format;
question wording;
data source;
question order;
content (points to presence of another latent factor).

Indicator residuals can correlate only in CFA

Fundamental equations of CFA

Variance of indicator =squared factor loading multiplied by factor variance + residual. \[ Var(y_1) = F.loading_{y_1}*F.loading_{y_1} * Var_{F_1} + Residual_{y_1} \]

Covariance between two indicators of single factor = simply a product of factor loadings and factor variance. \[ Covar_{y_1, y_2} = F.loading_{y_1}*F.loading_{y_2}*Var_{F}\] Covariance between two indicators of a single factor, with a covariance between residuals = product of two factor loadings, factor variance plus covariance of residuals. \[ Covar_{y1,y2} = F.loading_{y_1}*F.loading_{y_2}*Var_{F}+Covar_{Residuals(y_1,y_2)} \]

Covariance between two indicators of two different factors = product of corresponding factor loadings and covariance between factors. \[ Covar_{y1(F1),y3(F2)} = F1.loading_{y1} * F2.loading_{y3} * Covar_{F1,F2} \]

Example

Model-implied variance-covariance table
	ipadvnt	impfun	impdiff
ipadvnt	1.99	0.60	0.95
impfun	0.60	1.51	0.71
impdiff	0.95	0.71	1.76

☑️ Task

Using formulas and parameter estimates on diagram, compute:

variance of ipcrtiv
covariance between impdiff and ipgdtim

Answers

variance of ipcrtiv = 1.5795862
covariance between impdiff and ipgdtim = 0.6879641

Model identification

Identification of a model is a characteristic of a model (i.e. theory, not data) which allows to uniquely estimate (identify) parameters.

Example of non-identified model:

\[ x + y = 10 \]

\(x\) and \(y\) can take infinite number of values (e.g. 1 and 9, 5 and 5, 4.5 and 6.5, etc.) and the equation will still be true. Non-identification means there is not enough information in the model.

Example of identified model:

\[ x + 1 = 10 \]

Here, \(x\) can have only one unique value of 9.

Conditions of model identification

Number of degrees of freedom is equal or more than 0: df ≥ 0.
Regression models always have df=0, so are always identified.
Recuvrsive path models are always identified.
CFA models have their own rules of identification (below).

Degrees of freedom

DF is a difference between pieces of known information and number of unknown parameters.
Every degree of freedom can be seen as an opportunity to find one extra unknown parameter.

The model \(x + y = 10\) has 2 unknown parameters and 1 known piece. Therefore, df= 1 - 2 = -1. This equation \(x + 1 = 10\) has 1 unknown parameter and 2 known pieces. Therefore, df = 2 - 1 = 1. Therefore this model is identified.

In CFA and other structural equation models the counted information is a number of unique elements in variance-covariance matrix of observed variables.

number of unique elements in variance-covariance matrix of observed variables : \[ N_{obs} = {k*(k+1)} / 2, \] where \(k\) is a number of observed variables. For example, for 4 observed variables, there are \[ N_{obs} = {4*(4+1)} / 2 = 10 \] ten unique pices of information.

Identification of CFA model

\[ N_{parameters} = (N_{factors} * (N_{factors}+1))/2 + N_{obs.variables}*N_{factors} + N_{obs.variables} – N_{factors} \]

Parameters in CFA include:

variances and covariances of factors \((N_{factors} * (N_{factors}+1))/2\),
residuals of observed variables/indicators \(N_{obs.variables}\),
factor loadings \(N_{obs.variables}*N_{factors}\),
❗️excludes fixed parameters - \(N_{factors}\)

Three ways to identify simple CFA

Every factor in CFA should be given a metric to be identified.

Fix one factor loading to 1 (“marker variable” method)
Fix factor variance to 1
Fix the average loading to 1 (“effect coding” method).

Simple rules of identification in CFA:

Make sure that:

every factor has been assigned a metric; ~~preferred in most cases~~.
there are at least 3 indicators in 1-factor model and at least 2 indicators in multifactor models;
residual covariances consume degrees of freedom, so check if df ≧ 0.

Sorts of identification

non-identified, df < 0, impossible to estimate parameters;
overidentified, df > 0. We should strive for this;
just-identified, df = 0, it’s ok, but the fit cannot be assessed.

“Weakly identified” or “empirically underidentified” can occur in case of very high correlations between observed variables (~> 0.9).

How to choose factor metric?

Use an indicator which is the most reliable and most closely related to latent construct.

It sets the measurement unit of the latent variable.

1

2

3 Non-identified-1

~~-1 degrees of freedom~~

Non-identified-2

~~-1 degrees of freedom~~

Some common problems of CFA

Empirical underidentification. It can happen when two or more indicators are very highly correlated (multicollinearity);
Insufficient sample size. CFA requires “large” samples. Some say, it’s more than 20 cases per each free parameter (for the simplest one-factor-three-indicators model with 6 free parameters, it requires 120 cases).
No convergence. Various reasons, but most often:
- poor identification (sometimes due to a very complex model);
- poor fit to the data.
Negative variance estimates (inadmissible solutions, Heywood cases). Negative variance doesn’t exist, it’s a mathematical artifact. So look out for negative estimates. It may be a symptom of an underidentification.

Testing hypotheses with CFA

Fixing parameters

fix parameter to some value, usually 1 or 0 and check if the overall model fit became worse
- to save degrees of freedom (e.g. in order to identify model);
- to test if some parameter is actually zero (e.g. cross-loading, covariance of residuals).
constrain parameter to some function of the other parameter:
- force the equality with other parameter (e.g. all factor loadings are equal);
- fix the range of possible values (e.g. factor variance is greater than 0);
- in multiple groups models, fix parameters to equality (e.g. if factor loadings are the same across groups).

Fixed/constrained parameters and global fit comparison

To be able to compare the models, they must be nested.

The models are nested, if one can be derived from the other by either adding or removing parameters. Model with more parameters is called “full”, model with less parameters is called “reduced”.

In CFA, the typical nested models include:

one- and two-factor models (but not 2- and 3- or more factors);
correlated and non-correlated factors;
with residual covariance and without it.

Not nested, if

some parameters were added and some were dropped
- given the same set of observed varables, it’s possible to compare with AIC и BIC (smaller values indicate better fit).
models with different sets of observed variables
- these cannot be compared at all.

Example

Full

Nested

Model fit

Quality of the model is the similarity of estimated parameters to the true population parameters (given the model is correct). Impossible to check directly.

Model fit indicates ~~the similarity between observed variance-covariance matrix and implied~~ (i.e.predicted by model).

There are many (>50) fit indices. But a few are conventinally used: chi-square, CFI, TLI, RMSEA, SRMR.

Predicted and observed variance-covariance matrices

Observed variance-covariance matrix

	ipadvnt	impfun	impdiff	ipgdtim	ipcrtiv	impfree
ipadvnt	1.99	0.67	0.99	0.49	0.50	0.17
impfun	0.67	1.51	0.68	0.77	0.42	0.33
impdiff	0.99	0.68	1.76	0.66	0.68	0.38
ipgdtim	0.49	0.77	0.66	1.41	0.52	0.47
ipcrtiv	0.50	0.42	0.68	0.52	1.58	0.50
impfree	0.17	0.33	0.38	0.47	0.50	1.30

Model implied variance-covariance matrix

	ipadvnt	impfun	impdiff	ipgdtim	ipcrtiv	impfree
ipadvnt	1.99	0.60	0.95	0.58	0.57	0.33
impfun	0.60	1.51	0.71	0.77	0.43	0.25
impdiff	0.95	0.71	1.76	0.69	0.68	0.40
ipgdtim	0.58	0.77	0.69	1.41	0.41	0.24
ipcrtiv	0.57	0.43	0.68	0.41	1.58	0.50
impfree	0.33	0.25	0.40	0.24	0.50	1.30

Difference between model-implied and observed a.k.a. unstandardized residuals

	ipadvnt	impfun	impdiff	ipgdtim	ipcrtiv	impfree
ipadvnt	0.00	-0.07	-0.04	0.09	0.07	0.16
impfun	-0.07	0.00	0.03	0.00	0.01	-0.08
impdiff	-0.04	0.03	0.00	0.03	0.00	0.02
ipgdtim	0.09	0.00	0.03	0.00	-0.11	-0.23
ipcrtiv	0.07	0.01	0.00	-0.11	0.00	0.00
impfree	0.16	-0.08	0.02	-0.23	0.00	0.00

χ2 - Chi-square of the model

Test of exact fit of observed variance-covariance matrix and the model-implied matrix.
We are usually interested in their equality, so χ2 should be insignificant.
Large χ2 indicates that the two matrices are different.
The value of χ2 is not meaningful by itself: the smaller the better.

In just-identified models, χ2=0 always, but it means that we simply cannot estimate the model fit.

χ2 is sensitive to sample size. It is always large and significant (when N>1,000). For this reason, its interpretation is typically limited to small samples only.

\[\chi^2 = F_{max.lik.}(N-1) \] with a model degrees of freedom, where \(F_{max.lik.}\) - final value of maximum likelihood function; \(N\) - sample size.

Two different chi-squares

Chi-square of factor [tested] model - compares observed variance-covariance matrix with the model-implied one.

Chi-square of baseline [independence] model - compares observed variance-covariance matrix with zeros. Measures probability that the data were generated by independence model. It’s usually very large and significant (and that’s ok).

~~Do not confuse these two!~~

Standardized Root Mean Square Residual - SRMR

Absolute fit, literally just the aggregated and standardized residuals.

Recommended values: <0.08

Averaged standardized residual, difference between model-implied and observed variance-covariance matrices
If SRMR is too high, check the matrix of residuals to identify the source of the problem.

Comparative Fit index - CFI

Recommended values: >0.90 or >0.95

Compares chi-squares of the model and of the imaginary “independence model” in which variables are unrelated to each other.

Weak null hypothesis, so CFI values are usually very high.

Theoretical range 0 : 1, where 0 is equality to “independence model”, and 1 is zero probability that the tested model is “independence model”.

\[ CFI = 1- \frac{\chi^2_{model}-df_{model}}{\chi^2_{independence}-df_{independence}}\]

Tucker-Lewis Index - TLI

TLI is very similar to CFI, though can be a bit higher than 1. Simulations have shown that this index might be more robust than the CFI.

\[ TLI = \frac{\frac{\chi^2_{independence}}{df_{independence}} - \frac{\chi^2_{model}}{df_{model}}}{ \chi^2_{independence}/df_{independence}-1 }\]

Root Mean Squared Error of Approximation - RMSEA

Parsimony-corrected fit index, i.e. corrects for the model complexity.

Recommended values: <0.08; <0.05.

Unlike other fit indices RMSEA has a confidence interval. The upper bound should not exceed 0.08.

PCLOSE – probability of RMSEA to be equal to 0.05; Pclose should be greater than .05.

It’s inversed index of fit, i.e. higher values mean worse fit.

Works better in larger samples.

\[ RMSEA = \sqrt\frac{\chi^2_{model} - df_{model}}{df_{model}*(N-1)} \]

Use all of the model fit indices

well, at least three, preferably from different classes of fit indices to maximize insights drawing on their respective strengths.

Remember: every index has its benefits and downsides. If at least one fit index is beyond recomended values, the model cannot be accepted.

Fit indices do not reflect the predictive validity of the model (they aren’t like \(R^2\))
High model fit does not guarantee theoretical soundness of the model.
There are usually many models that have similarly good fit to the data, so you found only one of them.

❗️ ~~If the model fit is poor, estimated model parameters should not be taken seriously.~~

Which part of the model causes poor fit?

Chi-squre, CFI, TLI, and RMSEA are global fit indices, demonstrating fit of the whole model.

Standardized residuals may point to the problematic relations. Modification indicices can help finding missing parameters.

Modification indices

Preliminary estimates of estimates before they are actually included in the model.

A list of all the possible parameters not yet included in the model. Useful diagnostic information, for example:

MI – modification index (degree to which chi-square will be reduced after adding this parameter in the model)
EPC – expected parameter change – what would be the parameter estimate.

NB. This works only for single parameters, that is, they change every time a new parameter is included in the model.

Model comparison

Chi-square test of nested models

Simple difference between chi-squares of two nested models shows if adding/removing parameters significantly improved/decreased model fit to data.

If chi-square difference \(\chi^2_{difference} = \chi^2_{reduced} –\chi^2_{full}\) is significant, then full model has better fit. If it is insignificant, then reduced and full model have the same fit, so, following parsimony rule, the reduced model is preferred.

Nested models comparison using CFI, RMSEA, SRMR

Difficult to test statistically, but larger CFI and TLI and smaller RMSEA and SRMR point to better models.

Comparison of non-nested models

Information criteria - are just chi-squares with different adjustment by model complexity, sample size, and/or number of obserbed variables

❗️ Different software uses different formulas.
not scaled, do not mean much by themselves;
smaller values indicate better model.

Akaike Information Criterion - AIC \[ AIC = χ^2 - 2*df \]

Bayesian Information Criterion - BIC \[ BIC = χ^2+\log(N_{samp})*(N_{vars}(N_{vars} + 1)/2 – df) \]

The Sample-Size Adjusted BIC \[ SABIC = χ^2 +[(N_{samp} + 2)/24]*[N_{par}*(N_{par} + 1)/2 - df] \]

\(df\) - degrees of freedom, \(N_{vars}\) – number of variables in the model, \(N_{par}\) – number of free parameters on the model, \(N_{samp}\) – sample size.

Applications of CFA

Validity and reliability

The main focus of psychometrics

Reliability of measurement – how precise

Degree to which latent variable scores are free of measurement error.

Common types of reliability:

Reliability-Consistency (Cronbach’s Alpha) – overall intercorrelation between indicators;
Test-retest reliability - does the instrument provides the same results in different measurement conditions? E.g. measuring the depression in the morning and in the evening.
Reliability of parallel forms: do different forms of the same instrument measure the same construct? E.g. Can we measure (and compare) a level of religiosity with frequency of prayer and with church attendance? Will it reveal the same scores?
DIF - differential item functioning – does the instrument work in the same way across different groups? Our main focus

*

Some researchers think DIF is a validity problem rather than reliability.

McDonalds Omega: measure of reliability-consistency based on CFA (modern version of Cronbach’s Alpha)

Higher values indicate higher consistency indicators, the share of the common variance in total variance of all indicators. \(\omega\) can be used only with indicators measured with the same scales.

\[ \omega_{McDonald} = \frac{(Sum~of~loadings)^2} {(Sum~of~loadings)^2+Sum~of~residuals +Covariance~of~residuals}\]

Validity of measurement – whether instrument measures what it is supposed to measure

Construct validity - indicates if measured score is actually our construct. The most important validity, all other types of validity are proxy measures of the construct validity. aka simply - validity

Content

indicators capture all theoretically relevant aspects of construct

Convergent

another instrument aimed to measure the same construct gives similar results

Discriminant

measure does not correlate to another construct to which it is not/weakly theoretically related

Criterion

the measure is aligned with some other variable which is believed to be crucial for the construct. Criterion can usually be directly observed, or validly measured.

Predictive

how well the measure can predict some theoretically relevant outcomes, e.g. ability to learn -> academic performance after 1 year at university

DIF – differential item functioning

when the relations between latent construct (factor) and items (indicators) differ across groups

Reliability-Validity paradox

Increase in reliability can deteriorate validity, and vice versa.

Example:

Maksim wants a good measure of outgroup enmity. In the pilot questionnaire, he included five items on attitudes toward immigrants and 5 items on attitudes towards various social minorities. He found that Cronbach’s Alpha reliability coefficient was not high enough (only 0.5!). In the next version of the questionnaire, he replaced the second 5 items with more items on attitudes toward migrants. Reliability increased up to 0.85. Maksim is very happy now 🤓 But what happened to vthe alidity of the outgroup enmity scale?

Many other applications of CFA

Higher-order factor models

second-order factor model

Second-order factors replace covariances of first-order factors. First-order factors are indicators of the second-order factors.

Structural models with latent variables

Maksim Rudnev, 2019 using RMarkdown.

Lecture 2. Confirmatory factor analysis

Model and modeling

What is model?

Model parameters

Parsimony rule

Empirical data vs Model structure/configuration

Approaches to model development

Latent Variables

Observed (manifest) vs Latent Variables

Latent variable is

Classifications of latent variables

Measurement of latent variables

Latent variables are included in many models

Thurstone’s Common Factor Model

Conceptual

The same but as a factor analysis diagram .

Principal component analysis is not a factor analysis

Misconception steamed by SPSS

Exploratory Factor Analysis

Prerequisites

Determining the number of factors

Unrotated solution (basic values)

Rotation

Confirmatory Factor Analysis

Exploratory vs confirmatory factor analysis

Compared to exploratory, confirmatory factor analysis:

Types of factor analysis

Local independence axiom

Indicator residuals can correlate only in CFA

Fundamental equations of CFA

Example

☑️ Task

Model identification

Conditions of model identification

Degrees of freedom

Identification of CFA model

Three ways to identify simple CFA

Simple rules of identification in CFA:

Sorts of identification

How to choose factor metric?

1

2

3

Non-identified-1

Non-identified-2

Some common problems of CFA

Testing hypotheses with CFA

Fixing parameters

Fixed/constrained parameters and global fit comparison

Not nested, if

Example

Full

Nested

Model fit

Predicted and observed variance-covariance matrices

Observed variance-covariance matrix

Model implied variance-covariance matrix

Difference between model-implied and observed a.k.a. unstandardized residuals

χ2 - Chi-square of the model

Two different chi-squares

Standardized Root Mean Square Residual - SRMR

Comparative Fit index - CFI

Tucker-Lewis Index - TLI

Root Mean Squared Error of Approximation - RMSEA

Use all of the model fit indices

Which part of the model causes poor fit?

Modification indices

Model comparison

Chi-square test of nested models

Nested models comparison using CFI, RMSEA, SRMR

Comparison of non-nested models

Applications of CFA

Validity and reliability

Reliability of measurement – how precise

*

McDonalds Omega: measure of reliability-consistency based on CFA (modern version of Cronbach’s Alpha)

Validity of measurement – whether instrument measures what it is supposed to measure

Content

Convergent

Discriminant