# Alignment method for measurement invariance: Tutorial

It’s been a while since measurement invariance alignment has been introduced in 2014, but not that many researchers applied it in practice. Among ~200 citations (as of May 2019) of the original alignment paper there were only a few substantive applications. It is a pity because you can always enjoy more optimistic results with alignment as compared to the conventional (frequentist, exact) measurement invariance techniques. I guess, it’s been happening due to statistical complexity and a lack of simple guidelines. In this post I summarized, in an approachable way, the steps that are necessary to apply alignment procedure. In addition, I provide couple of  R functions which automate preparation of Mplus code and extraction of useful information from the outputs.

[February 26, 2022]: Some minor errors were fixed.
[November 13, 2020]: The post was updated to make it fully reproducible.

## Intro

The typical start for applying alignment procedure is: ok, I tested my model for invariance with multiple group confirmatory factor analysis (MGCFA) and the equality of loadings and/or intercepts was rejected, so what’s next? Next goes one or some of the following:

If you believe that most parameters are invariant and few are non-invariant, try alignment, which will show you the possible set of groups in which invariance holds. It will also provide approximate latent means, even if there is no exact measurement invariance.

If you believe the model is likely to be invariant, but your data are noisy (i.e. many small meaningless intercorrelations of residuals, cross-loadings, etc.), consider applying  approximate invariance. It is based on Bayesian statistics and allows small differences in loadings/intercepts across groups. One sign to apply approximate approach is when you experience problems even with configural invariance model (but you are sure your model is properly specified).

If you believe that most parameters are invariant AND your data are noisy, there is an option of Bayesian alignment, which combines approximate invariance (or simply Bayesian estimation) with alignment approach. I describe it in section 8.

In case your data contain very many groups (>100), consider using multilevel CFA with random loadings and intercepts (random effects model).

At some point you might admit there is no invariance at least for some groups. So I would usually try to explain why there is no invariance. You may speculate about differences in meaning, make cognitive interviews to understand it, or explain non invariance by applying multilevel CFA with a group-level covariate as in Davidov et al. 2012.

Below, I focus on alignment method in Mplus software. Alignment estimates a configural invariance model and then modifies the factor loadings and intercepts to make them as similar  across groups as possible without deteriorating the model fit. Conceptually, the procedure is alike target factor rotation where the target is across-group similarity of loadings and intercepts.

## Step 1. Find an acceptable configural invariance model

This is crucial as the alignment procedure is based on configural model and the models with aligned parameters  have the same fit (alignment doesn’t affect fit, similar to factor rotation). If the fit of configural model is not good enough, consider fitting the configural model using Bayesian approach and testing approximate Bayesian invariance (probably with further alignment). Dropping items and groups are the hardcore measures, apply them only if it is reasoned substantively.

My example: I use the data from World Values Survey wave 5  (the sample was shrunk to 10 convenient countries for the speed of computation). The model is a single factor of sexual and reproductive morality. It has 4 indicators: justifiability of homosexuality, prostitution, abortion, and divorce. The residuals of abortion and divorce items are allowed to covary. The configural measurement invariance model shows an acceptable model fit ( CFI = 0.995, RMSEA = 0.072) however, constraining factor loadings across groups – that is, setting up a metric invariance model – ruins the fit (CFI = 0.969, RMSEA = 0.094). So  I have to reject the metric invariance hypothesis. However, a good fit of configural model gives some hope so I can reach for the alignment procedure.

Before going further, keep in mind that alignment cannot handle cross-loadings (as well as anything beside factor model), but it is fine to have residual covariances. It can also deal with categorical (binary or ordinal) indicators.

## Step 2. Set up “FREE” alignment model in Mplus

There are two kinds of alignment models, Free and Fixed, they differ in the set of constraints placed on the MGCFA model. It is advised to run, first, Free alignment, and then for go for the Fixed one.

In general, the Free model works better with a large non-invariance, so if this model doesn’t converge, skip to the next step.

Mplus code would look like this:

```DATA: file = 'mplus_data.tab';
VARIABLE:
NAMES = country prostit homosex abortion divorce;
MISSING=.;
classes = c(10);! Type a number of groups in your data in parentheses
knownclass = c(country =  36   ! Australia
76   ! Brazil
170  ! Colombia
380  ! Italy
554  ! New Zealand
642  ! Romania
643  ! Russia
792  ! Turkey
840  ! United States
);
! The classes are not actually latent, they are *known* and
! it is just a grouping variable. So place the grouping variable
! on the place of 'country' above and list the categories in
! this variable (list all groups). Also, Mplus doesn't handle
! string names of groups, so we have to deal with numeric codes

ANALYSIS:
TYPE = mixture; ! Actually it is a multiple group model,
! but for technical reasons is specified as a mixture.
ESTIMATOR = ml; ! it can be mlr or mlf, or Bayes as well. See Section 8
ALIGNMENT = FREE; ! this line makes Mplus to actually run alignment.

MODEL:
%OVERALL% ! it means the CFA model specified below is applicable in every group
Moral BY prostit homosex abortion divorce;
abortion WITH divorce;

OUTPUT:
align; ! This line requests the detailed info on alignment
```

## Step 3. Set up “FIXED” alignment model

In the output of the previous “free alignment” model you can find a message –

```STANDARD ERROR COMPARISON INDICATES THAT THE FREE ALIGNMENT MODEL MAY BE POORLY IDENTIFIED.
USING THE FIXED ALIGNMENT OPTION MAY RESOLVE THIS PROBLEM.
TO AVOID MISSPECIFICATION USE THE GROUP WITH VALUE 792 AS THE BASELINE GROUP.```

It is self-explaining: follow this recommendation and replace in the above Mplus input the ANALYSIS section line ALIGNMENT = FREE; with the line ALIGNMENT = FIXED(792); and put the number of group  in parentheses with the one recommended by Mplus (it is just the smallest estimated latent mean). Sometimes Mplus doesn’t suggest specific group, so you can just choose the one with the smallest latent mean(s). Run this new code.

## Step 4. Interpret the “Approximate measurement invariance” output

In the output you will find this specific section of the alignment results. Here’s an excerpt from the fixed alignment:

```APPROXIMATE MEASUREMENT INVARIANCE (NONINVARIANCE) FOR GROUPS

Intercepts/Thresholds
PROSTIT     36 76 124 170 380 554 (642) 643 792 840
HOMOSEX     36 (76) (124) (170) 380 554 (642) (643) 792 840
ABORTION    36 (76) 124 (170) 380 554 (642) (643) 792 840
DIVORCE     36 (76) 124 170 380 554 642 (643) (792) 840

PROSTIT     (36) 76 (124) (170) (380) 554 642 643 792 (840)
HOMOSEX     36 76 124 170 380 554 (642) 643 (792) 840
ABORTION    (36) 76 (124) 170 380 554 642 643 (792) (840)
DIVORCE     36 76 124 (170) 380 554 642 643 792 840
```

It can be hard to read, but it is meant to simplify the results: this is the table of the intercepts and loadings compared across groups. The groups in which this current parameter is NOT invariant even after alignment are in parentheses. In my example,  intercept of PROSTIT indicator is significantly different in group 642, while in the other groups they are approximately the same. Likewise, the loading of DIVORCE is non-invariant in group 170, while in the other groups this loading is invariant. The parameters are compared across groups using a convenient confidence level of 95%.

## Step 5. Interpret “FACTOR MEAN COMPARISON” output

```FACTOR MEAN COMPARISON AT THE 5% SIGNIFICANCE LEVEL IN DESCENDING ORDER

Results for Factor MORAL1

Latent    Group      Factor
Ranking    Class    Value       Mean     Groups With Significantly Smaller Factor Mean
1         1        36       2.432    554 124 840 76 380 643 170 642 792
2         6       554       2.154    124 840 76 380 643 170 642 792
3         3       124       1.773    840 76 380 643 170 642 792
4        10       840       1.545    76 380 643 170 642 792
5         2        76       0.855    643 170 642 792
6         5       380       0.823    643 170 642 792
7         8       643       0.522    170 642 792
8         4       170       0.347    792
9         7       642       0.303    792
10         9       792       0.000```

For each factor in the model, alignment would produce this comparison of the estimated means. The same information in a different form can be requested by RANKING option of the OUTPUT section.

!! Be careful, these are the latent means that are estimated ignoring measurement non-invariance, it doesn’t mean they are reliable or fully invariant, they were estimated just for reference. These can be treated seriously only if the other tests support approximate measurement invariance.

First column is rank of the mean, second is internal number of group,  third one is your code of the group.

The last column of the table provides pairwise comparison of every group’s mean  with all the other groups’ means.  In the example, country 36 (Australia) has the highest values on the latent mean, and it significantly differs from all the other countries.
You can find the same means in the upper parts of the output, in the MODEL RESULTS section.

## Step 6. Interpret “ALIGNMENT OUTPUT”

This section will be produced if you add to the input code line OUTPUT: ALIGN;. It provides detailed information on the results of alignment for each parameter.  For each parameter, it shows three things: pairwise  comparison, summarized invariance information, and parameter values that were aligned across groups.

### 6.1. Pairwise comparison

First, it is a large table which begins like this:

```ALIGNMENT OUTPUT

INVARIANCE ANALYSIS

Intercepts/Thresholds
Intercept for PROSTIT
Group     Group      Value      Value     Difference  SE       P-value
76        36      1.613      1.606      0.006      0.056      0.912
124       36      1.627      1.606      0.021      0.077      0.788
124       76      1.627      1.613      0.015      0.043      0.735
170       36      1.650      1.606      0.043      0.066      0.508
170       76      1.650      1.613      0.037      0.029      0.204
170       124     1.650      1.627      0.023      0.055      0.680
380       36      1.638      1.606      0.032      0.074      0.669
380       76      1.638      1.613      0.025      0.042      0.543
380       124     1.638      1.627      0.011      0.059      0.853
380       170     1.638      1.650     -0.012      0.051      0.819
554       36      1.532      1.606     -0.074      0.104      0.476
<...>```

This table compares parameters and statistically tests their equality across each possible pairs of groups. First line in this example compares Intercept for PROSTIT in group 76 and in group 36,  the intercept in group 76 is 1.613 and the intercept in group 36 is 1.606, and we can see that the difference is 0.006 which is far from being significant. Yihaa, we found one invariant parameter across two groups. If only always it worked like this.
This table can be really large, because there is every possible pair of groups, so for 10 groups there will be 45 lines for each parameter. This table provides a very detailed information, so I would ignore it at this stage and return to it only in case other things fail to help.

### 6.2. Summarized invariance info

``` Approximate Measurement Invariance Holds For Groups:
36 76 124 170 380 554 643 792 840
```

Below the pairwise comparisons there is a list of groups in which this current parameter was found invariant after alignment. We already seen this information above, at Step 4. Sometimes, it is not very useful, especially if you have many groups and only few of them are non-invariant – imagine trying to identify group(s) which is absent from the list (answer here – group 642). So just ignore it and refer to the above Step 4.

`Weighted Average Value Across Invariant Groups:       1.628`

This is an aligned value that can be considered common for all the invariant groups, listed at a previous line. Note that this value is applicable only to the invariant groups!

`R-square/Explained variance/Invariance index:       0.916`

This R² indicates a degree of invariance of the given parameter. Muthén interpreted this index as the degree to which “the variation across groups in the configural model intercepts and loadings for this item is explained by variation in the factor mean and factor variance [respectively] across groups.” A little confusing can be the fact that this R² can be really small even if the corresponding parameter is highly invariant*. In my example, the factor loading of indicator PROSTIT was shown to be invariant across 9 out of 10 groups, and accrodingly R² is quite high 0.916.

### 6.3. Aligned parameter values

```Invariant Group Values, Difference to Average and Significance
Group        Value Difference         SE    P-value
36       1.606     -0.022      0.054      0.687
76       1.613     -0.016      0.012      0.184
124      1.627     -0.001      0.039      0.976
170      1.650      0.022      0.028      0.433
380      1.638      0.010      0.041      0.813
554      1.532     -0.096      0.067      0.152
643      1.595     -0.034      0.011      0.003
792      1.753      0.125      0.042      0.003
840      1.600     -0.028      0.049      0.566
```

Here, the parameter estimates are listed, but only for those groups which were found to be invariant (e.g. line with group 642 isn’t here). This table is meant to demonstrate the invariance of the invariant parameter. That’s why the values of parameters in the non-invariant groups are not included in this table (but you can find them in the main output “MODEL RESULTS” where all the parameters are listed).

### 6.4. Average Invariance index

The tables 6.1-6.3 are repeated for each factor loading and each indicator intercept. In the very end of the Alignment Output you will find

`Average Invariance index: 0.671`

This is an average R² across all the parameters. It is a handy global score of both metric and scalar invariance. Here, 1 stands for perfect scalar invariance, 0 for (likely impossible) full non-invariance. In general, one may interpret this index as a degree of confidence to which the means can be meaningfully compared across the given set of groups.

## Step 7. Checking the reliability of the results with simulation

The issue with alignment is that it is tied to a current dataset, so its external validity is questionable. For example,  if you have small samples within groups the standard errors of the loadings may be underestimated, so the alignment can find an invariance where it is not present. To check if this is the case, it is recommended to run a simulation study. It was made quite easy by Mplus.

### 7.1. Set up a simulation study

First, you need to re-run your last alignment model adding to the section OUTPUT, a command SVALUES, which will print the parameter estimates in the form of input commands for simulation study.  After running this updated code, navigate the output to the section “MODEL COMMAND WITH FINAL ESTIMATES USED AS STARTING VALUES” and copy the whole section (it is usually very large).

Next, you need to make several modifications to it: (a) remove intercepts part from the %OVERALL%  section, add starting values to loadings in this section, and replace C# with G# in the names of classes. Below the changes are in red.

``` %OVERALL%

moral BY homosex*1;
moral BY abortion*1;
moral BY divorce*1;

[ c#1*0.16328 ];
[ c#2*0.23189 ];
[ c#3*0.59097 ];
[ c#4*0.93735 ];
[ c#5*-0.16690 ];
[ c#6*-0.29107 ];
[ c#7*0.36642 ];
[ c#8*0.51571 ];
[ c#9*0.12452 ];;

%CG#1% ! This change should be done for the rest of the code as well

moral1 BY prostit*1.26506;
moral1 BY homosex*1.51873;
moral1 BY abortion*1.37710;
moral1 BY divorce*1.07074;

abortion WITH divorce*1.03434;

[ prostit*1.60633 ];
<...>```

Okay, now we are ready to combine it with the simulation code.

Next, create a new input file

```MONTECARLO:
NAMES = prostit homosex abortion divorce; ! Names of indicator variables (only)
ngroups = 10; ! Your number of groups
NOBSERVATIONS = 10(100); ! This is again a number of groups and sample size of each group in parentheses.
NREPS = 500; ! This is how many times the data generation and analysis should be repeated.

ANALYSIS:
TYPE = MIXTURE;
ESTIMATOR = ml;
alignment = fixed(9); ! a order number of the group in which the mean was fixed to 0 (it was 736=Turkey in the Step 3, and it's 9th group in the specification of knownclas
MODEL POPULATION:! This section includes a model to generate data

! Paste here the code that we created just before using svalues output
! - it looks like this:
%OVERALL%
moral BY prostit*1;
moral BY homosex*1;
moral BY abortion*1;
moral BY divorce*1;

%g#1%

moral1 BY prostit*1.26506;
moral1 BY homosex*1.51873;
<...>

MODEL: ! This section includes a model to analyze

! AND again, paste here the same edited svalues code -

%OVERALL%
moral BY prostit*1;
moral BY homosex*1;
moral BY abortion*1;
moral BY divorce*1;

%g#1%

moral1 BY prostit*1.26506;
moral1 BY homosex*1.51873;
<...>
```

And run it. It will take some time. Save the output, change the sample size in the parentheses of NOBSERVATIONS = 10(500); and run again. Then change the sample size again and run again. You will end up with three or more outputs based on different sample sizes. It will demonstrate if the model is able to reproduce the latent means with the data of different sample size.

### 7.2. Interpret outputs of the simulation

Locate in the output file the following tables:

```CORRELATIONS AND MEAN SQUARE ERROR OF POPULATION AND ESTIMATE VALUES

CORRELATIONS                MEAN SQUARE ERROR
Average    Std. Dev.           Average    Std. Dev.
MORAL1
Mean             0.9716      0.0151             0.2854       0.101
Variance         0.7738      0.1566             1.2841      14.002

CORRELATION AND MEAN SQUARE ERROR OF THE AVERAGE ESTIMATES

MORAL1 Mean                         0.999        0.045
MORAL1 Variance                     0.457        0.724```

These are two sets of measures of reliability of latent means estimated in the previous steps with alignment. First table results from two-stage computation:

1. it extracts latent means in across groups, which were estimated in a single simulation and correlates them to the true (population) means, and then
2. these correlations are averaged across all the simulations (in my case 500).

In the same way, the measure is found for the latent variances. Std. Dev. of correlations/variances here refers to standard deviation of correlations across simulation runs. It seems correct to interpret these scores as a measure of reliability of latent means estimated by alignment. These correlations are typically very high, so I would be worried when they are less than 0.95 (Asparouhov and Muthen, 2013 suggest that correlations should be not less than 0.98). In my example, there is something disturbing going with the estimated variances. However, when I run a simulation with 1500 cases in each simulation (which is closer to my actual data) this correlation gets very close to 1 (0.998). It means that such a model wouldn’t work if I had  less respondents.

Additionally, mean square error is calculated, which is an absolute reverse measure of association.

Second table is a product of

1. averaging latent means across all the simulation runs (500 in my case), and
2. correlating it with the true values.

First column lists these correlations, the second column is (apparently) mean square error. These measures seem to indicate reliability of the simulation itself, and reliability of the measurement model in general.

Sometimes all these correlations are zeros. If this is the case scroll down to the errors section; it might be that the model was misspecified somehow or none of the models converged.

That’s it.

If the news are good and alignment helped to locate problematic parameters/groups, you may proceed with corresponding dropping groups/modifying model to achieve higher levels of invariance. If you are happy with what you get with alignment, next step might be predicting factor scores based on alignment and then using them as a reliable (though not perfect) substitute of the factor scores. It can be done in a standard Mplus way by adding SAVE = FSCORES; to the SAVEDATA: section.

## Example Mplus files

Here is the list of the files used in the examples above

### Bayesian estimation

This is pretty much uncharted territory because only a few publications explored this analysis. One may consider using Bayesian alignment if the data are noisy and even configural model does not show a great fit to the data. The next step in this case would be setting up a Bayesian approximate invariance model with small prior variances of parameters across groups, and next running the alignment to find better solution. Check this paper with its supplementary materials for the full example. Another option is to simply substitute maximum likelihood with Bayesian estimation (it requires adding ESTIMATOR: BAYES; in the ANALYSYS: section). From my experience, the latter way produces less errors and overall problems compared to ML estimation.

### Ranking table

You can request it by adding SAVEDATA: RANKING IS ranking.dat; in the input file of fixed or free alignment (not simulation). The rankings of groups are based on the freely estimated and aligned group factor means, the differences are determined by the significance of the factor mean differences. It is also listed in the standard output, but in a bit different form, as shown in Step 5 “Factor mean comparison”.

```Ranking table for MORAL1

,36,554,124,840,76,380,643,170,642,792,
36,X,>,>,>,>,>,>,>,>,>,
554,<,X,>,>,>,>,>,>,>,>,
124,<,<,X,>,>,>,>,>,>,>,
840,<,<,<,X,>,>,>,>,>,>,
76,<,<,<,<,X,,>,>,>,>,
380,<,<,<,<,,X,>,>,>,>,
643,<,<,<,<,<,<,X,>,>,>,
170,<,<,<,<,<,<,<,X,,>,
642,<,<,<,<,<,<,<,,X,>,
792,<,<,<,<,<,<,<,<,<,X,```

### Fit  function contribution

Some papers report Fit Function Contribution from every between-group parameter constraint, that is, how well each parameter  contributed to the fit function which aligned these same parameters. Simply put the smaller the fit contribution the more invariant a parameter is. I find this statistic a bit challenging to use because it doesn’t have a clear unit and its comparability across parameters and models is questionable. R² already does this job for you.  Still, you can request it by requesting TECH8 output, by adding OUTPUT: TECH8;

In the output file, closer to the end, you will find a section which contains very detailed information, so scroll directly to these sections:

```TECHNICAL 8 OUTPUT
<...>

ALIGNMENT RESULTS FOR MORAL
<...>

-24.259
-16.643
-17.836
-20.312
<...>
Fit Function Intercepts Contribution By Variable
-14.943
-26.055
-28.765
-24.940```

These numbers are those contributions to the fit of the model that came from every parameter in alignment. The order of variables follows the data, so it’s like in my VARIABLE: NAMES statement: prostit homosex abortion divorce.

Mplus also provides fit contributions from every groups, but those are dependent on the sample size (somewhat alike group-specific chi-square contributions in multiple group CFA), so if you have an unbalanced sample, this part might be quite useless.

### Estimation fine-tuning

The user has a lot of control over alignment optimization. There are several options that you can add in the  ANALYSIS section to tune the alignment optimization algorithm. The following is a copy from Mplus Guide, version 8 (Muthén & Muthén, 1998-2017):

The ASTARTS option is used to specify the number of random sets of starting values to use for the alignment optimization. The default is 30.

The AITERATIONS option is used to specify the maximum number of iterations in the alignment optimization. The default is 5000.

The ACONVERGENCE option is used to specify the convergence criterion for the derivatives of the alignment optimization. The default is 0.001.

Beside this,  it is possible to choose the alignment function itself

The SIMPLICITY option has two settings: SQRT and FOURTHRT. SQRT is the default. The SQRT setting takes the square root of the weighted component loss function. The FOURTHRT setting takes the double square root of the weighted component loss function. It may in some cases further reduce small significant differences.

The precision of alignment can be boosted by lowering the value of tolerance, but you risk to lack the convergence, i.e. it might end with no solution at all.

The TOLERANCE option is used to specify the simplicity tolerance value of the alignment optimization which must be positive. The default is 0.01.

The METRIC option  is not related to metric invariance! This option identifies a set of constraints applied to identify the model:

The METRIC option is used to specify the factor variance metric of the alignment optimization. The METRIC option has two settings: REFGROUP and PRODUCT. REFGROUP is the default where the factor variance is fixed at one in the reference group. The PRODUCT setting sets the product of the factor variances in all of the groups to one. The PRODUCT setting is not allowed with ALIGNMENT=FIXED.

### Categorical indicators

In case (some) of the indicators are binary or ordinal, it is possible to apply alignment and all the steps above will be the same with minor differences. In the input files for alignment it is only needed to add the names of categorical binary indicators to the new line CATEGORICAL = or the VARIABLE: section and algorithm = integration; to the ANALYSIS: section. Due to the fact that it uses integration to estimate parameters, it can take substantial amount of time to compute.

In simulations, everything is the same as well, with couple additions. Like I just mentioned, add new line CATEGORICAL = or the VARIABLE: section and algorithm = integration; to the ANALYSIS: section. And again, list all the categorical variables in the new line  GENERATE =  of the section MONTECARLO:, putting the number of categories minus one in parentheses, something like this  `GENERATE = homosex (9) prostitut (9);` where both variables have 10 categories.

The automation in R described below simplifies these modifications a lot.

## Software

So far, alignment analysis is available only in Mplus software.

The R package “sirt” contains function “invariance.alignment()”, it provides a similar procedure.

### Automation in R

I wrote three functions that allow to quickly create and run all the models required for the alignment analysis (free, fixed, and simulations).

```runAlignment(
model = "Moral BY prostit homosex abortion divorce;", # Formula in Mplus format
group = "country", # grouping variable
categorical = NULL, # which indicators are ordinal/binary? supply a character vector
dat = wvs.s,
sim.samples = c(100, 500, 1000), # Group sample sizes for simulation,
# the length of this vector also determines
# the number of simulation studies.
# set to NULL to skip simulations.
sim.reps = 500,      # The number of simulated datasets in each simulation
Mplus_com = "Mplus", # Sometimes you don't have a direct access to Mplus, so this
# this argument specifies what to send to a system command line.
path = getwd(),  # where all the .inp, .out, and .dat files will be stored
summaries = TRUE # if the extractAlignment() and extractAlignmentSim() should
# be run after all the Mplus work is done.
)```

Another function summarizes the alignment output – check out extractAlignment(). It ha s a single argument which is a path to an .out Mplus file, it prints the summary of alignment in a nice way and returns a list with all the alignment info in the R-manageable format.

And finally extractAlignmentSim() function helps with summarizing multiple simulation outputs. It extracts only information described in Step 7.2

These functions are now part of the `MIE` R package, see https://github.com/MaksimRudnev/MIE.package

## Resources

Original paper that suggested the alignment method:

Asparouhov, T., & Muthén, B. (2014). Multiple-group factor analysis alignment. Structural Equation Modeling: A Multidisciplinary Journal21(4), 495-508. (also known as Webnote 18, version 3) http://www.statmodel.com/examples/webnotes/webnote18_3.pdf

An example with categorical indicators (IRT models):

Muthén, B., & Asparouhov, T. (2014). IRT studies of many groups: the alignment method. Frontiers in Psychology5, 978. https://doi.org/10.3389/fpsyg.2014.00978

Another clarification with an example:

Muthén, B., & Asparouhov, T. (2018). Recent methods for the study of measurement invariance with many groups: alignment and random effects. Sociological Methods & Research47(4), 637-664. https://doi.org/10.1177/0049124117701488

Extension of alignment to test for equality of residuals and variances (idk why):

Marsh, H. W., Guo, J., Parker, P. D., Nagengast, B., Asparouhov, T., Muthén, B., & Dicke, T. (2018). What to do when scalar invariance fails: The extended alignment method for multi-group factor analysis comparison of latent means across many groups. Psychological Methods, 23(3), 524-545. http://dx.doi.org/10.1037/met0000113

### Nice applications

Munck, I., Barber, C., & Torney-Purta, J. (2018). Measurement invariance in comparing attitudes toward immigrants among youth across Europe in 1999 and 2009: The alignment method applied to IEA CIVED and ICCS. Sociological Methods & Research47(4), 687-728. https://doi.org/10.1177/0049124117729691

Lomazzi, V. (2018). Using Alignment Optimization to test the measurement invariance of gender role attitudes in 59 Countries. Methods, data, analyses: a journal for quantitative methods and survey methodology (mda)12(1), 77-103.  https://www.ssoar.info/ssoar/handle/document/56055

### Another tutorial

Byrne, B. M., & van de Vijver, F. J. (2017). The maximum likelihood alignment approach to testing for approximate measurement invariance: A paradigmatic cross-cultural application. Psicothema29(4). https://doi.org/10.7334/psicothema2017.178

### Footnote

* One of the authors of the alignment method listed following reasons for the lack of correspondence between R² and the number of invariant groups:

 Tihomir Asparouhov posted on Friday, December 23, 2016 – 9:55 am
There could be several different reasons.

1. The one threshold that is non-invariant is large (due to non-occurrence of a particular category in one group) and that accounts for the majority of the variability in the threshold.

2. The factor mean variability is small