PSYC 3032 M
Mixing Categorical and Continuous Predictors
group):\[\hat{Reading \ score}_i = {\color{deeppink} {\beta_0}} + {\color{darkcyan} {\beta_1}}{\color{darkgrey} {D1_i}} + {\color{gold} {\beta_2}}{\color{lightblue} {D2_i}}\]
\[\hat{Reading \ score}_i = {\color{deeppink} {6.68}} + {\color{darkcyan} {3.09}}{\color{darkgrey} {D1_i}} + {\color{gold} {1.09}}{\color{lightblue} {D2_i}}\]
\[\hat{Reading \ score}_i = {\color{deeppink} {6.68}} + {\color{darkcyan} {3.09}}{\color{darkgrey} {(DRTA \ vs. \ Control)_i}} + {\color{gold} {1.09}}{\color{lightblue} {(TA \ vs. \ Control)_i}}\]
Regrssion with a Categorical Predictor is a One-Way ANOVA
\[\hat{Reading \ score}_i = {\color{deeppink} {\beta_0}} + {\color{darkcyan} {\beta_1}}{\color{darkgrey} {D1_i}} + {\color{gold} {\beta_2}}{\color{lightblue} {D2_i}} = \\ \hat{Reading \ score}_i = {\color{deeppink} {6.68}} + {\color{darkcyan} {3.09}}{\color{darkgrey} {D1_i}} + {\color{gold} {1.09}}{\color{lightblue} {D2_i}}\]
What about including other predictors beyond a single categorical variable?
If the dummy-coding approach is analogous to a one-way ANOVA, then an MLR model with at least one categorical predictor and at least one continuous predictor is analogous to an ANCOVA (Analysis of Covariance)!
So, an ANCOVA model is just a certain case of multiple regression model that includes both categorical and continuous predictors…
Typically, ANCOVA is used for comparing group means on an outcome variable while controlling for some continuous variable
It’s common for researchers to use the word “covariate” when referring to the continuous variable in ANCOVA
If ANOVA/regression with a categorical variable is commonly presented as a method for comparing group means, ANCOVA is often presented as a method for comparing adjusted means across groups (AKA conditional means)
The interpretation of the slope associated with the categorical predictor will change slightly, but you’re already familiar with this change!
This will not be different than the difference between any other predictor in SLR when you move to MLR
QUICK EXAMPLE
Say we estimate a model where life satisfaction is regressed on different types of meditation interventions (e.g., control, mindfulness, mantra-based, and gratitude-based), conditioning on socio-economic status (SES)
Then, we can interpret the dummy-code representing one of the comparisons between mediation types as the mean difference in life satisfaction between, say, mindfulness and control, adjusted for SES (or holding SES constant)
But, there’s this other thing…
ANCOVA has an additional critical assumption which is known as homogeneity of regression, or parallelism
Parallelism means that the relationship between the continuous variable, \(X\), and the outcome, \(Y\), is assumed to be constant, or homogeneous, across the levels of the categorical predictor
Put simply, the traditional ANCOVA model assumes that there is no interaction between the continuous and categorical variables
But, if we reframe ANCOVA as MLR, it’s easy to relax this assumption by including an interactions term between the continuous and the dummy-coded variables representing group membership (we will address interactions in Module 7)
Another way to think about the parallelism assumption from a regression framework is the assumption that your model is properly specified
Yes, the reading comprehension example, again!
Baumann et al. (1992) were actually interested in how the groups differed in their post-intervention reading test scores (post-test) over and above any differences on a reading test score administered before the intervention (pre-test)
Let’s explore the pre-test scores…
| Group | Count | Mean | SD | Min | Max | Skewness | Kurtosis |
|---|---|---|---|---|---|---|---|
| control | 22 | 10.500000 | 2.972092 | 4 | 16 | -0.2181529 | -0.6538804 |
| DRTA | 22 | 9.727273 | 2.693587 | 6 | 16 | 0.8073246 | -0.4021767 |
| TA | 22 | 9.136364 | 3.342304 | 4 | 14 | 0.0036221 | -1.5433129 |
Baumann et al. (1992) were actually interested in how the groups differed in their post-intervention reading test scores over and above any differences on a reading test score administered before the intervention
\[\hat{Post}_i=\beta_0 + \beta_1D1_i + \beta_2D2_i + \beta_3Pre_i\]
Using this ANCOVA approach, the omnibus, overall effect of the intervention variable is the joint effect of D1 and D2, taken together
To obtain this joint effect and its statistical significance, we can follow a hierarchical regression procedure
\(\text{Model 1}: \hat{Post}_i= {\color{deeppink} {\beta_0 + \beta_1Pre_i}}\)
vs.
\(\text{Model 2 (ANCOVA)}: \hat{Post}_i= {\color{deeppink} {\beta_0 + \beta_1Pre_i}} + \beta_2D1_i + \beta_3D2_i\)
And, for the actual model comparison, the F test:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.5966478 1.1845062 -0.5037101 6.162502e-01
pretest1 0.6931872 0.1014697 6.8314735 4.205253e-09
groupDRTA 3.6265538 0.7361861 4.9261371 6.553543e-06
groupTA 2.0361644 0.7449616 2.7332475 8.161831e-03
2.5 % 97.5 %
(Intercept) -2.9644420 1.7711465
pretest1 0.4903523 0.8960222
groupDRTA 2.1549387 5.0981689
groupTA 0.5470074 3.5253214
library(emmeans)
# Estimate the groups' marginal means
emm <- emmeans(mod2, ~ group)
# Pairwise comparisons for every unique pair
pairs(emm, adjust="none") # adjust= refers to multiplicity control (e.g., Tukey), but I don't know if I believe in MC contrast estimate SE df t.ratio p.value
control - DRTA -3.63 0.736 62 -4.926 <.0001
control - TA -2.04 0.745 62 -2.733 0.0082
DRTA - TA 1.59 0.734 62 2.165 0.0342
What about Using Change Scores Instead of Conditioning on Pretest?
Logical – Direct Measurement of Change: Change scores directly model the amount of change from pre-test to post-test, providing a clear and intuitive measure of individual differences over time.
Simplicity and Interpretability: Change scores are straightforward and easy to communicate, ideal for situations where stakeholders prefer clear, direct metrics of impact or change.
Potential for Bias: In non-randomized studies, change scores can introduce bias if baseline differences that influence outcomes are meaningful and not controlled for
Amplification of Measurement Error: Change scores can increase the impact of measurement error, especially if the measurement tools used at pre- and post-test are not highly reliable. This is due to their reliance on the accuracy of two separate measurements instead of one.
Although ANCOVA and change scores represent different approaches to handling the same research design, researchers may actually arrive at different conclusions if they were to use both approaches on the same dataset; this phenomenon is called Lord’s Paradox (Lord, 1967)
Lord’s original experiment was meant to evaluate how young men and women differ on weight change over the course of a semester
But, men obviously start at a much higher average weight than women
In Lord’s dataset, men and women do not change at all over time (mean change = ~0 in both groups). Here’s a simulated illustration…
Call:
lm(formula = change ~ gender, data = dat)
Residuals:
Min 1Q Median 3Q Max
-15.5233 -3.3977 0.6239 3.9303 13.8364
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.1692 0.5190 2.253 0.0254 *
genderMen 0.1209 0.7340 0.165 0.8694
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 5.19 on 198 degrees of freedom
Multiple R-squared: 0.0001369, Adjusted R-squared: -0.004913
F-statistic: 0.02711 on 1 and 198 DF, p-value: 0.8694
Call:
lm(formula = final ~ gender + initial, data = dat)
Residuals:
Min 1Q Median 3Q Max
-9.4746 -1.9902 0.1751 1.8293 8.0446
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 33.79411 1.67270 20.20 <2e-16 ***
genderMen 13.42333 0.79429 16.90 <2e-16 ***
initial 0.44538 0.02797 15.92 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 3.006 on 197 degrees of freedom
Multiple R-squared: 0.9463, Adjusted R-squared: 0.9457
F-statistic: 1734 on 2 and 197 DF, p-value: < 2.2e-16
Results
Using t test on change score, we do not conclude a difference in between men and women
Using ANCOVA, we conclude a difference between men and women in final weight controlling for initial weight
Change scores addresses the question, “What is the difference in weight change between men and women?”
ANCOVA answers, “Is there still a difference between men and women in their final weights, after accounting for where each individual started (initial weight)?”
Lord’s original experiment was meant to evaluate how young men and women differ on weight change over the course of a semester
ANCOVA Assumptions
The assumptions of OLS regression apply equivalently to models with discrete predictors
library(car)
mod2 <- lm(posttest1 ~ pretest1 + as_factor(group), data = read)
stud_resid <- rstudent(mod2) # Studentizing the model residuals
scatterplot(stud_resid ~ read$group, boxplot=FALSE)The assumptions of OLS regression apply equivalently to models with discrete predictors and the same diagnostic procedures presented in earlier modules can be used
Recall, LINE:
And, of course:
\(\hat{Post}_i= {\color{deeppink} {\beta_0}} + {\color{deeppink} {\beta_1D1_i}} + {\color{deeppink} {\beta_2D2_i}} + {\color{deeppink} {\beta_3Pre_i}}\) vs. \(\hat{Post}_i={\color{deeppink} {\beta_0}} + {\color{deeppink} {\beta_1D1_i}} + {\color{deeppink} {\beta_2D2_i}} + {\color{deeppink} {\beta_3Pre_i}} + \beta_4D1_i \times Pre_i + \beta_5D2_i \times Pre_i\)
read$group <- haven::as_factor(read$group) # Ensure the group variable is treated as a factor
interaction_mod <- lm(posttest1 ~ pretest1 * group, data = read) # By using * we automatically add all the terms, beta1 through beta5 in the model above!
no_int_mod <- lm(posttest1 ~ pretest1 + group, data = read) # the same as mod2 from earlierggplot(read, aes(x = pretest1, y = posttest1, color = group)) +
geom_smooth(method = "lm", se = TRUE, aes(fill = group), alpha = 0.25) + # Add linear regression lines with semi-transparent confidence bands
geom_point(size = 2, alpha = 0.6) + # Plot the points with slight transparency
labs(x = "Pretest Score", y = "Posttest Score", title="Regression Slopes by Group") +
theme(legend.position = "none") +
theme_classic()
Assignment 1 Review
“The researcher hypothesizes that these three focal predictors, as a set, will explain variability in career satisfaction above and beyond the effects of age and sex.”
— PSYC3032M Assignment 1
Most appropriate to use hierarchical regression, why?
Because the research hypothesis was that work climate, respect, and influence predict career satisfaction over and above age and sex, so:
\(Model \ 1: \hat{Satisfaction}_i={\color{deeppink} {\beta_0}}+{\color{deeppink} {\beta_1Age}} + {\color{deeppink} {\beta_2Sex}}\)
vs.
\(Model \ 2: \hat{Satisfaction}_i={\color{deeppink} {\beta_0}}+{\color{deeppink} {\beta_1Age}} + {\color{deeppink} {\beta_2Sex}} + \beta_3Climate+\beta_4Respect+\beta_5Influence\)
“She also believes that each predictor will be meaningfully associated with career satisfaction when adjusting for the rest of the focal predictors as well as the covariates. Specifically, higher scores on each of the three predictors are expected to be associated with greater career satisfaction.”
— PSYC3032M Assignment 1
For this hypothesis, we can just interpret the micro/predictor-level effects of the focal predictors in Model 2
For example, unstandardized regression coefficients (\(\hat{\beta} s\)), CIs, semipartial correlation squared (\(sr^2\)), significance tests, etc.
But, the phrasing “meaningfully associated” refers to the effect sizes, precision, and their practical implications, less the significance test, even though, that should also be taken into account
“the researcher also wishes to explore potential differences in career satisfaction based on sex, hypothesizing that men may report higher career satisfaction than women.”
— PSYC3032M Assignment 1
Module 6 (Part 2)