Just the tip of the iceberg - there are countless types of analyses you can do in R (introducing the basics)
There will be various functions and approaches that will essentially do the same thing - showing different ways
I will not go into statistical theory (but I will have biased comments and recommendations) - we will be focusing on how to conduct these analyses in R

Exploring the data

Let’s use the ToothGrowth dataset. Make sure you first load the tidyverse package.

ToothGrowth {datasets}

The Effect of Vitamin C on Tooth Growth in Guinea Pigs

The outcome (dependent variable) is the length of odontoblasts (cells responsible for tooth growth) in 60 guinea pigs. Each animal received one of three dose levels of vitamin C (0.5, 1, and 2 mg/day) by one of two delivery methods, orange juice or ascorbic acid (a form of vitamin C and coded as VC).

We’ll first look at the difference in length between the two supplement options, OJ and VC. Shall we start with a little descriptive and assumptions?

tg <- ToothGrowth  # renaming to a shorter name for easier typing

dim(tg)  # dimensions of the dataset

[1] 60  3

summary(tg)

      len        supp         dose      
 Min.   : 4.20   OJ:30   Min.   :0.500  
 1st Qu.:13.07   VC:30   1st Qu.:0.500  
 Median :19.25           Median :1.000  
 Mean   :18.81           Mean   :1.167  
 3rd Qu.:25.27           3rd Qu.:2.000  
 Max.   :33.90           Max.   :2.000

describe(tg)

      vars  n  mean   sd median trimmed  mad min  max range  skew kurtosis   se
len      1 60 18.81 7.65  19.25   18.95 9.04 4.2 33.9  29.7 -0.14    -1.04 0.99
supp*    2 60  1.50 0.50   1.50    1.50 0.74 1.0  2.0   1.0  0.00    -2.03 0.07
dose     3 60  1.17 0.63   1.00    1.15 0.74 0.5  2.0   1.5  0.37    -1.55 0.08

tg %>%
    group_by(supp) %>%
    summarise(sample_size = n(), mean_length = mean(len), SD = sd(len))

# A tibble: 2 × 4
  supp  sample_size mean_length    SD
  <fct>       <int>       <dbl> <dbl>
1 OJ             30        20.7  6.61
2 VC             30        17.0  8.27

qplot(supp, len, data = tg, main = "Tooth growth of guinea pigs by supplement type",
    xlab = "Supplement type", ylab = "Tooth length") + geom_boxplot(aes(fill = supp))

ggplot(tg, aes(x = len, fill = supp, color = supp)) + geom_histogram(bins = 8) +
    facet_wrap(~supp)

Comparing Two Means

Is there a statistically significant difference in tooth length between the 2 delivery methods orange juice (OJ) or ascorbic acid (VC)?

Assumptions of the two independent samples t test

# Making a vector for only OJ
OJ <- tg %>%
    filter(supp == "OJ") %>%
    dplyr::select(len) %>%
    rename(OJ = len)


# Making a vector for only VC
VC <- tg %>%
    filter(supp == "VC") %>%
    dplyr::select(len) %>%
    rename(VC = len)


# Combining into a simpler dataset
data <- data.frame(OJ, VC)


# NORMALITY
qqnorm(data$OJ)

qqnorm(data$VC)

hist(data$OJ, breaks = 5)

hist(data$VC, breaks = 10)

# Relying on Shapiro-Wilk test is not recommended, but can can be used as
# supplement
shapiro.test(data$OJ)


    Shapiro-Wilk normality test

data:  data$OJ
W = 0.91784, p-value = 0.02359

shapiro.test(data$VC)


    Shapiro-Wilk normality test

data:  data$VC
W = 0.96567, p-value = 0.4284

# Homogeneity of Variance
ggplot(tg, aes(x = len, colour = supp)) + geom_density()

# NHSTs - Looking for statistical NON-significance
leveneTest(data = tg, y = tg$len, group = tg$supp)

Levene's Test for Homogeneity of Variance (center = median: tg)
      Df F value Pr(>F)
group  1  1.2136 0.2752
      58

var.test(len ~ supp, data = tg)  # sensitive to non-normal data


    F test to compare two variances

data:  len by supp
F = 0.6386, num df = 29, denom df = 29, p-value = 0.2331
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
 0.3039488 1.3416857
sample estimates:
ratio of variances 
         0.6385951

# Note above that the outcome/dependent variable goes to the left of the ~ and
# independent variable goes to the right of the ~



# Fligner-Killeen test
fligner.test(len ~ supp, data = tg)


    Fligner-Killeen test of homogeneity of variances

data:  len by supp
Fligner-Killeen:med chi-squared = 0.97034, df = 1, p-value = 0.3246

t tests and effect sizes

# When assumptions are met
t.test(x = data$OJ, y = data$VC, var.equal = T)


    Two Sample t-test

data:  data$OJ and data$VC
t = 1.9153, df = 58, p-value = 0.06039
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.1670064  7.5670064
sample estimates:
mean of x mean of y 
 20.66333  16.96333

# Raw effect size, very important! (often more than standardized effect size)
mean(data$OJ) - mean(data$VC)

[1] 3.7

# a 3.7 microns difference in length between OJ and VC...

# Standardized Effect Size: Cohen's d
cohens_d(len ~ supp, data = tg, pooled_sd = T)

Cohen's d |        95% CI
-------------------------
0.49      | [-0.02, 1.01]

- Estimated using pooled SD.

# When assumptions of homogeneity of variance is NOT met
t.test(x = data$OJ, y = data$VC, var.equal = F)  # unequal variances


    Welch Two Sample t-test

data:  data$OJ and data$VC
t = 1.9153, df = 55.309, p-value = 0.06063
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.1710156  7.5710156
sample estimates:
mean of x mean of y 
 20.66333  16.96333

cohens_d(len ~ supp, data = tg, pooled_sd = F)  # pooled_sd=F

Cohen's d |        95% CI
-------------------------
0.49      | [-0.02, 1.01]

- Estimated using un-pooled SD.

# Two sample Wilcoxon test (AKA the Mann-Whitney test) when data is
# non-normally distributed
wilcox.test(x = data$OJ, y = data$VC, exact = F)


    Wilcoxon rank sum test with continuity correction

data:  data$OJ and data$VC
W = 575.5, p-value = 0.06449
alternative hypothesis: true location shift is not equal to 0

# paired t-test (dependent samples): Doesn't apply here, but for your reference
t.test(x = data$OJ, y = data$VC, var.equal = T, paired = T)  # paired=T


    Paired t-test

data:  data$OJ and data$VC
t = 3.3026, df = 29, p-value = 0.00255
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 1.408659 5.991341
sample estimates:
mean of the differences 
                    3.7

cohens_d(len ~ supp, data = tg, pooled_sd = T, paired = T)  # paired=T

Cohen's d |       95% CI
------------------------
0.60      | [0.21, 1.01]

Comparing two or more means: One-way ANOVA

getting started

# let's make the dose variable into a factor:
tg$dose <- as.factor(tg$dose)

# Descriptives
tg %>% 
  group_by(dose) %>% 
  summarise(sample_size= n(), mean_length = mean(len), 
            SD = sd(len))

# A tibble: 3 × 4
  dose  sample_size mean_length    SD
  <fct>       <int>       <dbl> <dbl>
1 0.5            20        10.6  4.50
2 1              20        19.7  4.42
3 2              20        26.1  3.77

plotmeans(  formula =  len ~ dose,  # plot length by dose
             data = tg,           # the data frame
             xlab = "Vit C Dose",  # x-axis label
             ylab = "Tooth Length",          # y-axis label
             n.label = T  )            # display sample size

Defining the ANOVA model

anova.results <- aov(len ~ dose, data = tg)  # Note that the outcome (DV) is to the left of the ~ and the IV is to the right

ANOVA assumptions

# HOV: Equal Variance
leveneTest(anova.results)  # HOV - Looking for NON-significant results!

Levene's Test for Homogeneity of Variance (center = median)
      Df F value Pr(>F)
group  2  0.6457 0.5281
      57

# Normality
anova.residuals <- residuals(object = anova.results)  # extract the residuals

hist(anova.residuals)  # plot a histogram

qqnorm(anova.residuals)

shapiro.test(anova.residuals)  # run Shapiro-Wilk test


    Shapiro-Wilk normality test

data:  anova.residuals
W = 0.96731, p-value = 0.1076

ANOVA results

95 - 16 < 0.001

[1] FALSE

summary(anova.results)

            Df Sum Sq Mean Sq F value   Pr(>F)    
dose         2   2426    1213   67.42 9.53e-16 ***
Residuals   57   1026      18                     
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# Effect Sizes
etaSquared(anova.results)

        eta.sq eta.sq.part
dose 0.7028642   0.7028642

omega_squared(anova.results)

# Effect Size for ANOVA

Parameter | Omega2 |       95% CI
---------------------------------
dose      |   0.69 | [0.57, 1.00]

- One-sided CIs: upper bound fixed at (1).

eta_squared(anova.results, ci = 0.95)

# Effect Size for ANOVA

Parameter | Eta2 |       95% CI
-------------------------------
dose      | 0.70 | [0.59, 1.00]

- One-sided CIs: upper bound fixed at (1).

Post-hoc tests

posthocPairwiseT(anova.results, p.adjust.method = "bonferroni")


    Pairwise comparisons using t tests with pooled SD 

data:  len and dose 

  0.5     1      
1 2.0e-08 -      
2 4.4e-16 4.3e-05

P value adjustment method: bonferroni

# same function as:

pairwise.t.test( x = tg$len,   # outcome variable
                 g = tg$dose ,        # grouping variable
                  p.adjust.method = "holm") # which correction to use?


    Pairwise comparisons using t tests with pooled SD 

data:  tg$len and tg$dose 

  0.5     1      
1 1.3e-08 -      
2 4.4e-16 1.4e-05

P value adjustment method: holm

# Tukey HSD
TukeyHSD(anova.results)

  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = len ~ dose, data = tg)

$dose
        diff       lwr       upr    p adj
1-0.5  9.130  5.901805 12.358195 0.00e+00
2-0.5 15.495 12.266805 18.723195 0.00e+00
2-1    6.365  3.136805  9.593195 4.25e-05

# If HOV assumption is violated you can run the Welch one-way test
oneway.test(len ~ dose, data = tg)


    One-way analysis of means (not assuming equal variances)

data:  len and dose
F = 68.401, num df = 2.000, denom df = 37.743, p-value = 2.812e-13

# If normality assumption is violated
kruskal.test(len ~ dose, data = tg)


    Kruskal-Wallis rank sum test

data:  len by dose
Kruskal-Wallis chi-squared = 40.669, df = 2, p-value = 1.475e-09

Correlation

# Visualizing the relationship
plot(data)

ggplot(data, aes(x = OJ, y = VC)) + geom_point() + geom_smooth(method = lm, se = T,
    linetype = "dashed", color = "darkred")

# Correlation test NHST
cor.test(data$OJ, data$VC)


    Pearson's product-moment correlation

data:  data$OJ and data$VC
t = 4.9134, df = 28, p-value = 3.515e-05
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.4241432 0.8358147
sample estimates:
      cor 
0.6804377

# If more than 2 variables -- Using a subset of the mtcars dataset

library(GGally)

sub.mtcars <- mtcars %>%
    dplyr::select(mpg, disp, hp, drat, wt, qsec)

# Investigate pairs
ggpairs(sub.mtcars, diag = list(continuous = "barDiag"), title = "Correlations between variables in mtcars")

# Spearman's rank correlation coefficient or Spearman's ρ Spearman Correlation
# for ordinal data
cor.test(mtcars$hp, mtcars$cyl, method = "spearman", exact = F)


    Spearman's rank correlation rho

data:  mtcars$hp and mtcars$cyl
S = 535.83, p-value = 1.868e-12
alternative hypothesis: true rho is not equal to 0
sample estimates:
      rho 
0.9017909

Linear regression

Let’s load the aggression dataset

Aggression is measured using the Buss Perry Aggression Questionnaire (BPAQ) and impulsivity is measured by the Barratt Impulsivity Scale (BIS).

BPAQ = Aggression score BIS = Impulsivity score

Importing the dataset

aggression <- read_sav("aggression.sav")

Descriptives

str(aggression)

tibble [275 × 8] (S3: tbl_df/tbl/data.frame)
 $ age    : num [1:275] 18 18 20 17 17 17 17 17 17 17 ...
  ..- attr(*, "format.spss")= chr "F2.0"
 $ BPAQ   : num [1:275] 2.62 2.24 2.72 1.93 2.72 ...
  ..- attr(*, "label")= chr "Aggression total score"
  ..- attr(*, "format.spss")= chr "F12.10"
  ..- attr(*, "display_width")= int 14
 $ AISS   : num [1:275] 2.65 2.85 3.05 2.65 2.95 1.95 2.55 2.3 2 2.15 ...
  ..- attr(*, "label")= chr "sensation seeking total score"
  ..- attr(*, "format.spss")= chr "F4.2"
 $ alcohol: num [1:275] 28 NA 80 28 10 12 21 3 21 0 ...
  ..- attr(*, "label")= chr "alcohol consumption (in drinks)"
  ..- attr(*, "format.spss")= chr "F2.0"
 $ BIS    : num [1:275] 2.15 3.08 3 1.85 2.08 ...
  ..- attr(*, "label")= chr "Impulsivity total score"
  ..- attr(*, "format.spss")= chr "F12.10"
  ..- attr(*, "display_width")= int 14
 $ NEOc   : num [1:275] 2.83 2.5 2.75 3.42 3.58 ...
  ..- attr(*, "label")= chr "Conscientiousness total score"
  ..- attr(*, "format.spss")= chr "F12.10"
  ..- attr(*, "display_width")= int 14
 $ gender : dbl+lbl [1:275] 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, ...
   ..@ label      : chr "biological sex of participant"
   ..@ format.spss: chr "F1.0"
   ..@ labels     : Named num [1:2] 0 1
   .. ..- attr(*, "names")= chr [1:2] "male" "female"
 $ NEOo   : num [1:275] 2.92 4.17 3.92 4.17 3.5 ...
  ..- attr(*, "label")= chr "openness to experience total score"
  ..- attr(*, "format.spss")= chr "F12.10"
  ..- attr(*, "display_width")= int 14

describe(aggression[, c(1, 2, 5)])  # 1st, 2nd, 5th columns are the 3 variables of interest

     vars   n  mean   sd median trimmed  mad   min   max range skew kurtosis
age     1 275 20.21 4.96  18.00   19.01 1.48 17.00 50.00 33.00 3.70    15.43
BPAQ    2 275  2.61 0.52   2.62    2.61 0.56  1.34  4.03  2.69 0.01    -0.41
BIS     3 275  2.28 0.35   2.27    2.27 0.40  1.42  3.15  1.73 0.36    -0.22
       se
age  0.30
BPAQ 0.03
BIS  0.02

corr.test(aggression[, c(1, 2, 5)])

Call:corr.test(x = aggression[, c(1, 2, 5)])
Correlation matrix 
       age  BPAQ   BIS
age   1.00 -0.21 -0.14
BPAQ -0.21  1.00  0.32
BIS  -0.14  0.32  1.00
Sample Size 
[1] 275
Probability values (Entries above the diagonal are adjusted for multiple tests.) 
      age BPAQ  BIS
age  0.00    0 0.02
BPAQ 0.00    0 0.00
BIS  0.02    0 0.00

 To see confidence intervals of the correlations, print with the short=FALSE option

car::scatter3d(BPAQ ~ BIS + age, data = aggression, rgl.printRglwidget = TRUE)
# may not work well in Markdown since interactive - try the command below
rglwidget()

library(plotly)
fig <- plot_ly(aggression, x = ~BIS, y = ~BPAQ, z = ~age)
fig <- fig %>%
    add_markers(size = 0.02)
fig <- fig %>%
    layout(scene = list(xaxis = list(title = "Impulsivity"), yaxis = list(title = "Aggression"),
        zaxis = list(title = "Age")))

fig

Define the regression model

agrs.mod <- lm(BPAQ ~ age + BIS, data = aggression)

Regression assumptions / model diagnostics

stud_res <- rstudent(agrs.mod)  # Studentized residual (t distribution) * recommended

# Normality
hist(resid(agrs.mod))

plot(agrs.mod, which = 2)

shapiro.test(stud_res)  # Looking for nonsignificant results


    Shapiro-Wilk normality test

data:  stud_res
W = 0.99497, p-value = 0.507

# Linearity
plot(agrs.mod, which = 1)  # looking for a flat red line

residualPlots(agrs.mod)  # looking for flat blue lines

           Test stat Pr(>|Test stat|)
age           0.1571           0.8753
BIS          -0.4727           0.6368
Tukey test   -0.7619           0.4461

# Homoscedasticity
plot(agrs.mod, which = 3)  # looking for a flat red line

ncvTest(agrs.mod)  # looking for nonsignificance

Non-constant Variance Score Test 
Variance formula: ~ fitted.values 
Chisquare = 0.5141766, Df = 1, p = 0.47334

# Multicollinearity
vif(agrs.mod)  # from car package - looking for below 3 (conservative) or 10 (extremely liberal)

     age      BIS 
1.019323 1.019323

# Outliers
plot(agrs.mod, which = 4)

# looking for below (extremely conservative) cutoff of 4/(n-P-1) [in our case
# 4/(275-2-1) = ~ 0.015] or below (extremely liberal) cutoff of 1.0

plot(cooks.distance(agrs.mod), ylab = "Cook's Distance", ylim = c(0, 0.2))

influencePlot(agrs.mod)

      StudRes         Hat       CookD
47   2.437728 0.004400059 0.008598094
261 -1.453120 0.110608718 0.087177965
262  1.061081 0.135907121 0.059000607
263  1.069090 0.139881372 0.061927065
264  2.768212 0.004344552 0.010879350

# You would be most concerned about larger bubbles (which is why their row
# number are identified), but especially so if they are close to the top and
# bottom most right corners (i.e., the observation has both high leverage and
# high discrepancy).

Model information and results

summary(agrs.mod)


Call:
lm(formula = BPAQ ~ age + BIS, data = aggression)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.17933 -0.32561 -0.00454  0.35594  1.33877 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)  1.954392   0.244922   7.980 4.12e-14 ***
age         -0.017556   0.006032  -2.910  0.00391 ** 
BIS          0.443607   0.085060   5.215 3.64e-07 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.4906 on 272 degrees of freedom
Multiple R-squared:  0.1299,    Adjusted R-squared:  0.1235 
F-statistic: 20.31 on 2 and 272 DF,  p-value: 6.016e-09

confint(agrs.mod)

                  2.5 %      97.5 %
(Intercept)  1.47220945  2.43657529
age         -0.02943198 -0.00568055
BIS          0.27614693  0.61106696

spcor(aggression[, c(1, 2, 5)])  # from ppcor package

$estimate
             age       BPAQ         BIS
age   1.00000000 -0.1721304 -0.07521432
BPAQ -0.16460751  1.0000000  0.29496094
BIS  -0.07282232  0.2986320  1.00000000

$p.value
             age         BPAQ          BIS
age  0.000000000 0.0042686065 2.145742e-01
BPAQ 0.006315477 0.0000000000 6.654300e-07
BIS  0.229553815 0.0000004753 0.000000e+00

$statistic
           age      BPAQ       BIS
age   0.000000 -2.881861 -1.243990
BPAQ -2.752321  0.000000  5.091128
BIS  -1.204214  5.160655  0.000000

$n
[1] 275

$gp
[1] 1

$method
[1] "pearson"

# to get the semi-partial correlations squared you could use

spcors <- spcor(aggression[, c(1, 2, 5)])$estimate

spcors^2

            age       BPAQ         BIS
age  1.00000000 0.02962887 0.005657193
BPAQ 0.02709563 1.00000000 0.087001957
BIS  0.00530309 0.08918109 1.000000000

# squared semi-partial correlation between X and Y tells you how much unique
# variance in Y that X alone can explain.

# look at the row where the variable that is the DV has a 1.00 (i.e., 2nd row
# in these data)

Comparing Regression Models

mod1 <- lm(BPAQ ~ BIS + age, data = aggression)
summary(mod1)


Call:
lm(formula = BPAQ ~ BIS + age, data = aggression)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.17933 -0.32561 -0.00454  0.35594  1.33877 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)  1.954392   0.244922   7.980 4.12e-14 ***
BIS          0.443607   0.085060   5.215 3.64e-07 ***
age         -0.017556   0.006032  -2.910  0.00391 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.4906 on 272 degrees of freedom
Multiple R-squared:  0.1299,    Adjusted R-squared:  0.1235 
F-statistic: 20.31 on 2 and 272 DF,  p-value: 6.016e-09

confint(mod1)

                  2.5 %      97.5 %
(Intercept)  1.47220945  2.43657529
BIS          0.27614693  0.61106696
age         -0.02943198 -0.00568055

mod2 <- lm(BPAQ ~ BIS + age + NEOc + NEOo, data = aggression)

# Model 2 information

summary(mod2)


Call:
lm(formula = BPAQ ~ BIS + age + NEOc + NEOo, data = aggression)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.20896 -0.32513 -0.00971  0.37144  1.28585 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)  2.415691   0.494262   4.887 1.75e-06 ***
BIS          0.369988   0.106467   3.475 0.000595 ***
age         -0.016666   0.006175  -2.699 0.007394 ** 
NEOc        -0.072225   0.062888  -1.148 0.251790    
NEOo        -0.016360   0.059002  -0.277 0.781774    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.4912 on 270 degrees of freedom
Multiple R-squared:  0.1342,    Adjusted R-squared:  0.1214 
F-statistic: 10.46 on 4 and 270 DF,  p-value: 6.788e-08

confint(mod2)

                  2.5 %      97.5 %
(Intercept)  1.44259357  3.38878829
BIS          0.16037625  0.57959935
age         -0.02882404 -0.00450888
NEOc        -0.19603852  0.05158817
NEOo        -0.13252309  0.09980236

spcor(aggression[, c(1, 2, 5, 6, 8)])

$estimate
              age        BPAQ          BIS        NEOc        NEOo
age   1.000000000 -0.15701239  0.009032315  0.08803045  0.18171659
BPAQ -0.152833352  1.00000000  0.196785971 -0.06503439 -0.01570175
BIS   0.007324224  0.16393530  1.000000000 -0.52385023 -0.14475321
NEOc  0.073454407 -0.05574986 -0.539050578  1.00000000 -0.10266802
NEOo  0.182757037 -0.01622346 -0.179533369 -0.12374560  1.00000000

$p.value
             age        BPAQ          BIS         NEOc        NEOo
age  0.000000000 0.009495037 8.821204e-01 1.476278e-01 0.002627745
BPAQ 0.011608464 0.000000000 1.104381e-03 2.851747e-01 0.796574189
BIS  0.904293418 0.006736558 0.000000e+00 1.429276e-20 0.016897433
NEOc 0.227242009 0.359704405 6.648811e-22 0.000000e+00 0.091044269
NEOo 0.002480182 0.789970076 2.963634e-03 4.141905e-02 0.000000000

$statistic
            age       BPAQ         BIS       NEOc      NEOo
age   0.0000000 -2.6123792   0.1484221   1.452125  3.036462
BPAQ -2.5411619  0.0000000   3.2980112  -1.070891 -0.258038
BIS   0.1203525  2.7306749   0.0000000 -10.105227 -2.403856
NEOc  1.2102485 -0.9174907 -10.5161874   0.000000 -1.695970
NEOo  3.0544472 -0.2666137  -2.9987585  -2.049097  0.000000

$n
[1] 275

$gp
[1] 3

$method
[1] "pearson"

# to get the semi-partial correlations squared you could use

spcors2 <- spcor(aggression[, c(1, 2, 5, 6, 8)])$estimate

round(spcors2^2, 2)

      age BPAQ  BIS NEOc NEOo
age  1.00 0.02 0.00 0.01 0.03
BPAQ 0.02 1.00 0.04 0.00 0.00
BIS  0.00 0.03 1.00 0.27 0.02
NEOc 0.01 0.00 0.29 1.00 0.01
NEOo 0.03 0.00 0.03 0.02 1.00

# look at the row where the variable that is the DV has a 1.00 (i.e., 2nd row
# in these data)

# Comparing the 2 models
anova(mod1, mod2)

Analysis of Variance Table

Model 1: BPAQ ~ BIS + age
Model 2: BPAQ ~ BIS + age + NEOc + NEOo
  Res.Df    RSS Df Sum of Sq      F Pr(>F)
1    272 65.461                           
2    270 65.138  2   0.32246 0.6683 0.5134

# R2 of .1342 - .1299 = .0043

# have R do the math:
summary(mod2)$r.squared - summary(mod1)$r.squared

[1] 0.004285952

# information criteria
AIC(mod1)

[1] 393.7029

BIC(mod1)

[1] 408.17

AIC(mod2)

[1] 396.3449

BIC(mod2)

[1] 418.0455

This is just the tip of the iceberg, but if you’re interested to learn more about these analyses, you can start with this free, online book: https://learningstatisticswithr.com/book/. e.g. chapters 13-16

Statistical Analyses in R

Udi Alter

2022-06-01