← All Tutorials

Tutorial: Choose Fitting Method

How should we estimate the model parameters?

✓ Systematic

✓ Link

✓ Distribution

4 Fitting

5 Implementation

Your model so far

Systematic Component

η = β₀ + β₁·Age + β₂·ExAng + β₃·STDep

Link Function

Identity: μ = η

Distribution

Gaussian (Normal)

How should we find the β coefficients?

We have our model structure defined. Now we need to estimate the parameters (β₀, β₁, β₂, β₃) that best fit the observed data.

Click on a card to select it.

🔄

Maximum Likelihood (IRLS)

Iteratively Reweighted Least Squares - the standard GLM fitting algorithm.

Iterative optimization • General purpose • Fisher scoring

📐

Closed-Form (OLS)

Direct analytical solution using matrix algebra: β = (X'X)⁻¹X'y

One-shot calculation • No iteration • Exact solution

🎲

Bayesian Estimation

Combine prior beliefs with data to get posterior distributions for parameters.

Prior specification • MCMC sampling • Full uncertainty

🎯

Penalised/Regularised

Add penalty terms to prevent overfitting (Ridge, Lasso, Elastic Net).

Shrinkage • Variable selection • Bias-variance tradeoff

✔ Fitting Method Selected

Maximum Likelihood via IRLS is the standard approach for GLMs. This is what glm() in R and statsmodels in Python use by default.

For your Gaussian + identity model, IRLS converges in one iteration to the OLS solution - but using glm() makes it easy to try other distributions later!

Why does maximising the likelihood work? See Likelihood and Simulation Theory on JonStats.

← Back to Distribution

✔ Correct!

Maximum Likelihood Estimation (MLE) via Iteratively Reweighted Least Squares (IRLS) is the canonical fitting method for GLMs. This is what glm() in R and statsmodels in Python use by default.

        Why IRLS for GLMs?

        GLMs can have non-Gaussian distributions and non-identity link functions. IRLS handles
        these by iteratively approximating the problem as a weighted least squares problem,
        converging to the maximum likelihood solution.
      

The algorithm works by:

Starting with initial parameter estimates
Computing weights based on the variance function
Solving a weighted least squares problem
Repeating until convergence

For Gaussian + identity, IRLS converges in one iteration to the OLS solution - but the framework generalizes to all GLM types!

🔍 Want to see optimisation in action?
Our interactive visualisations show how algorithms navigate parameter space from 1D to 4D, including gradient descent and Newton-Raphson.

⭐ Bonus Points!

Excellent insight! For your specific model — Gaussian distribution with identity link — the closed-form OLS solution is actually optimal.

        Why is OLS optimal here?

        When the response is normally distributed and the link is identity, the maximum likelihood
        solution is the OLS solution: β = (X'X)⁻¹X'y. No iteration needed!
      

However, in the broader context of GLMs, this approach doesn't generalize:

Logistic regression (binomial + logit) has no closed-form solution
Poisson regression (Poisson + log) requires iteration
Gamma regression and others also need iterative methods

The standard approach is to use Maximum Likelihood via IRLS, which works for all GLM types. For Gaussian + identity, IRLS converges to OLS in one iteration anyway.

Select Maximum Likelihood (IRLS) to continue, as it's the general GLM approach.

🎲 A Different Philosophy

Bayesian estimation is a valid and powerful approach, but it represents a fundamentally different statistical philosophy than classical GLM fitting.

        Bayesian vs Frequentist

        Frequentist (GLM): Parameters are fixed unknowns; find point estimates
Bayesian: Parameters have probability distributions; combine prior beliefs with data

Bayesian approaches require:

Prior distributions for each parameter
MCMC sampling (e.g., Stan, JAGS) to estimate posteriors
More computational resources and expertise

While Bayesian GLMs exist, the standard GLM framework uses maximum likelihood estimation.

For classical GLM fitting, select Maximum Likelihood (IRLS).

🎯 When You Need Regularization

Penalised (regularised) regression adds penalty terms to the likelihood to prevent overfitting or perform variable selection.

        Common penalty types:
        Ridge (L2): Shrinks coefficients toward zero
Lasso (L1): Can set coefficients exactly to zero
Elastic Net: Combines L1 and L2 penalties

      

These methods are valuable when you have:

Many predictors relative to observations
Multicollinearity between predictors
Need for automatic variable selection

However, penalised regression introduces bias in exchange for reduced variance. For standard GLM estimation, we use IRLS without penalties.

For standard GLM fitting, select Maximum Likelihood (IRLS).