← All Tutorials

Tutorial 3: The Link Function for Count Data

Choose how to connect your predictors to the expected bike rental count

Systematic
2 Link Function
3 Distribution

Your model so far

E[RentalCount] = g(?)(1, Temperature, Humidity, WindSpeed, WorkingDay, Weather)

The link function g() determines how the linear combination of predictors relates to the expected count of rentals.

The Key Constraint

We're predicting a count. This means our predictions must be non-negative:

$\mu = E[\text{Count}] \geq 0$

The link function must map from the unbounded linear predictor $\eta = \beta_0 + \beta_1 X_1 + \ldots$ (which can be any real number) to a strictly positive count.

Choose the Link Function

For predicting bike rental counts (non-negative integers), which link function is most appropriate?

Click on a card to select it.

Link Function Selected: Log

With the log link, your model equation becomes:

$\ln(\mu) = \beta_0 + \beta_1 \cdot \text{Temp} + \beta_2 \cdot \text{Hum} + \beta_3 \cdot \text{Wind} + \beta_4 \cdot \text{Work} + \beta_5 \cdot \text{Weather}$

What the coefficients mean

With the log link, the coefficients give rate ratios:

  • $e^{\beta}$ gives the multiplicative effect on the expected count
  • $e^{\beta} = 1.5$ means 50% more rentals for a one-unit increase
  • $e^{\beta} = 0.8$ means 20% fewer rentals for a one-unit increase

✔ Correct Choice!

The log link is the canonical choice for count data, giving us Poisson regression (when combined with the Poisson distribution).

The log function ensures predictions are always positive:

$\ln(\mu) = \eta \quad \Rightarrow \quad \mu = e^\eta > 0$

Since $e^x > 0$ for all real $x$, our predicted counts can never be negative!

Why log dominates for counts:

⚠ Valid, But Not Preferred Today

The square root link is mathematically valid for count data - it also ensures positive predictions. However, it's rarely used today.

Historical Context

Pre-GLM era (before 1970s): The square root was used as a "variance-stabilizing transformation" for count data.

Why sqrt was used: For Poisson data, Var$(Y) = \mu$, so variance increases with the mean. The square root transformation was thought to "stabilize" this variance.

Why log won: GLM theory showed that the log link is the "canonical" link for Poisson, and the variance issue is handled by the model itself. Plus, log coefficients have cleaner interpretation (rate ratios).

The interpretation problem:

With sqrt link: $\sqrt{\mu} = \beta_0 + \beta_1 X$
What does $\beta_1$ mean? It's the change in $\sqrt{\mu}$ per unit change in $X$ - not intuitive!

With log link: $e^{\beta_1}$ is the rate ratio - much more interpretable.

For this tutorial, use the log link - it's the modern standard with better interpretation.

❌ Can Predict Negative Counts

The identity link doesn't constrain predictions to be positive:

$\mu = \beta_0 + \beta_1 \cdot \text{Temp} + \beta_2 \cdot \text{Humidity} + \ldots$

The problem: Linear combinations can produce any real number:

The identity link is appropriate for continuous responses that can be any real number (like temperature or weight change), but not for counts.

❌ Wrong Domain

The logit link is designed for probabilities, not counts:

$\text{logit}(\mu) = \ln\left(\frac{\mu}{1-\mu}\right)$

This requires $0 < \mu < 1$ - it only makes sense for probabilities!

The problems:

Remember: Tutorial 2 used logit for heart disease (yes/no). This tutorial has counts (0, 1, 2, ..., thousands).