Your Complete Model
$\text{HeartDisease} \sim \text{Binomial}(1, p)$ where $\text{logit}(p) = \beta_0 + \beta_1 \cdot \text{Age} + \beta_2 \cdot \text{Sex} + \beta_3 \cdot \text{CP} + \beta_4 \cdot \text{MaxHR} + \beta_5 \cdot \text{STDep}$
Response
Heart Disease (0/1)
Method
IRLS (5 iterations)
1 Load the data
heart <- read.csv(
"https://archive.ics.uci.edu/ml/machine-learning-databases/heart-disease/processed.cleveland.data",
header = FALSE
)
names(heart) <- c("age", "sex", "cp", "trestbps", "chol", "fbs",
"restecg", "thalach", "exang", "oldpeak",
"slope", "ca", "thal", "target")
heart$target <- ifelse(heart$target > 0, 1, 0)
head(heart)
2 Fit the GLM
model <- glm(
target ~ age + sex + cp + thalach + oldpeak,
data = heart,
family = binomial(link = "logit")
)
Note: family = binomial(link = "logit") specifies logistic regression.
The logit link is the default for binomial, so you could also write just family = binomial.
3 View results
summary(model)
coef(model)
exp(coef(model))
exp(confint(model))
4 Assess model fit
model$deviance
model$null.deviance
AIC(model)
1 - (model$deviance / model$null.deviance)
predicted <- ifelse(predict(model, type = "response") > 0.5, 1, 0)
table(Actual = heart$target, Predicted = predicted)
Expected Output
Call:
glm(formula = target ~ age + sex + cp + thalach + oldpeak, family = binomial(link = "logit"),
data = heart)
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -3.165518 2.025288 -1.563 0.11805
age 0.035939 0.018825 1.909 0.05624 .
sex 1.674501 0.350569 4.777 1.78e-06 ***
cp 0.896265 0.170199 5.266 1.39e-07 ***
thalach -0.024663 0.007989 -3.087 0.00202 **
oldpeak 0.682883 0.152781 4.470 7.83e-06 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 417.98 on 302 degrees of freedom
Residual deviance: 273.13 on 297 degrees of freedom
AIC: 285.13
Number of Fisher Scoring iterations: 5
1 Import libraries and load data
import pandas as pd
import numpy as np
import statsmodels.api as sm
import statsmodels.formula.api as smf
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/heart-disease/processed.cleveland.data"
cols = ["age", "sex", "cp", "trestbps", "chol", "fbs",
"restecg", "thalach", "exang", "oldpeak",
"slope", "ca", "thal", "target"]
heart = pd.read_csv(url, header=None, names=cols)
heart['target'] = (heart['target'] > 0).astype(int)
print(heart.head())
2 Fit the GLM
model = smf.glm(
formula="target ~ age + sex + cp + thalach + oldpeak",
data=heart,
family=sm.families.Binomial()
).fit()
Note: Binomial() uses the logit link by default.
You could explicitly specify Binomial(link=sm.families.links.Logit()).
3 View results
print(model.summary())
print(model.params)
print(np.exp(model.params))
print(np.exp(model.conf_int()))
4 Assess model fit
print(f"Deviance: {model.deviance:.2f}")
print(f"AIC: {model.aic:.2f}")
pseudo_r2 = 1 - (model.deviance / model.null_deviance)
print(f"Pseudo R²: {pseudo_r2:.4f}")
from sklearn.metrics import confusion_matrix, classification_report
predicted = (model.predict() > 0.5).astype(int)
print(confusion_matrix(heart['target'], predicted))
print(classification_report(heart['target'], predicted))
Expected Output
Generalized Linear Model Regression Results
==============================================================================
Dep. Variable: target No. Observations: 303
Model: GLM Df Residuals: 297
Model Family: Binomial Df Model: 5
Link Function: Logit Scale: 1.0000
Method: IRLS Log-Likelihood: -136.57
Date: Thu, 11 Dec 2025 Deviance: 273.13
Time: 16:37:21 Pearson chi2: 309.
No. Iterations: 5 Pseudo R-squ. (CS): 0.3800
Covariance Type: nonrobust
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
Intercept -3.1655 2.025 -1.563 0.118 -7.135 0.804
age 0.0359 0.019 1.909 0.056 -0.001 0.073
sex 1.6745 0.351 4.777 0.000 0.987 2.362
cp 0.8963 0.170 5.266 0.000 0.563 1.230
thalach -0.0247 0.008 -3.087 0.002 -0.040 -0.009
oldpeak 0.6829 0.153 4.470 0.000 0.383 0.982
==============================================================================