# Interpreting dummy-coded parameter estimates with and without a model intercept

Posted April 10, 2018 | View revision history

This post is to illustrate the differences in model parameter estimates for dummy-coded factors when the model includes an intercept versus when it does not.

## Data

Dataset is 10k rows from Lending Club, years 2007-2014.

loan_status
The target values are `0 = Fully Paid` and `1 = Charged off (default)`. The goal is to predict whether a loan will default.
dti
A ratio calculated using the borrower’s total monthly debt payments on the total debt obligations, excluding mortgage and the requested LC loan, divided by the borrower’s selfreported monthly income.
grade
Lending Club-assigned loan grade. “A” is good, “G” is bad, etc.

## Reference level with an intercept

If we fit a logistic regression model, by default, an intercept will be estimated. In R, by default, the first level is used as the reference level.

The logOdds model predictions for any given instance starts with a value equal to the intercept that the model estimated – in this case, -2.75. Then, if the instance has a grade anywhere between B..G, the logOdds is adjusted up or down (depending on the value of the estimate). If the instance has a grade equal to the reference level (grade A), then no modification is made. This is because the intercept is the estimated prediction for the reference level, Grade A, and all other estimates are relative to Grade A.

Using a reference level has the benefit of adding interpretabilty to the z-scores for each of the parameter estimates. Because each of the estimates is relative to the reference level, we can answer the question “Are the average values for Grade B statisictally significantly different from those of Grade A?” The answer to that is found by looking at the z-score for the gradeB estimate – 5.954, which has a very small associated p-value, therefore we can reject the hypothesis that the means of gradeA and gradeB are the same. All other grade parameter estimates can likewise be interpreted as pairwise comparisons between each level and gradeA. Comparisons between other pairs can be obtained by contrasting least-square means for each grade level.

Least-square mean estimates:

And pairwise contrasts:

The z-score for the estimate of the intercept can be interpreted as “is the intercept significantly different from 0?”

## No intercept, no reference level

Now examine the parameter estimates for when we instruct our model to not estimate an intercept.

Take note that the estimate for grade A is equivalent to the previously-estimated intercept. However, the estimates for all other grade levels are no longer interpreted as being relative to any other level – but rather they are each interpreted as testing whether the weight is different from 0. So, while the estimates in isolation are much different from what they were previously, they are exactly equal to the previous relative estimates added to the previous intercept. The two models will lead to exactly the same predictions.

Least-square means in this case are equivalent to what they were previously.

# More than one predictor

Note that least-square mean estimates are not reliable when interaction terms are present in the model. Also, when more than one predictor is used, least-square mean estimates are the mean effect for each level of the requested factor holding all other predictors constant at their averages. In other words, they are mean estimates “controlling for” the effect of other predictors. David Eargle is an Assistant Professor at the University of Colorado Boulder in the Leeds School of Business. He earned his Ph.D. degree in Information Systems from the University of Pittsburgh. His research interests include human-computer interaction and information security. He has coauthored several articles in these areas using neurophysiological and other methodologies in outlets such as the Journal of the Association for Information Systems, the European Journal of Information Systems, the International Conference on Information Systems, and the Hawaii International Conference on System Sciences), along with the Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI). More about the author →

This page is open source. Please help improve it.