# Short introduction to Generalized Linear Models

## Contents

## Short introduction into Generalized Linear Models (GLM)[edit]

### Linear regression[edit]

In linear regression we assume a model of the form with a random variable, , ..., fixed values and an random variable describing the error term. The distribution of is determined by the distribution of , which is usually assumed to be normal distributed.

But what happens, if the distribution of is not normal distributed, e.g. if is a zero-one variable describing a fail (usually coded as 0) and a success (usually coded as 1) ? Can we extend the linear model such that the easy interpretability of the coefficient can be kept in the model ?

### Basic generalized linear model[edit]

The linear model can be extended in the following way

with a fixed link function depending on the distribution of and all other parameters as in the linear model.

In contrast to the linear model we may not estimate the variable not directly, but, as in the case of a zero-one variable, the probability . This requires a framework for handling different distributions of .

### Exponential family[edit]

is called a member of the exponential family if we can write the density or probability function of as

.

**Example (Normal distribution):**

is a member of the exponential family with

<latex template="eqnarray.tex"> E(Y) &=& \mu,\ Var(Y)=\sigma^2\\ f(y) &=& \frac{1}{\sqrt{2\pi\sigma^2}}\exp\left(-\frac{(y-\mu)^2}{2\sigma^2}\right)\\ &=& \exp\left[-\frac{1}{2}\log(2\pi\sigma^2)-\frac{(y-\mu)^2}{2\sigma^2}\right]\\ &=& \exp\left[\underbrace{-\frac{1}{2}\log(2\pi\sigma^2)-\frac{y^2}{2\sigma^2}}_{=c(y,\psi)} + \underbrace{\frac{1}{\sigma^2}}_{=1/a(\psi)} \left(y\underbrace{\mu}_{=\theta}-\underbrace{\frac{\mu^2}{2}}_{=b(\theta)}\right)\right]\\ \mu &=& \theta\\ \psi&=&\sigma\\ b(\theta) &=& \frac{\theta^2}{2}\\ a(\psi)&=& \psi^2\\ c(y,\psi)&=& -\frac{1}{2}\log(2\pi\psi^2)-\frac{y^2}{2\psi^2} </latex>

**Example (Binomial distribution):**

is a member of the exponential family with

<latex template="eqnarray.tex"> E(Y) &=& n\frac{\mu}{n} =\mu,\ Var(Y)=n\frac{\mu}{n}\left(1-\frac{\mu}{n}\right) = \mu\left(1-\frac{\mu}{n}\right)\\ P(Y=y) &=& {n \choose y} \left(\frac{\mu}{n}\right)^y\left(1-\frac{\mu}{n}\right)^{n-y} = {n \choose y} \left(\frac{\frac{\mu}{n}}{1-\frac{\mu}{n}}\right)^y\left(1-\frac{\mu}{n}\right)^{n}\\ &=& \exp\left[\log{n \choose y} +y \log\left(\frac{\frac{\mu}{n}}{1-\frac{\mu}{n}}\right) + n \log\left(1-\frac{\mu}{n}\right) \right]\\ &=& \exp\left[\underbrace{\log{n \choose y}}_{=c(y,\psi)} +y \underbrace{\log\left(\frac{\frac{\mu}{n}}{1-\frac{\mu}{n}}\right)}_{=\theta} - \underbrace{-n \log\left(1-\frac{\mu}{n}\right)}_{=b(\theta)} \right]\\ \mu &=&\frac{n\exp(\theta)}{1+\exp(\theta)}\\ \psi&=&\mbox{ unused }\\ b(\theta) &=& n\log(1+\exp(\theta))\\ a(\psi) &=& 1\\ c(y,\psi) &=& \log{n \choose y} </latex>

**Common properties**

We can derive (under some regularity conditions) some common properties:

- .

The log likelihood can be solved by an iterative methods, e.g. Newton-Raphson method.

### Link functions[edit]

The following table shows the parameters and link functions for some distributions:

Distribution | Range of | |||||||

Bernoulli | unused | 1 | ||||||

Binomial known |
unused | 1 | ||||||

Poisson | unused | 1 | ||||||

Negative Binomial known |
unused | 1 | ||||||

Normal | ||||||||

Gamma | ||||||||

Inverse Gaussian |

Note: For all distributions in the table the parameters of are scaled such that (see example for Binomial distribution) and the densities and probability functions are taken from Rinne (2003).

Even for the same distribution of we can have different link functions, e.g.

- for Bernoulli
- logit: (see table above)
- probit: with the cumulative distribution function of the standard normal
- complementary log-log:

- and in general (if positive)
- power:

## References[edit]

- W. Härdle, M. Müller, S. Sperlich, A. Werwatz (2004), Nonparametric and Semiparametric Models, Springer Verlag, Heidelberg
- P. McCullagh, J.A. Nelder (1989). Generalized linear Models, Chapman & Hall, London
- H. Rinne (2003). Taschenbuch der Statistik, 3. Auflage, Verlag Harri Deutsch
- B. Rönz (1999), Modelling the perception of current and prospective economic situation, Statistics Research Report No. 99.002, The Australian National University