The definitions given here are designed to understand the more technical parts of statistical regression analysis. I often find this type of work can help evidence some relationships and some inferred causality. The regressions are ultimately equations, so if relativism is not relevant or necessary for a particular research question, then a more layered and appropriate approach using regressions can help. Similarly, simple reductive reasoning from the statistical findings can also mislead the ‘real’ logic and intuitive depth of research. In short exploring economics and the economy is both an art and a science. Italics indicate cross-references to other entries. Warning though, statistics can bring out a sense of certainty rather than doubt – curb your self-righteousness.
The association between a variable and an earlier or a subsequent measure of the same variable.
A variable with just two values, for example, ‘yes’ and ‘no’.
Bivariate regression model
A regression model analysing two variables to establish the strength of the relationship between them
A variable taking more than two ordered or unordered values, for example social class or ethnic group.
A regression model.
A variable not included in a regression model that might lead to biased causal conclusions.
A variable included in a regression model to reduce the effects of self-selection.
Difference in differences
A regression model in which the difference model between two (or more) occasions in the outcome variable is related to the difference in the explanatory variable of causal interest.
One or more binary variables generated from a categorical variable in order to represent that variable as an explanatory variable in a regression model.
The effect of an explanatory variable on an outcome variable, expressed in standardised units.
A variable that is part of the causal process generating an outcome variable.
A variable that is unrelated to the causal process generating an outcome variable.
The variable on the right-hand side of a regression model, also known as a predictor variable or, misleadingly, as an independent variable which, in some circumstances, is a cause of the outcome variable.
An effect associated with an explanatory variable that has a limited number of values. See also random effect.
A model that generates plausible values of a variable that has one or more missing values.
A variable correlated with an explanatory Instrument variable of interest but uncorrelated with the residual in a regression model and used in place of the explanatory variable in order to reduce bias.
The product of two or more explanatory variables, used to represent the situation when the effect of one explanatory variable varies according to the value of another explanatory variable.
A scale, for example income, in which the same numerical difference between two scale values has the same meaning throughout the scale’s range.
An effect of an explanatory variable that does not vary according to its own value or with the value of other explanatory variables.
Logistic regression model
A regression model in which the outcome variable is a binary variable.
A method of estimation used with, for example, a logistic regression model.
When the true and observed values of a variable differ.
Increasing or decreasing but not both.
A regression model in which the outcome variable can vary at two or more levels, for example between pupils within schools and between schools. These levels are represented as random effects.
A regression model with more than two outcome variables.
An effect of an explanatory variable that varies according to its own value or with the value of other explanatory variables.
The symmetric bell-shaped statistical distribution.
Ordinary least squares
A method of estimation often used with, for example, a regression model.
The variable on the left-hand side of a regression model, also known as a dependent variable or response which, in some circumstances, is an effect of an explanatory variable.
See statistical significance.
The actual estimate from the sample.
A transformation, for example square root or log, often used to bring an outcome variable closer to a Normal distribution.
The expected value of an outcome variable from a regression model for fixed values of the explanatory variables.
The distribution, often assumed to be Normal, of a predicted value.
A regression model.
Pseudo R squared
One measure of the extent to which the explanatory variables account for the variability of a binary outcome in a logistic regression model.
A squared term, for example x2.
The points that divide any statistical distribution into four equal sections.
R squared (R2)
The proportion of the variance of an outcome variable accounted for by the explanatory variables in a regression model when the outcome is measured on an interval scale.
An effect associated with explanatory variables (for example, schools) chosen at random from a population having a large or infinite number of possible values.
A statistical model that relates one (or more) outcome variables to a set of one or more explanatory variables.
A variable in a regression model that represents the variance unexplained by the explanatory variables.
A statistical distribution used for income.
The square root of the variance.
A representation of the way in which an estimate, for example, from a regression model, varies from sample to sample.
A representation of the way in which a variable varies from case to case in a sample.
A representation, using the p-value, of how likely it is that the effect of interest will be as far or further from a chosen value (often zero) in other samples.
A statistical distribution in which all values are equally likely.
The variance in an outcome variable attributed to confounding variables.
A measure of the spread in a statistical distribution.
A test of statistical significance often applied to a set of estimates from a regression model.
A transformation of any statistical distribution so that the mean is zero and the variance is one.