Microsoft Store
 

Regression analysis


 

Regression analysis is any statistical method where the mean of one or more random variables is predicted conditioned on other (measured) random variables.

Related Topics:
Mean - Random variable - Predicted

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

In particular, there are linear regression, logistic regression, Poisson regression and supervised learning. Regression analysis is the statistical view of curve fitting: choosing a curve that best fits given data points.

Related Topics:
Linear regression - Logistic regression - Poisson regression - Supervised learning - Data point

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

Sometimes there are only two variables, one of which is called X and can be regarded as constant, i.e., non-random, because it can be measured without substantial error and its values can even be chosen at will. For this reason it is called the independent or controlled variable. The other variable called Y, is a random variable called the dependent variable, because its values depend on X. In regression we are interested in the variation of Y on X.

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

Typical examples are the dependence of the blood pressure Y on the age X of a person, or the dependence of the weight Y of certain animals on their daily ration of food X. This dependence is called the regression of Y on X.

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

See also: multivariate normal distribution, important publications in regression analysis.

Related Topics:
Multivariate normal distribution - Important publications in regression analysis

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

Regression is usually posed as an optimization problem as we are attempting to find a solution where the error is at a minimum. The most common error measure that is used is the least squares: this corresponds to a Gaussian likelihood of generating observed data given the (hidden) random variable. In a certain sense, least squares is an optimal estimator: see the Gauss-Markov theorem.

Related Topics:
Optimization - Error - Minimum - Least squares - Gaussian likelihood - Gauss-Markov theorem

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

The optimization problem in regression is typically solved by algorithms such as

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

the gradient descent algorithm, the Gauss-Newton algorithm, and the Levenberg-Marquardt algorithm. Probabilistic algorithms such as RANSAC can be used to find a good fit for a sample set, given a parametrized model of the curve function.

Related Topics:
Gradient descent - Gauss-Newton algorithm - Levenberg-Marquardt algorithm - RANSAC

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

Regression can be expressed as a maximum likelihood method of estimating the parameters of a model. However, for small amounts of data, this estimate can have high variance. Some practitioners use maximum a posteriori (MAP) methods, which place a prior over the parameters and then choose the parameters that maximize the posterior. MAP methods are related to Occam's Razor: there is a preference for simplicity among a family of regression models (curves) just as there is a preference for simplicity among competing theories.

Related Topics:
Maximum likelihood - Maximum a posteriori - Prior - Posterior - Occam's Razor

~ ~ ~ ~ ~ ~ ~ ~ ~ ~