The method of least squares is one of the most important applications in the approximation of functions. The idea is to find a curve such that, given a set of ordered pairs, this function best approximates the data. The function can be a line, a quadratic curve, a cubic, etc.
The idea of the method consists of minimizing the sum of squares of the differences in the ordinate (Y component), between the points generated by the chosen function and the points belonging to the data set.
Least squares method
Before giving the method, we must first be clear about what “better approach” means. Suppose that we are looking for a line y = b + mx that is the one that best represents a set of n points, namely {(x1, y1), (x2, y2)…, (xn, yn)}.
As shown in the previous figure, if the variables x and y were related by the line y = b + mx, then for x = x1 the corresponding value of y would be b + mx1. However, this value is different from the true value of y, which is y = y1.
Remember that in the plane, the distance between two points is given by the following formula:
With this in mind, to determine the way to choose the line y = b + mx that best approximates the given data, it seems logical to use as a criterion the selection of the line that minimizes the sum of the squares of the distances between the points and the straight.
Since the distance between the points (x1, y1) and (x1, b + mx1) is y1- (b + mx1), our problem reduces to finding numbers m and b such that the following sum is minimal:
The line that fulfills this condition is known as the «approximation of the least squares line to the points (x1, y1), (x2, y2),…, (xn, yn)».
Once the problem is obtained, it only remains to choose a method to find the least squares approximation. If the points (x1, y1), (x2, y2),…, (xn, yn) are all on the line y = mx + b, we would have that they are collinear y:
In this expression:
Finally, if the points are not collinear, then y-Au = 0 and the problem can be translated into finding a vector u such that the Euclidean norm is minimal.
Finding the minimizing vector u is not as difficult as you might think. Since A is an nx2 matrix and u is a 2 × 1 matrix, we have that the vector Au is a vector in R n and belongs to the image of A, which is a subspace of R n with a dimension no greater than two.
We will assume that n = 3 to show which procedure to follow. If n = 3, the image of A will be a plane or a line through the origin.
Let v be the minimizing vector. In the figure we observe that y-Au is minimized when it is orthogonal to the image of A. That is, if v is the minimizing vector, then it happens that:
Then, we can express the above in this way:
This can only happen if:
Finally, solving for v, we have:
It is possible to do this since A t A is invertible as long as the n points given as data are not collinear.
Now, if instead of looking for a line we wanted to find a parabola (whose expression would be of the form y = a + bx + cx 2) that would be a better approximation to the n data points, the procedure would be as described below.
If the n data points were in this parabola, we would have:
Then:
Similarly we can write y = Au. If all the points are not in the parabola, we have that y-Au is different from zero for any vector u and our problem is again: find a vector u in R3 such that its norm --y-Au-- is as small as possible.
Repeating the previous procedure, we can arrive at that the vector sought is:
Solved exercises
Exercise 1
Find the line that best fits the points (1,4), (-2,5), (3, -1) and (4,1).
Solution
We have to:
Then:
Therefore, we conclude that the line that best fits the points is given by:
Exercise 2
Suppose an object is dropped from a height of 200 m. As it falls, the following steps are taken:
We know that the height of said object, after a time t has elapsed, is given by:
If we wish to obtain the value of g, we can find a parabola that is a better approximation to the five points given in the table, and thus we would have that the coefficient that accompanies t 2 will be a reasonable approximation to (-1/2) g if the measurements are accurate.
We have to:
And later:
So the data points are fit by the following quadratic expression:
So, you have to:
This is a value that is reasonably close to correct, which is g = 9.81 m / s 2. In order to obtain a more exact approximation of g, it would be necessary to start from more precise observations.
What is it for?
In the problems that occur in the natural or social sciences, it is convenient to write the relationships that exist between different variables by means of some mathematical expression.
For example, in economics we can relate cost (C), income (I), and profits (U) by means of a simple formula:
In physics, we can relate the acceleration caused by gravity, the time an object has been falling, and the height of the object by law:
In the previous expression s o is the initial height of said object and v o is its initial velocity.
However, finding formulas like these is not an easy task; it is usually up to the professional on duty to work with a lot of data and repeatedly perform several experiments (in order to verify that the results obtained are constant) to find relationships between the different data.
A common way to achieve this is to represent the data obtained in a plane as points and look for a continuous function that optimally approximates those points.
One of the ways to find the function that "best approximates" the given data is by the method of least squares.
Furthermore, as we also saw in the exercise, thanks to this method we can get fairly close approximations to physical constants.
References
- Charles W Curtis Linear Algebra. Springer-Velarg
- Kai Lai Chung. Elementary Proability Theory with Stochastic Processes. Springer-Verlag New York Inc
- Richar L Burden & J.Douglas Faires. Numerical Analysis (7ed). Thompson Learning.
- Stanley I. Grossman. Applications of Linear Algebra. MCGRAW-HILL / INTERAMERICANA DE MEXICO
- Stanley I. Grossman. Linear algebra. MCGRAW-HILL / INTERAMERICANA DE MEXICO