%\input math_mac.tex%\setup {\sl Appendix IV: Derivation of the Least Squares Fit}%{\sl%\subsect {(C.2)} {Derivation of the Least Squares Fit}%\noindent Following is a simple derivation of the least squares fit.%\medskip%\noindent Suppose the relationship between the two experimental parametersbeing studied is\[y = f(x)\]where $x$ is the independent parameter which is varied, and$y$ is the dependent parameter.If $f(x)$ is a polynomial function, or can be approximatedby a polynomial, then the least squares method is a\emph{linear} one, and it will almost always give reliableanswers.If $f(x)$ cannot be expressed as a polynomial, but consistsof transcendental functions, the least squares method isnon-linear, and may or may not work reliably.In some cases, a change of variables may result in a polynomial,as in the exponential example above.A function like\[y = a + \frac b x + \frac c {x^2}\]is not a polynomial in $x$, but it is a polynomial inthe variable $z = 1/x$.%\medskip%\par Suppose the functional relationship between $x$ and $y$is a polynomial of degree $\ell$:\begin{equation}y = a_0 + a_1 x + a_2 x^2  \ldots  a_\ell x^\ell \label{eq:lsfone}\end{equation}or\begin{equation}y = \sum_{j=0}^\ell a_j x^j \label{eq:lsftwo}\end{equation}and we have a set of $N$ data points ${x_i,y_i}$ obtainedby experiment.The goal is to find the values of the $\ell+1$ parameters$a_0, a_1 \ldots a_\ell$ which will give the best fit ofEquation~{\ref{eq:lsfone}} to our data points.The first piece of information to note is that\begin{equation}N \geq \ell+1 \label{eq:lsfthr}\end{equation}or else we will not be able to make a unique determination.For example, if $\ell=1$, we need at least two data points tofind the equation of the straight line.In order to make any meaningful statistical statements, however,we will need even more than $\ell+1$ points, as we shallsee later.A good rule of thumb: if we wish to fit our data witha polynomial of degree $\ell$ in a 95\% confidence interval,we should choose N such that\begin{equation}N - (\ell+1) \geq 10 \label{eq:lsffou}\end{equation}The idea behind the linear least squares method is to\emph{minimize} the sum\begin{equation}S = \sum_{i=1}^N \left(y_i - \sum_{j=0}^\ell a_j x_i^j \right)^2\label{eq:lsffiv}\end{equation}$S$ will be a minimum if\begin{equation}\delx S {a_k} = 0 \qquad k = 0, 1, 2 \ldots \ell \label{eq:lsfsix}\end{equation}The result will be $\ell+1$ linear equations in $\ell+1$ unknowns:\begin{equation}\sum_{j=0}^\ell a_j \left( \sum_{i=1}^N x_i^{j+k} \right) =\sum_{i=1}^N x_i^k y_i \qquad k=0,1 \ldots \ell \label{eq:lsfsev}\end{equation}which can be solved by standard matrix techniques forthe unknown coefficients $a_0, a_1 \ldots a_\ell$.As an example, let us consider the case where $\ell=1$, or\[y = m x + b\]In this case,\[S = \sum_{i=1}^N \left( y_i - ( m x_i + b ) \right) ^2\]Expanding Equation~{\ref{eq:lsfsev}}, we have\begin{eqnarray}b (N) +m \left( \sum_{i=1}^N x_i \right) &= \sum_{i=1}^N y_i \\%\noalign{\smallskip}b \left( \sum_{i=1}^N x_i \right) +m \left( \sum_{i=1}^N x_i^2 \right) &= \sum_{i=1}^N x_i y_i \end{eqnarray}Then the intercept $b$ and the slope $m$ can be foundfrom Cramer's rule\begin{equation}b = {\frac {\left(\sum y_i\right)\left(\sum x_i^2\right) -             \left(\sum x_i\right)\left(\sum x_i y_i \right)}            {N\left(\sum x_i^2\right) -              \left( \sum x_i \right)^2}  } \end{equation}and\begin{equation}m = {\frac {N\left( \sum x_i y_i \right) -              \left( \sum x_i \right)\left(\sum y_i \right)}            {N\left( \sum x_i^2 \right) -              \left( \sum x_i \right)^2}  } \end{equation}%}%\vfill\eject%\vfill\eject\end