Given a collection of numerical data \(\bigl\{(x_i, y_i)\bigr\}\)
we can use least-squares regression to determine a function
of any template we choose that “best fits” the data.
We call such a function \(f\) a model for the data,
and write \(y_i \sim f(x_i).\)
The numbers that appear in the formula for the model are called parameters.
The “goodness of fit” of the model to the data is often quantified
as a value \(R^2\) between \(0\) and \(1,\) a higher value meaning a better fit.
Beyond the \(R^2\) value, whether or not a specific type of function
serves as a good model for some data is rather subjective.
But certain functions have features
that morally must be considered for their use a models.
A linear model \(y_i \sim ax_i + b,\) having a constant slope,
should be used for data where \(y_i\) is suspected to be changing
at a constant rate with respect to \(x_i.\)
This rate is the value of the parameter \(a.\)
A quadratic model \(y_i \sim ax_i^2 + bx_i + c,\)
which increases/decreases at a constant rate,
should be used for data where \(y_i\) is suspected to be accelerating/decelerating
at a constant rate with respect to \(x_i.\)
This acceleration is the value \(2a,\) and \(b\) is the initial rate at \(x=0.\)
An exponential model \(y_i \sim \mathrm{e}^{k x_i},\)
should be used for data where \(y_i\) is suspected to be
changing at a rate proportional to the value of \(x_i.\)
The parameter \(k\) is the constant by which they’re proportional,
\(k \gt 1\) corresponding to a positive/increasing relationship (growth)
and \(k \lt 1\) corresponding to a negative/decreasing relationship (decay).