# Fitting non-linear models

Nonlinear models are sometimes used in hydrology and R can be used for fitting and to calculate confidence intervals for parameters and predicted values.  In this post I’ll work through an example.

Previously we looked at the relationship between initial loss and Antecedent Precipitation Index.  Lets fit a model to data collected for the Bobo River by Cordery (1968) (Figure 1).

Figure 1: Data for initial loss and antecedent precipitation index for the Bobo River (digitized from Figure 5 of Cordery, 1968, units changed to mm)

Cordery proposed the following non-linear model.

$IL = IL_{max}(N)^{API}$       (1)

Where

• $IL$ is the initial loss
• $N$ is a fitted constant (less than unity)
• $IL_{max}$ is a fitted constant, the initial loss when API is zero, which we interpret physically as the maximum value of initial loss.

I’ll summarise the fitting approach below.  Details are available in this gist.

Fitting is achieved using the `nls` function in R.  We need to supply starting values for the parameters;  $IL_{max}$ looks like about 80 and $N$ is usually a bit less than one, say 0.98. We can check if these values are reasonable using the function `nlstools::preview` (Figure 2).

Figure 2: Comparison of model (red +) and observations with starting values $IL_{max}$ = 80, $N$ = 0.98

Using nls, parameter estimates and  95% confidence intervals are:

• $IL_{max}$   84.8 (73.1, 97.1)
• $N$   0.969 (0.964, 0.974)

The fitted curve is shown in Figure 3.

Figure 3: model fit

We could leave it at that, but as Howard Wainer says “the road to advancing knowledge runs through the recognition and measurement of uncertainty rather than through simply ignoring it”.  Therefore, lets check the residuals using the `nlstools::nlsResiduals` function.

Figure 4: Residual diagnostic plots

There is some autocorrelation in the residuals but this is an artifact of the way I obtained the data.  The IL and API values come from a digitised plot so data are sorted from high to low ILs values.  The variance of the residuals seems reasonable homogeneous and residuals look a bit heavy tailed but pass a normality test.

Calculating confidence and prediction intervals for nonlinear models is not straightforward and is currently not implemented in the `nls` function.  There is discussion of one approach (here) and there is also a function, predictNLS in the propagate package.  Also see this exchange on r-help.

I tried two approaches as documented in the gist and they have similar results.  The prediction interval limits are shown in Figure 5.  Two issues stand out.  Even though the relationship didn’t look too bad in Figure 1 there is a lot of uncertainty.  For example, if we calculated an API of 30 mm, there is a 95% change the initial loss would be between 14.4 and 51.2 mm.  This range of initial loss would have a substantial influence on flood peak. Secondly, the lower bound of the prediction interval is negative once API is greater than 48 mm.  That problem could be addressed by fitting a different model, for example we could try fitting log(IL) as a function of log(API) (Figure 3).  We’ve learned at least two things from fitting the model and looking at the uncertainty:

• Figure 1 shows that knowing API helps quantify initial loss but the prediction intervals show there is still a lot of unexplained variation.
• It may be worth exploring different models of the relationship between initial loss and API to obtain more physically realistic results.

The upshot is that we can use R to fit nonlinear models and look at model diagnostics which means we can do a better job of choosing  the best model for a particular application.

Figure 5:  Bobo River API – Initial Loss relationship using the model suggested by Cordery (1970) with 95% prediction interval

Figure 6: Bobo River API – Initial Loss relationship based on log-log model; 95% prediction intervals are shown

### References

Baty, F., Charles, S. Flandrois, J. Delignette-Muller, M. (2009) The R package nlstools: a toolbox for nonlinear regression.  (conference presentation).

Cordery, I. (1968) Improved techniques in flood estimation and modelling.  PhD, School of Civil Engineering, The University of NSW.  (see this link for a PDF).

Cordery, I. (1970) Initial loss for flood estimation and forecasting. Journal of the Hydraulics Division 96 (HY12):2447-2466 (link).

Ritz, C. and Streibig, J. C. (2008) Nonlinear regression with R.  Springer. (link).