Nonlinear models are sometimes used in hydrology and R can be used for fitting and to calculate confidence intervals for parameters and predicted values. In this post I’ll work through an example.
Cordery proposed the following non-linear model.
- is the initial loss
- is a fitted constant (less than unity)
- is a fitted constant, the initial loss when API is zero, which we interpret physically as the maximum value of initial loss.
I’ll summarise the fitting approach below. Details are available in this gist.
Fitting is achieved using the
nls function in R. We need to supply starting values for the parameters; looks like about 80 and is usually a bit less than one, say 0.98. We can check if these values are reasonable using the function
nlstools::preview (Figure 2).
Using nls, parameter estimates and 95% confidence intervals are:
- 84.8 (73.1, 97.1)
- 0.969 (0.964, 0.974)
The fitted curve is shown in Figure 3.
Figure 3: model fit
We could leave it at that, but as Howard Wainer says “the road to advancing knowledge runs through the recognition and measurement of uncertainty rather than through simply ignoring it”. Therefore, lets check the residuals using the
There is some autocorrelation in the residuals but this is an artifact of the way I obtained the data. The IL and API values come from a digitised plot so data are sorted from high to low ILs values. The variance of the residuals seems reasonable homogeneous and residuals look a bit heavy tailed but pass a normality test.
Calculating confidence and prediction intervals for nonlinear models is not straightforward and is currently not implemented in the
nls function. There is discussion of one approach (here) and there is also a function, predictNLS in the propagate package. Also see this exchange on r-help.
I tried two approaches as documented in the gist and they have similar results. The prediction interval limits are shown in Figure 5. Two issues stand out. Even though the relationship didn’t look too bad in Figure 1 there is a lot of uncertainty. For example, if we calculated an API of 30 mm, there is a 95% change the initial loss would be between 14.4 and 51.2 mm. This range of initial loss would have a substantial influence on flood peak. Secondly, the lower bound of the prediction interval is negative once API is greater than 48 mm. That problem could be addressed by fitting a different model, for example we could try fitting log(IL) as a function of log(API) (Figure 3). We’ve learned at least two things from fitting the model and looking at the uncertainty:
- Figure 1 shows that knowing API helps quantify initial loss but the prediction intervals show there is still a lot of unexplained variation.
- It may be worth exploring different models of the relationship between initial loss and API to obtain more physically realistic results.
The upshot is that we can use R to fit nonlinear models and look at model diagnostics which means we can do a better job of choosing the best model for a particular application.
Baty, F., Charles, S. Flandrois, J. Delignette-Muller, M. (2009) The R package nlstools: a toolbox for nonlinear regression. (conference presentation).
Cordery, I. (1968) Improved techniques in flood estimation and modelling. PhD, School of Civil Engineering, The University of NSW. (see this link for a PDF).
Cordery, I. (1970) Initial loss for flood estimation and forecasting. Journal of the Hydraulics Division 96 (HY12):2447-2466 (link).
Ritz, C. and Streibig, J. C. (2008) Nonlinear regression with R. Springer. (link).