Tag Archives: model performance

Recommended Practice for Hydrologic Investigations and Reporting

A comment on:

R. J. Nathan, and T. A. McMahon (2017) Recommended practice for hydrologic investigations and reporting.  Australasian Journal of Water Resources 21(1):3-19

This is a very useful paper which sets out a road map of what should be included in hydrologic reports.

A few highlights.


When using rainfall, evaporation and streamflow data, include:

  • gauge name and number (for flow data also include the stream name)
  • location (latitude and longitude)
  • period of record
  • amount of missing data
  • method used to impute missing data
  • (for rainfall data) the method used to check and adjust accumulated values.

The quality of rainfall and evaporation records can be checked using double-mass curve analysis.

For streamflow data, the quality of the rating is important so there needs to be information on:

  • number of gaugings used to define the rating curve
  • a comparison of the maximum gauged flow to the maximum recorded flow.

Highflow ratings can be checked and adjusted using hydraulic modelling (Tate and Russell, 2014).

My personal observation is that there is often more data available than can be found on the standard websites.  It may be worth chasing older published compilations of data.

Objective functions for modelling

The objective function used to find optimal parameter values for modelling will depend on which aspect of the model results are most important to get right.  Using the standard Nash-Sutcliffe approach may be fine for high flows but may not provide a good fit for low flows.

Model validation

“Whatever length of data is available, an optimisation technique needs to be applied that allows validation of the model parameters using ‘independent’ data.”

Flood frequency analysis

There is good advice on flood frequency analysis in Australian Rainfall and Runoff.  One sage piece of advice by Nathan and McMahon relates to the inclusion of historic events (events that predate the gauging record).  These can be included when fitting probability models (see ARR, Book 3, Section 2.8.4).

“[the historic period] is commonly assumed to commence in the year that the earliest historic event occurred in, but it may be more appropriate to base this on the year in which reliable anecdotal evidence (in the form of newspaper or other extant records) can be assumed to be available.”

Use of regional information

We are fortunate in Australia to have the Regional Flood Frequency Analysis (RFFA) tool which provides flood frequency estimates over the populated areas of Australia. Sometimes the relationships between flood size and site characteristics provided by the RFFA, may be useful to transpose other flow indices.


Nathan and McMahon provide a comprehensive list of what needs to be provided in a report.  The items that stood out for me were:

  • A map that shows the area to be modelled, waterways, catchments, sub-catchments,  and location of rainfall, evaporation and flow monitoring stations
  • Tables and charts that show the availability of data
  • Information on ratings for flow gauges
  • Clear description of what data were used and an explanation of any data that were excluded
  • Description of the data used in calibration and validation
  • Statistics and graphical descriptions of model performance
  • Values of all model parameters and how they were obtained
  • Details of flood maxima used at the key sites of interest including information on:
    • sources of data
    • dates of occurrence of floods
    • extent of extrapolation of the rating curve
    • outliers and how they were handled
    • historical data
  • Details of flood frequency analysis:
    • choice of probability model
    • method of fitting
    • the use of prior information for parameters
    • confidence limits for the fitted flood frequency distribution
  • All acronyms should be defined (and located in one place in the report)
  • It is not appropriate to leave key details out of reporting by referring to other reports that the reviewer can’t access.

Nathan and McMahon conclude that the issues in most need of additional consideration are:

  • Provision of more detail around the provenance and preparation of datasets
  • Increased rigour regarding the calibration and validation of models
  • The use of regional information and independent methods to derive ‘best estimates’ based on consideration of the relative sources of uncertainty
  • Assessment of salient uncertainties and their impact on the conclusion drawn.

Reporting needs to include enough detail to satisfy a technical reviewer.  It is fine for reports to be aimed at a general audience provided there is sufficient technical detail in appendices.


Tate, B., & Russell, K. (2014). Improving rating curves with 2D hydrodynamic modelling. Hydrology and Water Resources Symposium, pp. 873–880. Perth : Engineers Australia.

R. J. Nathan, and T. A. McMahon (2017) Recommended practice for hydrologic investigations and reporting.  Australasian Journal of Water Resources 21(1):3-19


Barma, D. and I. Varley (2012) Hydrological modelling Practices for Estimating Low Flows – Guidelines.  Lows Flows Report Series.  Canberra: National Water Commission

Hydrologic modelling practice notes https://www.mdba.gov.au/publications/mdba-reports/hydrologic-modelling-practice-notes

Beven, K. and P. Young (2013) A guide to good practice in modelling semantics for authors and referees.  Water Resources Research 49(8):5092-5098.

Vaze, J., P. Jordan, R. Beecham, A. Frost and G Summerell (2012) Guidelines for Rainfall-runoff modelling – towards best practice model application eWater Cooperative Research Centre.

Jakeman, A. J., Letcher, R. A. and Norton, J. P. (2006) Ten iterative steps in development and evaluation of Enviornmental models.  Environmental modelling and software. 21(5):602-614


Model performance based on coefficient of efficiency

The Nash-Sutcliffe coefficient of efficiency E, is commonly used to assess the performance of rainfall runoff models.

E = \frac{\sum \left(O_i - \bar{O} \right)^2 - \sum \left( M_i - O_i \right)^2 }{\sum \left(O_i - \bar{O} \right)^2 }

Where O are the observed values and M are the modelled values.

The maximum values of E is 1.  A value of zero indicates that the model is only as good as using the mean of the observations.  E can be less than zero.

So what value of E indicates a model is reasonable? Researchers (Chiew and McMahon, 1993) surveyed 93 hydrologists, (63 responded ) to find out what diagnostic plots and goodness of fit statistics they used, which were the most important, and how they were used to classify the quality of a model fit. The most important diagnostic graphs were timeseries plots and scatter plots of simulated and recorded flows from the data used for calibration. R-squared and Nash-Sutcliffe model efficiency coefficient (E) were the favoured goodness of fit statistics. Results were considered acceptable if E ≥ 0.8.

I adapted the values provided by Chiew and McMahon (1993) to create the following table (Ladson, 2008).

Table 1: Model performance based on coefficient of efficiency

Classification Coefficient of efficiency
Coefficient of efficiency
Excellent E ≥ 0.93 E ≥ 0.93
Good 0.8 ≤ E < 0.93 0.8 ≤ E < 0.93
Satisfactory 0.7 ≤ E < 0.8 0.6 ≤ E < 0.8
Passable 0.6 ≤ E < 0.7 0.3 ≤ E < 0.6
Poor E < 0.6 E < 0.3

Others have suggested different ranges.  For example, when assessing an ecosystem model in the North Sea, the following categorisation was used E > 0.65 excellent, 0.5 to 0.65 very good, 0.2 to 0.5 as good, and <0.2 as poor (Allen et al., 2007).  Moriasi et al (2015) provide the performance evaluation criteria shown in Table 2.

Table 2: Criteria for Nash-Sutcliffe coefficient of efficiency (Moriasi et al., 2015)

Component Temporal scale Very good Good Satisfactory Not Satisfactory
Flow Daily Monthly Annual > 0.80 0.7 < E ≤ 0.8 0.50 < E ≤ 0.70 ≤ 0.50
Sediment Monthly > 0.80 0.7 < E ≤ 0.8 0.45 < E ≤ 0.7 ≤ 0.45
Monthly > 0.65 0.50 < E ≤ 0.65 0.35 < E ≤ 0.50 ≤ 0.35

In modelling of flows to the Great Barrier Reef (GBR) the following criteria were adopted (Waters, 2014):

  • Daily Nash Sutcliffe Coefficient of Efficiency, E > 0.5
  • Monthly E > 0.8

For GBR constituent modelling, the following ranges were used for the coefficient of efficiency: Very good E >0.75; Good 0.65 < E ≤ 0.75; Satisfactory 0.50 < E ≤ 0.65; unsatisfactory ≤ 0.50.

Also note that the Nash-Sutcliffe coefficient has been criticised and there are alternative proposals but it remains frequently used in hydrologic modelling.  See the references below for more information.

There is also an interesting discussion on stakoverflow https://stats.stackexchange.com/questions/414349/is-my-model-any-good-based-on-the-diagnostic-metric-r2-auc-accuracy-e/


Allen, J., P. Somerfield, and F. Gilbert (2007), Quantifying uncertainty in high‐resolution coupled hydrodynamic‐ecosystem models, J. Mar. Syst.,64(1–4), 3–14, doi:10.1016/j.jmarsys.2006.02.010. (link to research gate)

Bardsley, W.E. (2013) A goodness of fit measure related to r2 for model performance assessment.  Hydrological Processes. 27(19):2851-2856. DOI: 10.1002/hyp.9914

Criss RE, Winston WE. 2008. Do Nash values have value? Discussion and alternate proposals. Hydrological Process 22: 2723–2725.

Gupta HV, Kling H. 2011. On typical range, sensitivity, and normalization of mean squared error and Nash-Sutcliffe efficiency type metrics. Water Resources Research 47: W10601, doi:10.1029/2011WR010962.

Gupta HV, Kling H, Yilmaz KK, Martinez GF. 2009. Decomposition of the mean squared error and NSE performance criteria: Implications for improving hydrological modelling. Journal of Hydrology 377:80–91.

Ladson, A. R. (2008) Hydrology: an Australian Introduction.  Oxford University Press. (link)

McCuen RH, Knight Z, Cutter, AG. 2006. Evaluation of the Nash–Sutcliffe efficiency index. Journal of Hydrologic Engineering 11:597–602.

Moriasi, D., Gitau, M. Pai, N. and Daggupati, P. (2015) Hydrologic and Water Quality Models: Performance Measures and Evaluation Criteria Transactions of the ASABE (American Society of Agricultural and Biological Engineers) 58(6):1763-1785 (Link to article at research gate)

Murphy, A. H. (1988) Skill scores based on the mean square error and their relationship to the correlation coefficient.  Monthly Weather Review 116: 2417-2424.

Waters, D. (2014) Modelling reductions of pollutant loads due to improved management practices in the Great Barrier Reef catchments. Whole of GBR, Technical Report. Volume 1.Department of Natural Resources and Minds, Queensland Government. (Link to article at research gate)

Willmott, C. J. (1981) On the validation of models.  Physical Geography 2: 184-194.