Model performance based on coefficient of efficiency

The Nash-Sutcliffe coefficient of efficiency E, is commonly used to assess the performance of rainfall runoff models.

E = \frac{\sum \left(O_i - \bar{O} \right)^2 - \sum \left( M_i - O_i \right)^2 }{\sum \left(O_i - \bar{O} \right)^2 }

Where O are the observed values and M are the modelled values.

The maximum values of E is 1.  A value of zero indicates that the model is only as good as using the mean of the observations.  E can be less than zero.

So what value of E indicates a model is reasonable? Researchers (Chiew and McMahon, 1993) surveyed 93 hydrologists, (63 responded ) to find out what diagnostic plots and goodness of fit statistics they used, which were the most important, and how they were used to classify the quality of a model fit. The most important diagnostic graphs were timeseries plots and scatter plots of simulated and recorded flows from the data used for calibration. R-squared and Nash-Sutcliffe model efficiency coefficient (E) were the favoured goodness of fit statistics. Results were considered acceptable if E ≥ 0.8.

I adapted the values provided by Chiew and McMahon (1993) to create the following table (Ladson, 2008).

Table 1: Model performance based on coefficient of efficiency

Classification Coefficient of efficiency
(Calibration)
Coefficient of efficiency
(Validation)
Excellent E ≥ 0.93 E ≥ 0.93
Good 0.8 ≤ E < 0.93 0.8 ≤ E < 0.93
Satisfactory 0.7 ≤ E < 0.8 0.6 ≤ E < 0.8
Passable 0.6 ≤ E < 0.7 0.3 ≤ E < 0.6
Poor E < 0.6 E < 0.3

Others have suggested different ranges.  For example, when assessing an ecosystem model in the North Sea, the following categorisation was used E > 0.65 excellent, 0.5 to 0.65 very good, 0.2 to 0.5 as good, and <0.2 as poor (Allen et al., 2007).  Moriasi et al (2015) provide the performance evaluation criteria shown in Table 2.

Table 2: Criteria for Nash-Sutcliffe coefficient of efficiency (Moriasi et al., 2015)

Component Temporal scale Very good Good Satisfactory Not Satisfactory
Flow Daily Monthly Annual > 0.80 0.7 < E ≤ 0.8 0.50 < E ≤ 0.70 ≤ 0.50
Sediment Monthly > 0.80 0.7 < E ≤ 0.8 0.45 < E ≤ 0.7 ≤ 0.45
Nitrogen
Phosphorus
Monthly > 0.65 0.50 < E ≤ 0.65 0.35 < E ≤ 0.50 ≤ 0.35

In modelling of flows to the Great Barrier Reef (GBR) the following criteria were adopted (Waters, 2014):

  • Daily Nash Sutcliffe Coefficient of Efficiency, E > 0.5
  • Monthly E > 0.8

For GBR constituent modelling, the following ranges were used for the coefficient of efficiency: Very good E >0.75; Good 0.65 < E ≤ 0.75; Satisfactory 0.50 < E ≤ 0.65; unsatisfactory ≤ 0.50.

Also note that the Nash-Sutcliffe coefficient has been criticised and there are alternative proposals but it remains frequently used in hydrologic modelling.  See the references below for more information.

There is also an interesting discussion on stakoverflow https://stats.stackexchange.com/questions/414349/is-my-model-any-good-based-on-the-diagnostic-metric-r2-auc-accuracy-e/

References

Allen, J., P. Somerfield, and F. Gilbert (2007), Quantifying uncertainty in high‐resolution coupled hydrodynamic‐ecosystem models, J. Mar. Syst.,64(1–4), 3–14, doi:10.1016/j.jmarsys.2006.02.010. (link to research gate)

Bardsley, W.E. (2013) A goodness of fit measure related to r2 for model performance assessment.  Hydrological Processes. 27(19):2851-2856. DOI: 10.1002/hyp.9914

Criss RE, Winston WE. 2008. Do Nash values have value? Discussion and alternate proposals. Hydrological Process 22: 2723–2725.

Gupta HV, Kling H. 2011. On typical range, sensitivity, and normalization of mean squared error and Nash-Sutcliffe efficiency type metrics. Water Resources Research 47: W10601, doi:10.1029/2011WR010962.

Gupta HV, Kling H, Yilmaz KK, Martinez GF. 2009. Decomposition of the mean squared error and NSE performance criteria: Implications for improving hydrological modelling. Journal of Hydrology 377:80–91.

Ladson, A. R. (2008) Hydrology: an Australian Introduction.  Oxford University Press. (link)

McCuen RH, Knight Z, Cutter, AG. 2006. Evaluation of the Nash–Sutcliffe efficiency index. Journal of Hydrologic Engineering 11:597–602.

Moriasi, D., Gitau, M. Pai, N. and Daggupati, P. (2015) Hydrologic and Water Quality Models: Performance Measures and Evaluation Criteria Transactions of the ASABE (American Society of Agricultural and Biological Engineers) 58(6):1763-1785 (Link to article at research gate)

Murphy, A. H. (1988) Skill scores based on the mean square error and their relationship to the correlation coefficient.  Monthly Weather Review 116: 2417-2424.

Waters, D. (2014) Modelling reductions of pollutant loads due to improved management practices in the Great Barrier Reef catchments. Whole of GBR, Technical Report. Volume 1.Department of Natural Resources and Minds, Queensland Government. (Link to article at research gate)

Willmott, C. J. (1981) On the validation of models.  Physical Geography 2: 184-194.

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s