Converting between EY, AEP and ARI

The latest version of Australian Rainfall and Runoff (ARR2016) proposes new terminology for flood risk (see Book 1, Chapter 2.2.5).  Preferred terminology is provided in Figure 1.2.1 which is reproduced below.

Definitions:

• EY – Number of exceedances per year
• AEP – Annual exceedance probability
• AEP (1 in x) – 1/AEP
• ARI – Average Recurrence Interval (years)

Australian Rainfall and Runoff preferred terminology

For floods rarer than 5%, the relationship between the various frequency descriptors can be estimated by the following straightforward equations.

$\mathrm{EY} = \frac{1}{\mathrm{ARI}}$
$\mathrm{EY} = \mathrm{AEP}$
$\mathrm{AEP(1\; in\; x \;Years)} = \frac{1}{\mathrm{AEP}}$
$\mathrm{ARI} = \mathrm{AEP(1\; in \; x \; Years)}$
$\mathrm{AEP} = \frac{1}{\mathrm{ARI}}$

For common events, more complex equations are required (these will also work for any frequency):

$\mathrm{EY} = \frac{1}{\mathrm{ARI}}$
$\mathrm{AEP(1\; in\; x \;Years)} = \frac{1}{\mathrm{AEP}}$
$\mathrm{AEP(1\; in\; x \;Years)} = \frac{\exp(\mathrm{EY})}{\left( \exp(\mathrm{EY}) - 1 \right)}$
$\mathrm{ARI} =\frac{1}{-\log_e(1-AEP)}$
$\mathrm{AEP} = \frac{\exp(\frac{1}{\mathrm{ARI}}) - 1}{\exp(\frac{1}{\mathrm{ARI}})}$

A key result is that we can’t use the simple relationship ARI = 1/AEP for frequent events.  So, for example, the 50% AEP event is not the same as the 2-year ARI event.

Example calculations

For an ARI of 5 years, what is the AEP:

$\mathrm{AEP} = \frac{\exp(\frac{1}{\mathrm{5}}) - 1}{\exp(\frac{1}{\mathrm{5}})} = 0.1813$

For an AEP of 50%, what is the ARI?

$\mathrm{ARI} =\frac{1}{-\log_e(1-0.5)} = 1.443$

R functions and example calculation available as a gist.

Flood frequency plots using ggplot

This post provides a recipe for making plots like the one below using ggplot2 in R.  Although it looks simple, there are a few tricky aspects:

• Superscripts in y-axis labels
• Probability scale on x-axis
• Labelling points on the x-axis that are different to the plotted values i.e. we are plotting the normal quantile values but labelling them as percentages
• Adding a title to the legend
• Adding labels to the legend
• Positioning the legend on the plot
• Choosing colours for the lines
• Using commas as a thousand separator.

Code is available as a gist, which also shows how to:

• Enter data using the tribble function, which is convenient for small data sets
• Change the format of data to one observation per row using the tidyr::gather function.
• Use a log scale on the y-axis
• Plot a secondary axis showing the AEP as 1 in X years
• Use the Probit transformation for the AEP values

Flood frequency and the rule of 3

There is a ‘rule of three‘ in statistics that provides a rapid method for working out the confidence interval for flood occurrence.

From Wikipedia:

If a certain event did not occur in a sample with n subjects, the interval from 0 to 3/n is a 95% confidence interval for the rate of occurrences in the population.

For example, if a levee hasn’t been overtopped since it was built 100 years ago, then it can be concluded with 95% confidence that overtopping will occur in fewer than 1 year in 33 (3/100).  Alternatively the 95% confidence interval for the Annual Exceedance Probability of the flood that would cause overtopping is between 0 and 3/100 (3%).  Of course you may be able to get a better estimate of the confidence interval if you have other data such as a flow record, information on water levels and the height of the levee.

The rule of 3 provides a reasonable estimate for n greater 30.

Protecting assets from flooding: what size flood do I need to consider?

Often assets, such as houses or mines, have some design life. If they are to be built on flood prone land then we need to decide the appropriate level of flood protection so there is an acceptable flood risk during their design life.

Consider an example. This is taken from the wonderful book Statistical methods in Hydrology by C. T. Haan (see p 87 of the second edition).

In order to be 90% sure that a design flood is not exceeded in a 10-year period, what should be the return period of the design flood?

The example is analogous to tossing a biassed coin. We are going to toss a coin 10 times, once for each of the 10 years of the design life, and we want to be 90% sure that there will be no heads – assuming heads means floods. What probability, p, should we set for getting a head for a single toss. The return period we are interested in will be 1/p.

The probability of getting no floods during the design period is

(1-p)^10

This needs to equal 90%

0.9 = (1 – p)^10

Therefore p = 1- 0.9^(1/10) = 0.01408

The return period = 1/p = 95.413

So to be 90% sure of avoiding failure, we need to ensure our asset, with a 10-year design life, is protected against a 96-year flood event.

What is the probability of getting at least one flood in the 10-year design life if the asset is protected against floods with a 10-year recurrence interval?

1 – (1 – (1/10))^10 = 0.651 = 65%

Setting the design recurrence interval equal to the design life means that it is likely the asset will experience flooding.

Haan provides a graph that shows design return period required as a function of design life to be a given percent confident that the design condition is not exceeded. I’ve reproduced this below.

Design return period required as function of design life to be a given percent confident that the design condition is not exceeded

Using the graph, if a house has a design life of 50-years1 then to be 95% confident of not being flooded, the house needs to be protected against floods with a return period of about 1000 years (the actual answer is 975 years).  The is a much higher standard than is commonly used.  Current practice is to adopt the 100-year flood as a design event in most situations which means many assets, that are ostensibly protected against floods, will nevertheless be flooded during their design life.

Probably of at least 1, 1% flood in 50 years.

$Pr = 1 - (1 - 0.01)^{50} = 0.395$

So 39.5% of flood prone houses, protected to a 1% standard, will flood during their design life.

R code is below.

# Function to calculate the required return period given:
# design.life in years
# confidence (%propability of not being flooded)

CalcReturnPeriod <- function(design.life, confidence ) {
1/(1- (confidence/100)^(1/design.life))

}

# 50 year design life
# 95% confident wont be flooded

CalcReturnPeriod(50, 95)
[1] 975.2864 years

# Make the graph

library(RColorBrewer)
library(dplyr)
library(ggplot2)
library(devtools)
library(grid)

my.pal <- brewer.pal(length(confidence.seq), 'Paired') # define line colours

confidence.seq <- c(30, 40, 50, 60, 70, 75, 80, 90, 95)
design.life.seq <- 10^seq(0,2,0.1)

y.label <- CalcReturnPeriod(1, confidence.seq) # location of labels

df %>%
mutate(return.period = CalcReturnPeriod(design.life, confidence)) %>%
ggplot(aes(x = design.life, y = return.period, color = factor(confidence))) +
geom_line() +
annotate('text', x = 0.9, y = y.label, label = confidence.seq) +
scale_x_log10(name = 'Design life', breaks = c(1,2, 5, 10,20, 50, 100)) +
scale_y_log10(name = 'Return period',  breaks = c(1,2, 5, 10,20, 50, 100, 200, 500, 1000)) +
BwTheme +  # pre-defined theme
scale_colour_manual(values = my.pal) +
theme(legend.position="none")   # remove legend



1 Australian Rainfall and Runoff (Book 1, Table 1.5.2) lists the effective service life of residential buildings as 40 to 95 years.