# 1% flood: binomial distribution, conditional probabilities

I previously wrote about considering the occurrence of 1% floods as a binomial distribution, this post extends that analysis to look at conditional probabilities.  Some of the results are counter intuitive, at least to me, in that the risk of multiple 1% floods is larger than I would have guessed.

The probability of a 1% (1 in 100) annual exceedance probability (AEP) flood occurring in any year is 1%.  This can be treated as the probability of a “success”  in the binomial distribution, with the number of trials being the number of years. So the probability of having exactly one 1% flood in 100 years is

${100\choose 1}0.01^{1}\left( 1-0.01\right) ^{99} = 0.37$

In R this can be calculated as dbinom(x = 1, size = 100, prob = 0.01) or in excel =BINOM.DIST(1,100, 0.01, FALSE).

The cumulative distribution function of the binomial distribution is also useful for flood calculations.

What is the probability of 2 or more 1% floods in 100 years:

R: pbinom(q = 1, size = 100, prob = 0.01, lower.tail = FALSE) = 0.264

Excel: =1 - BINOM.DIST(1,100, 0.01, TRUE) = 0.264

We can check this by calculating the probability of zero or one flood in 100 years and subtracting that value from 1.

1 - (dbinom(x = 1, size = 100, prob = 0.01) + dbinom(x = 0, size = 100, prob = 0.01)) = 0.264

We can also do conditional probability calculations which could be useful for risk assessment scenarios.

What is the probability that exactly two 1% floods occur in 100 years given that at least one occurs?

$\Pr{(X = 2\mid X \ge 1)}$ =
dbinom(x = 2, size = 100, prob = 0.01)/pbinom(q = 0, size = 100, prob = 0.01, lower.tail = FALSE) = 0.291

What is the probability that at least two 1% floods occur in 100 years given that at least one occurs?
$\Pr{(X \ge 2\mid X \ge 1)}$ =
pbinom(q = 1, size = 100, prob = 0.01, lower.tail = FALSE)/pbinom(q = 0, size = 100, prob = 0.01, lower.tail = FALSE) = 0.416

We can also check this by simulation.  This code generates the number of 1% floods in each of 100,000 100-year sequences.  We can then count the number of interest.

set.seed(1969) # use a random number seed so the analysis can be repeated if necessary
floods = rbinom(100000,100, 0.01) # generate the number of 1% floods in each of 100,000, 100-year sequences

floods_subset = floods[floods >= 1] # Subset of sequences that have 1 or more floods
# Number of times there are two or more floods in the subset of 1 or more floods

sum(floods_subset >= 2) / length(floods_subset)
# 0.4167966

# or
sum(floods >= 2)/sum(floods >= 1)

#[1] 0.4167966



A slightly tricker situation is a question like: What is the probability of three or fewer floods in 100-years given there is more than one.

$\Pr{(X \le 3\mid X > 1)} = \Pr(X \le 3 \cap X > 1 )/\Pr( X > 1)$

floods_subset = floods[floods > 1] # Subset of sequences that have more than one flood

# Number of times there are three or fewer floods in the subset of more than one flood

sum(floods_subset ≤ 3) / length(floods_subset)
#[1] 0.9310957

# Or, for the exact value

# (Probability that X = 3 + Probability that X = 2)/(Probability that X > 1)
(dbinom(x = 3, size = 100, prob = 0.01) + dbinom(x = 2, size = 100, prob = 0.01))/ pbinom(q = 1, size = 100, prob = 0.01, lower.tail = FALSE)
#[1] 0.9304641



The probability of experiencing at least one 1% flood in 100-years is $1 - (1-0.01)^{100}$ = 0.634.  How many years would we need to wait to have a 99% chance of experiencing a 1% flood?

$0.99 = 1-(1-0.1)^n$

$n=\frac{log(0.01)}{log(0.99)} = 458.2$.  The next largest integer is 459.

We can also solve this numerically.  In R the formula is 0.99 = pbinom(q=0, size = n, prob = 0.01), solve for n. Using the uniroot function gives n = 459 years (see below).

So all these areas subject to a 1% flood risk will flood eventually, but it may take a while.

f = function(n) {
n = as.integer(n) #n must be an integer
0.99 - pbinom(q = 0, size = n, prob = 0.01, lower.tail = FALSE)
}

# \$root
# [1] 458.4999

uniroot(f, lower = 100, upper = 1000)

pbinom(q = 0, size = 459, prob = 0.01, lower.tail = FALSE)
# [1] 0.990079



How many years before there is a >99% chance of experiencing more than one flood? This is one minus (the probability of zero floods + the probability of one flood).

Let the number of years equal n.

$1-((1-0.01)^n + n(0.01)(1-0.01)^{n-1}) = 0.99$. Solving for n gives 662 years

# Assessing the impact of blockage as part of flood modelling

Australian Rainfall and Runoff 2016 provides guidance on assessing the impact of blockage of culverts and bridges as part of flood modelling. Details are in Book 6 Chapter 6.

The need to assess blockage represents a change to hydraulic modelling practice.  In the past, industry-wide guidance was lacking and the effect of blockage was often not considered.    The new guidelines provide a standard procedure but, as yet, there is limited experience in their application.  Part of the process is to complete a blockage assessment form which is linked from ARR.  Unfortunately the link no longer works but the form is on the internet here.

The guidelines are being incorporated into hydraulic modelling software with a recent paper outlining the procedures that have been included in TUFLOW and an assessment of how they have performed in three case studies of recent flood modelling projects (Ollett, et al., 2017).

It is likely that future flood modelling briefs will require assessment of blockage so flood consultants will need to learn about, and be able to apply, the procedures and explain the significance of results to clients.  Some resources are listed blow.

Large floating debris collection. Chalmers St Wollongong after the Aug 1998 flood (Forbes, Rigby 1999) (Source: ARR Project 11 Stage 1 report, p. 14)

### Form

Blockage Assessment Form

### References – articles

Ollett, P., Syme, B. and Ryan, P. (2017) Australian Rainfall and Runoff guidance on blockage of hydraulic structures: numerical implementation and three case studies.  Journal of Hydrology (NZ) 56(2) 109-122. (link)

Ollett, P. and Syme, B. (2016) ARR blockage: numerical implementation and three case studies.  37th Hydrology and Water Resources Symposium 2016: Water, Infrastructure and the Environment. Queenstown, NZ. pp. 346-359 (link)

Ribgy, E. and Weeks, W. (2015) Evolving an Australian design procedure for structure blockages.  36th Hydrology and Water Resources Symposium: The art and science of water. Hobart, Tas. pp. 154-161. (link)

Suitability of ARR guidelines as an alternative blockage policy for Wollongong. 36th Hydrology and Water Resources Symposium: The art and science of water. Hobart, Tas. pp. 370-377. (link)

### References – ARR2016 project reports

Project 11 (Blockage of Hydraulic Structures) Stage 2 Report (2013) (link at arr-software.org)

# Better flood frequency plots from Flike – II

I’ve previously written about improving the flood frequency plots from Flike.   This is an update to that earlier post.

Using Flike version 5.0.300.0 I’ve fitted a Log Pearson III distribution to the annual series for the Tyers River at Browns (226007) using data from 1963 to 2007.  The graph produced by Flike is shown in Figure 1.  This graph is fine to show the quality of the fit but it would be nice to polish it for incorporation into a report.  A csv file of the flow data input to Flike is is available here; the Flike .fld file is here.

Figure 1: Flike plot – flood frequency curve for the Tyers River at Browns (226007) (1963-2007).  Log Pearson III probability model

The data used to create the plot can be downloaded into a csv file by clicking the ‘Save’ button below the plot (the file associated with this graph is available here).  There are three parts to the resulting file:

1. the data points – deviates and gauged values
2. points specifying the expected parameter quantiles and confidence limits
3. points specifying the expected probability quantiles.

These can be read into Excel, or a graphics program, and plotted.   An example is shown in Figure 2.

A key enhancement in this figure, compared to the standard Flike plot, is that the y-axis tick marks are labelled with the flow values rather than logs. The log transformation has been retained, just the labelling has been changed. The x-axis tick mark labels are similar. The deviate values are plotted but are labelled using the ‘1 in Y’ format.

Although it takes some time to construct and label a plot, much of the work can be repeatedly re-used in future reports. If you are doing a lot of flood frequency analysis, its worth setting up a template.

I’ve used the ggplot2 package in R to produce this plot.  Details are available via this gist.

# Flood frequency plots using ggplot

This post provides a recipe for making plots like the one below using ggplot2 in R.  Although it looks simple, there are a few tricky aspects:

• Superscripts in y-axis labels
• Probability scale on x-axis
• Labelling points on the x-axis that are different to the plotted values i.e. we are plotting the normal quantile values but labelling them as percentages
• Adding a title to the legend
• Adding labels to the legend
• Positioning the legend on the plot
• Choosing colours for the lines
• Using commas as a thousand separator.

Code is available as a gist, which also shows how to:

• Enter data using the tribble function, which is convenient for small data sets
• Change the format of data to one observation per row using the tidyr::gather function.
• Use a log scale on the y-axis
• Plot a secondary axis showing the AEP as 1 in X years
• Use the Probit transformation for the AEP values

Links for more information:

# Climate change and flood investigations

One surprising finding from the review of the state of hydrologic practice in Victoria, is that climate change impacts on flooding are not being widely considered. Only half the studies reviewed (10 of 20), mention climate change.  Similar findings are reported in other work that shows some Victorian flood managers are not keeping up with their national and international colleagues in considering the additional flood risk predicted with a change in climate.

There is already evidence that rainfall intensity for short duration storms is increasing, which could lead to more frequent and larger flash floods.  This is a particular issue in towns and cities because small urban catchments are especially vulnerable.

In the corporate world, consideration of climate change is being taken seriously.   The recent Hutley opinion found that many climate change risks “would be regarded by a Court as being foreseeable at the present time” and that Australian company directors “who fail to consider ‘climate change risks’ now, could be found liable for breaching their duty of care and diligence in the future”.

The Task Force on Climate Related Financial Disclosures (TCFD), chaired by Michael Bloomberg, has recently released recommendations on how companies should report on climate change risks.  This includes the need to report on risks of “Increased severity of extreme weather events such as cyclones and floods” and “Changes in precipitation patterns and extreme weather variability”.

In the Australian flood scene, the latest Handbook 7Managing the floodplain: a guide to best practice in flood risk management in Australia – provides advice on assessing and reporting on climate change risk.  But the accompanying project brief template and guide, describe climate change aspects of a flood investigation as optional.  The latest version of Australian Rainfall and Runoff provides recommended approaches to assessing climate change impacts on flooding but recent research  argues these methods are too conservative.

On a positive note for Victoria, the Floodplain Management Strategy does encourage consideration of climate change (Policy 9A):

Flood studies prepared with government financial assistance will consider a range of floods of different probabilities, and the rarer flood events will be used to help determine the location’s sensitivity to climate change. Further climate change scenarios may be considered where this sensitivity is significant.

Figure 1: Flooding in Creswick 4 Aug 2010 (link to source)

Flood investigations lead on to decisions about land use zoning and design of mitigation works.  Are climate change risks to these measures foreseeable at the present time?  If so, then they should be considered and reported on.

Clearly this is an area where knowledge and ideas are changing rapidly. Practising hydrologists need to keep up with latest methods, and managers and boards of floodplain management authorities need to be aware of the latest thinking on governance, risk management, and disclosure.

# ARR update from the FMA conference

There were several papers related to Australian Rainfall and Runoff at the FMA conference last week.  Once the papers become available on the FMA website, it would be worth checking, at least these three:

• What Do Floodplain Managers Do Now That Australian Rainfall and Runoff Has Been Released? – Monique Retallick, WMAwater.
• Australian Rainfall and Runoff: Case Study on Applying the New Guidelines -Isabelle Testoni, WMAwater.
• Impact of Ensemble and Joint Probability Techniques on Design Flood Levels -David Stephens, Hydrology and Risk Consulting.

There was also a workshop session where software vendors and maintainers discussed how they were updating their products to become compliant with the new ARR.

A few highlights:

1. The ARR team are working on a single temporal pattern that can be used with hydrologic models to get a preliminary and rapid assessment of flood magnitudes for a given frequency. This means an ensemble or Monte Carlo approach won’t be necessary in all cases but is recommended for all but very approximate flood estimates.

2. The main software vendors presented on their efforts to incorporate ARR2016 data and procedures into models. This included: RORB, URBS, WBMN, RAFTS. Drains has also included functionality. All the models use similar approaches but speakers acknowledged further changes were likely as we learn more about the implications of ARR2016. The modelling of spatial rainfall patterns did not seem well advanced as most programs only accept a single pattern so don’t allow for the influence of AEP and duration.

3. WMA Water have developed a guide on how to use ARR2016 for flood studies. This has been done for the NSW Office of Environment and Heritage (OEH) and looks to be very useful as it includes several case studies. The guide is not yet publicly available but will be provided to the NFRAG committee so may released.

4. Hydrologists need to take care when selecting the hydrograph, from the ensemble of hydrographs, to use for hydraulic modelling. A peaked, low-volume hydrograph may end up being attenuated by hydraulic routing. We need to look at the peaks of the ensemble of hydrographs as well as their volumes. The selection of a single design hydrograph from an ensemble of hydrographs was seen as an area requiring further research.

5. Critical duration – The identification of a single critical duration is often much less obvious now we are using ensemble rainfall patterns. It seems that many durations produce similar flood magnitudes. The implications of this are not yet clear. Perhaps if the peaks are similar, we should consider hydrographs with more volume as they will be subject to less attenuation from further routing.

6. There was lots of discussion around whether we should use the mean or median of an ensemble of events.  The take away message was that in general we should be using the median of inputs and mean of outputs.

7. When determining the flood risk at many points is a large catchment, different points will have different critical durations. There was talk of “enveloping” the results. This is likely to be an envelope of means rather than extremes.

8. The probabilistic rational method, previously used for rural flood estimates in ungauged catchments, is no longer supported. The RFFE is now recommended.

9. The urban rational method will only be recommended for small catchments such as a “two lot subdivision”.

10. There was no update on when a complete draft of ARR Book 9 would be released.

11. Losses should be based on local data if there is any available. This includes estimating losses by calibration to a flood frequency curve. Only use data hub losses if there is no better information. In one case study that was presented, the initial loss was taken from the data hub and the continuing loss was determined by calibration to a flood frequency curve.

12. NSW will not be adopting the ARR2016 approach to the interaction of coastal and riverine flooding. Apparently their current approaches are better and have an allowance for entrance conditions that are not embedded in the ARR approach.

13. NSW will not be using ARR approaches to estimate the impacts of climate change on flooding. Instead they will use NARCLIM.

14. NSW have mapped the difference between the 1987 IFD and the 2016 IFD rainfalls and use this to assist in setting priorities for undertaking flood studies.

15. A case study was presented for a highly urbanized catchment in Woolloomooloo. There was quite an involved procedure to determine the critical duration for all points in the catchment and the temporal patterns that led to the critical cases. Results using all 10 patterns were mapped, gridded and averaged. I didn’t fully understand the approach as presented but there may be more information in the published version of Isabelle Testoni’s paper once it becomes available.

There is still much to learn about the new Australian Rainfall and Runoff and much to be decided.  The papers at the FMA conference were a big help in understanding how people are interpreting and responding to the new guideline.

# Flood frequency and the rule of 3

There is a ‘rule of three‘ in statistics that provides a rapid method for working out the confidence interval for flood occurrence.

From Wikipedia:

If a certain event did not occur in a sample with n subjects, the interval from 0 to 3/n is a 95% confidence interval for the rate of occurrences in the population.

For example, if a levee hasn’t been overtopped since it was built 100 years ago, then it can be concluded with 95% confidence that overtopping will occur in fewer than 1 year in 33 (3/100).  Alternatively the 95% confidence interval for the Annual Exceedance Probability of the flood that would cause overtopping is between 0 and 3/100 (3%).  Of course you may be able to get a better estimate of the confidence interval if you have other data such as a flow record, information on water levels and the height of the levee.

The rule of 3 provides a reasonable estimate for n greater 30.