Author Archives: tonyladson

About tonyladson

I work as a hydrologist. I'm a director of a specialist consulting company, Moroka Pty Ltd and have teaching and research appointments at Monash University, the University of Melbourne and Victorian University

Red Books

I wrote a previous post about the on-line availability of the ‘Blue Books’, a key resource for Victorian Hydrologists which contain stream gauge information to 1987.   Now the ‘Red Books’ are also available, which include stream gauge information to 1982.  There are some additional gauges in the Red Books so its good to check both sources if you are looking for stream data.

The four volumes can be accessed via these links:

Red Books

Australian Rainfall and Runoff Workshops

A brief post to highlight the three ARR2016 workshops coming up in Melbourne:

As I learn things about ARR2016 I’m adding pages here.

R Packages for Hydrology

The Environmetrics Task View contains information about using R to analyse ecological and environmental data.  Importantly it includes a list of packages useful for hydrology.

I’ve extracted part of the task view here but please also check the original.  Acknowledgements to Gavin Simpson.

  • Package HydroMe estimates the parameters in infiltration and water retention models by curve-fitting method.
  • hydroTSM is a package for management, analysis, interpolation and plotting of time series used in hydrology and related environmental sciences.
  • hydroGOF is a package implementing both statistical and graphical goodness-of-fit measures between observed and simulated values, mainly oriented to be used during the calibration, validation, and application of hydrological/environmental models. Related packages are tiger, which allows temporally resolved groups of typical differences (errors) between two time series to be determined and visualized, and qualV which provides quantitative and qualitative criteria to compare models with data and to measure similarity of patterns
  • hydroPSO is a model-independent global optimization tool for calibration of environmental and other real-world models that need to be executed from the system console.hydroPSO implements a state-of-the-art PSO (SPSO-2011 and SPSO-2007 capable), with several fine-tuning options. The package is parallel-capable, to alleviate the computational burden of complex models.
  • EcoHydRology provides a flexible foundation for scientists, engineers, and policy makers to base teaching exercises as well as for more applied use to model complex eco-hydrological interactions.
  • topmodel is a set of hydrological functions including an R implementation of the hydrological model TOPMODEL, which is based on the 1995 FORTRAN version by Keith Beven. New functionality is being developed as part of the RHydro package on R-Forge.
  • dynatopmodel is a native R implementation and enhancement of the Dynamic TOPMODEL, Beven and Freers’ (2001) extension to the semi-distributed hydrological model TOPMODEL (Beven and Kirkby, 1979).
  • wasim provides tools for data processing and visualisation of results of the hydrological model WASIM-ETH
  • The nsRFA package provides collection of statistical tools for objective (non-supervised) applications of the Regional Frequency Analysis methods in hydrology.
  • The boussinesq package is a collection of functions implementing the one-dimensional Boussinesq Equation (ground-water).
  • rtop is a package for geostatistical interpolation of data with irregular spatial support such as runoff related data or data from administrative units.


Seasonal gradient in flow across Melbourne

There is a seasonal gradient in flow across Melbourne. Out in the rural east, the rivers flow as you would expect; high flows in winter (maximum in August) with dry conditions in January, February and March.  As we track west, the seasonal signal is lost and we get similar flows in all months.

Lets look at monthly discharges in seven streams from the Bass River, near Phillip Island, around to the Werribee River.


Flow in the Bass River at Glen Forbes South is strongly seasonal.  The graph below shows the flow volume in each month. The horizontal lines are the monthly averages with vertical lines representing volumes for each month as recorded in the years 2000 to 2015.


As we track west, winter flows become less pronounced; see the discharges for the Bunyip River below.


There is a small seasonal signal in Eumemmerring Creek (near Dandenong), but in the streams in urbanised catchments (from east to west – Gardiners, Merri and Kororoit) there is no evidence of seasonality.   For these streams, the high flow months are November and February.  This is partly driven by some unusually large events, for example, the flood of Feb 2005 stands out on these figures.

For the highly regulated Werribee River, average monthly volumes are higher in summer and spring.  The small number of large vertical spikes show this pattern is influenced by a few rare events.






Its difficult to tease out the cause of the differences in seasonality as these streams are affected by gradients in urbanisation, climate, and regulation.  Certainly, urbanisation has a strong influence. The impervious surfaces in urban areas will produce runoff all year round; they aren’t influenced by the seasonal wetting and drying that occurs in the rural catchments.

And the consequences?  We know that fish, for example, respond to seasonal flow changes to migrate and spawn.  But changes in seasonality are probably only a small influence in comparison with the large number of factors affecting the ecology of these streams.

Log-normal flood frequency analysis

Although the log-normal distribution is not usually used for flood frequency estimation in Australia it can be useful to compare results with more common approaches.

The general equation for flood quantile estimation (from Chow, 1951) is:

x_t = \mu+K_t\sigma      (1)

Where K_t is the frequency factor that depends on the distribution of the flood data and the probability of interest, \mu and \sigma are the mean and standard deviation.

Frequency factors

The classical frequency factor for the log-normal distribution is:

K_t = \left( \frac{1}{C_v} \right) \left[ \exp \left\{ Z\left( \log \left(1 + C_v^2 \right) \right)^\frac{1}{2} - 0.5\left( \log \left( 1 + C_v^2 \right) \right) \right\} -1\right]      (2)

Where, Z is a standard normal deviate and C_v is the coefficient of variation = \frac{\mu}{\sigma}.  The standard normal deviate can be calculated, for example, using the qnorm function in R.  So for the 100-year flood, exceedance probability = 1/100, z = 2.326.

An R function to calculate this log-normal frequency factor is:

# Frequency factor for the log-normal distribution
# m - mean
# s - standard deviation
# p - exceedance probability

FF_LogNormal <- function(m, s, p) {
cv <- s/m
z <- qnorm(1 - p)
(1/cv) * (exp(sqrt(log(1 + cv^2)) * z - 0.5 * log(1 + cv^2)) - 1)


Example 1

Kite (2004) provides an example we can use for testing.

Annual maximum discharges of the Saint John River at Fort Kent, New Brunswick for the 37 years from 1927 to 1963 have a mean of 81,000 cfs and a standard deviation of 22,800. Estimate the 100-year flood.

As, calculated below, Q100 = 148,000 cfs

m <- 81000
s <- 22800
p <- 0.01

Kt <- FF_LogNormal(81000, 22800, p) #2.943

(Q100 <- m + Kt * s)
#[1] 148221.3

If flood data are log-normally distributed, that means the logarithms of the flood data are normally distributed.  This suggests there are two ways of calculating the flood quantiles.  We can use the data as-is along with the log-normal frequency factor (as in the above example), or take the logs of the data and use the frequency factor from the normal distribution.

Example 2

Continuing with the example from Kite.  The mean and standard deviation of the logarithms of the 37 events from the Saint John River are 11.263 and 0.284 (in cfs units).

The frequency factor for the 100-year flood, based on the normal distribution, will be 2.326 so the 100-year flood estimate is 150,800 cfs.  Similar to the previous example but not exactly the same.

m.log <- 11.263
s.log <- 0.284

Kt <- qnorm(1 - 0.01) #2.326348

(Q100 <- exp(m.log + Kt*s.log))
#[1] 150795.9

Example 3

Lets repeat this analysis using some Australian data.  An annual flood series for the Hunter River at Singleton is available here.

The two estimates for the 100-year flood are 10,500 cumec using data as-is and 13,800 cumec when calculating in the log domain.

# Hunter River at Singleton
my.url <- ''

singleton <- source_data(my.url)

m <- mean(singleton$`Peak (cumec)`) # 1401.7
s <- sd(singleton$`Peak (cumec)`)   # 2312.9
Kt <- FF_LogNormal(m, s, 0.01) # 3.917
(Q100 <- m + Kt * s)
# 10460.5

m.log <- mean(log(singleton$`Peak (cumec)`)) # 6.425
s.log <- sd(log(singleton$`Peak (cumec)`))   # 1.336
Kt <- qnorm(1 - 0.01) # 2.326
(Q100 <- exp(m.log + Kt*s.log))
# 13818.9

Bayesian frequency factor

Kuczera (1999) derives a frequency factor for the log-normal distribution which takes account of the uncertainty in the parameters – the mean and standard deviation.  These parameters must be estimated from the flood data.  Assuming nothing is known about these parameters other that what we learn from the flood measurements (i.e. the Bayesian prior is noninformative), the frequency factor is:

K_t = t_{n-1} \times \sqrt{1 + \frac{1}{n}}

Using this frequency factor will result in a more conservative estimate of flood quantiles. As the sample size increases the the Bayesian frequency factor will approach the frequency factor based on the normal distribution.

An R function to calculate this Bayesian log-normal frequency factor is:

# Frequency factor for the log-normal distribution
# n - number of data points
# p - exceedance probability

FF_LogNormal_Bayes <- function(n, p) {

qt( 1-p, n-1) * sqrt(1 + 1/n)


Example 4

Calculate the Bayesian estimate for the 100-year flood quantile using the data for the Hunter River at Singleton.

Calculations are below.  The estimate is 17,350 cumecs.

my.url <- ''

singleton <- source_data(my.url)

m.log <- mean(log(singleton$`Peak (cumec)`)) # 6.425
s.log <- sd(log(singleton$`Peak (cumec)`)) # 1.336
Kt <- FF_LogNormal_Bayes(nrow(singleton), 0.01) # 2.4966

(Q100 <- exp(m.log + Kt*s.log))
# 17348


So we have three flood quantile estimators for log-normally distributed data:

  1. Take logs and use the frequency factor from the normal distribution
  2. Use data as-as and calculate the frequency factor using equation 2.
  3. Take logs and use the Bayesian frequency factor.

If the data set is large, all the results are the same.

my.flood <- rlnorm(1e6, 6, 1)  # generate 1 flood values from a log-normal distribution.

# True value

exp(6 + 1*qnorm(1 - 0.01)) # 4131.302

# Direct calculation of quantile
quantile(my.flood, probs = 0.99, type = 8 ) # 4114.213

# Taking logs and using normal distribution frequency factor
Q100.log <- mean(log(my.flood)) + sd(log(my.flood))*qnorm(1 - 0.01)
exp(Q100.log) # 4130.316

# Use frequency factor from equation 2
Q100 <- mean(my.flood) + sd(my.flood) * FF_LogNormal(mean(my.flood), sd(my.flood), 0.01)
Q100 # 4127.146

# Bayes frequency factor
Q100.Bayes <- mean(log(my.flood)) + sd(log(my.flood))*FF_LogNormal_Bayes(length(my.flood), 0.01)
exp(Q100.Bayes) #4130.336

For small datasets, the estimates vary substantially.

First, let’s make a function that simulations 30 years of log-normal flood peaks and calculates the 100-year quantiles.

Test_f <- function(i, p) {

# i = dummy variable so the function can be included in a loop
# p = exceedance probability

my.flood <- rlnorm(30, 6, 1) # Generate 30 years of flood data
Q100.normal <- mean(log(my.flood)) + sd(log(my.flood))*qnorm(1 - p)
Q100.eq2 <- mean(my.flood) + sd(my.flood) * FF_LogNormal(mean(my.flood), sd(my.flood), p)
Q100.Bayes <- mean(log(my.flood)) + sd(log(my.flood))*FF_LogNormal_Bayes(length(my.flood), p)
data.frame(Q100.normal = exp(Q100.normal), Q100.eq2 = Q100.eq2, Q100.Bayes = exp(Q100.Bayes))


Now, we’ll call this multiple times and compare the average quantiles for the three methods with the true quantile.

out <- lapply(1:10000, Test_f, p = 0.01)
out.df <-, out)

Q100.normal Q100.eq2 Q100.Bayes
4334.727 3678.353 5204.641

# True quantile 4131

So, for for this data set, on average, the quantile based on taking the logs and using the frequency factor from the normal distribution is about right.  The quantile based on equation 2 is too low, and the quantile using the Bayesian frequency factor is too high.

In another post, we’ll look at quantifying the uncertainty in quantile estimates.


Chow, V. T. (1951) A general formula for hydrologic frequency analysis.  Transactions, American Geophysical Union 32:231-237.

Kite, G. W. (2004) Frequency and risk analysis in hydrology.  Water Resources Publications, LLC.

Kuczera (1999) Comprehensive at-site flood frequency analysis using Monte Carlo Bayesian inference.  Water Resources Research 35(5): 1551-1557.

Graphing a water balance

The water balance for a urban catchment equates the change in storage during a certain period, with the difference between water inputs (precipitation and mains water) and water outputs (evaporation, stormwater runoff and wastewater discharge).

\Delta s = (P+I) - (E_a + R_s + R_w)

\Delta s change in catchment storage
P precipitation
I  imported water
E_a actual evaportranspiration
R_s stormwater runoff
R_w wastewater discharge

Mitchell et al., (2003) provides data on the water balance for Curtin, ACT for 1979 to 1996.  The water balance for the average, wettest and driest years are shown in the table below.


When presenting financial statements, a common approach is to use a waterfall chart which shows how the components of a financial balance contribute to an overall result.  Here I’ve used a waterfall chart to show the water balance for Curtin for the driest and wettest year as reported by Mitchell et al., (2003).



Figure 1: Water balance for Curtin, ACT in (A) and driest and (B) the wettest years as estimated by Mitchell et al., (2003).

Does this approach to visualising a water balance help understanding?  A few things stand out:

  • In the driest year, more water was input from the mains than from rainfall
  • In the driest year, actual evapotranspiration was larger than rainfall and mains inputs.
  • Evapotranspiration and stormwater change with climate, with large variation between the wet and dry years.  Wastewater doesn’t change all that much.
  • Precipitation is highly variable, ranging from 247 mm to 914 mm.

There is a guide to making a waterfall chart in excel here.  The R code to produce the graphs shown in this blog is available as a gist, which draws on this blog.


Mitchell, V. G., T. A. McMahon and R. G. Mein (2003) Components of the Total Water Balance of an Urban Catchment. Environmental Management 32(6): 735-746. (link)

Munging rating tables

The Victorian water monitoring site includes rating tables for stream gauges but they are in a format that is not easy to work with.   An example is shown in Figure 1 below.

Rating table

Figure 1: Extract of rating table

The following steps can be used to extract and convert the data into a useable format.

1. Download and save rating table.  Click the button shown to get the rating table as a text file.


Figure 2: Save the rating table

2. Re-format the data to create columns of levels and flows.  You’ll need to use your favourite tool for this munging step.  An example using R is available as a gist.

3. Plot and compare with the online version


Figure 2: Rating plot (source:


Figure 3: Rating plot using data from rating table

4. Save as a csv file for further use.

R code is available here.

Related posts: