Scraping the RFFE

[Edit 23 Nov 2018: The RFFE site now has some CloudFlare protection  which means the simple postForm call in this code will no longer work.  Scraping is still possible but will require a bit more code. (Thanks Alistair for the tip)]  

[Edit 17 Oct 2016: The RFFE is back]

[Edit 10 Oct 2016: The RFFE web app has disappeared.  The domain is being changed to which has caused the temporary loss of some pages and links]

The Australian Regional Flood Frequency Estimation (RFFE) model is available is available as a web app.   It is a wonderful tool for practising hydrologists and a major output of the Australian Rainfall and Runoff revision project.

It is possible to specify a catchment anywhere in Australia and the RFFE will provide a flood estimate. Details of the method are in ARR (Book 2)

The RFFE has a nice point and click interface but it’s also easy to scape results.  Using RCurl, results can be returned using the ‘postForm’ function, and by specifying 5 parameters for a catchment of interest.

  • Catchment name (catchment_name)
  • Latitude of the catchment outlet (lato)
  • Longitude of the catchment outlet (lono)
  • Latitude of the catchment centroid (latc)
  • Longitude of the catchment centroid (lonc)
  • Catchment area (area)

The basic code is as follows.  I’ve used the latitude and longitude of an arbitrary location in Gippsland, Victoria (link to site at Google Maps).  All code is available as a gist.

library(reshape2) <- postForm("",
  catchment_name = "test1",
  lato = "-37",
  lono = "148",
  latc = "-37.2",
  lonc = "148.2",
  area = "100"

This returns a rather unwieldy file but are two parts tacked on the end that have the useful data.  These occur between the characters ‘[{}]’.  We can separate out the required parts as follows. <- as.character( # convert to text
# separate out parts with useful information, everything between[{ }]
# There are two separate parts 
# 1. Results from gauges in the region of influence 
# 2. Results at the chosen location
# grab the parts using regex to get all the text between
# [{ }]

x <- stringr::str_match_all(, '\\[\\{.*\\}\\]' ) 
gauges.ffa.JSON <- x[[1]][1,] # information from surrounding gauges
RFFE.res <-x[[1]][2,] # RFFE results

The first of these (with the information on surrounding gauges) is a JSON file and we can extract data and build a data frame using the fromJSON function in the jsonlite package.

# Convert to data frame
gauges.ffa.df <- jsonlite::fromJSON(gauges.ffa.JSON, flatten = TRUE)

# Build a data frame of the pieces needed with information on neighbouring
# gauges
gauges.ffa <- gauges.ffa.df %>% 
 select(station_id, area, 
 record.length = sflength,
 latc, lonc, lato, lono, 
 bcf, i2_6h, i50_6h, 
 shape.factor = sf, 
 flow_1pc = q1,
 flow_2pc = q2,
 flow_5pc = q5,
 flow_10pc = q10,
 flow_20pc = q20,
 flow_50pc = q50,
 flow_1pc_LCL = lower1,
 flow_2pc_LCL = lower2,
 flow_5pc_LCL = lower5,
 flow_10pc_LCL = lower10,
 flow_20pc_LCL = lower20,
 flow_50pc_LCL = lower50,
 flow_1pc_UCL = upper1,
 flow_2pc_UCL = upper2,
 flow_5pc_UCL = upper5,
 flow_10pc_UCL = upper10,
 flow_20pc_UCL = upper20,
 flow_50pc_UCL = upper50,,
 statistics.mean = statistics.mean,
# add the correlations, need to extract from nested data frame
 stat.correlation <-t(sapply(gauges.ffa.df$statistics.correlations, unlist))
 stat.correlation <-
 stat.correlation <- stat.correlation[ ,c(2,4,5)] # don't need to have the columns with all '1'
 names(stat.correlation) = c('cor_mean_sd', 'cor_mean_skew', 'cor_sd_skew')
# final data frame 
gauges.ffa <- cbind(gauges.ffa, stat.correlation) # add to gauges.ffa

For our example location (-37, 148), a few columns of the gauges.ffa data frame are as shown in Table 1.  These are the details of the neighbouring gauges to the location of interest.  For further information on neighbouring gauges used in the RFFE see this post.


Table 1:  Details of neighbouring gauges

Now we need to deal with the second part, the results for the location of interest.  This was saved as RFFE.res in the script above.  Some string processing can turn this into a data frame.

RFFE.res <- str_replace_all(RFFE.res, "[{\\[\\]}]", "") # remove braces and square brackets
RFFE.res <- str_replace_all(RFFE.res, c(":" = "", # remove characters that are not required
"'"= "",
"aep" = "",
"lower_limit"= "",
"upper_limit" = "",
"flow" = "",
"\\s+" = "")) # remove spaces
RFFE.res <- unlist(str_split(RFFE.res, ',')) # split at commas
RFFE.res <-, ncol = 4, byrow = TRUE)) # change type to numeric and convert to a data frame
names(RFFE.res) <- c('ARI', 'upper_limit', 'lower_limit', 'flow') # name columns

We end up with the results that are returned by the RFFE model.

ARI upper_limit lower_limit flow
2 27.9 5.25 12.1
5 56.8 11.5 25.5
10 87.1 16.6 38.0
20 128 22.2 53.0
50 201 30.4 77.4
100 275 37.3 99.8

Now plot:


Figure 1: Plot of flood frequency results from RFFE

Code to reproduce the analysis and figures is available as a gist.

One thought on “Scraping the RFFE

  1. Hang Wang

    Hi Tony,
    Thanks for sharing your R code for grabbing the RFFE results from the online tool. Just wondering if you have ever tried to do the same from the ARR Data Hub? Would that be similar to what you posted here?


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s