This is a continuation from the previous post that looks at the differences between the 4 sources of frequency factors that I discussed:

- Harter’s (1969; 1971) tables
- Wilson-Hilferty approximation
- Pearson III quantiles from the lmomco package
- Gamma formula.

First, check if there is any difference between the last 2. It turns out they are producing the same results.

# load in the functions from the previous post library(dplyr) library(lmomco) Ky_gamma <- function(g, aep){ if(abs(g) < 1e-8) { # Use the Wilson-Hilferty approximation to avoid numerical # issues near zero z <- qnorm(1-aep) b <- g/6 return(z + (z^2 - 1)*b + (1/3)*(z^3 - 6*z)*b^2 - (z^2 - 1)*b^3 + z*b^4 - (1/3)*(b)^5 ) } if(g < 0) aep <- 1 - aep a <- 4/(g^2) b <- 1/sqrt(a) sign(g)*(qgamma(1 - aep, shape = a, scale = b) - a*b) } Ky_Qp3 <- function(g, aep){ # Return the quantile of the Pearson III distribution given # skewness of g, and annual exceedance probability, aep param <- vec2par(c(0,1,g), type='pe3') quape3(1 - aep, param) } # set up some test values of skew and probability skew.seq <- seq(0, 3.9, 0.1) prob.seq <- c(0.99, 0.95, 0.90, 0.80, 0.5, 0.2, 0.1, 0.05, 0.02, 0.01, 0.005, 0.002) calc.check <- tbl_df(expand.grid(skew = skew.seq, prob = prob.seq)) # Calculate and compare calc.check %>% mutate(ky.qp3 = mapply(Ky_Qp3, skew, prob)) %>% mutate(ky.gamma = mapply(Ky_gamma, skew, prob)) %>% summarise(all.equal(ky.qp3, ky.gamma)) # all.equal(ky.qp3, ky.gamma) # 1 TRUE # Producing the same results

The next thing is to compare results with those from Harter’s tables. These have been considered the gold standard.

library(repmis) library(tidyr) # download data from dropbox my.url <- "https://dl.dropboxusercontent.com/u/10963448/Harters-tables.csv" harter <- tbl_df(repmis::source_data(my.url,sep = ",",header = TRUE)) # reshape the data harter <- harter %>% gather(skew, ky.harter, -P) # reshape the data # ky is the frequency factor # P is non-exceedance probability # extract and rename harter <- harter %>% mutate(skew = extract_numeric(skew)) %>% rename(prob = P) %>% mutate(prob = 1 - prob) # change to exceedance probability harter %>% mutate(ky.qp3 = mapply(Ky_Qp3, skew, prob)) %>% mutate(ky.diff = ky.qp3 - ky.harter) %>% arrange(desc(abs(ky.diff))) # prob skew ky.harter ky.qp3 ky.diff # 1 0.960000 2.9 -0.68336 -0.68836211 -5.002114e-03 # 2 0.570376 0.6 -0.27047 -0.27053891 -6.891316e-05 # 3 0.570376 0.7 -0.28516 -0.28522360 -6.360473e-05 # 4 0.500000 0.2 -0.03325 -0.03331351 -6.350807e-05 # 5 0.600000 0.8 -0.36889 -0.36894676 -5.675508e-05

The differences are all small. The first one in the list above, seems like the one genuine typo in all the pages of values in Harter’s papers. An exceedance probability of 0.96 (non-exceedance probability of 0.04) with a skew of 2.9 should give a ‘percentage point’ of -0.68836 not -0.68336.

This shows the `quape`

function in the lmomco package is producing the expected values and is fine to use in flood frequency analysis.

Next, lets look at a comparison with the Wilson-Hilferty approximation.

For large skew and small exceedance probability there are substantial errors. If we restrict the skew to an absolute values less than 2 then its not too bad.

skew.seq <- seq(-3.9, 3.9, 0.1) prob.seq <- c(0.99, 0.95, 0.90, 0.80, 0.5, 0.2, 0.1, 0.05, 0.02, 0.01, 0.005, 0.002) calc.check <- tbl_df(expand.grid(skew = skew.seq, prob = prob.seq)) # Calculate and compare calc.check %>% mutate(ky.qp3 = mapply(Ky_Qp3, skew, prob)) %>% mutate(ky.WH = mapply(Ky_WH, skew, prob)) %>% mutate(ky.diff = abs(ky.WH - ky.qp3)) %>% arrange(desc(ky.diff)) # skew prob ky.qp3 ky.WH ky.diff # 1 -3.9 0.002 0.5128205 1.622170 1.1093495 # 2 -3.8 0.002 0.5263158 1.491331 0.9650151 # 3 -3.7 0.002 0.5405405 1.373717 0.8331762 # 4 -3.6 0.002 0.5555556 1.268889 0.7133337 # 5 -3.9 0.005 0.5128205 1.189425 0.6766042 # restrict abs(skew) to less than 2 calc.check %>% mutate(ky.qp3 = mapply(Ky_Qp3, skew, prob)) %>% mutate(ky.WH = mapply(Ky_WH, skew, prob)) %>% mutate(ky.diff = abs(ky.WH - ky.qp3)) %>% filter(abs(skew) < 2) %>% arrange(desc(ky.diff)) # skew prob ky.qp3 ky.WH ky.diff # 1 1.9 0.002 5.107675 5.201004 0.09332828 # 2 1.8 0.002 4.999367 5.086342 0.08697489 # 3 1.7 0.002 4.889712 4.970070 0.08035717 # 4 1.6 0.002 4.778745 4.852303 0.07355803 # 5 1.5 0.002 4.666505 4.733165 0.06665929

The upshot is, as we would have expected, we can use an R function to determine frequency factors and the results are more accurate than the Wilson-Hilferty approximation and Harter’s tables.

.

Pingback: Estimating parameters from quantiles, Pearson III distribution | tonyladson