Frequency factors – 2

This is a continuation from the previous post that looks at the differences between the 4 sources of frequency factors that I discussed:

  1. Harter’s (1969; 1971) tables
  2. Wilson-Hilferty approximation
  3. Pearson III quantiles from the lmomco package
  4. Gamma formula.

First, check if there is any difference between the last 2.  It turns out they are producing the same results.

# load in the functions from the previous post

library(dplyr)
library(lmomco)

Ky_gamma <- function(g, aep){
  
  if(abs(g) < 1e-8) { # Use the Wilson-Hilferty approximation to avoid numerical
                         # issues near zero
    z <- qnorm(1-aep)
    b <- g/6
    return(z + (z^2 - 1)*b + (1/3)*(z^3 - 6*z)*b^2 - 
                 (z^2 - 1)*b^3 + z*b^4 - (1/3)*(b)^5 )
  }
        
  if(g < 0) aep  <- 1 - aep
  a  <- 4/(g^2)
  b  <- 1/sqrt(a)
  sign(g)*(qgamma(1 - aep, shape = a, scale = b) - a*b)
}

Ky_Qp3 <- function(g, aep){ 
  # Return the quantile of the Pearson III distribution given 
  # skewness of g, and annual exceedance probability, aep
  
  param <- vec2par(c(0,1,g), type='pe3')
  quape3(1 - aep, param)  
}

# set up some test values of skew and probability

skew.seq <- seq(0, 3.9, 0.1)
prob.seq <- c(0.99, 0.95, 0.90, 0.80, 0.5,
              0.2, 0.1, 0.05, 0.02, 0.01, 0.005, 0.002)

calc.check <- tbl_df(expand.grid(skew = skew.seq, prob = prob.seq))

# Calculate and compare
calc.check %>%
  mutate(ky.qp3 = mapply(Ky_Qp3, skew, prob)) %>%
  mutate(ky.gamma = mapply(Ky_gamma, skew, prob)) %>%
  summarise(all.equal(ky.qp3, ky.gamma))
            
# all.equal(ky.qp3, ky.gamma)
# 1                        TRUE

# Producing the same results

The next thing is to compare results with those from Harter’s tables.  These have been considered the gold standard.

library(repmis)
library(tidyr)

# download data from dropbox
my.url <- "https://dl.dropboxusercontent.com/u/10963448/Harters-tables.csv"

harter <- tbl_df(repmis::source_data(my.url,sep = ",",header = TRUE))

# reshape the data
harter <- harter %>% gather(skew, ky.harter, -P)  # reshape the data
# ky is the frequency factor
# P is non-exceedance probability
# extract and rename
harter <- harter %>%
  mutate(skew = extract_numeric(skew)) %>%
  rename(prob = P) %>%
  mutate(prob = 1 - prob)  # change to exceedance probability

harter %>%
  mutate(ky.qp3 = mapply(Ky_Qp3, skew, prob)) %>%
  mutate(ky.diff = ky.qp3 - ky.harter) %>%
  arrange(desc(abs(ky.diff))) 

#         prob skew ky.harter      ky.qp3       ky.diff
# 1  0.960000  2.9  -0.68336 -0.68836211 -5.002114e-03
# 2  0.570376  0.6  -0.27047 -0.27053891 -6.891316e-05
# 3  0.570376  0.7  -0.28516 -0.28522360 -6.360473e-05
# 4  0.500000  0.2  -0.03325 -0.03331351 -6.350807e-05
# 5  0.600000  0.8  -0.36889 -0.36894676 -5.675508e-05




The differences are all small. The first one in the list above, seems like the one genuine typo in all the pages of values in Harter’s papers.  An exceedance probability of 0.96 (non-exceedance probability of 0.04) with a skew of 2.9 should give a ‘percentage point’ of -0.68836 not -0.68336.

This shows the quape function in the lmomco package is producing the expected values and is fine to use in flood frequency analysis.

Next, lets look at a comparison with the Wilson-Hilferty approximation.

For large skew and small exceedance probability there are substantial errors.  If we restrict the skew to an absolute values less than 2 then its not too bad.

skew.seq <- seq(-3.9, 3.9, 0.1)
prob.seq <- c(0.99, 0.95, 0.90, 0.80, 0.5,
              0.2, 0.1, 0.05, 0.02, 0.01, 0.005, 0.002)


calc.check <- tbl_df(expand.grid(skew = skew.seq, prob = prob.seq))

# Calculate and compare
calc.check %>%
  mutate(ky.qp3 = mapply(Ky_Qp3, skew, prob)) %>%
  mutate(ky.WH = mapply(Ky_WH, skew, prob)) %>%
  mutate(ky.diff = abs(ky.WH - ky.qp3)) %>%
  arrange(desc(ky.diff)) 

#    skew  prob    ky.qp3    ky.WH   ky.diff
# 1  -3.9 0.002 0.5128205 1.622170 1.1093495
# 2  -3.8 0.002 0.5263158 1.491331 0.9650151
# 3  -3.7 0.002 0.5405405 1.373717 0.8331762
# 4  -3.6 0.002 0.5555556 1.268889 0.7133337
# 5  -3.9 0.005 0.5128205 1.189425 0.6766042

# restrict abs(skew) to less than 2
calc.check %>%
  mutate(ky.qp3 = mapply(Ky_Qp3, skew, prob)) %>%
  mutate(ky.WH = mapply(Ky_WH, skew, prob)) %>%
  mutate(ky.diff = abs(ky.WH - ky.qp3)) %>%
  filter(abs(skew) < 2) %>%
  arrange(desc(ky.diff)) 

#     skew  prob   ky.qp3    ky.WH    ky.diff
# 1   1.9 0.002 5.107675 5.201004 0.09332828
# 2   1.8 0.002 4.999367 5.086342 0.08697489
# 3   1.7 0.002 4.889712 4.970070 0.08035717
# 4   1.6 0.002 4.778745 4.852303 0.07355803
# 5   1.5 0.002 4.666505 4.733165 0.06665929

The upshot is, as we would have expected, we can use an R function to determine frequency factors and the results are more accurate than the Wilson-Hilferty approximation and Harter’s tables.

.

One thought on “Frequency factors – 2

  1. Pingback: Estimating parameters from quantiles, Pearson III distribution | tonyladson

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s