# Frequency factors – 2

This is a continuation from the previous post that looks at the differences between the 4 sources of frequency factors that I discussed:

1. Harter’s (1969; 1971) tables
2. Wilson-Hilferty approximation
3. Pearson III quantiles from the lmomco package
4. Gamma formula.

First, check if there is any difference between the last 2.  It turns out they are producing the same results.

```# load in the functions from the previous post

library(dplyr)
library(lmomco)

Ky_gamma <- function(g, aep){

if(abs(g) < 1e-8) { # Use the Wilson-Hilferty approximation to avoid numerical
# issues near zero
z <- qnorm(1-aep)
b <- g/6
return(z + (z^2 - 1)*b + (1/3)*(z^3 - 6*z)*b^2 -
(z^2 - 1)*b^3 + z*b^4 - (1/3)*(b)^5 )
}

if(g < 0) aep  <- 1 - aep
a  <- 4/(g^2)
b  <- 1/sqrt(a)
sign(g)*(qgamma(1 - aep, shape = a, scale = b) - a*b)
}

Ky_Qp3 <- function(g, aep){
# Return the quantile of the Pearson III distribution given
# skewness of g, and annual exceedance probability, aep

param <- vec2par(c(0,1,g), type='pe3')
quape3(1 - aep, param)
}

# set up some test values of skew and probability

skew.seq <- seq(0, 3.9, 0.1)
prob.seq <- c(0.99, 0.95, 0.90, 0.80, 0.5,
0.2, 0.1, 0.05, 0.02, 0.01, 0.005, 0.002)

calc.check <- tbl_df(expand.grid(skew = skew.seq, prob = prob.seq))

# Calculate and compare
calc.check %>%
mutate(ky.qp3 = mapply(Ky_Qp3, skew, prob)) %>%
mutate(ky.gamma = mapply(Ky_gamma, skew, prob)) %>%
summarise(all.equal(ky.qp3, ky.gamma))

# all.equal(ky.qp3, ky.gamma)
# 1                        TRUE

# Producing the same results
```

The next thing is to compare results with those from Harter’s tables.  These have been considered the gold standard.

```library(repmis)
library(tidyr)

my.url <- "https://dl.dropboxusercontent.com/u/10963448/Harters-tables.csv"

harter <- tbl_df(repmis::source_data(my.url,sep = ",",header = TRUE))

# reshape the data
harter <- harter %>% gather(skew, ky.harter, -P)  # reshape the data
# ky is the frequency factor
# P is non-exceedance probability
# extract and rename
harter <- harter %>%
mutate(skew = extract_numeric(skew)) %>%
rename(prob = P) %>%
mutate(prob = 1 - prob)  # change to exceedance probability

harter %>%
mutate(ky.qp3 = mapply(Ky_Qp3, skew, prob)) %>%
mutate(ky.diff = ky.qp3 - ky.harter) %>%
arrange(desc(abs(ky.diff)))

#         prob skew ky.harter      ky.qp3       ky.diff
# 1  0.960000  2.9  -0.68336 -0.68836211 -5.002114e-03
# 2  0.570376  0.6  -0.27047 -0.27053891 -6.891316e-05
# 3  0.570376  0.7  -0.28516 -0.28522360 -6.360473e-05
# 4  0.500000  0.2  -0.03325 -0.03331351 -6.350807e-05
# 5  0.600000  0.8  -0.36889 -0.36894676 -5.675508e-05

```

The differences are all small. The first one in the list above, seems like the one genuine typo in all the pages of values in Harter’s papers.  An exceedance probability of 0.96 (non-exceedance probability of 0.04) with a skew of 2.9 should give a ‘percentage point’ of -0.68836 not -0.68336.

This shows the `quape` function in the lmomco package is producing the expected values and is fine to use in flood frequency analysis.

Next, lets look at a comparison with the Wilson-Hilferty approximation.

For large skew and small exceedance probability there are substantial errors.  If we restrict the skew to an absolute values less than 2 then its not too bad.

```skew.seq <- seq(-3.9, 3.9, 0.1)
prob.seq <- c(0.99, 0.95, 0.90, 0.80, 0.5,
0.2, 0.1, 0.05, 0.02, 0.01, 0.005, 0.002)

calc.check <- tbl_df(expand.grid(skew = skew.seq, prob = prob.seq))

# Calculate and compare
calc.check %>%
mutate(ky.qp3 = mapply(Ky_Qp3, skew, prob)) %>%
mutate(ky.WH = mapply(Ky_WH, skew, prob)) %>%
mutate(ky.diff = abs(ky.WH - ky.qp3)) %>%
arrange(desc(ky.diff))

#    skew  prob    ky.qp3    ky.WH   ky.diff
# 1  -3.9 0.002 0.5128205 1.622170 1.1093495
# 2  -3.8 0.002 0.5263158 1.491331 0.9650151
# 3  -3.7 0.002 0.5405405 1.373717 0.8331762
# 4  -3.6 0.002 0.5555556 1.268889 0.7133337
# 5  -3.9 0.005 0.5128205 1.189425 0.6766042

# restrict abs(skew) to less than 2
calc.check %>%
mutate(ky.qp3 = mapply(Ky_Qp3, skew, prob)) %>%
mutate(ky.WH = mapply(Ky_WH, skew, prob)) %>%
mutate(ky.diff = abs(ky.WH - ky.qp3)) %>%
filter(abs(skew) < 2) %>%
arrange(desc(ky.diff))

#     skew  prob   ky.qp3    ky.WH    ky.diff
# 1   1.9 0.002 5.107675 5.201004 0.09332828
# 2   1.8 0.002 4.999367 5.086342 0.08697489
# 3   1.7 0.002 4.889712 4.970070 0.08035717
# 4   1.6 0.002 4.778745 4.852303 0.07355803
# 5   1.5 0.002 4.666505 4.733165 0.06665929

```

The upshot is, as we would have expected, we can use an R function to determine frequency factors and the results are more accurate than the Wilson-Hilferty approximation and Harter’s tables.

.