# Plotting percentiles

Water quality objectives are often expressed as percentiles; for example, the 75th percentile of the measured concentration of some parameter is not to exceed some value.  To check compliance, it is nice to be able to plot the percentile on a time series of samples (see figure below).  The `rq` function in the quantreg package can be used to fit percentiles.  Here I’ve used splines to provide smooth curves by following the example in the package vignette (see Figure 7 in the vignette).

Percentile plot – Total Nitrogen (mg/L)

The following code calculates and plots the percentiles.  B-splines are used to achieve the smooth curves with the smoothness controlled by the df parameter.  Code is also available as a gist.

```
library(repmis)
library(quantreg)
library(dplyr)
library(lubridate)
library(stringr)
library(ggplot2)
library(devtools)
library(splines)

# Source ggplot Theme see

my.url <- 'https://dl.dropboxusercontent.com/u/10963448/QuantileEg.csv'
wq <- dplyr::tbl_df(repmis::source_data(my.url))

# parse and sort
wq <- wq %>%
mutate(date.time = lubridate::parse_date_time(stringr::str_c(Date,Time), orders = 'dmy HM')) %>% # set up dates
arrange(date.time) # sort by date

# Check for missing values
colSums(is.na(wq))

# Date Time Total.N date.time
# 0 0 6 0

# remove them
wq <- wq[complete.cases(wq), ]

# create 75th and 25th quantile models
fit.75 <- quantreg::rq(Total.N ~ splines::bs(date.time, df=15), tau=0.75, data=wq)
fit.50 <- quantreg::rq(Total.N ~ splines::bs(date.time, df=15), tau=0.50, data=wq)
fit.25 <- quantreg::rq(Total.N ~ splines::bs(date.time, df=15), tau=0.25, data=wq)

# Add quantiles to data frame

wq <- wq %>%
mutate(pc.75 = predict(fit.75)) %>%
mutate(pc.50 = predict(fit.50)) %>%
mutate(pc.25 = predict(fit.25))

# plot
wq %>%
ggplot(aes(x = date.time)) +
geom_point(aes(y = Total.N)) +
geom_line(aes(y = pc.75, colour = '75th percentile')) +
geom_line(aes(y = pc.50, colour = 'Median')) +
geom_line(aes(y = pc.25, colour = '25th percentile')) +
scale_color_manual('', values = c('75th percentile' = 'red', '25th percentile' = 'green', 'Median' = 'blue'),
breaks = c('75th percentile', 'Median', '25th percentile')) +
BwTheme +
xlab('Date') +
ylab('Total Nitrogen mg/L')```