Flood Frequency Analysis
How large will the 100-year flood be? Flood frequency analysis fits probability distributions to annual peak flows enabling estimation of design floods and floodplain mapping via Log-Pearson Type III and regional regression methods. This model derives frequency factors, implements LP3 fitting, demonstrates flood quantile calculations, and extends to ungauged basins via regional equations.
Prerequisites: log pearson iii, flood quantiles, confidence intervals, regional frequency
1. The Question
What is the 100-year flood discharge for this river?
Design flood problem:
Every structure near water needs design flood estimate.
Bridge: How high to build?
Dam: What spillway capacity?
Floodplain map: Where is 100-year boundary?
Insurance: What’s the premium?
Flood frequency analysis provides answer:
Statistical method estimating flood magnitude for given return period.
Based on: Historical streamflow records
Annual peak series:
Each year, record maximum instantaneous discharge.
Example - Allegheny River:
- 1950: 4,200 cms
- 1951: 2,800 cms
- 1952: 5,600 cms (major flood)
- …
- 2025: 3,100 cms
Fit probability distribution to these peaks.
Extrapolate to estimate rare events (100-year, 500-year).
Applications:
- Floodplain mapping (FEMA flood insurance rate maps)
- Bridge design (scour protection)
- Dam spillway sizing
- Levee height determination
- Building elevation requirements (freeboard)
- Emergency management planning
Return period interpretation:
100-year flood: 1% annual exceedance probability (AEP)
Not: “Occurs once per 100 years”
Rather: “1% chance each year, regardless of past floods”
Misconception: “We just had 100-year flood, safe for 100 years”
Wrong! Each year independent.
2. The Conceptual Model
Log-Pearson Type III Distribution
USGS standard method (Bulletin 17C, 2017)
Why logarithms?
Flood peaks positively skewed (long right tail).
Logarithmic transformation normalizes distribution.
Transform to log-space:
\[Y = \log_{10}(Q)\]Pearson Type III distribution:
Three-parameter distribution flexible for skewed data.
\[f(y) = \frac{\lambda^\alpha}{\Gamma(\alpha)}(y-\beta)^{\alpha-1}e^{-\lambda(y-\beta)}\]Where:
- $\alpha$ = shape parameter (related to skewness)
- $\beta$ = location parameter (related to mean)
- $\lambda$ = scale parameter (related to std dev)
- $\Gamma(\alpha)$ = gamma function
Method of moments:
Parameters estimated from sample statistics:
Mean: $\bar{y} = \frac{1}{n}\sum y_i$
Standard deviation: $s_y = \sqrt{\frac{1}{n-1}\sum(y_i - \bar{y})^2}$
Skew coefficient: $G = \frac{n\sum(y_i - \bar{y})^3}{(n-1)(n-2)s_y^3}$
Flood quantile formula:
\[y_T = \bar{y} + K_T \times s_y\]Where $K_T$ = frequency factor (function of skew $G$ and return period $T$)
Back-transform:
\[Q_T = 10^{y_T}\]Frequency Factor Tables
$K_T$ values tabulated by USGS (Bulletin 17C, Appendix 3)
Example values for selected skew and return periods:
| G | T=10 | T=25 | T=50 | T=100 | T=500 |
|---|---|---|---|---|---|
| -0.4 | 1.32 | 1.87 | 2.23 | 2.56 | 3.27 |
| 0.0 | 1.28 | 1.75 | 2.05 | 2.33 | 2.88 |
| +0.4 | 1.22 | 1.61 | 1.85 | 2.08 | 2.49 |
| +1.0 | 1.10 | 1.34 | 1.49 | 1.62 | 1.88 |
Pattern: Higher positive skew → lower $K_T$ for same return period
Interpretation: Positively skewed distributions have heavier upper tail already captured in mean/std dev
Regional Skew
Problem: Sample skew unstable (requires long records)
Solution: Generalized skew from regional analysis
Weighted skew:
\[G_w = \frac{MSE_G \times G + MSE_r \times G_r}{MSE_G + MSE_r}\]Where:
- $G$ = station skew
- $G_r$ = regional skew (from USGS maps)
- $MSE$ = mean square error (weighting factor)
Typical: Regional skew map provides $G_r \approx 0$ to $+0.5$ depending on location
3. Building the Mathematical Model
Step-by-Step LP3 Analysis
Step 1: Assemble annual peak data
Minimum 10 years, prefer 30+
Step 2: Check for outliers
High outliers: Peaks unusually large (different flood mechanism?)
Low outliers: Peaks unusually small (dam operation, drought?)
Grubbs-Beck test: Statistical outlier detection
Retain outliers if physically plausible.
Step 3: Transform to logarithms
\[y_i = \log_{10}(Q_i)\]Step 4: Calculate sample statistics
\[\bar{y} = \frac{1}{n}\sum_{i=1}^n y_i\] \[s_y = \sqrt{\frac{1}{n-1}\sum_{i=1}^n (y_i - \bar{y})^2}\] \[G = \frac{n}{(n-1)(n-2)s_y^3}\sum_{i=1}^n (y_i - \bar{y})^3\]Step 5: Apply weighted skew (combine station and regional)
Step 6: Select frequency factors from tables
Step 7: Calculate flood quantiles
\[y_T = \bar{y} + K_T(G_w, T) \times s_y\] \[Q_T = 10^{y_T}\]Step 8: Estimate confidence intervals
\[SE(y_T) = s_y \sqrt{\frac{1 + K_T^2/2}{n}}\]95% CI: $y_T \pm 1.96 \times SE(y_T)$
Plotting Position
Empirical probability for observed peaks:
\[P_i = \frac{i}{n+1}\]Where $i$ = rank (1 = largest)
Return period:
\[T_i = \frac{1}{P_i}\]Plot observed $(T_i, Q_i)$ vs fitted curve $Q_T$
Visual check of distribution fit
4. Worked Example by Hand
Problem: Estimate design floods for bridge design.
River: Tributary with 15 years of annual peak data
Annual peaks (m³/s):
| Year | Peak Q |
|---|---|
| 2011 | 450 |
| 2012 | 580 |
| 2013 | 720 |
| 2014 | 390 |
| 2015 | 510 |
| 2016 | 680 |
| 2017 | 820 |
| 2018 | 470 |
| 2019 | 540 |
| 2020 | 610 |
| 2021 | 380 |
| 2022 | 650 |
| 2023 | 490 |
| 2024 | 560 |
| 2025 | 530 |
Regional skew: $G_r = +0.2$ (from USGS map)
Calculate 10-year, 50-year, and 100-year floods.
Solution
Step 1: Transform to logarithms
\[y_i = \log_{10}(Q_i)\]| Year | Q | log₁₀(Q) |
|---|---|---|
| 2011 | 450 | 2.653 |
| 2012 | 580 | 2.763 |
| 2013 | 720 | 2.857 |
| … | … | … |
Step 2: Calculate mean
\[\bar{y} = \frac{1}{15}(2.653 + 2.763 + ... + 2.724) = \frac{40.897}{15} = 2.726\]Step 3: Calculate standard deviation
\[s_y = \sqrt{\frac{1}{14}\sum(y_i - 2.726)^2}\]Example deviation: $(2.653 - 2.726)^2 = 0.00533$
Sum of squared deviations = 0.2156
\[s_y = \sqrt{\frac{0.2156}{14}} = \sqrt{0.0154} = 0.124\]Step 4: Calculate station skew
\[G = \frac{15}{14 \times 13 \times 0.124^3}\sum(y_i - 2.726)^3\]Example cubed deviation: $(2.653 - 2.726)^3 = -0.000389$
Sum of cubed deviations = +0.000821
\[G = \frac{15 \times 0.000821}{182 \times 0.00191} = \frac{0.0123}{0.348} = +0.035\]Low skew (nearly symmetric)
Step 5: Weighted skew
Assume $MSE_G = 0.30$ (typical for $n=15$), $MSE_r = 0.25$
\[G_w = \frac{0.30 \times 0.035 + 0.25 \times 0.2}{0.30 + 0.25} = \frac{0.0105 + 0.050}{0.55} = 0.110\]Use $G_w = 0.11$ for frequency factors
Step 6: Frequency factors (interpolated from tables)
For $G = 0.11$:
- $K_{10} = 1.27$
- $K_{50} = 2.01$
- $K_{100} = 2.29$
Step 7: Calculate flood quantiles
10-year flood:
\[y_{10} = 2.726 + 1.27 \times 0.124 = 2.726 + 0.157 = 2.883\] \[Q_{10} = 10^{2.883} = 764 \text{ m}^3\text{/s}\]50-year flood:
\[y_{50} = 2.726 + 2.01 \times 0.124 = 2.726 + 0.249 = 2.975\] \[Q_{50} = 10^{2.975} = 944 \text{ m}^3\text{/s}\]100-year flood:
\[y_{100} = 2.726 + 2.29 \times 0.124 = 2.726 + 0.284 = 3.010\] \[Q_{100} = 10^{3.010} = 1023 \text{ m}^3\text{/s}\]Step 8: Confidence intervals (95%)
\[SE(y_{100}) = 0.124 \sqrt{\frac{1 + 2.29^2/2}{15}} = 0.124 \sqrt{\frac{3.62}{15}} = 0.124 \times 0.491 = 0.061\] \[y_{100} \pm 1.96 \times 0.061 = 3.010 \pm 0.120\]Range: $y = 2.890$ to $3.130$
\[Q_{100} = 10^{2.890} \text{ to } 10^{3.130} = 776 \text{ to } 1349 \text{ m}^3\text{/s}\]Wide uncertainty: 776-1349 cms (±28%)
Step 9: Design recommendation
Bridge design: Use upper confidence limit for safety
Design Q₁₀₀ = 1350 m³/s
5. Computational Implementation
Below is interactive flood frequency analyzer.
Q₁₀: -- m³/s
Q₅₀: -- m³/s
Q₁₀₀: -- m³/s
CI width (Q₁₀₀): ±--%
6. Summary
Flood frequency analysis estimates design floods via Log-Pearson Type III distribution fitted to annual peak streamflow data transforming to logarithmic space for skewed distributions. Frequency factors K_T derived from skew coefficient and return period enabling quantile calculation via y_T = mean + K_T × std_dev. USGS Bulletin 17C standard method incorporating regional skew weighting and outlier detection improving estimates. Confidence intervals widen significantly for rare events with ±20-40% typical for 100-year estimates from 30-year records. Applications span bridge design, dam spillways, floodplain mapping, and flood insurance requiring return periods from 10 to 10,000 years depending on consequence of failure.