ExtremesPlots {fExtremes} | R Documentation |
A collection and description of functions for explorative
data analysis of extreme values. The tools include plot
functions for emprical distributions, quantile plots, graphs
exploring the properties of exceedences over a threshold,
plots for mean/sum ratio and for the development of records.
The functions are:
emdPlot | Plot of empirical distribution function, |
qqPlot | Normal quantile-quantile plot, |
qqbayesPlot | Normal QQ-Plot with 95 percent intervals, |
qPlot | Exponential/Pareto quantile plot, |
mePlot | Plot of mean excesses over a threshold, |
mrlPlot | another variant, mean residual life plot, |
mxfPlot | another variant, with confidence intervals, |
msratioPlot | Plot of the ratio of maximum and sum, |
recordsPlot | Record development compared with iid data, |
ssrecordsPlot | another variant, investigates subsamples, |
xacfPlot | ACF of exceedences over a threshold, |
interactivePlot | a framework for interactive plot displays, |
gridVector | creates from two vectors x and y all grid points. |
emdPlot(x, doplot = TRUE, plottype = c("", "x", "y", "xy"), labels = TRUE, ...) qqPlot(x, doplot = TRUE, labels = TRUE, ...) qqbayesPlot(x, doplot = TRUE, labels = TRUE, ...) qPlot(x, xi = 0, trim = NA, threshold = NA, doplot = TRUE, labels = TRUE, ...) mePlot(x, doplot = TRUE, labels = TRUE, ...) mrlPlot(x, conf = 0.95, umin = NA, umax = NA, nint = 100, doplot = TRUE, plottype = c("autoscale", ""), labels = TRUE, ...) mxfPlot(x, tail = 0.05, doplot = TRUE, labels = TRUE, ...) msratioPlot(x, p = 1:4, doplot = TRUE, plottype = c("autoscale", ""), labels = TRUE, ...) recordsPlot(x, conf = 0.95, doplot = TRUE, labels = TRUE, ...) ssrecordsPlot(x, subsamples = 10, doplot = TRUE, plottype = c("lin", "log"), labels = TRUE, ...) xacfPlot(x, threshold = 0.95, lag.max = 15, doplot = TRUE, ...) interactivePlot(x, choices = paste("Plot", 1:9), plotFUN = paste("plot.", 1:9, sep = ""), which = "all", ...) gridVector(x, y)
choices |
[interactivePlot] - a vector of character strings for the choice menu. By Default "Plot 1" ... "Plot 9"
allowing for 9 plots at maximum.
|
conf |
[recordsPlot] - a confidence level. By default 0.95, i.e. 95%. |
doplot |
a logical. Should the results be plotted? By default TRUE .
|
labels |
a logical. Whether or not x- and y-axes should be automatically
labelled and a default main title should be added to the plot.
By default TRUE .
|
lag.max |
[xacfPlot] - maximum number of lags at which to calculate the autocorrelation functions. The default value is 15. |
nint |
[mrlPlot] - the number of intervals, see umin and umax . The
default value is 100.
|
p |
[msratioPlot] - the power exponents, a numeric vector. By default a sequence from 1 to 4 in unit integer steps. |
plotFUN |
[interactivePlot] - a vector of character strings naming the plot functions. By Default "plot.1" ... "plot.9"
allowing for 9 plots at maximum.
|
plottype |
[emdPlot] - which axes should be on a log scale: "x" x-axis only;
"y" y-axis only; "xy" both axes; ""
neither axis.
[msratioPlot] - a logical, if set to "autoscale" , then the scale of the
plots are automatically determined, any other string allows user
specified scale information through the ... argument.
[ssrecordsPlot] - one from two options can be select either "lin"
or "log" . The default creates a linear plot.
|
subsamples |
[ssrecordsPlot] - the number of subsamples, by default 10, an integer value. |
tail |
[mxfPlot] - the threshold determined from the relative number of data points defining the tail, a numeric value; by default 0.05 which says that 5% of the data make the tail. |
threshold, trim |
[qPlot][xacfPlot] - a numeric value at which data are to be left-truncated, value at which data are to be right-truncated or the thresold value, by default 95%. |
umin, umax |
[mrlPlot] - range of threshold values. If umin and/or umax are
not available, then by default they are set to the following
values: umin=mean(x) and umax=max(x) .
|
which |
plot selection, which graph should be displayed? If "which"
is a character string named "ask" the user is interactively asked
which to plot, if a logical vector of length N , those plots
which are set TRUE are displayed, if a character string
named "all" all plots are displayed.
|
x, y |
a numeric data vector or an object to be plotted.
[gridVector] - two numeric vector which span the two dimensional grid. |
xi |
the shape parameter of the generalized Pareto distribution. |
... |
additional arguments passed to the plot function. |
Empirical Distribution Function:
The function emdPlot
is a simple explanatory function. A
straight line on the double log scale indicates Pareto tail behaviour.
Quantile–Quantile Plot:
The function qqPlot
produces a normal QQ-plot. Note, that
qqPlot
is not a synonym function call to the R-base function
qqplot
which produces a quantile-quantile plot of two datasets.
To help with assessing the relevance of sampling variability on just
"how close" to the normal the data appears, qqbayesPlot
adds
approximate posterior 95
function at each point.
qPlot
creates a QQ-plot for threshold data. If xi
is
zero the reference distribution is the exponential; if xi
is
non-zero the reference distribution is the generalized Pareto with
that value of xi
. In the case of the exponential, the plot is
interpreted as follows: Concave departures from a straight line are a
sign of heavy-tailed behaviour, convex departures show thin-tailed
behaviour.
Mean Excess Function Plot:
Three variants to plot the mean excess function are available:
A sample mean excess plot over increasing thresholds, and two mean
excess function plots with confidence intervals for discrimination
in the tails of a distribution.
In general, an upward trend in a mean excess function plot shows
heavy-tailed behaviour. In particular, a straight line with positive
gradient above some threshold is a sign of Pareto behaviour in tail.
A downward trend shows thin-tailed behaviour whereas a line with
zero gradient shows an exponential tail. Here are some hints:
Because upper plotting points are the average of a handful of extreme
excesses, these may be omitted for a prettier plot.
For mrlPlot
and mxfPlot
the upper tail is investigated;
for the lower tail reverse the sign of the data
vector.
Plot of the Maximum/Sum Ratio:
The ratio of maximum and sum is a simple tool for detecting heavy
tails of a distribution and for giving a rough estimate of
the order of its finite moments. Sharp increases in the curves
of a msratioPlot
are a sign for heavy tail behaviour.
Plot of the Development of Records:
These are functions that investigate the development of records in
a dataset and calculate the expected behaviour for iid data.
recordPlot
counts records and reports the observations
at which they occur. In addition subsamples can be investigated
with the help of the function ssrecords
.
ACF Plot of Exceedences over a Thresold:
This function plots the autocorrelation functions of heights and
distances of exceedences over a threshold.
The plots are labeled by default with a x-label, a y-label and
a main title. If the argument label
is set to FALSE
neither a x-label, a y-label nor a main title will be added to
graph. To add user defined label strings "..." just use the
function title(xlab="...", ylab="...", main="...")
.
Some of the functions were implemented from Alec Stephenson's
R-package evir
ported from Alexander McNeil's S library
EVIS
, Extreme Values in S, some from Alec Stephenson's
R-package ismev
based on Stuart Coles code from his book,
Introduction to Statistical Modeling of Extreme Values and
some were written by Diethelm Wuertz.
Coles S. (2001); Introduction to Statistical Modelling of Extreme Values, Springer.
Embrechts, P., Klueppelberg, C., Mikosch, T. (1997); Modelling Extremal Events, Springer.
## emdPlot - xmpExtremes("\nStart: Empirical Distribution Function >") # Danish fire insurance data show Pareto tail behaviour: par(mfrow = c(2, 2)) data(danish) emdPlot(danish, plottype = "xy", labels = FALSE) title(xlab = "x", ylab = "1-F(x)", main = "Danish Fire") # BMW Stocks: data(bmw) emdPlot(bmw, plottype = "xy", labels = FALSE) title(xlab = "x", ylab = "1-F(x)", main = "BMW Stocks") # Simulated Student-t: emdPlot(rt(5000, 4), plottype = "xy") ## qqPlot - xmpExtremes("\nNext: Quantile-Quantile Plot >") # QQ-Plot of Simulated Normal rvs: par(mfrow = c(2, 2)) set.seed(4711) qqPlot(rnorm(5000)) text(-3.5, 3, pos = 4, "Simulated Normal rvs") # QQ-Plot of simulated Student-t rvs: qqPlot(rt(5000, 4)) text(-3.5, 11.0, pos = 4, "Simulated Student-t rvs") # QQ-Plot of BMW share residuals: data(bmw) qqPlot(bmw) text(-3.5, 0.09, pos = 4, "BMW log returns") ## qPlot - xmpExtremes("\nNext: QQ-Plot of Heavy Tails >") # QQ-Plot of heavy-tailed Danish fire insurance data: data(danish) qPlot(danish) ## mePlot - xmpExtremes("\nNext: Mean Excess Plot >") # Sample mean excess plot of heavy-tailed Danish fire # insurance data par(mfrow = c(3, 2)) data(danish) mePlot(danish, labels = FALSE) title(xlab = "u", ylab = "e", main = "mePlot - Danish Fire Data") ## mrlPlot - xmpExtremes("\nNext: mean Residual Live Plot >") # Sample mean residual live plot of heavy-tailed Danish Fire # insurance data mrlPlot(danish, labels = FALSE) title(xlab = "u", ylab = "e", main = "mrlPlot - Danish Fire Data") ## mxfPlot - xmpExtremes("\nNext: Mean Excess Function Plot >") # Plot the mean excess functions for randomly distributed # residuals par(mfrow = c(2, 2)) n = 10000 set.seed(4711) xlab = "Threshold: u"; ylab = "Mean Excess: e" mxfPlot(rnorm(n), tail = 0.5, labels = FALSE) title(xlab = xlab, ylab = ylab, main = "mxf Plot - Normal DF") set.seed(7138) mxfPlot(rexp(n, 2), tail = 0.5, labels = FALSE) title(xlab = xlab, ylab = ylab, main = "mxfPlot - Exponential DF") abline(1/2, 0) set.seed(6952) mxfPlot(rlnorm(n, 0, 2), tail = 0.5, xlim = c(0,90), ylim = c(0, 120), labels = FALSE) title(xlab = xlab, ylab = ylab, main = "mxfPlot - Lognormal DF") set.seed(8835) mxfPlot(rgpd(n, 1/2), tail = 0.10, xlim = c(0,200), ylim=c(0,200), labels = FALSE) title(xlab = xlab, ylab = ylab, main = "mxfPlot - Pareto") abline(0, 1) ## msratioPlot - xmpExtremes("\nNext: Maximum/Sum Ratio Plot >") # Examples for Ratio of Maximum and Sum Plots: par(mfrow = c(3, 2)) data(bmw) xlab = "n"; ylab = "R(n)" msratioPlot (rnorm(8000), labels = FALSE) title(xlab = xlab, ylab = ylab, main = "Standard Normal") msratioPlot (rexp(8000), labels = FALSE) title(xlab = xlab, ylab = ylab, main = "Exponential") msratioPlot (rt(8000, 4), labels = FALSE) title(xlab = xlab, ylab = ylab, main = "Student-t") msratioPlot (rcauchy(8000), labels = FALSE) title(xlab = xlab, ylab = ylab, main = "Cauchy") msratioPlot (bmw, labels = FALSE) title(xlab = xlab, ylab = ylab, main = "BMW Returns") ## recordsPlot - xmpExtremes("\nNext: Records Plot >") # Record fire insurance losses in Denmark par(mfrow = c(2, 2)) data(danish) recordsPlot(danish) text(1, 7.9, pos = 4, "Danish Fire") # BMW Stocks data(bmw) recordsPlot(bmw) text(1, 12.8, pos = 4, "BMW Shares") ## ssrecordsPlot - xmpExtremes("\nNext: Subsample Record Plot >") # Record fire insurance losses in Denmark ssrecordsPlot(danish) text(1, 9.2, pos = 4, "Danish Fire") # BMW Stocks ssrecordsPlot(bmw) text(1, 10.5, pos = 4, "BMW Shares") ## xacfPlot - xmpExtremes("\nNext: ACF Plot of Exceedences >") # Plot ACF of Heights/Distances of Eceedences over threshold: par(mfrow = c(2, 2)) data(bmw) xacfPlot(bmw)