Title: | Fitting Tails by the Empirical Residual Coefficient of Variation |
---|---|
Description: | Provides a methodology simple and trustworthy for the analysis of extreme values and multiple threshold tests for a generalized Pareto distribution, together with an automatic threshold selection algorithm. See del Castillo, J, Daoudi, J and Lockhart, R (2014) <doi:10.1111/sjos.12037>. |
Authors: | Joan del Castillo, David Moriña Soler and Isabel Serra |
Maintainer: | Isabel Serra <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.0.1 |
Built: | 2025-02-13 03:41:10 UTC |
Source: | https://github.com/cran/ercv |
Fitting tails by the empirical residual coefficient of variation.
Package: | ercv |
Type: | Package |
Version: | 1.0.1 |
Date: | 2019-09-19 |
License: | GPL version 2 or newer |
LazyLoad: | yes |
The package provides a methodology simple and trustworthy for the analysis of extreme values. The package contains functions for visualizing, fitting and validating the distribution of tails. Moreover, it also provides multiple threshold tests for a generalized Pareto distribution, together with an automatic threshold selection algorithm.
Joan del Castillo (Universitat Autònoma de Barcelona), David Moriña Soler (Catalan Institute of Oncology (ICO)-IDIBELL) and Isabel Serra (Centre de Recerca Matemàtica)
del Castillo, J. and Padilla, M. (2016). Modeling extreme values by the residual coefficient of variation. SORT Statist. Oper. Res. Trans. 40(2), 303-320.
del Castillo, J. and Serra, I. (2015). Likelihood inference for Generalized Pareto Distribution. Computational Statistics and Data Analysis, 83, 116-128.
del Castillo, J., Daoudi, J. and Lockhart, R. (2014). Methods to Distinguish Between Polynomial and Exponential Tails. Scandinavian Journal of Statistics, 41, 382-393.
ercv-package
, cievi
,
ccdfplot
, cvevi
, cvplot
, evicv
, fitpot
,
ppot
, qpot
, tdata
, thrselect
,
Tm
This data corresponds to 1000 observations sampled from the third benchmark of the well-known suite for real-time systems EEMBC AutoBench suite (Poovey, 2007), including a number of of programs used in automotive embedded systems. It corresponds to the basic integer and floating point (BIFP) algorithm.
BIFP
BIFP
A numeric vector.
Abella J., Padilla, M.,del Castillo, J. & Cazorla, F. (2017). Measurement-Based Worst-Case Execution Time Estimation Using the Coefficient of Variation". ACM Transactions on Design Automation of Electronic Systems (TODAES), 22(4).
Poovey, J. (2007). Characterization of the EEMBC Benchmark Suite. North Carolina State University.
This data corresponds to the Bilbao waves data set, firstly analysed by Castillo and Hadi (1997) and in del Castillo and Serra (2015) from the MLE point of view.
bilbao
bilbao
A numeric vector.
Castillo, E. and Hadi, A. S. (1997). Fitting the Generalized Pareto Distribution to Data. Journal of the American Statistical Association, 92, 1609-1620. del Castillo, J. and Serra, I. (2015). Likelihood inference for Generalized Pareto Distribution. Computational Statistics and Data Analysis, 83, 116-128.
Plot of complementary empirical distribution function of a sample and the complementary distribution function from peaks-over-threshold model.
ccdfplot(data, pars=NA, log="y", from=NA, ci=FALSE, main="Complementary cdf", xlab="data", ylab="ccdf", ...)
ccdfplot(data, pars=NA, log="y", from=NA, ci=FALSE, main="Complementary cdf", xlab="data", ylab="ccdf", ...)
data |
a numeric vector. |
pars |
a list with the set of parameters of peaks-over-threshold model. |
log |
a character string which contains |
from |
the origen of x-axis in the plot. |
ci |
should confidence bands be plotted. Defaults to |
main |
an overall title for the plot. |
xlab |
horizontal axis label. Defaults to |
ylab |
vertical axis label. Defaults to |
... |
usual graphic parameters. |
Plot of complementary empirical distribution function and the complementary distribution function.
Joan del Castillo, David Moriña Soler and Isabel Serra
del Castillo, J. and Padilla, M. (2016). Modeling extreme values by the residual coefficient of variation. SORT Statist. Oper. Res. Trans. 40(2), 303-320.
del Castillo, J. and Serra, I. (2015). Likelihood inference for Generalized Pareto Distribution. Computational Statistics and Data Analysis, 83, 116-128.
del Castillo, J., Daoudi, J. and Lockhart, R. (2014). Methods to Distinguish Between Polynomial and Exponential Tails. Scandinavian Journal of Statistics, 41, 382-393.
ercv-package
, cievi
,
cvevi
, cvplot
, evicv
, fitpot
,
ppot
, qpot
, tdata
, thrselect
,
Tm
data(iFFT) ccdfplot(iFFT)
data(iFFT) ccdfplot(iFFT)
Confidence interval for extreme value index estimation by Tm
method.
cievi(nextremes, evi=0, conf.level=0.90, m=10, nsim=100)
cievi(nextremes, evi=0, conf.level=0.90, m=10, nsim=100)
nextremes |
the number of upper extremes to be used. |
evi |
extreme value index. In particular, the shape parammeter of a generalized Pareto distribution. |
conf.level |
confidence level of the interval. |
m |
number of thresholds to do multiplicial test. |
nsim |
number of simulation. |
A numerical vector with two elements, containing the limits of the interval.
Joan del Castillo, David Moriña Soler and Isabel Serra
del Castillo, J. and Padilla, M. (2016). Modeling extreme values by the residual coefficient of variation. SORT Statist. Oper. Res. Trans. 40(2), 303-320.
del Castillo, J. and Serra, I. (2015). Likelihood inference for Generalized Pareto Distribution. Computational Statistics and Data Analysis, 83, 116-128.
del Castillo, J., Daoudi, J. and Lockhart, R. (2014). Methods to Distinguish Between Polynomial and Exponential Tails. Scandinavian Journal of Statistics, 41, 382-393.
ercv-package
, cievi
,
ccdfplot
, cvevi
, cvplot
, evicv
, fitpot
,
ppot
, qpot
, tdata
, thrselect
,
Tm
cievi(70, evi=0)
cievi(70, evi=0)
The coefficient of variation for a given extreme value index in the generalized Pareto distribution.
cvevi(evi)
cvevi(evi)
evi |
extreme value index. In particular, the shape parameter
of a generalized Pareto distribution. It has to satisfy |
A numerical value containing the coefficient of variation for the given extreme value index.
Joan del Castillo, David Moriña Soler and Isabel Serra
del Castillo, J. and Padilla, M. (2016). Modeling extreme values by the residual coefficient of variation. SORT Statist. Oper. Res. Trans. 40(2), 303-320.
del Castillo, J. and Serra, I. (2015). Likelihood inference for Generalized Pareto Distribution. Computational Statistics and Data Analysis, 83, 116-128.
del Castillo, J., Daoudi, J. and Lockhart, R. (2014). Methods to Distinguish Between Polynomial and Exponential Tails. Scandinavian Journal of Statistics, 41, 382-393.
ercv-package
, cievi
,
ccdfplot
, cvevi
, cvplot
, evicv
, fitpot
,
ppot
, qpot
, tdata
, thrselect
,
Tm
cvevi(-1)
cvevi(-1)
Exploratory empirical residual coefficient of variation for extreme value analysis.
cvplot(data, threshold = NA, nextremes = NA, omit=4, evi=0, main="CVplot", conf.level=0.90, xlab="Excluded sample size", ylab="Coefficient of variation", col="blue", ...)
cvplot(data, threshold = NA, nextremes = NA, omit=4, evi=0, main="CVplot", conf.level=0.90, xlab="Excluded sample size", ylab="Coefficient of variation", col="blue", ...)
data |
a numeric vector. |
threshold |
a threshold value (either this or |
nextremes |
the number of upper extremes to be used (either
this or |
omit |
the minimum required number of upper extremes for computing residual statistics. |
evi |
extreme value index. In particular, the shape parammeter of a generalized Pareto distribution. |
main |
an overall title for the plot. |
conf.level |
confidence level of the interval (defaults to 0.90). |
xlab |
horizontal axis label. Defaults to |
ylab |
vertical axis label. Defaults to |
col |
plot color. Defaults to |
... |
Usual graphic parameters. |
Plot of the empirical residual CV and confidence intervals.
Joan del Castillo, David Moriña Soler and Isabel Serra
del Castillo, J. and Padilla, M. (2016). Modeling extreme values by the residual coefficient of variation. SORT Statist. Oper. Res. Trans. 40(2), 303-320.
del Castillo, J. and Serra, I. (2015). Likelihood inference for Generalized Pareto Distribution. Computational Statistics and Data Analysis, 83, 116-128.
del Castillo, J., Daoudi, J. and Lockhart, R. (2014). Methods to Distinguish Between Polynomial and Exponential Tails. Scandinavian Journal of Statistics, 41, 382-393.
ercv-package
, cievi
,
ccdfplot
, cvevi
, evicv
, fitpot
,
ppot
, qpot
, tdata
, thrselect
,
Tm
data("moby", package = "poweRlaw") cvplot(moby, main="MobyDick") data(iFFT) cvplot(iFFT, threshold=median(iFFT), main="iFFT")
data("moby", package = "poweRlaw") cvplot(moby, main="MobyDick") data(iFFT) cvplot(iFFT, threshold=median(iFFT), main="iFFT")
This data corresponds to the euro/dollar daily exchange rates between 1999 and 2016, including the financial crisis of 2007-2008, which has been generated from the package quantmod
(Ryan, 2016).
EURUSD
EURUSD
A data frame with 6575 rows and 1 column.
Ryan, J. A. (2016). quantmod: Quantitative Financial Modelling Framework. R package version 0.4-7. https://CRAN.R-project.org/package=quantmod
The extreme value index for a given coefficient of variation in the generalized Pareto distribution.
evicv(cv)
evicv(cv)
cv |
coefficient of variation. It has to satisfy |
The extreme value index for a given coefficient of variation in the generalized Pareto distribution as a numerical value.
Joan del Castillo, David Moriña Soler and Isabel Serra
del Castillo, J. and Padilla, M. (2016). Modeling extreme values by the residual coefficient of variation. SORT Statist. Oper. Res. Trans. 40(2), 303-320.
del Castillo, J. and Serra, I. (2015). Likelihood inference for Generalized Pareto Distribution. Computational Statistics and Data Analysis, 83, 116-128.
del Castillo, J., Daoudi, J. and Lockhart, R. (2014). Methods to Distinguish Between Polynomial and Exponential Tails. Scandinavian Journal of Statistics, 41, 382-393.
ercv-package
, cievi
,
ccdfplot
, cvevi
, cvplot
, fitpot
,
ppot
, qpot
, tdata
, thrselect
,
Tm
evicv(2)
evicv(2)
This data corresponds to 1000 observations sampled from the second benchmark of the well-known suite for real-time systems EEMBC AutoBench suite (Poovey, 2007), including a number of of programs used in automotive embedded systems. It corresponds to the fast fourier transform (FFT) algorithm.
FFT
FFT
A numeric vector.
Abella J., Padilla, M.,del Castillo, J. & Cazorla, F. (2017). Measurement-Based Worst-Case Execution Time Estimation Using the Coefficient of Variation". ACM Transactions on Design Automation of Electronic Systems (TODAES), 22(4).
Poovey, J. (2007). Characterization of the EEMBC Benchmark Suite. North Carolina State University.
Fits peaks-over-threshold model of a sample.
fitpot(data, threshold=NA, nextremes=NA, evi=NA)
fitpot(data, threshold=NA, nextremes=NA, evi=NA)
data |
a numeric vector. |
threshold |
a threshold value (either this or |
nextremes |
the number of upper extremes to be used (either
this or |
evi |
extreme value index. In particular, the shape parammeter of a generalized Pareto distribution. |
A data.frame
with the following columns:
evi extreme value index. In particular, the shape parammeter of a generalized Pareto distribution.
psi the scale parameter of a generalized Pareto distribution.
threshold a threshold value where peaks-over-threshold is applied.
prob proportion of size of data corresponding to the upper extremes modelled with generalized pareto distribution.
Joan del Castillo, David Moriña Soler and Isabel Serra
del Castillo, J. and Padilla, M. (2016). Modeling extreme values by the residual coefficient of variation. SORT Statist. Oper. Res. Trans. 40(2), 303-320.
del Castillo, J. and Serra, I. (2015). Likelihood inference for Generalized Pareto Distribution. Computational Statistics and Data Analysis, 83, 116-128.
del Castillo, J., Daoudi, J. and Lockhart, R. (2014). Methods to Distinguish Between Polynomial and Exponential Tails. Scandinavian Journal of Statistics, 41, 382-393.
ercv-package
, cievi
,
ccdfplot
, cvevi
, cvplot
, evicv
,
ppot
, qpot
, tdata
, thrselect
,
Tm
data("nidd.thresh", package = "evir") fitpot(nidd.thresh)
data("nidd.thresh", package = "evir") fitpot(nidd.thresh)
This data corresponds to 1000 observations sampled from the first benchmark of the well-known suite for real-time systems EEMBC AutoBench suite (Poovey, 2007), including a number of of programs used in automotive embedded systems. It corresponds to the inverse fast fourier transform (iFFT) algorithm.
iFFT
iFFT
A numeric vector.
Abella J., Padilla, M.,del Castillo, J. & Cazorla, F. (2017). Measurement-Based Worst-Case Execution Time Estimation Using the Coefficient of Variation". ACM Transactions on Design Automation of Electronic Systems (TODAES), 22(4).
Poovey, J. (2007). Characterization of the EEMBC Benchmark Suite. North Carolina State University.
This data corresponds to 1000 observations sampled from the fourth benchmark of the well-known suite for real-time systems EEMBC AutoBench suite (Poovey, 2007), including a number of of programs used in automotive embedded systems. It corresponds to the matrix arithmetic (MA) algorithm.
MA
MA
A numeric vector.
Abella J., Padilla, M.,del Castillo, J. & Cazorla, F. (2017). Measurement-Based Worst-Case Execution Time Estimation Using the Coefficient of Variation". ACM Transactions on Design Automation of Electronic Systems (TODAES), 22(4).
Poovey, J. (2007). Characterization of the EEMBC Benchmark Suite. North Carolina State University.
Cumulative distribution function from the peaks-over-threshold model.
ppot(q, pars, lower.tail=TRUE, log.p=FALSE)
ppot(q, pars, lower.tail=TRUE, log.p=FALSE)
q |
vector of quantiles. |
pars |
a numeric vector with the set of parameters of
peaks-over-threshold model. The names of the elements have to be |
lower.tail |
logical; if |
log.p |
logical; if |
Cumulated probability function as a numerical value.
Joan del Castillo, David Moriña Soler and Isabel Serra
del Castillo, J. and Padilla, M. (2016). Modeling extreme values by the residual coefficient of variation. SORT Statist. Oper. Res. Trans. 40(2), 303-320.
del Castillo, J. and Serra, I. (2015). Likelihood inference for Generalized Pareto Distribution. Computational Statistics and Data Analysis, 83, 116-128.
del Castillo, J., Daoudi, J. and Lockhart, R. (2014). Methods to Distinguish Between Polynomial and Exponential Tails. Scandinavian Journal of Statistics, 41, 382-393.
ercv-package
, cievi
,
ccdfplot
, cvevi
, cvplot
, evicv
, fitpot
,
qpot
, tdata
, thrselect
,
Tm
ppot(1.9, c(evi=0.1, psi=0.2, threshold=0.3, prob=0.4), lower.tail=FALSE) x<-runif(10000) x<-c(x^-1,x) pars<-fitpot(x,1) ppot(10,pars$coeff,lower.tail=FALSE) #the true value is 0.5/10
ppot(1.9, c(evi=0.1, psi=0.2, threshold=0.3, prob=0.4), lower.tail=FALSE) x<-runif(10000) x<-c(x^-1,x) pars<-fitpot(x,1) ppot(10,pars$coeff,lower.tail=FALSE) #the true value is 0.5/10
Quantile function from the peaks-over-threshold model.
qpot(p, pars, lower.tail=TRUE, log.p=FALSE)
qpot(p, pars, lower.tail=TRUE, log.p=FALSE)
p |
vector of probabilities. |
pars |
a numeric vector with the set of parameters of
peaks-over-threshold model. The names of the elements have to be |
lower.tail |
logical; if |
log.p |
logical; if |
Quantile function as a numerical value.
Joan del Castillo, David Moriña Soler and Isabel Serra
del Castillo, J. and Padilla, M. (2016). Modeling extreme values by the residual coefficient of variation. SORT Statist. Oper. Res. Trans. 40(2), 303-320.
del Castillo, J. and Serra, I. (2015). Likelihood inference for Generalized Pareto Distribution. Computational Statistics and Data Analysis, 83, 116-128.
del Castillo, J., Daoudi, J. and Lockhart, R. (2014). Methods to Distinguish Between Polynomial and Exponential Tails. Scandinavian Journal of Statistics, 41, 382-393.
ercv-package
, cievi
,
ccdfplot
, cvevi
, cvplot
, evicv
, fitpot
,
ppot
, tdata
, thrselect
,
Tm
qpot(0.1, c(evi=0.1, psi=0.2, threshold=0.3, prob=0.4), lower.tail=FALSE) x<-runif(10000) x<-c(x^-1,x) pars<-fitpot(x,1) qpot(0.5/10,pars$coeff,lower.tail=FALSE) #the true value is 10
qpot(0.1, c(evi=0.1, psi=0.2, threshold=0.3, prob=0.4), lower.tail=FALSE) x<-runif(10000) x<-c(x^-1,x) pars<-fitpot(x,1) qpot(0.5/10,pars$coeff,lower.tail=FALSE) #the true value is 10
Transformation of a sample with assumption of heavy-tail to a sample with non-heavy tail.
tdata(data, threshold = NA, nextremes = NA, sigma=NA)
tdata(data, threshold = NA, nextremes = NA, sigma=NA)
data |
a numeric vector. |
threshold |
a threshold value (either this or |
nextremes |
the number of upper extremes to be used (either
this or |
sigma |
the scale parammeter divided by shape parameter in generalized Pareto distribution. |
The transformed data as a numerical vector.
Joan del Castillo, David Moriña Soler and Isabel Serra
del Castillo, J. and Padilla, M. (2016). Modeling extreme values by the residual coefficient of variation. SORT Statist. Oper. Res. Trans. 40(2), 303-320.
del Castillo, J. and Serra, I. (2015). Likelihood inference for Generalized Pareto Distribution. Computational Statistics and Data Analysis, 83, 116-128.
del Castillo, J., Daoudi, J. and Lockhart, R. (2014). Methods to Distinguish Between Polynomial and Exponential Tails. Scandinavian Journal of Statistics, 41, 382-393.
ercv-package
, cievi
,
ccdfplot
, cvevi
, cvplot
, evicv
, fitpot
,
ppot
, qpot
, thrselect
,
Tm
data("danish", package = "evir") tdata(danish)
data("danish", package = "evir") tdata(danish)
Threshold selection algorithm.
thrselect(data, threshold=NA, nextremes=NA, omit=16, evi=NA, m=10, nsim=100, conf.level=0.90, oprint=TRUE)
thrselect(data, threshold=NA, nextremes=NA, omit=16, evi=NA, m=10, nsim=100, conf.level=0.90, oprint=TRUE)
data |
a numeric vector. |
threshold |
a threshold value (either this or |
nextremes |
the number of upper extremes to be used (either
this or |
omit |
the minimum required number of upper extremes for computing residual statistics. |
evi |
extreme value index. In particular, the shape parammeter of a generalized Pareto distribution. |
m |
number of thresholds to do multiplicial test. |
nsim |
number of simulations. |
conf.level |
confidence level of the interval. |
oprint |
logical. If |
A list including two data.frame
(solution and options). Each of the data.frame
contains the following columns:
m number of thresholds for testing tail index.
nextremes number of thresholds for testing tail index.
threshold the threshold value
rcv residual coefficient of variation for selected threshold.
cvopt optimal coefficient of variation for the tail.
evi
the corresponding tail index for optimal coefficient of variation if evi
parameter is NA
.
tms the statistic of the tail index test.
pvalue
p-value associated to tms
.
Joan del Castillo, David Moriña Soler and Isabel Serra
del Castillo, J. and Padilla, M. (2016). Modeling extreme values by the residual coefficient of variation. SORT Statist. Oper. Res. Trans. 40(2), 303-320.
del Castillo, J. and Serra, I. (2015). Likelihood inference for Generalized Pareto Distribution. Computational Statistics and Data Analysis, 83, 116-128.
del Castillo, J., Daoudi, J. and Lockhart, R. (2014). Methods to Distinguish Between Polynomial and Exponential Tails. Scandinavian Journal of Statistics, 41, 382-393.
ercv-package
, cievi
,
ccdfplot
, cvevi
, cvplot
, evicv
, fitpot
,
ppot
, qpot
, tdata
,
Tm
data("nidd.thresh", package = "evir") thrselect(nidd.thresh, nsim=500)
data("nidd.thresh", package = "evir") thrselect(nidd.thresh, nsim=500)
Multiple threshold test for a GPD.
Tm(data, threshold = NA, nextremes = NA, omit = 16, evi = NA, m = 10, nsim = 100)
Tm(data, threshold = NA, nextremes = NA, omit = 16, evi = NA, m = 10, nsim = 100)
data |
a numeric vector. |
threshold |
a threshold value (either this or |
nextremes |
the number of upper extremes to be used (either
this or |
omit |
the minimum required number of upper extremes for computing residual statistics. |
evi |
extreme value index. In particular, the shape parammeter of a generalized Pareto distribution. |
m |
number of thresholds to do multiplicial test. |
nsim |
number of simulations. |
A data.frame
containing the following columns:
nextremes the number of upper extremes to be used.
cvopt optimal coefficient of variation for the tail.
evi
the corresponding tail index for optimal coefficient of variation if evi
parameter is NA
.
tms the statistic of the tail index test.
pvalue
p-value associated to tms
.
Joan del Castillo, David Moriña Soler and Isabel Serra
del Castillo, J. and Padilla, M. (2016). Modeling extreme values by the residual coefficient of variation. SORT Statist. Oper. Res. Trans. 40(2), 303-320.
del Castillo, J. and Serra, I. (2015). Likelihood inference for Generalized Pareto Distribution. Computational Statistics and Data Analysis, 83, 116-128.
del Castillo, J., Daoudi, J. and Lockhart, R. (2014). Methods to Distinguish Between Polynomial and Exponential Tails. Scandinavian Journal of Statistics, 41, 382-393.
ercv-package
, cievi
,
ccdfplot
, cvevi
, cvplot
, evicv
, fitpot
,
ppot
, qpot
, tdata
, thrselect
data("nidd.thresh",package = "evir") Tm(nidd.thresh,evi=0, nextremes = 75)
data("nidd.thresh",package = "evir") Tm(nidd.thresh,evi=0, nextremes = 75)