In the statistical physics literature, heavy-tailed inter-arrival times are said to be the main signature of b burstyb dynamics. Such dynamics have been observed for financial time series, earthquakes, solar flares and neuron voltage spikes. This talk is concerned with the modelling of extremes of events when events follow such bursty dynamics. We assign i.i.d. magnitudes (marks) to the events in a heavy-tailed renewal process and apply the b Peaks Over Thresholdb method from Extreme Value Theory. Leveraging geometric stability, it can be shown that the threshold exceedance times asymptotically form a b fractional Poisson processb with Mittag-Leffler inter-arrival times. Finally, we discuss methods for inference on the tail and scale parameters of the bursty dynamics.
Vajna, S., TC3th, B., & KertC)sz, J. (2013). Modelling bursty time series. New Journal of Physics, 15(10), 103023. http://doi.org/10.1088/1367-2630/15/10/103023
Karsai, M., Kaski, K., BarabC!si, A. L., & KertC)sz, J. (2012). Universal features of correlated bursty behaviour. Scientific Reports, 2. http://doi.org/10.1038/srep00397
But: for renewal processes with \(1 < \beta < 2\) (finite mean, infinite variance) the ACF decays with exponent \(2-\beta\).
Quick fix: log-transform the waiting times.
a sequence of consecutive events where all inter-arrival times are \(< \Delta t\).
If the two plots are visually different, renewal assumption is false.
(Karsai et al 2012, Universal features of correlated bursty behaviour, Scientific Reports)
Clustering coefficients for a growing network, where nodes are preferentially attached to nodes with large clustering coefficients.
Bagrow, J. P., & Brockmann, D. (2013). Natural emergence of clusters and bursts in network evolution. Physical Review X, 3(2), 1b 6. http://doi.org/10.1103/PhysRevX.3.021016
Waiting time | Magnitude |
---|---|
Server requests | filesize |
Earthquakes | magnitudes |
Email sent | email replied |
Solar flares | magnitudes |
Time between rest for paper currency | distance travelled |
Time between raindrops | raindrop size |
Time between price jumps | price jumps |
How does one model magnitude and timing of extreme events, if the underlying dynamics are a heavy-tailed renewal process?
Consider the uncoupled case: \(\{W_k\}\) and \(\{J_k\}\) are independent.
What is the distribution of
Renewal Process
\[N(t) = \{ n\ge 0: W_1 + \ldots + W_n \le t \}\]
counts events up to time \(t\).
CTRM Process:
\[M(t)=\bigvee_{k=1}^{N(t)}J_k=\max\{J_1, \ldots,J_{N(t)}\}\]
(s. dashed line)
First magnitude above a threshold \(\ell\) is:
\[\tau(\ell) = \min\{k: J_k > \ell\}\]
Exceedance Process:
\[T(\ell) = \sum_{k=1}^{\tau(\ell)} W_k\]
We have \(T(\ell) > t \Leftrightarrow M(t) \le \ell\).
(CTRM and Exceedance Time are inverse to each other.)
(s. dashed line)
\[p_\ell T(\ell) \stackrel{d}{\to} {\rm Exp}(\mu_W)\]
\[p_\ell^2 {\rm Var} T(\ell) \to \mu_W^2\]
where \(\mu_W = \mathbf E[W_k]\) (e.g.B Gut b 99).
Let \(a_n\) and \(b_n\) be such that \(F_J(a_n x + b_n)^n \to \Lambda(x)\) for all \(x\)
Let \(g(t) = t / [\Gamma(2-\alpha) \int_0^t (1-F_W(u))\,du]\).
Then the CTRM process \(M(t)\) satisfies
\[\lim_{t\to \infty} \mathbf P\left[M(t) \le a_{[g(t)]} x + b_{[g(t)]}\right] = E_\beta(\log \Lambda(x))\]
where \(\beta \in (0,1]\) is the tail index of \(W_k\) and \(E_\beta(\cdot)\) is the Mittag-Leffler function (more later).
As \(n \to \infty\), assume there are increasing norming functions \(a\), \(d\) and \(b\) such that
Then as \(c \to \infty\),
\[\frac{M(ct) - d(\tilde b(c))}{a(\tilde b(c))} \Rightarrow A(E(t))\]
Here
\[b(\tilde b(c)) \sim c \sim \tilde b(b(c))\]
(Meerschaert & Stoev b 08).
Other approaches:
These models have been studied thoroughlyb & but how do you fit them to data?
Define the Exceedance as
\[X(\ell) = M(T_\ell) - \ell.\]
In the uncoupled case, it follows easily that Exceedance \(X(\ell)\) and Exceedance Time \(T(\ell)\) are independent.
Asymptotically for large \(\ell\), \(X_\ell\) follows a generalized Pareto distribution:
\[X_\ell \sim GP(\xi, \bar \sigma)\]
Extract the red lines and fit GEV distributions to them!
For threshold \(\ell\) in \(\{J_{(1)} > J_{(2)} > \ldots > J_{(n)}\}\):
Then plot parameters estimates \(\hat \xi, \hat \sigma\) against \(k\).
Example: earthquake magnitudes shape parameterb &
b & and scale parameter:
By inspection, pick \(\hat \xi = -0.1\), \(\hat \sigma = 0.6\).
Small \(k\): few data (high variance) but asymptotically close to GP distribution (low bias)
High \(k\): many data (low variance) but asymptotically far away from GP distribution (high bias)
For threshold \(\ell\) in \(\{J_{(1)} > J_{(2)} > \ldots > J_{(n)}\}\):
Then plot the parameters of this distribution against \(k\).
But what is the distribution of \(T(\ell)\)??
\(T(\ell) =\) distances between blue circles.
\[E_\beta(z) = \sum_{k=0}^\infty \frac{z^k}{\Gamma(1+\beta k)}\]
where \(0 < \beta \le 1\).
\(E_\beta(z)\) nests the exponential function: \(E_1(z) = e^z\).
For \(\beta < 1\), \(E_\beta(-z^\beta)\) is \(\rm{RV}_\infty(-\beta)\).
A r.v. \(T_\beta\) is Mittag-Leffler-1 distributed with tail parameter \(\beta \in (0,1]\) if
\[\mathbf P(T > t) = E_\beta(-t^\beta), \quad t > 0.\]
A r.v. \(T_\beta\) is Mittag-Leffler-2 distributed with parameter \(\beta \in (0,1]\) if
\[\int_0^\infty e^{-st} \mathbf P(T \in dt) = E_\beta(-s).\]
We write \(\sigma T_\beta \sim {\rm ML2}(\beta, \sigma)\) for \(\sigma > 0\).
Let \(D(u)\) be a stable subordinator with Laplace Transform \(\exp(-us^\beta)\).
Let \(E(t) = \inf\{u: D(u) > t\}\) (inverse stable subordinator).
Then
\[E(t) \sim {\rm ML2}(\beta, t^\beta).\]
A random variable \(Y\) is geometric stable if and only if its characteristic function is
\[\psi(t) = \frac{1}{1 + \sigma^\alpha |t|^\alpha \omega_{\alpha, \beta}(t) - i\mu t}\]
where for \(\alpha \neq 1\)
\[\omega_{\alpha, \beta}(t)= \begin{cases} 1 - i\beta {\rm sign}(t) \tan(\pi \alpha / 2), & \alpha \neq 1 \\ 1 + i\beta(2/\pi) {\rm sign}(t) \log|t|, & \alpha = 1 \end{cases}\]
Write \(Y \sim {\rm GS}(\alpha, \beta, \sigma, \mu)\); (stability, skewness, scale, location).
(Parametrization by Samorodnitsky & Taqqu, see Kozubowski & Rachev, 1999)
A r.v. \(Y\) on \(\mathbb R\) is GS (geometric stable) if for \(N_p \sim {\rm Geom}(p)\) there exist i.i.d. r.v.b s \(X_1, X_2, \ldots\), independent of \(N_p\), a norming sequence \(A_p\) and centering constants \(b_p\) such that
\[A_p \sum_{k=1}^{N_p} (X_k + b_p) \stackrel{d}{\to} Y \text{ as } p \downarrow 0.\]
Then \(X_1\) is weakly geometrically attracted to \(Y\), and the collection of such \(X_k\) is called the generalized domain of geometric attraction (GDOGA) of \(Y\).
Let \(\beta \in (0,1]\). Then \(X \in \text{GDOGA}({\rm ML(\beta,1)})\) if and only if
\[1 - F_X(t) \in {\rm RV}_\infty(-\beta)\]
Recall:
Moreover, \(N_{p_\ell} := \min\{k: J_k > \ell\}\) is geometric, independent of \(W_k\), and
\[T(\ell) = \sum_{k=1}^{N_{p_\ell}} W_k.\]
Suppose the magnitudes \(J_k\) are supported on the interval \([x_L, x_R]\).
Suppose that \(\mathbf P(W_k > t) = L(t) t^{-\beta}\) for some slowly varying function \(L(t)\) and \(\beta \in (0,1)\).
Let \(\ell \in [x_L, x_R]\) be a threshold, and write \(p_\ell = 1 - F_J(\ell)\) for the probability that a magnitude exceeds threshold \(\ell\).
Then for some \(\sigma > 0\),
\[p_\ell^{1/\beta} T(\ell) \stackrel{d}{\to} {\rm ML}(\beta, \sigma), \quad \ell \uparrow x_R\]
Gnedenko (1972) predates the Shock-process literature (1988+); Yet geometric stability does not appear in the Shock process literature.
It took the end of the Cold War and the invention of the internet to put the two things together!
For large thresholds \(\ell\), we may approximate
\[T(\ell) \approx {\rm ML}(\beta, \sigma p_\ell^{-1/\beta}).\]
Small k: few data (high variance) but many geometric summands, hence asymptotically close to M-L distribution (low bias)
High k: many data (low variance) but few geometric summands, hence far away from GP distribution (high bias)
Let \(T \sim ML(\beta, \sigma)\).
(Cahoy et al b 10)
Recent algorithm (Garrappa, 2015) calculates Mittag-Leffler Function by inversion of Laplace transform.
The MittagLeffleR package (Gill & Straka, 2017) implements this algorithm and calculates pdf, cdf and random variates for both types of Mittag-Leffler distributions.
But this is more costly than method of log-moments.
Suppose now that \(\mathbf P(W_k > t) = L(t) t^{-\beta}\), where \(1 < \beta < 2\) and \(L(t)\) is a slowly varying function.
Convergence to b homogeneousb dynamics takes very long:
\[\mathbf E[N(t)] \approx \frac{t}{\mathbf E[W_1]} + \frac{C}{\mathbf E[W_1]^3} t^{2-\beta}, \quad \text{large } t\]
If the tail parameter satisfies \(1<\beta<2\), then \(\mathbf E[W_1] < \infty\), and in the limit as \(\ell \uparrow x_R\),
\[p_\ell T(\ell) = p_\ell \sum_{k=1}^{N_{p_\ell}} W_k \to {\rm Exp}(\mu_W).\]
Hence we b couldb approximate
\[T(\ell) \sim {\rm Exp}(\mu_W/p_\ell)\]
But we expect this convergence to be very slow, as seen in previous slide!
\[ \begin{multline} \sum_{k=1}^{N_{p_\ell}} W_k = \sum_{k=1}^{N_{p_\ell}} (W_k - \mu_W) + N_{p_\ell} \mu_W \\ = p_\ell^{-1/\beta} p_\ell^{1/\beta} \sum_{k=1}^{N_{p_\ell}} (W_k - \mu_W) + p_\ell^{-1}p_\ell N_{p_\ell} \mu_W \approx p_\ell^{-1/\beta} \sigma Z^{1/\beta} X + p_\ell^{-1} \mu_W Z \end{multline} \]
where \(Z \sim {\rm Exp}(1)\) and \(X \sim S(\beta, 1, 1, 0)\). The latter r.v. is geometric stable (Kozubowski & Rachev, 1999), and so:
\[T(\ell) = \sum_{k=1}^{N_{p_\ell}} W_k \approx {\rm GS}(\beta, 1, \sigma p_\ell^{-1/\beta}, p_\ell^{-1} \mu_W) \text{ for large } \ell. \]
Available estimators for \({\rm GS}\) distributions:
tempered heavy-tails
coupled CTRMs
Many datasets from finance (CGMY), geology and hydrology display tempered heavy tails:
This is modelled with one extra parameter by the tempered stable distribution
\[f_{\beta, \lambda}(x) = e^{-\lambda x + \lambda^\beta} f_\beta(x)\]
(where \(f_\beta(x)\) denotes the stable distribution.)
How to do inference for CTRMs with tempered heavy-tailed waiting times?
How to model coupled CTRMs?
Theoretical results are available on
Thank you for your attention!
Shiny-app: https://strakaps.shinyapps.io/ctrmapp/
homepage: https://strakaps.github.io/
Twitter: @strakaps
Gill, G., & Straka, P. (2017). MittagLeffleR: Using the Mittag-Leffler distributions in R. Retrieved from https://github.com/strakaps/MittagLeffleR
Meerschaert, M. M., & Stoev, S. A. (2008). Extremal limit theorems for observations separated by random power law waiting times. Journal of Statistical Planning and Inference, 139(7), 2175b 2188. http://doi.org/10.1016/j.jspi.2008.10.005
Karsai, M., Kaski, K., BarabC!si, A. L., & KertC)sz, J. (2012). Universal features of correlated bursty behaviour. Scientific Reports, 2. http://doi.org/10.1038/srep00397
Meerschaert, M. M., & Straka, P. (2013). Inverse Stable Subordinators. Mathematical Modelling of Natural Phenomena, 8(2), 1b 16. http://doi.org/10.1051/mmnp/20138201
Cahoy, D. O. (2013). Estimation of Mittag-Leffler Parameters. Communications in Statistics - Simulation and Computation, 42(2), 303b 315. http://doi.org/10.1080/03610918.2011.640094
\(\newcommand{\R}{\mathbb R}\)
Assume the i.i.d. pairs \((J_k, W_k)\) are drawn from a bivariate distribution \(\mu\) on \(\R \times (0,\infty)\);
but \(J_k\) and \(W_k\) may be dependent.
How does one model and estimate Exceedance Times \(T_\ell\) and Exceedance Sizes \(X_\ell\)?
The bivariate random vector \((J_k, W_k)\) lies in the Max-Sum-Domain of Attraction of the bivariate random vector \((A,D)\) if
\[ \left(\frac{\bigvee_{k=1}^n J_k - d_n}{a_n} , \frac{\sum_{k=1}^n W_k}{b_n} \right) \Longrightarrow \left( A , D \right) \quad n \to \infty \]
for some norming functions \(a(n), d(n)\) and \(b(n)\).
Theorem(Hees & Scheffler b 16):
Let \((A,D)\) be a bivariate random vector. The following are equivalent:
For i.i.d. copies of \((A,D)\), one has \[(A_1 \vee \ldots \vee A_n, D_1 + \ldots + D_n) \stackrel{d}{=} (a_n A, b_n D)\]
\((A,D)\) is the weak limit of a sequence as above.
If the above are satisfied, \((A,D)\) is called Max-Sum-Stable.
\(\newcommand{\plx}{\stackrel{+}{\vee}}\) Let \(S := \R \times [0,\infty]\). Consider the semigroup \((S, \plx)\) with neutral element \(e = (-\infty,0)\), defined via: \((a,b) \plx (c,d) := (a \vee c, b + d)\).
A semicharacter is a function \(\rho: S \to \R\) with the properties i) \(\rho(e) = 1\) and ii) \(\rho(s \plx t) = \rho(s) \rho(t)\). Write \(\hat{\mathbf S}\) for the set of semicharacters on \((S, \plx)\); \(\hat{\mathbf S}\) is a topological semigroup.
The generalized Laplace transform of a measure \(\mu\) on \(\hat{\mathbf S}\) is given by
\[\varphi(s) = \int_{\hat{\mathbf S}} \rho(s)\,d\mu(\rho), \quad s \in S.\]
Here, the semicharacters are \(\rho_{x,t} = e^{-t \cdot} \mathbf 1_{[-\infty, \cdot]}(x)\), and \(\hat{\mathbf S}\) is isomorphic to \(S\). This defines the CDF-Laplace Transform
\[\varphi(y,s) = \int_{\R \times [0,\infty)} e^{-st} \mathbf 1_{[-\infty, y]}(x)\,\mu(dt,dx)\]
Uniqueness and continuity theorems hold in the same fashion as for the usual Laplace transform (Hees & Scheffler b 16).
The semigroup \((S, \plx)\) defines a convolution:
\[P_{(J_1,W_1)} \star P_{(J_2,W_2)} := P_{(J_1,W_1) \plx (J_2, W_2)}\]
A probability measure \(\mu\) on \(S\) is Max-Sum-Infinitely Divisible iff for every \(n\) there is a measure \(\mu_n\) such that \[\mu = \mu_n \star \ldots \star \mu_n\] (\(n\) times).
Theorem:
\(\mu\) is Max-Sum-Infinitely Divisible iff its CDF-Laplace Transform has the representation
\[ -\log(\phi(y,s)) = as + \iint\limits_{\R \times [0,\infty)} \left(1 - e^{-st} \mathbf 1_{[-\infty, y]}(x)\right) \eta(dx, dt) \]
for some Radon measure \(\eta\) on \(\R \times [0,\infty)\) such that \(\eta(\{(0,-\infty)\}) = 0\) and \(a \ge 0\) (Hees & Scheffler b 16).
Theorem:
The following are equivalent:
Theorem:
In the above situation, the Levy measure has the representation
\[\eta(dx,dt) = \epsilon_0(dx) C \beta x^{-1-\alpha} dx \\ + \mathbf 1_{\R \times [0,\infty)}(x,t) (t^{\beta/\alpha} \omega) (dx) K \beta t^{-1-\beta}\]
(Hees & Scheffler b 16)
Coupled case: Max-Sum-Stable distributions have GEV resp. stable marginals.
Events b happenb at the points of increase of \(E(t)\): \(\mathcal R = \{D(r): r \ge 0\}\) of \(E(t)\).
This means intermittent periods of high activity, alternating with periods of rest, i.e.B bursts.
Further properties: - \(\mathcal R\) is a fractal of dimension \(\beta\) - Self similarity: \(E(c \cdot) \stackrel{d}{=} c^\beta E(\cdot)\)
\(\newcommand{\R}{\mathbb R} \newcommand{\ms}{\mathsf}\) Let \(J_k\) be i.i.d. with continuous distribution in \(\R\). Then there exist functions \(a(n)\) and \(d(n)\) such that for every \(x \in \R\)
\[\frac{\bigvee_{k=1}^n J_k - d(n)}{a(n)} \Rightarrow \ms A, \quad (n \to \infty)\]
(weak convergence) where \(\ms A\) is a r.v. with GEV (generalized extreme value) distribution with CDF \(\Lambda_\xi(x)\) given by
\[-\log \Lambda_\xi(x) = \begin{cases} (1+\xi x)^{-1/\xi}, & \xi \neq 0 \\ e^{-x}, & \xi = 0 \end{cases}\]
with support \(\{x \in \R: 1 + \xi x > 0\}\).
(Beirlant et al, 2006, Statistics of extremes)
Let \(a(n)\) and \(b(n)\) be as above. Then
\[\frac{\bigvee_{k=1}^{[ct]} J_k - d(c)}{a(c)} \Rightarrow A(t), \quad (c \to \infty)\]
holds (weakly, \(J_1\) topology), where \(A(t)\) is an extreme value process given via
\[\mathbf P[A(t_i) \le x_i, i=1\ldots, d] = \Lambda_\xi(x_1)^{t_1} \Lambda_\xi(x_2)^{t_2 - t_1} \cdots \Lambda_\xi(x_d)^{t_d - t_{d-1}}\]
for ordered \(x_1 < x_2 < \ldots < x_d\).
(Lamperti b 64; Resnick b 75)
Recall the definition of the CTRM process \(M(t) = \bigvee_{k=1}^{N(t)} J_k\).
Hence we need a functional scaling limit of \(N(t)\).
Donsker-type functional limit theorem (e.g.B Whitt 2010, Meerschaert & Sikorskii 2011):
Suppose that \(W_i\) are i.i.d. and \(\mathbf P(W_n>t) \sim ct^{-\beta}\) for \(t \to \infty\), where \(c>0\) and \(0<\beta<1\).
Then
\[(W_1 + \ldots + W_{[ct]}) / b(c) \Rightarrow D(t), \quad c \to \infty\]
for a regularly varying norming sequence \(b(n) \in RV(1/\beta)\). The limit process \(D(t)\) is a stable subordinator (one-sided, stable Levy process).
Let \(W_i\) be i.i.d. and heavy-tailed as above, and
\[N(t) = \min\{n: W_1 + \ldots + W_n > t\} - 1\]
Then
\[N(ct) / \tilde b(c) \Rightarrow E(t)\]
where \(E(t) = \inf\{r: D(r) > t\}\) is the inverse stable subordinator (Meerschaert & Straka, b 13).
Here the norming sequence \(\tilde b(c) \in RV(\beta)\) is asymptotically inverse to \(b(c)\) (Seneta, b 76), i.e.
\[\tilde b(b(c)) \sim c \sim b(\tilde b(c))\]