Multivariate Point Process Package
0.1
|
PlainHawkes implements the standard multivariate Hawkes process. More...
#include "include/PlainHawkes.h"
Classes | |
struct | OPTION |
Options used to configure the fitting of the standard Hawkes Process. More... | |
Public Types | |
enum | OptMethod { SGD, PLBFGS } |
Optimization algorithms used to fit the standard Hawkes Process. More... | |
enum | Regularizer { L1, L22, NUCLEAR, NONE } |
Supported regularizations used to fit the standard Hawkes Process. More... | |
enum | RegCoef { LAMBDA, BETA } |
Regularization coefficients. More... | |
Public Member Functions | |
PlainHawkes (const unsigned &n, const unsigned &num_dims, const Eigen::MatrixXd &Beta) | |
The constructor. More... | |
void | fit (const std::vector< Sequence > &data, const OPTION &options) |
Maximum likelihood estimation for the model parameters. More... | |
void | fit (const std::vector< Sequence > &data, const OPTION &options, const Eigen::VectorXd &trueparameters) |
Maximum likelihood estimation for the model parameters. More... | |
virtual void | NegLoglikelihood (double &objvalue, Eigen::VectorXd &gradient) |
Negative loglikelihood of Hawkes process. More... | |
virtual void | Gradient (const unsigned &k, Eigen::VectorXd &gradient) |
Gradient on the \(k\)-th data sample. More... | |
virtual double | Intensity (const double &t, const Sequence &data, Eigen::VectorXd &intensity_dim) |
Intensity function of the standard multi-dimensional Hawkes process. More... | |
virtual double | IntensityUpperBound (const double &t, const double &L, const Sequence &data, Eigen::VectorXd &intensity_upper_dim) |
The upper bound of the intensity function between \([t, t + L]\). More... | |
virtual double | IntensityIntegral (const double &lower, const double &upper, const Sequence &data) |
The integral of the intensity function within the range \([\text{lower}, \text{upper}]\). More... | |
virtual double | PredictNextEventTime (const Sequence &data, const unsigned &num_simulations) |
Predict the next event timing by the expectation \(\int_{t_n}^\infty tf^*(t)dt\). Currently, we use the sample average by simulations to approximate the expectation since the conditional density \(f^*(t)\) normally does not have an analytic form. More... | |
void | Simulate (const std::vector< double > &vec_T, std::vector< Sequence > &sequences) |
Fast simulation method based on the property of the exponential triggering kernel using dynamic programming. More... | |
void | Simulate (const unsigned &n, const unsigned &num_sequences, std::vector< Sequence > &sequences) |
Fast simulation method based on the property of the exponential triggering kernel using dynamic programming. More... | |
Public Member Functions inherited from IProcess | |
IProcess (const unsigned &n, const unsigned &num_dims) | |
The constructor. More... | |
const Eigen::VectorXd & | GetParameters () |
Return the column vector of model parameters. More... | |
unsigned | GetNumDims () |
Return the number of dimensions in the process. More... | |
void | SetParameters (const Eigen::VectorXd &v) |
Set the model parameters. More... | |
void | PlotIntensityFunction (const Sequence &data) |
Plots the intensity functions based on the given sequence. It plots the intensity function and the associated event points up of each dimension in the same figure. More... | |
void | PlotIntensityFunction (const Sequence &data, const unsigned &dim_id) |
Plots the intensity function and the associated event points of the dimension dim_id. More... | |
Protected Member Functions | |
void | Initialize (const std::vector< Sequence > &data) |
Initialize the internal feature varables all_exp_kernel_recursive_sum_ and intensity_itegral_features_. More... | |
void | RestoreOptionToDefault () |
Restores the optimization methods and regualrization into their default values. More... | |
unsigned | AssignDim (const Eigen::VectorXd &lambda) |
Auxiliary function of sampling dimension for fast simulation of Hawkes process. More... | |
void | UpdateExpSum (double t, const Eigen::VectorXd &last_event_per_dim, Eigen::MatrixXd &expsum) |
Auxiliary function of updating the intensity function of each dimension for fast simulation of Hawkes process. More... | |
Protected Member Functions inherited from IProcess | |
void | InitializeDimension (const std::vector< Sequence > &data) |
Protected Attributes | |
Eigen::MatrixXd | Beta_ |
Beta_ is a D-by-D matrix, which stores the \(\{\beta_{mn}\geq 0\}\) decaying rates of the exponential kernels. More... | |
std::vector< std::vector< std::vector< Eigen::VectorXd > > > | all_exp_kernel_recursive_sum_ |
all_exp_kernel_recursive_sum_ stores the cumulative influence between dimensions. More... | |
std::vector< Eigen::MatrixXd > | intensity_itegral_features_ |
intensity_itegral_features_ stores the integral of each exponential kernel. More... | |
Eigen::VectorXd | observation_window_T_ |
observation_window_T_ stores the observation window for each sequence. More... | |
SimpleRNG | RNG_ |
RNG_ implements a simple random number generator. More... | |
unsigned | num_sequences_ |
num_sequences_ is the total number of sequences in the given data. More... | |
OPTION | options_ |
A configuration object which saves the optimization options. More... | |
Protected Attributes inherited from IProcess | |
Eigen::VectorXd | parameters_ |
A column vector represents all model parameters of the process. More... | |
unsigned | num_dims_ |
The total number of dimensions of the process. More... | |
std::vector< std::vector< std::vector< double > > > | all_timestamp_per_dimension_ |
all_timestamp_per_dimension_ is a 3-d array where all_timestamp_per_dimension_[c][n][i] records the i-th event on the n-th dimension in the c-th sequence. More... | |
PlainHawkes implements the standard multivariate Hawkes process.
A multivariate Hawkes Process is a process where the occurrence of an event to a dimension will trigger more events on this dimension and other related dimensions in the near future. The intensity function of each dimension of the Hawkes process is defined as the following :
\begin{align} \lambda^n(t) = \lambda_0^n + \sum_{m=1}^D\alpha_{mn}\sum_{t^m_j < t}\exp(-\beta_{mn}(t - t^m_j)), \end{align}
where \(\lambda_0^n\geq 0\) is the base intensity, \(D\) is the number of dimensions, and \(\alpha_{mn}\geq 0\) captures the extent to which an event on dimension m can trigger an event on dimension n in the near future. The collection of \(\{\alpha_{mn}\}\) can be represented as a matrix \(\mathbf{A}(m,n) = \alpha_{mn}\), and the collection of \(\{\lambda_0^n\}\) can be represented as a column vector \(\boldsymbol{\lambda}_0\).
\(\boldsymbol{\lambda}_0\) and \(\mathbf{A}\) are the model parameters of a multivariate Hawkes process. The collection of \(\{\beta_{mn}\geq 0\}\) is often given in advance.
Definition at line 25 of file PlainHawkes.h.
Optimization algorithms used to fit the standard Hawkes Process.
Enumerator | |
---|---|
SGD |
stochastic gradient descend. |
PLBFGS |
Definition at line 105 of file PlainHawkes.h.
enum PlainHawkes::RegCoef |
Regularization coefficients.
Enumerator | |
---|---|
LAMBDA |
Regularization coefficient for \(\|\boldsymbol{\lambda}_0\|\) |
BETA |
Regularization coefficient for \(\|\mathbf{A}\|\) |
Definition at line 141 of file PlainHawkes.h.
Supported regularizations used to fit the standard Hawkes Process.
Definition at line 119 of file PlainHawkes.h.
|
inline |
The constructor.
[in] | n | the number of parameters in total. |
[in] | num_dims | the number of dimensions in the process. |
[in] | Beta | a D-by-D matrix, which stores the \(\{\beta_{mn}\geq 0\}\) decaying rates of the exponential kernels. |
Definition at line 204 of file PlainHawkes.h.
|
protected |
Auxiliary function of sampling dimension for fast simulation of Hawkes process.
lambda | A column vector where each component is the intensity of the respective dimension. |
Definition at line 494 of file PlainHawkes.cc.
Maximum likelihood estimation for the model parameters.
[in] | data | vectors of observed sequences. |
[in] | options | data structure sotring different configuration for the optimization algorithm and the respective regularizations. |
Definition at line 355 of file PlainHawkes.cc.
void PlainHawkes::fit | ( | const std::vector< Sequence > & | data, |
const OPTION & | options, | ||
const Eigen::VectorXd & | trueparameters | ||
) |
Maximum likelihood estimation for the model parameters.
[in] | data | vectors of observed sequences. |
[in] | options | data structure sotring different configuration for the optimization algorithm and the respective regularizations. |
[in] | trueparameters | a column vector storing the true parameters of the Hawkes process used to generate the set of observed sequences in data to compare. |
Definition at line 395 of file PlainHawkes.cc.
|
virtual |
Gradient on the \(k\)-th data sample.
[in] | data | sample index. |
[out] | gradient | of the objective function w.r.t. the current data sample |
Implements IProcess.
Definition at line 302 of file PlainHawkes.cc.
|
protected |
Initialize the internal feature varables all_exp_kernel_recursive_sum_ and intensity_itegral_features_.
[in] | data | A collection of input sequences. |
Definition at line 19 of file PlainHawkes.cc.
|
virtual |
Intensity function of the standard multi-dimensional Hawkes process.
[in] | t | the given time t. |
[in] | data | the sequence of past events. |
[out] | intensity_dim | a column vector where the n-th component is the intensity function of dimension n at the current given time t, that is, \begin{align} \text{intensity_dim[n]} = \lambda^n(t) = \lambda^n_0 + \sum_{m=1}^D\alpha_{mn}\sum_{t^m_j < t}\exp(-\beta_{mn}(t - t^m_j)). \end{align} |
Implements IProcess.
Definition at line 111 of file PlainHawkes.cc.
|
virtual |
The integral of the intensity function within the range \([\text{lower}, \text{upper}]\).
[in] | lower | the inegeral lower bound. |
[in] | upper | the integral upper bound. |
[in] | data | the sequence of past events. |
Implements IProcess.
Definition at line 447 of file PlainHawkes.cc.
|
virtual |
The upper bound of the intensity function between \([t, t + L]\).
[in] | t | the given time t. |
[in] | L | the length we look into the future starting from the time t. |
[in] | data | the sequence of past events. |
[out] | intensity_upper_dim | a column vector where the n-th component is the upper bound for the n-th dimension between \([t, t + L]\). |
Implements IProcess.
Definition at line 146 of file PlainHawkes.cc.
|
virtual |
Negative loglikelihood of Hawkes process.
[out] | objvalue | negative loglikelihood. |
[out] | gradient | gradient of the parameters. |
Implements IProcess.
Definition at line 180 of file PlainHawkes.cc.
|
virtual |
Predict the next event timing by the expectation \(\int_{t_n}^\infty tf^*(t)dt\). Currently, we use the sample average by simulations to approximate the expectation since the conditional density \(f^*(t)\) normally does not have an analytic form.
[in] | data | the sequence of past events. |
[in] | num_simulations | number of simulations we use to calculate the sample average. |
Implements IProcess.
Definition at line 435 of file PlainHawkes.cc.
|
protected |
Restores the optimization methods and regualrization into their default values.
Definition at line 485 of file PlainHawkes.cc.
void PlainHawkes::Simulate | ( | const std::vector< double > & | vec_T, |
std::vector< Sequence > & | sequences | ||
) |
Fast simulation method based on the property of the exponential triggering kernel using dynamic programming.
The intensity function on the n-th dimension is defined as \(\lambda^n(t) = \lambda^n_0 + \sum_{m=1}^D\alpha_{mn}\sum_{t^m_j < t}\exp(-\beta_{mn}(t - t^m_j))\). Let \(A_{mn}(t) = \sum_{t^m_j < t}exp(-\beta_{mn}(t - t^m_j))\). Then, \(A_{mn}(t) = \exp(-\beta_{mn}(t - t^n_i))A_{mn}(t^n_i) + \sum_{t^n_i\leq t^m_j < t}\exp(-\beta_{mn}(t - t^m_j))\).
[in] | vec_T | a columne vector where each component is the observation window for each simulated sequence. We can simulate multiple sequences. Each component of the column vector vec_T is the given observation window of the respective sequence, so the dimension of vec_T is equal to the number of simulated sequences in total. |
[out] | sequences | a vector of simulated sequences. |
Definition at line 531 of file PlainHawkes.cc.
void PlainHawkes::Simulate | ( | const unsigned & | n, |
const unsigned & | num_sequences, | ||
std::vector< Sequence > & | sequences | ||
) |
Fast simulation method based on the property of the exponential triggering kernel using dynamic programming.
[in] | n | number of events per simulated sequence. |
[in] | num_sequences | number of simulated sequences. |
[out] | sequences | a vector of simulated sequences. |
Definition at line 634 of file PlainHawkes.cc.
|
protected |
Auxiliary function of updating the intensity function of each dimension for fast simulation of Hawkes process.
[in] | t | current given time. |
[in] | last_event_per_dim | a column vector where each component is the last event of the respective dimension. |
[out] | expsum | cumulative influence between dimensions at time t. \(\text{expsum(m,n)(t)} = \sum_{t^m_j < t}\exp(-\beta_{mn}(t - t^m_j)) = \exp(-\beta_{mn}(t - t^m_{n_m}))(1 + \text{expsum(m,n)}(t^m_{n_m}))\) where \(n_m\) is the index of the last event on dimension m. |
Definition at line 519 of file PlainHawkes.cc.
|
protected |
all_exp_kernel_recursive_sum_ stores the cumulative influence between dimensions.
For each sequence c, given a pair of dimension m and n, and a point \(t^{c,n}_{i}\) on the dimension n,
\begin{align} \text{all_exp_kernel_recursive_sum_[c][m][n][i]} & = \sum_{t^{c,m}_{j} < t^{c,n}_{i}}\exp(-\beta_{mn}(t^{c,n}_i - t^{c,m}_j))\\ & = \exp(-\beta_{mn}(t^{c,n}_i - t^{c,n}_{i-1}))\cdot\text{all_exp_kernel_recursive_sum_[c][m][n][i - 1]}\quad + \sum_{t^{c,n}_{i-1}\leq t^{c,m}_j < t^{c,n}_i} \exp(-\beta_{mn}(t^{c,n}_i - t^{c,m}_j)) \end{align}
which is the sum of the influence from all past events \(t^{c,m}_{j} < t^{c,n}_{i}\) on the dimension m in the sequence c.
Definition at line 45 of file PlainHawkes.h.
|
protected |
Beta_ is a D-by-D matrix, which stores the \(\{\beta_{mn}\geq 0\}\) decaying rates of the exponential kernels.
Definition at line 33 of file PlainHawkes.h.
|
protected |
intensity_itegral_features_ stores the integral of each exponential kernel.
For each sequence c, given a pair of dimension m and n,
\begin{align} \text{intensity_itegral_features_[c](m,n)} = \sum_{t^{c,m}_j < T^c}(1 - \exp(-\beta_{mn}(T_c - t^{c,m}_j))) \end{align}
Definition at line 55 of file PlainHawkes.h.
|
protected |
num_sequences_ is the total number of sequences in the given data.
Definition at line 72 of file PlainHawkes.h.
|
protected |
observation_window_T_ stores the observation window for each sequence.
For each sequence c, \(\text{observation_window_T_}(c) = T_c\).
Definition at line 62 of file PlainHawkes.h.
|
protected |
A configuration object which saves the optimization options.
Definition at line 194 of file PlainHawkes.h.
|
protected |
RNG_ implements a simple random number generator.
Definition at line 67 of file PlainHawkes.h.