funcnet.coupling_analysis

Provides classes for analyzing spatially embedded complex networks, handling multivariate data. Written by Jakob Runge.

class pyunicorn.funcnet.coupling_analysis.CouplingAnalysis(data, silence_level=0)[source]

Bases: object

Contains methods to calculate coupling matrices from large arrays of scalar time series. Comprises linear and information-theoretic measures, lagged and directed couplings.

__init__(data, silence_level=0)[source]

Initialize an instance of CouplingAnalysis from data array.

Parameters:
  • data (multidimensional numpy array) – The time series array with time in first dimension.

  • silence_level (int >= 0) – The higher, the less progress info is output.

__str__()[source]

Return a string representation of the CouplingAnalysis object.

__weakref__

list of weak references to the object

static _par_corr_to_cmi(par_corr)[source]

Transformation of partial correlation to conditional mutual information scale using the (multivariate) Gaussian assumption.

Parameters:

par_corr (float or array) – partial correlation

Return type:

float

Returns:

transformed partial correlation.

static _quantile_bin_array(array, bins=6)[source]

Returns symbolified array with aequi-quantile binning.

This partition results in a uniform distribution of the marginals.

Parameters:
  • array (array) – data

  • bins (int) – number of bins

Return type:

array

Returns:

converted data

static bincount_hist(symb_array)[source]

Computes histogram from symbolic array.

Parameters:

symb_array (array of integers) – symbolic data

Return type:

array

Returns:

(unnormalized) histogram

static create_plogp(T)[source]

Precalculation of p*log(p) needed for entropies.

Parameters:

T (int) – sample length

Return type:

array

Returns:

p*log(p) array from p=1 to p=T

cross_correlation(tau_max=0, lag_mode='max')[source]

Return cross correlation between all pairs of nodes.

Two lag-modes are available (default: lag_mode=’max’):

lag_mode = ‘all’: Return 3-dimensional array of lagged cross correlations between all pairs of nodes. An entry \((i, j, \tau)\) corresponds to \(\rho(X^i_t-\tau, X^j_t)\) for positive lags tau, i.e., the direction i –> j for \(\tau \ne 0\).

lag_mode = ‘max’: Return matrix of absolute maxima and corresponding lags of lagged cross correlation (CC) between all pairs of nodes. Returns two usually asymmetric matrices of CC values and lags: In each matrix, an entry \((i, j)\) corresponds to the (positive or negative) value and lag, respectively, at absolute maximum of \(\rho(X^i_t-\tau, X^j_t)\) for positive lags tau, i.e., the direction i –> j for \(\tau > 0\). The matrices are, thus, asymmetric. The function symmetrize_by_absmax() can be used to obtain a symmetric matrix.

Example:

>>> coup_ana = CouplingAnalysis(CouplingAnalysis.test_data())
>>> similarity_matrix, lag_matrix = coup_ana.cross_correlation(
...     tau_max=5, lag_mode='max')
>>> r((similarity_matrix, lag_matrix))
(array([[ 1.   ,  0.757 ,  0.779 ,  0.7536],
       [ 0.4847,  1.    ,  0.4502,  0.5197],
       [ 0.6219,  0.5844,  1.    ,  0.5992],
       [ 0.4827,  0.5509,  0.4996,  1.    ]]),
 array([[0, 4, 1, 2], [0, 0, 0, 0], [0, 3, 0, 1], [0, 2, 0, 0]]))
Parameters:
  • tau_max (int [int>=0]) – maximum lag of cross correlation lag function.

  • lag_mode (str [('max'|'all')]) – lag-mode of cross correlations to return.

Return type:

3D-array or tuple of matrices

Returns:

all-lag array or matrices of value and lag at the absolute maximum.

static get_nearest_neighbors(array, xyz, k, standardize=True)[source]

Returns nearest-neighbors for conditional mutual information estimator.

Reference: [Kraskov2004]

Parameters:
  • array (array (float)) – data array.

  • xyz (array [int(0|1|2)]) – identifier of X, Y, Z in CMI

  • k (int [int>=1]) – nearest-neighbor MI estimation parameter.

  • standardize (bool) – standardize array before estimation. (default: True)

Return type:

tuple of arrays

Returns:

nearest neighbors for each sample point.

information_transfer(tau_max=0, estimator='knn', knn=10, past=1, cond_mode='ity', lag_mode='max')[source]

Return bivariate information transfer between all pairs of nodes.

Two condition modes of information transfer are available as described in [Runge2012b].

Information transfer to Y (ITY):
\[I(X^i_t-\tau, X^j_t | X^j_t-1, ...,X^j_t-past)\]
Momentary information transfer (MIT):
\[I(X^i_t-\tau, X^j_t | X^j_t-1, ...,X^j_t-past, X^i_t-\tau-1, ...,X^j_t-\tau-past)\]

Two estimators are available:

estimator = ‘knn’ (Recommended): Based on k-nearest-neighbors [Kraskov2004], version 1 in their paper. Larger k have smaller variance, but larger (typically negative) bias, and vice versa.

estimator = ‘gauss’: Captures only linear part of association. Essentially estimates a transformed partial correlation.

Two lag-modes are available (default: lag_mode=’max’):

lag_mode = ‘all’: Return 3-dimensional array of lag-functions between all pairs of nodes. An entry \((i, j, \tau)\) corresponds to \(I(X^i_t-\tau, X^j_t | ...)\) for positive lags tau, i.e., the direction i –> j for \(\tau \ne 0\).

lag_mode = ‘max’: Return matrix of absolute maxima and corresponding lags of lag-functions between all pairs of nodes. Returns two usually asymmetric matrices of values and lags: In each matrix, an entry \((i, j)\) corresponds to the value and lag, respectively, at absolute maximum of \(I(X^i_t-\tau, X^j_t | ...)\) for positive lags tau, i.e., the direction i –> j for \(\tau > 0\). The matrices are, thus, asymmetric. The function symmetrize_by_absmax() can be used to obtain a symmetric matrix.

Example:

>>> coup_ana = CouplingAnalysis(CouplingAnalysis.test_data())
>>> similarity_matrix, lag_matrix = coup_ana.information_transfer(
...     tau_max=5, estimator='knn', knn=10)
>>> r((similarity_matrix, lag_matrix))
(array([[ 0.    ,  0.1544,  0.3261,  0.3047],
       [  0.0218,  0.    ,  0.0394,  0.0976],
       [  0.0134,  0.0663,  0.    ,  0.1502],
       [  0.0066,  0.0694,  0.0401,  0.    ]]),
array([[0, 2, 1, 2], [5, 0, 0, 0], [5, 1, 0, 1], [5, 0, 0, 0]]))
Parameters:
  • tau_max (int [int>=0]) – maximum lag of ITY lag function.

  • past (int [int>=1]) – maximum lag of past history.

  • knn (int [int>=1]) – nearest-neighbor ITY estimation parameter. (default: 10)

  • bins (int [int>=2]) – binning ITY estimation parameter. (default: 6)

  • estimator (str [('knn'|'gauss')]) – ITY estimator. (default: ‘knn’)

  • cond_mode (str [('ity'|'mit')]) – condition mode. (default: ‘ity’)

  • lag_mode (str [('max'|'all')]) – lag-mode of ITY to return.

Return type:

3D-array or tuple of matrices

Returns:

all-lag array or matrices of value and lag at the absolute maximum.

mutual_information(tau_max=0, estimator='knn', knn=10, bins=6, lag_mode='max')[source]

Return mutual information (MI) between all pairs of nodes.

Three estimators are available:

estimator = ‘knn’ (Recommended): Based on k-nearest-neighbors [Kraskov2004], version 1 in their paper. Larger k have smaller variance, but larger (typically negative) bias, and vice versa.

estimator = ‘binning’: Binning estimator based on equal-quantile binning.

estimator = ‘gauss’: Captures only linear part of association. Essentially estimates a transformed partial correlation.

Two lag-modes are available (default: lag_mode=’max’):

lag_mode = ‘all’: Return 3-dimensional array of lagged MI between all pairs of nodes. An entry \((i, j, \tau)\) corresponds to \(I(X^i_t-\tau, X^j_t)\) for positive lags tau, i.e., the direction i –> j for \(\tau \ne 0\).

lag_mode = ‘max’: Return matrix of absolute maxima and corresponding lags of lagged MI between all pairs of nodes. Returns two usually asymmetric matrices of MI values and lags: In each matrix, an entry \((i, j)\) corresponds to the value and lag, respectively, at absolute maximum of \(I(X^i_t-\tau, X^j_t)\) for positive lags tau, i.e., the direction i –> j for \(\tau > 0\). The matrices are, thus, asymmetric. The function symmetrize_by_absmax() can be used to obtain a symmetric matrix.

Reference: [Kraskov2004]

Example:

>>> coup_ana = CouplingAnalysis(CouplingAnalysis.test_data())
>>> similarity_matrix, lag_matrix = coup_ana.mutual_information(
...     tau_max=5, knn=10, estimator='knn')
>>> r(similarity_matrix)
array([[ 4.6505,  0.4387,  0.4652,  0.4126],
       [ 0.147 ,  4.6505,  0.1065,  0.1639],
       [ 0.2483,  0.2126,  4.6505,  0.2204],
       [ 0.1209,  0.199 ,  0.1453,  4.6505]])
>>> lag_matrix
array([[0, 4, 1, 2],
       [0, 0, 0, 0],
       [0, 2, 0, 1],
       [0, 2, 0, 0]], dtype=int8)
Parameters:
  • tau_max (int [int>=0]) – maximum lag of MI lag function.

  • knn (int [int>=1]) – nearest-neighbor MI estimation parameter. (default: 10)

  • bins (int [int>=2]) – binning MI estimation parameter. (default: 6)

  • estimator (str [('knn'|'binning'|'gauss')]) – MI estimator. (default: ‘knn’)

  • lag_mode (str [('max'|'all')]) – lag-mode of MI to return.

Return type:

3D-array or tuple of matrices

Returns:

all-lag array or matrices of value and lag at the absolute maximum.

silence_level

(int>=0) higher -> less progress info

symmetrize_by_absmax(similarity_matrix, lag_matrix)[source]

Returns symmetrized similarity matrix.

Computes the largest absolute value for each pair (i,j) and (j,i) and returns the in-place changed matrices of measures and lags. A negative lag for an entry (i,j) in the lag_matrix then indicates a ‘direction’ j –> i regarding the peak of the lag function, and vice versa for a positive lag.

Example:

>>> coup_ana = CouplingAnalysis(CouplingAnalysis.test_data())
>>> similarity_matrix, lag_matrix = coup_ana.cross_correlation(
...     tau_max=2)
>>> r((similarity_matrix, lag_matrix))
(array([[ 1.    , 0.698 , 0.7788, 0.7535],
        [ 0.4848, 1.    , 0.4507, 0.52  ],
        [ 0.6219, 0.5704, 1.    , 0.5996],
        [ 0.4833, 0.5503, 0.5002, 1.    ]]),
 array([[0, 2, 1, 2], [0, 0, 0, 0],
        [0, 2, 0, 1], [0, 2, 0, 0]]))
>>> r(coup_ana.symmetrize_by_absmax(similarity_matrix, lag_matrix))
(array([[ 1.    , 0.698 , 0.7788, 0.7535],
        [ 0.698 , 1.    , 0.5704, 0.5503],
        [ 0.7788, 0.5704, 1.    , 0.5996],
        [ 0.7535, 0.5503, 0.5996, 1.    ]]),
 array([[ 0, 2, 1, 2], [-2, 0, -2, -2],
        [-1, 2, 0, 1], [-2, 2, -1, 0]]))
Parameters:
  • similarity_matrix (array-like [float]) – array-like [node, node] matrix of similarity estimates

  • lag_matrix (array-like [int>=0]) – array-like [node, node] matrix of lags

Return type:

tuple of arrays

Returns:

the value at the absolute maximum and the (pos or neg) lag.

static test_data()[source]

Return example test data as discussed in pyunicorn description paper.