climate.climate_data

Provides classes for generating and analyzing complex climate networks.

class pyunicorn.climate.climate_data.ClimateData(observable, grid, time_cycle, anomalies=False, observable_name='', observable_long_name=None, window=None, silence_level=0)[source]

Bases: Data, Cached

Encapsulates spatio-temporal climate data.

Provides methods to manipulate this data, i.e. calculate daily (monthly) mean values and anomaly values.

@ivar data_source: (string) - The name of the data source

(model, reanalysis, station)

classmethod Load(file_name, observable_name, file_type='NetCDF', dimension_names=None, window=None, vertical_level=None, silence_level=0, time_cycle=None, data_source=None)[source]

Initialize an instance of ClimateData.

Supported file types file_type are:
  • “NetCDF” for regular (rectangular) grids

  • “iNetCDF” for irregular (e.g. geodesic) grids or station data.

The spatio-temporal window is described by the following dictionary:

window = {"time_min": 0., "time_max": 0., "lat_min": 0.,
          "lat_max": 0., "lon_min": 0., "lon_max": 0.}
Parameters:
  • file_name (str) – The name of the data file.

  • observable_name (str) – The short name of the observable within data file (particularly relevant for NetCDF).

  • file_type (str) – The format of the data file.

  • dimension_names (dict) – The names of the dimensions as used in the NetCDF file. Default: {“lat”: “lat”, “lon”: “lon”, “time”: “time”}

  • window (dict) – Spatio-temporal window to select a view on the data.

  • vertical_level (int) – The vertical level to be extracted from the data file. Is ignored for horizontal data sets. If None, the first level in the data file is chosen.

  • silence_level (int) – The inverse level of verbosity of the object.

  • time_cycle (int) – The annual cycle length of the data (units of samples). NOTE: This is a required argument!

  • data_source (str) – The name of the data source (model, reanalysis, station).

static SmallTestData()[source]

Return test data set of 6 time series with 10 sampling points each.

Example:

>>> r(Data.SmallTestData().observable())
array([[ 0.    ,  1.    ,  0.    , -1.    , -0.    ,  1.    ],
       [ 0.309 ,  0.9511, -0.309 , -0.9511,  0.309 ,  0.9511],
       [ 0.5878,  0.809 , -0.5878, -0.809 ,  0.5878,  0.809 ],
       [ 0.809 ,  0.5878, -0.809 , -0.5878,  0.809 ,  0.5878],
       [ 0.9511,  0.309 , -0.9511, -0.309 ,  0.9511,  0.309 ],
       [ 1.    ,  0.    , -1.    , -0.    ,  1.    ,  0.    ],
       [ 0.9511, -0.309 , -0.9511,  0.309 ,  0.9511, -0.309 ],
       [ 0.809 , -0.5878, -0.809 ,  0.5878,  0.809 , -0.5878],
       [ 0.5878, -0.809 , -0.5878,  0.809 ,  0.5878, -0.809 ],
       [ 0.309 , -0.9511, -0.309 ,  0.9511,  0.309 , -0.9511]])
Return type:

ClimateData instance

Returns:

a ClimateData instance for testing purposes.

__cache_state__() Tuple[Hashable, ...][source]

Hashable tuple of mutable object attributes, which will determine the instance identity for ALL cached method lookups in this class, in addition to the built-in object id(). Returning an empty tuple amounts to declaring the object immutable in general. Mutable dependencies that are specific to a method should instead be declared via @Cached.method(attrs=(…)).

NOTE: A subclass is responsible for the consistency and cost of this state descriptor. For example, hashing a large array attribute may be circumvented by declaring it as a property, with a custom setter method that increments a dedicated mutation counter.

__init__(observable, grid, time_cycle, anomalies=False, observable_name='', observable_long_name=None, window=None, silence_level=0)[source]

Initialize an instance of ClimateData.

The spatio-temporal window is described by the following dictionary:

window = {"time_min": 0., "time_max": 0., "lat_min": 0.,
          "lat_max": 0., "lon_min": 0., "lon_max": 0.}
Parameters:
  • observable (2D array [time, index]) – The array of time series to be represented by the Data instance.

  • grid (Grid2D instance) – The Grid representing the spatial coordinates associated to the time series and their temporal sampling.

  • time_cycle (int) – The annual cycle length of the data (units of samples).

  • anomalies (bool) – Indicates whether the data are climatological anomaly values.

  • observable_name (str) – A short name for the observable.

  • observable_long_name (str) – A long name for the observable.

  • window (dict) – Spatio-temporal window to select a view on the data.

  • silence_level (int) – The inverse level of verbosity of the object.

__str__()[source]

Returns a string representation.

_mut_window

mutation count

anomaly()[source]

Calculate anomaly time series from observable.

To obtain climatological anomaly time series, the climatological means are subtracted from each sample in the original time series. This procedure is also known as phase averaging.

Note

Only the currently selected spatio-temporal window is considered.

Return type:

2D Numpy array [time, node index]

Returns:

the anomalized time series.

Example:

>>> r(ClimateData.SmallTestData().anomaly()[:,0])
array([-0.5 , -0.321 , -0.1106,  0.1106,  0.321 ,
        0.5 ,  0.321 ,  0.1106, -0.1106, -0.321 ])
anomaly_selected_months(selected_months)[source]

Return anomaly time series from observable for selected months.

For further comments, see anomaly().

Note

Only the currently selected spatio-temporal window is considered.

Parameters:

selected_months ([number]) – The selected months.

Return type:

2D array [time, node index]

Returns:

the anomalized time series for selected months.

indices_selected_months(selected_months)[source]

Return sorted time indices associated to certain months.

Currently, only cycle lengths of 12 (monthly data) and 360 (standardized daily data) are supported.

Note

Only the currently selected spatio-temporal window is considered.

Parameters:

selected_months ([number]) – The selected months.

Return type:

1D array (int)

Returns:

the sorted time indices corresponding to chosen months.

indices_selected_phases(selected_phases)[source]

Return sorted time indices associated to certain phase indices.

Note

Only the currently selected spatio-temporal window is considered.

Example:

>>> ClimateData.SmallTestData().indices_selected_phases([0,1,4])
array([0, 1, 4, 5, 6, 9])
Parameters:

selected_phases ([int]) – The selected phase indices.

Return type:

1D array (int)

Returns:

the sorted time indices corresponding to chosen phase indices.

phase_indices()[source]

Return time indices associated to all phases in the annual cycle.

In other words, provides all time indices falling into a particular day, month etc. of the year.

Just includes measurements from years for which complete data exists.

Note

Only the currently selected spatio-temporal window is considered.

Note

Only the currently selected spatio-temporal window is considered.

Example:

>>> ClimateData.SmallTestData().phase_indices()
array([[0, 5], [1, 6], [2, 7], [3, 8], [4, 9]])
Return type:

2D Numpy array (int) [phase index, year]

Returns:

the time indices associated to all phases of the annual cycle.

phase_mean()[source]

Calculate mean values of observable for each phase of the annual cycle.

This is also commonly referred to as climatological mean, e.g., the mean temperature for all Januaries in the data set for monthly time resolution (time_cycle=12).

Note

Only the currently selected spatio-temporal window is considered.

Return type:

2D Numpy array [cycle index, node index]

Returns:

the mean values of observable for each phase of the annual cycle.

Example:

>>> r(ClimateData.SmallTestData().phase_mean())
array([[ 0.5   ,  0.5   , -0.5   , -0.5   ,  0.5   ,  0.5   ],
       [ 0.63  ,  0.321 , -0.63  , -0.321 ,  0.63  ,  0.321 ],
       [ 0.6984,  0.1106, -0.6984, -0.1106,  0.6984,  0.1106],
       [ 0.6984, -0.1106, -0.6984,  0.1106,  0.6984, -0.1106],
       [ 0.63  , -0.321 , -0.63  ,  0.321 ,  0.63  , -0.321 ]])
set_global_window()[source]

Set the view on the whole data set.

Select the full data set and creates a data array as well as a corresponding Grid2D object to access this window from outside.

Example (Set smaller window and subsequently restore global window):

>>> data = ClimateData.SmallTestData()
>>> data.set_window(window={"time_min": 0., "time_max": 4.,
...                 "lat_min": 10., "lat_max": 20.,
...                 "lon_min": 5.,  "lon_max": 10.})
>>> data.grid.grid()["lat"]
array([ 10.,  15.], dtype=float32)
>>> data.set_global_window()
>>> data.grid.grid()["lat"]
array([  0.,   5.,  10.,  15.,  20.,  25.], dtype=float32)
set_window(window)[source]

Set spatio-temporal window.

Calls set_window method of parent class Data and additionally sets flags, so that measures derived from data (mean, anomaly) will be recalculated for new window.

The spatio-temporal window is described by the following dictionary:

window = {"time_min": 0., "time_max": 0., "lat_min": 0.,
          "lat_max": 0., "lon_min": 0., "lon_max": 0.}

If the temporal boundaries are equal, the data’s full time range is selected. If any of the two corresponding spatial boundaries are equal, the data’s full spatial extension is included.

For more information see pyunicorn.Data.set_window().

Example:

>>> data = ClimateData.SmallTestData()
>>> data.set_window(window={"time_min": 0., "time_max": 0.,
...                 "lat_min": 10., "lat_max": 20.,
...                 "lon_min": 5.,  "lon_max": 10.})
>>> r(data.anomaly())
array([[ 0.5   , -0.5   ], [ 0.321 , -0.63  ], [ 0.1106, -0.6984],
       [-0.1106, -0.6984], [-0.321 , -0.63  ], [-0.5   ,  0.5   ],
       [-0.321 ,  0.63  ], [-0.1106,  0.6984], [ 0.1106,  0.6984],
       [ 0.321 ,  0.63  ]])
Parameters:

window (dictionary) – The spatio-temporal window to select a view on the data.

shuffled_anomaly()[source]

Return the randomly shuffled anomaly time series.

Each anomaly time series is shuffled individually.

Note

Only the currently selected spatio-temporal window is considered.

Example (Anomaly with and without temporal shuffling should have the same standard deviation along time axis):

>>> r(ClimateData.SmallTestData().anomaly().std(axis=0))
array([ 0.31 , 0.6355, 0.31 , 0.6355, 0.31 , 0.6355])
>>> r(ClimateData.SmallTestData().shuffled_anomaly().std(axis=0))
array([ 0.31 , 0.6355, 0.31 , 0.6355, 0.31 , 0.6355])
Return type:

2D Numpy array [time, node index]

Returns:

the anomalized and shuffled time series.

time_cycle

(number (int)) - The annual cycle length of the data (units of samples).