API Reference

TimeSeries

In traces, a TimeSeries is similar to a dictionary that contains measurements of something at different times. One difference is that you can ask for the value at any time – it doesn’t need to be at a measurement time. Let’s say you’re measuring the contents of a grocery cart by the number of minutes within a shopping trip.

>>> cart = traces.TimeSeries()
>>> cart[1.2] = {'broccoli'}
>>> cart[1.7] = {'broccoli', 'apple'}
>>> cart[2.2] = {'apple'}
>>> cart[3.5] = {'apple', 'beets'}

If you want to know what’s in the cart at 2 minutes, you can simply get the value using cart[2] and you’ll see {'broccoli', 'apple'}. By default, if you ask for a time before the first measurement, you’ll get None.

>>> cart = traces.TimeSeries()
>>> cart[-1]
None

If, however, you set the default when creating the TimeSeries, you’ll get that instead:

>>> cart = traces.TimeSeries(default=set())
>>> cart[-1]
set([])

In this case, it might also make sense to add the t=0 point as a measurement with cart[0] = set().

Performance note

Traces is not designed for maximal performance, but it’s no slouch since it uses the excellent sortedcontainers.SortedDict under the hood to store sparse time series.

class traces.TimeSeries(data=None, default=None)[source]

A class to help manipulate and analyze time series that are the result of taking measurements at irregular points in time. For example, here would be a simple time series that starts at 8am and goes to 9:59am:

>>> ts = TimeSeries()
>>> ts['8:00am'] = 0
>>> ts['8:47am'] = 1
>>> ts['8:51am'] = 0
>>> ts['9:15am'] = 1
>>> ts['9:59am'] = 0

The value of the time series is the last recorded measurement: for example, at 8:05am the value is 0 and at 8:48am the value is 1. So:

>>> ts['8:05am']
0
>>> ts['8:48am']
1

There are also a bunch of things for operating on another time series: sums, difference, logical operators and such.

compact()[source]

Convert this instance to a compact version: the value will be the same at all times, but repeated measurements are discarded.

default

Return the default value of the time series.

difference(other)[source]

difference(x, y) = x(t) - y(t).

distribution(start=None, end=None, normalized=True, mask=None, interpolate='previous')[source]

Calculate the distribution of values over the given time range from start to end.

Parameters:
  • start (orderable, optional) – The lower time bound of when to calculate the distribution. By default, the first time point will be used.
  • end (orderable, optional) – The upper time bound of when to calculate the distribution. By default, the last time point will be used.
  • normalized (bool) – If True, distribution will sum to one. If False and the time values of the TimeSeries are datetimes, the units will be seconds.
  • mask (TimeSeries, optional) – A domain on which to calculate the distribution.
  • interpolate (str, optional) – Method for interpolating between measurement points: either “previous” (default) or “linear”. Note: if “previous” is used, then the resulting histogram is exact. If “linear” is given, then the values used for the histogram are the average value for each segment – the mean of this histogram will be exact, but higher moments (variance) will be approximate.
Returns:

Histogram with the results.

exists()[source]

returns False when the timeseries has a None value, True otherwise

first_item()[source]

Returns the first (time, value) pair of the time series.

first_key()[source]

Returns the first time recorded in the time series

first_value()[source]

Returns the first recorded value in the time series

get(time, interpolate='previous')[source]

Get the value of the time series, even in-between measured values.

get_item_by_index(index)[source]

Get the (t, value) pair of the time series by index.

items() → list of the (key, value) pairs in ts, as 2-tuples[source]
classmethod iter_merge(timeseries_list)[source]

Iterate through several time series in order, yielding (time, list) tuples where list is the values of each individual TimeSeries in the list at time t.

iterintervals(n=2)[source]

Iterate over groups of n consecutive measurement points in the time series.

iterperiods(start=None, end=None, value=None)[source]

This iterates over the periods (optionally, within a given time span) and yields (interval start, interval end, value) tuples.

TODO: add mask argument here.

last_item()[source]

Returns the last (time, value) pair of the time series.

last_key()[source]

Returns the last time recorded in the time series

last_value()[source]

Returns the last recorded value in the time series

logical_and(other)[source]

logical_and(t) = self(t) and other(t).

logical_or(other)[source]

logical_or(t) = self(t) or other(t).

logical_xor(other)[source]

logical_xor(t) = self(t) ^ other(t).

mean(start=None, end=None, mask=None, interpolate='previous')[source]

This calculated the average value of the time series over the given time range from start to end, when mask is truthy.

classmethod merge(ts_list, compact=True, operation=None)[source]

Iterate through several time series in order, yielding (time, value) where value is the either the list of each individual TimeSeries in the list at time t (in the same order as in ts_list) or the result of the optional operation on that list of values.

moving_average(sampling_period, window_size=None, start=None, end=None, placement='center', pandas=False)[source]

Averaging over regular intervals

multiply(other)[source]

mul(t) = self(t) * other(t).

n_measurements()[source]

Return the number of measurements in the time series.

n_points(start=-inf, end=inf, mask=None, include_start=True, include_end=False, normalized=False)[source]

Calculate the number of points over the given time range from start to end.

Parameters:
  • start (orderable, optional) – The lower time bound of when to calculate the distribution. By default, start is -infinity.
  • end (orderable, optional) – The upper time bound of when to calculate the distribution. By default, the end is +infinity.
  • mask (TimeSeries, optional) – A domain on which to calculate the distribution.
Returns:

int with the result

operation(other, function, **kwargs)[source]

Calculate “elementwise” operation either between this TimeSeries and another one, i.e.

operation(t) = function(self(t), other(t))

or between this timeseries and a constant:

operation(t) = function(self(t), other)

If it’s another time series, the measurement times in the resulting TimeSeries will be the union of the sets of measurement times of the input time series. If it’s a constant, the measurement times will not change.

remove(time)[source]

Allow removal of measurements from the time series. This throws an error if the given time is not actually a measurement point.

remove_points_from_interval(start, end)[source]

Allow removal of all points from the time series within a interval [start:end].

sample(sampling_period, start=None, end=None, interpolate='previous')[source]

Sampling at regular time periods.

set(time, value, compact=False)[source]

Set the value for the time series. If compact is True, only set the value if it’s different from what it would be anyway.

set_interval(start, end, value, compact=False)[source]

Set the value for the time series on an interval. If compact is True, only set the value if it’s different from what it would be anyway.

slice(start, end)[source]

Return an equivalent TimeSeries that only has points between start and end (always starting at start)

sum(other)[source]

sum(x, y) = x(t) + y(t).

threshold(value, inclusive=False)[source]

Return True if > than treshold value (or >= threshold value if inclusive=True).

to_bool(invert=False)[source]

Return the truth value of each element.

Histogram

class traces.Histogram(data=(), **kwargs)[source]
max(include_zero=False)[source]

Maximum observed value with non-zero count.

mean()[source]

Mean of the distribution.

min(include_zero=False)[source]

Minimum observed value with non-zero count.

normalized()[source]

Return a normalized version of the histogram where the values sum to one.

standard_deviation()[source]

Standard deviation of the distribution.

total()[source]

Sum of values.

variance()[source]

Variance of the distribution.