Time Series – A Primer

Time series, sequentially ordered numerical values, are omnipresent throughout nearly every field of interest. They occur in the financial sector, in e-commerce, in medicine and astronomy, and many other fields of interest. Time series naturally arise from observations made at subsequent, usually equispaced, points in time, e.g. by sensor or log readings. Being able to predict or forecast time series is a valuable asset in nearly every operational field as it allows for adaption to upcoming developments, e.g. timely scaling of web server resources, buying or selling distinct assets or restocking of items in a shop.
However, handling, analysis and forecasting of time series poses unique challenges we need to address for tasks like the ones mentioned above.

This article will define equispaced time series fundamental properties, how to detect these properties with
Python and is accompanied by a Jupyter notebook at Github.

Hyperlinks and useful references, e.g. the Jupyter Notebooks for this time series article can be found at the end of this text.

Time Series

Uinivariate equi-spaced time series are a member of the class of contextual data representations and consist of two attributes.

  1. The so called contextual attribute, the time dimension.
  2. The behavioral attribute, the measured value.

Mathematically a time series is usually defined as $X = {X_1, X_2, …}$ or $Y = {Y_{t}|t \in T}$ where $T$ is the so called index set and. Times series thus represent a kind of stochastic process where a distinct time series is a realization of the stochastic process.

We usually speak of time series (TS), when the behavioral attribute is a numerical value ($x_t \in \rm I!R$) and of a temporal sequences in case the behavioral attribute is a member of a symbolic set.

Equi-spaced means that time-spans between two neighbor points are of equal length $|t_{1} – t_{2}| = |t_{2} – t_{3}| = … = |t_{n-1} – t_{n}|$.

From above definition we can infer that time series are usually of very high dimensionality and are therefore subject to the so called curse of high dimensionality.

Visualizing a Time Series

To get a first feeling for a TS it is best to visualize it in a meaningful way. Luckily visualizing a TS is very easy to accomplish with the help of Jupyter and Pandas. The following Python code plots a TS representing the minimum temperature for each day in Melbourne, Australia over the course of 10 years.

import numpy as np
import pandas as pd

df = pd.read_csv("data/daily_min_temp_melbourne.csv", sep=",", encoding="utf8", index_col="Date")
df.plot(figsize=(20, 10))
Melbourne Minimum Temperatures (1981-1990)
Daily Minimum Temperature in Australia, Melbourne 1981 – 1990

A quick glance at the image reveals some kind of cyclic seasonality. Identifying foundational TS properties is helpful for building predictors and simulators. The next subsection will show how to identify and prove 3 foundational time series properties, namely seasonality, trend, irregularity.

Time Series Decomposition

For analysis it is useful to isolate patterns in time series. A TS itself could be thought of as a additive or multiplicative composition of a seasonal, trend, and irregular part, where the original TS is given by one of the following models.

  • $Y_t = Y(offset) + Y(seasonality) + Y(irregularity)$
  • $Y_t = Y(offset) * Y(seasonality) * Y(irregularity)$

A useful tool to capture those properties is the Loess seasonal decompose from statsmodels, which decomposes a time series into its trend, seasonal, and irregular (residual) components. The following Python code will create a seasonal Loess decomposition plot of the Melbourne minimum temperature dataset.

import statsmodels.api as sm

res = sm.tsa.seasonal_decompose(df['Melbourne Min. Temp.'], model='additive', freq=365)
resplot = res.plot()
Melbourne Seasonal Decomposition Plot
Seasonal decomposition plot of Melbourne temperature data

The plot itself contains the observed time series, the trend component, the seasonal component and the irregular component of the analysed time series. In this case we used the additive model, as negative or zero values are not supported in case of the multiplicative model.

What’s next

This was a very quick primer on time series and seasonal decomposition in Python. In the following articles we examine time series,autocorrelation, and seasonal decomposition more throughly, before we start with using machine learning and deep learning for time series analysis, prediction and forecasting.

Hyperlinks & References

Best regards,
Henrik Hain

This article has been first published on: https://henrikhain.io/time-series-a-primer.html#time-series-a-primer

Gefällt Ihnen der Artikel?

Share on linkedin
Share on Linkdin
Share on xing
Share on XING
Share on twitter
Share on Twitter
Share on facebook
Share on Facebook

    Ihre Daten werden gemäß unserer Datenschutzerklärung erhoben und verarbeitet.
    Künstliche Intelligenz Partials
    Data Analytics

    Time Series Data Clustering Distance Measures

    As ubiquitous as time series are, it is often of interest to identify clusters of similar time series in order to gain better insight into the structure of the available data. However, unsupervised learning from time series data has its own stumbling blocks. For this reason, the following article presents some helpful time series specific distance metrics and basic procedures to work successfully with time series data.

    Weiterlesen »
    Künstliche Intelligenz Parts
    Künstliche Intelligenz

    Unsupervised Skill Discovery in Deep Reinforcement Learning

    Scientists from Google AI have published exciting research regarding unsupervised skill discovery in deep reinforcement learning. Essentially it will be possible to utilize unsupervised learning methods to learn model dynamics and promising skills in an unsupervised, model-free reinforcement learning enviroment, subsequently enabling to use model-based planning methods in model-free reinforcement learning setups.

    Weiterlesen »