Cisco Tech Blog CiscoTech Blog Close
Home Contact

Drowning in Data and Starving for Insights - Episode #3: Application of Topological Data Analysis to Telemetry

Data networks are complex systems with plenty of counters and other data collection mechanisms built into them. Topological Data Analysis (TDA) techniques can be used to provide system breakdowns information or early warning signals in real-time. TDA can also be used for exploratory analysis and can help data center administrators and customers solve the big data dilemma, the tradeoff between data abundance and data scarcity. Administrators do not need all information from all components all the time. However, they cannot operate complex networks with just a small percentage of data being available as failure modes or service outages.

So, how is it possible to provide the correct information and data insights at the right time? The first episode of our series “Drowning with Data and Starving for Insights” was about the challenges of the abundance of data for network operators. The second episode introduced the concepts of Topological Data Analysis (TDA). In this third episode, we will look at some applications of TDA and show how it can help resolve the network operators’ challenges.

Change Point Detection

Topological methods, such as persistent homology and Mapper, provide a succinct description of the dynamics of multilevel multi-scale non-linear complex systems, using techniques derived from algebraic topology. Windowed persistent homology and sliding windows embedding are valuable approaches for exploring high-dimensional datasets, for change points detection in multivariate time series.

Telemetry data are usually provided in the form of multivariate time series, but time series do not have point clouds representations. For change point detection using TDA, we follow the pipeline presented in the following figure.

“Figure 1. TDA pipeline for multivariate time series”

The approach is the following:

  • We consider the multivariate time series as a sequence of measurements sampled at discrete times. The point cloud is simply the set of all points in the Euclidean multidimensional space.
  • Instead of examining the topology using all points at once, we examine the point cloud, exploring the topological properties through the persistence diagrams of consecutive sliding windows, which describe how the topology of the point cloud changes given two adjacent windows.
  • Given two corresponding persistence diagrams we use the p-Wasserstein distance [4] (a topological similarity metric) for comparing the two persistence diagrams and get how persistence diagrams from the two sliding windows are similar. The rationale is that if the point clouds in the two windows are the same, the Wasserstein distance is zero, otherwise if there a change in the two windows the Wasserstein distance will increase. The value of the Wasserstein distance taken from the sliding windows is a proxy of the evolution of the multivariate time series over time and as such can be used for change point detection
“Figure 2. Windowed Persistent Homology”

In the example of Figure 2:

  • Equally sized rectangular windows corresponding to 24 hours of data have been used,
  • persistence diagram has been calculated for each window,
  • similarity between pair of adjacent windows have been evaluated using the Wasserstein distance.

In this example, two global major events have been detected: the two temporal gaps with no measurement (Jun 26-Jun 27 and Jul 9-Jul 10) and the divergence point (Jul 3-4) representing a system break point. Other spikes are related to state change in the system dynamic of the individual time series (change points): for example, we can examine the series of spikes between Jun29 and Jul01 in (a) and verify that several time series have transitioned exactly in this same period, as it is possible to compare with (b) highlighted in the light blue rectangle.

Timeseries Seasonality

The state of a router shows cyclical, periodic, seasonal behavior, or other typical recurrent behavior, due to the combination of all counters offered by the router. Recurring network usage patterns (e.g., a traffic spike on Monday morning, low traffic during seasonal holidays), as well as the fact that a lot of network control protocols implement state machines are sources of seasonality at different timescales. To infer recurring patterns, and/or trend changes in multivariate time series, and to identify state changes in an unsupervised and automated way, we can apply persistent homology [1][2][3] on the delay embedding of the point cloud and discover periodic or recurrent structures.

We define the sliding window embedding as the vector SWn obtained stacking by rows the sequence of frames yn and collapsing in row-major order the matrix obtained. Actually, we apply Takens’s embedding theorem to a sequence of 2D frames because the geometry of the sliding window embedding carries fundamental information about the original multivariate time series. It is then possible to use persistence diagrams and quantify the presence of periodicity, and quasi-periodicity in the multivariate time series, measuring the geometry of its associated sliding window embedding, to discover [3]:

  • Periodic systems: show perfect repetition in the phase space, which corresponds to a topological loop S1 with Betti number signature H0 = 1, H1 = 1, H≥2 = 0.
  • Quasi-periodic systems: characterized by the superposition of periodic modes with non-commensurate periods, which corresponds to the k-dimensional torus Tk = (S1)k or to the k-holed torus Tk = Tk# . . . #Tk. A typical signature is for the 2-Torus, i.e. for systems that show two periodic modes on T2 with Betti numbers H0 = 1, H1 = 2, H2 = 1,H≥3 =0.
“Figure 3. Persistence Diagram and PCA embedding of the multivariate time series”

The Principal Component Analysis (PCA) embedding presented in Figure 3 already carries interesting information, such as the existence of two different dynamic regimes separated by a structural breakdown around Jul 3, but more detailed information is captured by the corresponding persistence diagram. For example, it is possible to see that we have H1 = 2 and H2 = 1, the Betti numbers for a torus, together with other persistence H1 = 3. Until Jul 3 the system undergoes a regular periodic behavior represented by 3 nested loops. After Jul 3 the system shows up quasi-periodicity, meaning that the system has not recovered the structural breakdown on Jul 3 and that a systemic failure is likely to be expected in the future. This is an example of an early warning signal that should warn network administrators and at the same time is quite difficult to discover this kind of alert with traditional statistical methodologies.

Extension to AI 2.0

TDA works quite well as a stand-alone system. Now how can TDA be seamlessly integrated into existing frameworks? One of the most mature frameworks in Cisco is AI 2.0 [5]. This framework already provides advanced methods for describing the dynamics of multilevel complex datasets where knowledge is organized into Levels of Abstraction (LoA). These methods can be complemented by a computational topology approach, to extract local and global information from the collection of data observations. TDA and AI 2.0 will work particularly well together since they will operate on the same metamodel, which builds knowledge via processes that connect and extend the different LoAs.

For example, a massive multivariate/multimodal time series embedded into millions of data points in L0 can be meaningfully mapped into AI 2.0 higher levels of abstraction up to L∗. It will be a compressed and distributed representation in lower dimensions that does not sacrifice the most relevant data structure properties and include all the qualitative data information. This boils down to measure the topological features and causal relationships of data discovered by TDA that persist across multiple scales and are mapped to DFRE multiple levels of abstraction and causal models.

Since TDA contrasts with dimensional reduction methods, it will lead to the discovery of the underlying data shape and invariant topological properties by extracting semantic attributes from data and eventually showing casual connections in the Reeb graph generated, which complements the DFRE knowledge graph. In this case, the synergy between AI 2.0 and TDA will be ideal.

Moreover, application of TDA to AI 2.0 may constitute a preliminary investigation for a topological reasoner integration. We can expect the following synergies:

  • General: applicability of TDA in large multi-variate time series to represent data structure and relationships as a knowledge graph adopting the DFRE metamodel.
  • On the back-end side: learning of the sub-symbolic representation in a data-driven way to extract robust topological features from input and generate a topological summary. The topological summary is coordinate-free, deformation invariant, and represents a highly compressed semantic description of the original multidimensional data space perfectly adapted for symbolic processing.
  • On the front-end side: design and integration of a topological reasoner for the AI 2.0 middleware, in charge of inferring topological relations among topological objects and the range of the input query using ontologies.

To summarize, TDA offers the promise of solving network operators’ challenges enabling closed-loop autonomous networks, significantly reducing operational costs, and enhancing the reliability and robustness of networks. Topological techniques can be used to provide system breakdowns information or early warning signals in real-time, they can also be used for exploratory analysis, and can help data center administrators and customers to solve the big data dilemma. TDA can be successfully used to discover data insights from large datasets of multivariate time series in completely unsupervised or minimally supervised learning in conjunction with other ML/DL methods and provide the right piece of information at the right time.

[1]. Perea, J. & Harer, J. Sliding Windows and Persistence: An Application of Topological Methods to Signal Analysis 2013. arXiv: 1307.6188 [math.AT].

[2]. Perea, J. A. Persistent homology of toroidal sliding window embeddings in 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2016), 6435{6439.

[3]. Skraba, P., Silva, V. D. & Vejdemo-Johansson, M. Topological Analysis of Recurrent Systems in NIPS 2012 (2012).

[4]. Chazal, F. & Michel, B. An introduction to Topological Data Analysis: fundamental and practical aspects for data scientists 2017. arXiv: 1710.04019 [math.ST]

[5]. Latapie, H., Kilic, O., Liu, G., Kompella, R., Lawrence, A., Sun, Y., … & R. Thórisson, K. (2021). A Metamodel and Framework for Artificial General Intelligence From Theory to Practice. Journal of Artificial Intelligence and Consciousness, 1-23.