A Java High Performance Tool For Topological Data Analysis

For more details please refer to:

[1] MATTEO RUCCO, FILIPPO CASTIGLIONE, EMANUELA MERELLI, AND MARCO PETTINI. Characterisation of the idiotypic immune network through persistent entropy. Accepted by Springer Proceedings in Complexity, 2015

[2] EMANUELA MERELLI, MATTEO RUCCO, MARCO PIANGERELLI, AND DANIELE TOLLER. A topological approach for multivariate time series characterization: the epilepsy case study. Proc. 9th EAI Conference on Bio-inspired Information and Communications Technologies (BICT 2015), 2015.

[3] EMANUELA MERELLI, MATTEO RUCCO, PETER SLOOT, AND LUCA TESEI. Topological Characterization of Complex Systems: Using Persistent Entropy. Entropy, 17(10):6872–6892, 2015.

[4] MATTEO RUCCO, ROCIO GONZALEZ-DIAZ, MARIA-JOSE JIMENEZ, NIEVES ATIENZA, ENRICO CONCETTONI, CRISTINA CRISTALLI, ANDREA FERRANTE, AND EMANUELA MERELLI. A new topological entropy-based approach for measuring similarities among piecewise linear functions. Submitted - http://arxiv.org/abs/1512.07613, 2016.

The Figure above represents the temporal evolution of the persistent entropy of immune system for each homological group. Top: persistent entropy for the homological group H0. Middle: persistent entropy for the homological group H1. Bottom: persistent entropy for the homological group H2. Note that, the gap in the last plot recognizes the process of affinity maturation that is the process of generating antibodies with increased binding affinities. Affinity maturation occurs in mature B cells after V(D)J recombination, and is dependent on assistance from helper T cells. From a topological point of view, it means that in order to reach the memory state, the links formed by antibodies less affine are completely eliminated from the Idiotypic Network [1,2,3]. 

In order to compare the Persistent Entropy of two simplicial complex the following stability theorem is necessary [4]:

Note that, persistent entropy is computed over all the homological groups. We guessed that the analysis of each homological group separately can pinpoint out meaningful information. Because the analysis of each homological group can be more interesting than the analysis of the whole system, we define the j-Weighted Persistent Entropy WHj that is an extension of the Persistent Entropy H define above.
WH_j=- c_j\Sigma_{i=1}^{N_j} P(l_i)\cdot \log(P(l_i)).

Example of temporal evolution of Persistent Entropy for the Idiotypic Network. The peaks correspond to immune responses against antigens, while the plateau represents the immune memory

$$||f-g||_{\infty}\leq \delta\Rightarrow |H(f)-H(g)| \leq \epsilon.
Definition. Wighted Persistent Entropy. Let d be the dimension of the simplicial complex. For every j ∈ ℕ j ≤ d, we consider the j-th homology space Hj, and let Nj be the number of lines (both noise and persistent topological features) belonging to Hj. We set lj=(j - aj) to be the length of the $i$-th topological feature belonging to Hj. Now we let Lj=∑ li be the total length of the j-th barcode, and pi = li/Li be the frequential probability:
H=-\sum_{j \in J} p_j log(p_j)
Where pj=lj/L, lj=bj-aj, L=∑ lj. Note that the maximum persistent entropy corresponds to the situation in which all the intervals in the barcode are of equal length. In that case, H = logn if n is the number of elements of I. Conversely, the value of the persistent entropy decreases as more intervals of different length are present. This entropy measures how much is ordered the construction of a filtered simplicial complex.
Given two filter functions on simplicial complexes embedded in ℝn, ƒ : K → ℝ and g : K' → ℝ, for every ε>0, there exists δ>0 such that

Persistent Entropy

From Computational Topology to Information Theory

​Simplicial complexes represent useful and accurate models of complex networks and complex systems in general. Complex systems science becomes one of important area in both the natural and social sciences. However, there is no concise definition of complex systems. There are various attempts to characterize a complex system. In order to define complex systems, the concept of complexity is necessary. Many scientists have tried to find proper measures of complexity with mathematical rigor to the issue. In this section, we discuss one effective measure of complexity, based on information theory, the so-called persistent entropy. Persistent entropy similarly to Shannon entropy can be used for studying the dynamics of complex systems. From an information-theoretic viewpoint, the number of intervals in a persistent diagrams can be interpreted as the coding length of a simplicial complex. The coding length is intimately related to the notion of entropy and for a such reason, it is possible to define a entropy starting from the persistent barcodes. Diaz et al., defined an entropy based on the persistent barcode (Def. 3 of [Diaz]). The aim of their paper is an algorithm entropy-driven for finding the best filtration of a set of simplices. We argue that when the filtration is given their entropy can be easily extended without loosing the interpretation like Shannon. Here we propose to use the maximum of the filtration value of a persistent barcode plus one as upper bound, let call this quantity m.