Visualization and topological analysis of brain networks with applications in Autism

Publication Type	honors thesis
School or College	School of Computing
Department	Computer Science
Faculty Mentor	Bei Wang Philips
Creator	Shi, Yiliang
Title	Visualization and topological analysis of brain networks with applications in Autism
Date	2018
Description	Network analysis is an increasingly prevalent tool in neuroscience for the study of brain imaging data from functional magnetic resonance imaging (fMRI). Numerous tasks are involved in such as analysis, including the inference of networks from raw data, graph theoretic analysis, hierarchical clustering, and visualization. In this work, we integrate methods from topological data analysis (TDA) with interactive visualization to provide novel insights for the study of brain networks. First, we provide a lightweight and interactive visualization tool for brain network exploration and comparison. The tool provides exploratory visualization of networks with linked views guided by topological measures for both single network exploration and multiple network comparisons. The tool enables the exploration of network structure across multiple thresholds via the notion of persistent homology and highlights visual differences between pairs of networks. Next, we study the impact of hierarchical clustering on network structures. We examine the changes in a number of graph-theoretical measure using various hierarchical agglomerative clustering techniques. We also study how topological features arise from persistent homology evolve during such a process.
Type	Text
Publisher	University of Utah
Subject	brain network analysis; topological data analysis; hierarchical clustering
Language	eng
Rights Management	(c) Yiliang Shi
Format Medium	application/pdf
ARK	ark:/87278/s69708ez
Setname	ir_htoa
ID	2973526
OCR Text	Show VISUALIZATION AND TOPOLOGICAL ANALYSIS OF BRAIN NETWORKS WITH APPLICATIONS IN AUTISM by Yiliang Shi A Senior Honors Project submitted to the faculty of The University of Utah in partial fulfillment of the requirements for the Honors Degree of Bachelor of Science in The School of Computing APPROVED: Bei Wang Philips Supervisor Ross Whitaker Chairperson, The School of Computing Erin Parker Departmental Honors Advisor Sylvia D. Torti Director, Honors Program May 2018 ABSTRACT Network analysis is an increasingly prevalent tool in neuroscience for the study of brain imaging data from functional magnetic resonance imaging (fMRI). Numerous tasks are involved in such as analysis, including the inference of networks from raw data, graph theoretic analysis, hierarchical clustering, and visualization. In this work, we integrate methods from topological data analysis (TDA) with interactive visualization to provide novel insights for the study of brain networks. First, we provide a lightweight and interactive visualization tool for brain network exploration and comparison. The tool provides exploratory visualization of networks with linked views guided by topological measures for both single network exploration and multiple network comparisons. The tool enables the exploration of network structure across multiple thresholds via the notion of persistent homology and highlights visual differences between pairs of networks. Next, we study the impact of hierarchical clustering on network structures. We examine the changes in a number of graph-theoretical measure using various hierarchical agglomerative clustering techniques. We also study how topological features arise from persistent homology evolve during such a process. CONTENTS ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii CHAPTERS 1. INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2. RELATED WORKS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2.1 2.2 2.3 2.4 2.5 2.6 2.7 3 4 5 7 7 8 8 3. Network Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Brain Network Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Brain Network Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Topological Analysis of Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Topological Data Analysis of Brain Networks . . . . . . . . . . . . . . . . . . . . . . . . . . Brain Network Graph Theoretical Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . Network Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . TECHNICAL BACKGROUND . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.1 Graph-theoretic Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Persistent Homology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Topological Data Analysis of Brain Networks . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Clustering techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4. LIGHTWEIGHT VISUALIZATION TOOL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4.1 4.2 4.3 4.4 5. 10 11 14 16 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Topological Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 20 20 21 HIERARCHICAL CLUSTERING AND TOPOLOGICAL PROFILES OF NETWORKS 24 5.1 Functionality and Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 5.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 5.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 6. DISCUSSION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 CHAPTER 1 INTRODUCTION Humans have been fascinated with the anatomy and functionality of the brain since antiquity. From the Ancient Greek philosophers and Renaissance artists to modern day neurologists, humans have attempted to understand the structure and function of the brain. As a result, having a deeper understanding of the brain can lead to better disease detection and treatment. For instance, modern neuroimaging have allowed researchers to detect a variety of illnesses, such as the early onset of Alzheimer [1] and brain cancer. Together with advances in imaging techniques that accurately portray the brain, analysis techniques are vital to helping researchers identify brain characteristics that differentiate between a healthy individual and a patient. Given the complexity of the brain, visualization is a helpful method to present information and highlight patterns that are otherwise difficult to observe. Human in general find it difficult to process large amounts of data. Visualization allows for the summarization of complicated data in a manner suitable for human analysis, making it a core technique in brain analysis. In recent years, there has been an increase in attempts to map the relation between different areas of the brain, an area known as brain connectomics. A natural technique to represent connections between regions is a network, which allows for analysis of the network through traditional techniques from graph theory and statistics. However, since the network of all relations between regions can be abstracted as a complete, weighted graph, researchers often have to threshold the network to obtain one with less edges for the purpose of analysis and visualization. The thresholding results in a loss of information, which could create biases in the analysis. One proposed technique to avoid thresholding altogether is the application of persistent homology from topological data analysis to brain networks [2]. Such techniques summarize topological changes in a network across all threshold levels using a persistence barcode. Persistent homology, being an emerging 2 technique, is currently not easily accessible to neuroscientists. Scalability is an another significant challenge for the analysis and visualization of brain networks. Large networks with thousands of or tens of thousands of nodes can easily arise from imaging techniques such as fMRI scans. This causes complications for visualizations of such networks as the naive visualization of the raw data can easily creates ”hairballs” of criss-crossing edges with low information content. One solution is to group clusters of similar nodes together via hierarchical clustering to reduce the size of the network. However, despite the many statistics that have been proposed as quality measures for clustering, finding and evaluating the best clusters is still an open question. In this thesis, we present a lightweight web visualization tool that integrates topological data analysis with network visualization. First, we perform visual comparisons of brain networks guided by their topological profiles derived from persistent homology. Second, we evaluate the impact of hierarchical clustering by studying the change of its topological profiles. Our main contributions are two-fold. First, we provide the neuroscientists a lightweight browser-based visualization tool that improves the accessibility of emerging topological techniques such as persistent homology to the neuroscience field. Second, we investigate the effect of hierarchical clustering on network structure based on the changes to its topological profile, as well as graph theoretic measures. We then demonstrate the utility of our tool using an autism dataset. CHAPTER 2 RELATED WORKS 2.1 Network Visualization Networks are structures that encode relationships between objects, where the objects are represented by a set of nodes and the relations between them are represented by a set of edges. There can be a variety of data associated with a network, and effective visualization of the data has long been a topic of interest. Network visualization should enhance understanding, should accurately represent the underlying data, and should be intuitive to use. Some of the earliest attempts at network visualization was done at AT&T. Becker et. al. [3] discusses early visualizations from the perspective of AT&T’s communication network, which categorizes static network data visualization into matrix based ones and spatial maps of nodes and links, referred to today as node-link diagrams. If nodes were arranged to reflect physical geometry, it is easy to end up with visual distortions in edge value since longer edges are perceived to be more significant. Becker also considers matrix visualization which removes nodes and encodes edge weight with color saturation in a matrix layout. This technique removes edge distortions at the expense of spatial information. Apart from static visualization, the authors also discussed dynamically changing the parameters that affect the network, such as the data being visualized, the size of symbols, the colors and more to provide flexibility. In addition to these considerations, the usage of zooming and animation to provide interactivity is also briefly discussed [3]. There are multiple layouts available for node-link diagrams, each with their advantages and disadvantages. A recent survey of the different network visualization layouts is found in [4]. Force directed layouts [5], circular layouts, [6] and physical geography based layouts [3] are some of the most common layouts in network visualization. Apart from layouts, in- 4 teractive techniques such as zooming and fisheye distortion [7] are also methods that could decrease visual clutter and improve the insight to noise ratio of network visualizations. 2.2 Brain Network Visualization Brain network visualizations in the the form of node-link diagrams and matrices face the same complexities as general networks in terms of accuracy, size, and visual clutter. In terms of accuracy, brain networks must balance the additional dimension of anatomical accuracy and connectomic complexity [8]. Also, Accuracy is further complicated by uncertainty in data, which can be difficult to visualize. In [9], the authors evaluated the effectiveness of matrices and node-link diagrams in brain network visualizations. First, they identified seven common visual analysis tasks carried out by neuroscientists based on a literature review and hour-long interviews with seven neuroscientists. The tasks include: identifying the network structures that are responsible for a specific cognitive function, the effects of anatomical structure on functional connectivity, the alterations in brain connectivity, the existence or loss of patterns in brain connectivity that are associated with anomalous conditions, the deviation of an individual’s connectivity from a population mean; performing effective brain parcellation and multimodal connectivity analysis; and finally identifying the effects of local injury. Based on these criterion, they conducted an experiment that required participants to assess weight change of a node’s connections, to assess the connectivity of common neighbors, and to identify the region with most changes at different scales of networks. Based on the experiment, they concluded that matrix visualization perform better than graph visualization with larger and denser graphs. However, the results are very specific to the non-augmented visualization that they conducted their experiment on, and there are visualization techniques that could address the issues they identified with node-link diagram visualization. For instance, node aggregation could be utilized to decrease the size of a graph, while thresholding could be used to reduce density, as long as the impact of thresholding is accounted for. Margulies et. al. [8] provide a full survey of existing brain network visualization techniques, issues, and tools. These include the usage of glyphs to display multidimensional tensor data, tractography, the importance of spatial data, and functional connectiv- 5 ity versus structural and anatomical connectivity. While traditional visualizations have focused on the anatomy and structure of the brain, anatomical information is often lost as visualizations place increased importance on relational connectivity. This can be observed through the shift from physical brain models to abstract relational visualizations. One of the best relational visualizations is the connectogram [10], a node-link diagram laid out in a circular layout, which is also known as the wheelchart or donut chart. As a result, it encodes multimodal relationships between region of interests (ROI), with each mode being placed on concentric circles. Even though a variety of tools have the ability to visualize brain network data, most of them are meant for the processing and analysis of a single brain image and do not perform comparisons. Others focus specifically on visualization and not on analysis. For instance, BrainNet Viewer [11] is capable of generating very detailed and customized visualizations based on nodes, edges, and surfaces, but does not contain any analysis component. In contrast, Connectome Viewer Toolkit [12] is a comprehensive solution for creating and visualizing general connectomes, with an emphasis on the creation of networks instead of the its properties. Other similar tools include BrainNetVis [13], which is similar to BrainNet Viewer but with simpler visualization and increased analysis functionality. The tool closest to the one we introduce is the USC Multimodal Connectivity Database [14], which allows users to upload connectivity matrices and perform a comparison that generates static images of node-link diagrams, histograms, and other analysis measurements. However, this tool does not allow interactivity, and requires a static threshold to create binary graphs. Our tool, on the other hand, combines persistent homology and interactivity to visualize the brain across all thresholds. 2.3 Brain Network Construction Filtman [15] provides a survey of common network imaging techniques. These brain imaging techniques can be broadly summarized as methods to represent the brain by measuring certain physical aspects. For instance, computed tomography (CT) and magnetic resonance imaging (MRI) scans produce images of physical brain structures, such as tissue types. On the other hand, functional magnetic resonance imaging (fMRI) and positron emission tomography (PET) measure indicators of brain activity in an area at a 6 given resolution. Thus, a common indicator in fMRI scans is the blood oxygenation level dependent (BOLD) signal, which corresponds to the blood flow in an area. MRI and fMRI scans typically partition the brain into equally sized boxes named voxels. Depending on the resolution of the scan, there can be over a hundred thousand voxels in a single image. Domain-specific information can be used to group the voxels into smaller regions of interest (ROI). As a result, it is often useful to abstract the brain as a network for analysis and comparison purposes [16]. With fMRIs, images are collected over a period of time. It is then possible to find the connectivity between either voxels, or ROIs, based on the temporal dependence of the indicator measured such as the BOLD signal. For structural scans, it is possible to count the number of physical connections between regions. Analysis of brain networks require an intersection of statistical, topological, and graph theoretic analysis as well as an understanding of the psychological relevance and practical reliability of the analysis techniques mentioned above. The pipeline for processing brain images often involves statistical techniques to discover significant patterns. Covariance and correlation, for instance, are used to represent functional links between regions of interests in the brain, which can be represented with a matrix encoding the pairwise relationship between each ROI. Once this part has been finalized, a decision threshold can be constructed in order to differentiate spurious and significant relationships between regions [17]. Then, the matrix can be transformed into a network graph for network analysis. The simplest representation is in the form of an unweighted graph where relationships are binary. Anything above a threshold is an edge, and anything below is not. It is also possible to include a weight. The weight can be correlation and its derivatives from functional scans, or the number of physical connections for structural scans. As a result, the matrix can be represented as a undirected graph that can be weighted or unweighted if there is a relation that includes cause and effect. Once a graph is created, measures from graph theory can be used to obtain information about the characteristics of the graph in terms of features on the local scale of individual nodes, the regional scale of sets of nodes, and the global scale of the entire network [16]. Neuroscientists must then apply their domain-specific knowledge to determine the rele- 7 vance of the graph-theoretic measures. 2.4 Topological Analysis of Networks Comparison and difference identification between brain networks is an important task in neurology research and is supported by the tasks identified by researchers in [9]. An important question that arises from this activity is what actually constitutes a difference in networks? Brain networks are characterized by a set of regions in this structure and a set of relations between regions. Two networks could differ in both sets of regions and the set of relations, making it difficult to quantify what amounts to a significant difference when doing comparisons. One method of differentiating graphs is through its topology. Topology defines shapes and spaces in terms of closeness and connectivity instead of distance, allowing for comparisons of features that persist through distortions in shape. Persistent homology and topological data analysis emerged from algebraic topology through the independent work of Frosini [18], Robins [19], and Edelsbrunner [20] in the early 2000s. It has since been applied to the analysis of sensor networks [21], metabolic networks, [22] and collaboration networks[23] to name a few. Topological data analysis (TDA) extends concepts from topology to discrete point cloud data. It studies point cloud data in terms of the topological concepts such as homology, where differences in shapes are defined by the number and types of holes they contain. One way to create topological constructs from point cloud data is by connecting points within a neighborhood graph. Persistent homology summarizes changes in homology as we change the resolution of the criteria for connecting points. By modeling nodes in a network as points in a point cloud, it is possible to apply persistent homology to networks as well. 2.5 Topological Data Analysis of Brain Networks In Lee et. al [2], the authors applied persistent homology to brain networks. They examined the topological profile of different networks belonging to patients falling under the categories of Attention Deficit Hyperactivity Disorder (ADHD), Autism Spectrum Disorder (ASD) and controls. TDA-inspired measures such as bottleneck distance is then compared to distance between single linkage matrices, average assortativity, average node 8 betweenness centrality, average clustering coefficient, characteristic path length, smallworldness, modularity, and the global network homogeneity. Other similar work includes that of Chung et. al. [24] who used networks derived from the cortical surface, Cassidy et. al. [25] who defined the dissimilarity via sparse conditional coherence (DSC) and used it on fMRI data, as well as Dabaghian et. al. [26], who applied persistent homology on the hippocampal spatial map. 2.6 Brain Network Graph Theoretical Analysis There is a well known connection between the features of the structural and functional networks of a brain and neuropsychiatric disorders [27]. Bullmore and Sporns demonstrates that both anatomical and functional networks shows properties of small-world topology, where there are distinct hubs with high edge density and low edge density between hubs. Measures that reflect small-world architecture include modularity, global efficiency and clustering coefficients. Graphs with small-world architecture tend to have high values of all of the above, since the are closely related measures. This suggests that there is high potential of relating the brain’s network graph theoretic properties with its functionality. 2.7 Network Clustering Clustering refers to the grouping of similar data points, and is a technique that was first utilized in the 1930s in the area of anthropology and psychology. Hierarchical clustering is a method that begins with each data point in its own cluster and successively merging clusters that are close to each other, which results in a hierarchy. Lance and Williams [28] define some of the oldest and most common distance update strategies for hierarchical clustering, namely the nearest neighbor, furthest neighbor, centroid, median, and group average. The update strategies specify the way to compute the distance between clusters and is necessary because clusters can compromise multiple nodes. There have been numerous quality measures proposed as a measurement of clustering quality. Notably, maximizing modularity is a goal of a number of clustering techniques proposed in recent years [29]. Other clustering algorithms cluster based on the density and distribution of network. Apart from specific measures, there have also been efforts 9 to create a set of general axioms to describe the quality of a cluster [30]. These includes scale invariance, consistency, and richness. In this work, we investigate the behavior of bottleneck and Wasserstein distances as cluster quality measures. CHAPTER 3 TECHNICAL BACKGROUND 3.1 Graph-theoretic Measures In this section, we look at some graph-theoretic measures that summarize network structure. For the rest of this section, a network is modeled as a graph and we refer to it as such. First, we define the degree of a node to be the sum of all edge weights connected to the node. The community structure [31] of a graph refers to the existence of densely connected subgraphs within a graph, with sparse connections to other subgraphs. Modularity is one measurement of community structure, that reflects how far the distribution of the number or weights of edges in a graph differs from the expected distribution in a random graph. The formal definition is given below [32]: Definition 3.1.1. 1 modularity = 4m ∑ ij ki , k j Aij − 2m ( s i s j + 1) Aij represents the graph in the form of a connectivity matrix detailing the number of connections between nodes. k i , k j are the expected degrees of a a random graph, while m is the expected number of edges. Another property of a network is associativity [33], which is defined as the Pearson correlation, p between the degrees of linked nodes. This reflects the tendency of nodes with similar degree to connect to each other. Definition 3.1.2. associativity = p( X, Y ) = E[ XY ] − µ x µy σx σy X and Y represents the excess degree of nodes at the two end of a edge. In other words, it is the degree of each node minus the weight of the edge connecting the two nodes joined 11 by an edge. E refers to the expected value, µ refers to the mean over all edges. σ refers to the standard deviation over all edges. 3.2 Persistent Homology We now introduce relevant notions from algebraic topology, topological data analysis, and in particular, persistent homology [34]. A topological space is an abstraction of Euclidean space [34] where distance is replaced with concepts of connectivity. Definition 3.2.1. If X is a point cloud and U is the the set of all subsets of X, a topological space is defined as the pair (U, X ) where 1. ∅, X ⊂ U 2. u1 ∪ u2 ⊂ U, ∀u1 , u2 ∈ U 3. Ui ∩ Uj ⊂ U, ∀Ui , Uj ∈ Us One representation of a topological space in the form of a simplicial complex, as defined below [35]. Definition 3.2.2. Let σ = {u0 , u1 , . . . , uk } be a set of affinely independent points. A ksimplex is the convex hull of u, defined as the set of all points x such that x = ∑in=0 ui ti , where ∑ ti = 1 and ti ≥ 0∀ti . A node is a 0-simplex, an edge is a 1-simplex, and a triangle is a 2-simplex. Definition 3.2.3. [34] Let τ be the convex hull of a non-empty subset of ui , called a face of σ. A boundary is the union of all proper faces where τ < σ. A simplicial complex is a finite collection of simplices K such that σ ∈ K and τ ≤ σ implies τ ∈ K, and σ, σ0 ∈ K implies σ ∩ σ0 is either empty or a face or both. An example of a common simplicial complex is the Veitoris-Rips complex. Definition 3.2.4. [34] Let S be a set of points. The Veitoris-Rips complex is defined as the simplicial complex formed by points in S where the distance between two points is less than 2r, where r is a given radius. 12 Figure 3.1. A cup and a donut Homology is a mathematical way of describing the way a space is connected. We are concerned with differentiating topological spaces based on the holes in the space. As such, we consider objects from two topological spaces to be the same, or homotopic, if we could create a continuous bijection between the two to deform them into each other. Holes in the object can thus be used to differentiate them, since it is not possible to to create continuous bijections between objects with different numbers of holes. For instance, the cup and a donut in Figure 3.1 are the same from a homology perspective since they both contains a single hole. Holes are formally defined in terms of what is around them. Next, we introduce the relevant terms to define homology [34]. First, we define the p-chain, c as the sum of all p-simplices in K, where c = ∑ ai σi . ai refers to the coefficients of the p-chain, though in topological data analysis we are concerned only with modulo 2 coefficients, implying that the p-chain is simply the a set of simplices, where ai = 1 implies that the simplice is in the set and ai = 0 implies otherwise. Next, we define the boundary of a p-simplex to be the sum of its p − 1 faces, and the boundary of a p-chain to be the sum of the boundaries of the simplices in the chain. A p-cycle is a p-chain with empty boundary, and forms a group. A p-boundary is a p-chain that is the boundary of a p + 1 chain, which again forms a group. A concrete example of modulo 2 simplex addition is that of two triangles. Triangles contains three vertices and three edges, where the edges form a boundary. When we put two triangles together, the edge joining the two simplices is no longer considered a boundary. Definition 3.2.5. The p-th homology group is defined as the p-th cycle group modulo the p-th boundary group.The p-th Betti number is thus the rank of a homology group, where rank refers to the number of linearly independent components. Betti numbers loosely refer to the number of k-dimensional holes, or features, in a 13 topological space, where β 0 refers to the number of connected components in a space, and β 1 refers to the number of tunnels. For the most part, we will use β 0 in our analysis of the topological spaces generated from brain networks. Finally, we define homomorphism as a map between between homology groups that preserves group operations. Given a simplicial complex K and a function f : K → R, it is possible to construct a filtration on K. Filtrations are defined as a sequence of simplicial complexes connected by inclusion maps, given by ∅ = K0 ⊆ K1 ⊆ . . . ⊆ Kn = K. Ki = K ( ai ) where a1 < a2 < . . . < an are the function values of f . This implies that there is a homomorphism for each dimension p, and thus the filtration corresponds to a sequence of homology connected by homomorphisms for each p. As we move from one simplicial complex to thee next, new homological features can be born, or can die from merges. Persistence describes the time between the birth and death of a feature, where time refers to ai . Figure 3.3 shows the number of Betti 0 features, the connected components, decreasing as points merge into larger simplicial complexes. Definition 3.2.6. [34] The p-th persistent homology groups are the images of thee homoi,j i,j morphisms induced by inclusion H p = im f p for 0 ≤ i ≤ j ≤ n. The p-th persistent Betti number s are the rank of these persistent homology groups. A persistence diagram is used to explain the changes in persistent Betti number by i,j encoding the persistence of homology features of that dimension. Let µ p be the number of p-dimensional classes born at Ki and dying entering K j . A persistence diagram is obtained i,j by drawing each point ( ai , a j ) with multiplicity µ p [34]. The distance from the horizontal axis is ai , the value of f when the feature is born, and the distance from the vertical axis is a j , the value of f when a feature dies. A persistence barcode is similar, where each feature is a horizontal bar that starts at the time the class is born and ends at the time it dies. Figure 3.2 shows a simple persistence diagram and corresponding persistence barcode with 6 features. Differences in the topological profile of an object as defined by their persistence diagrams can be measured with the bottleneck and Wasserstein distance [34]. Definition 3.2.7. Let now X and Y be two persistence diagrams. To define the distance 14 Figure 3.2. Persistence diagram and barcodes between them, we consider a bijections η : X → Y and record the supremum of the distances between corresponding points for each. Measuring the distance between points x = ( x1 , x2 ) and y = (y1 , y2 ) as \|\| x − y\|\|∞ = max \| x1 − y1 \|, \| x2 − y2 \| and taking the infinum over all bijections, we get the bottleneck distance between the diagrams, W∞ ( X, Y ) = inf sup \|\| x − η ( x )\|\|∞ η:X →Y x ∈ X Definition 3.2.8. The degree q Wasserstein distance between X and Y for any positive real number q is the sum of q − th powers of the L∞ -distances between corresponding points minimizing over all bijections. " Wq ( X, Y ) = 3.3 inf ∑ η:X →Y x ∈ X #(1/q) q \|\| x − ηx \|\|∞ Topological Data Analysis of Brain Networks Lee et. al. [2] laid the foundations relating persistent homology with brain networks. The relationship between brain networks and persistent homology is summarized below. Distance d between two points, a and b, is formally defined as an operator that maps to the non-negative real numbers. It is considered a metric if its result is non-negative, is 0 if and only if the two inputs to the operator is identical, is symmetric and holds to the triangle inequality. Euclidean distance is an example of a metric, and is given by d( a, b) = p ∑in=1 ( ai − bi )2 where n is the dimension of the space. With brain networks, the connection between regions is typically in the form of a Pearson correlation, which is the cosine similarity between time series for fMRI data [36]. Pearson correlation,r = corr ( x, y), is a form of cosine similarity between two dataset normalized between -1 and 1, and could be transformed into Euclidean distance, dist = 15 Figure 3.3. Filtrations p 2(1 − r ), through the cosine theorem [37]. For our clustering tool, we transform edge weights from r to Euclidean distance. For our visualization and comparison tool, we model edge weights as \|r \| to remain consistent with previous research done on the test dataset. For brain networks where edge weights reflects Euclidean distance, a filtration of the network can be obtained by modeling the networks as a subset of the Veitoris-Rips complex, where the highest dimension of the simplices is 1 since there are no faces. We can then obtain filtrations by increasing the radius r, which corresponds to a correlation threshold, giving us a series of simplicial complex as shown in Figure 3.3. Given a filtration of Vietoris-Rips complexes, we can compute its persistent homology.We are specifically interested in dimensional 0 persistent homology, which reflects the number of connected components. When the threshold is 0, there are no edges, which results in a Betti 0 number equal to the number of points. As we increase the required threshold, points are joined by lines, decreasing the number of independent connected components. The birth and death of the Betti 0 features can thus be encoded in a persistence diagram or barcode. Lee et. al [2] also noted that the joining of nodes corresponds with the order of single-linkage clustering. For networks where edge weight is given by \|r \|, we create a filtration by creating nested binary graph based on an edge threshold λ. This is known as a graph filtration. The adjacency matrix of the binary graph is given by: ( aij (λ) = 1 g0 wij > λ otherwise By creating a sequence of increasing thresholds, 0 ≤ λ0 ≤ λ1 ≤ . . . ≤ λn ≤ 1, we can obtain a corresponding series of binary graphs, each with a smaller number of edges and larger number of dimensional 0 features. 16 3.4 Clustering techniques In general, clustering refers to a series of techniques that groups similar data together. Hierarchical clustering is a type of clustering algorithm where each data point is originally separated into its own cluster, and close clusters are iteratively merged based on a certain distance update formula. Nearest neighbor clustering defines distance as the distance between the two closest nodes in two clusters. Furtherest neighbor clustering defines it as the distance between the two furtherest nodes in two clusters. With centroid clustering, there are multiple ways to define a centroid. One method of defining the centroid is to average the coordinates of all nodes within a cluster across its dimension, essentially squaring the Euclidean. The median clustering strategy is similar to the centroid, but instead of calculating a new point, it pick the central most node as the center of the cluster. Finally, with group average clustering, we sum up the distance between all pairs in the two group and divide the sum by the product of the number of nodes in each cluster. The hierarchical clustering strategy merges the closest nodes in a space. The distance between nodes depends on both distance definition, as well as clustering method. The tree of node merges is called a dendrogram, where each node on a level of the tree represents a unique cluster. The number of clusters at each level shrink as we go up the tree, until all nodes merge together as a root cluster. An example of the application of hierarchical clustering to brain networks is found in [38], where the authors used single-linkage clustering on fMRI data to divide a brain into regions. CHAPTER 4 LIGHTWEIGHT VISUALIZATION TOOL 4.1 Design The primary purpose of our tool is to incorporate the usage of topological data analysis (TDA) with brain network analysis to complement traditional visualization and analysis techniques. Using TDA helps to remove the impact of thresholding while building brain networks. Our visualization design thus needs to be able to fulfill the following criteria: visually encoding important dimensions of data intrinsic to brain networks, providing access to specific information about individual networks, highlighting cross network features, and facilitating interactivity. Our visual design also need to support single network visualization and network comparison. The visual interface is broadly divided into three sections as seen from Figure 4.1. On the left is an input panel, where users are able to select the number of networks they wish to display, upload files, or display an existing preloaded network we used from an autism study. Additional information from the visualization is also displayed in the panel. On the upper right we have our actual brain network visualization, and on the bottom right we have the persistence barcode of the network where filtering is performed based on the edge weights of the connectivity matrix. In between the visualization and the barcode is a slider where the user can adjust the threshold for the binary network, which corresponds to the filtration level. The main reason we chose to stack the network visualization and chart display vertically is to facilitate network comparison. This split creates space to horizontally display two comparable visualization side by side. Given the above interface, we look at the characteristics of the brain network data and suitable visualization layouts. First, we examine typical features of a brain network to determine the visualization layouts we will include in our tool. The two primary fea- 18 Figure 4.1. Visual Interface Figure 4.2. 3D node-link diagram of SN network comparision tures are the ROI in the brain and the connectivity between them. ROI contain spatial coordinates, as well as categorical labels. To best encode the physical location of the ROI, we use a node-link diagram embedded in a 3D physical rendering of the brain as one visualization, as shown in Figure 4.2. In brain imaging, anatomical accuracy is one of the most important attributes, since physical proximity can contribute to the correlation between regions. Node-link diagram provide an easy way to visually connect the ROI. An often cited disadvantage of 3D diagrams is that the overlap of edges and nodes 19 Figure 4.3. Wheel chart of SN network comparison can reduce the readability of a visualization. We argue that brain imaging is an exception to this, since the presence of the 3D brain rendering gives context to a node’s anatomical locations. To deal with possible edge clutter, we allow users to zoom and rotate the models. This provide fine control over the granularity of the visualization, improving relevant visibility. To aid comparisons between networks, we link the zooming and rotating functionality of both networks when two networks are displayed. By maintaining a consistent orientation and scale, we make it easier for users to focus on the differences between edges. For comparison purpose, our tool also highlights similarities and differences between the edges of two networks. Edges present only in the left network is in red, those present only in the right network is in green, and those that are present in both is in grey. Both red and green edges are higher in saturation than other colors in the visualization, immediately drawing the users’ attention. In addition to the 3D node-link diagram, we include a connectogram, also known as a wheel chart. Nodes are arranged in a circle instead of being in an arbitrary position, which visually highlights the degree of each node. The position nodes appear in the circle could 20 be used to encode clusters of similar points. In our tool, we group nodes by the regions in the brain they belong to, as shown in Figure 4.3. 4.2 Topological Data Analysis As mentioned in [2], persistent homology is an efficient way to summarize the edge weights of a graph. In order to allow users to more easily draw the connection between filtrations, connectivity, and the actual network, we link our network visualization to that of a dimension 0 persistence barcode through a slider representing threshold values. Changing the value of the slider simultaneously thresholds the network into a binary graph and highlights the filtration level of persistence barcode as represented by the threshold with an orange bar. This allows users to visually keep track of the number of connected components present in the graph. On the persistence barcode, each bar represents the birth and death of a connected component. When the threshold is 0, we have the original complete weighted graph that is fully connected, with β 0 = 1. Note that the stronger the correlation, the smaller the distance, so the movement of the orange column is opposite to that of the slider. Since we are concerned with the merging of connected components as the threshold decreases, our tool also has the functionality of highlighting the edge that is responsible for the most recent merge, as seen in Figure 4.4. This functionality is available in both single network visualization and network comparison. For each node, we also compute the number of times a given node is involved in a merge, represented by its degree in the minimum spanning tree. The information is available upon clicking the node in question in the 3D diagram, as shown in Figure 4.4. 4.3 Implementation As a web tool, we primarily use Python and Flask as the backend server to handle network computations and requests from the browser. Internally, networks are stored as graph objects in the Networkx library. We use the Networkx library to compute the minimum spanning trees of a graph. To compute the persistence barcode, we simply examine the minimum spanning tree of the graph. After sorting the weights, the edges will be listed in the order that they are 21 Figure 4.4. SN control network with edge highlight merged. The number of times a node is involved in a merge is also the degree of the node in the minimum spanning tree. Here, degree refers to the number of edges a node has. For the visualization of the 3D node-link diagram, we make use of the three.js library which have 3D WebGL rendering capabilitiees. For the visualization of the wheel chart and persistence barcode, we use the d3.js library. We make extensive use of existing scripts to ensure that the two node-link diagrams are fully linked when performing rotations and zooming. Clicking on the nodes reveals vertex information from individual networks, since the topological information about the nodes is different for the two networks. Users are required to upload two files for each network, one containing the connectivity matrix, and the other containing node information such as the physical coordinates of their location and the region they belong to. Users can also choose between viewing a single network, or comparing two networks. 4.4 Results Our tool is used to load and explore the autism-related data used in [39] and [40]. The data is derived from 49 male subjects with autism and 49 male control subjects. After preprocessing, the salience (SN), executive control (ECN), and default mode (DMN) intrinsic connectivity networks are derived from the right frontoinsular cortex, the right 22 Figure 4.5. DMN network comparison dorsolateral prefrontal cortex, and the right posterior cingulate cortex, and comprise of 39, 32 and 39 nodes, respectively. Each network represents population level correlation, as the correlation between regions is computed by calculating the covariance of gray matter densities across all subjects in a group for each region, a defining feature of structural covariance MRI. To remain consistent with [39], edge weight is taken to be the absolute value of the correlation from the original correlation matrix. Figure 4.2 shows the visualization of the salience network. From the barcodes, we see that at a threshold of 0.67, the control network is fully connected, whereas there are still three separate connected components for the autistic network. The difference makes sense since when we look at the node-link diagram, there are many more links in control (as indicated by the green line) compared to the autistic brain, although there are also four areas with stronger correlation in the autistic brain. Figure 4.5 shows the visualization of the default mode network. Surprisingly, the connectivity between regions formed faster in the autistic brain compared to the control brain, as seen from the slimmer columns in the persistence barcode. From the node-link diagram, we can see that although there are indeed more connections present in the autistic brain at a threshold of 0.74, there are also a fair number of edges present only in the control brain, as indicated by the green edges. Thus, the difference from the salience 23 Figure 4.6. ECN network comparison network suggests that other factors are involved in autism apart from the strength of the connections. Lastly, Figure 4.6 shows the executive control network. It is similar to the salience network, with many more edges in the control compared to the autistic brain. In summary, in all three networks there are distinct differences in topological profiles between the brains of autistic patients and control subjects. CHAPTER 5 HIERARCHICAL CLUSTERING AND TOPOLOGICAL PROFILES OF NETWORKS Networks generated from connectivity matrices are by definition complete graphs since the matrix contains the pairwise relationship between all ROI and nodes. Even with thresholding, it can be difficult to visualize without causing a hairball effect on the graph, creating difficulties in graph interpretation [9]. The visual clutter is especially prominent with large unprocessed data generated from raw image volumes, which could contain tens of thousands of voxels. It is thus often desirable to reduce the scale of a network to a more manageable size. Clustering is a common technique used to group similar nodes together into a super node, which reduces the size of the network. However, it is possible for clustering to change the topology of a graph. To obtain a better understanding of the impact of clustering on the topological profile of a network, we created a web tool that shows the changes in graph theoretic measures and topological profile for a graph at different levels of hierarchical clustering. In particular, we quantify the changes in topological profile of a graph over different clustering resolutions by measuring the bottleneck and Wasserstein distance between the original network and clustered network. We then evaluate the utility of bottleneck and Wasserstein distance as clustering quality indicators against other traditional clustering quality statistics. In our tool, we select nearest neighbor, furthest neighbor, and group average as the clustering strategies to explore. For nearest neighbors, the distance between two groups is defined as the closest distance between their elements. For furthest neighbor, the distance is defined as the furthest distance between members of the two groups. Group average distance is simply the average of all links between two nodes. Given the correlation between p two points, we can easily translate it into a distance metric by using d = 2 ∗ (1 − r ) as shown in [36]. 25 Similar to the visualization tool, we look at the dimension 0 persistent homology in our clustering tool, which reflects the number of connected components in the network. Our tool then calculates the dimension 0 Wasserstein and bottleneck distances. 5.1 Functionality and Design We use a layout similar to our visualization tool, with an control panel on the left, network visualization panel on the upper right, and analytics chart on the bottom right, as seen in Figure 5.1. For convenience, we use the force directed layout to visualize the graph. Users are required to upload a connectivity matrix, specify the type of hierarchical clustering they want to analyze, as well as the number of slices they would like to take from the dendrogram of the clustering. Each slice correspond to a clustering of the original graph. For each clustering, we compute the modularity, assortivity, bottleneck distance, and Wasserstein distance. Our analytics results are displayed with a simple line and bar graph with the x axis being the size of the network displayed, and the y axis the value of the selected measure. We chose to combine a bar graph and line graph in our chart as the line highlights trends, while the bar shows the cluster size. The tool will allow users to scroll through the different cluster sizes, highlighting the corresponding bar in the analytics chart with orange. 5.2 Implementation Like our visualization tool, we use a combination of Flask, Python and Networkx as a backend server. For the clustering of networks, we utilized the Scipy Python library for performing single linked, complete linked, and average linked clustering on a connectivity matrix. For the calculation of modularity and assortivity, we use the Networkx Python library. We use the Hera tool [41] to compute the dimension 0 bottleneck and Wasserstein distance. Finally, we use sigma.js to visualize the network, and d3.js to create the line and bar chart. 5.3 Results To show the tool in action, we choose to visualize a connectivity matrix of a single control brain from the Autism Brain Imaging Data Exchange (ABIDE). The network is comprised of 264 nodes from the Power 264 regions of interest atlas, and is derived from 26 Figure 5.1. Clustering tool layout rs-fMRI data. For edges, we used d = p 2 ∗ (1 − r ) as the distance between nodes, where r is the Pearson correlation coefficient of the BOLD signal between the time series of two regions. We use our tool with the above connectivity matrix to explore the changes in measures for the three types of clustering we perform. For convenience, we set the slice number our tool takes to 20. Below are the charts our tool produced, grouped by the measure type. The y axis of each chart is the value of the measure, and the x axis is the size of the cluster. The y axis goes from the smallest to the largest value of the measure. For all assortivity graphs in Figure 5.3, values increase logarithmically, with increases leveling off around 34 nodes or so. From the graphs of the bottleneck distances, it is difficult to identify a specific cutoff point. The distance between identical graphs is 0 as expected when there are 264 nodes, and the furthest distance is 0.6 for a graph with 4 nodes. The changes between the rest of the values seem fairly linear, with a mild increase in change between 4 and 20, and from 255 to 264. The graphs are identical for all three clustering types. Modularity values in Figure 5.2 follow the same pattern except for nearest neighbor clustering where the values jump around for values below a graph of size 95. For average and furthest neighbor clustering, we would probably pick 40 as a cutoff for clustering, 27 Nearest Neighbor Furthest Neighbor Average Furthest Neighbor Average Furthest Neighbor Average Furthest Neighbor Average Figure 5.2. Modularity Nearest Neighbor Figure 5.3. Assortivity Nearest Neighbor Figure 5.4. bottleneck distance Nearest Neighbor Figure 5.5. Wasserstein distance 28 since we want to maximize modularity. The β 0 bottleneck distances for all three clustering methods form a mild S-shaped line that is fairly close to linear in Figure 5.4. From the shape of the line alone, it is difficult to determine if there is a optimal cutoff point for clustering size. For the Wasserstein distance, the graph from all three clusterings in Figure 5.5 look fairly linear. However, the line from the nearest neighbor clustering is convex, whereas the line from the furthest neighbor clustering is concave. Similar to the bottleneck charts, it is difficult to precisely determine a cutoff point for the clustering from the chart. Overall, it is clear that the topological profile captured by the bottleneck and Wasserstein distance behaves differently compared to modularity and assortivity, which both captures the small-world characteristic of the network. Further investigation is necessary to relate the changes in topology-based distance measures to changes in network structure during clustering. CHAPTER 6 DISCUSSION Both the visualization and clustering tools in their current state are prototypes that could be polished. The main goal of our web visualization tool is to provide an easy way to compare the connectivity matrices of brain networks. From a visualization perspective, our tool is lightweight and easy to use. This simplicity however comes at the cost of flexibility in terms of the visual encoding of data. Currently, we can only display pre-determined attributes with fixed visual encoding. The files uploaded must be of a specified format with location and region. Flexibility could be improved if we change our tool to a more modular design, so that users are able to specify the data feature and corresponding encoding they would like to visualize. For instance, users could upload a list corresponding to the volume of an ROI, and our graph could potentially dynamically adjust node saturation or size to reflect that attribute. Our tool could also be improved in terms of visualizing analytics measures derived from the network. Currently, users can choose between using edge color to compare and contrast the edges in two networks, or highlight the edge responsible for merging two connected components. It would be better if we could have the option of highlighting the members of a connected components in the network. Another way we could expand the topological analysis aspect of our tool is to look at persistent homology beyond dimensional 0. Currently, our visualization tool looks only at dimensional-0 features, namely connected components. In the future, we could also look at the behavior of dimensional-1 features, which are the tunnels in the network. Given autism data, our tool is able to clearly illustrate differences between control and autistic networks. Although there is a clear difference in the persistence barcodes of autistic and control brains, the difference is not consistent among the three networks we examine. Should we visualize all three networks together, autistic brains likely have weaker correlation between regions than the control. However, the clear difference between the 30 topological profiles of the DMN subnetworks suggests that the pattern is not universal. For the clustering analysis tool, we are able to accurately show the changes in several graph measures across different levels of hierarchical clustering. However, the distance between persistent profiles of network does not seem to be a suitable measure to determine optimal clustering resolution based on the graph from our test network. The slope of the change is fairly linear, which provides no information on the suitability of one cluster size against another. It is possible that Wasserstein and bottleneck distances would show an optimal clustering point given a larger or different network. Investigating different networks is a plausible direction for future work. Further work also needs to be done to analyze the behavior of the Wasserstein and bottleneck distance in order to generalize the measure to other networks. It is clear from both our work and previous work [2] that a network’s topological profile captures significant network characteristic that could be used for network differentiation. Thus far, it has been applied to the study of Autism and ADHD. There is much potential in further exploring the causes of the differences in topological profiles between a control brain and one with a neuropsychiatric disorder. In future studies, it would be interesting to examine the relationship between a network’s topological profile and other graph theoretic measures. For the clustering portion of the project, a possible future direction is to examine if the size of the original network have an impact on the topological profile of the clustering. In both our experiment with Autism data and previous studies involve ADHD, bottleneck distance did not perform well as a quality measure. This is unexpected, since there was distinct differences in the topological profile between control and experimental groups. A plausible direction to take is to investigate if changing clustering techniques to spectral or k-means clustering would make a difference in the topological profile of a network. REFERENCES [1] W. E. Klunk, H. Engler, A. Nordberg, Y. Wang, G. Blomqvist, D. P. Holt, M. Bergström, I. Savitcheva, G.-F. Huang, S. Estrada, et al., “Imaging amyloid in alzheimer’s disease with pittsburgh compound-b,” Annals of neurology, vol. 55, no. 3, pp. 306–319, 2004. [2] H. Lee, H. Kang, M. K. Chung, B.-N. Kim, and D. S. Lee, “Persistent brain network homology from the perspective of dendrogram,” IEEE transactions on medical imaging, vol. 31, no. 12, pp. 2267–2277, 2012. [3] R. A. Becker, S. G. Eick, and A. R. Wilks, “Visualizing network data,” IEEE Transactions on Visualization and Computer Graphics, vol. 1, pp. 16–28, Mar 1995. [4] I. Herman, G. Melancon, and M. S. Marshall, “Graph visualization and navigation in information visualization: A survey,” IEEE Transactions on Visualization and Computer Graphics, vol. 6, pp. 24–43, Jan 2000. [5] P. Eades, “A heuristic for graph drawing,” Congressus numerantium, vol. 42, pp. 149– 160, 1984. [6] B.-J. Breitkreutz, C. Stark, and M. Tyers, “Osprey: a network visualization system,” Genome biology, vol. 4, no. 3, p. R22, 2003. [7] M. Sarkar and M. H. Brown, “Graphical fisheye views of graphs,” in Proceedings of the SIGCHI conference on Human factors in computing systems, pp. 83–91, ACM, 1992. [8] D. S. Margulies, J. Böttger, A. Watanabe, and K. J. Gorgolewski, “Visualizing the human connectome,” NeuroImage, vol. 80, pp. 445–461, 2013. [9] B. Alper, B. Bach, N. Henry Riche, T. Isenberg, and J.-D. Fekete, “Weighted graph comparison techniques for brain connectivity analysis,” in Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 483–492, ACM, 2013. [10] A. Irimia, M. C. Chambers, C. M. Torgerson, and J. D. Van Horn, “Circular representation of human cortical networks for subject and population-level connectomic visualization,” Neuroimage, vol. 60, no. 2, pp. 1340–1351, 2012. [11] M. Xia, J. Wang, and Y. He, “Brainnet viewer: a network visualization tool for human brain connectomics,” PloS one, vol. 8, no. 7, p. e68910, 2013. [12] S. Gerhard, A. Daducci, A. Lemkaddem, R. Meuli, J.-P. Thiran, and P. Hagmann, “The connectome viewer toolkit: an open source framework to manage, analyze, and visualize connectomes,” Frontiers in neuroinformatics, vol. 5, p. 3, 2011. [13] E. G. Christodoulou, V. Sakkalis, V. Tsiaras, and I. G. Tollis, “Brainnetvis: an openaccess tool to effectively quantify and visualize brain networks,” Computational intelligence and neuroscience, vol. 2011, 2011. 32 [14] J. A. Brown and J. D. Van Horn, “Connected brains and mindsthe umcd repository for brain connectivity matrices,” Neuroimage, vol. 124, pp. 1238–1241, 2016. [15] S. S. Flitman, “Survey of brain imaging techniques with implications for nanomedicine,” Foresight Nanotech Institute (online), 2000. [16] M. Kaiser, “A tutorial in connectome analysis: topological and spatial features of brain networks,” Neuroimage, vol. 57, no. 3, pp. 892–907, 2011. [17] F. D. V. Fallani, J. Richiardi, M. Chavez, and S. Achard, “Graph analysis of functional brain networks: practical issues in translational neuroscience,” Phil. Trans. R. Soc. B, vol. 369, no. 1653, p. 20130521, 2014. [18] P. Frosini, “Measuring shapes by size functions,” in Intelligent Robots and Computer Vision X: Algorithms and Techniques, vol. 1607, pp. 122–134, International Society for Optics and Photonics, 1992. [19] V. Robins, “Towards computing homology from finite approximations,” in Topology proceedings, vol. 24, pp. 503–532, 1999. [20] H. Edelsbrunner and E. P. Mücke, “Three-dimensional alpha shapes,” ACM Transactions on Graphics (TOG), vol. 13, no. 1, pp. 43–72, 1994. [21] V. De Silva, R. Ghrist, et al., “Coverage in sensor networks via persistent homology,” Algebraic & Geometric Topology, vol. 7, no. 1, pp. 339–358, 2007. [22] H. Choi, Y. K. Kim, H. Kang, H. Lee, H.-J. Im, E. E. Kim, J.-K. Chung, D. S. Lee, et al., “Abnormal metabolic connectivity in the pilocarpine-induced epilepsy rat model: a multiscale network analysis based on persistent homology,” NeuroImage, vol. 99, pp. 226–236, 2014. [23] C. J. Carstens and K. J. Horadam, “Persistent homology of collaboration networks,” Mathematical problems in engineering, vol. 2013, 2013. [24] M. K. Chung, P. Bubenik, P. T. Kim, K. M. Dalton, and R. J. Davidson, “Persistence diagrams of cortical surface data,” Information Processing in Medical Imaging, vol. 5636, pp. 386–397, 2009. [25] B. Cassidy, C. Rae, and V. Solo, “Brain activity: Conditional dissimilarity and persistent homology,” IEEE 12th International Symposium on Biomedical Imaging (ISBI), pp. 1356 – 1359, 2015. [26] Y. Dabaghian, F. Mémoli, L. Frank, and G. Carlsson, “A topological paradigm for hippocampal spatial map formation using persistent homology,” PLoS Computational Biology, vol. 8, no. 8, p. e1002581, 2012. [27] E. Bullmore and O. Sporns, “Complex brain networks: graph theoretical analysis of structural and functional systems,” Nature Reviews Neuroscience, vol. 10, no. 3, p. 186, 2009. [28] G. N. Lance and W. T. Williams, “A general theory of classificatory sorting strategies: 1. hierarchical systems,” The computer journal, vol. 9, no. 4, pp. 373–380, 1967. [29] U. Brandes, D. Delling, M. Gaertler, R. Gorke, M. Hoefer, Z. Nikoloski, and D. Wagner, “On modularity clustering,” IEEE transactions on knowledge and data engineering, vol. 20, no. 2, pp. 172–188, 2008. [30] S. Ben-David and M. Ackerman, “Measures of clustering quality: A working set of axioms for clustering,” in Advances in neural information processing systems, pp. 121– 128, 2009. [31] M. E. Newman, “Detecting community structure in networks,” The European Physical Journal B, vol. 38, no. 2, pp. 321–330, 2004. [32] M. E. Newman, “Modularity and community structure in networks,” Proceedings of the national academy of sciences, vol. 103, no. 23, pp. 8577–8582, 2006. [33] J. Badham and R. Stocker, “The impact of network clustering and assortativity on epidemic behaviour,” Theoretical population biology, vol. 77, no. 1, pp. 71–75, 2010. [34] H. Edelsbrunner and J. Harer, Computational topology: an introduction. Mathematical Soc., 2010. American [35] J. R. Munkres, Elements of algebraic topology. CRC Press, 2018. [36] Y. Zhu and D. Shasha, “Statstream: Statistical monitoring of thousands of data streams in real time** work supported in part by us nsf grants iis-9988345 and n2010: 0115586.,” in VLDB’02: Proceedings of the 28th International Conference on Very Large Databases, pp. 358–369, Elsevier, 2002. [37] L. Egghe and L. Leydesdorff, “The relation between pearson’s correlation coefficient r and salton’s cosine measure,” Journal of the Association for Information Science and Technology, vol. 60, no. 5, pp. 1027–1036, 2009. [38] D. Cordes, V. Haughton, J. D. Carew, K. Arfanakis, and K. Maravilla, “Hierarchical clustering to measure connectivity in fmri resting-state data,” Magnetic resonance imaging, vol. 20, no. 4, pp. 305–317, 2002. [39] S. Palande, V. Jose, B. Zielinski, J. Anderson, P. T. Fletcher, and B. Wang, “Revisiting abnormalities in brain network architecture underlying autism using topologyinspired statistical inference,” in International Workshop on Connectomics in Neuroimaging, pp. 98–107, Springer, 2017. [40] B. A. Zielinski, J. S. Anderson, A. L. Froehlich, M. B. Prigge, J. A. Nielsen, J. R. Cooperrider, A. N. Cariello, P. T. Fletcher, A. L. Alexander, N. Lange, et al., “scmri reveals large-scale brain network abnormalities in autism,” PloS one, vol. 7, no. 11, p. e49172, 2012. [41] M. Kerber, D. Morozov, and A. Nigmetov, “Geometry helps to compare persistence diagrams,” Journal of Experimental Algorithmics (JEA), vol. 22, pp. 1–4, 2017.
Reference URL	https://collections.lib.utah.edu/ark:/87278/s69708ez