Structure-&Physics- Preserving Reductions of Power Grid Models

The large size of multiscale, distribution and transmission, power grids hinder fast system-wide estimation and real-time control and optimization of operations. This paper studies graph reduction methods of power grids that are favourable for fast simulations and follow-up applications. We present systematic techniques that reduce the grid size while preserving basic functionality in the reduced grid. The reduction is achieved through the use of iterative aggregation of sub-graphs that include general tree structures, lines, and triangles. An important feature of our reduction algorithms is the efficient use of hash-tables to store the sequential reduction of the original grid in the reduced network. This allows for easy introduction of detailed models into the reduced conceptual network and makes the model backward compatible with varying resolution. The performance of our graph reduction algorithms, and features of the reduced grids, are discussed on a real-word transmission and distribution grid. We produce visualizations of the reduced models through open source libraries and release our reduction algorithms with example code and toy data.

1. Introduction. Power grids consist of the network of transmission and distribution lines connecting generators with end-users, enabling the transfer of electricity. The power grid of North America, in particular, is recognized as the most complicated machine built on earth [1,30]. Topologically the grid is represented by a large, connected graph with nodes denoting buses (loads and generation) and edges representing lines. These nodes and edges are constructed in distinct formations across the physical scales in the problem of power delivery [17]. The transmission and distribution sub-networks exist in a hierarchical configuration, where the transmission sub-network consists of high voltage lines connecting generators to substations and the distribution sub-network connects substations to end users [10]. System-wide monitoring and control of the grid involves simulation studies carried out by network authorities like independent system operators (ISO) [19,24]. Simulating grid operations relies on accurate state estimation and optimization with respect to power flow laws, describing interactions across layers of temporal and spatial resolution [27].
Over time, the grid and its dynamical characteristics undergo changes with the introduction of new loads, generators and network components. The increased penetration of the renewable energy, e.g., solar and wind power, has expanded the frontiers of the grid and also made issues regarding grid stability and control of paramount importance [3,31,13]. Dynamic forcing from the distribution grid has historically been much smaller than the transmission components. For example, the amount of inertia and damping in the distribution grid are limited [18]. However, rooftop solar, the internet of things [22], and other resources have cultivated the demand for decentralized resource generation and control in the distribution sub-grid [14,25,12,9]. With this demand comes the need for multiscale, dynamical models of the grid.
Owing to the large size and dense interconnections, control, optimization and dynamical simulation of the detailed grid face implementation issues [5,4]. Operational demands require reduced order and approximate schemes to improve the efficiency of computations and simulations of grid operations. However, one must ensure that the model reduction schemes are true to the original grid and have comparable dynamic behavior, or approximate the same. This paper analyses system-aware graph reductions of large power grids, to construct conceptual networks amenable for follow-up action and reanalysis. System-awareness here refers to the use of parameters such as nodal voltage, line impedance and bulk nodal inertia; we additionally aim to preserve topological features like nodal degree, presence of graph paths etc. in the graph reduction process. This is necessary for our underlying goal which is to develop graph reduction schemes that preserve qualitative features of the original grid's dynamic behavior for accurate state estimation and disturbance prediction.

Contribution.
There is extensive research into reduction algorithms for improving the analysis of large networks and reducing the computational burden therein. Community detection approaches use graph based methods to collapse subnetworks into smaller, representative components, e.g. Kannan et. al. analyze criteria for effective clustering approaches in relation to the spectrum of the graph Laplacian [21]; Newman develops reduction methodology in terms of the node-group connectivity measure of modularity [26]. While community detection methods have applications in power systems, these approaches aren't appropriate for constructing a dynamically consistent reduced order model. Other works, in circuit design, have focused on network reductions which preserve static power flow computations, e.g. Zhou et. al. study block based hierarchical graph reduction schemes for fast solution to power flow for in-chip circuits [35]; Wang provides a deterministic random walk based preprocessing and graph reduction algorithm also aimed at solving the DC power flow problem [32]; Chen & Chen present Krylov-subspace iterative methods for preconditioning [7].
Dorfler & Bullo, in particular, provide a detailed mathematical analysis of the classical Kron reduction [11] for power grid networks. Given a choice of reference nodes, the Kron reduction uses Gaussian elimination to pare down the full network to a reduced model that is electrically equivalent from the perspective of the references. The Kron reduction has been applied extensively in power systems analysis with success in control and optimization problems [11, and references therein]. The selection of reference nodes is unambiguous for transmission networks under a classical, hierarchical distribution design. However, the deployment of decentralized generation and storage in the distribution sub-grid makes the selection of the references problematic: while individual distribution nodes do not provide significant generation, the aggregation of these can strongly impact the optimal power flow and control problem.
Our work is distinguished by graph reductions designed to respect the dynamics and topology of multiscale, distribution and transmission, electric grid networks.
Our key contribution is developing a series of invertible graph reduction algorithms and demonstrating the viability of these techniques on a real, electric grid in a US Midwest utility. Our methodology emphasizes two design features: (i) the reductions degree one reduced d2 degree two reduced tri triangle reduced vThr voltage threshold dThr degree threshold deg(b) degree of b Fig. 1. Algorithm notations, e.g. the degree one reduced node and edge sets are denoted d1N, d1E respectively. The degree one reduction data is stored in d1H. Degree and voltage thresholds are criteria set for greedy triangular reductions in Algorithm 4.
are system aware, respecting the network topology and grid parameters, and (ii) the reductions may be partially inverted. System-awareness enforces that the reduced network respects the power flow of the full network, but complex dynamical features can be given increased resolution by inverting the nodal clustering post-facto. Unstructured data, describing the placement of clustered nodes in the reduced model, can be used to parameterize net power flow. However, this unstructured data is insufficient to increase the resolution on a specific cluster. Effective implementation of data structures, tracking sequential reductions, has been an integral component of our work: our design enforces backwards compatibility, with intermediate, partially reduced representations of network features. Implementing these techniques on a real, multiscale electric grid, we present the results of our analysis comparing the graph characteristics of reduced network to the full network. We interactively visualize the reduced network, utilizing graph-topology, rather than geographical location to qualitatively analyze the results.
In section 2, we present our main results, including our reduction algorithms and the analysis of their performance on the real, multiscale network. In section 3 we demonstrate conceptual visualizations the reduced, case study network, and the nodal clustering produced by the algorithms. We detail our use of data structures in Appendix A, explaining how to invert the algorithms to increase the resolution post-facto. Utilizing the reduced network for online, dynamic modelling of multiscale electric grids is discussed in section 4 where we introduce future directions of research. Finally, example code and test data are available as supplementary material online [16] with interactive visualizations available in web browsers [15].
2. Reduction algorithms. This section develops our algorithms for the sequential reduction of a multiscale, distribution and transmission, power grid network -a reference for our notations can be found in Figure 1. Our methodology is designed to achieve the following objectives: 1. the reduced network faithfully represents the original network's power flow; 2. the full network can be (partially) reconstructed from the reduced model; 3. the reduced network is of a scale that is amenable to interactive visualization. We focus on off-line methods meeting the above objectives. As an off-line reduction, we use static network characteristics to produce a model which is robust to changes in the production, consumption, and dynamic characteristics of loads and generators. In particular, the same reduced network will be used to represent different load and parameter states including the inertia, damping and frequency control settings. Any such reduction implicitly assumes infrequent updates. These updates occur when topology and parameter estimations (line impedances, transformer settings, and related) undergo a significant change, e.g. in seasonal transitions or following major network modifications. With such schemes, operators infrequently need to reproduce reduced order models (and their associated parameterizations). However, the scheme can be improved with state estimation available through phasor measurement unit (PMU) technology in real time [29], i.e. on-line. The construction of a robust reduction scheme involving online state estimation will be the subject of future research. In the remainder of this section we discuss four consecutive sub-steps of our off-line scheme and analyze the performance of the algorithms on a real multiscale network.
2.1. Degree zero reductions. As a preliminary step, we remove nodes of zero nominal voltage and restrict the test network to a single connected component, representing a full regional power grid, as described in teh following definition. Definition 1. The set N is defined to be the set of nodes in the pre-processed network, consisting of a single connected component with all nodes of nominal voltage greater than zero. The edge set for this network is denoted E and defines a symmetric connectivity matrix. Every node b ∈ N may be identified with its row/ column number in the connectivity matrix so that we associate b equivalently with this index. We will denote the undirected line connecting the two nodes b 0 , b 1 by the set {b 0 , b 1 } ∈ E.
The connected sub-network contains 53,155 nodes, 63,832 edges, 268 PMU devices and 4,332 generators. The distribution of node degrees is given in Figure 4. The mean node degree is 2.40 with standard deviation of 1.61, and a max degree of 40. Our analysis is performed by manipulating the connectivity matrix, but we describe the reduction algorithms at a high level with the node and edge sets, N and E.

Degree one reductions.
While the physical lines constituting the distribution sub-network form meshed, loopy graphs, the operational topology for load balancing consists of radial tree structures [9]. Operational switching disconnects lines and the meshed topology so that the substations, connected to the transmission network, form roots of disjoint trees in the distribution sub-networks. This structure differs significantly from the transmission network which typically has multiple loops energized at all times to guarantee continuous delivery to the substations [8,9]. In an operational window where the radial structure is unchanged, the distribution graph structure lends itself to an intuitive representation of the network's multiscale coupling. We map disjoint distribution trees to their respective roots at substationsin our conceptual network, these terminal roots form super nodes which are used to represent the entire behaviour of the tree. Dynamically this is realized by modelling the net power flow of the entire tree through the terminal super node.
Confining our analysis to the connected network N, we collapse all trees in the network to their root nodes. This reduction is performed unambiguously by recursively mapping each node of degree one into the node with which it shares a line. The recursive step is performed until all nodes in N are of degree two or greater. Our method is described in Algorithm 1. To post process, and refine the graph structure, our design allows one to invert the collapse of any subset of a tree in the network; we use hashable maps [33] for ease of implementation in this reconstruction. For each terminal super node in which we cluster a tree, we associate a sequence of lists and arrays representing the recursive reduction procedure. This implementation is described comprehensively in Appendix A. We define the following notation.
Definition 2. The degree of the node b 0 is denoted deg(b 0 ). The data structure d1H is a hashable map, {"f ield" : "data"}, where "data" is an ordered list. Any subset of nodes t ⊂ N is defined as a tree if it is collapsed to a node under Algorithm 1. The mapping which collapses a tree, or a collection of trees, to the root node b 0 is associated to the field t b 0 in d1H.
In each loop of Algorithm 1 we collapse the degree one node b 0 into the connected Algorithm 1 Degree one reduction Define: d1N := N, d1E := E, d1H := empty hashable map.
The if statement requires that whenever a list of collapsed trees is associated to t b 0 ∈ d1H, we append all associated arrays to the list t b 1 , and b 1 is appended to each array denoting the root node. Algorithm 1 reduces the test network to 32,891 nodes and 43,568 edges. The histogram of tree lengths and the distribution of the degrees of the nodes in d1N are given in the Figure 4. The mean degree of nodes in d1N is 2.65, with a standard deviation 1.42 and maximal degree of 38. Tree lengths are calculated as the number of nodes aggregated into the super node, including the root node itself. The total number of trees collapsed in d1H is 9,528 with a mean tree length 3.12 nodes, a standard deviation 2.41 and a max tree length of 36 nodes.
The reductions to the test network via Algorithm 1 are significant, yet in an on-line reduction we may expect a further collapse yet. In our study, the degrees of nodes in the distribution network are defined by the physical lines connecting nodes, irrespective of the operational disconnecting. In practice, however, the operational switching for real power delivery further sparsifies the network and forms additional tree structures that would be collapsed under Algorithm 1. The operational structure typically changes in response to system faults and outages which may occur a few times a day [9]. Therefore, on-line graph reduction faces the additional challenge of efficiently learning the operational topology, based on incomplete information, and constructing a reduced model within the window of the current configuration.

Degree two reductions.
Geographically distant sub-networks that have significant generation or load resources are linked for robustness of power delivery. In case of line failures within one area, the interconnected sub-networks can be configured to balance loads and generation around the failure. In a setting where the intermediate area between these sub-networks has low generation or load, the connection between them is comprised of long range transmission lines, as seen in grids in the USA, China and others [28]. These transmission lines are topologically modelled as stringlike line sub-graphs of degree two nodes. Often, the intermediate nodes in the string lack significant generation or load and have negligible impact to network dynamics. The simple dynamical transmission structure of these degree two nodes motivates an intuitive model of the power flow: we replace all nodes in the interior of the string with a "meta-edge" and parameterize the net power flow with line characteristics. Recursively removing degree one nodes from N as described earlier produces the network d1N, d1E comprised of nodes degree two or greater. Our subsequent reduction removes all nodes of degree two by recursively replacing degree two nodes with edges, if the edge does not already appear in d1E. Prohibiting double edges between nodes, our reduction has the additional effect of reducing other tree-like configurations. These structures are discovered when the procedure results in a degree one node in d1N.
Let {b 0 , b 1 , b 2 } be a sparsely connected triangle as in Figure 3. Removing b 0 , and the edges {b 0 , b 1 } and {b 0 , b 2 }, lowers the degree of b 1 to one. In Lemma 1 we demonstrate that a degree one node is produced by Algorithm 2 if and only if a sparsely connected triangle is reduced this way. The network d1N has a single connected component so we conclude that deg(b 2 ) ≥ 3. Indeed, this node must connect the triangular to the rest of the network. The converse statement is obvious from the above discussion. By subsequently performing a recursive collapse of degree one nodes, we may redefine d1N to consist of nodes at least degree two.
Many structures reduce to a sparsely connected triangle by recursively replacing nodes with edges -this includes but is not limited to any simple polygon of nodes P ⊂ d1N for which every node but one in P is of degree two. Any configuration of nodes that can be reduced to a sparsely connected triangle can thus be collapsed entirely: the sparsely connected triangle is broken by our routine as in Lemma 1, and the remaining nodes are mapped into a terminal root node by recursive degree one reduction. Our analysis leads to Algorithm 2 and Algorithm 3. We describe the data structures used in these routines in Appendix A and define the following notation.
Definition 4. The data structure d2H is a hashable map {"f ield" : "data"}, where "data" is an ordered list. The mapping which takes the node b 0 to the edge {b 1 , b 2 } is associated to the field e b 1 b 2 , where we assume b 1 < b 2 . We define any subset gt ⊂ d1N to be a generalized tree if it is collapsed to a root node under Algorithm 2 and Algorithm 3. The mapping which collapses a generalized tree to the terminal node b 0 is associated to the field t b 0 ∈ d2H.
Remove t b 0 from d2H. end if Pass d2N, d2E and d2H to Algorithm 3. end while return d2N, d2E, d2H Algorithm 2 maps nodes to edges and tracks these reductions sequentially in the hashable map d2H. Whenever b 0 is mapped to the edge we write all preceding mappings to the list e b 1 b 2 when {b 0 , b 1 } and {b 0 , b 2 } are removed. We enforce a similar condition whenever a generalized tree t b 0 ∈ d2H: these maps are appended, as a hashable map, to the list e b 1 b 2 . The subroutine, Algorithm 3, is a modification of Algorithm 1 which tracks the collapse of sparsely connected triangles. Knowing that a degree one node is produced under Algorithm 2 if and only if the routine breaks a sparsely connected triangle, Algorithm 3 stores the list e b 1 b 2 under in the root of the generalized tree subsequently collapsed.

Algorithm 3 Reduce sparsely connected triangle
if ∃ a 0 ∈ d2N with deg(a 0 ) < 2, then while ∃ a 0 ∈ d2N with deg(a 0 ) < 2, do Remove a 0 from d2N and line {a 0 , a 1 } from d2E. if t a 0 ∈ d2H, then Append a 1 to the end of each array in t a 0 ∈ d2H. Append all arrays in t a 0 ∈ d2H to the list t a 1 ∈ d2H.
Remove t a 0 from d2H. else Write array [a 0 , a 1 ] to a list t a 1 ∈ d2H. end if end while Prepend the hash table The root is defined by the final iteration of the degree one reduction. The design and inversion of these data structures is described in detail in Appendix A.
Algorithm 2 and Algorithm 3 reduce the sets d1N, d1E to the sets d2N, d2E with 9716 nodes and 18,700 edges. Figure 4 summarizes this reduction with the histogram of the number of nodes per reduction in d2H and the distribution of the degrees of nodes in d2N. The mean degree of nodes in d2N is 3.85, with a standard deviation 1.76 and a maximal node degree of 38. The total number of collapsed edges in d2H is 9,696, with the mean number of nodes per edge is 3.88, standard deviation 4.24 and max nodes per edge 94. The total number of generalized trees collapsed in d2H is 2,579 with a mean of 4.38 nodes per generalize tree, standard deviation of 3.65 and max nodes per generalized tree 56. We note, generalized trees which have been mapped to edges are considered only as nodes within the meta-edge of their final reduction. Likewise, we do not distinguish meta-edges which have been collapsed into generalized trees from the root super node where their reduction terminates.

Triangular reductions.
The network d2N, d2E is of a scale which permits qualitative analysis. However, further reductions may be necessary to run on-line parameter estimation in an operational window. It is possible to collapse higher degree coherent structures, such as non-sparse triangular configurations, but there is greater subtlety. The degree one and degree two node reductions produce an unambiguous model for net power flow. This is likewise the case for "pure" triangular reductions pictured in Figure 5 where there are three nodes, each of degree three and similar nominal voltage, forming a link between three large connected groups of nodes. In this case we may collapse the three nodes to a single super node of degree three that accounts for the net power flow through the triangle into the other subnetworks. "Pure" triangular configurations are easy to model, but are rare and many other triangular configurations exist throughout the network.
Recursively collapsing generic triangular configurations to super nodes may produce multiple lines between nodes, degree two nodes, degree one nodes and non-unique final reductions, which may strongly bias the distribution of node degrees. To prevent this bias, we permit the collapse of a triangle only if each node in the configuration doesn't exceed a specified degree. Recursively mapping triangles to nodes, the de-  grees of nodes will increase and decrease, so that we always refer to the degree of each node in the current iteration of the algorithm. Without this degree threshold, the test network collapses to a single densely connected node with an orbit of small components. Therefore, this threshold is a necessary criteria to produce a reduced model, via recursively collapsing triangles, that faithfully represents the dynamics.
The degree threshold introduces a tuning factor into our algorithm with which we balance the scale of the collapse with preserving the degree distribution for nodes in d2N. For any degree threshold dThr, the maximal degree of a super node produced by collapsing a triangular configuration is given by 3(dThr − 1). This corresponds to a configuration where all nodes only have common lines within the triangle. Assume each node has the maximum of dThr lines and one node gains all lines of the other two nodes. Mapping one node at a time, the first node adds only dThr − 1 distinct lines to the super node and the second contributes dThr − 2 distinct lines. We choose dThr = 6, 7 and 8, which produce a node of at most degree 15, 18 and 21 respectively.
A solely graph based reduction of triangles may, however, combine transmission and distribution nodes in a way which distorts the dynamics of net power flow. For instance, if the "pure" triangle in Figure 5 is formed by two nodes of high and one node of low nominal voltage, a super node produced from this triangle may confer stronger coupling between the three separate sub-networks (groups one, two, and three) than actually exists. To prevent non-coherent mixing of transmission and distribution subnetworks, we restrict our reductions only to the nodes in d2N which fall below an additional voltage threshold. We permit a reduction to a triangle if every node in the configuration additionally falls below a specified nominal voltage. We choose voltage thresholds of 110, 138, 230, 345 nominal KV (standard low, medium and high voltages for different transmission grid lines), and for reference, compare results without a voltage threshold. The nodes in d2N, and edges in d2E, may represent multiple nodes due to reductions performed in Algorithm 1, Algorithm 2 and Algorithm 3. Our analysis leads to Algorithm 4, we introduce the following notation.
Definition 5. Let vThr be a specified voltage threshold. Define nL to be a list of nodes in d2N excluding any node(s) • b 0 such that t b 0 ∈ d2H contains a node of nominal voltage above vThr, contains a node of nominal voltage above vThr • or b 0 ∈ d2N which has a nominal voltage above vThr. The data structure triH is a hashable map {"f ield" : "data"} where "data" is an ordered list. Entries of these lists are hashable maps of the form {"b 0 " : lines(b 0 )} where lines(b 0 ) is a list of lines associated to b 0 in d2E.
Append all entries in tri b i to tri b 0 ∈ triH. Remove tri b i from triH. for each b j such that {b i , b j } ∈ triE, do Write {b 0 , b j } to triE excluding double and self lines. Remove {b i , b j } from triE. end for Remove b i from triN and from nL. end for K := 0, nL := random permutation of nL, ST OP := length(nL). end while end if end while for tri b 0 ∈ triH, do Append {"b 0 " : lines(b 0 )} to tri b 0 ∈ triH. end for return triN, triE, triH In each iteration of Algorithm 4, we perform a greedy search for permissible triangles connected to a base node b 0 , i.e. all triangles for which the nodes fall below the specified voltage and degree thresholds. We recursively collapse all such triangles into b 0 by removing the two associated nodes from triN and connecting all their lines to b 0 , avoiding double and self lines. We perform this search until there are no permitted triangles which include b 0 and start the search again from a new base node. The base node from which we search for triangles is randomized upon each iteration. Thus, for each combination of voltage and degree threshold, we run an ensemble of experiments to find a distribution for our results. We plot the distribution of the degrees of nodes in triN over 1,000 experiments in Figure 6; for reference we include the degree distribution of nodes in d2N. Note, while the triangular reduction produces nodes of degree at most 21, the reduction may lower the degree of any node if it is connected to at least two nodes in a permissible reduction. In Figure 6, the newly apparent nodes of degree greater than 21 correspond to this phenomena. We likewise plot the distribution of size of the network triN in Figure 7 with respect to the various threshold settings, demonstrating both the variability of recursive triangular reduction and its sensitivity to the voltage and degree thresholds.
The smallest network produced by Algorithm 4 has 5,560 nodes and 11,079 edges -this is used as a reference for the limit of the reduction, performed without a voltage threshold. The degree distributions are similar across the degree thresholds in each voltage setting. However, we notice sensitivity in the size of the reduced network to the voltage threshold as we pass both from 110 KV to 138 KV, and from 138 KV to 230 KV thresholds respectively. Distributions of the network size are all close and strongly peaked for the 110 KV threshold, indicating that few nodes in the distribution sub-network remain un-clustered after the degree one and degree two steps. The dramatic reductions to network size passing to the 138 KV threshold indicates that the nodes of the distribution sub-network, and the substations connecting these to the transmission network (including super nodes which combine the two), possesses a loopy configuration that can be clustered by the triangular reduction for a significant gain. This is likewise the case passing to the 230 KV threshold, where the loopy structure below the high voltage transmission network can be reduced significantly. The distributions of network size for voltage thresholds above 230 KV are more closely aligned, and are instead distinguished along their degree thresholds. 3. Visualization of the reduced network. The sequential steps in section 2 reduce our test network to a size that is amenable to qualitative analysis. In this section, we discuss methods of graph visualization for the models produced by the degree two reductions and the triangular reductions. Following Wong et. al. [34], we choose to visualize the network based on its graph characteristics as in the Green-Grid visualization package. Rather than visualizing our network by the geographic information, a graph theoretic layout can better represent electrical distance and grid vulnerabilities. The classical force directed layout technique uses a spring and repulsion model where each node is a repelling body and the edges are represented by springs [6,23]. Initial node positions are chosen randomly and the n-body problem is solved until the positions of nodes stabilize. This pseudo physical model can be utilized to represent electric grid physics by parameterizing spring lengths and node repulsion with the electrical distance of lines and the nominal voltages of nodes [34].
We utilize the JavaScipt library vis.js [2] to perform interactive visualizations. The default graph layout uses uniform spring lengths and repulsion parameters, and implementing a parametrization scheme that reflects the electrical distance in the reduced network models is the subject of future work. The underlying ForceAtlas2 model [20] in vis.js is used to resolve the spring repulsion evolution. We produce a conceptual visualization of the clustering performed via Algorithm 4 as follows: (i) first generate node positions for triN, using no voltage threshold and setting dThr = 8, and resolve the ForceAtlas2 model until node positions stabilize; (ii) fix these node locations and assign the initial position for every node in d2N as its clustered position in triN; resolve the ForceAtlas2 model until node positions of d2N stabilize. Figure 8 demonstrates a realization of this de-clustering: the left hand plot shows the initial positions for the nodes in d2N; the middle plot describes an intermediate point in their evolution as node positions are released; the right hand plot visualizes the stabilized d2N positions. This de-clustering visualization demonstrates how the degree threshold maintains qualitative graph features during the reduction.

Conclusions.
Analysis of our test network demonstrates that our reductions meet the goals stated in section 2. Firstly, our graph based approach to network re- duction preserves network topological features such as degree distributions and graph paths, allowing a physically meaningful interpretation of the nodal clustering in terms of net power flow. Moreover, by a sequential, recursive design, we allow a partial reconstruction of the full network from the reduced model with varying levels of resolution: efficient use of data structures allows the user to reconstruct sequential reductions and reintroduce complex network features. Finally, we demonstrate the potential for interactive visualization of the reduced model for qualitative study of network sensitivities. As an additional step, one may use the graph based visualization to represent the electrical distance in the reduced network, using the (clustered) nodal voltage to represent repulsion and (meta-)edge characteristics to represent spring parameters [34]. Visualizing the reduced network this way preserves and even distinguishes major qualitative features of the original model, using the comprehensible reduced network.
These graph based reductions can also be used as a preliminary step for the classical Kron reduction on multiscale, transmission and distribution networks. Collapsing nodes and edges as above provides a nodal clustering with physically meaningful coupling between the distribution and transmission sub-networks, while reducing the large size of the distribution sub-network. Classical approaches to solving the optimal power flow and control problems can thereby be applied in a multiscale, power grid model which includes significant generation in the distribution sub-network: the algorithms in section 2 reduce the large multiscale network to one which can be understood in terms of its dynamically relevant super-nodes, used as reference in the subsequent Kron reduction, solving the issue of node selection.
While the reduced network is of a scale that allows additional qualitative analysis, e.g. visualizations and Kron reduction, the graph reduction itself provides insight. Computing the number of generalized trees and meta-edges, along with their associated sizes, helps to identify coherent structures within the full network. Likewise, collapsing triangles within the various voltage and degree thresholds exposes the connectivity of the network across its layers of transmission and distribution. The dramatic reductions when raising the voltage threshold from 110 KV to 138 KV, and likewise 138 KV to 230 KV, quantitatively demonstrates the highly meshed structure of the sub-network falling below these nominal voltages. The ultimate goal of this project is to produce a dynamically faithful, reduced order model, but this still requires significant steps that go beyond the scope of the current work. One component is the parametrization of the reduced order model, from both static and time-varying network data. Additional opportunities also arise from post-processing the reduced model with the Kron reduction. The interactive, graph-based visualizations may also be refined to represent electrical distance with spring and repulsion parameters. These steps will be explored in future research.
Appendix A. Data structures and inverting reductions. Allowing users to refine the reduced network structure is basic to our algorithm design. We expand in detail the data storage of generalized trees, edges and collapsed triangles. The recursion in Algorithm 2 and Algorithm 3 implies that edge and generalized tree data structures can be multilayered, containing multiple levels of sub-edges or subtrees. Proceeding from the bottom layer to the top, and from right to left within lists, one can recover the reverse sequence of mappings to reconstruct a node. An example interactive visualization is available in web browsers [15], demonstrating the de-clustering performed in Figure 8. We likewise release our reduction scripts and toy data describing the full and reduced network node and edge sets, with voltage information in an arbitrary, per unit representation [16].
A.1. Tree data. Tree reductions are called by a field t b 0 where b 0 is the terminal node of the collapse in Algorithm 1. Each field returns a list of arrays, each array corresponding to a branch collapsed to the root node b 0 . The first position of each array describes the end leaf of the branch and each subsequent position describes the shortest path in the network to the terminal node. Figure 10 corresponds to the list where leaves are reintroduced by following the path described in the array.
A.2. Edge data. Let b 1 be a node of degree two, and suppose it is connected to b 0 and b 2 . The basic mapping produced by Algorithm 2 takes b 1 to the edge {b 0 , b 2 }. We represent this map by the array [b 0 , b 1 , b 2 ] where, without loss of generality, we assume that b 0 < b 2 . Given such a sequence of mappings Generalized trees embedded in an edge are reconstructed by reintroducing the terminal node from the edge data and reconstructing the generalized tree as described in Appendix A.2.1. An edge in d2H may contain an arbitrary length sequence of edge and tree reductions, possibly multilayered. Each meta-edge may therefore be represented in multiple ways by different orders of mappings, but each map can be inverted sequentially to reconstruct the original network, regardless of the order.
A.2.1. Generalized tree maps. Lemma 1 demonstrates that a sparsely connected triangle is collapsed if and only if Algorithm 2 produces a degree one node. Generalized tree data, therefore, includes the sequence of nodes mapped to edges which precipitate collapse of the triangle. The field t b n corresponds to the node, b n , that the generalized tree has been collapsed to. Suppose as in Figure 3, mapping b 0 to the edge {b 1 , b 2 } produces a degree one node in d2N. Algorithm 3 collapses degree one nodes recursively until every node is again at least degree two. Let b n be the terminal node of this collapse, then Algorithm 3 stores a hashable map as the first entry of t b n ∈ d2H, followed by the array with the path from b 1 to the terminal node In general, the edge data precipitating the collapse of the sparsely connected triangle can be of arbitrary length and contain multiple layers.
A.3. Triangular reductions. Due to the more arbitrary nature of Algorithm 4, we take a simple approach to track the reductions. The field tri b 0 ∈ triH corresponds to a list where each entry is a hashable map of the form {"b j " : lines(b j )}. The value lines(b j ) is the list of lines associated to b j in d2E. In this way, one can reintroduce a node from a collapsed triangular configuration by writing the node b j into triN and reconnecting this node with the appropriate edges from d2E, while removing these edges from b 0 if the lines were formed uniquely by joining b j to the cluster.