===affil2:
University of Illinois at Chicago
===firstname:
Ilya
===firstname4:
Lauren
===firstname3:
Sven
===lastname2:
Gutfraind
===lastname:
Safro
===firstname5:

===affil6:

===lastname3:
Leyffer
===email:
isafro@clemson.edu
===lastname6:

===affil5:

===otherauths:

===lastname4:
Meyers
===affil4:
University of Texas at Austin
===lastname7:

===affil7:

===firstname7:

===postal:
228 McAdams Hall
School of Computing
Clemson University
Clemson, SC 29634
===firstname6:

===ABSTRACT:
\par Networks are a widely used type of abstraction for complex data. Optimization of different quantitative objectives on networks often plays a crucial role in network science, not only when a practical solution is needed, but also for a general  understanding of structural and statistical features of networks. We present multiscale approaches for two problems: optimal response to epidemics, and network generation. Both approaches are inspired by AMG scheme reinforced by the algebraic distance connectivity strength. 

\paragraph{Heuristic for optimal response to epidemics and cyber attacks.} 
We consider a traditional infection-spread SIS model in which network nodes can be in one of two possible states, namely, \emph{infected} and \emph{susceptible}, and each node $i$ is associated with probability of being infected at time $t$, $\phi_{i,t}$. This model has been extensively analyzed in epidemiology and adapted in the cyber security area for analysis of computer viruses propagation. 
We now formulate the optimization problem whose goal is to keep the level of infection at each node low while maximizing the (weighted) number of ``alive'' connections. This is motivated by the infection spread response policies that are often driven by the number of resources available for the realization. We denote the graph of a network by $G=(V, E, w)$, where $V$ and $E$ are sets of nodes and edgfes, respectively, and $w:V\rightarrow \mathbb{R}_{\geq 0}$ represents the strength of connectivity (such as the number of shared users in a cyber system). Assuming that the probabilities of infection transition from $\Gamma_i$ (neighbors of $i$) to $i$ are independent, the problem is formulated as
\begin{equation}\label{PROBFORM}
\begin{aligned}
& \max_{x} && \sum_{ij\in E} w_{ij} x_i x_j && \\
& \text{subject to} && x_i-\prod_{j\in \Gamma_i}\left(1-p_{ij} \phi_{j,t-1} x_j\right)\leq b_i && \forall i\in V,\\
& && x_i \in\{0,1\} && \forall i\in V,
\end{aligned}
\end{equation}
where $w_{ij}$ is the link weight between nodes $i$ and $j$; $b_i$ is a threshold for the level of allowed probability of infection at $i$; and $x_i$ are binary variables (1 - if we decide to leave the node $i$ functioning, 0 - closed, requiring special attention). In general, (\ref{PROBFORM}) is a nonconvex integer nonlinear program and known to be $NP$-complete. We demonstrate a strategy for designing fast multiscale methods for this class of problems. 
The refinement represents the collective improvement for sufficiently small subsets of variables. This phase can easily be  performed in parallel by using the red-black order. Single subset refinement solves problem for subset of nodes  by choosing a connected subgraph and fixing the \emph{boundary conditions} for the rest of the nodes.
 
We evaluate our method on a set of small networks with known optimal solutions, two case studies (HIV spread  and cyber infrastructure networks), and one massive data set. The two case study networks are typical complex network instances on which solving this particular optimization is of great practical importance. The massive dataset evaluation contains networks of different structures and sources that can potentially represent hard structures for the method. In all experiments our method improved best known results (quality and/or running time) significantly.
 
\paragraph{Multiscale Entropic Network Generation.}
High-quality, large-scale network data is
often not available for scientists, because of economic, legal, technological,
or other obstacles. For example, the human contact
networks along with infectious diseases spread are notoriously
difficult to estimate, and thus our understanding of
the dynamics and control of epidemics stems from models that
make highly simplifying assumptions or simulate contact networks
from incomplete or proxy data. In another
domain, the development of cybersecurity systems requires
testing across diverse threat scenarios and validation across
diverse network structures that are not yet known. In both
examples, the systems of interest cannot be represented by a
single exemplar network, but must instead be modeled as collections
of networks in which the variation among them may
be just as important as their common features. Such cases
point to the importance of data-driven methods for synthesizing
networks that capture both the essential features of a
system and realistic variability in order to use them in such
tasks as simulations and verification.

Because existing methods only reproduce a limited set of specified network properties, we introduce a novel strategy for synthesizing artificial networks, namely, the multiscale entropic network generation (MUSKETEER). We create a hierarchy of coarse networks;
but, in contrast to the multiscale methods for computational
and optimization problems, we do not optimize anything but
\emph{edit} the network at all scales of coarseness. During the editing
process we allow only local changes in the network. The editing problem
 (or generation) is formulated and solved at all scales
where primitives at the coarse scale (such as coarse nodes
and edges) represent aggregates of primitives and their fractions
at previous finer scale. Analogous to multiscale methods
for computational problems, by using appropriate
coarsening we are able to detect and use the “geometry” behind
the original network at multiple scales, which can be
interpreted as an additional property that is not captured by
other network generation methods. It is known
that the topology of many complex networks is hierarchical
and thus might be produced through iterations of generative
laws at multiple scales. In general, such generative laws often
can be different at different scales, as evidenced by the
finding that complex networks are self-dissimilar across scales. For example, the number of triangles
in the original network that one can be interested in can be
completely different from the number of triangles at coarse
scales. These differences can naturally be reflected in the proposed
multiscale framework.

To evaluate the performance of MUSKETEER (http://www.cs.clemson.edu/$\tilde{~~}$isafro/musketeer), we generated a large number of networks and compared them to the real-world networks (from epidemiology, social science, etc.) for a variety of local and global structural properties (such as clustering, modularity, and degree distribution). For most properties, the generated ensemble yields a median value close to the original value and range of values that is fairly symmetric about the median.
===affil3:
Argonne National Laboratory
===lastname5:

===affilother:

===title:
Multiscale methods for large networks: response to epidemics and network generation problems
===firstname2:
Alexander