===affil2:
Department of Mathematics, Virginia Tech
===firstname:
Eric
===firstname4:
Misha E.
===firstname3:
Serkan
===lastname2:
Beattie
===lastname:
de Sturler
===firstname5:

===affil6:

===lastname3:
Gugercin
===email:
sturler@vt.edu
===lastname6:

===affil5:

===otherauths:

===lastname4:
Kilmer
===affil4:
Department of Mathematics, Tufts University
===lastname7:

===affil7:

===firstname7:

===postal:
Department of Mathematics
460 McBryde Hall
Virginia Tech,
Blacksburg, VA 24061
===firstname6:

===ABSTRACT:
We will show how reduced order models can significantly reduce the cost of general
inverse problems approached through parametric level set methods.
Our method drastically reduces the solution of forward problems in diffuse optimal
tomography (DOT) by using of interpolatory parametric model reduction.
In the DOT setting, these surrogate models can approximate both the cost functional and
associated Jacobian with little loss of accuracy and significantly reduced cost.


\vspace{11pt}
We recover diffusion and absorption coefficients, $D( \mathbf{x})$ and $\mu( \mathbf{x})$, respectively,
using observations, $\boldsymbol{m}(t)$,
from illumination by source signals, $\mathbf{u}(t)$.
We assume that the unknown fields can be characterized by
a finite set of parameters, $ \mathbf{p}=[p_1,\ldots,\,p_{\ell}]^T$.
Discretization of the underlying PDE gives the following, order $n$, differential algebraic system,
\begin{equation}  \label{processModelDynSys}
 \frac{1}{\nu} \mathbf{E}\,  \dot{ \mathbf{y}}(t; \mathbf{p})  =- \mathbf{A}( \mathbf{p})
 \mathbf{y}(t; \mathbf{p}) +  \mathbf{B} \mathbf{u}(t)\quad\mbox{with}\quad
 \boldsymbol{m}(t; \mathbf{p}) = \mathbf{C} \mathbf{y}(t; \mathbf{p}) ,
\end{equation}
where $ \mathbf{y} $ denotes the discretized photon flux;
$ \boldsymbol{m}=[m_1,\,\ldots,\, m_{n_{det}}]^T$ is the vector of detector outputs;
the columns of $ \mathbf{B}$ are discretizations of the source ``footprints"; and
$\mathbf{A}( \mathbf{p})$ is the discretization of the diffusion and absorption terms, inheriting
the parameterizations of these fields.

Let $ \mathbf{y}(\omega;  \mathbf{p})$, $ \mathbf{u}(\omega)$, and $ \mathbf{M}(\omega;  \mathbf{p})$
denote Fourier transforms of $\mathbf{y}(t; \mathbf{p})$, $ \mathbf{u}(t)$, and
$ \boldsymbol{m}(t; \mathbf{p})$, respectively.
Taking the Fourier transform of (\ref{processModelDynSys}) and rearranging, we find
\begin{equation} \label{FullOrdTransFnc}
   \mathbf{M}(\omega; \mathbf{p}) = \boldsymbol{\Psi}\!\left(\imath\omega; \mathbf{p}\right)\,
   \mathbf{u}(\omega)\quad \mbox{where}\quad  \boldsymbol{\Psi}(s; \mathbf{p})=
   \mathbf{C}\left(\frac{s}{\nu}\, \mathbf{E}\, + \mathbf{A}( \mathbf{p})\right)^{-1} \mathbf{B}.
\end{equation}
$ \boldsymbol{\Psi}(s, \mathbf{p})$ is a mapping from sources (inputs) to measurements (outputs)
in the frequency domain; indeed, it is the \emph{transfer function} of the dynamical system defined
in (\ref{processModelDynSys}).
Given a parameter vector, $ \mathbf{p}$, and absorption field, $\mu(\cdot, \mathbf{p})$,
input source, $ \mathbf{U}_i$, and frequency $\omega_j$,
$  \mathbf{M}_i(\omega_j; \mathbf{p})\in\mathbb{C}^{n_{det}}$ denotes
the vector of observations predicted by the forward model.
For $n_{src}$ sources and $n_{\omega}$ frequencies, we obtain the (stacked) observation vector
\[ {\cal M}( \mathbf{p}) = [   \mathbf{M}_1(\omega_1; \mathbf{p})^T,\,
\mathbf{M}_1(\omega_2; \mathbf{p})^T,\, \ldots,\, \mathbf{M}_1(\omega_{n_\omega};
\mathbf{p})^T,\, \mathbf{M}_2(\omega_1; \mathbf{p})^T \ldots,  \mathbf{M}_{n_{src}}
(\omega_{n_\omega}; \mathbf{p})^T ]^T  ,\]
which is a (complex) vector of dimension $n_{det}\cdot n_{src} \cdot n_{\omega} $.
We construct the corresponding empirical data vector,
$\mathbb{D}$, from actual observations, and solve the optimization problem:
\[ \min_{ \mathbf{p} \in \mathbb{R}^\ell} \| {\cal M}( \mathbf{p}) - \mathbb{D} \|_2 \]
The computational cost of evaluating ${\cal M}( \mathbf{p}) - \mathbb{D}$ is dominated
by the solution of the large, sparse block linear systems in (\ref{FullOrdTransFnc})
for all frequencies $\omega_j$.

To reduce costs while maintaining accuracy, we seek a much smaller dynamical system of order $r \ll n$
that replicates the input-output map of the ``original" system (\ref{processModelDynSys}):
\begin{equation} \label{paramSysROM}
 \frac{1}{\nu}\,\widehat{ \mathbf{E}}\, \dot{\widehat{ \mathbf{Y}}}_r(t; \mathbf{p})  =
  - \widehat{ \mathbf{A}}( \mathbf{p})\, \widehat{ \mathbf{Y}}(t; \mathbf{p})
 + \widehat{ \mathbf{B}}\,  \mathbf{U}(t)   \quad\mbox{with}\quad
 \widehat{\boldsymbol{m}}(t; \mathbf{p})  =  \widehat{ \mathbf{C}}\,
 \widehat{ \mathbf{Y}}(t; \mathbf{p}) ,
\end{equation}
%\vspace{-1ex}
where the new state vector $\widehat{ \mathbf{Y}}(t; \mathbf{p})\in  \mathbb{R}^r$,
$\widehat{ \mathbf{E}},\widehat{ \mathbf{A}}( \mathbf{p}) \in  \mathbb{R}^{r \times r}$,
$\widehat{ \mathbf{B}} \in  \mathbb{R}^{r \times n_{src}}$, and
$ \widehat{ \mathbf{C}}\in  \mathbb{R}^{n_{det}\times r}$
such that ${\boldsymbol{m}}(t; \mathbf{p}) \approx \widehat{\boldsymbol{m}}(t; \mathbf{p}) $.
The surrogate transfer function is
$$
\widehat{ \boldsymbol{\Psi}}(s; \mathbf{p}) =
 \widehat{ \mathbf{C}} \left(\frac{s}{\nu}
 \widehat{ \mathbf{E}} + \widehat{ \mathbf{A}}( \mathbf{p})\right)^{-1} \widehat{ \mathbf{B}},
$$
which requires only the solution of linear systems
of dimension $r \ll n$; hence drastically reducing the cost.
For a given parameter value $\boldsymbol{\pi}$ used in an optimization step,
a reduced (surrogate) model for the necessary function evaluations
at the frequency $\omega_j$, involves the construction of a reduced parametric model
of the form  (\ref{paramSysROM})
with transfer function $\widehat{ \boldsymbol{\Psi}}(s; \mathbf{p})$ that satisfies
%
\begin{equation} \label{eq:hermite}
 \widehat{ \boldsymbol{\Psi}}\left(\imath \omega_j;\boldsymbol{\pi}\right) =
 \boldsymbol{\Psi}\left(\imath \omega_j;\boldsymbol{\pi}\right)
 %~~~\mbox{and}~~~\nabla_{ \mathbf{p}}\widehat{ \boldsymbol{\Psi}}\left(\imath \omega_k;\boldsymbol{\pi}\right) = \nabla_{ \mathbf{p}} \boldsymbol{\Psi}\left(\imath \omega_k;\boldsymbol{\pi}\right).
\end{equation}
%
{\it This is exactly what interpolatory model reduction achieves.}
Moreover, the transfer function $\widehat{ \boldsymbol{\Psi}}(s)$ of the reduced-model also satisfies
\begin{eqnarray}
\nabla_{ \mathbf{p}} \boldsymbol{\Psi}\left(\imath \omega_j,\boldsymbol{\pi}\right) =
\nabla_{ \mathbf{p}}\widehat{ \boldsymbol{\Psi}}\left(\imath \omega_j,\boldsymbol{\pi}\right),
  ~~~~ \mbox{\textit{and}} ~~~~  \boldsymbol{\Psi}'\left(\imath \omega_j,\boldsymbol{\pi}\right) =
  \widehat{ \boldsymbol{\Psi}}'\left(\imath \omega_j,\boldsymbol{\pi}\right). \nonumber
\end{eqnarray}
%
The use of interpolatory projections allows us to match both function and gradient values exactly
\emph{without computing them}, requiring instead only  that
the computed projection spaces defining the reduced model
contain particular (stably computable) vectors.
If we were to compute a reduced-model for every
parameter point $\boldsymbol{\pi}$ and $\imath \omega_j$ for $j=1,\ldots,N_\omega$
and use these
reduced-models in the forward model,
the solution of the inverse problem would proceed in {\em exactly} the same way
as in the case of the full-order forward model --
the nonlinear optimization algorithm would not see the difference between the
full and reduced forward problems.
Of course, computing a new surrogate-model for every parameter value is infeasible.
Hence, we focus on methods of constructing surrogate-models that have
high-fidelity over a wide range of parameter values and consider effective
approaches for updating the surrogate models. We will present numerical
examples that illustrate the effectiveness of these methods.

===affil3:
Department of Mathematics, Virginia Tech
===lastname5:

===affilother:

===title:
Integrated Model Reduction Strategies for Nonlinear Parametric Inversion
===firstname2:
Chris