Top 20 Most Read Articles
April 2012
The 20 articles with the most full-text downloads during the month, in descending order.
|
|
Fast Two-scale Methods for Eikonal Equations SIAM J. Sci. Comput. 34, pp. A547-A578 (32 pages) Online Publication Date: March 08, 2012
Full Text:
|
Download PDF
|
||
|
Show Abstract
Fast Marching and Fast Sweeping are the two most commonly used methods for solving the eikonal equation. Each of these methods performs best on a different set of problems. Fast Sweeping, for example, will outperform Fast Marching on problems where the characteristics are largely straight lines. Fast Marching, on the other hand, is usually more efficient than Fast Sweeping on problems where characteristics frequently change their directions and on domains with complicated geometry. In this paper we explore the possibility of combining the best features of both approaches by using Marching on a coarser scale and sweeping on a finer scale. We present three new hybrid methods based on this idea and illustrate their properties in several numerical examples with continuous and piecewise-constant speed functions in $R^2$. |
|||
|
|
Analysis of Measurements Based on the Singular Value Decomposition SIAM J. Sci. and Stat. Comput. 2, pp. 363-373 (11 pages)
Full Text:
|
Download PDF
|
||
|
Show Abstract
The problem of maintaining quality control of manufactured parts is considered. This involves matching points on the parts with corresponding points on a drawing. The difficulty in this process is that the measurements are often in different coordinate systems. Using the assumption that the relation between the two sets of coordinates is a certain rigid transformation, an explicit least squares solution is obtained. This solution requires the singular value decomposition of a related matrix. Other topics in the paper include an appropriate angular representation of the resulting orthogonal transformation matrix, and a computational algorithm for the various required quantities. |
|||
|
|
Divide and Conquer on Hybrid GPU-Accelerated Multicore Systems SIAM J. Sci. Comput. 34, pp. C70-C82 (13 pages) Online Publication Date: April 12, 2012
Full Text:
|
Download PDF
|
||
|
Show Abstract
With the raw computing power of graphics processing units (GPUs) being more widely available in commodity multicore systems, there is an imminent need to harness their power for important numerical libraries such as LAPACK. In this paper, we consider the solution of dense symmetric and Hermitian eigenproblems by the LAPACK divide and conquer algorithm on such modern heterogeneous systems. We focus on how to make the best use of the individual strengths of the massively parallel manycore GPUs and multicore CPUs. The resulting algorithm overcomes performance bottlenecks faced by current implementations that are optimized for a homogeneous multicore. On a dual socket quad-core Intel Xeon 2.33 GHz with an NVIDIA GTX 280 GPU, we typically obtain up to about a tenfold improvement in performance for the complete dense problem. The techniques described here thus represent an example of how to develop numerical software to efficiently use heterogeneous architectures. As heterogeneity becomes more common in the architecture design, the significance of and need for this work are expected to grow. |
|||
|
|
SIAM J. Sci. and Stat. Comput. 13, pp. 631-644 (14 pages) Online Publication Date: July 13, 2006
Full Text:
|
Download PDF
|
||
|
Show Abstract
Recently the Conjugate Gradients-Squared (CG-S) method has been proposed as an attractive variant of the Bi-Conjugate Gradients (Bi-CG) method. However, it has been observed that CG-S may lead to a rather irregular convergence behaviour, so that in some cases rounding errors can even result in severe cancellation effects in the solution. In this paper, another variant of Bi-CG is proposed which does not seem to suffer from these negative effects. Numerical experiments indicate also that the new variant, named Bi-CGSTAB, is often much more efficient than CG-S. |
|||
|
|
Expression Templates Revisited: A Performance Analysis of Current Methodologies SIAM J. Sci. Comput. 34, pp. C42-C69 (28 pages) Online Publication Date: March 08, 2012
Full Text:
|
Download PDF
|
||
|
Show Abstract
In the last decade, expression templates (ETs) have gained a reputation as an efficient performance optimization tool for C++ codes. This reputation builds on several ET-based linear algebra frameworks focused on combining both elegant and high-performance C++ code. However, on closer examination the assumption that ETs are a performance optimization technique cannot be maintained. In this paper we compare the performance of several generations of ET-based frameworks. We analyze different ET methodologies and explain the inability of some ET implementations to deliver high performance for dense and sparse linear algebra operations. Additionally, we introduce the notion of “smart” ETs, which truly allow for a combination of high performance code with the elegance and maintainability of a domain-specific language. |
|||
|
|
A Weak Formulation of the Immersed Boundary Method SIAM J. Sci. Comput. 34, pp. A1010-A1026 (17 pages) Online Publication Date: April 03, 2012
Full Text:
|
Download PDF
|
||
|
Show Abstract
A new method of spatial discretization for immersed boundary computations is introduced. Fluid velocity and pressure are obtained as weak solutions of the discretized fluid equations with respect to a wavelet basis of functions. The scaling function of the fluid velocity basis may be chosen to be identical to Peskin's discrete approximation to the Dirac delta function. On a regular rectangular grid the discretized equations are solved using the fast Fourier transform, retaining the efficiency of the immersed boundary method. We show experimental numerical evidence that the rate of volume loss of our method is better than that of the finite difference immersed boundary method. Our formulation offers new insights into the immersed boundary method and leads to new extensions and applications. |
|||
|
|
Asymptotic-preserving Projective Integration Schemes for Kinetic Equations in the Diffusion Limit SIAM J. Sci. Comput. 34, pp. A579-A602 (24 pages) Online Publication Date: March 08, 2012
Full Text:
|
Download PDF
|
||
|
Show Abstract
We investigate a projective integration scheme for a kinetic equation in the limit of vanishing mean free path in which the kinetic description approaches a diffusion phenomenon. The scheme first takes a few small steps with a simple, explicit method, such as a spatial centered flux/forward Euler time integration, and subsequently projects the results forward in time over a large time step on the diffusion time scale. We show that with an appropriate choice of the inner step size, the time-step restriction on the outer time step is similar to the stability condition for the diffusion equation, whereas the required number of inner steps does not depend on the mean free path. We also provide a consistency result. The presented method is asymptotic-preserving in the sense that the method converges to a standard finite volume scheme for the diffusion equation in the limit of vanishing mean free path. The analysis is illustrated with numerical results, and we present an application to the Su–Olson test. |
|||
|
|
SIAM J. Sci. Comput. 18, pp. 1-22 (22 pages) Online Publication Date: July 25, 2006
Full Text:
|
Download PDF
|
||
|
Show Abstract
This paper describes mathematical and software developments for a suite of programs for solving ordinary differential equations in MATLAB. |
|||
|
|
A Fast Solver for a Nonlocal Dielectric Continuum Model SIAM J. Sci. Comput. 34, pp. B107-B126 (20 pages) Online Publication Date: April 03, 2012
Full Text:
|
Download PDF
|
||
|
Show Abstract
The nonlocal continuum dielectric model is an important extension of the classical Poisson dielectric model, but it is very expensive to be solved in general. In this paper, we prove that the solution of one commonly used nonlocal continuum dielectric model of water can be split as a sum of two functions, and these two functions are simply the solutions of one Poisson equation and one Poisson-like equation. With this new solution splitting formula, we develop a fast finite element algorithm and a program package in Python based on the DOLFIN program library such that a nonlocal dielectric model can be solved numerically in an amount of computation that merely doubles that of solving a classic Poisson dielectric model. Using the new solution splitting formula, we also derive the analytical solutions of two nonlocal model problems. We then solve these two nonlocal model problems numerically by our program package and validate the numerical solutions through a comparison with the analytical solutions. Finally, our study of free energy calculation by a nonlocal Born ion model demonstrates that the nonlocal dielectric model is a much better predictor of the solvation free energy of ions than the local Poisson dielectric model. |
|||
|
|
Partitioning Hypergraphs in Scientific Computing Applications through Vertex Separators on Graphs SIAM J. Sci. Comput. 34, pp. A970-A992 (23 pages) Online Publication Date: March 29, 2012
Full Text:
|
Download PDF
|
||
|
Show Abstract
The modeling flexibility provided by hypergraphs has drawn a lot of interest from the combinatorial scientific community, leading to novel models and algorithms, their applications, and development of associated tools. Hypergraphs are now a standard tool in combinatorial scientific computing. The modeling flexibility of hypergraphs, however, comes at a cost: algorithms on hypergraphs are inherently more complicated than those on graphs, which sometimes translates to nontrivial increases in processing times. Neither the modeling flexibility of hypergraphs nor the runtime efficiency of graph algorithms can be overlooked. Therefore, the new research thrust should be how to cleverly trade off between the two. This work addresses one method for this trade-off by solving the hypergraph partitioning problem by finding vertex separators on graphs. Specifically, we investigate how to solve the hypergraph partitioning problem by seeking a vertex separator on its net intersection graph (NIG), where each net of the hypergraph is represented by a vertex, and two vertices share an edge if their nets have a common vertex. We propose a vertex-weighting scheme to attain good node-balanced hypergraphs, since the NIG model cannot preserve node-balancing information. Vertex-removal and vertex-splitting techniques are described to optimize cut-net and connectivity metrics, respectively, under the recursive bipartitioning paradigm. We also developed implementations of our proposed hypergraph partitioning formulations by adopting and modifying a state-of-the-art graph partitioning by vertex separator tool onmetis. Experiments conducted on a large collection of sparse matrices demonstrate the effectiveness of our proposed techniques. |
|||
|
|
GMRES: A Generalized Minimal Residual Algorithm for Solving Nonsymmetric Linear Systems SIAM J. Sci. and Stat. Comput. 7, pp. 856-869 (14 pages) Online Publication Date: July 14, 2006
Full Text:
|
Download PDF
|
||
|
Show Abstract
We present an iterative method for solving linear systems, which has the property of minimizing at every step the norm of the residual vector over a Krylov subspace. The algorithm is derived from the Arnoldi process for constructing an $l_2 $-orthogonal basis of Krylov subspaces. It can be considered as a generalization of Paige and Saunders’ MINRES algorithm and is theoretically equivalent to the Generalized Conjugate Residual (GCR) method and to ORTHODIR. The new algorithm presents several advantages over GCR and ORTHODIR. |
|||
|
|
An Algebraic Multigrid Method with Guaranteed Convergence Rate SIAM J. Sci. Comput. 34, pp. A1079-A1109 (31 pages) Online Publication Date: April 10, 2012
Full Text:
|
Download PDF
|
||
|
Show Abstract
We consider the iterative solution of large sparse symmetric positive definite linear systems. We present an algebraic multigrid method which has a guaranteed convergence rate for the class of nonsingular symmetric M-matrices with nonnegative row sum. The coarsening is based on the aggregation of the unknowns. A key ingredient is an algorithm that builds the aggregates while ensuring that the corresponding two-grid convergence rate is bounded by a user-defined parameter. For a sensible choice of this parameter, it is shown that the recursive use of the two-grid procedure yields a convergence independent of the number of levels, provided that one uses a proper AMLI-cycle. On the other hand, the computational cost per iteration step is of optimal order if the mean aggregate size is large enough. This cannot be guaranteed in all cases but is analytically shown to hold for the model Poisson problem. For more general problems, a wide range of experiments suggests that there are no complexity issues and further demonstrates the robustness of the method. The experiments are performed on systems obtained from low order finite difference or finite element discretizations of second order elliptic partial differential equations (PDEs). The set includes two- and three-dimensional problems, with both structured and unstructured grids, some of them with local refinement and/or reentering corner, and possible jumps or anisotropies in the PDE coefficients. |
|||
|
|
Flexible Variants of Block Restarted GMRES Methods with Application to Geophysics SIAM J. Sci. Comput. 34, pp. A714-A736 (23 pages) Online Publication Date: March 22, 2012
Full Text:
|
Download PDF
|
||
|
Show Abstract
In a wide number of applications in computational science and engineering the solution of large linear systems of equations with several right-hand sides given at once is required. Direct methods based on Gaussian elimination are known to be especially appealing in that setting. Nevertheless, when the dimension of the problem is very large, preconditioned block Krylov space solvers are often considered as the method of choice. The purpose of this paper is thus to present iterative methods based on block restarted GMRES that allow variable preconditioning for the solution of linear systems with multiple right-hand sides. The use of flexible methods is especially of interest when approximate possibly iterative solvers are considered in the preconditioning phase. First we introduce a new variant of block flexible restarted GMRES that includes a strategy for detecting when a linear combination of the systems has approximately converged. This explicit block size reduction is often called deflation. We analyze the main properties of this flexible method based on deflation and notably prove that the Frobenius norm of the block residual is always nonincreasing. We also present a flexible variant based on both deflation and truncation to especially be used in case of limited memory. Finally we illustrate the numerical behavior of these flexible block methods for large industrial simulations arising in geophysics, where indefinite linear systems of size up to 1 billion unknowns with multiple right-hand sides have been successfully solved in a parallel distributed memory environment. |
|||
|
|
SIAM J. Sci. Comput. 34, pp. A937-A969 (33 pages) Online Publication Date: March 29, 2012
Full Text:
|
Download PDF
|
||
|
Show Abstract
We present a new approach to treating nonlinear operators in reduced basis approximations of parametrized evolution equations. Our approach is based on empirical interpolation of nonlinear differential operators and their Fréchet derivatives. Efficient offline/online decomposition is obtained for discrete operators that allow an efficient evaluation for a certain set of interpolation functionals. An a posteriori error estimate for the resulting reduced basis method is derived and analyzed numerically. We introduce a new algorithm, the PODEI-greedy algorithm, which constructs the reduced basis spaces for the empirical interpolation and for the numerical scheme in a synchronized way. The approach is applied to nonlinear parabolic and hyperbolic equations based on explicit or implicit finite volume discretizations. We show that the resulting reduced scheme is able to capture the evolution of both smooth and discontinuous solutions. In case of symmetries of the problem, the approach realizes an automatic and intuitive space-compression or even space-dimensionality reduction. We perform empirical investigations of the error convergence and run-times. In all cases we obtain a good run-time acceleration. |
|||
|
|
A Quasi-algebraic Multigrid Approach to Fracture Problems Based on Extended Finite Elements SIAM J. Sci. Comput. 34, pp. A603-A626 (24 pages) Online Publication Date: March 13, 2012
Full Text:
|
Download PDF
|
||
|
Show Abstract
The modeling of discontinuities arising from fracture of materials poses a number of significant computational challenges. The extended finite element method provides an attractive alternative to standard finite elements in that they do not require fine spatial resolution in the vicinity of discontinuities nor do they require repeated remeshing to properly address propagation of cracks. They do, however, give rise to linear systems requiring special care within an iterative solver method. An algebraic multigrid method is proposed that is suitable for the linear systems associated with modeling fracture via extended finite elements. The new method follows naturally from an energy minimizing algebraic multigrid framework. The key idea is the modification of the prolongator sparsity pattern to prevent interpolation across cracks. This is accomplished by accessing the standard levelset functions used during the discretization process. Numerical experiments illustrate that the resulting method converges in a fashion that is relatively insensitive to mesh resolution and to the number of cracks or their location. |
|||
|
|
SIAM J. Sci. Comput. 34, pp. A627-A658 (32 pages) Online Publication Date: March 13, 2012
Full Text:
|
Download PDF
|
||
|
Show Abstract
To easily generalize the maximum-principle-satisfying schemes for scalar conservation laws in [X. Zhang and C.-W. Shu, J. Comput. Phys., 229 (2010), pp. 3091–3120] to convection diffusion equations, we propose a nonconventional high order finite volume weighted essentially nonoscillatory (WENO) scheme which can be proved maximum-principle-satisfying. Two-dimensional extensions are straightforward. We also show that the same idea can be used to construct high order schemes preserving the maximum principle for two-dimensional incompressible Navier–Stokes equations in the vorticity stream-function formulation. Numerical tests for the fifth order WENO schemes are reported. |
|||
|
|
Adaptive Spectral Viscosity for Hyperbolic Conservation Laws SIAM J. Sci. Comput. 34, pp. A993-A1009 (17 pages) Online Publication Date: March 29, 2012
Full Text:
|
Download PDF
|
||
|
Show Abstract
Spectral approximations to nonlinear hyperbolic conservation laws require dissipative regularization for stability. The dissipative mechanism must, on the other hand, be small enough in order to retain the spectral accuracy in regions where the solution is smooth. We introduce a new form of viscous regularization which is activated only in the local neighborhood of shock discontinuities. The basic idea is to employ a spectral edge detection algorithm as a dynamical indicator of where in physical space to apply numerical viscosity. The resulting spatially local viscosity is successfully combined with spectral viscosity, where a much higher than usual cut-off frequency can be used. Numerical results show that the new adaptive spectral viscosity scheme significantly improves the accuracy of the standard spectral viscosity scheme. In particular, results are improved near the shocks and at low resolutions. Examples include numerical simulations of Burgers' equation, shallow water with bottom topography, and the isothermal Euler equations. We also test the schemes on a nonconvex scalar problem, finding that the new scheme approximates the entropy solution more reliably than the standard spectral viscosity scheme. |
|||
|
|
A New Formulation of the Fast Fractional Fourier Transform SIAM J. Sci. Comput. 34, pp. A1110-A1125 (16 pages) Online Publication Date: April 10, 2012
Full Text:
|
Download PDF
|
||
|
Show Abstract
By using a spectral approach, we derive a Gaussian-like quadrature of the continuous fractional Fourier transform. The quadrature is obtained from a bilinear form of eigenvectors of the matrix associated to the recurrence equation of the Hermite polynomials. These eigenvectors are discrete approximations of the Hermite functions, which are eigenfunctions of the fractional Fourier transform operator. This new discrete transform is unitary and has a group structure. By using some asymptotic formulas, we rewrite the quadrature in terms of the fast Fourier transform (FFT), yielding a fast discretization of the fractional Fourier transform and its inverse in closed form. We extend the range of the fractional Fourier transform by considering arbitrary complex values inside the unit circle and not only at the boundary. We find that this fast quadrature evaluated at $z=i$ becomes a more accurate version of the FFT and can be used for nonperiodic functions. |
|||
|
|
A New Truncation Strategy for the Higher-Order Singular Value Decomposition SIAM J. Sci. Comput. 34, pp. A1027-A1052 (26 pages) Online Publication Date: April 03, 2012
Full Text:
|
Download PDF
|
||
|
Show Abstract
We present an alternative strategy for truncating the higher-order singular value decomposition (T-HOSVD). An error expression for an approximate Tucker decomposition with orthogonal factor matrices is presented, leading us to propose a novel truncation strategy for the HOSVD, which we refer to as the sequentially truncated higher-order singular value decomposition (ST-HOSVD). This decomposition retains several favorable properties of the T-HOSVD, while reducing the number of operations required to compute the decomposition and practically always improving the approximation error. Three applications are presented, demonstrating the effectiveness of ST-HOSVD. In the first application, ST-HOSVD, T-HOSVD, and higher-order orthogonal iteration (HOOI) are employed to compress a database of images of faces. On average, the ST-HOSVD approximation was only $0.1\%$ worse than the optimum computed by HOOI, while cutting the execution time by a factor of $20$. In the second application, classification of handwritten digits, ST-HOSVD achieved a speedup factor of $50$ over T-HOSVD during the training phase, and reduced the classification time and storage costs, while not significantly affecting the classification error. The third application demonstrates the effectiveness of ST-HOSVD in compressing results from a numerical simulation of a partial differential equation. In such problems, ST-HOSVD inevitably can greatly improve the running time. We present an example wherein the $2$ hour $45$ minute calculation of T-HOSVD was reduced to just over one minute by ST-HOSVD, representing a speedup factor of $133$, while even improving the memory consumption. |
|||
|
|
A Semidiscrete Finite Volume Constrained Transport Method on Orthogonal Curvilinear Grids SIAM J. Sci. Comput. 34, pp. A763-A791 (29 pages) Online Publication Date: March 27, 2012
Full Text:
|
Download PDF
|
||
|
Show Abstract
A new semidiscrete finite volume scheme for systems of hyperbolic conservation laws using the constrained transport method to evolve divergence-free vector fields on orthogonal curvilinear grids is presented. Our results are an extension of a semidiscrete central-upwind scheme for hyperbolic conservation laws to the framework of constrained transport methods. In particular, we show that by employing the mathematical framework used to derive the hyperbolic base scheme, a constrained transport method sharing the desired upwind and nonoscillatory characteristics can be obtained. The derivation of the basic framework is performed independent of the intended spatial order of the scheme, opening the possibility for high-order schemes. Thus, the derivation is also independent of the piecewise polynomial reconstruction from the cell-averages. Furthermore, the geometric factors arising due to the orthogonal curvilinear grid are obtained in a consistent way. The accuracy of the scheme is demonstrated by applying the method to the equations of magnetohydrodynamics. |
|||






ALL SIAM Content
Scitation
Google Scholar