Phase retrieval with sparse phase constraint

For the first time, this paper investigates the phase retrieval problem with the assumption that the phase (of the complex signal) is sparse in contrast to the sparsity assumption on the signal itself as considered in the literature of sparse signal processing. The intended application of this new problem model, which will be conducted in a follow-up paper, is to practical phase retrieval problems where the aberration phase is sparse with respect to the orthogonal basis of Zernike polynomials. Such a problem is called sparse phase retrieval (SPR) problem in this paper. When the amplitude modulation at the exit pupil is uniform, a new scheme of sparsity regularization on phase is proposed to capture the sparsity property of the SPR problem. Based on this regularization scheme, we design and analyze an efficient solution method, named SROP algorithm, for solving SPR given only a single intensity point-spread-function image. The algorithm is a combination of the Gerchberg-Saxton algorithm with the newly proposed sparsity regularization on the phase. The latter regularization step is mathematically a rotation but with direction varying in iterations. Surprisingly, this rotation is shown to be a metric projection on an auxiliary set which is independent of iterations. As a consequence, SROP algorithm is proved to be the cyclic projections algorithm for solving a feasibility problem involving three auxiliary sets. Analyzing regularity properties of the latter auxiliary sets, we obtain convergence results for SROP algorithm based on recent convergence theory for the cyclic projections algorithm. Numerical results show clear effectiveness of the new regularization scheme for solving the SPR problem.


Introduction
The phase retrieval is an inverse problem of recovering a complex signal from one or several measured intensity patterns. It appears in many scientific and engineering fields, including astronomy, crystallography, microscopy, optical manufacturing, and adaptive optics [10,11,24,32]. Historically, there are three sorts of problems with three different names [4]. Phase retrieval refers to a basic form of image-based wavefront sensing in which a single image of a point source is used to estimate the unknown phase parameters of an optical system. Phase-diverse phase retrieval is similar in concept to phase retrieval except multiple images of a point source, each differing by a known phase term, are used to estimate the unknown phase parameters of an optical system. Phase diversity refers to a method of image-based wavefront sensing where multiple images of an unknown extended object or scene are used to estimate both the unknown phase parameters and the unknown object. The multiple images differ by some known phase terms. An important application of phase retrieval is to quantify the properties of an imaging system via its generalized pupil function (GPF). The fundamental advantage of this approach compared to those using intensity point spread functions (PSFs) or intensity optical transfer functions (OTFs) is that it is modifiable and automatically includes specific characterizations of the imaging system under investigation. The generic formulation of the phase retrieval problem is [32]: find x ∈ C n such that |M x| 2 where b is the measurement data, M is a known complex matrix of compatible sizes and ε is unknown measurement noise. Depending on the parameterization, phase retrieval problem can be formulated in either the modal or the zonal form. In modal approach, one can either parameterize the phase aberration only [33], or represent the GPF as a linear combination of some basis functions, e.g. the extended Nijboer-Zernike basis functions [6] and the radial basis functions, see for example [8,29]. In this formulation the dimension of the parameter variable x does not depend on the size of the problem (the sizes of the sample grid) but rather the number of the basis functions in use to approximate the GPF. This makes the modal form approach a feasible choice for very large-scale problems in practice. The matrix M in the modal form formulation in general has no desirable properties such as unitarity or isometry and hence convergence analysis of numerical methods for solving (1) is often challenging even in the noise-free case. It is an obvious prerequisite for the modal formulation that the GPF must be well approximated by a finite number of functions chosen from an orthogonal basis of an infinite dimensional space. This condition is not satisfied for example for the sparse phase retrieval considered in this paper. Complementing the modal formulation, the zonal form approach does not suffer the two potential drawbacks described above while requires the validity of the paraxial approximation in the imaging formulation. This formulation is theoretically based on the relationship that intensity PSF images are the squared magnitudes of the Fourier transforms of the GPF multiplied by well controlled phase diversity patterns [13]. The matrix M in the zonal form formulation can hence be normalized to be unitary for problems given a single intensity image and isometric for those given multiple images. More importantly, available fast algorithms for evaluating the Fourier transform can be utilized to numerically compute solutions to phase retrieval problems in this Fourier transform-based model.
Dealing with sparse phase retrieval, we consider in this paper only the zonal form formulation. Since a PSF image captures the complex object GPF via the squared amplitude of its Fourier transform, multiple PSF images are, on the one hand, usually needed for retrieving the phase of the GPF up to a total piston term. On the other hand, simultaneously measuring several PSF images is not always a simple task for all applications in practice, especially for those involving real time imaging systems. A fundamental reason is that the use of double exposure procedures for such a measurement process would inevitably generate unwanted noise to the measured data. It is also known that under certain circumstances the phase can be estimated from a single intensity PSF image, see for example [7,12]. In this paper, we primarily focus on locally solving the sparse phase retrieval (SPR) given a single intensity PSF image, Section 3. Sparse phase retrieval with multiple intensity images is of course much easier to solve and will only be briefly discussed using the analysis at our disposal for the main research task.
For an overview of numerical methods for phase retrieval problems in general, we refer the reader to the papers [11,24,32]. For the purpose of this paper, we restrict ourselves to the discussion on algorithms which are applicable to the sparse phase retrieval problem (SPR) under investigation. By ignoring the information of sparsity, the Gerchberg-Saxton (GS) algorithm [12] technically fits the setting of SPR formulated in Section 3 and will be compared to our proposed algorithm in Section 7. To continue the discussion it is essential to figure out the key feature of SPR. The sparse phase retrieval problem is generally not a sparse signal recovery problem (SSR) given the amplitude of its Fourier transform. In the first problem, the sparsity is imposed on the phase of the signal whereas the signal itself may vanish nowhere (with respect to the canonical basis of C n×n ). Mathematically, a signal χ ⊙ exp(jΦ) can vanish nowhere no matters how sparse is its phase Φ. Hence in this paper we make a distinction between the two problems SPR and SSR. A modification of the Fienup algorithm [10] for solving SSR was suggested in [26]. We demonstrate that to some extent this algorithm can be adapted for solving SPR. This algorithm will be compared to our algorithm along the discussion regarding solvability and sensitivity. In the signal recovery literature, when the signal is real and sparse, the authors of [28,31] proposed and analyzed various iterative algorithms. Their rigorous treatment provides some mathematical hints for the study of SPR. However, it is emphasized again that there has been no concrete reformulation of SPR as an SSR problem. This fundamental fact is the barrier for adapting sparse signal retrieval algorithms in [28,31] for SPR and hence a comparison of our algorithm to these methods seems not feasible.
In this paper, we propose an iterative algorithm, named SOPR, for solving SPR and analyze its convergence. SOPR is technically a combination of the Gerchberg-Saxton method and the · 0 -regularization which is applied to the phase instead of the elements of the underlying space. This regularization scheme is newly developed in this paper. Convergence results for SOPR are rigorously established by showing that SOPR is equivalent to the cyclic projections algorithm for solving an auxiliary possibly inconsistent feasibility problem involving three (locally) prox-regular [30] sets and hence recent convergence theory developed in [25, Section 3.1] for the cyclic projections method can be applied to SOPR. Our numerical experiments for SPR comparing SOPR with the Gerchberg-Saxton method and the sparse Fienup algorithm [26] demonstrate that our new regularization scheme has clear advantages.
The remainder of paper is organized as follows. Basic mathematical notation will be introduced in Section 2. In Section 3 the sparse phase retrieval problem is formally formulated. In Section 4 we discuss the sparse feasibility approach for sparse signal retrieval and investigate a special setting of SPR where the sparse signal recovery algorithm [26] can be applied. In Section 5 we present the SOPR algorithm and discuss its significant features. In Section 6 convergence analysis for SOPR are rigorously established. Section 7 is devoted to demonstrate better numerical performance of SOPR compared to the relevant algorithms in particular and the advantages of our regularization scheme in general. Concluding remarks and discussion on future research are presented in Section 8.

Mathematical notation
The underlying space in this paper is a complex Hilbert space H = C n×n . The Frobenius norm is denoted · F . The 0-norm · 0 of an object is the number of its nonzero entries. The element-wise multiplication is denoted ⊙. The element-wise division · · , the element-wise absolute value | · | and the element-wise square root √ operations are also frequently used in this paper but without need for extra notation.
Re(·) and Im(·) denote the real and the imaginary parts respectively of a complex number. The distance to a set Ω ⊂ H is defined by and the set-valued mapping is the corresponding projection operator. A selection w ∈ P Ω (x) is called a projection of x on Ω. An iterative sequence x k+1 ∈ T (x k ) generated by an operator/algorithm T : H ⇒ H is said to converge R-linearly to x * with rate c ∈ [0, 1) if there is a constant γ > 0 such that

Sparse phase retrieval
We consider the sparse phase retrieval problem (SPR) of where χ ∈ R n×n + is a known amplitude distribution, Φ d ∈ R n×n is a known phase diversity, Φ ∈ R n×n is the unknown variable which is known a priori to be sparse, F is the (discrete) 2-dimensional Fourier transform, b ∈ R n×n + is the measured intensity image of x, and ε ∈ R n×n is some unknown noise. In this paper, χ is assumed to be a uniform distribution. For the simplicity of terminology, when x = χ ⊙ exp j Φ d + Φ is a solution to (2), Φ in the sequel will be called a phase solution to that problem. Note that the phase diversity Φ d has an important role in applications though it brings no differences to (2) in terms of mathematics.
Since Φ is the only unknown of the above problem, the model of sparse phase retrieval given multiple intensity images is simply a combination of several models (2) each of which corresponds to a different phase diversity Φ d .
Despite of the sparsity of Φ, problem (2) is not a sparse signal retrieval (SSR) [26,28,31] because the signal x indeed vanishes nowhere. Hence, there is a profound difference between SPR and SSR. The link connecting the two problems established in Section 4 as well as the · 0 -regularization for SPR presented in Section 5 are novel to the best of our awareness.

4
· 0 -regularization for sparse signal retrieval As discussed in Section 3, in general there is no concrete interpretation of SPR as a sparse signal retrieval (SSR) in the way that algorithms for the later problem can be applied to SPR. In this section we establish a new link between SPR and SSR. As a by-product, we to some extent can compare our proposed algorithm to a sparse signal recovery method, see Section 7. Let us consider the optimization problem with sparsity and complex-amplitude constraints (SOCA) where for a given data b ′ ∈ R n×n + , andŝ is some sparsity parameter. We note that SPR is not explicitly related to SOCA since the variable x in (2) is nonzero everywhere. However, the following observation following from the properties of the Fourier transform gives rise to a link between SPR and SOCA in the special setting.
holds true for all entries except at the one corresponding to the zero frequency ξ 0 . If in addition we suppose further that Im(mean(x)) = 0, then it also holds that Proof. We have by the linearity of F that Since χ is uniform, F (χ) is zero everywhere except at the zero frequency ξ 0 . As a result, the first part of Lemma 4.1 follows. For the later part regarding the zero frequency, we first note that F (x)(ξ 0 ) = mean(x), the average of the signal x, and F (χ)(ξ 0 ) = χ F by the basic properties of the Fourier transform. Condition (6) then implies that F (x)(ξ 0 ) = Re (F (x)(ξ 0 )). Since the (element-wise) argument of x is sparse and χ is a uniform distribution, Re (F (x)(ξ 0 )) is positive. As a consequence of the Rayleigh-Parseval theorem we get |F (x)(ξ 0 )| ≤ χ F . Putting these facts together, we obtain which is (5) as claimed.
The implication of Lemma 4.1 is that under condition (6) solutions to SOCA (3) can be used as approximate solutions to SPR (2) as explained next.
for a sufficiently small noise term ε ′ . Then up to a noise term x is a solution to SPR.
Proof. By the assumption, x ′ is a solution to SOCA and it holds that Then using Lemma 4.1, we have where the last equality follows from (7) and (8). Hence up to a noise term we can write |F (x) That is, x is a solution to SPR and hence the proof is complete.
In the setting of Lemmas 4.1 and 4.2, one can approximately calculate the measurement b ′ := |F (x ′ )| 2 of the sparse signal x ′ from the given data b = |F (x)| 2 of SPR. This allows us to approach SPR using analysis techniques for SOCA. Note that the zero norm x ′ 0 in (3) is a regularization term encoding the sparsity nature of x ′ , equivalently Φ, which is not captured in (2), and hence has no affects on the relationship between SPR and SOCA.
We consider SOCA as an extension of the sparse affine optimization problem and adapt the feasibility approach [15] for solving SOCA. To begin, let us assume to have a good guessŝ of the sparsity of solutions to (3) and consider the sparse feasibility problem (SFP) of where A := {x ∈ C n×n | x 0 ≤ŝ} with the sparsity parameterŝ, and the set B is defined by (4). When the measurement data b ′ is corrupted with noise, SFP may become an inconsistent feasibility problem and its solutions are accordingly understood as best approximation points in a reasonable sense for the two sets, see [23].
In the noise-free case andŝ = s, the sparsity of the sparsest solutions to SOCA, the two problems SOCA and SFP are equivalent. In practice the sparsity s is hardly known in advance. However, similar to the sparse affine feasibility [15], the sparsity parameterŝ is not numerically sensitive with respect to various fundamental feasibility algorithms. That is, for a relatively wide range of values ofŝ, these algorithms perform very much in the same nature (see also Section 7.2 for SOPR algorithm). Prominent advantages of the feasibility approach for SOCA are its low computational complexity compared to convex relaxation schemes for (3) and that convergence of solution methods can be understood and anticipated from the geometry of the feasibility problem.
The fundamental step of solving (9) by projection algorithms is to calculate the two projection operators P A and P B . Let us define on C n×n the constraint set Since F is a unitary transformation, the projector P B takes the form where F −1 is the inverse 2-dimensional Fourier transform. As the set C is geometrically the cartesian product of a number of circles of the complex plane, the explicit form of the projector P C , which is in general a set-valued mapping, is available [2,24]. From the viewpoint of numerical calculation, we only need a selection of the projector P C . From now on, we only work with the following selection of P C which is, with a slight abuse of notation, also denoted by P C .
with use of the extended arithmetic operation 0 0 = 1. The corresponding selection of P B is There is also a closed form for P A . For each x ∈ C n×n , let us denote Iŝ(x) the set of allŝ-tuples of indices ofŝ largest in absolute value entries of x. The set Iŝ(x) can contain multiple such s-tuples. The projector P A can be explicitly written [3] as follows: for any x ∈ C n×n , z equals x on I and vanishes elsewhere .
In summary, the well known sparsity regularization for sparse signal retrieval described in this section can be applied to SPR in the special setting specified in Lemma 4.1. For the general setting of SPR (2), we propose a new strategy of · 0 -regularization which is implicitly incorporated in the SOPR algorithm described in the next section.

SOPR algorithm
This section contains our seemingly new idea for SPR. Under the assumption that the amplitude distribution χ is uniform, a sparsity constraint can be applied on the phase to encode the sparsity property of the phase solution to SPR (2). The rigorous analysis of this technique will be provided in Section 6.
In the sequel, by saying "largest/smallest entries" of a vector or a matrix we always mean "largest/smallest in absolute value entries" to avoid repetition. Stopping criterion: Φ k − Φ k+1 < τ or k exceeds a number of iterates. Output: Φ = Φ end -the estimate.
Algorithm 5.1 is fundamentally a combination of the Gerchberg-Saxton (GS) algorithm (steps (1)-(5)) and a sparsity regularization but applied to the phase (steps (6)- (7)). When applied to SPR, this extra regularization step brings significant features of SOPR compared to the original GS method, see Section 7.
SOPR can be naturally adapted for solving SPR with multiple intensity images in the same way that the original GS method has been extended to various phase retrieval algorithms using phase diversity patterns. More specifically, steps (2)-(5) of SOPR can be enhanced by any algorithmic schemes for solving feasibility problems such as the alternating projections, the Douglas-Rachford method, the RAAR algorithm [22], the T λ algorithm [27] and cyclic Douglas-Rachford scheme [5]. In order to demonstrate the effectiveness of our regularization strategy, in Section 7.5 we briefly compare the cyclic projections algorithm with the corresponding cyclic version of SOPR in terms of numerical performance for SPR with multiple intensity images.

Convergence analysis
In this section we will establish local convergence results for SOPR by showing its equivalence to the cyclic projections algorithm involving three sets. As a result, a comprehensive convergence theory for the cyclic projections method recently developed in [25] can be applied to SOPR. It is worth mentioning that the above mentioned theory also encompasses the inconsistent feasibility which in the setting of this paper corresponding to SOPR applied to SPR with noise. To avoid quite formidable technical details inherently arising for inconsistent feasibilities, we will present the convergence results in the noise-free setting.
To begin, let us define on C n×n the three sets as follows: , where arg(x) denotes the element-wise argument of x taking values in (−π, π]. The next lemma shows that SOPR is nothing else but the cyclic projections method P Ω3 P Ω2 P Ω1 .
Lemma 6.1. The following statements regarding Algorithm 5.1 hold true.
As a consequence, Proof. (i) We first observe that where the constraint set C is given by (10). Let us denote the linear operator D : C n×n → C n×n by It is clear that D is a unitary transform since the energy of any signal in C n×n is invariant with respect to D. Its inverse clearly takes the form D −1 (x) = x ⊙ exp −jΦ d . Now using (12) and the formula (11) with the unitary transform F • D in place of F , we obtain Comparing to steps (2), (3) and (4) of SOPR and using (14), we have that The proof is complete.
(ii) This part is clear from step (5) of SOPR and the definition of Ω 2 .
(iii) This statement is rather unexpected since the · 0 -regularization applied to the phase object on (−π, π] n×n turns out to be equivalent to a metric projection on the underlying space C n×n . Let Ψ k , I k and Φ k+1 be as in steps (6), (7) and (1) of SOPR. We need to show that x k+1 = χ ⊙ exp(jΦ k+1 ) is a projection of z k = χ ⊙ exp(jΨ k ) on Ω 3 . By the definition of Ω 3 , it suffices to show that We first note that χ(ξ) = χ(ξ c ) since χ is a uniform distribution. Condition (15) reduces to By the definition of I k , it holds that 0 ≤ |Ψ k (ξ c )| ≤ |Ψ k (ξ)| ≤ π ∀ξ ∈ I k , ∀ξ c / ∈ I k .
Then we have for all ξ ∈ I k and ξ c / ∈ I k that where the only inequality follows from (16). Hence we have proved that x k+1 ∈ P Ω3 (z k ) and the proof is complete.
Lemma 6.1 allows us to fully apply the convergence theory for the cyclic projections algorithm developed in [25, Section 3.1] to the SOPR algorithm. We next present an amount of mathematical material which is sufficient for formulating a local linear convergence criterion in a concise and memorable statement. Since phase retrieval is a typical nonconvex problem and our approach requires no convex relaxations, one should not expect global convergence for SOPR.
Recall that a set Ω is called prox-regular at a point x * ∈ Ω if the projector P Ω is single-valued around x * [30]. The analysis regarding prox-regularity in the context of the phase retrieval problem was first given in [23]. The following lemma provides the geometry of the auxiliary feasibility problem under consideration: Lemma 6.2. The following statements regarding the sets Ω i (i = 1, 2, 3) hold true.
(i) The set Ω 1 is prox-regular at every point of it.
(ii) The set Ω 2 is prox-regular at every point of it.
As a consequence, in the setting of item (iii), the sets Ω i (i = 1, 2, 3) are prox-regular at every solution to problem (18).
Proof. (i) The proof follows the idea of [23, Section 3.1]. We first note that the set C defined at (10) is prox-regular at every point of it since it is the cartesian product of a number of circles which are typical examples of prox-regularity. It is clear from the definition of Ω 1 that Ω 1 = D −1 (C) with D given by (13). But D −1 which is a unitary transform does not affect the geometry properties of C and hence the statement is proved.
(ii) The first argument for item (i) also encompasses this statement.
(iii) We first note that Ω 3 is the union of n 2 s sets each of which, we call a component of Ω 3 , is the cartesian product ofŝ copies of the circle {c ∈ C | |c| = χ} and n 2 −ŝ singletons {1}. Note that each of the components of Ω 3 is a prox-regular set by the same argument as for Lemma 6.2 (i). Let x * ∈ Ω 1 ∩Ω 2 ∩Ω 3 . By the definition of the sets Ω i (i = 1, 2, 3), we have x * = χ⊙exp (jΦ * ) and Φ * 0 =ŝ. Let us denote I x * the set of indices corresponding to nonzero entries of Φ * . Sinceŝ is the sparsity of the solutions with sparsest phase to problem (18), there is the unique component of Ω 3 which contains x * , denoted L x * . Since L x * is prox-regular as mentioned above, all we need is to show the existence of a neighborhood U of x * such that Let us define We will show that the neighborhood of x * satisfies the desired property (19). Since the inclusion (Ω 3 ∩ U ) ⊃ (L x * ∩ U ) is trivial, we now prove only the inverse one, (Ω 3 ∩ U ) ⊂ (L x * ∩ U ). Indeed, take any x ∈ Ω 3 ∩ U . Note that |x| = χ as x ∈ Ω 3 . Let us write x = χ ⊙ exp (jΦ). Then for every index ξ ∈ I x * , we have by the triangle inequality, (17), (22), (20) and (21) successively that This in particular implies that Φ(ξ) = 0 for all indices ξ ∈ I x * and Φ 0 =ŝ. That is x ∈ L x * and hence the proof is complete.
It is worth mentioning that the condition on the uniform distribution of the amplitude χ is essential to both Lemmas 6.1 and 6.2.
We are now ready to state and prove a local linear convergence criterion for SOPR, which has been known from Lemma 6.1 to be the same as the cyclic projections algorithm T CP := P Ω3 P Ω2 P Ω1 . Theorem 6.3 (R-linear convergence of SOPR). Let Φ * be a sparsest phase solution to SPR problem (2) with Φ * 0 =ŝ. Suppose that the set-valued mapping F := Id −T CP is metrically subregular at x * := χ ⊙ exp (jΦ * ) for 0, i.e., there is a constant κ > 0 such that the inequality holds for all x in a neighborhood of x * . Then every iterative sequence generated by SOPR converges R-linearly to a solution to SPR provided that the initial point is sufficiently close to x * .
Proof. Sinceŝ is the sparsity of the sparsest phase solutions to the SPR problem (2), Φ * is a sparsest phase solution to (2) if and only if x * = χ ⊙ exp (jΦ * ) is a solution to (18). By Lemma 6.1 the two algorithms SOPR and T CP coincide. Therefore, it suffices to show the conclusion of Theorem 6.3 for sequences generated by T CP . By Lemma 6.2 the sets Ω i (i = 1, 2, 3) are prox-regular at x * . By [14, Theorem 2.14 (ii)], the projections P Ωi (i = 1, 2, 3) are almost firmly nonexpansive with violation that can be made arbitrarily small by shrinking the effective neighborhood of x * . Then by [25,Proposition 2.4 (iii)], there is a neighborhood U of x * such that the operator T CP is almost averaged on U with averaging constant 3/4 and violation ε that can be made arbitrarily small. Hence, taking also the metric subregularity of Id −T CP at x * for 0 into account, we can directly apply [25,Corollary 2.3] to obtain local linear convergence of T CP . The proof is complete.
The notion of metric subregularity appearing in Theorem 6.3 is amongst cornerstones of variational analysis and optimization theory with many important applications [9,16]. In the context of feasibility problem, the metric subregularity is closely related to the mutual arrangement of the sets at the reference point [14,[17][18][19][20][21]25]. A deeper look at this type of regularity for the collection of sets {Ω 1 , Ω 2 , Ω 3 } at x * involved in Theorem 6.3 is obviously of importance but beyond the scope of this paper.

Numerical experiments
Several numerical aspects of SOPR including solvability, sensitivity to sparsity parameter, sensitivity to erroneous input and robustness against noise are examined on relevant experiments. Since algorithms for retrieving sparse signal such as those suggested in [28,31] are not applicable to SPR as discussed in Section 3, we can not make a comparison of SOPR to them. Indeed, we have not found in the literature any numerical experiments on (locally) solving sparse phase retrieval given only one PSF intensity image. For comparison, we primarily compare SOPR to the well known Gerchberg-Saxton (GS) algorithm, which mathematically corresponds to SOPR withŝ = n 2 (i.e., steps (6) and (7) of SOPR are dispensed). The ultimate conclusion to be made is that SOPR, which can be thought as a · 0 -regularization variant of GS works significantly better than that algorithm for SPR. This advantageous feature of SOPR is also demonstrated for the easier problem, SPR with multiple images, in Section 7.5.
During the discussion regarding the sparsity level s of the signal and the sparsity parameterŝ, we also compare SOPR to the sparse Fienup (SF) algorithm [26] applied to retrieve the sparse signal x − χ from the data |F (x − χ)| which is approximately calculated from |F (x)| via Lemma 4.1. For convenience, we also specify SF for our SPR problem in Algorithm 7.1. It is clear that the inexactness of the calculation of b ′ from b and χ described in Algorithm 7.1 is an obvious factor leading to the ineffectiveness of this algorithm for solving SPR as will be observed in this section. It is worth mentioning that the outperform of SOPR over SF is not the main goal of these experiments but rather the performance of SORP itself.
-intensity PSF imagê s -sparsity parameter Φ 0 -initial guess for Φ τ -tolerance threshold. Data processing: calculate b ′ from b and χ using Lemma 4.1 u 0 = χ ⊙ (exp (jΦ 0 ) − 1) -initial guess of the sparse signal x ′ = x − χ. Iteration process: given u k (4) find I k , a set of indices corresponding to n 2 −ŝ smallest entries of v k (5) u k+1 = v k on I k and zero elsewhere.

Solvability Versus Sparsity
In this section we examine the recovery probability of SOPR relative to the sparsity of the phase. A comparison to SF is also presented. For each sparsity level s, an s-sparse phase Φ of size 32 × 32 pixels was randomly generated from the uniform distribution on [−π, π]. For this experiment very little white Gaussian noise, generated by using the Matlab function awgn(·, 40dB, 'measured'), was added to the squared amplitude of F (χ ⊙ exp (jΦ)), the negative entries of the obtained measurement were then reset to zeros. In the remainder of this paper, the amplitude χ is uniform with value 1. One hundred initial points were generated according to the formula x 0 = χ ⊙ exp (j (Φ + φ)), where the entries of φ were randomly generated from the uniform distribution on [−2π/5, 2π/5]. SOPR and SF were called 100 times for each sparsity level s = 4 + 10k with k = 1, 2, . . . , 58, respectively. The sparsity parameterŝ was taken around 105% of s. In this experiment and in the sequel, a run is counted as successful if the criterion is reached in less than 200 iterates, where Φ s is the truncated at s largest entries of the current estimated phase. Fig. 1 shows that the recovery probability of SOPR is almost surely for the sparsity between 6% and 25%. The algorithm fails to retrieve the phase in both extreme cases of sparsity: lower 4% or higher 40%. It is also clear that SF is far outperformed by SOPR.

Sensitivity to Sparsity Parameter
Likewise the stability of projection methods for solving sparse affine feasibility with respect to the sparsity parameter [15], in this section we demonstrate that SOPR for locally solving sparse phase retrieval problems also enjoys that kind of stability. A comparison to SF is presented. We consider a simulation sparse phase Φ of size 32 × 32, intensity image corrupted by white Gaussian noise at 40dB, 100 initial points and the criterion of success run (23) as in Section 7.1. In this experiment we know the sparsity of Φ being s = 144 and examine the recovery probability of SOPR for a wide range of sparsity parameterŝ, both lower and higher than s. SOPR and SF were called 100 times for each sparsity parameterŝ = 44 + 10k with k = 1, 2, . . . , 50, respectively. It has been known from Fig. 1 that the recovery probability of SOPR withŝ slightly above s is almost surely. Fig. 2 shows that such a probability indeed remains stable for a wide range ofŝ -between 80% and 180% of s. It is also clear from Fig. 2 that SF is far outperformed by SOPR. In this section we examine the sensitivity of SOPR with respect to the erroneous input, the distance from the initial point to the solution. A comparison to SF and the Gerchberg-Saxton (GS) methods is presented. We consider a simulation phase Φ of size 32 × 32 with sparsity s = 144, a white Gaussian noise and the success criterion (23) as in Section 7.2. The sparsity parameter was taken around 105% of s. Note that for an initial phase guess Φ 0 = Φ + φ, due to the sparseness of Φ a norm of φ might be not very informative to the intrinsic difference between Φ and Φ 0 . For example, any φ that if was added to Φ would have no affects on the set of indices of s largest entries of Φ could be considered as zero in this experiment. Here, the erroneous input is measured as the difference (in pixels) between the supports of Φ and Φ s 0 the truncation of Φ 0 at s-largest entries. The minimum of such a support difference is zero pixel corresponding to the ignorable case of φ described above, and its maximum value is 2s corresponding to the opposite scenario. In this experiment, 100 initial points were generated for each support difference value d = 5k with k = 0, 1, . . . , 40, respectively. The three algorithms SOPR, SF and GS were called 100 times for those initial points. Fig. 3 shows that SOPR performs consistently for d up to 85% of s. SF performs well for d up to 50% of s while GS can not retrieve the phase for d above 15% of s. It is clear that SOPR outperforms SF and GS in this experiment. The outperform of SOPR over GS in particular demonstrates the effectiveness of our regularization scheme newly proposed in this paper.

Robustness Against Noise
In this section we examine the robustness of SOPR against noise. We consider a simulation phase Φ of size 32 × 32 with sparsity s = 144 and the success criterion (23) as in Section 7.2. The sparsity parameter was taken around 105% of s. One hundred initial points were generated according to the formula x 0 = χ ⊙ exp (j (Φ + φ)), where φ was randomly generated from the uniform distribution on [−2π/5, 2π/5] as in Section 7.1. The support difference is on average about 117 pixels (see Section 7.3). Since the Gerchberg-Saxton method does not work in this experiment as can be observed from Fig. 3, we report the performance of SOPR and SF. For each level of white Gaussian noise ranging from 20dB to 50dB, SOPR and SF were called 100 times with the initial points described above. Fig. 4 shows that for signal-to-noise rate below 26dB they can not retrieve the phase and that for signal-to-noise rate above 30dB SOPR again far outperforms SF.

SPR with Multiple Images
In this section we briefly compare the cyclic projections (CP) algorithm with the corresponding cyclic version of SOPR for SPR given multiple images. We consider a simulation phase Φ of size 128 × 128 with sparsity level 25%, which was randomly generated from the uniform distribution on [−π, π]. A circular aperture with radius 64 pixels was used in this experiment. Three intensity images corresponding to the three phase diversity patterns exp jπ d 4 Z 0 2 (ρ) with d = −2, 0, 3 respectively were generated, where Z 0 2 (ρ) is the radial polynomial in the polar coordinates (ρ, ϕ). For the experiment with noise, a white Gaussian noise at 30dB was added to these images and the negative entries of the obtained images were then reset to zeros. The sparsity parameter for (the cyclic version of) SOPR was taken about 105% of the true sparsity. The amplitude distribution (zero phase everywhere) was taken as the starting point for both algorithms. Note that a good initial point is not technically required for phase retrieval given multiple images. For this simulation we can actually monitor the residual between the estimated phasê Φ and the exact phase Φ, and hence the residual quantity Φ − Φ F will be used to evaluate the quality of retrieval in this section. In the noise-free case shown by Fig. 5 (left), SOPR appears to recover the exact phase while the cyclic projections algorithm seemingly fails to exactly retrieve the zero entries of the true solution. In the experiment with noise shown by Fig. 5 (right), SOPR also appears to yield the estimated phase with smaller residual than that of CP.

Conclusion
We have developed a new · 0 -regularization scheme which is implicitly incorporated in the SOPR algorithm for solving the sparse phase retrieval (SPR) problem. For each iterate of SOPR, apart from the well known steps for Fourier transform-based phase retrieval algorithms, an additional sparsity constraint is applied to the phase to capture its sparsity property. This additional step is mathematically a rotation on the underlying space C n×n but with direction varying in iterates. In the setting of SPR with a uniform amplitude distribution, we have proved that this step indeed corresponds to a projection on an auxiliary set independent of the iterates. This result is rather unexpected since the · 0 -regularization applied to the phase on (−π, π] n×n turns out to be equivalent to a metric projection on the complex Hilbert space C n×n . As a result, SOPR algorithm is shown to be equivalent to the cyclic projections algorithm for solving a feasibility problem involving several auxiliary locally prox-regular sets. The convergence analysis of SOPR is rigorously obtained based on recent results for the cyclic projections algorithm for feasibility. Our numerical results have demonstrated clear empirical advantages of our · 0 -regularization scheme. The overall message to deliver is that if the phase is known a priori to be sparse, then a sparsity constraint appropriately applied to the phase will improve the recovery quality. For a concrete example, we have illustrated those advantages on the SPR problem given only one PSF intensity image. To this end the primary research goal of this paper has been achieved.
To additionally demonstrate the potential of our approach for more application settings, we have also preliminarily shown the seemingly advantage of our strategy for the sparse phase retrieval problem with multiple intensity images. Extensive numerical comparison to existing algorithms in this setting is vast and will be postponed for a future research.