当前位置: 首页 > >

Iterative total variation schemes for nonlinear inverse problems

发布时间:

Home

Search

Collections

Journals

About

Contact us

My IOPscience

Iterative total variation schemes for nonlinear inverse problems

This content has been downloaded from IOPscience. Please scroll down to see the full text. 2009 Inverse Problems 25 105004 (http://iopscience.iop.org/0266-5611/25/10/105004) View the table of contents for this issue, or go to the journal homepage for more

Download details: IP Address: 116.1.3.208 This content was downloaded on 29/04/2016 at 03:29

Please note that terms and conditions apply.

IOP PUBLISHING Inverse Problems 25 (2009) 105004 (26pp)

INVERSE PROBLEMS

doi:10.1088/0266-5611/25/10/105004

Iterative total variation schemes for nonlinear inverse problems
Markus Bachmayr1 and Martin Burger2
1 RWTH Aachen, Aachen Institute for Advanced Study in Computational Engineering Science, Schinkelstr. 2, D 52056 Aachen, Germany 2 Westf¨ alische Wilhelms Universit¨ at (WWU) M¨ unster, Institut f¨ ur Numerische und Angewandte Mathematik, Einsteinstr. 62, D 48149 M¨ unster, Germany

E-mail: bachmayr@aices.rwth-aachen.de and martin.burger@wwu.de

Received 27 April 2009, in ?nal form 13 July 2009 Published 16 September 2009 Online at stacks.iop.org/IP/25/105004 Abstract In this paper we discuss the construction, analysis and implementation of iterative schemes for the solution of inverse problems based on total variation regularization. Via different approximations of the nonlinearity we derive three different schemes resembling three well-known methods for nonlinear inverse problems in Hilbert spaces, namely iterated Tikhonov, Levenberg– Marquardt and Landweber. These methods can be set up such that all arising subproblems are convex optimization problems, analogous to those appearing in image denoising or deblurring. We provide a detailed convergence analysis and appropriate stopping rules in the presence of data noise. Moreover, we discuss the implementation of the schemes and the application to distributed parameter estimation in elliptic partial differential equations.

1. Introduction Variational methods based on penalization by total variation have become a popular and almost standard approach for the computation of discontinuous solutions of inverse problems (cf [ROF92, AV94, V02, LP99]). Due to the properties of the total variation functional, the reconstructions exhibit a spatially sparse gradient, i.e. they consist of large constant regions and sharp edges. These properties are very desirable for many inverse problems, where the unknowns describe densities or material functions changing in different regions or objects. The total variation reconstructions allow in particular to separate objects clearly. Besides their advantages total variation penalization methods suffer from several shortcomings. One of them is the dif?culty of constructing ef?cient computational schemes for the minimization due to nonsmoothness of the total variation. Another is a loss of contrast in reconstructions that can be signi?cant for ill-posed problems. Recently, a novel class of reconstruction schemes with a multi-scale nature has been proposed for total variation
0266-5611/09/105004+26$30.00 ? 2009 IOP Publishing Ltd Printed in the UK 1

Inverse Problems 25 (2009) 105004

M Bachmayr and M Burger

approaches in imaging (cf [OBG+05, BGO+06, HBO06, HMO05]), which can overcome the aforementioned shortcomings. Instead of a single variational problem, an iterative scheme (or in the limit a continuous ?ow in pseudo-time) is used with an appropriate stopping criterion dependent on data noise. In this paper we investigate possible generalizations of this iterative approach to nonlinear inverse problems. For such nonlinear problems, iterative schemes are very natural, since some iterative approximation is usually needed in any case in order to deal with the nonlinearity. In the schemes we propose, the iterative approach to total variation reconstruction is directly combined with the approximation of the nonlinearity. The type of approximation will then distinguish three different methods, similar to three well-known schemes for nonlinear inverse problems (iterated Tikhonov, Levenberg–Marquardt, Landweber). We mention that all the schemes discussed here are not restricted in their construction to total variation functionals. Indeed the schemes can be formulated for all common convex regularization functionals, including quadratic functionals, where the standard iterations are recovered, and other nonsmooth functionals such as the ones used in wavelet shrinkage or other sparsity approaches (cf [DDD04, CDL+98]). The convergence analysis is formulated here for the case of total variation schemes. However, the basic strategy of the proofs is not restricted to this case and can also be adapted to other convex functionals with suitable properties. In [KSS09], a slightly different construction was used independently to obtain iterative regularization methods for nonlinear ill-posed problems in Banach spaces that are closely related to those considered in this work. However, the methods and convergence results in [KSS09] are formulated under the assumption of a smooth and uniformly convex regularization functional, and therefore do not apply to total variation regularization. We also mention that since the thesis [B07] was ?nished and this paper was drafted, the Landweber-type method proposed there has found applications in compressed sensing (cf [COS09a]) and frame-based deblurring (cf [COS09c]) under the name linearized Bregman iteration, and also the convergence analysis has been carried over to the case of 1 -type penalization in ?nite dimensions (cf [YOG+08, COS09b]). Another interesting development in the spirit of Bregman iterations is the Split Bregman method proposed as an ef?cient scheme for total variation regularization in the case of linear inverse problems (cf [GO08]). The Split Bregman method is just an application of the Bregman iteration to an augmented version of the variational problem. The schemes and results in this paper can provide a basis for generalizing Split Bregman iterations to nonlinear inverse problems. 2. Iteration schemes Our basic setup in this paper is to consider ill-posed nonlinear operator equations of the form F (x) = y, (1)

where F : D(F ) ? X → H, y ∈ H for a Banach space X and a Hilbert space H. In practice, only noisy data y δ ∈ H that are corrupted by numerical and measurement errors are ? ∈ H with available, where δ > 0 denotes the noise level. We will assume the existence of y ? H δ and F (x) ? =y ? for some x ? ∈ D(F ). yδ ? y The iterative algorithms that will be introduced below are motivated by variational regularization methods, where the regularized solution is obtained as a global minimizer of
1 2

F (x) ? y δ

2 H

+ αJ (x)

(2)

2

Inverse Problems 25 (2009) 105004

M Bachmayr and M Burger

with a suitable convex regularization functional J : X → R ∪ {+∞}. We are especially interested in the case of total variation regularization, where J is the seminorm J (x) = |x |BV (
)

=

∞ g ∈C0 ( ,Rn ) g ∞ 1

sup

x div g

(3) . Note that for

on the space X = BV ( ) of the functions of bounded variation on a domain functions in the Sobolev space W 1,1 ( ) the identity |x |BV (
)

=

|? x |

holds. In a similar spirit are sparse reconstruction techniques with respect to some orthonormal 1 basis {bk }∞ -norm for penalization, i.e. k =1 of X, which use an


J (x) =
k =1

| x, bk |.

(4)

A result of this choice is that almost all coef?cients x, bk will vanish. A typical example is wavelet coef?cients, where the 1 -norm is equivalent to the norm in an appropriate Besov space. Note that the regularization functionals for total variation regularization and sparse reconstructions are nondifferentiable, and due to the nonlinearity of F the least-squares ?tting term in (2) need not be convex, in particular for small α . Thus, the numerical solution of the corresponding nonconvex and nondifferentiable minimization problem can be quite expensive for nonlinear inverse problems. This issue is addressed by the methods studied in the present work. A key ingredient for those iterative methods is the Bregman distance, which was introduced in [Bre67] and can be interpreted as a generalization of the mean-square distance to more ? ∈ X can be de?ned as general functionals J . A generalized Bregman distance for J of x, x
J ? = J (x) ? J (x) ? ? ξ, x ? x ? (x, x) Dξ

? . Note that for nonsmooth and not strictly convex functionals for a subgradient ξ ∈ ?J (x) ? ), and it can be the Bregman distance is not a strict distance (i.e. it can be zero for x = x multivalued (i.e. for each choice of a subgradient a different distance will be obtained). In our work this issue will however be of less importance, since we only use the Bregman distance for penalizations and all the methods will choose a particular subgradient. Our starting point is the following iterative regularization method for linear inverse problems recently introduced in [OBG+05]: xk+1 = arg min
x ∈BV ( ) 1 2

Kx ? y δ

2 H

J + αDξ (x, xk ) , k

(5a ) (5b)

?1 ? K (Kxk+1 ? y δ ), ξk+1 = ξk ? αk

where, in addition to our above assumptions, F (x) = Kx with a linear operator K ∈ L(X, H ). Here αk > 0 can be chosen a priori and large; it is not the regularization parameter. The role of the actual regularization parameter is played by the stopping index k ? , determined by a modi?ed discrepancy principle, at which the iteration is stopped. When the subdifferential of J is multivalued, which is the case for total variation regularization or sparse reconstructions, equation (5b) selects a speci?c subgradient ξk+1 ∈ ?J (xk+1 ), which also lies in the range of the (smoothing) adjoint operator K ? .
3

Inverse Problems 25 (2009) 105004

M Bachmayr and M Burger

In [OBG+05], special attention was paid to the case J = | · |BV ( ) , K = I, which leads to an iterative method for total variation denoising. For this particular case, several different motivations have been suggested, for instance as matching both grey level values and normal ?elds [OBG+05] and as a combined denoising and enhancing method [RS06]. The iteration turns out to cure a major shortcoming of standard total variation denoising by considerably reducing its systematic error, i.e. the reduction of contrast in the image. The method was also applied to image deblurring [HMO05] and extended to non-quadratic ?tting terms [HBO06], wavelet-based denoising [XO06] and MR imaging [HCO+06]. For arbitrary K and J , the iteration (5) can also be regarded as a generalization of nonstationary iterated Tikhonov regularization. The latter is obtained by choosing J as the square of a Hilbert space norm, in which case (5a ) and (5b) coincide (up to the Riesz isomorphism). This interpretation will be our starting point for the present work. We give three possible extensions of the idea to nonlinear operator equations, which can also be regarded as generalizations of certain well-known iterative regularization methods in a Hilbert space context. In the case of linear operators, iterative methods as the ones considered in this paper have become popular in compressed sensing recently (cf [COS09a]). Based on results to recover the sparsest solution (cf [CT06]) it has become classical in this context to minimize the 1 -norm of coef?cients in a basis or the total variation subject to a constraint Ax = y , where y ∈ RM with M rather typically small. Our approach yields a possibility of realizing such an approach for nonlinear constraints, i.e. the minimization of the regularization functional subject to a low number of nonlinear equations. 2.1. Iterated variational method The ?rst method we consider for the nonlinear case can be regarded as a generalization of nonlinear iterated Tikhonov regularization. The iterates are de?ned analogously to (5) by xk+1 ∈ arg min
x ∈D(F ) 1 2

F (x) ? y δ

2

J + αk Dξ (x, xk ) , k

(6a ) (6b)

?1 ξk+1 = ξk ? αk F (xk+1 )? (F (xk+1 ) ? y δ ).

Note that (6b) is an equation for a subgradient ξk+1 ∈ ?J (xk+1 ) resulting from the ?rstorder optimality condition corresponding to (6a ). Under standard assumptions (see also section 3), well-de?nedness of the iterates, i.e. existence, uniqueness and stability of the minimization problems to be solved in each step, can be veri?ed using the same arguments as for (2), cf [RS06]. At a ?rst glance it is not obvious how this scheme provides any computational advantage compared to standard total variation regularization—contrary it seems that a single nonlinear variational problem is replaced by the solution of a sequence of problems of the same type. However, with the choice of an appropriate regularization functional and a suf?ciently large αk , the variational problem to be solved in each iteration can be made locally convex around xk , so that the global minimum can indeed be computed by local descent methods. This property cannot be guaranteed by using the total variation functional only for penalization, but e.g. by adding a multiple of the squared L2 -norm, which however should not change the smoothing properties of the scheme. In our numerical experiments below we shall verify that this scheme also leads to improved results compared to the standard variational method. x 2 + J1 (x) for some Hilbert space norm, κ > 0 and some convex If J (x) = κ 2 regularization functional J1 (e.g. total variation), then (6a ) can be rewritten equivalently
4

Inverse Problems 25 (2009) 105004

M Bachmayr and M Burger

as xk+1 ∈ arg min
x ∈D(F )

1 F (x) ? y δ 2

2

+

αk κ x ? κ ?1 ξk 2

2

+ αk J1 (x) .

Hence, the problem in each iteration step can be written as a minimization with a standard regularization. 2.2. Levenberg–Marquardt-type method In each step of the iterated variational scheme above, some approximation of F will be necessary in order to solve (6a ). Hence, one could also consider variations of the scheme by approximating F directly in each iteration step. A ?rst possibility is to approximate the operator by its linearization at the last iterate in each step, which leads to the familiar Levenberg–Marquardt method in a Hilbert space context. In our case we obtain the scheme xk+1 = arg min
x ∈D(F ) 1 2

F (xk ) + F (xk )(x ? xk ) ? y δ

2

J + αk Dξ (x, xk ) , k

(7a ) (7b)

?1 F (xk )? (F (xk ) + F (xk )(xk+1 ? xk ) ? y δ ). ξk+1 = ξk ? αk

For this Levenberg–Marquardt-type method, a convex problem has to be solved in each step, where the only nonlinearity comes from the regularization functional. The convex problem in each step of the iteration is of the same type as in (5a ) and (5b), and therefore well-known and ef?cient numerical methods for these subproblems are available, e.g. methods based on duality in the case of total variation. Moreover, the well-posedness of the variational problem in (7a ) follows with the same arguments as for (5a ), cf [OBG+05]. x 2 + J1 (x), we can equivalently rewrite (7a ) as If again J (x) = κ 2 xk+1 = arg min
x ∈D(F )

1 F (xk )x ? dk 2

2

+

αk κ x ? κ ?1 ξk 2

2

+ αk J1 (x)

with dk = y δ ? F (xk ) + F (xk )xk . 2.3. Landweber-type method A further simpli?cation of each step can be achieved by linearization of the least-squares functional, which leads to
J xk+1 = arg min F (xk )? (F (xk ) ? y δ ), x ? xk + αk Dξ (x, xk ) , k x ∈D(F ) ?1 F (xk )? (F (xk ) ? y δ ). ξk+1 = ξk ? αk

(8a ) (8b)

This method reduces to Landweber iteration in the Hilbert space case. If ?J is singlevalued, it is essentially the same as the algorithm described and analysed in [SLS06] for linear inverse problems and in [KSS09] for nonlinear problems, in both cases under the assumption that J is a norm on a smooth and uniformly convex Banach space. Note that the well-posedness of the variational problem in (8a ) follows by the same considerations as for image denoising with iterated total variation methods, cf [OBG+05]. Concerning implementation, the Landweber-type method is the most straightforward of the three schemes discussed. It can be realized in two subsequent steps: ?rst, the update of the subgradient (8b) can be performed, which requires the same effort as the Landweber iteration in Hilbert spaces—only F and the adjoint of F have to be evaluated. Subsequently (8a ) can
5

Inverse Problems 25 (2009) 105004

M Bachmayr and M Burger

be solved, which is a problem similar to image denoising, independent of the operator F. In x 2 + J1 (x), we can equivalently rewrite (8a ) as particular if again J (x) = κ 2 xk+1 = arg min
x ∈D(F )

κ x ? κ ?1 gk 2

2

+ J1 (x)

?1 F (xk )? (y δ ? F (xk )) + ξk . with gk = αk

2.4. Stopping rule For noisy data, the methods have to be supplied with a suitable stopping rule. It turns out that, similarly to the corresponding methods in Hilbert spaces or to the case of linear operators (5), this can be achieved using modi?ed versions of Morozov’s discrepancy principle, i.e. we assume the iterative methods to be stopped at the index k ? (δ, y δ ) de?ned by k ? = min{k : F (xk ) ? y δ
H

τ δ}

(9)

with a constant τ > 1. We will formulate further conditions on τ for each method as part of our convergence results below. The regularized solution is given by the iterate xk? . We ?nally mention that for each of these methods, given that dom J = {x ∈ X: J (x) < ∞} ? D(F ), the sequence ξk+1 generated by (6b), (7b) and (8b), respectively, satis?es ξk+1 ∈ ?J (xk+1 ). 3. Convergence analysis To obtain a ?rst convergence analysis, we restrict ourselves to the particular case X = L2 ( ), ? R2 , a Lipschitz domain, and κ (10) J (x) = x 2 L2 ( ) + |x |BV ( ) + χD(F ) (x), 2 with some κ > 0, where we set κ x 2 J1 (x) = |x |BV ( ) + χD(F ) (x). J2 (x) = L2 ( ) , 2 We assume D(F ) to be convex, which ensures that J1 and J are convex. By [ET99], any ξ ∈ ?J (x) can be decomposed as ξ = κx + p, p ∈ ?J1 (x). For simplicity, we set κ = 1. For any different choice of a κ > 0, the arguments remain valid with appropriate changes to constants. Without further notice we shall make the following assumptions on the operator F in the following: Assumptions 1. Let F : D(F ) ? L2 ( ) → H be continuous, weakly sequentially closed, i.e. y imply x ∈ D(F ) and for any sequence {xn } ∈ D(F ), xn ? x in BV ( ) and F (xn ) F (x) = y , and Fr? echet differentiable with F (·) locally bounded on the closed and convex set D(F ). To simplify notation, the norm on the image space H will be denoted by · . 3.1. Monotonicity In the following we ?rst verify the monotonicity of the residuals in the three iteration schemes, which can be shown under very general conditions. A particularly straightforward case is the iterated Tikhonov-type method (6):
6

Inverse Problems 25 (2009) 105004

M Bachmayr and M Burger

Proposition 1. Let xk ∈ D(F ) and αk > 0. Then xk+1 de?ned by the iterated Tikhonov method (6) satis?es F (xk+1 ) ? y δ
2

? F (xk ) ? y δ

2

?αk xk+1 ? xk

2 L2 ( ) .

Proof. The de?nition of xk+1 as a minimizer of the functional in (6a ) in comparison with the functional value at xk directly implies the assertion. In the case of the Levenberg–Marquardt method (7), monotonicity of the residual cannot be shown for small αk , but with a lower bound. Proposition 2. Let xk ∈ D(F ) and let F and F be Lipschitz continuous with respect to the L2 -norm in BR (x k ) ∩ D(F ) for some R > 0. Then there exists αk > 0 such that xk+1 de?ned by the Levenberg–Marquardt method (7) satis?es F (xk+1 ) ? y δ
2

? F (xk ) ? y δ

2

?(αk /2) xk+1 ? xk

2 L2 ( ) .

Proof. From the local Lipschitz continuity of F and F we ?nd that for some C0 , C1 > 0, the estimates F (xk+1 ) ? F (xk ) C0 xk+1 ? xk
L2 ( ) ,

(11a ) C1 xk+1 ? xk
2 L2 ( )

F (xk+1 ) ? F (xk ) ? F (xk )(xk+1 ? xk )

(11b)

2 hold. Let αk 4 2C0 + C1 F (xk ) ? y δ . Comparing the value of the minimizer of xk+1 in (7a ) with the functional value at xk we ?nd 1 2

F (xk ) + F (xk )(xk+1 ? xk ) ? y δ

2

J + αk Dξ (xk+1 , xk ) k

1 2

F (xk ) ? y δ 2 .

The ?rst term can be estimated as F (xk ) + F (xk )(xk+1 ? xk ) ? y δ 2 = F (xk+1 ) ? y δ 2 + 2 F (xk+1 ) ? y δ , F (xk ) +F (xk )(xk+1 ? xk ) ? F (xk+1 ) + F (xk ) + F (xk )(xk+1 ? xk ) ? F (xk+1 ) F (xk+1 ) ? y δ 2 ? 2 F (xk+1 ) ? F (xk ) ? F (xk ), F (xk )(xk+1 ? xk )
2 2

+ 2 F (xk+1 )

? 2 F (xk ) ? y δ F (xk ) + F (xk )(xk+1 ? xk ) ? F (xk+1 ) 2 δ F (xk+1 ) ? y δ 2 ? 4C0 xk+1 ? xk 2 L2 ( ) ? 2C1 F (xk ) ? y Finally, the Bregman distance is estimated from below by αk J xk+1 ? xk 2 (xk+1 , xk ) αk Dξ L2 ( ) , k 2 and with the condition on αk this implies the assertion.

xk+1 ? xk

2 L2 ( ) .

Finally we verify the descent of the residual in the case of the Landweber-type method (8). Proposition 3. Let xk ∈ D(F ) and let F and F be Lipschitz continuous with respect to the L2 -norm in BR (x k ) ∩ D(F ) for some R > 0. Then there exists αk > 0 such that xk+1 de?ned by the Landweber method (8) satis?es F (xk+1 ) ? y δ
2

? F (xk ) ? y δ

2

?αk xk+1 ? xk

2 L2 ( ) .

7

Inverse Problems 25 (2009) 105004

M Bachmayr and M Burger

2 Proof. As above, we have the estimates (11). Let αk C0 +2C1 F (xk ) ? y δ and ξk = xk + pk with pk ∈ ?J1 (xk ). By convexity of J1 , xk+1 ? xk , pk+1 ? pk 0, and combining this with the ?rst-order optimality condition for (8a ) we obtain

xk+1 ? xk , F (xk )? (F (xk ) ? y δ ) Furthermore, we have F (xk+1 ) ? y δ
2

?αk xk+1 ? xk 2 .
2

? F (xk ) ? y δ

2

= F (xk+1 ) ? F (xk )

+ 2 F (xk ) ? y δ , F (xk+1 ) ? F (xk )

2 C0 xk+1 ? xk 2 + 2 F (xk )? (F (xk ) ? y δ ), xk+1 ? xk + 2 F (xk ) ? y δ F (xk+1 ) ? F (xk ) ? F (xk )(xk+1 ? xk ) ,

and hence F (xk+1 ) ? y δ
2

? F (xk ) ? y δ

2

2 ?2αk + C0 + 2C1 F (xk ) ? y δ

xk+1 ? xk

2 L2 ( ) ,

which implies the assertion. We will use the descent of the residual as a motivation for a heuristic selection criterion for αk in our numerical examples for the method (8). 3.2. Basic properties In the following we discuss some preliminary results needed in the further convergence analysis of the three iterative schemes. We will use the following identity for Bregman distances, which was also employed for the convergence analysis of iterative methods e.g. in ? ∈ ?J (x), ? ∈ ?J (x) ? x ? ∈ X, ξ ? ξ ? ; then [CT93] and [OBG+05]: Let x, x, J J J ? ?ξ ?, x ? ? D ? (x, x) ? + D ? (x, ? x) ? = ξ ? ?x . D ? (x, x) (12)
ξ ξ ξ

? in what follows. We denote the Bregman distance corresponding to J by Dξ (x, x) For any y ∈ H , let S (y) = {x ∈ D(F ): F (x) = y }. We make assumptions on the nonlinear operator F that are rather common in the convergence analysis of iterative regularization methods. Assumptions 2. We assume that F satis?es a nonlinearity condition of the form ? ? F (x) ? F (x)(x ? ? x) F (x) ? ∈ Bρ (x) ? ∩ D(F ), x, x ? η x?x
L2 ( )

? , F (x) ? F (x)

(13)

? ∩ dom J and Bρ (x) ? denotes the open ball around x ? of ? ∈ S (y) for some η, ρ > 0, where x radius ρ in L2 ( ). It has to be mentioned that condition (13), restricting the nonlinearity of F, is a rather severe one, see [EHN96] for further details. Although there are a number of examples for which it can be veri?ed, such as distributed parameter identi?cation problems, it remains open for many problems of practical interest, e.g. parameter identi?cation from boundary measurements. As usual for nonlinear problems we can only expect local convergence of the above algorithms; hence the starting values x0 (also in relation with ξ0 ) will need to be close enough ? in an appropriate sense. In our convergence analysis, it will turn out that the Bregman to x ? x0 ) has to be small. In the following lemma we make sure that indeed starting distance Dξ0 (x, ? exist. values with an arbitrarily small Bregman distance to x ? ∈ BV ( ) ∩ D(F ). For α > 0, let x α ∈ BV ( ) ∩ D(F ) be de?ned by Lemma 4. Let x x α = arg min α |x |BV (
x ∈D(F ) )

? + x?x

2 L2 ( )

.

8

Inverse Problems 25 (2009) 105004

M Bachmayr and M Burger

? in L2 ( ) as α → 0, and for any γ > 0, there is an α > 0 and ξ α ∈ ?J (x α ) Then x α → x J ? xα ) < γ . such that Dξ α (x, Proof. By de?nition of x α , for any α > 0 we have α |x α |BV (
)

? + xα ? x

2 L2 ( )

? |BV ( ) . α |x

(14)

? in L2 ( ) and lim supα→0 |x α |BV ( ) ? |BV ( ) . Let αk → 0 and This implies x α → x |x αk xk := x ; then by lower semicontinuity of the BV seminorm we have ? |BV ( |x
)

lim inf |xk |BV (
k

)

lim sup |xk |BV (
k

)

? |BV ( ) , |x

?1 ? |BV ( ) as k → ∞. Together with (14), this implies αk ? 2 i.e. |xk |BV ( ) → |x xk ? x → 0. L2 ( ) ? = 0. Hence we obtain a For any k we have a pk ∈ ?J1 (xk ) such that αk pk + 2(xk ? x) subgradient ?1 ? + xk ∈ ?J (xk ). (xk ? x) ξk = pk + xk = ?2αk

? in L2 ( ), we obtain Combining this with the decay of xk ? x
J ? xk ) = |x ? |BV ( Dξ (x, k )

? |xk |BV (

)

+

1 2

? x

2 L2 ( )

? xk

2 L2 ( )

?1 ? xk ? x ? 2αk

2 L2 ( )

? ? xk → 0, ? xk , x

which proves the assertion. Note that the sequence of regularization parameters {αk }, as well as the sequence of δ := iterates {xk }, depends on δ . We shall use the abbreviations yk := F (xk ), Kk := F (xk ), rk δ ?k := F (xk ) ? y ? where appropriate to simplify notation. F (xk ) ? y , r Under the assumptions stated above, we show weak? convergence in BV ( ) as δ → 0 of the methods (6), (7) and (8). In all three cases, the basic strategy is similar to the one in [OBG+05]. We restrict ourselves to results on semiconvergence for δ > 0 under the above stopping rule, for ‘exact data’ with δ = 0 one can show convergence of the full sequence of iterates by basically the same techniques. These results should rather be regarded as a ?rst step, because we have to make use of the Hilbert space structure of L2 ( ) in dealing with the nonlinearity of F, the methods themselves being applicable in a more general Banach space setting. On the other hand, we obtain a much stronger type of convergence than convergence in L2 ( ). 3.3. Convergence of the iterated variational method We begin with (6); in this case our assumptions are rather restrictive in comparison to the ones necessary for the stationary case (2) (if x0 = 0, the ?rst step of the method actually concides with (2)). To the best of our knowledge, no analogous result for nonlinear iterated Tikhonov regularization in a Hilbert space setting is available in the literature, where this method is usually considered for a ?xed number of steps and variable regularization parameters, which allows for weaker assumptions in the convergence analysis. We start with a fundamental monotonicity result for the error. Lemma 5. If for given iterates xk , ξk a minimizer xk+1 for (6a ) satis?es ? ? xk+1 ) y δ ? F (xk+1 ) ? F (xk+1 )(x we have y δ ? F (xk+1 ) β y δ ? F (xk+1 ) , 0 < β < 1, (15) y δ ? F (xk ) as well as 1?β δ ? xk+1 ) ? Dξk (x, ? xk ) ? y ? F (xk+1 ) 2 . Dξk+1 (x, αk
9

Inverse Problems 25 (2009) 105004

M Bachmayr and M Burger

Proof. Monotonicity of residuals follows directly from the de?nition of the method. By (12), ? xk+1 ) ? Dξk (x, ? xk ) + Dξk (xk+1 , xk ) = ξk+1 ? ξk , xk+1 ? x ? . Dξk+1 (x, Using (6b) we obtain
?1 δ ? = αk ? ? xk+1 ) rk+1 , Kk+1 (x ξk+1 ? ξk , xk+1 ? x ?1 δ = ?αk rk +1 ?1 δ rk ?αk +1 2 ?1 δ δ ? ? xk+1 ) + αk rk+1 , rk +1 + Kk +1 (x δ δ ? ? xk+1 ) . rk +1 ? rk +1 + Kk +1 (x 2

? By assumption (15), ξk+1 ? ξk , xk+1 ? x

?1 δ ?αk (1 ? β) rk +1 .

The main result of this section is the (semi-)convergence of iterated variational methods. Theorem 6. Let γ < min{1/η, ρ/2}, 0 < α(δ) αk α , where δ 2 /α(δ) < 3γ 2 /4, and the ? x0 ) < γ 2 /8. Let starting values x0 ∈ D(F ) ∩ BV ( ), ξ0 ∈ L2 ( ) satisfy Dξ0 (x, ? }, where δm > 0, {δm } → 0 with corresponding stopping indices {km τ > (1 + ηγ )/(1 ? ηγ ). (16)
? } has a Then for every δm , the stopping index is ?nite and every subsequence of {xkm ? in the weak? topology of BV ( ). If furthermore subsequence converging to an x ∈ S (y) ? ? ? ∩ Bρ (x) ? = {x ? }, then xkm ? in BV ( ). S (y) x

Proof. Take δ > 0 arbitrary but ?xed and let k ? be the corresponding stopping index, which at this point can possibly be in?nite. ? xk ) < γ 2 /8. By de?nition of the iterates, Assume k < k ? ? 1 and Dξk (x,
1 2 2 τ δ 2

+ αk Dξk (xk+1 , xk ) <

2 1 rδ + αk Dξk (xk+1 , xk ) 2 k +1 1 2 ? xk ). δ + αk Dξk (x, 2 L2 ( )

? xk ), and in particular xk+1 ? xk Hence Dξk (xk+1 , xk ) Dξk (x, ? ? xk L2 ( ) yields which combined with the same estimate for x ? ? xk+1 x
L2 ( )

? xk ), 2Dξk (x,

? xk ) < γ . < 2 2Dξk (x,

? xk+1 ) Dξk (x, ? xk ), which by induction Thus, by (13) we can apply lemma 5 to obtain Dξk+1 (x, ? ? xk+1 L2 ( ) < γ for any k < k ? ? 1. implies x δ τ δ , we can verify the required nonlinearity condition (15) for Using (13) and rk +1 noisy data for all k < k ? ? 1:
δ ? ? xk+1 ) rk +1 + Kk +1 (x

?k+1 + Kk+1 (x ? ? xk+1 ) δ+ r
δ ?k+1 (1 + ηγ )δ + ηγ rk δ + ηγ r +1 , 1 δ (1 + ηγ ) + ηγ rk +1 , τ

where β := τ ?1 (1 + ηγ ) + ηγ < 1 by our choice of τ . Hence for τ as in (16), the assumption (15) of lemma 5 is satis?ed for k < k ? ? 1. By lemma 5 we obtain
k ? ?2

? xk? ?1 ) + Dξk? ?1 (x,
k =0 k ? ?2 k =0

1?β δ rk+1 αk

2

? x0 ). Dξ0 (x,

Now for given δ, k ? is necessarily ?nite because (k ? ? 1)τ 2 δ 2 maxk k? ?2 αk
10

1 δ r αk k+1

2

? x0 ) Dξ0 (x, . 1?β

(17)

Inverse Problems 25 (2009) 105004

M Bachmayr and M Burger

Again by de?nition of the iterates, αk? ?1 Dξk? ?1 (xk? , xk? ?1 ) and hence xk? ? xk? ?1 δ2
L2 ( ) 1 2 δ 2

? xk? ?1 ) + αk? ?1 Dξk? ?1 (x, γ2 4
1 2

αk? ?1

+

. 2γ < ρ . Using convexity of J and

? Since δ 2 /α(δ) < 3γ 2 /4, this implies xk? ? x expanding the de?nition of ξk? ,
k?

J (xk? )

? + J (x)
k =1

1 ? ? xk? ) + ρ ξ0 r δ , Kk (x αk?1 k

L2 ( ) .

δ we have For k < k ? , by (15), (13) and monotonicity of rk δ ? ? xk? ) rk , Kk (x δ rk δ rk

? ? xk ) + Kk (xk? ? xk ) Kk (x
δ (1 + β) rk + (1 + 3ηγ ) yk ? yk? 2

δ (3 + β + 3ηρ) rk

. τ δ 2 (1 + τ )(1 + ηρ).

For the remaining term in the sum (k = k ), we have
δ ? ? xk? ) rk ? , Kk ? (x

?

?k? τ δ(1 + ηρ) r

Combining the above, we obtain 3 + β + 3ηρ ? x0 ) + τ δ 2 (1 + τ )(1 + ηρ) + ρ ξ0 L2 ( ) Dξ0 (x, 1?β and thus J (xk? ) is uniformly bounded for small δ . ? } as in our assumption. We choose a sequence {δm } with corresponding stopping indices {km δ m ? ? ? → 0 as δm → 0 by de?nition of the → 0, and hence F xkm ?y ?y We have F xkm stopping index. ? is uniformly bounded and F is weakly sequentially closed, we obtain weak? As J xkm convergence in BV ( ) and weak convergence in L2 ( ) of a subsequence of any subsequence ? ? . to an x ∈ S (y) of xkm ? , a subsequence-of-subsequence argument gives the If the solution is unique in Bρ (x) ? to x ? in the same sense. convergence of xkm J (xk? ) ? + J (x) 3.4. Convergence of the Levenberg–Marquardt-type method The following analysis for (7) uses ideas from [Han97], where the Levenberg–Marquardt method in a Hilbert space setting was analysed as a regularization method. Again we start with a monotonicity result on the Bregman distance. Lemma 7. Let the parameter αk in (7) be chosen such that for some 0 < μ < 1, μ y δ ? F (xk ) y δ ? F (xk ) ? F (xk )(xk+1 ? xk ) μ δ y ? F (xk ) . ν μ2 (ν ? 1) δ y ? F (xk ) 2 . αk ν y δ ? F (xk ) . (18) Additionally we assume that for a ν > 1, ? ? xk ) y δ ? F (xk ) ? F (xk )(x Then the iterates for the scheme (7) satisfy ? xk+1 ) ? Dξk (x, ? xk ) Dξk+1 (x, ? (20)
11

(19)

Inverse Problems 25 (2009) 105004

M Bachmayr and M Burger

Proof. By (12), ? xk+1 ) ? Dξk (x, ? xk ) + Dξk (xk+1 , xk ) = ξk+1 ? ξk , xk+1 ? x ? . Dξk+1 (x, Using (7b) we obtain
?1 δ ? = ?αk ? ξk+1 ? ξk , xk+1 ? x rk + Kk (xk+1 ? xk ), Kk (xk+1 ? x) ?1 δ rk + Kk (xk+1 ? xk ), = ?αk δ δ ? ? xk ) + Kk (xk+1 ? xk ) ? rk ? Kk (x rk ?1 δ rk + Kk (xk+1 ? xk ) ?αk δ δ ? ? xk ) . rk + Kk (xk+1 ? xk ) ? rk + Kk ( x

Combined with (18) and (19), this yields ? ξk+1 ? ξk , xk+1 ? x ? μ2 (ν ? 1) δ rk αk ν
2

.

In order to obtain a consistent convergence analysis, we make sure that for appropriate parameter choice, condition (18) can indeed be ful?lled. Lemma 8. Let xk ∈ D(F ) ∩ BV ( ), ξk ∈ L2 ( ) where ξk = xk + pk with pk ∈ ?J1 (x). For given 0 < μ < 1, condition (18) is satis?ed if αk > 0 is chosen such that αk F (xk ) 1?μ
δ qk + δ δ (1 ? μ) F (xk ) + qk , qk L2 ( ) /

(21)

δ where qk = F (xk )? (F (xk ) ? y δ )

F (xk ) ? y δ . 0. Substituting the optimality condition

Proof. By convexity of J1 , xk+1 ? xk , pk+1 ? pk for (7a ), which reads we obtain

? δ αk (pk+1 ? pk ) + αk (xk+1 ? xk ) + Kk rk + Kk (xk+1 ? xk ) = 0,

? ? δ (Kk Kk + αk I)?1 ?Kk rk ? αk (pk+1 ? pk ) , pk+1 ? pk

0.

Using continuity of

? (Kk Kk

+ αk I)
L2 ( )

?1

we may conclude

αk + Kk 2 ? δ Kk rk L2 ( ) . αk Again using the optimality condition for (7a ) to solve for xk+1 ? xk , this yields the estimate αk (pk+1 ? pk ) αk + Kk αk Finally, by the second triangle inequality, xk+1 ? xk
L2 ( ) ?1 1+ αk δ rk + Kk (xk+1 ? xk ) δ rk ? Kk δ rk δ 1 ? qk 2 ? δ Kk rk L2 ( )

.

xk+1 ? xk Kk αk 1+

L2 ( )

αk + Kk αk

2

,

and with Kk δ δ q δ + qk (1 ? μ) Kk + qk , 1?μ k this yields the ?rst inequality in (18). The second one follows directly by comparing xk+1 to xk in the objective functional for (7a ). αk
12

Inverse Problems 25 (2009) 105004

M Bachmayr and M Burger

With the above ingredients we can also prove the (semi-)convergence of the Levenberg– Marquardt-type method. Theorem 9. Let 0 < γ < min{μ/η, ρ }, x0 ∈ D(F ) ∩ BV ( ), ξ0 ∈ L2 ( ) such that ? x0 ) < γ 2 /2, αk satisfy (21) and αk α , and let the stopping index be chosen with a τ Dξ0 (x, such that τ > (1 + ηγ )/(μ ? ηγ ). (22)

Then for given δ > 0, the iterates for (7) are well-de?ned for k k ? , where k ? is ?nite. ? , then every subsequence of If δm > 0, {δm } → 0 with corresponding stopping indices km ? ? in the weak? topology of BV ( ). If has a subsequence converging to an x ∈ S (y) xkm
? ? ∩ Bρ (x) ? = {x ? }, then xkm S (y)

?

? in BV ( ). x

δ ? ? xk < γ and k < k ? , i.e. τ δ < rk , we have Proof. If x δ ? ? ? xk ) rk + Kk (x δ ?k δ + ηγ r , (1 + ηγ )δ + ηγ rk 1 + ηγ δ . + ηγ rk τ

As a consequence, (19) holds with ν = μτ/(1 + (1 + τ )ηγ ) > 1. ? ? xk Again by induction, lemma 7 applies for any k < k ? , which gives x any k k ? . Summing inequalities (20),
k ? ?1

L2 ( )

< γ for

? xk? ) + Dξk? (x,
k =0

μ2 (ν ? 1) δ rk αk ν

2

? x0 ) Dξ0 (x,

and hence for some S independent of δ ,
k ? ?1 k =0

1 δ r αk k

2

S.

It follows analogously to (17) that k ? is ?nite for given δ > 0. To use a compactness argument, an estimate for J (xk? ) independent of δ is required. Proceeding similarly to theorem 6 by expanding the de?nition of ξk? ,
k ? ?1

? | | ξk? , xk? ? x
l =0

1 δ ? + ρ ξ0 r + Kl (xl +1 ? xl ), Kl (xk? ? x) αl l

L2 ( ) .

For each 0

l

δ < rlδ by de?nition of the stopping index, k ? ? 1, using that rk ?

? rlδ + Kl (xl +1 ? xl ), Kl (xk? ? x)

? ? xl ) ) rlδ ( Kl (xk? ? xl ) + Kl (x rlδ (1 + 2ηγ ) yk? ? yl + (1 + μ/ν) rlδ , (3 + 4ηγ + μ/ν) rlδ 2 .

As a consequence, J (xk? ) ? + (3 + 4ηγ + μ/ν)S + ρ ξ0 J (x)
L2 ( ) .

? ?km → 0; thus the statement follows as in the proof of Due to the stopping rule we have r theorem 6.

13

Inverse Problems 25 (2009) 105004

M Bachmayr and M Burger

3.5. Convergence of the Landweber-type method We ?nally turn our attention to the Landweber-type method (8). The following results rely on estimates that are quite similar to the analysis of Landweber iteration in a Hilbert space context given in [HNS95]. Lemma 10. Let xk ∈ D(F ) ∩ BV ( ), ξk ∈ L2 ( ), where ξk = xk + pk with pk ∈ ?J1 (x). ? ? xk L2 ( ) < γ < ρ and αk is chosen such Then (8a ) has a unique minimizer xk+1 , and if x that αk (2 F (xk ) )2 , then ? xk+1 ) ? Dξk (x, ? xk ) Dξk+1 (x, ?(2αk )?1 F (xk ) ? y δ ((1? 2ηγ ) F (xk ) ? y δ ? 2(1+ ηγ )δ).

Proof. Similar to lemma 8 we use the optimality condition for (8a ), which reads
? δ αk (pk+1 ? pk ) + αk (xk+1 ? xk ) + Kk rk = 0,

as well as xk+1 ? xk , pk+1 ? pk pk+1 ? pk By (12) we have
L2 ( )

0 to obtain the estimate
?1 ? δ Kk αk rk L2 ( )

.

(23)

? xk+1 ) ? Dξk (x, ? xk ) Dξk+1 (x,

? ξk+1 ? ξk , xk+1 ? x ? . = ξk+1 ? ξk , xk+1 ? xk + ξk+1 ? ξk , xk ? x

Using (23), for the ?rst term we obtain the bound
?1 ? δ ?1 ? δ Kk rk , (pk+1 ? pk ) + αk Kk r k ξk+1 ? ξk , xk+1 ? xk = αk

2 Kk 2 αk

2 δ rk

2

.

Employing the nonlinearity condition (13), the second term can be estimated by
?1 δ δ δ ? = ?αk ? ? xk ) ξk+1 ? ξk , xk ? x rk , rk ? rk ? Kk (x ?1 δ rk ?αk ?1 δ rk ?αk 2 ?1 δ ?k ) rk (δ + ηγ r + αk δ (1 ? ηγ ) rk ? (1 + ηγ )δ .

Combining the two estimates, we arrive at the assertion. Theorem 11. Let 0 < γ < min{1/(2η), ρ }, x0 ∈ D(F ) ∩ BV ( ), ξ0 ∈ L2 ( ) such that ? x0 ) < γ 2 /2, αk satisfy 0 < α Dξ0 (x, αk α and αk (2 F (xk ) )2 , and the stopping index be chosen with τ such that τ > 2(1 + ηγ )/(1 ? 2ηγ ). (24) Then for given δ > 0, the iterates for (8) are well-de?ned for k k ? , where k ? is ?nite. ? If δm > 0, {δm } → 0 with corresponding stopping indices km , then every subsequence of ? } has a subsequence converging to an x ∈ S (y) ? in the weak? topology of BV ( ). If {xkm
? ? ∩ Bρ (x) ? = {x ? }, then xkm S (y)

?

? in BV ( ). x
L2 ( ) 2

? ? xk Proof. Using lemma 10, we again inductively obtain x
k ? ?1

< γ for any k ? x0 ), Dξ0 (x,

k ? and

? xk? ) + Dξk? (x,
k =0

(1 ? 2ηγ )τ ? 2(1 + ηγ ) δ rk 2αk τ

where (1 ? 2ηγ )τ ? 2(1 + ηγ ) > 0 by (24). Now the statement follows analogously to the proof of theorem 9.
14

Inverse Problems 25 (2009) 105004

M Bachmayr and M Burger

4. Application to parameter identi?cation In the following we will discuss the application of the total variation methods to parameter identi?cation problems. In these problems one often seeks parameters that are close to piecewise constants (with unknown numbers of constants on unknown numbers of subdomains), where the constants model e.g. material parameters in regions of different composition. Here we will investigate two particular identi?cation problems in elliptic partial differential equations with distributed measurements, in one case additionally with boundary measurements. 4.1. Identi?cation of a reaction coef?cient Our ?rst test problem can be shown to satisfy the assumptions of our convergence analysis. The problem consists in recovering q from an observation uδ of a true solution u ∈ H 1 ( ) of ? u + qu = f in
2

,

u = g on ? ,

(25)
2

where ? R is a bounded domain with smooth boundary or a rectangle, f ∈ L ( ) and g ∈ H 3/2 (? ). The nonlinear operator F : D(F ) ? L2 ( ) → L2 ( ) is de?ned as F (q) = u(q), where u(q) is the solution of (25) for the parameter q. This example is taken from [HNS95]. It can be shown that for some ω > 0, F is Fr? echet differentiable with locally bounded derivative and weakly sequentially closed on ? D(F ) = {q ∈ L2 ( ): q ? q and that u(q) ∈ H ( ) for q ∈ D(F ).
2 L2 ( )

? ∈ L2 ( ), q ? ω for a q

0 a.e.},

Lemma 12. Let q ∈ D(F ) with q L2 ( depending only on and M such that ? ∈ D(F ). for any q

)

M for some M > 0. Then there exists an η > 0
L2 ( )

? ? F (q) ? F (q)(q ? ? q) F (q)

? η q ?q

L2 ( )

? F (q) ? F (q)

L2 ( )

1 Proof. Let q ∈ D(F ) with q L2 ( ) M . Let A(q): H 2 ( ) ∩ H0 ( ) → L2 ( ), u → 2 ?1 3/2 ? u + qu. For ψ ∈ L ( ), let ν := A(q) ψ . Since qν ∈ L ( ), the imbeddings 2,3/2 1 H0 ( ) ? L6 ( ) and W0 ( ) ? L∞ ( ) in addition to regularity for the Poisson equation give

ν

L∞ ( )

ν

W0

2,3/2

( )

ψ ? qν ψ
2 L2 ( )

L3/2 ( )

+ q
1 H0 (

L2 ( )

ν

L6 ( )

CM ψ

L2 ( ) .

? ∈ D(F ) and h := q ? ? q . With w ∈ H ( ) ∩ Let q
?1

) de?ned by = 1,

w := ?A(q) (hu(q + h) ? hu(q)), we have w = F (q + h) ? F (q) ? F (q)h. For any ? ∈ L2 ( ) with ? w, ?
L2 ( ) L2 ( )

| hu(q + h) ? hu(q), A(q) ? L2 ( ) | hu(q + h) ? hu(q) L1 ( ) A(q)?1 ? L∞ ( CM h
L2 ( )

?1

)

u(q + h) ? u(q)

L2 ( ) . L2 ( )

Consequently, w

L2 ( )

? η q ?q

L2 ( )

? F (q) ? F (q)

with η = CM .

This implies condition (13). In summary, the convergence results of section 3 apply to problem (25). The results also hold for the simpler case of the analogue of (25) on an interval [a, b] ? R with Dirichlet boundary conditions.
15

Inverse Problems 25 (2009) 105004

M Bachmayr and M Burger

We ?nally mention that convergence rates for the standard variational method (2) in the Bregman distance corresponding to the regularization functional have been obtained in [RS06]. In particular, the authors demonstrated the results to be applicable to (25) with J as in section 3. As a second test problem, we consider a one-dimensional version of (25) with boundary measurements, i.e. identi?cation of a parameter q : [0, 1] → R from boundary values ui (1) of ui ∈ H 1 ([0, 1]), i = 1, . . . , ns , solving ?ui + qui = fi in (0, 1),
2

ui (0) = u0 ∈ R,

ui (1) = 0,

(26)

where fi ∈ L ([0, 1]) are the given sources. Though this is an instance of a problem where condition (13) cannot be veri?ed, it is nonetheless of interest how the schemes behave. In particular, we shall also take care of the case of small number of measurements ns (corresponding to few sources) to test a nonlinear version of compressed sensing. 4.2. Identi?cation of a diffusion coef?cient We now turn to a problem of identi?cation of a coef?cient of a higher order term, which leads to additional complications because the regularity of the coef?cient has more impact on the regularity of the solution. 1 ( ) of Here we want to reconstruct q from a solution u ∈ H0 ?div(q ? u) = f,
2 2

(27)

where ? R is convex and f ∈ L ( ). It can be shown that the parameter-to-solution map F (q) = u(q) is continuous and q a.e.} for any Fr? echet differentiable with locally bounded derivative on {q ∈ L∞ ( ): q q > 0. Furthermore, with the additional constraint
D(F ) = {q ∈ L∞ ( ): q

q

q a.e.}

with ?xed q > q > 0 it can be shown that with J as in section 3—here including the possibility κ = 0—the minimization problem (2), and similarly the subproblems arising in (6), (7) and (8) are well-posed. For further details and proofs, we refer to [CKP98] and [B07]. The convergence results of section 3 do not carry over to (27) directly, because unless additional smoothness assumptions such as q ∈ H 1 ( ) are made, which of course is not of interest in our context, the problem cannot be formulated in a Hilbert space framework and suitable nonlinearity conditions are not available. However, our numerical experiments show that the methods still give good results for this problem. 4.3. Results As outlined above, the numerical implementation of the iterative methods (6), (7) and (8) reduces to a sequence of standard regularization problems in each case, the main computational challenge being the non-differentiability of the BV seminorm. A standard way of dealing with this is to replace |q |BV by a differentiable approximation, e.g. |q |BV ≈ |Dq |2 + ε2 (28)

with small ε > 0, cf [AV94]. For methods (7) and (8), the minimization problems are convex, and hence methods based on duality, which do not necessarily involve a smoothing of the BV seminorm, become applicable. Examples are the projected gradient descent method of [C04] or the semismooth Newton method given in [HK04]; we have used the latter in our numerical tests.
16

Inverse Problems 25 (2009) 105004
3 2.5 2 1.5 1 0.5 0 0 0.2 0.4 0.6 0.8 1 0.12 0.1 0.08 0.06 0.04 0.02 0 0 0.2 0.4

M Bachmayr and M Burger

0.6

0.8

1

(a)

(b)

? , and (b) distributed input data uδ with 1% noise for one-dimensional Figure 1. (a) Exact parameter q examples.

In the case of the method (6) and for comparison to standard BV regularization, we have to resort to the approximation (28), where we use locally linearly convergent lagged diffusivity iteration [V098] as a minimization method. Using a slightly different differentiable approximation, this method was also used for total variation regularization in the form (2) of nonlinear problems, including the test problem (27), in [AHH06]. To have a direct comparison of the methods, we also use the differentiable approximation (28) within methods (7) and (8) for the example (27), where the minimization problems can be solved by a more ef?cient primal-dual Newton method [CGM99]. As the regularization functional for the iterative methods, we always use (10). In general, it will be necessary to take care of the constraints de?ning D(F ); for simplicity, our examples are chosen such that these constraints remain inactive.

4.3.1. One-dimensional example with distributed data. As a ?rst example, we consider the one-dimensional case of (25) on the unit interval with boundary conditions u(0) = u(1) = 0. We give results for the Levenberg–Marquardt-type method (7), with starting value q0 ≡ 1.5, κ = 0.1 and αk some appropriate constant. To better illustrate the convergence ? and behaviour, we choose αk larger than necessary for this example. The exact parameter q the corresponding input data uδ with 1% noise are shown in ?gure 1. We remark that the Landweber-type method (8) gives very similar results, but requires a much higher number of iterations if κ is chosen small. The subproblems were solved using the semismooth Newton method from [HK04], which in the one-dimensional case can easily be implemented without any arti?cial smoothing of the BV seminorm. For the discretization, we use piecewise constant elements for q and piecewise linear elements for u and the dual variable arising in the optimization method. The given results were obtained using 2000 elements. Figure 2 shows the convergence history of the Levenberg–Marquardt-type scheme with 1% and 5% noise and the reconstructions obtained using the discrepancy principle (9); since the parameter τ it involves is not explicitly known, we use the ?rst iterate with residual below the noise level (for larger τ the discrepancy principle tends to oversmooth anyway in the general experience and also in our tests). In fact, the monotonicity of the Bregman distance to
17

Inverse Problems 25 (2009) 105004
3 2.5 2 1 1.5 1 0.5 0 1 0.8 0.6 0.4 0.2 0 0 0.995 1.01

M Bachmayr and M Burger

1.005

0.2

0.4

0.6

0.8

1

0.99 0 1.8 1.6 1.4 1.2 1 0.8 0.6

100

200

300

400

500

(a)

(b)

200

400

600

800

1000

0.4 0

200

400

600

800

1000

(c)

(d)

Figure 2. Iteration for 1% (black) and 5% (grey) noise, both with αk = 5 × 10?7 : (a) reconstructions (discrepancy principle with τ = 1, iterates 287 resp. 25), (b) residuals ? qk )/Dξ0 (q, ? q0 ), (d) relative L2 -error relative to noise level, (c) relative Bregman distance Dξk (q, ? ? qk L2 / q ? ? q0 L2 . q

the exact solution predicted by the theoretical results is observed to hold also for later iterates, whereas the L2 -error does not show any monotonicity. The contour plots of ?gure 3 show the evolution of qk and ξk during the iteration at 1% noise in more detail. In ?gure 4 we compare the reconstruction to the one obtained by the Levenberg– q 2 , where for both methods the Marquardt method in L2 , corresponding to J (q) = κ L2 2 2 iterate with minimal L -error is shown. As a further illustration of the differences between the two types of regularization, ?gure 5 gives results for data without noise. Whereas BV regularization reproduces piecewise constants extremely well, in smoothly varying regions one observes the characteristic ‘staircasing’. In contrast, L2 regularization leads to persistent Gibbs oscillations at jumps and to problems at the boundary. 4.3.2. One-dimensional example with boundary data. In a similar setting, we compare the problem (26) with boundary measurements. We use the boundary conditions ui (0) = 1, ui (1) = 0 and measurements of ui (1) for the sources fi (x) = 10 exp[?10(x ? i/(1 + ns ))2 ],
18

i = 1, . . . , ns .

Inverse Problems 25 (2009) 105004

M Bachmayr and M Burger

(a)

(b)

Figure 3. Contour plot of iterates k = 0, . . . , 1000 of (a) qk and (b) ξk with 1% noise as in ?gure 2.

3 2.5 2 1.5 1 0.5 0

0.2

0.4

0.6

0.8

1

Figure 4. Reconstructions of minimal L2 -error with 1% noise (black) as in ?gure 2, compared to that obtained by L2 regularization (grey).

3 2.5 2 1.5 1 0.5 0

3 2.5 2 1.5 1 0.5 0

0.2

0.4

0.6

0.8

1

0.2

0.4

0.6

0.8

1

(a)

(b)

Figure 5. Reconstructions of Levenberg–Marquardt-type method for noise-free data, (a) with BV regularization and (b) L2 regularization only, both with κ = 0.1, αk = 10?9 , iterates 10 (grey), 1000 (black). 19

Inverse Problems 25 (2009) 105004
3 2.5 2 1.5 1 0.5 0 0 0.2 0.4 0.6 0.8 1 0.95 0 50 100 1 1.05

M Bachmayr and M Burger

150

200

250

(a)
1.2 1 0.8 1 0.6 0.4 0.2 0.5 1.5

(b)

0

100

200

300

400

500

0

100

200

300

400

500

(c)

(d)

Figure 6. ns = 50 with 1% noise (black, αk = 5 × 10?3 ), and with 0.1% noise (grey, αk = 10?4 ): (a) reconstructions (iterates 200 resp. 100), (b) residuals relative to noise level, (c) relative Bregman ? qk )/Dξ0 (q, ? q0 ), (d) relative L2 -error q ? ? qk L2 / q ? ? q0 L2 . distance Dξk (q,

3 2.5 2 1.5 1 0.5 0 0 0.2 0.4 0.6 0.8 1

Figure 7. ns = 5 without noise, reconstructions of a piecewise constant parameter (dashed) with BV regularization (black) and L2 regularization only (grey).

The implementation of the Levenberg–Marquardt-type method is done as above, in this example using 500 elements.
20

Inverse Problems 25 (2009) 105004

M Bachmayr and M Burger

0.25 2 1.8 1.6 1.4 1.2 1 1 0 ?1 ?1 0 1 0.2 0.15 0.1 0.05 0 1 0 ?1 ?1 0 1

(a)

(b)

? , (b) input data uδ with 5% noise for two-dimensional example. Figure 8. (a) Exact parameter q
4 3.5 3
0.016

0.02 0.018

2.5
0.014 0.012 0.01 0

2 1.5
2 4 6 8 10 12

1 0

2

4

6

8

10

12

(a)
0.8 0.7 0.6
10
?5

(b)
10
?4

0.5 0.4
10
?6

0

2

4

6

8

10

12

0

2

4

6

8

10

12

(c)

(d)

Figure 9. Iterations of the iterated Tikhonov-type method (+), Levenberg–Marquardt-type method ? qk ), (×), compared to stationary BV regularization (?): (a) residual, (b) Bregman distance Dξk (q, ? ? qk , (d) regularization parameters αk . (c) L2 -error q

Figure 6 shows the results of the Levenberg–Marquardt-type method with ns = 50. Again, the Landweber-type method leads to similar results, but requires a high number of iterations in this case due to rather severe restrictions on the regularization parameters.
21

Inverse Problems 25 (2009) 105004
0.016 0.015 0.014 0.013 0.012 0.011 0.01 0 1000 2000 3000 4000 5000

M Bachmayr and M Burger
4.5 4 3.5 3 2.5 2 1.5 1 0 1000 2000 3000 4000 5000

(a)
0.55 0.5 0.45 0.4 0.35 0.3 0

(b)

1000

2000

3000

4000

5000

(c)

Figure 10. Landweber-type method with κ = 1 (black) and κ = 0.1 (grey): (a) residual, ? qk ), (c) L2 -error q ? ? qk ; for κ = 1, αk ∈ [5.36e ? 3, 1.70e ? 2], (b) Bregman distance Dξk (q, for κ = 0.1, αk ∈ [4.28e ? 2, 1.62e ? 1].

As noted above we can also regard our schemes as methods for solving nonlinear compressed sensing problems, the standard approach to which consists in minimizing the regularization functional subject to an underdetermined equation constraint (a nonlinear one in our case). Hence, we tested our approach for a very low number of measurements, as an example in ?gure 7 we compare the reconstruction of a piecewise constant parameter obtained for ns = 5 without noise to that of the Levenberg–Marquardt method in L2 . The position of the peaks and the edges can be reconstructed reasonably well already with these measurements, but there remains some error on the amplitude. Similar experiences have been gained in other computational tests with few measurements. 4.3.3. Two-dimensional example with distributed data. Finally, we consider (27) with = B1 (0) ? R2 and f ≡ 1. To obtain a direct comparison of all methods, we use a differentiable approximation of the BV seminorm as described above. For the discretization, we use an unstructured mesh with 8648 triangles and 4421 nodes, where piecewise linear nodal elements are used for both q and u. In all cases, we take q0 ≡ 1.5 and, unless stated otherwise, κ = 0.1. Here we prescribe for αk a ?xed geometrically decreasing sequence for iterated Tikhonov and Levenberg–Marquardt, whereas for Landweber αk is chosen by an ad hoc backtracking scheme that ensures nonincreasing residuals.
22

Inverse Problems 25 (2009) 105004

M Bachmayr and M Burger

2 1.8 1.6 1.4 1.2 1 1 0 ?1 ?1 0 1

2 1.8 1.6 1.4 1.2 1 1 0 ?1 ?1 0 1

(a)

(b)

2 1.8 1.6 1.4 1.2 1 1 0 ?1 ?1 0 1

2 1.8 1.6 1.4 1.2 1 1 0 ?1 ?1 0 1

(c)

(d)

Figure 11. Reconstructions of (a) iterated Tikhonov- (with κ = 0.1, iterate 8), (b) Levenberg– Marquardt- (with κ = 0.1, iterate 8), (c) Landweber-type (κ = 1, iterate 2040) methods and (d) stationary BV regularization (at the same residual, compare ?gure 9).

? and the corresponding input data with 5% noise. Figure 8 shows the exact parameter q In ?gure 9, results for the iterated Tikhonov- and Levenberg–Marquardt-type methods are compared to those of standard BV regularization, i.e. the stationary method (2) with q ? q0 2 + |q |BV ( ) . In this example, relatively small regularization parameters J (q) = κ L2 ( ) 2 are prescribed for the two iterative schemes. For comparison, the regularization parameters in the non-iterative BV method (2) are chosen such that the residuals of the iterated Tikhonov method in the range of interest are reproduced; note that this requires much smaller regularization parameters in the case of the stationary method. As can be seen from ?gure 10, the regularization parameters {αk } that can be used for the Landweber-type method depend strongly on κ . In ?gure 11, the reconstructions of all four methods are compared, all corresponding to the same residual slightly below the noise level (compare ?gures 9 and 10 for the corresponding regularization parameters). For further numerical results for this problem and for the two-dimensional case of (25), we refer to [B07]. 5. Conclusions We have described the construction of three iterative methods for total variation regularization of ill-posed nonlinear operator equations and have analysed their convergence under a
23

Inverse Problems 25 (2009) 105004

M Bachmayr and M Burger

standard condition on the nonlinearity of the operator. Illustrative applications to parameter identi?cation in elliptic partial differential equations have been given, as well as numerical results that demonstrate the usefulness of the schemes and the improvement compared to standard variational schemes. An important problem for future research is the construction of ef?cient methods for the minimization subproblems to be solved in each step of the total variation schemes, in particular for Levenberg–Marquardt and iterated Tikhonov. Indeed the availability or non-availability of ef?cient schemes for the subproblems in speci?c applications might be the decisive fact upon choosing one of the three schemes. The schemes we presented in this paper can actually be applied for more general regularization functionals than just total variation, the main necessary ingredient being convexity of the regularization. Possible examples are several kinds of regularizations enforcing sparsity ( 1 -penalization in some basis), entropy functionals or higher order total variation functionals recently investigated in imaging applications. Most of the convergence analysis carries over if one has a Banach space with suitable embedding into a Hilbert space, on which the operator satis?es appropriate conditions (as the one used in this paper). Some details in the convergence proofs still rely on the speci?c properties of the spaces and regularizations in connection with the properties of the operators F. Hence, we suggest that the right conditions should be tuned to the speci?c problem, keeping our analysis as a guideline for different applications. From a practical point of view, the speci?c statements of the conditions for convergence might have less impact for the iterated Tikhonov and Levenberg–Marquardt methods, whereas they can be crucial for the success of the Landweber iteration. In any case we think that some strict convexity of the regularization functional will be needed to obtain convergence for nonlinear inverse problems, hence it does not seem feasible to use total variation or 1 -norms alone. In the Landweber-type method it is obvious that it cannot work in this form without a strictly convex part, since otherwise the functional to be minimized can be unbounded from below. In the iterated Tikhonov and Levenberg–Marquardt methods the need for strict convexity is more hidden. However, it is not possible to prove even the well de?nedness of iterates in general, since the ?tting part can behave arbitrarily and the Bregman distance does not yield coercivity. The techniques from the linear case, where the linear part in the Bregman distance is rewritten into the quadratic ?tting term do not apply, since the subgradients do not necessarily satisfy the needed range conditions. At least well-de?nedness can be shown for these two methods even in the case of small parameters in front of the L2 -norm regularization, thus it makes sense to consider the case of small perturbations of the total variation. Acknowledgments This research was performed when the second author was with the Industrial Mathematics Institute at JKU Linz and the Johann Radon Institute for Computational and Applied Mathematics (Austrian Academy of Sciences), and was ?nancially supported by the Austrian National Science Foundation FWF through project SFB F 013/08. Part of this work was completed while the ?rst author was receiving support from Deutsche Forschungsgemeinschaft (German Research Foundation) through grant GSC 111 Aachen Institute for Advanced Study in Computational Engineering Science (AICES), which is gratefully acknowledged, and the second author was supported by the DFG through the project Regularization with singular energies. The authors would like to thank Stanley Osher for stimulating discussions on the methods and Benjamin Hackl and Andreas Neubauer for useful hints concerning the example problems.
24

Inverse Problems 25 (2009) 105004

M Bachmayr and M Burger

References
[AHH06] Ascher U, Haber E and Huang H 2006 On effective methods for implicit piecewise smooth surface recovery SIAM J. Sci. Comput. 28 339–58 [AV94] Acar R and Vogel C 1994 Analysis of bounded variation penalty methods for ill-posed problems Inverse Problems 10 1217–29 [B07] Bachmayr M 2007 Iterative total variation methods for nonlinear inverse problems Master thesis Johannes Kepler Universit¨ at, Linz [Bre67] Bregman L 1967 The relaxation method of ?nding the common points of convex sets and its application to the solution of problems in convex programming U. S. S. R. Comput. Math. Math. Phys. 7 200–17 [BGO+06] Burger M, Gilboa G, Osher S J and Xu J 2006 Nonlinear inverse scale space methods Commun. Math. Sci. 4 179–212 [C04] Chambolle A 2004 An algorithm for total variation minimization and applications J. Math. Imaging Vis. 20 89–97 [CDL+98] Chambolle A, DeVore R, Lee N-Y and Lucier B 1998 Nonlinear wavelet image processing: variational problems, compression, and noise removal through wavelet shrinkage IEEE Trans. Image Process. 7 319–35 [CGM99] Chan T-F, Golub G H and Mulet P 1999 A nonlinear primal-dual method for total variation-based image restoration SIAM J. Sci. Comput. 20 1964–77 [CKP98] Casas E, Kunisch K and Pola C 1998 Some applications of BV functions in optimal control and calculus of variations ESAIM: Proc. 4 83–96 [COS09a] Cai J F, Osher S J and Shen Z 2009 Linearized Bregman iterations for compressed sensing Math. Comput. 78 1515–36 [COS09b] Cai J F, Osher S J and Shen Z 2009 Convergence of the linearized Bregman iteration for 1 -norm minimization Math. Comput. 78 2127–36 [COS09c] Cai J F, Osher S J and Shen Z 2009 Linearized Bregman iteration for frame-based image deblurring SIAM J. Imaging Sci. 2 226–52 [CT93] Chen G and Teboulle M 1993 Convergence analysis of a proximal-like minimization algorithm using Bregman functions SIAM J. Optim. 3 538–43 [CT06] Candes E and Tao T 2006 Near-optimal signal recovery from random projections: universal encoding strategies? IEEE Trans. Inf. Theory 52 5406–25 [DDD04] Daubechies I, Defrise M and DeMol C 2004 An iterative thresholding algorithm for linear inverse problems with a sparsity constraint Commun. Pure Appl. Math. 57 1413–57 [EHN96] Engl H W, Hanke M and Neubauer A 1996 Regularization of Inverse Problems (Mathematics and Its Applications vol 375) (Dordrecht: Kluwer) [ET99] Ekeland I and Temam R 1999 Convex analysis and variational problems Classics in Applied Mathematics vol 28 (Philadelphia: SIAM) [GO08] Goldstein T and Osher S J 2008 The split Bregman algorithm for 1 -regularized problems CAM Report 08-29 UCLA [Han97] Hanke M 1997 A regularizing Levenberg–Marquardt scheme with applications to inverse groundwater ?ltration problems Inverse Problems 13 79–95 [HNS95] Hanke M, Neubauer A and Scherzer O 1995 A convergence analysis of the Landweber iteration for nonlinear ill-posed problems Numer. Math. 72 21–37 [HBO06] He L, Burger M and Osher S J 2006 Iterative total variation regularization with non-quadratic ?delity J. Math. Imaging Vis. 26 167–84 [HCO+06] He L, Chang T-Ch, Osher S J, Fang T and Speier P 2006 MR image reconstruction by using the iterative re?nement method and nonlinear inverse scale space methods. CAM Report 06-35 UCLA [HMO05] He L, Marquina A and Osher S J 2005 Blind deconvolution using TV regularization and Bregman iteration Int. J. Imag. Syst. Technol. 15 74–83 [HK04] Hinterm¨ uller M and Kunisch K 2004 Total bounded variation regularization as bilaterally constrained optimization problem SIAM J. Appl. Math. 64 1311–33 [KSS09] Kaltenbacher B, Sch¨ opfer F and Schuster T 2009 Iterative methods for nonlinear ill-posed problems in Banach spaces: convergence and applications to parameter identi?cation problems Inverse Problems 25 065003 [LP99] Luce R and Perez S 1999 Parameter identi?cation for an elliptic partial differential equation with distributed noisy data Inverse Problems 15 291–307 [OBG+05] Osher S J, Burger M, Goldfarb D, Xu J and Yin W 2005 An iterative regularization method for total variation-based image restoration Multiscale Model. Simul. 4 460–89 25

Inverse Problems 25 (2009) 105004

M Bachmayr and M Burger

[ROF92] Rudin L-I, Osher S J and Fatemi E 1992 Nonlinear total variation based noise removal algorithms Physica D 60 259–68 [RS06] Resmerita E and Scherzer O 2006 Error estimates for non-quadratic regularization and the relation to enhancing Inverse Problems 22 801–14 [SLS06] Sch¨ opfer F, Louis A K and Schuster T 2006 Nonlinear iterative methods for linear ill-posed problems in Banach spaces Inverse Problems 22 311–29 [V02] Vogel C R 2002 Computational Methods for Inverse Problems (Philadelphia: SIAM) [V098] Vogel C R and Oman M E 1996 Iterative methods for total variation denoising SIAM J. Scient. Comput. 17 227–38 [XO06] Xu J and Osher S J 2006 Iterative regularization and nonlinear inverse scale space applied to wavelet based denoising. CAM Report 06-11 UCLA [YOG+08] Yin W, Osher S J, Goldfarb D and Darbon J 2008 Bregman iterative algorithms for 1 -minimization with applications to compressed sensing SIAM J. Imag. Sci. 1 143–68

26




友情链接: