Fourier reconstruction with sparse inversions

Pełen tekst

(1)Fourier reconstruction with sparse inversion.

(2)

(3) Fourier reconstruction with sparse inversion. PROEFSCHRIFT. ter verkrijging van de graad van doctor aan de Technische Universiteit Delft, op gezag van de Rector Magnificus prof. dr. ir. J.T. Fokkema, voorzitter van het College voor Promoties, in het openbaar te verdedigen op dinsdag 6 december 2005 om 13:00 uur. door. Paul Maarten ZWARTJES. doctorandus in de geofysica geboren te ’s-Gravenhage.

(4) Dit proefschrift is goedgekeurd door de promotor: Prof. dr. ir. A. Gisolf Toegevoegd promotor: Dr. ir. D.J. Verschuur. Samenstelling promotiecommissie: Rector Magnificus, voorzitter Prof. dr. ir. A. Gisolf, Technische Universiteit Delft, promotor Dr. ir. D.J. Verschuur, Technische Universiteit Delft, toegevoegd promotor Prof. dr. ir. A.J. Berkhout, Technische Universiteit Delft Prof. dr. W. A. Mulder, Technische Universiteit Delft Prof. dr. L. J. van Vliet, Technische Universiteit Delft Dr. M. D. Sacchi University of Alberta, Edmonton, Canada Prof. dr. M. Sen, University of Texas, Austin, USA. ISBN 90-8559-113-9. c Copyright 2005, by P.M.Zwartjes, Laboratory of Acoustical Imaging and Sound Control, Faculty of Applied Sciences, Delft University of Technology, Delft, The Netherlands. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without the prior written permission of the author, P.M. Zwartjes, Faculty of Applied Sciences, Delft University of Technology, P.O. Box 5046, 2600 GA, Delft, The Netherlands. SUPPORT The research for this thesis was financially supported by the Delphi consortium. Typesetting system: LATEX. Printed in The Netherlands by Optima Grafische Communicatie, Rotterdam..

(5) Contents. 1 Introduction 1.1 Seismic exploration . . . . . . . . . . . 1.1.1 Sampling the seismic wavefield 1.1.2 Data reconstruction methods . 1.1.3 Fourier reconstruction . . . . . 1.1.4 Sparse inversion . . . . . . . . 1.2 Objectives . . . . . . . . . . . . . . . . 1.2.1 Thesis outline . . . . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. 1 1 2 3 5 6 6 6. 2 Fourier reconstruction with minimum norm inversion 2.1 The discrete Fourier Transform . . . . . . . . . . . . . . . 2.1.1 1D Fourier transform . . . . . . . . . . . . . . . . . 2.1.2 Reconstruction with the DFT . . . . . . . . . . . . 2.2 Non-uniform Discrete Fourier Transform . . . . . . . . . . 2.2.1 NDFT . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.2 NFFT . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.3 Reconstruction with the NDFT . . . . . . . . . . . 2.3 Least-squares Fourier transform . . . . . . . . . . . . . . . 2.3.1 Inverse problem . . . . . . . . . . . . . . . . . . . . 2.3.2 Reconstruction with FRMN in the spatial domain 2.3.3 Relation to other algorithms . . . . . . . . . . . . 2.4 2D Fourier reconstruction . . . . . . . . . . . . . . . . . . 2.5 Fourier reconstruction of seismic data . . . . . . . . . . . 2.5.1 Advantages of FRMN . . . . . . . . . . . . . . . . 2.5.2 Disadvantages of FRMN . . . . . . . . . . . . . . . 2.5.3 Objectives . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. 9 10 10 12 12 12 13 13 14 14 18 19 20 22 23 23 24. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . ..

(6) 2. CONTENTS. 3 Fourier reconstruction with sparse inversion 3.1 Sparse inversion . . . . . . . . . . . . . . . . . . . . . . . 3.1.1 Penalized likelihood optimization . . . . . . . . . 3.1.2 Non-quadratic penalty functions . . . . . . . . . 3.1.3 Choosing a non-quadratic function . . . . . . . . 3.1.4 Reconstruction with FRSI in the spatial domain 3.2 FRSI on 1D time series . . . . . . . . . . . . . . . . . . 3.2.1 Reconstruction of a sparse spectrum . . . . . . . 3.2.2 Reconstruction of a smooth spectrum . . . . . . 3.2.3 Reconstruction of data with one large gap . . . . 3.2.4 Quality control and resolution analysis . . . . . . 3.3 Implementation . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 Inversion with IRLS and CG . . . . . . . . . . . 3.3.2 Data space preconditioning . . . . . . . . . . . . 3.3.3 Model space preconditioning . . . . . . . . . . . 3.3.4 IRLS with preconditioned CGNE . . . . . . . . . 3.4 Image reconstruction example . . . . . . . . . . . . . . . 3.5 Related algorithms . . . . . . . . . . . . . . . . . . . . . 3.5.1 Variable selection . . . . . . . . . . . . . . . . . . 3.5.2 Garrote, lasso and adaptive ridge regression . . . 3.5.3 Wavelet de-noising and shrinkage rules . . . . . .. . . . . . . . . . . . . . . . . . . . .. 4 Fourier reconstruction of seismic data in one dinates 4.1 FRSI in one spatial dimensions . . . . . . . . 4.1.1 Synthetic data with linear events . . . 4.1.2 Seismic shot-record . . . . . . . . . . . 4.2 FRSI in two spatial dimensions . . . . . . . . 4.2.1 Synthetic shot-record . . . . . . . . . 4.2.2 3D VSP data . . . . . . . . . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. 65 66 66 70 75 75 80. 5 Fourier reconstruction of spatially aliased seismic data 5.1 Sampling related problems: aliasing and sampling noise . 5.2 Interpolation beyond aliasing of uniformly sampled data . 5.2.1 Gulunay’s f -k interpolation . . . . . . . . . . . . . 5.3 Fourier reconstruction of aliased seismic data . . . . . . . 5.3.1 Cascade of FRSI and Gulunay’s method . . . . . . 5.3.2 Double FRSI method . . . . . . . . . . . . . . . . 5.4 Reconstruction of data aliased in one spatial dimension . 5.4.1 Non-uniformly sampled, aliased linear events . . . 5.4.2 Uniformly sampled, aliased seismic shot-record . . 5.4.3 Non-uniformly sampled, aliased seismic shot-record 5.5 Reconstruction of data aliased in two spatial dimensions . 5.5.1 Uniform and non-uniformly sampled linear events .. . . . . . . . . . . . .. . . . . . . . . . . . .. . . . . . . . . . . . .. . . . . . . . . . . . .. . . . . . . . . . . . .. . . . . . . . . . . . .. 85 85 86 91 93 93 95 97 97 101 101 105 105. . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . .. 29 30 31 32 36 38 40 40 42 44 45 48 48 48 50 51 55 59 59 60 62. and two spatial coor. . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . ..

(7) CONTENTS. 3. 5.5.2 5.5.3. Randomly sampled linear events . . . . . . . . . . . . . . . . 108 Non-uniformly sampled synthetic seismic shot-records . . . . 110. 6 Fourier reconstruction in three spatial coordinates 6.1 Seismic acquisition and processing: a multi-dimensional problem . . 6.2 Fourier reconstruction of 3D seismic surveys . . . . . . . . . . . . . . 6.2.1 Fourier reconstruction in four spatial coordinates using a 1D and 2D algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.2 A modification to include the fourth coordinate . . . . . . . . 6.3 Fourier reconstruction in four spatial coordinates using a 1D and 3D algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.1 Reconstruction in the common in-line midpoint domain . . . 6.3.2 Reconstruction in the common offset domain . . . . . . . . . 6.3.3 Sampling requirements for 3D FRSI . . . . . . . . . . . . . . 6.4 Azimuthal variation in seismic data . . . . . . . . . . . . . . . . . . . 6.4.1 Amplitude variation with azimuth . . . . . . . . . . . . . . . 6.4.2 Traveltime variation with azimuth . . . . . . . . . . . . . . . 6.5 3D Fourier reconstruction . . . . . . . . . . . . . . . . . . . . . . . . 6.5.1 Aliasing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5.2 Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5.3 Automatic determination of sparseness parameters . . . . . . 6.6 Synthetic example with P-wave anisotropy . . . . . . . . . . . . . . . 6.7 Synthetic time-lapse example with a model with dipping layers . . .. 117 117 119. 7 Conclusions and Recommendations 7.1 Fourier reconstruction with sparse inversion . . . . 7.2 Fourier reconstruction of aliased data . . . . . . . . 7.3 Fourier reconstruction in three spatial coordinates 7.4 Recommendations for seismic data reconstruction. . . . .. 153 154 156 158 161. Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 163 163 164 165. A Notation, Symbols and A.1 Notation . . . . . . . A.2 Symbols . . . . . . . A.3 Glossary . . . . . . . B The B.1 B.2 B.3 B.4. Non-uniform Fast Fourier Transform The NFFT algorithm . . . . . . . . . . . . Kaiser-Bessel window . . . . . . . . . . . 3D and 4D NFFT . . . . . . . . . . . . . Pipe-Menon weighting scheme . . . . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. 120 121 121 122 122 122 125 125 126 128 129 130 130 133 142. 169 169 171 172 172. C 2D sampling in seismic acquisition. 175. Bibliography. 179.

(8) 4. CONTENTS. Abstract. 189. Samenvatting. 193. Dankwoord. 197. Curriculum vitae. 199.

(9) 1 Introduction. 1.1 Seismic exploration Seismic surveying is an exploration technique based on the physics of wave propagation that allows geoscientists to look into the earth. A source, such as dynamite on land or a air gun at sea, send energy into the subsurface which reflects where there is a change in physical properties, such as, for instance, at the interface between sandstone and shale layers. These reflections are then recorded at the surface and stored in digital form. After the seismic data acquisition has finished the data are loaded into a computer to generate images of the subsurface using seismic data processing. Before such images can be generated, problems in acquisition must be corrected for and unwanted events in the data must be filtered out. When the data has gone through this pre-processing stage, it is ready to be imaged. This means that the reflections measured at the surface are ’propagated’ back down using the physical theory of wave propagation, and are focused at all depth levels, such that a representation of the subsurface layers is obtained. The problems in the pre-processing stage can be grouped into two categories: acquisition based and subsurface based. Subsurface based problems are caused by, for instance, multiple reflections from a single layer which would appear as phantom layers on the final image. Other causes of problems are complications due to near-surface structures that distort the signal, salt-domes that cause very complicated wave-propagation or basalt layers that absorb the seismic source vibrations and thereby obscure the layers that lie underneath. Acquisition based problems are caused by, for instance, missing data due to faulty equipment, inaccessibility, or feathering of cables in marine acquisition (non-uniform sampling), errors.

(10) Introduction. 2. in the positioning of equipment, aliasing due to coarse sampling, noise from passing ships/cars/people, etc. This thesis deals with one of the acquisition related problems, namely that of non-uniform and sparse spatial sampling of the seismic wavefield. 1.1.1. Sampling the seismic wavefield. The seismic source generates a wavefield. During data acquisition this wavefield is sampled at discrete locations and at discrete times over a very large area. As a rule of thumb, the maximum distance between source and receiver should exceed the depth of the layer that needs to be imaged and these can be up to five or six thousand meters deep. In the ideal case the wavefield is uniformly and densely sampled. Sampling the incoming wavefield in time is not a problem, because the frequency range used in seismic processing (up to 100Hz) can be sampled uniformly and unaliased with modern equipment. However, spatial sampling of the wavefield with dense arrays of geophones on land or hydrophones at sea is technologically and logistically possible, but prohibitively expensive. In seismic surveys the wavefield is sampled in four spatial coordinates, namely two for the source and two for the receivers1 . The sampling is rarely dense in all four coordinates, because of a trade-off between geophysical and economical constraints. Sparse sampling is cheaper, because it means acquiring less data, but it also leads to aliasing artifacts in the data. Another sampling problem is that of non-uniform sampling, which is basically any sampling geometry that is not uniformly spaced on a orthogonal grid. Non-uniform sampling can be caused by (a combination of): • bad luck (misfired shots, hardware failure, etc.), • man-made obstacles (cities, hydrocarbon production installations, regulations, etc.), • natural causes (currents, cliffs, rivers, etc.). Aliasing and non-uniform sampling both cause artifacts during seismic data processing. Other sources of error are listed in Ongkiehong and Huizer [1987]. They state that the most important error in seismic acquisition is the error due to binning (or nearest neighbor interpolation), which is applied to correct for non-uniform sampling. Correct handling of the positions is important as it can reduce artifacts in noise removal, multiple attenuation and prestack imaging and especially in time lapse processing, in which two 3D surveys are compared to spot differences at reservoir level due to hydrocarbon production. Positioning differences can mask small 1 The reason it is called a 3D survey is that such the final product obtained from such a survey is a 3D data volume representing the subsurface.

(11) 1.1 Seismic exploration. 3. amplitude and timing differences expected from changes in the reservoir [Eiken et al., 2003; Wever and Spetzler, 2004]. This is the processing step where data reconstruction can have the most impact. 1.1.2. Data reconstruction methods. The effects of aliasing and non-uniform sampling can be minimized by reconstructing the data on an uniformly spaced grid, early on in the seismic processing flow. There are various reconstruction methods, which can be placed into three groups. Transformation based reconstruction. Non-uniformly sampled data can be regularized in a two step approach using a variety of transformations (Fourier, Radon, wavelet, etc). In the first step of these methods the transform coefficients are estimated, usually via least-squares inversion. When sampling is non-uniform the direct forward transformation gives a distorted estimate of the coefficients and, therefore, an inversion gives a better estimate (provided the inverse problem is well posed). In the second step the data on the desired uniform grid is found by the inverse transform. An advantage of transformation based data reconstruction is that it is fast (if the transform can be computed efficiently), requires little user-input and can handle both non-uniform and uniform grids. Best results are obtained for band-limited data. In fact, non-bandlimited or aliased data may even cause meaningless results. The damage caused by aliasing can be reduced by using appropriate prior information in the inversion, as has been shown by Herrmann et al. [2000] for the parabolic Radon and by Zwartjes and Sacchi [2004] for the Fourier transform. In seismic data processing this approach has been taken using the Fourier transform [Duijndam et al., 1999b], the parabolic Radon transform [Hugonnet and Canadas, 1997] and the hyperbolic Radon transform [Trad, 2000]. The Fourier transform has also been used for regularization in image reconstruction [Strohmer, 1993] and potential field geophysics [Rauth, 1998]. The wavelet transform, which differs from the Fourier/Radon transforms by its localized basis functions, has been used in reconstruction of 1D time series [Choi and Baraniuk, 1993]. Filter based reconstruction. Filter based methods interpolate by convolution with an interpolating filter, such as the sinc function, in the spatial domain. These methods have been used in virtually all fields of science, since the Shannon-Kotelnikov-Whitaker signal reconstruction theorem. Regularization of non-uniformly sampled data can be done by convolving the non-uniformly sampled data with, for instance, a Gaussian window and placing.

(12) 4. Introduction. the output on a regular grid, as, for instance, in the method of generalized convolution [Knutsson and Westin, 1993]. A common method in seismic data processing for interpolation of uniformly sampled, aliased data is via the method of predictionerror filters [Spitz, 1991; Claerbout and Nichols, 1991; Claerbout and Fomel, 2004]. Fomel [2000] discusses how to use prediction-error filters to fill in gaps in uniformly sampled data sets. These methods are not suited for random sampling, but by binning the data to a fine grid the random sampling geometry is replaced by a uniform grid with missing samples. Wavefield operator based reconstruction. Wavefield operator based reconstruction methods are based on the Kirchhoff integral operator and offer a theoretically attractive framework to reconstruction of seismic data. In numerical implementations the integral equations are replaced by summations which rely on constructive and destructive interference along the summation path. The Kirchhoff integral operators used in imaging of seismic data are flexible, because they can handle any sampling geometry. However, the price to pay for this flexibility is the artifacts caused by non-uniform sampling. When the sampling is too coarse aliasing artifacts appear due to incomplete destructive interference. If the sampling is non-uniform such artifacts will also appear. These artifacts can be reduced by adopting an inversion approach, as is done in imaging with Kirchhoff operators via least-squares migration [Nemeth et al., 1999; Kuehl, 2002]. Examples of data reconstruction with wavefield continuation methods are offset continuation [Bagaini and Spagnolini, 1996], shot continuation [Spagnolini and Opreni, 1996] and DMO which transform prestack data to equivalent zero offset data [Deregowski, 1986]. The combination of DMO followed by inverse DMO is often applied to reconstruct data to a uniform grid. Integral continuation operators are best applied to uniformly sampled data, but they can be used for interpolation and regularization. However, the non-uniform sampling causes artifacts, as it does for all Kirchhoff integrals. Weighting schemes can be used to correct for artifacts that occur when multi-trace convolutional operators are applied to non-uniformly sampled data. Canning and Gardner [1998] suggested trace weighting to reduce artifacts in 3D DMO, in much the same way as trace weighting is used to reduce artifacts in the non-uniform discrete Fourier transform (see Section 2.2.1). A variety of DMO implementations have been used to regularize seismic data, such as explicitly applying first DMO followed by inverse DMO [Canning and Gardner, 1996] or by setting up an inverse problem with the regularized data as the desired model and the inverse DMO operator as the forward operator [Ronen, 1987]. Biondi et al. [1998] have constructed a single operator by chaining the DMO and inverse DMO operators and called it azimuth moveout or AMO. [Chemingui, 1999] used this operator in an inversion approach to perform a least-squares AMO which he called ”inversion to common offset” (ICO)..

(13) 1.1 Seismic exploration. 5. A positive feature of wavefield operator based reconstruction methods is that they allow maximum use of available subsurface information. However, this strength can also be a weakness if the subsurface information is unknown or wrong. The wavefield continuation methods are typically computationally expensive. 1.1.3. Fourier reconstruction. This thesis is about Fourier reconstruction, one of the transformation based reconstruction methods. Fourier reconstruction uses a least-squares inversion to estimate the Fourier coefficients that describe the non-uniformly sampled data. A good estimate of the Fourier coefficients, therefore, lies at the heart of the method. The main requirement of the method is that enough data is available and that the signal is band-limited. Once the Fourier coefficients have been estimated, data can be reconstructed on any uniformly or non-uniformly sampled grid. The Fourier reconstruction algorithm has been applied successfully to 1D and 2D non-uniformly sampled data in for example image reconstruction [Strohmer, 1993], magnetic resonance imaging Wajer et al. [1996]; Wajer [2001] and geophysics [Rauth, 1998; Duijndam et al., 1999b; Hindriks and Duijndam, 2000]. The strengths of Fourier reconstruction are that • it is efficient in terms of computational and memory requirements, • it requires little user input and no subsurface information to reconstruct seismic data, • it can handle both uniform grids with missing data as well as random sampling. However, it also has some limitations and overcoming these it the purpose of this thesis. The main limitations are • the maximum gap that can be filled in is approximately three times the average Nyquist interval (Schonewille [2000], Section 2.4.6) • the reconstruction result is strongly dependent on spatial bandwidth and, therefore, the best reconstruction usually occurs at low (temporal) frequencies where the spatial bandwidth is narrowest, • it cannot handle spatially aliased data, • it is currently limited to two spatial dimensions in seismic data reconstruction, because of the sparse sampling in seismic data acquisition..

(14) Introduction. 6. 1.1.4. Sparse inversion. In Fourier reconstruction the Fourier coefficients are obtained through damped leastsquares inversion. The damping of the inversion is required when there is not enough data to reliably estimate the coefficients. The damped least-squares inversion is derived using a minimum energy constraint on the model, which is fulfilled by filling the reconstructed signal in the gaps with zeroes. The effect of this damping term is to suppress the reconstruction in the gaps. In this thesis the minimum energy criterion is replaced by a criterion of minimum structure, or rather, a desire for sparseness in the model space. Sparse models can be found through sparse inversion, which uses non-quadratic regularization terms to penalize smooth and favor sparse models. In geophysics sparse inversion has been used for high resolution Radon transforms [Sacchi and Ulrych, 1995], which benefited the multiple removal process because of the better discrimination between primary and multiple reflections, as well as a high resolution Fourier transform [Sacchi and Ulrych, 1996] that benefit spectral analysis. Trad [2000] has shown that data reconstruction using the high-resolution Radon transform yields much better results than reconstruction via the (damped) least-squares Radon transform. Similar result are presented in this thesis and in [Zwartjes and Duijndam, 2000; Zwartjes and Hindriks, 2001; Zwartjes and Sacchi, 2004] for data reconstruction from nonuniformly sampled data using the Fourier transform .. 1.2 Objectives The objectives of this thesis are to modify Fourier reconstruction such that the previously mentioned limitations are overcome. More specifically, the goal is to modify the existing Fourier reconstruction method such that it • yields better reconstruction in large gaps, • is less sensitive to the spatial bandwidth used in the inversion, • can handle spatially aliased data, • can handle non-uniform sampling in more than two spatial dimensions. 1.2.1. Thesis outline. Chapter 2 deals briefly with the discrete Fourier transforms for uniform and nonuniform sampling and the least-squares Fourier transform as used in Fourier reconstruction with minimum norm inversion (FRMN). An analysis of the reconstruction algorithm in the spatial domain shows the limitations of this reconstruction method..

(15) 1.2 Objectives. 7. Chapter 3 introduces sparse inversion and the various ways a sparse inversion estimator can be obtained. The theory is used to construct the Fourier reconstruction with sparse inversion (FRSI) algorithm. An analysis of the reconstruction algorithm in the spatial domain shows how and why it works. The chapter is concluded with an overview of related algorithms in other fields. Chapter 4 shows the results of the FRSI applied to synthetic and real 2D and 3D non-uniformly sampled seismic data. These examples show how two of the goals are met, namely those of improving reconstruction in large gaps and reducing the sensitivity to the spatial bandwidth (or number of parameters). Chapter 5 discusses how the FRSI algorithm can be used to reconstruct and/or interpolate aliased seismic data by applying the algorithm in two steps. In the first step the unaliased part of the spectrum is estimated and is used as prior information in the inversion in the second step to suppress aliased energy. Examples are shown for synthetic and real 2D and synthetic 3D seismic data. Chapter 6 deals with reconstruction in three spatial dimensions. The third dimension is azimuth and some synthetic examples of reconstruction of 3D marine streamer data are discussed. Additionally, an automatic procedure for determining the inversion tuning parameters is discussed. Chapter 7 contains a summary and discussion of the advantages and limitations of each of the modifications as well as recommendations for future research. Appendix A.3 gives an explanation of some seismic processing jargon and Appendix A.1 a brief overview of the notation. Appendix B discusses a fast implementation of the non-uniform discrete Fourier transform as presented by Duijndam and Schonewille [1999a], which we extended to three and four spatial dimensions..

(16) Introduction. 8. Begin Chapter 1 : Introduction. Theory Chapter 2 : Fourier reconstruction with minimum norm inversion (FRMN) Chapter 3 : Fourier reconstruction with sparse inversion (FRSI). Shortcomings of FRMN Large gaps and. Objectives for FRSI. dependence on spatial bandwidth. Chapter 4: FRSI of seismic data in one and two spatial dimensions. Aliasing. Chapter 5 : FRSI of spatially aliased seismic data. Reconstruction in three spatial dimensions. Chapter 6 : FRSI in three spatial dimensions. Conclusions Chapter 7 : Conclusions and recommendations. End Appendices A−C. Figure 1.1: Schematic outline of the thesis.

(17) 2 Fourier reconstruction with minimum norm inversion. Fourier reconstruction is a method to recover a signal from non-uniformly spaced samples. It is based on the sampling theorem [Jerry, 1977; Benedetto and Ferreira, 2000], which says that a bandlimited continuous function can be recovered from a uniformly spaced samples. Non-uniformly sampled signals can be reconstructed if the average sampling rate exceeds the Nyquist sampling rate [Beutler, 1966, 1974; Bilinskis and Mikelsons, 1992]. In case of both uniform and non-uniform sampling the bandlimited continuous signal can be recovered by convolution with a sinc filter or from knowledge of the Fourier coefficients that define the signal. Fourier coefficients of sampled data are obtained via the discrete Fourier transform (DFT). In case of uniform sampling the DFT is an orthonormal transform, but when the sampling is non-uniform the basis functions are no longer orthogonal. This means that direct evaluation of the DFT yields a smeared estimate of the Fourier coefficients. Several remedies have been proposed, such as correcting for the non-orthogonality of the basis functions via Gram-Schmidt orthogonalization [Ferraz-Mello, 1981], estimating and subtracting the significant Fourier coefficients from the signal one at a time until no significant signal is left [Högbom, 1974] or by least-squares estimation of Fourier coefficients from the non-uniformly sampled data [Papoulis, 1975; Lomb, 1976; Feichtinger et al., 1995]. Fast algorithms for computing the discrete Fourier transform of non-uniformly spaced samples have been developed which make these operations numerically efficient [Dutt and Rokhlin, 1993; Duijndam and Schonewille, 1999a]..

(18) Fourier reconstruction with minimum norm inversion. 10. Reconstruction by least-squares Fourier transform. The Fourier reconstruction method discussed in this thesis uses the least-squares estimation approach. A good estimate of the Fourier coefficients lies at the heart of the least-squares Fourier transform. To ensure this the signal should be bandlimited to avoid aliasing. Additionally, enough samples should be recorded with respect to the bandwidth estimated so that the inversion for the Fourier coefficients is well posed. Once the Fourier coefficients have been estimated, the data can be reconstructed to any uniform or non-uniform grid within the boundaries set by the input data, as Fourier reconstruction handles interpolation well, but extrapolation poorly. The Fourier reconstruction method has been applied successfully to 1D and 2D non-uniformly sampled data in image reconstruction [Strohmer, 1993], magnetic resonance imaging [Wajer et al., 1996; Wajer, 2001] and geophysics [Rauth, 1998; Duijndam et al., 1999b; Hindriks and Duijndam, 2000; Schonewille, 2000; Zwartjes and Duijndam, 2000]. This chapter discusses the details of the Fourier reconstruction method, because it is central to this thesis. For additional information see the references given above. In Section 2.1.1 the discrete Fourier transform (DFT) for uniformly sampled data is presented, after which the DFT for non-uniformly sampled data (NDFT) is discussed in Section 2.2. The NDFT does not yield the correct Fourier coefficients, and the least-squares Fourier transform corrects for this, as discussed in Section 2.3. In that section also the effect of Fourier reconstruction in the spatial domain is explained. The strength and limitations of the method are discussed in Section 2.5. This leads to the objectives of this thesis, which consists of several aspects of the method that need to be improved.. 2.1 The discrete Fourier Transform 2.1.1. 1D Fourier transform. The discrete signal p[n∆x] is obtained by uniform sampling of a continuous signal p(x) where x is the spatial coordinate and the sampling interval is ∆x. The individual samples are denoted by indices 0 ≤ n ≤ N − 1, where N is the total number of samples. Round brackets will be used for continuous signals and square brackets for discrete signals. The sampled data is written as the vector p in which the nth element is p[n∆x]. Although spatial sampling does not require the Fourier domain to be discretized, the use of digital computers does. So, let k denote the wavenumber and p˜[m∆k] the discrete version of Fourier coefficients p˜(k), sampled at spatial frequencies m∆k where ∆k = 2π/(N ∆x). The total number of Fourier coefficients is M = N , with.

(19) 2.1 The discrete Fourier Transform. 11. the spatial bandwidth determined by −M/2 ≤ m ≤ M/2 − 1 for an even number of input samples, which means the Nyquist frequency is the first sample in the array and the zero frequency is at position M/2 + 1. For an odd number of samples N and M = N the spatial bandwidth is −(M − 1)/2 ≤ m ≤ (M − 1)/2, and the zero frequency is at position (M + 1)/2. For the remainder of this chapter we assume an even number of samples. The collection of Fourier coefficients is denoted by the ˜ . Throughout this thesis the sampling in the frequency domain is always vector p uniform. With these definitions the discrete Fourier transform pair is defined as p˜DFT [m∆k] = ∆x. N −1 X. p[n∆x]ejm∆k n∆x ,. (2.1.1). p˜[m∆k]e−jm∆k n∆x ,. (2.1.2). n=0. and p[n∆x] =. ∆k 2π. M/2−1. X. m=−M/2. where the subscript DFT indicates discrete Fourier transform. When, in case of uniform sampling, the constant ∆x = 1, Eqs. (2.1.1) and (2.1.2) can be written as p˜DFT [m] =. N −1 X. p[n]ej2πmn/N ,. (2.1.3). p˜[m]e−j2πmn/N ,. (2.1.4). n=0. and p[n] = where M = N and ∆k =. 2π N ∆x .. 1 N. M/2−1. X. m=−M/2. With the definition of the N × M Fourier matrix 1 Fnm = e−jm∆k n∆x ,. (2.1.5). Eqs. (2.1.3) and (2.1.4) can be expressed in vector notation as p = F˜ p, 1 H ˜ = p F p. N. (2.1.6) (2.1.7). with FH F = N I. Throughout this thesis Eq. (2.1.1), and Eq. (2.1.2) will be used, since they explicitly show the dependence of the Fourier transform kernel on the sampling position x. 1 The choice of F for the inverse Fourier transform complies with the notation in Section 2.3, where the forward model in the inversion is the inverse NDFT matrix..

(20) Fourier reconstruction with minimum norm inversion. 12. 2.1.2. Reconstruction with the DFT. A continuous, band-limited signal p(x) can be reconstructed from its sampled version p[n∆x], if the sampling rate is at least twice the highest frequency in the data. The bandlimited signal has p˜(k) = 0 for at least |k| > π/∆x. The reconstruction formula is obtained by substituting the DFT from the sampled spatial domain to the continuous Fourier domain in the inverse continuous Fourier transform. The result is the well known sinc interpolation formula, p(x) =. N −1 X. p[n∆x]. n=0. π (n∆x − x)] sin[ ∆x . π (n∆x − x) ∆x. (2.1.8). and is the main result of the sampling theorem. The derivation can be found in many textbooks on sampling and the Fourier transform. Eq. (2.1.8) says that the continuous signal can be recovered from its samples by convolution with a sinc-function. The sinc function has infinite extent so in practice it is tapered. Alternatively, it can be approximated by a short filter. Interpolation to a finer spatial sampling interval also can be performed in the frequency domain by padding zeros to the DFT of the original signal before applying an inverse DFT.. 2.2 Non-uniform Discrete Fourier Transform 2.2.1. NDFT. In the case of non-uniform sampling the data p(x) is sampled non-equidistantly, i.e. at locations [x0 , . . . , xn , . . . , xN −1 ] to yield p[xn ]. The discrete sum in Eq. (2.1.1) is approximated using the actual sample locations and a midpoint rule for the sampling interval. This yields the non-uniform discrete Fourier transform: pÑDFT [m∆k] =. N −1 X. p[xn ]ej m∆k xn ∆xn .. (2.2.9). n=0. ˜ DFT , to the non-uniformly The inverse transform of uniformly sampled k−space data, p spaced grid is M/2−1 X 1 p[xn ] = p˜DFT [m∆k]e−jm∆kxn ∆k . (2.2.10) 2π m=−M/2. Point-spread function (PSF). The coefficients of pÑDFT [m∆k] equal those of p˜DFT [m∆k] convolved with the pointspread function or PSF, which is the Fourier transform of the non-uniform sampling.

(21) 2.2 Non-uniform Discrete Fourier Transform. 13. grid. This follows from substituting (2.2.10) in (2.2.9) pÑDFT [m∆k]. = =. ∆k X X p˜DFT [q∆k]e−jq∆kxn ejm∆kxn ∆xn 2π n q ∆k X X ∆xn ej(m−q)∆kxn p˜DFT [q∆k] 2π q n. = {PSF ∗ p˜DFT }[m∆k] ,. in which the PSF is defined as the NDFT of the sample weights, PSF[m∆k] =. N −1 ∆k X ∆xn ejm∆kxn . 2π n=0. (2.2.11). The amount of distortion in the NDFT is determined by the PSF. As the PSF approaches a delta spike, the NDFT coefficients approach those of the DFT. The PSF itself is determined by the sampling geometry, but it can be influenced by the choice of data weights, ∆xn . Usually the weighting scheme is chosen based on the sample positions, such as ∆xn =. 1 (xn+1 − xn−1 ) . 2. (2.2.12). A similar analysis of the discrete Fourier transformation on a non-uniform grid is presented in Wajer [2001], who studied the problem of non-uniform sampling in the Fourier domain in the field of magnetic resonance imaging (MRI). 2.2.2. NFFT. The NDFT is an important part of the Fourier reconstruction algorithm. In case of uniform sampling with missing samples, the NDFT yields the same coefficients as the DFT with zeros inserted at the locations of the missing samples. This means the FFT can be used to compute the NDFT, instead of the NFFT. In case of nonuniform sampling computing the NDFT through matrix-vector multiplication can be rather expensive, especially in higher dimensions. Therefore, we use a fast algorithm called the NFFT. The details of the NFFT are given by Duijndam and Schonewille [1999a]. We have extended the algorithm to three and four spatial dimensions for use in Chapter 6. Results on accuracy and efficiency are given in Appendix B. 2.2.3. Reconstruction with the NDFT. Small gaps in uniform sampling can be filled in using a band-limited inverse DFT, i.e. by zeroing coefficients −π/∆x < m∆k < −M/2∆k and (M/2 − 1)∆k < m∆k <.

(22) Fourier reconstruction with minimum norm inversion. 14. π/∆x prior to the inverse transform. This is equivalent to convolution with a bandlimited sinc filter in the spatial domain. p [i∆x] =. N X. p [xj ]. j=1. sin[(M + 1)∆k(i∆x − xj )/2] . sin [∆k(i∆x − xj )/2]. (2.2.13). Eq. (2.2.13) is the discrete analogue of Eq. (2.1.8). For random sampling Beutler [1974] showed that a signal can be recovered from its non-uniformly spaced samples if the number of samples per unit space on average exceeds the maximum frequency in the signal, i.e. if the average sampling rate exceeds the Nyquist rate.. 2.3 Least-squares Fourier transform As was mentioned in Section 2.2 the NDFT coefficients equal the DFT coefficients convolved with the PSF of the non-uniform grid. A reconstruction procedure that contains a deconvolution for the PSF is, therefore, a logical approach to the problem. The deconvolution is performed by least-squares inversion [Duijndam et al., 1999b]. The same least-squares Fourier transform has been derived by Feichtinger and Gröchenig [1993] via the theory of frames. 2.3.1. Inverse problem. In the inversion we want to solve a linear system of equations, written in vector ˜ the Fourier coefficients and A the notation as p = A˜ p + n, where p is the data, p inverse Fourier transform matrix that takes us back to the non-uniformly spaced grid in the spatial domain. Energy outside the spatial bandwidth used in the inversion and actual random noise are put together in a noise term n. The inversion solves ˜ . If the data is non-uniformly sampled in one spatial for the Fourier coefficients p coordinate, Fourier reconstruction inverts for the 1D spatial inverse NDFT : p[xn ] =. ∆k X p˜[m∆k]e−jm∆kxn . 2π m. (2.2.10). The linear system p = A˜ p contains the following elements pn Anm p˜m. = p[xn ] , −jm∆kxn , = ∆k 2π e = p˜[m∆k] .. ˜ are a Fourier transform pair and a change in one automatically The vectors p and p involves a change in the other. However, in the remainder of this thesis we will refer ˜ as the parameter vector of Fourier coefficients that describes the data and is to p.

(23) 2.3 Least-squares Fourier transform. 15. constrained by a prior information in the inversion. This constitutes an abuse of notation we will accept for the sake of convenience. The objective in Fourier reconstruction is to minimize the quadratic function −1. J = kCn 2 (p − A˜ p)k22 =. 1 1 kW 2 (p − A˜ p)k22 . 2 c. (2.3.14). PM/2−1 where k˜ pk22 = m=−M/2 |˜ pm |2 . An accurate representation of the noise covariance matrix is expensive and unrealistic in the sense the such information is not available (Schonewille [2000], p.12). Duijndam et al. [1999b] gives a justification for writing the noise covariance matrix as Cn = c2 W−1 , where c is a constant. The diagonal matrix W is the data weighting matrix P in which the weights are defined as Wnn ± ∆xn and are normalized such that ∆xn = 2π/∆k. The minimum of Eq. (2.3.14) is obtained by the least-squares estimator ˆ˜ = AH WA p. −1. AH Wp .. (2.3.15). When the inverse problem is well-posed and well-conditioned the Fourier coefficients can be estimated with Eq. (2.3.15) [Feichtinger et al., 1995; Rhebergen et al., 1997; Duijndam et al., 1999b]. When the inverse problem is ill-conditioned the inversion must be regularized, for example by including a restriction on the Euclidean model norm: 1 1 1 ˜ 0 k22 . J = 2 kW 2 (p − A˜ p)k22 + 2 k˜ p−p (2.3.16) c σp˜ ˜ 0 the a priori model estimate. The where σp2˜ is the a priori model variance and p minimum of this objective function is given by ˆ˜ = AH WA + λI p. −1. AH Wp − λ˜ p0 .. (2.3.17). with the damping term λ = c2 /σp2˜. Eq. (2.3.17) is called the minimum norm solution2 . See Tarantola [1987] for a detailed discussion of inverse theory. In the ˜ 0 is taken as zero. We reabsence of prior knowledge on the Fourier coefficients, p fer to the Fourier coefficients obtained with Eq. (2.3.17) as the FRMN coefficients, where FRMN stand for Fourier reconstruction with minimum norm inversion. Matrix AH A has elements AH Apq. =. ∆k∆x 2π. PN −1 n=0. e−j(p−q)∆k xn ,. and is a Toeplitz matrix, because (p − q)∆k is constant along diagonals given by a constant value of p − q. Matrix-vector multiplications with a Toeplitz matrix can 2 Minimum norm inversion is also called damped least-squares, stabilized inversion or ridge regression..

(24) Fourier reconstruction with minimum norm inversion. 16. be efficiently performed using FFT’s. However, in case of uniform sampling with missing elements, it is more efficient to use the uniform grid (n∆x) instead of the actual sampling positions (xn ). In that case AH A = I and one can use the FFT instead of the NDFT (see for instance Liu [2004]). After the Fourier coefficients have been estimated, data can be generated on any desired grid with the inverse Fourier transform. When the output grid is uniform the inverse Fourier transformation can be performed by FFT. When it is non-uniformly sampled, the inverse NDFT must be used, or preferably the NFFT for sake of efficiency. Note that in the special case of uniform sampling and with λ = 0, the DFT, NDFT and FRMN Fourier coefficients are equal. Underdetermined vs. overdetermined inverse problem. Eq. (2.3.17) is the damped least-squares solution for the overdetermined inverse problem (N > M ). The following identity [Snieder and Trampert, 1999] (AH B−1 A + D−1 )−1 AH B−1 = DAH (B + ADAH )−1 , can be used to find the solution for the underdetermined inverse problem (N < M ). Using B−1 = W and D = λIN this yields ˆ˜ = AH λW + AAH p. −1. p.. (2.3.18). In case N < M the matrix AAH is smaller (N × N ) than AH A (M × M ). The matrix AAH has elements PM/2−1 −jp∆k (xm −xn ) ∆k∆x . AAH mn = p=−M/2 e 2π When sampling is uniform with gaps, it is once again more efficient to use the uniform grid instead of the actual sampling positions, for the same reason mentioned in the previous paragraph. Unfortunately, when sampling is non-uniform the matrix AAH is not a Toeplitz matrix, because xm − xn is not constant along diagonals defined by a constant value of m − n. In that case the inversion of Eq. (2.3.18) cannot be performed with the same computational efficiency as inverting Eq. (2.3.17). Weighting and spectral Leakage. The NDFT, factor AH Wp in Eq. (2.3.15) smears out energy over the entire model space, because it is not an orthogonal transformation. The factor in brackets in Eq. (2.3.15), the inverted Gram matrix, can be seen as a deconvolution that corrects for this smearing. The filter used in the deconvolution is the first column of the Toeplitz matrix AH WA, which equals the PSF in Eq. (2.2.11), but the deconvolution.

(25) 2.3 Least-squares Fourier transform. 17. is only carried out for the basis functions contained in A. Any non-zero model parameters belonging to basis functions not in A leak into the estimated parameters. This effect is called spectral leakage and may cause artifacts in inverse problems with non-uniformly sampled data [Snieder, 1993; Spetzler and Trampert, 2003]. The spectral leakage problem can be solved either by extending the bandwidth in the inversion such that all relevant basis functions are accounted for, or by taking the basis functions for −M/2 < m < M/2 − 1 into account by incorporating them in an appropriate covariance matrix Cn . However, such a matrix destroys the Toeplitz structure. A simpler solution is to use weights that compensate for variable sampling density by giving more weight to sparsely sampled regions and less weight to densely sampled regions [Feichtinger and Gröchenig, 1993; Snieder, 1993; Duijndam et al., 1999b]. This reduces spectral leakage effects to a large extent. A interesting side effect of data weighting is that it acts as a preconditioner by clustering the eigenvalues of matrix AH WA [Feichtinger et al., 1995; Strohmer, 1996; Schonewille, 2000]. Numerical implementation. For small problems direct inversion methods can be very efficient, but for large scale problems they are too expensive. We use the conjugate gradient (CG) algorithm, which takes advantage of the Toeplitz structure of AH WA. As the Toeplitz matrix only occurs in the normal equations, we apply CG to the normal equations (CGNE). The matrix AH WA is symmetric and positive-definite and its condition number is the square of the condition number of A. Therefore, convergence may be slower then when the CG scheme is used with matrix A. However, the Toeplitz matrix AH WA allows matrix-vector multiplication to be carried out efficiently by FFT’s. The fast execution far outweighs the potentially slower convergence by using CG on the normal equations. Schonewille [2000] discusses several efficient alternatives for the inversion of Toeplitz matrices such as Levinson’s algorithm and the GohbergSemencul procedure. In this research only the FFT based method has been used. The CGNE algorithm is shown in Fig. 2.1. The initial values At iteration k = 0 ˜ 0 = AH p, for the residual r0 = AH p − (AH WA + λI)˜ are for the model p p0 and for the search direction d0 = r0 . The CGNE algorithm iterates until the residual reaches a predefined threshold or the maximum number of iterations has been reached. The threshold for the residual ratio has been taken as 10−6 . The tuning parameter λ can be determined with the L-curve or generalized cross-validation method [Hansen, 1998]. A practical rule given by Duijndam is to take ∆k M , (2.3.19) 2π N where f represent the expected signal-to-noise ratio of the data. Because of the ratio M/N the damping grows as the inversion becomes less well determined. See λ=f.

(26) Fourier reconstruction with minimum norm inversion. 18. CG: while. krk k kr0 k. α=. > limit and k ≤ kmax. krk k H dH k (A WA+λI)dk. ˜ k+1 = p ˜ k + αdk p rk+1 = rk − α(AH WA + λI)˜ pk krk+1 k β = krk k dk+1 = −rk+1 + βdk Figure 2.1: The conjugate gradient algorithm for the normal equations (CGNE).. Feichtinger et al. [1995], Strohmer [1996], Duijndam et al. [1999b] or Schonewille [2000] for a more elaborate discussion on weighting, stability and efficiency in Fourier reconstruction. 2.3.2. Reconstruction with FRMN in the spatial domain. In this section the Fourier reconstruction problem is defined in the spatial domain instead of the Fourier domain. In case of uniform sampling with gaps, the bandlimited inverse NDFT matrix can be written as A = PFQ. The N × N matrix F is the inverse DFT matrix, as introduced in Eq. (2.1.5). The diagonal matrix Q is a band-limiting operator that puts to zero all Fourier coefficients outside the chosen band-width. The matrix P is a diagonal matrix with ones and zeros along the diagonal that acts as an sampling operator such that p = Ppr , where p is the coarsely sampled data and pr the finely sampled data. Both Q and P are idempotent, which means PH P = P. Furthermore, with no trace weighting for simplicity (W = I), the objective function for reconstruction of non-uniformly sampled data can be written as 1 1 J = 2 kP(pr − FQ˜ pk22 , (2.3.20) p)k22 + 2 kQ˜ c σp˜ which is minimized when ˆ˜ = FH Ppr , FH PF + λI Qp. (2.3.21). ˆ˜ are the estimated coefficients. Eq. (2.3.21) can be transformed to the spatial where p domain using the inverse Fourier transform (P + λI) Bˆ p = Ppr .. (2.3.22). ˆ represents the reconstructed signal. The Toeplitz matrix B can be The vector p thought of as a band-limited sinc filter: sin[M ∆k(i∆x − xj )/2] Bij = FQFH ij = . sin [∆k(i∆x − xj )/2]. (2.3.23).

(27) 2.3 Least-squares Fourier transform. 19. where M ∆k is the bandwidth. Eq. (2.3.22) can be rewritten using I = P + I − P, as (1 + λ)PBˆ p + λ(I − P)Bˆ p = Ppr . (2.3.24) This says that the sum of the band-limited reconstruction at original positions, PBˆ p, and the reconstruction in the gaps, (I − P)Bˆ p, must equal the input data, Ppr . This can only happen if λ = 0 or if the reconstruction tends to zero in the gaps for λ 6= 0. Gaps in the sampling influence the condition number of the matrix AH WA [Feichtinger and Gröchenig, 1993] and when the gaps become too large, or when the data contains noise, damping is required. Eq. (2.3.24) shows that the effect of the damping term λ is to suppress the band-limited reconstruction in the gaps. Schonewille [2000] showed that the FRMN reconstruction can be successful for data with one large gap, as long as the gap does not exceed three times the Nyquist interval. Eq. (2.3.23) shows that if the reconstruction is performed for full bandwidth (M = N ), then Q = I and there cannot be any band-limited reconstruction, because B = FQFH = I due to the orthogonality of the Fourier matrices. The reconstruction then fully depends on the initial guess (which has been taken as zero). In case of random sampling the matrix F is replaced by matrix A as defined in Section 2.3. Because for non-uniform sampling AH A 6= I, even for full spatial bandwidth there will be some reconstruction, mainly near the edges of gaps. 2.3.3. Relation to other algorithms. If the signal has sufficiently small bandwidth M with respect to the number of samples N , or if the sampling is such that the matrix AH WA is well conditioned, the damping term λ can be set to zero. Then Eq. (2.3.24) can be written (using P = I + P − I) as Bˆ p = Ppr + (I − P)Bˆ p, (2.3.25) which says that the band-limited reconstruction (Bˆ p) must equal the original data on the measured positions (Ppr ) and in the gaps just equals the band-limited reˆ by construction itself ((I − P)Bˆ p). This can be solved iteratively for p ˆ (k) = Ppr + (I − P)Bˆ p p(k−1) .. (2.3.26). This is equivalent to the Papoulis-Gerchberg (PG) algorithm for extrapolation or interpolation of missing samples in uniformly sampled data [Papoulis, 1975; Ferreira, 1994]. Rhebergen et al. [1997] discuss the link between least-squares Fourier reconstruction (without damping) via a conjugate gradient or Neumann iterative scheme and PG. Strohmer [1993] compared the Fourier reconstruction method with PG and found that Fourier reconstruction converges much faster and yields lower.

(28) Fourier reconstruction with minimum norm inversion. 20. residual error. Rhebergen et al. [1997] also indicate the relation between Fourier reconstruction and reconstruction by alternating projections and projection on convex sets (POCS), which is based on the fact that the operators P and B are projection operators (projection on the sampled data space and projection on band-limited signals). This link is also discussed in Ferreira [1995] and that paper also contains a list of references that further explore the relationships between different reconstruction methods. In geophysics, a scheme similar to PG has been used by Kabir and Verschuur [1995] in the restoration of missing near-offsets of marine seismic surveys, by parabolic Radon transform. Improvements on Kabir’s PG method based on the least-squares Radon transform are discussed by Trad et al. [2003] and references mentioned here.. 2.4 2D Fourier reconstruction Fourier reconstruction generalizes to the multi-dimensional case straightforwardly. Here the same notation as in the 1D case will be used. Sampling in both the spatial ˜ . These and the frequency domain with a Cartesian grid produces the arrays p and p arrays are related by the 2D discrete forward and inverse Fourier transforms: XX p(nx ∆x, ny ∆y)ej(mkx ∆kx nx ∆x+mky ∆ky ny ∆y) , p˜(mkx ∆kx , mky ∆ky ) = ∆x∆y ny. nx. (2.4.27). p(nx ∆x, ny ∆y) =. ∆kx ∆ky X X p˜(mkx ∆kx , mky ∆ky )e−j(mkx ∆kx nx ∆x+mky ∆ky ny ∆y) . 4π 2 m m kx. ky. (2.4.28) The scalars (nx , ny ) denote the nth sample along the x and y directions and similarly (mkx , mky ) denote the mth coefficient along the kx and ky wavenumber axes. The sampling intervals in the spatial and Fourier domain are denoted by (∆x, ∆y) and (∆kx , ∆ky ), respectively. 2D NDFT. When the sampling is non-uniform the coordinates cannot be indexed as (nx ∆x, ny ∆y). Instead, the indexing is simply n = 0, . . . , N − 1 where N is the total number of samples. As in the 1D case, the DFT is approximated by the NDFT which uses the actual sample locations (xn , yn ). The double summation over the two coordinates x and y is replaced by a single summation over sample index n pñdft (mkx ∆kx , mky ∆ky ) =. N −1 X n=0. p(xn , yn )ej(mkx ∆kx xn +mky ∆ky yn ) ∆Sn ,. (2.4.29).

(29) 2.4 2D Fourier reconstruction. 21. where ∆Sn represent the weights assigned to the nth sample. The inverse transform to the non-uniform grid defined by coordinates (xn , yn ) is a double summation over the uniformly sampled kx and ky axes p(xn , yn ) =. ∆kx ∆ky X X p˜(mkx ∆kx , mky ∆ky )e−j(mkx ∆kx xn +mky ∆ky yn ) . 4π 2 m m kx. ky. (2.4.30) The NDFT coefficients pñdft equal the true coefficients p˜ convolved with the pointspread function (PSF), which is defined as the NDFT of the sample weights: N −1 ∆kx ∆ky X PSF(mkx ∆kx , mky ∆ky ) = ∆Sn ej(mkx ∆kx xn +mky ∆ky yn ) . 4π 2 n=0. (2.4.31). The 2D NDFT is a distorted estimate of the true Fourier coefficients. The amount of distortion is determined by the PSF and as the PSF approaches a delta function, the NDFT approaches the true Fourier coefficients. The PSF can be influenced by the data weights ∆Sn . 2D Fourier reconstruction. Now the 2D arrays are ordered in lexicographic order, i.e. with columns stacked on top each other to produce a single column vector. For the Fourier coefficients this implies that the indices (mkx , mky ) are replaced by a single index m. Eq. (2.4.30) can then be written in matrix notation as p = A˜ p, where pn Anm p˜m. = p[xn , yn ] , ∆kx ∆ky −j(kx,m xn +ky,m yn ) = , 4π 2 e = p˜[kx,m , ky,m ] .. The least-squares inversion is as described in Section 2.3. The matrix AH A is a block-Toeplitz matrix with Toeplitz blocks, which means that matrix-vector multiplications can be performed via 2D FFT’s. This is used in a fast CG algorithm similar to the one-dimensional case. Once the Fourier coefficients have been estimated, the data can be reconstructed on the desired spatial grid using the inverse (N)FFT. There are some differences that complicate multi-dimensional Fourier reconstruction. First, the relationship between sampling and periodicity is not as straightforward as in the 1D case [Peterson and Middleton, 1962]. When one reduces the spatial bandwidth to limit the number of parameters in the inversion, all kinds of periodic regions can be defined to prevent overlapping spectra. This is explained well in the book on multi-dimensional discrete Fourier transforms and its applications by Dudgeon and Mersereau [1984]. The ambiguity in periodicity has been used.

(30) 22. Fourier reconstruction with minimum norm inversion. by Schonewille et al. [2003] to restrict the model to a T-shaped region in the 2D wavenumber plane. Another difference with 1D Fourier reconstruction is that the choice of sample weights is not obvious. A common approach is to design a weight as a certain area per sample, for instance with Delauny triangles or Voronoi cells [Hindriks and Duijndam, 2000]. An alternative procedure is to compute weights such that the PSF is as close as possible to a delta-function [Pipe and Menon, 1999]. Examples of Fourier reconstruction in two spatial dimensions are given by Strohmer [1996], Rauth [1998], Hindriks and Duijndam [2000], Schonewille [2000], Zwartjes and Hindriks [2001], Wajer [2001] and Schonewille et al. [2003].. 2.5 Fourier reconstruction of seismic data In the previous section we have explained that the result of Fourier reconstruction depends on the spatial bandwidth. Seismic data, p[t, xn ], is sampled in both time and space. Applying Fourier reconstruction in the time domain implies estimating the parameters (Fourier coefficients) in the full t-x domain in one large inversion, or for each time sample independently. Alternatively, we can Fourier transform the data along the temporal axis and perform the inversion in the f -k domain. Again, we can invert for all Fourier coefficients at once or perform the inversion per frequency slice. The frequency bandwidth of seismic data is typically limited to 0-100Hz and the spatial bandwidth is frequency dependent (as can be seen in Fig. 2.3c). Therefore, it is more efficient to perform Fourier reconstruction in the f -k domain and estimate the Fourier coefficients per temporal frequency. Sometimes the spatial bandwidth can be compressed further by NMO correction, which moves reflection energy towards the k = 0 axis by compensating for the approximate hyperbolic moveout of reflection events. This is beneficial for the inversion in Fourier reconstruction, because it reduces the number of parameters. Fig. 2.3 shows a shot-record of a seismic survey from the North-Sea in different domains. In Fig. 2.4 shows the same data, but non-uniformly sampled by omitting traces (maximum gap of two traces). Figs. 2.4b,c show the reconstruction with the NDFT and the FRMN algorithm, respectively. The NDFT reconstruction does not fill in the gaps well and misses information in the higher frequencies. FRMN does correctly reconstruct the high frequencies as can be seen in the f -x domain plots (Fig. 2.4e,f). A more difficult example is shown in Fig. 2.5a, which has a maximum gap of seven traces. The NDFT result fails, while the FRMN only yields reconstruction at the lowest frequencies. Reconstruction with FRMN works very well when sampling is non-uniform and gaps are not too large. However, this deterioration in reconstruction is typical at high temporal frequencies and when gaps become too large. Schonewille [2000] showed that the maximum gap that can be successfully reconstructed is about three times the Nyquist sampling interval..

(31) 2.5 Fourier reconstruction of seismic data. tmin. time. tmax −kmax. . 0 wavenumber. +kmax. (a). 23. fmax. frequency. 0. −kmax . . 0 . . +kmax wavenumber (b). Figure 2.2: (a) Model space parameterization in reconstruction of seismic data in the t-x domain. Because of the symmetry in the real-to-complex Fourier transform x → k, the reconstruction can be performed for all time and positive wave numbers only. (b) Model space parameterization in f -k domain. Because of the symmetry in the real-to-complex Fourier transform t → f , the reconstruction is for positive and negative wavenumbers for positive frequencies f only.. 2.5.1. Advantages of FRMN. • The main application of FRMN reconstruction is to fill in gaps in uniformly sampled data or regularize non-uniform sampling. It uses the actual trace coordinates and does not require data sampled on a uniform grid. Correct handling of the trace positions is important as it reduces artifacts in signal processing applications for noise removal, multiple attenuation, prestack imaging and time-lapse processing. • Fourier reconstruction is implemented via an efficient numerical scheme. • No subsurface model is required in the reconstruction of seismic data. In fact, the method can be applied to any kind of non-uniformly sampled data, such as seismic or potential field data, but also to signal reconstruction in the field of MRI and image processing. 2.5.2. Disadvantages of FRMN. • In gaps larger than three times the average Nyquist interval the reconstruction goes to zero because of the damping [Schonewille, 2000], as can be seen in the example in Fig. 2.5. Additionally, when the spatial bandwidth of the data is relatively large (in seismic data at high temporal frequencies), the reconstruction deteriorates..

(32) Fourier reconstruction with minimum norm inversion. 24. • The number of Fourier coefficients or parameters in the inversion must be smaller than the number of samples to guarantee a good reconstruction and, therefore, reduction of the spatial bandwidth is often required. Unfortunately, this carries the risk that significant energy outside the chosen bandwidth is discarded. • The advantage of FRMN is that it can handle any sampling grid. However, it cannot handle spatially aliased data. Since seismic data is frequently spatially aliased due to economic constraints on the acquisition geometry, this can be considered a disadvantage. • 3D marine streamer seismic survey sample in four coordinates (two for the source and two for the receivers). Currently, for such surveys Fourier reconstruction is applied as a cascade of 1D and 2D Fourier reconstruction. The fourth coordinate (azimuth) is neglected. However, the information in the azimuthal coordinate is becoming increasingly important and proper handling is therefore required. 2.5.3. Objectives. The objective of the research presented in this thesis is to modify Fourier reconstruction so that • reconstruction in large gaps and at higher temporal frequencies is improved, • larger spatial bandwidth than dictated by the minimum norm inversion can be handled, • spatially aliased data can be reconstructed, and • reconstruction of 3D marine streamer seismic data can be performed for all four spatial coordinates..

(33) 2.5 Fourier reconstruction of seismic data offset (m) 1000. 1500. 0.9. 0.5. 1.0. 1.0. 1.1. time (s). time (s). 0. 500. 25. 1.5. 2.0. 200. 300. offset (m). 400. 500. 1.2. 1.3. 2.5. 1.4. 3.0. (a) 0. (b) 1000. 1500. 20. 40. frequency (Hz). frequency (Hz). 20. 500. offset (m). 60. 0. 200. 300. offset (m). 400. 500. 20. frequency (Hz). -0.04 0. wavenumber (1/m) -0.02 0 0.02. 40. 40. 60. 60. 80. 80. 80. (c). (d). (e). Figure 2.3: Marine shot-record in (a) t-x and (b) a close-up. Same shot-record in the (c) f -k. and (d) f -x domains. (e) Close up in f -x..

(34) Fourier reconstruction with minimum norm inversion. 26 200. 300. 400. 500. 0.9. 400. 500. 0.9. 1.1. 1.1. 1.1. 1.2. time (s). 1.0. 1.2. 1.3. 1.3. 1.4. 1.4. 1.4. (a). 300. offset (m). 400. 500. 0. 200. (b). 300. offset (m). 400. 500. 0. frequency (Hz). 20. 40. 40. 60. 60. 80. 80. 80. (d) -0.02. (e). wavenumber (1/m) 0 0.02. -0.04 0. -0.02. -0.04 0. 40. 40. 40. frequency (Hz). 20. frequency (Hz). 20. 80. 60. 80. (g) Figure 2.4:. 200. (c). (f). wavenumber (1/m) 0 0.02. 20. 60. offfset (m). 400. 500. 300. offset (m). 400. 500. 40. 60. -0.04 0. 300. 20. frequency (Hz). 200. 200. 1.2. 1.3. 20. frequency (Hz). 300. 1.0. 0. frequency (Hz). 200. offset (m). 1.0. time (s). time (s). 0.9. offset (m). -0.02. wavenumber (1/m) 0 0.02. 60. 80. (h). (i). (a) Non-uniformly sampled shot-record with maximum gap of two traces. (b) Band-limited NDFT reconstruction. This reconstruction works only for low frequencies and for small gaps. (c) Band-limited FRMN reconstruction. (d-f ) f -x representation of (a-c), (g-i) f -k representation of the complete shot-record corresponding to (a-c). The NDFT in (g) is a distorted estimate of the true coefficients in Fig. 2.3(c). The (h) NDFT and (i) FRMN estimate contains less artifacts, which explains the better reconstruction..

(35) 2.5 Fourier reconstruction of seismic data 200. 300. 400. 500. 0.9. 300. 400. 500. 0.9. 1.1. 1.1. 1.1. 1.2. time (s). 1.0. 1.2. 1.3. 1.3. 1.4. 1.4. 1.4. (a). 300. offset (m). 400. 500. 0. 200. (b). 300. offset (m). 400. 500. 0. frequency (Hz). 20. 40. 40. 60. 60. 80. 80. 80. (d) -0.02. (e). wavenumber (1/m) 0 0.02. -0.04 0. -0.02. -0.04 0. 40. 40. 40. frequency (Hz). 20. frequency (Hz). 20. 80. 60. 80. (g). 200. (c). (f). wavenumber (1/m) 0 0.02. 20. 60. 400. 500. 300. offset (m). 400. 500. 40. 60. -0.04 0. 300. 20. frequency (Hz). 200. 200. offset (m). 1.2. 1.3. 20. frequency (Hz). 200. 1.0. 0. frequency (Hz). 27 offset (m). 1.0. time (s). time (s). 0.9. offset (m). -0.02. wavenumber (1/m) 0 0.02. 60. 80. (h). (i). Figure 2.5: As Fig. 2.4 but with maximum gap of seven traces. (a) Non-uniformly sampled shot-record, (b) band-limited NDFT reconstruction. (c) band-limited FRMN reconstruction. The FRMN reconstruction is good for the small gaps and for low frequencies in the large gap. (d-f ) f -x representation of (a-c), (g-i) f -k representation of (a-c)..

(36) 28. Fourier reconstruction with minimum norm inversion.

(37) 3 Fourier reconstruction with sparse inversion. The previous chapter introduced the method of Fourier reconstruction with minimum norm inversion (FRMN) to reconstruct non-uniformly sampled seismic data. In this chapter Fourier reconstruction with sparse inversion (FRSI) is introduced as a new method to achieve the first two goals set in Section 2.5, which were (1) to improve reconstruction in large gaps and at higher temporal frequencies, and (2) to make the result less dependent on the spatial bandwidth used in the inversion. In Section 3.1 it will be discussed how penalized likelihood estimation with nonquadratic model penalty functions yields estimators that outperform the minimum norm estimators when the model contains only a few relevant parameters. There are many non-quadratic functions that can be used to construct such sparse inversion estimators and in Section 3.1.2 it is discussed how and why they work. The type of non-quadratic function we use is discussed in Section 3.1.3. By defining the FRSI method in the spatial domain in Section 3.1.4, it becomes apparent that Fourier reconstruction with sparse inversion consists of a bandlimited sinc interpolation combined with an interpolation filter that ensures reconstruction where the sinc interpolation fails. Section 3.2 shows FRSI applied to a non-uniformly sampled 1D time series with a sparse and smooth spectrum. Also the issue of the maximum gap size that can be reconstructed is addressed. The section is concluded with a resolution analysis of a 1D reconstruction problem. The algorithmic implementation is discussed in Section 3.3, followed by a small 2D image reconstruction example in Section 3.4. Sparse inversion estimators have been developed in other fields and some of these are discussed in Section 3.5..

(38) 30. Fourier reconstruction with sparse inversion. 3.1 Sparse inversion The least-squares estimator in FRMN uses minimum norm or damped least-squares inversion to handle ill-conditioned inversions. As the damping coefficient increases, the Euclidean model norm decreases and the resulting model will be smooth in the sense that energy tends to be spread out over the model in order to reduce the model norm. This is acceptable when the number of parameters is small or when the model space is large and smooth, but sometimes sparse or parsimonious models, i.e. models with minimum structure are more desirable. For instance when the data consists of a single sinusoid it would make no sense to expect smoothness in the Fourier domain, because there is only one relevant Fourier coefficient. Non-quadratic norms. Sparse models can be found with non-quadratic regularization terms that penalize smooth and favor sparse models. In the nineties researchers in various fields experimented with a wide variety of non-quadratic regularization terms. This was made possible by the increase in computing power as well as by new and efficient algorithms for non-linear inversion. Examples can be found for a variety of applications, such as statistical regression [Breiman, 1995; Tibshirani, 1995; George and Foster, 1997], signal processing [Sacchi et al., 1998], neural networks and machine learning [Grandvalet, 1998; Roth, 2001] and wavelet thresholding [Gao, 1998; Fan and Li, 2001]. Minimum entropy deconvolution. One of the earliest applications of non-quadratic penalty functions in geophysics was in minimum entropy deconvolution [Wiggins, 1978], in which the objective is to obtain a reflectivity series consisting of the smallest number of large spikes (maximum order) consistent with the data1 . Instead of minimizing entropy, other non-quadratic functions of the model parameters can be used to obtain a sparse reflectivity series in the model domain, such as the `1 norm [Taylor et al., 1979; Levy and Fullager, 1981]. Non-quadratic norms and high-resolution Radon. Non-quadratic norms are also frequently used in geophysics in combination with the hyperbolic or parabolic Radon transform. A sparse Radon space helps to discriminate moveout of seismic reflections, which benefits velocity analysis and multiple 1 This is in contrast to maximum entropy methods that yield a reflectivity series with a white spectrum or minimum order..

(39) 3.1 Sparse inversion. 31. removal methods. De Vries and Berkhout [1984] applied the minimum entropy norm to quantify resolving power in velocity analysis, and Thorson and Claerbout [1985] were one of the first to use of a sparse inversion approach: they used a mix of quadratic penalty functions to obtain a sparse hyperbolic Radon transform. Later Sacchi and Ulrych [1995] showed that such a composite model weight can be approximated by a single non-quadratic penalty function, which they used with the faster parabolic Radon transform to obtain coefficients that are much less contaminated by limited aperture artifacts. For this reason the method was called the high-resolution Radon transform. The non-quadratic penalty function was used by van Dedem and Verschuur [2000] in 3D multiple elimination and by Trad [2000] to construct a fast high-resolution hyperbolic Radon transform based on Stolt migration. Trad et al. [2003] give an overview of the sparse Radon transform methods. Other interesting examples of sparse inversion in seismic data processing include cross-hole tomography to detect hidden tunnels [Clippard et al., 1995] and least-squares migration to improve angle gathers [Liu et al., 2003]. Fourier reconstruction with sparse inversion. The fact that the high-resolution Radon transform can be used to reconstruct missing data and that the Radon and Fourier transforms are closely related, suggests that these non-quadratic model penalty functions can also be used to improve Fourier reconstruction. This is indeed the case as has been shown by Wajer et al. [1998]; Zwartjes and Duijndam [2000]; Zwartjes and Hindriks [2001]; Liu and Sacchi [2001]. This chapter explains how. 3.1.1. Penalized likelihood optimization. An ill-posed or ill-conditioned linear system p = A˜ p can be solved by means of regularization. This means that a model penalty term is included in the objective function, Eq. (2.3.16) −1. ˜ 0 )k22 , p−p J = kCn−1/2 (p − A˜ p)k22 + kCp˜ 2 (˜. (3.1.1). −1/2. where Cn and Cp˜ are the a priori noise and model covariance matrices. The objective function is minimized by −1 −1 −1 ˆ˜ = AH C−1 ˜0 . (3.1.2) AH C−1 p n p − Cp˜ p n A + Cp˜. 1 The noise covariance matrix is simplified to C−1 n = c2 W as discussed in Chapter 2. 2 In case of minimum norm inversion, Cp˜ = σp˜I, where σp2˜ is the a priori model vari˜ 0 = 0 and the estimator is written as ance. The a priori model is chosen as p !−1 2 c ˆ˜ = AH WA + I AH Wp . (3.1.3) p σp2˜.

(40) Fourier reconstruction with sparse inversion. 32. 3.1.2. Non-quadratic penalty functions. In sparse inversion a non-quadratic model penalty function is used in the objective function. The difference between the various sparse inversion estimators presented in literature is mainly in the model penalty term. In this section we will discuss how functions that are used in robust regression can be used to construct a sparse inversion estimator. This approach also yields some additional insight into how sparse inversion actually works. Robust regression. Robust regression is used for inverse problems where the data contains outliers. The basic idea of robust regression is to minimize a function ρ(x) of the residuals that is increases not as dramatically as a quadratic. The first derivative ψ(x) = ∂ρ(x) ∂x called the influence function, and it defines how much influence a single variable x has on the estimate. A robust estimator should have a bounded influence function and hence less weight for large values of x. A few suitable functions are listed in Table 3.1 and graphically shown in Fig. 3.1. The quadratic function has an unbounded influence function ψ(x) = x and large values have influence that is proportional to x. The absolute value function (`1 norm model penalty) has an influence function ψ(x) = sgn(x), which is a constant and hence it is much more robust. The functions `1−2 and Huber’s function (modified to have a continuous second derivative) are alternatives to the discontinuous absolute value function that behave like `2 for small values and like `1 for large values of x. A generalization of the `1 and `2 norms is the `p norm and for 1 < p < 2 the function is convex and more robust than `2 and according to Zhang [1997] p = 1.2 is a good choice. The Cauchy and Geman-McClure functions are part of a family of non-convex functions that are even more robust, because outliers have less influence than the ‘inliers’. The last function in Table 3.1 is a generalization of the weight functions obtained from the `1−2 , Cauchy and Geman function and is discussed in Section 3.1.3. Sparse inversion. In robust regression non-quadratic penalties are imposed on the residual. In sparse inversion they are imposed on the model parameters. The a priori model variance is determined by the non-quadratic model penalty function and is inversely proportional to the value of the weight Cx function in Table 3.1. Because non-quadratic penalty functions give less weight to large values (the basic idea behind robust regression), large parameters are assigned large variance and the small parameters small variance. Small variance corresponds to a large degree of belief that the a priori model is correct. Combined with an priori model that is zero these parameters are suppressed. Large variance means the parameters remain unknown and will be.