Hostname: page-component-cd9895bd7-mkpzs Total loading time: 0 Render date: 2025-01-03T14:26:45.552Z Has data issue: false hasContentIssue false

Modelling and peeling extended sources with shapelets: A Fornax A case study

Published online by Cambridge University Press:  16 July 2020

J. L. B. Line*
Affiliation:
International Centre for Radio Astronomy Research, Curtin University, Perth, WA6845, Australia ARC Centre of Excellence for All Sky Astrophysics in 3 Dimensions (ASTRO-3D)
D. A. Mitchell
Affiliation:
CSIRO Astronomy and Space Science (CASS), PO Box 76, Epping, NSW1710, Australia
B. Pindor
Affiliation:
ARC Centre of Excellence for All Sky Astrophysics in 3 Dimensions (ASTRO-3D) The University of Melbourne, School of Physics, Parkville, VIC3010, Australia
J. L. Riding
Affiliation:
ARC Centre of Excellence for All Sky Astrophysics in 3 Dimensions (ASTRO-3D) The University of Melbourne, School of Physics, Parkville, VIC3010, Australia
B. McKinley
Affiliation:
International Centre for Radio Astronomy Research, Curtin University, Perth, WA6845, Australia ARC Centre of Excellence for All Sky Astrophysics in 3 Dimensions (ASTRO-3D)
R. L. Webster
Affiliation:
ARC Centre of Excellence for All Sky Astrophysics in 3 Dimensions (ASTRO-3D) The University of Melbourne, School of Physics, Parkville, VIC3010, Australia
C. M. Trott
Affiliation:
International Centre for Radio Astronomy Research, Curtin University, Perth, WA6845, Australia ARC Centre of Excellence for All Sky Astrophysics in 3 Dimensions (ASTRO-3D)
N. Hurley-Walker
Affiliation:
International Centre for Radio Astronomy Research, Curtin University, Perth, WA6845, Australia ARC Centre of Excellence for All Sky Astrophysics in 3 Dimensions (ASTRO-3D)
A. R. Offringa
Affiliation:
Netherlands Institute for Radio Astronomy (ASTRON), 7991 PDDwingeloo, The Netherlands
*
Author for correspondence: Jack Laurence Bramble Line, E-mail: [email protected]
Rights & Permissions [Opens in a new window]

Abstract

To make a power spectrum (PS) detection of the 21-cm signal from the Epoch of Reionisation (EoR), one must avoid/subtract bright foreground sources. Sources such as Fornax A present a modelling challenge due to spatial structures spanning from arc seconds up to a degree. We compare modelling with multi-scale (MS) CLEAN components to ‘shapelets’, an alternative set of basis functions. We introduce a new image-based shapelet modelling package, SHAMFI. We also introduce a new CUDA simulation code (WODEN) to generate point source, Gaussian, and shapelet components into visibilities. We test performance by modelling a simulation of Fornax A, peeling the model from simulated visibilities, and producing a residual PS. We find the shapelet method consistently subtracts large-angular-scale emission well, even when the angular resolution of the data is changed. We find that when increasing the angular resolution of the data, the MS CLEAN model worsens at large angular scales. When testing on real Murchison Widefield Array data, the expected improvement is not seen in real data because of the other dominating systematics still present. Through further simulation, we find the expected differences to be lower than obtainable through current processing pipelines. We conclude shapelets are worthwhile for subtracting extended galaxies, and may prove essential for an EoR detection in the future, once other systematics have been addressed.

Type
Research Article
Copyright
Copyright © Astronomical Society of Australia 2020; published by Cambridge University Press

1. Introduction

Detecting primordial hydrogen during the Epoch of Reionisation (EoR) via the 21-cm line power spectrum (PS) has been a hotly pursued goal in astrophysics over the last decade, with projects on established instruments such as LOFAR (van Haarlem et al. Reference van Haarlem2013), Murchison Widefield Array (MWA) (Tingay et al. Reference Tingay2013), and PAPER (Parsons et al. Reference Parsons2010), and future projects on HERA (DeBoer et al. Reference DeBoer2017) and SKA_LOW (Dewdney et al. Reference Dewdney, Turner, Millenaar, Mccool, Lazio and Cornwell2013) all pushing for a detection. The signal promises to constrain cosmological models and fundamental physics (see Morales & Wyithe Reference Morales and Wyithe2010; Mellema et al. Reference Mellema2013; Pritchard & Loeb Reference Pritchard and Loeb2012, for reviews), but is swamped by intervening emission from other astrophysical objects, including radio-loud galaxies and diffuse synchrotron emission. These foregrounds can be up to $\sim\!5$ orders of magnitude brighter than the 21-cm signal. Recent work and PS limits from LOFAR (Mertens et al. Reference Mertens2020; Patil et al. Reference Patil2017), MWA (Barry et al. Reference Barry2019; Li et al. Reference Li2019; Trott et al. Reference Trott2020), and PAPER (Kerrigan et al. Reference Kerrigan2018) all point towards painstakingly precise foreground subtraction being needed to facilitate a detection, with a strong focus on spectral smoothness. As these instruments are interferometers, which make measurements in the Fourier transform of image space called ‘visibilities’, any foreground models benefit from easily being calculated in both image and visibility space.

Some radio-loud galaxies are nearby and cover a large angular extent on the sky (up to degrees in angle), and display rich morphologies, e.g. Centaurus A (McKinley et al. Reference McKinley2013). These sources present a modelling challenge, and are bright enough to necessitate careful subtraction when present in EoR data. The most straightforward prescriptions build extended models out of Dirac delta functions, which also lends synergy with CLEAN-based deconvolution imaging techniques. For the most extended objects, this requires a large number of components, which can become computationally demanding. Furthermore, as discussed in Yatawatta (Reference Yatawatta2010), the choice of pixel size in image deconvolution for extended emission affects the modelled emission.

An alternative modelling approach is to use so-called ‘shapelets’, a set of orthonormal basis functions consisting of weighed Hermite polynomials (Refregier Reference Refregier2003). These basis functions are attractive due to having analytically defined Fourier transforms, and so can be fitted in image space and then generated directly in visibility space. They have been used in a number of novel astronomical applications including modelling three-dimensional (3D) distribution of dust (Schechtman-Rook, Bershady, & Wood Reference Schechtman-Rook, Bershady and Wood2012), weak lensing measurements in simulations of radio images (Bacon et al. Reference Bacon, Rowe, Patel, Beswick, Abdalla and Smirnov2014), gravitationally lensed images (Tagore & Jackson Reference Tagore and Jackson2016), and classifying bent radio galaxies (Bastien, Oozeer, & Somanah Reference Bastien, Oozeer and Somanah2017). Furthermore, they can be scaled to be extended on the sky, and lend themselves to compression/truncation. Yatawatta (Reference Yatawatta2010) replaced CLEAN components with two-dimensional (2D) shapelet basis functions around Cygnus A during image-based deconvolution, and was able to increase the dynamic range in an image of Westerbork Synthesis Radio Telescope data by a factor of $\,\sim$ 50. Calibration packages such as SAGECAL Footnote a (Kazemi et al. Reference Kazemi, Yatawatta, Zaroubi, Lampropoulos, de Bruyn, Koopmans and Noordam2011; Yatawatta et al. Reference Yatawatta, Zaroubi, de Bruyn, Koopmans and Noordam2009) and the RTS (Mitchell et al. Reference Mitchell, Greenhill, Wayth, Sault, Lonsdale, Cappallo, Morales and Ord2008; Riding, Mitchell, & Webster Reference Riding, Mitchell and Webster2017) have the capability to use shapelet models, with SAGECAL shapelet models being used predominantly with LOFAR data, as described in the LOFAR cookbook.Footnote b

Recent advances in CLEAN algorithms and packages now allow multi-scale (MS) CLEANing (see Cornwell Reference Cornwell2008, and references therein) with the option to use Gaussians during the CLEAN, as well as point sources.Footnote c WSClean Footnote d (Offringa et al. Reference Offringa2014; Offringa & Smirnov Reference Offringa and Smirnov2017) is a cutting-edge imaging package capable of all the above, and has the ability to produce point source and Gaussian models. The main goal of this paper is to compare an MS CLEAN component model to a shapelet model, testing how effective and computationally efficient they are during calibration and peeling, with the primary metric being the 2D 21-cm PS as used by EoR experiments. We chose to use Fornax A as a case study, as it sits in the primary beam side lobes of one of the MWA EoR observational fields (see Jacobs et al. Reference Jacobs2016, for more details on MWA EoR processing pipelines), and as such must be peeled from the data.

In this paper, we introduce a new image-based shapelet fitting package, SHApelet Modelling For Interferometers (SHAMFI Footnote e, Section 2.1), which we use to test against MS CLEAN outputs from WSClean. To test the efficiency of generating each type of model, as well as their ability to accurately subtract EoR foreground emission, we have also developed the visibility simulator WODEN Footnote f (see Section 3). We first simulate data using WODEN to test each method in the absence of instrumental, atmospheric, and astrophysical contaminants. We then test each method on real MWA data using the RTS.

The paper is organised as follows. In Section 2, we introduce shapelets as basis functions, and introduce SHAMFI. In Section 3, we introduce the Graphics processing unit (GPU)-accelerated visibility simulator WODEN. In Section 4, we introduce a model from which to simulate Fornax A data, outline simulations of this model with WODEN, and detail fitting the simulations using SHAMFI. In Section 5, we introduce a method to peel with the simulated data, and present simulated peeling results, including testing truncation on both the shapelet and MS CLEAN models. In Section 6, we fit and peel using real data. We discuss our results in Section 7, and summarise in Section 8.

2. Shapelet fitting and SHAMFI

Shapelets as a basis function are introduced in Refregier (Reference Refregier2003), to which we refer the reader for a detailed overview; we briefly define them here for clarity within this paper. Shapelets are orthonormal basis functions based around hermite polynomials, and can be based in a polar or Cartesian co-ordinate system, the latter of which we use in this work. In one dimension, they are defined as

(1) \begin{equation} B_p(x;\, \beta) \equiv \beta^{-\frac{1}{2}} \left[2^p \pi^2 p\,! \right]^{-\frac{1}{2}} H_p({\beta^{-1}}x)\exp\left(\frac{-x^2}{2}{\beta^{2}}\right),\end{equation}

where $\beta$ is a scaling factor, p is a positive integer, and $H_p(x)$ is a hermite polynomial of order p. For our purposes, x is the abscissa in a Cartesian co-ordinate system. These basis functions can be converted to two dimensions by multiplication:

(2) \begin{equation} B_{p_1,p_2}(x,y ;\, \beta_1, \beta_2) = B_{p_1}(x;\, \beta_1)B_{p_2}(y;\, \beta_2),\end{equation}

where y is the ordinate in a Cartesian co-ordinate system. The Fourier transform $\Tilde{B}_{p_1,p_2}(k_x,k_y)$ of $B_{p_1,p_2}(x,y;\, \beta_1, \beta_2)$ is given by

(3) \begin{equation} \Tilde{B}_{p_1,p_2}(k_x,k_y) = i^{p_1 + p_2}B_{p_1,p_2}(k_x,k_y ;\, \beta_1^{-1}, \beta_2^{-1}),\end{equation}

meaning basis functions fitted in image space can be analytically calculated directly in u, v space, as $k_x,k_y \propto u,v$ . One further parameter of interest is the highest order basis function to fit for, $p_{\text{max}} = p_1 + p_2$ , which sets the maximum resolution that can be fitted for. Given a maximum (object size) and minimum angular extent (point spread function (PSF) resolution) to be fitted, $\vartheta_{\text{max}},\vartheta_{\text{min}}$ , Refregier (Reference Refregier2003) shows that good starting points for $\beta$ and $p_{\text{max}}$ are

(4) \begin{align}\beta \approx (\vartheta_{\text{min}}\vartheta_{\text{max}})^{\frac{1}{2}} \quad \textrm{and} \end{align}
(5) \begin{align}p_{\text{max}} \approx \frac{\vartheta_{\text{max}}}{\vartheta_{\text{min}}} - 1. \end{align}

The consequence of setting $p_{\text{max}}$ is that the total number of basis functions to fit, $p_{\text{tot}}$ , grows as

(6) \begin{equation}p_{\text{tot}} = \frac{(p_{\text{max}}+2)(p_{\text{max}}+1)}{2},\end{equation}

meaning that the number of basis functions required grows quickly with increasing model resolution and size.

2.1. Fitting with SHAMFI

The simplest method to fit the basis functions in image space is to set up a number of linear equations, where for an image with $s \times t$ pixels, each pixel value P at location x, y can be described by

(7) \begin{equation} \begin{bmatrix} B_{0,0}(x_0,y_0) & \quad \cdots & \quad B_{p_{\text{max}},0}(x_0,y_0) \\ \vdots & \quad \ddots & \quad \vdots \\ B_{0,0}(x_s,y_t) & \quad \dots & \quad B_{p_{\text{max}},0}(x_s,y_t) \end{bmatrix} \begin{bmatrix} C_{0,0} \\ \vdots \\ C_{p_{\text{max}},0} \end{bmatrix} = \begin{bmatrix} P_{0,0} \\ \vdots \\ P_{s,t} \end{bmatrix},\end{equation}

where $C_{p_1,p_2}$ is a scalar coefficient for the basis function $B_{p_1,p_2}$ , for all basis functions where $p_1 + p_2 \leq p_{\text{max}}$ . This reduces the problem to the well-studied linear matrix $\mathbf{A}\mathbf{x} = \mathbf{b}$ . For convenience, we use the python function numpy.linalg.lstsq,Footnote g which solves for $\mathbf{x}$ by minimising the Euclidean two-norm $|| \mathbf{b} - \mathbf{A} \mathbf{x} ||^2$ .

We choose to fit the basis functions in image space, as it is far easier to isolate emission from a single object. Each visibility is the sum of the emission of all sources within the field of view, which is of particular concern for large field-of-view instruments such as the MWA. One can simply crop or edit in image space to remove difficult/unwanted pixels. CLEANed restored images offer an easy data set to fit, as the restoring beam tends to smooth out any sharp model pixelisation effects from the CLEAN fitting procedure. By design, these images are ‘true’ sky emission convolved with a Gaussian kernel known as the ‘restoring beam’, which is an approximation of the synthesised PSF. The synthesised PSF is the image-based manifestation of the incomplete sampling of an interferometer in the Fourier transform space in which it samples data, and is the reason images must be deconvolved. For a more accurate and instrument-independent shapelet fit, one must account for this. Rather than deconvolve the CLEANed image however, we achieve the equivalent by convolving the basis functions themselves by the restoring beam kernel, as outlined in the LOFAR cookbook mentioned in Section 1.

Right-ascension (RA) and declination ( $\delta$ ) are related to x, y via a rotation by the position angle $\phi_{\text{PA}}$ , defined as the increasing angle to East from North. Aside from fitting $C_{p_1,p_2}$ , $\beta_1$ , $\beta_2$ , and $\phi_{\text{PA}}$ are all variables. Minimising $|| \mathbf{b} - \mathbf{A} \mathbf{x} ||^2$ for many values of $\beta_1,\beta_2$ , and $\phi_{\text{PA}}$ is computationally expensive. We instead settle for an initial 2D-Gaussian fit [defined in Equation (10)] to the entire object to determine $\phi_{\text{PA}}$ , and then fit a range of $\beta_1,\beta_2$ in a grid, using the residuals of each fit to determine the optimal parameters. Examples of these steps are given in Sections 4.2 and 6.1.

3. WODEN

To test the MS CLEAN and shapelet fitting methods, we first use simulated data to isolate fitting errors from the plethora of calibration and instrumental effects present in real data. To enable a fair comparison of the speed of point source, Gaussian, and shapelet visibility model generation, we found it necessary to write a bespoke simulator (WODEN), which we detail here. This gave us control over the optimisation of each generation method, allowing consistency. We found when testing model generation within the RTS that the architecture of the code is optimised for shapelet models, rather than models built with multiple point/Gaussian components, due to the original needs of the calibration design at the inception of the RTS. Work is underway to update the architecture but remains ongoing.

3.1. Point source generation

WODEN analytically generates a sky model for calibration directly in visibility space via the measurement equation (c.f. Thompson, Moran, & Swenson Reference Thompson, Moran and Swenson2001)

(8) \begin{align} & V(u,v,w) = \\[4pt] & \int \mathcal{B}(l,m) I(l,m) \exp[-2\pi i(ul + vm + w(n-1))] \frac{dldm}{n},\nonumber \end{align}

where V(u, v, w) is the measured visibility at baseline co-ordinates u, v, w, given the sky intensity I(l, m) and instrument beam pattern $\mathcal{B}(l,m)$ , which are functions of the direction cosines l, m, with $n=\sqrt{1-l^2-m^2}$ . Equation (9) is discretised for point sources such that

(9) \begin{align} & V(u_i,v_i,w_i) =\\[5pt] & \sum_j \mathcal{B}(l_j,m_j)I(l_j,m_j) \exp[-2\pi i(u_il_j + v_im_j + w_i(n_j-1))],\nonumber \end{align}

where $u_i,v_i,w_i$ are the visibility co-ordinates of the $i{\text{th}}$ baseline, and $l_j$ , $m_j$ , $n_j$ is the sky position of the $j{\text{th}}$ point source.

This is an embarrassingly parallel problem, well suited to optimisation on GPUs. The basic application of a GPU to a problem such as this is to calculate all iterations over i, j in parallel, and then sum over j after the fact. While this efficiently leverages the parallel computational structure of the GPU, it requires large arrays to store the $i \times j$ outputs, which can quickly use large amounts of GPU memory when j becomes large.

In recent years, the CUDA GPU-language (Nickolls et al. Reference Nickolls, Buck, Garland and Skadron2008) has made progress in making efficient memory-managed read–modify–write operations on a single address, or so-called atomic operations.Footnote h These allow calculations to be made in parallel, and the results written out to a single address with ‘hardware-level’ thread-safe management, with minimal slowdown. We make use of atomicAdd, which adds the output of a thread to a single memory address, to perform the summation over j in Equation (10). This reduces the size of the array needed to store outputs significantly.

We make no attempt to add any instrumental or ionospheric effects, and write WODEN to simply evaluate Equation (10) with $\mathcal{B}(l,m) \equiv 1$ for a given array layout and observation parameters. We use WODEN here simply to test the modelling efficiency and model-generation costs of both the MS CLEAN and shapelet methods.

3.2. Gaussian and shapelet generation

To simulate peeling using both shapelets and MS CLEAN outputs, we add the functionality to simulate Gaussians and shapelets into WODEN. We define a 2D Gaussian in image space as

(10) \begin{equation}G(x,y) = \exp \left( -4\ln(2)\left[\frac{(x - x_0)^2}{\theta_{\text{maj}}^2} + \frac{(y - y_0)^2}{\theta_{\text{min}}^2} \right] \right),\end{equation}

where $x_0,y_0$ is the central point, and $\theta_{\text{maj}},\theta_{\text{min}}$ are the full-width half-maximum (FWHM) major and minor axes. Right-ascension and declination are related to x, y via a rotation by the position angle $\phi_{\text{PA}}$ . We utilise the RTS methodology of inserting a visibility ‘envelope’ $\xi$ into Equation (10) to create a Gaussian or shapelet source:

(11) \begin{multline}V(u_i,v_i,w_i) = \\\sum_j \xi_j(u_i,v_i)I(l_j,m_j) \exp[-2\pi i(u_il_j + v_im_j + w_i(n_j-1))],\end{multline}

where for a Gaussian,

(12) \begin{align}\xi_j = \exp\left( -\frac{\pi^2}{4\ln(2)} \left( k_x^2\theta_{\text{maj}j}^2 + k_y^2\theta_{\text{min}j}^2\right) \right), \end{align}
(13) \begin{align}k_x = \cos(\phi_{PAj})v_i + \sin(\phi_{PAj})u_i, \end{align}
(14) \begin{align}k_y = -\sin(\phi_{PAj})v_i + \cos(\phi_{PAj})u_i, \end{align}

[noting that as visibilities are the Fourier dual of image space, the normalisation factor is inverted from Equation (10)] and for a shapelet,

(15) \begin{align}\xi_j = \sum^{p_k +p_l < p_{\text{max}}}_{k,l} C_{p_k,p_l} \Tilde{B}_{p_k,p_l}(k_x,k_y), \end{align}
(16) \begin{align}k_x = \frac{\pi}{\sqrt{2\ln(2)}} \left[\cos(\phi_{PA})v_{i,j} + \sin(\phi_{PA})u_{i,j} \right], \end{align}
(17) \begin{align}k_y = \frac{\pi}{\sqrt{2\ln(2)}} \left[-\sin(\phi_{PA})v_{i,j} + \cos(\phi_{PA})u_{i,j} \right], \end{align}

where $u_{i,j},v_{i,j}$ are visibility co-ordinates for baseline i, calculated with a phase-centre $RA_j,\delta_j$ , which corresponds to the central position $x_0,y_0$ used to fit the shapelet model in image space. Inserting the visibility envelope $\xi$ in this way causes a convolution in visibility space, which in turn causes a multiplication in image space. Using Equation (10) causes a convolution with a Gaussian kernel that sums to one, meaning the visibility can simply be multiplied by I(l, m) to give the correct flux density. When using Equation (15), one must scale the coefficients $C_{p_k,p_l}$ to ensure the kernel sums to one, to use Equation (11) for both Gaussian and shapelet components. In Equation (15), then we set $C_{p_k,p_l} \equiv C_{p_k,p_l} / I(l_j,m_j)$ , and note that the coefficients output by SHAMFI have this division by integrated flux density applied, hence Equation (15) is implemented in WODEN exactly as it appears.

The shapelet basis function values $\Tilde{B}_{p_k,p_l}(u,v)$ can be calculated by interpolating from one-dimensional (1D) look-up tables of $\Tilde{B}(k_x;\,1)$ [c.f. Equation (1)], and scaling by the appropriate $\beta$ . When using the scalings presented in Equations (13), (14), (16), and (17), the first-order shapelet basis function is equal to a 2D Gaussian when $\beta_1=\theta_{\text{maj}}, \beta_2=\theta_{\text{min}}$ , explicitly $G(x,y,\theta_{\text{maj}},\theta_{\text{min}}) \equiv B_{0,0}(x,y,\theta_{\text{maj}},\theta_{\text{min}})$ . This is useful to test the consistency between simulating Gaussians and shapelets.

For all simulation bench-marking used in this paper, we use a GeForce GTX 1080 Ti NVIDIA GPU, with 12 GB of memory. This allows us to store all time steps in GPU memory, meaning the optimisation over j includes all time steps. The net result of these optimisations is that per frequency channel, we launch a single CUDA kernel instance each for all point sources, Gaussians, and shapelets, all of which are optimised with a consistent strategy. As u, v, w only differ in frequency by a scaling factor, we only calculate them once per time step.

4. Simulating and modelling Fornax A

4.1. Very Large Array (VLA) Fornax A Model

The goal of this work is to compare shapelet and MS CLEAN methods to generate models of extended radio galaxies. To fairly test the two methods as implemented in SHAMFI and WSClean, in reality one would start with calibrated visibility data, deconvolve and image this data using WSClean to create MS CLEAN components, and create an image to use when modelling with SHAMFI. This would ensure both methods use the same data set, with the same angular resolution, and both suffer any systematics inherent to the data. As stated in Sections 1 and 3, we choose to test the methods first using simulated data, to reduce the number of systematics. To do this, we need to create simulated ‘observed’ data in visibility space. We choose to simulate observed visibility data using point sources alone, to minimise biasing towards either method. To build a realistic and high-resolution model of a complex and extended radio galaxy, we use a public VLA image of Fornax A at 1.4 GHz, obtained from the NASA/IPAC Extragalactic Database,Footnote i a digitised version of the data presented in Fomalont et al. (Reference Fomalont, Ebneter, van Breugel and Ekers1989). The online FITS file is a cut-out around the galaxy, containing some CLEAN residuals. To create a point source model, we first apply a Gaussian taper around the edge of the image (see Figure 1). Not only does this reduce CLEAN residuals around the image, but also removes the sharp boundary edge, minimising future ringing when imaging from the simulated visibility data. We then move the image from B1950 to J2000 co-ordinates, convert each pixel into a point source [explicitly we convert each pixel into a Dirac delta function, allowing us to run Equation (10) over all pixels], and scale the flux density of all pixels to sum 500 Jy at 180 MHz. We give each pixel a single spectral index of $-0.8$ . As the restoring beam of this CLEAN-ed image is smaller than the resolution of the instruments used in simulation, we make no attempt to remove it from the image.

Figure 1. Upper panel: Fornax A image obtained from NED, which includes CLEAN residuals around the edge of the lobes. Over-plotted in red is a boundary line outside of which a Gaussian taper was applied to create the model used in WODEN. Lower panel: The image after applying the taper and transforming from B1950 to J2000 centred coordinates. Any pixels that were not masked (the grey region) in the lower image were converted into point sources.

We simulate six MWA observations of 30.72 MHz bandwidth and 10 kHz spectral resolution using WODEN. These simulations use the observational parameters of the real data used to image Fornax A in Section 6.1. As the MWA has recently been upgraded with added receiver elements, dubbed Phase II, it can be operated in various configurations, with different angular resolutions and sensitivities to diffuse or compact emission (see Wayth et al. Reference Wayth2018, for details). Three of the observations use the Phase I configuration (maximum baseline $\sim3\,$ km), with the other three the Phase II extended configuration (maximum baseline $\sim$ 5.5 km, hereto referred to as Phase II). The Phase I observations have 2 s time resolution and the Phase II observations 0.5 s resolution to reduce time decorrelation on the longer baselines. We simulate 56 time steps for all observations regardless of the time resolution to reduce computational load. We use WSClean to perform an MS CLEAN on the simulated data, with the results shown in Figure 2. These images have not been beam corrected, given that WODEN does not include beam effects. We include an example WSClean command in Figure A1 to detail the settings used. We also include all software versions used during this research in Table B1.

Figure 2. MS CLEANed images of the WODEN simulation of Fornax A. Top: All Phase I configuration, imaged with briggs 0 weighing to balance the large number of short baselines present in Phase I data with a reasonable resolution. Bottom: A combination of both Phase I and Phase II configurations. These are imaged with uniform weighing to take advantage of the higher resolution Phase II data.

4.2. Modelling the Fornax A simulation

We fit the simulated Fornax A in the images in Figure 2 using SHAMFI as described in Section 2, and use Equation (5) to set $p_{\text{max}}$ . We set $\vartheta_{\text{max}} = 0.5^{\circ}$ , and set $\vartheta_{\text{min}}$ by oversampling the angular resolution of the array by 3, giving $\vartheta_{\text{min}} = \lambda / 3b_{\text{max}}$ , where $\lambda$ is the wavelength of the observation, and $b_{\text{max}}$ the maximum baseline of the array. We set the latter to 3 km for the Phase I array and 5.5 km for the Phase II array. This yields $p_{\text{max}} = 46$ for Phase I and $p_{\text{max}} = 86$ for Phase II.

Figure 3. Left: Example of initial Gaussian fit used to set $\phi_{\text{PA}}$ of the basis functions for one of the simulated Phase 1+2 lobe images. The red line shows the FWHM, with the white lines demonstrating the PA found. Right: Example of the grid-based approach for fitting $\beta_1$ and $\beta_2$ . The colour scale here represents the residuals in (Jy/pixel) $^2$ left after fitting the image in the left panel with all basis functions up to $p_{\text{max}} = 86$ .

Figure 4. Left: MS CLEANed Phase 1 simulated data Middle: Fitted shapelet model recreated in image space, using restoring beam convolved basis functions Right: The model subtracted from the data. The footprints of the two separate shapelet models (one for each lobe) are clearly visible in the residual plot.

During modelling, we found the choice of location of the zero pixel $x,y = 0,0$ of the basis functions severely affected the quality of fit. We found that for a very extended, double-lobed radio galaxy like Fornax A, the best solution was to split the galaxy into two lobes and fit each lobe separately. To avoid double-fitting flux, we divided the image in two by fitting each lobe with a normalised 2D Gaussian, which we label $N_1(x,y)$ , $N_2(x,y)$ , and then applying weights $w_1(x,y),w_2(x,y)$ to each pixel such that $w_1 = N_1 / (N_1 + N_2)$ , $w_2 = N_2 / (N_1 + N_2)$ . Figure 3 shows an example of fitting for $\phi$ and the grid-search optimising for $\beta_1,\beta_2$ . The fitting results for the simulated Phase I data are shown in Figure 4.

As is also prescribed in the LOFAR cookbook, we found it necessary to subtract any compact point source-like emission from the images including Phase II data, which reduced the number of higher-order shapelets required for a good fit. We do this manually by subtracting image-based Gaussian components, as we found traditional source finders struggle to deal with the complexity of Fornax A to successfully isolate the point-like details in the image. A full example of the image preparation needed to model Fornax A is given in Section 6.1. We note the tools to split a radio galaxy into Gaussian masked regions, and manually subtract Gaussians, are included in the SHAMFI package.

5. Simulated peeling results

The main motivation behind this paper is to understand how modelling Fornax A might impact a PS estimation of the EoR through residuals left after peeling. We therefore only test peeling on Phase I data here, as the Phase II layout does not have the necessary short baselines required to detect the EoR signal. To mimic peeling, we simulate the full point source model from Section 4.1, the MS CLEAN component models from the images in Figure 1, and the fitted shapelet models described in Section 4.2, with the same time and frequency cadence (8 s, 80 kHz), all phase-centred on Fornax A, for a single 2-min observation. We then fit a complex gain per tile (as we simulate no beam, we have no polarisation, and so only need to fit a single complex gain) for each set of MS CLEAN and shapelet visibilities to the full simulation, apply the gains to the model, and subtract from the full simulated visibilities. While no antenna gains were ever added during the simulation, real peeling includes a calibration step, so we include that step in our analysis. We reiterate that we have also not added thermal noise to the simulations, making this a strongly idealised case.

5.1. Peeling method on simulated data

To fit the gains, we follow the calibration scheme implemented in YANDAsoft.Footnote j As this scheme has yet to be published, we detail the formalism here. We label the ideal visibility model for baseline ij as $\mathcal{I}_{ij}$ . The visibility model $M_{ij}$ (over the range of time and frequency that antenna gains are assumed constant) is

(18) \begin{equation} M_{ij}(t,\nu) = g_ig_j^* \mathcal{I}_{ij}(t,\nu).\end{equation}

where $g_i$ is the gain on antenna i. The measured visibility $V_{ij}$ is

(19) \begin{equation} V_{ij}(t,\nu) = (g_i+e_i)(g_j+e_j)^*\mathcal{I}_{ij}(t,\nu),\end{equation}

where $e_i$ is an additive error on the gain $g_i$ . We label the residual visibilities as

(20) \begin{equation} R_{ij}(t,\nu) = V_{ij}(t,\nu) - M_{ij}(t,\nu).\end{equation}

The gains can be optimised by iteratively solving for and following local error gradients. These are found using linear least-squares with data $\Re(R)$ and $\Im(R)$ and free parameters $\Re(e)$ and $\Im(e)$ . The complex conjugation of gains in the equations above cannot be simply represented by a linear operator. So rather than solving for complex solutions, we instead solve for the real and imaginary parts separately and multiply the relevant coefficients by $-1$ . Assuming the error-squared term is small allows the problem to be linearised as $\mathbf{A}\mathbf{x} = \mathbf{b}$ , where $\mathbf{A}$ is the design matrix, $\mathbf{x}$ are the parameters, and $\mathbf{b}$ is the residual vector, e.g. the real parts of the residual vector $\mathbf{b}$ are

(21) \begin{equation} \Re[R_{ij}(t,\nu)] = \Re[ e_j^*g_iI_{ij}(t,\nu) + e_ig_j^*I_{ij}(t,\nu)].\end{equation}

This can be implemented to solve for all $e_i$ in the vector $\mathbf{x}$ as

(22) \begin{equation}\mathbf{x} = \left[ \mathbf{A}^\intercal \mathbf{A} \right]^{-1}\mathbf{A}^\intercal \mathbf{b}.\end{equation}

The assumption of linearity will improve as the system converges. We found that as the simulations had no instrumental or source side-lobe noise, the gains consistently converged after 10 iterations of the calibration loop, and so used 10 iterations for all the results shown in this and following sections. We fit a separate gain for each time and frequency step. Once we have the gains, we then fit a cubic spline across the entire bandwidth using the scipy.interpolate.interp1d Footnote k function, to ensure we apply spectrally smooth gains during peeling. This was found to be necessary for an EoR detection by Mouri Sardarabadi & Koopmans (Reference Mouri Sardarabadi and Koopmans2019), as without smooth gains, the EoR signal can be suppressed. The caveat to this being that we are assuming a linearised calibration case, where spectrally smoothing the gains does yield improvement. In the non-linear regime, this may not be true; however given that these simulations have zero noise and instrumental effects, the error terms in calibration are likely small.

Figure 5. Simulated peeling results. Plots (a) to (d) show MS CLEANed peel residuals of an integration over the full bandwidth and time span of a single simulated Phase 1 observation, each peeled with a different model for Fornax A. The models are: (a) a shapelet model from Phase I data; (b) a shapelet model from Phase 1 and 2 data; (c) an MS CLEAN model from Phase I data; (d) an MS CLEAN model from Phase I and II data. A contour plot of the unpeeled power is shown on each plot as a angular scale reference. Plot (f) shows the 1D PS as estimated with CHIPS for the models in (a) to (d). For reference, the power without peeling is shown for the unpeeled WODEN simulation in (e). The line style and marker for each 1D PS plotted in (f) is also plotted in each plot from (a) to (d) on the lower right for reference.

5.2. Peeling results

The results of peeling all four models from the full simulation are shown in Figures 5 and 6. We image the peeling residuals in Figure 5 using the same WSClean commands used to image the full simulation in Figure 2. We make 1D (Figure 5) and 2D (Figure 6) PS estimates using CHIPS (Trott et al. Reference Trott2016), and note that the values of the powers in Figures 5 and 6 are indicative, rather than absolute. This is because CHIPS grids using the primary beam shape of the MWA, and normalises the outputs based on the expected cosmological observing volume of the telescope. Neither are present in WODEN simulations. All PS estimates were run with identical settings however, and so all differences are solely due to the models used for peeling.

Figure 6. The 2D PS as estimated by CHIPS of the simulated peeling results. Plots (a) and (b) show the shapelet model peel results for Phase I and Phase I + II, with (c) showing the difference between the 2D PS in (a) and (b). The MS CLEAN peel results are shown in a similar manner in plots (d), (e), and (f). In the difference PS of (c) and (f), blue means the Phase I model subtracted less power than the Phase I + II model, and red means the Phase I model subtracted more power.

Figure 7. Results from compressing the MS CLEAN and shapelet models. Plots (a), (b), and (c) on the left show results from Phase I models, where (d), (e), and (f) on the right show results from Phase I + II models. The top row show residual 1D PS left after peeling MS CLEAN models (a) and (d) and shapelet models (b) and (e), at various levels of truncation. The percentage in the legend is the percentage of remaining components after truncation. We take slices through the truncation results at specific low k values, shown by the vertical dotted lines, and plot them as a function of GPU time in the bottom row (c) and (f). In both plots, dashed lines are MS CLEAN, and solid lines are shapelet results. Matching colours plot matching k-modes.

Figure 8. Stages of fitting for the Fornax A image using real MWA data as described in Section 6.1 and shown in (a). Four compact sources (outlined in red circles) were found through trial and error to cause fitting problems, and so were subtracted as Gaussians. The image was then split into two lobes, (b) and (c), and fitted separately. (d) shows the fitted shapelet model recreated in image space, using restoring beam convolved basis functions, as well as the subtracted Gaussians. (e) shows the fitting residuals.

The 1D PS is an average over the Fourier inverse (or k-space) of three dimensions; two angular dimensions across the sky, and one derived from the spectral dimension of the data. The 1D PS is currently used to generate upper limits on an EoR detection. In Figure 5, we include a fiducial EoR signal as a point of reference to the reader. This is taken from the public websiteFootnote l associated with the work of Mesinger, Greig, & Sobacchi (Reference Mesinger, Greig and Sobacchi2016). We select a redshift that matches the centre of the simulated frequency bandwidth. The greatest ratio of expected EoR signal to astrophysical foregrounds, after foreground removal/avoidance, is expected to lie at low k-modes in the 1D PS. In Figure 5, we concentrate on the smallest k-modes. We find little difference when peeling with the Phase I MS CLEAN and shapelet models (purple line with squares, and blue line with circles, respectively). We see a difference in behaviour in the Phase I + II models; however, including Phase II data in the model improves the subtraction at small k with the shapelet model, but actually worsens the subtraction with the MS CLEAN model.

To understand the difference, we turn to the 2D PS, shown in Figure 6. The 2D PS averages the k-modes obtained from angular scales upon the sky ( $k_\perp$ , horizontal axis), and plots them against the k-modes derived from the spectral response ( $k_\parallel$ , vertical axis). Broadly, astrophysical foregrounds are expected to be spectrally smooth, and so should inhabit the bottom of the 2D PS, where $k_\parallel$ is small. Instrumental chromaticity causes the foregrounds to be swept up at higher $k_\perp$ into the so-called ‘wedge’ on the lower right, with the upper left being known as the ‘window’, offering the cleanest area of k-space in which to attempt a detection. If the peeled residuals were perfectly spectrally smooth, we would expect this window to be devoid of emission; however, there is clearly power present. There are two processes that could have added in spectral structure: the calibration step during peeling, and the gridding kernel of CHIPS, which expects the data to have the imprint of an MWA beam pattern, which was not included in the simulation. Visibilities with quickly changing spectral structure have the effect of throwing power that would normally sit at low $k_\parallel$ up into high $k_\parallel$ , contaminating the window. This effect can happen in real data, and the net effect is more power in the wedge at a certain $k_\perp$ will also throw more power into the window at that $k_\perp$ .

This effect goes some way to explaining the difference in 1D power when adding higher angular resolution data into the modelling process between shapelets and MS CLEAN. When looking at the difference of the Phase I and Phase I + II shapelet model residuals [Figure 6(c)], we see more power has been subtracted out of the wedge at all $k_\perp$ beside the very smallest $k_\perp$ -bin, which has meant there is less power in the window also. Comparing the MS CLEAN models, the introduction of higher angular resolution data has yielded a large improvement in the power subtracted at large $k_\perp$ (small angular scales), but has worsened the subtraction at low $k_\perp$ , contaminating the window. It is difficult to isolate what might cause worsening subtraction at large angular scales with MS CLEAN. It is possible that as including the Phase II configuration reduces the angular size and shape of the synthesised PSF, an MS CLEAN is more biased to smaller angular scales. There are two WSClean parameters that can be manually set to change the MS CLEAN settings: -multiscale-scales, which forces the CLEAN to specific scales, and -multiscale-scale-bias, which biases the CLEAN to larger or smaller scales. We tested forcing the scales that were CLEAN-ed for the Phase I + II image to be the same as the scales CLEAN-ed for the Phase I alone (plus an extra smaller scale to account for the increase in resolution). We also changed -multiscale-scale-bias to favour larger scales. We propagated both these changes through to the PS, and found either no change in residuals, or worsened residuals at all scales. While not an exhaustive parameter search, we find the default settings give the best results, and so use them throughout the paper. What is clear is that the SHAMFI model was able to incorporate the increased angular resolution, while simultaneously still fitting the large-scale structure.

5.3. Model truncation

As discussed in Section 5, calculating a value for a single shapelet basis function requires more calculation than for a single point source or Gaussian. As a point of reference, for the simulated Phase I + II image, WSClean produced a total of 6 331 MS CLEAN components (4 540 point source and 1 791 Gaussian). Fitting a separate shapelet model for each lobe, each with $p_{\text{max}} = 86$ , gives a total of 7 656 basis functions to extrapolate. Therefore, for shapelets to be computationally competitive, some compression/truncation of the full model is necessary. We apply a basic truncation, by simply removing the MS CLEAN components or shapelet basis functions that contribute the smallest absolute flux density to the total number of components $N_{\text{tot}}$ , up until some fraction $f_{\textrm{comp}}$ of all components remain. Explicitly, we find the number of components $N_{\textrm{comp}}$ that satisfies

(23) \begin{align}\sum^{N_{\textrm{comp}}}_j |I_j| <= f_{\textrm{comp}} \sum^{N_\textrm{tot}}_j |I_j|, \end{align}
(24) \begin{align}\textrm{where}\quad\quad & |I_j| > |I_{j+1}|.\end{align}

For a point source or Gaussian, $I_j$ is simply the total integrated flux density, with the absolute value necessary as MS CLEAN components and shapelets can be positive or negative. To find $I_j \equiv I_{p_1,p_2}$ for the shapelet basis function $B_{p_1,p_2}$ , we image and sum the absolute of the restoring beam $G(\theta_{\text{bmaj}},\theta_{\text{bmin}})$ convolved basis function,

(25) \begin{equation} I_{p_1,p_2} = \sum_s \sum_t \left| C_{p_1,p_2} G(\theta_{\text{bmaj}},\theta_{\text{bmin}}) \ast B_{p_1,p_2}(x_s,y_t) \right|,\end{equation}

where $\ast$ is a convolution, for all $s \times t$ pixels used during the model fit.

We have implemented this type of truncation within SHAMFI, which has the necessary code base to allow the compressed (reduced) set of shapelet basis functions to be fit to the image again. We apply this extra step in the following results.

We apply three levels of truncation to both MS CLEAN and shapelets, for both the Phase I and Phase II models. We then simulate these models through WODEN for the same 2-min observation as used in Section 5, and apply the same peeling technique, fitting a single complex gain per antenna with 10 iterations of Equation (22). When running the simulations, we used nvprof,Footnote m the NVIDIA line profiler, to measure the time spent performing calculations and memory allocations on the GPU. For each level of truncation, we ran the peeled outputs through CHIPS, so we could compare the effect of peeling against time taken in generating the visibilities. The results are shown in Figure 7. As expected, the uncompressed shapelet model takes more than twice the time to generate than the MS CLEAN model. However, for both the Phase I and Phase I + II models, shapelets can be compressed to a level where they incur comparative computational expense, and still peel out more power at the largest spatial scales. By comparing the blue dashed lines in the bottom plots of Figure 7, we can see that the 100% MS CLEAN model subtracts more large-scale power when only including Phase I information, (bottom left), rather than including both Phases I and II. The amount of power subtracted by the shapelet model is comparable when incorporating just Phase I information or both Phases I and II.

Table 1. Observational parameters of the real data used to test peeling.

6. Real data results

6.1. Imaging and modelling real MWA data

We use six 2-min MWA observations to create the image of Fornax A in Figure 8. Three observations were Phase I, taken in December 2014 as part of the GLEAM survey (Hurley-Walker et al. Reference Hurley-Walker2017; Wayth et al. Reference Wayth2015), and the other three were Phase II, taken in February 2018 as part of the GLEAM-X survey (Hurley-Walker Reference Hurley-Walker2020, in preparation). Calibration observations of the bright radio galaxy Pictor A were also taken at the same frequency on the same nights as the Fornax A observations. The Pictor A observations were used to solve for an initial set of calibration solutions, which were then applied to the Fornax A data. An image was made with the full 30.72 MHz bandwidth of the MWA, centred at 185 MHz using joint deconvolution with WSClean. This joint deconvolution was enabled by the implementation of Image Domain Gridding (IDG, van der Tol, Veenboer, & Offringa Reference van der Tol, Veenboer and Offringa2018) into WSClean, which uses the MWA primary beam (Sokolowski et al. Reference Sokolowski2017) to properly take into account the changing beam shapes for each snapshot observation in the gridding process. Several rounds of self-calibration were then performed with CALIBRATE (Offringa et al. Reference Offringa2016) to improve the antenna gain solutions, using the MS CLEAN component model produced by WSClean.

In Figure 8, we detail the image manipulation required when running SHAMFI to obtain a good model using real data, including the necessary subtraction of compact emission to reduce the need for higher-order (and therefore higher-resolution) shapelet basis functions. The process of fitting the real Fornax A image, along with all commands used, is available online as a tutorial in the documentation of SHAMFI.Footnote n

Figure 9. Real data peel results for Fornax A data detailed in Table 1: (a) the calibrated data before subtraction; (b) and (c) the residuals after peeling the shapelet model and MS CLEAN model, respectively; (d) and (e) the 2D PS of the residuals after peeling the shapelet model and MS CLEAN model, respectively; (f) the difference between the two 2D PS shown in (d) and (e), with the MS CLEAN PS subtracted from the shapelet PS. In (f), red means the shapelet model subtracted more power during the peel, and blue means the MS CLEAN model subtracted more.

Figure 10. Real data peel results for the EoR1 data detailed in Table 1: (a) and (b) the new Phase I + II shapelet model simulated using the RTS and the residuals after peeling with this model, respectively; (c) and (d) the old shapelet model and residuals after peeling, respectively; (e) an east-west polarisation difference 2D PS where the residuals of (d) (old shapelets) are subtracted from (b) (new shapelets); (f) 1D PS of the residuals left behind after peeling the new shapelet mode (blue line with squares) and the old model (orange with triangles), with an estimate of the thermal noise in the east-west polarisation; (g) the ratio of the 1D PS shown in (f), with the new model divide by the old model. In (e), red means the new model subtracted more power during the peel, and blue means the old model subtracted more. We only show east-west here, but the north-south display the same behaviour.

6.2. Calibrating and peeling real data

To carry out calibration and peeling on the real data sets, we use the RTS, software which is used by the Australian MWA EoR collaboration (see Jacobs et al. Reference Jacobs2016, for an overview of RTS processing). The RTS has the capacity to perform direction-dependent calibration, correct for first-order ionospheric effects, and peel sources (phase-rotate, calibrate, and subtract a source, see Noordam Reference Noordam and Oschmann2004). During the imaging of the real data, WSClean produced 2 537 MS CLEAN components. These components, as well as the shapelet model described in Figure 8, were used to peel with the RTS. We peel Fornax A from two data sets: the first set where Fornax A is near the primary beam centre, and therefore contributes a large fraction of the total power in the sky; the second set made of standard MWA EoR observations, where Fornax A is far from pointing centre, often at a total Stokes I beam power of $\sim 30\%$ . The latter data set is pointed towards the ‘EoR1’ field at $RA=4^h,\delta=-27^\circ$ . We use the ionospheric metrics detailed in Jordan et al. (Reference Jordan2017) to pick an ionospherically quiet night for the EoR1 data. The key points of both data sets are summarised in Table 1. For both data sets, we use the same sky model, and simply switch in the MS CLEAN or shapelet model of Fornax A. The sky model is mostly based on the GLEAM catalogue (Hurley-Walker et al. Reference Hurley-Walker2017), cross-matched to the following catalogues and frequencies: VLSSr, 74 MHz (Lane et al. Reference Lane, Cotton, Helmboldt and Kassim2012); TGSS ADR1, 150 MHz (Intema et al. Reference Intema, Jagannathan, Mooley and Frail2017); MRC, 408 MHz (Large et al. Reference Large, Mills, Little, Crawford and Sutton1981); SUMSS, 843 MHz (Mauch et al. Reference Mauch, Murphy, Buttery, Curran, Hunstead, Piestrzynski, Robertson and Sadler2003); NVSS, 1 400 MHz (Condon et al. Reference Condon, Cotton, Greisen, Yin, Perley, Taylor and Broderick1998). We use PUMA (Line et al. Reference Line, Webster, Pindor, Mitchell and Trott2017) for the cross-match, using higher-resolution data for position or source morphology where possible. We use the sky model developed in Procopio et al. (Reference Procopio2017) in the EoR1 field. This sky model has been used by the MWA EoR analysis for a number of years, most recently in the limits published in Trott et al. (Reference Trott2020).

We use the RTS to perform foreground subtraction in two steps: an initial direction-independent calibration to a sky model of 1 000 sources; a subtraction step where a number of sources are fitted for first-order ionospheric effects and peeled. Even though we are only comparing the Fornax A models, we need to subtract off a number of sources in the sky to be able to see the difference in residuals, lest it be swamped by the rest of the sky. Typically, we peel off 1 000 sources in the latter stage. As discussed in Section 3 however, the architecture of the RTS is currently such that creating a single calibrator source with thousands of components causes significant computational expense. We found as the Fornax A data set had a higher time resolution, when processing with the MS CLEAN model, we could only peel off 500 sources before running into GPU memory issues. The processing time using the MS CLEAN model was also around 4 times slower than when using the shapelet model. The results of peeling 500 sources from the Fornax A data are shown in Figure 9, which shows both image residuals, and 2D PS.

In image space, the MS CLEAN and shapelet models leave similar residuals. A limitation of the shapelet fitting routine developed in this paper is exposed in Figure 9(b). The residuals are surrounded by a negative over-peel, which we attribute to side-lobe noise from the original image that was fitted into the shapelet model. This adds in a ‘floor’ to the model, which the RTS then has to calibrate for, leading to a slight under-peel of the lobes, and a slight over-peel of the floor. Interestingly, a similar negative border surrounds the MS CLEAN residuals, which is possibly due to the largest scale fitted for during the CLEANing process, or a consequence of the peeling algorithms internal to the RTS. While the 2D difference PS in Figure 9 suggests that the shapelet model has subtracted more power at all scales, the difference in the window is negligible, with the 1D PS looking near identical.

Given the small difference seen between the two models, we decided to compare our shapelet model to an earlier shapelet model created from early Phase I data, for the EoR1 data. This model has historically been used by the Australian EoR team. Fornax A has long been suspected of causing poor limits in the 1D PS. The results of comparing these two models are shown in Figure 10, and again we see little difference in the resultant 1D PS, regardless of the stark contrast in residuals left behind in the image. We explore possible reasons for the small differences in the following section.

Figure 11. Residual 1D PS comparing peeling 1 000 simulated sources, including the new Fornax A MS CLEAN model, when peeling sources including the new shapelet model (blue with squares) and including the old shapelet model (orange with triangles).

7. Discussion

While the simulation results in Section 5 indicate that the new shapelet model of Formax A we have generated should improve the 1D PS, the overall amplitude of the improvements seen in the simulated PS were several orders of magnitude lower than the current systematics present in real data. This is of course expected, given the simulations contained no other sources were noiseless, and had no instrumental effects. We did expect however that this improvement might scale somewhat with real data, given the residuals seen in real data were far worse. The results using real data in Section 6 resolutely disagree with this hypothesis, showing little to no difference.

To address this, we return to simulations. Recently, functionality was added to the RTS to be run as a simulator, allowing the software to use an input sky model to generate visibilities, which it can then calibrate and peel. For a single 2-min zenith observation, we weigh our sky model by the MWA primary beam, and select the 1 000 apparently brightest sources. We simulate these 1 000 sources (with the primary beam and MWA bandpass), including the new MS CLEAN component model from the real image of Fornax A. We then perform the standard direction-independent calibration, followed by direction-dependent ionospheric calibration to peel those 1 000 sources, as we did with the real EoR1 data. We performed the peel with both the new and old shapelet models; this left us with two sets of residuals comparable to the those seen in Figure 10. We show the differences in the residual 1D PS in Figure 11.

This shows a greater than two orders of magnitude difference in residuals after peeling with the two models, and also explains the lack of a difference in EoR1 data; the residuals in the RTS simulation vary of order $\sim 10^0$ $10^4$ , whereas in the real EoR1 data are of order $>\!\!10^5$ . Any improvement garnered through improving the Fornax A model is seemingly blown away by other systematics in the flagging, calibration, and peeling of real EoR1 data. There are a number of potential systematics that could be masking the improvement from the new Fornax A model. We list a handful of them here:

  • The calibration solutions from the RTS are not fit to be continuous in frequency. All twenty-four 1.28 MHz coarse channels are calibrated somewhat independently, which might inject false spectral structure into the residual visibilities.

  • We use AOFlagger (Offringa et al. Reference Offringa2015) to flag RFI from our data. Early work suggests there may be a higher flagging occupancy seen in EoR1 data, when compared to data of the same frequency which is pointed at a different patch of sky.

  • For each sky subtraction, Fornax A is only one of 1 000 sources being peeled. It is possible the rest of our sky model is inaccurate enough to leave residuals that cumulatively outweigh those left behind after subtracting Fornax A. Furthermore, there are the remaining sources that go unpeeled that through instrumental and calibration errors can still contribute power to the window.

Upcoming improved sky catalogues including LOBES (Lynch Reference Lynch2020, in preparation) and GLEAM-X (Hurley-Walker Reference Hurley-Walker2020, in preparation) which utilise Phase II MWA data should help improve the sky model and subtraction to reduce residuals.

We also note here that in all simulations, Fornax A was always given a single and known SI. In reality, the SI varies across the object, giving the residuals in real data more spectral structure as well. This could be contributing to the lack of improvement seen in updating the model on real data. In our current implementation, the SI cannot be varied across the shapelet model, but could easily be controlled for each component of the MS CLEAN model. This spectral structure issue could possibly result in MS CLEAN components to perform better subtractions on sources with complicated spectral behaviour. Future work could investigate fitting shapelet models to multiple frequency bands, and fitting the derived coefficients $C_{p_1,p_2}$ as a function of frequency. This may allow spectral behaviour to vary with angular position in a shapelet source.

8. Summary

We have successfully implemented a method to fit shapelets (SHAMFI) for use with the RTS, and explicitly specified how shapelets can be implemented to generate sky models in visibility space. We have written a new simulator, WODEN, in the CUDA language, allowing us to optimise simulating point sources, Gaussians, and shapelets in a consistent way. Through simulation, we see little difference in a 1D PS after peeling Fornax A, with shapelet and MS CLEAN models created from Phase I simulated MWA observations. When adding in Phase II (improved angular resolution) information to the simulation, we see that the shapelet model is able to subtract more power from the smallest k-modes than the equivalent MS CLEAN model. Interestingly, at small k-modes, the Phase I + II MS CLEAN model subtracts less power than the MS CLEAN model made from just Phase I data. Adding in the smaller angular resolution information does however improve the peel at large k-modes. We suggest that the change in scale of the synthesised PSF could be biasing the MS CLEAN components to smaller angular scales, but further work is required to verify this. Through simulation, we also find that with truncation, we are able to still better subtract power using the Phase I + II shapelet model for the same computational cost.

No difference could be measured with a 1D PS between peeling with a shapelet or MS CLEAN model, when using the RTS and CHIPS, on real data. Simulations suggest that modest, but not insignificant, improved residuals at the smallest k-modes could be achieved by using shapelets over MS CLEAN components, when using both Phase I and Phase II data. Further simulations show these improvements are orders of magnitude smaller than current systematics in our analysis and data. Pragmatically, when using the current Australian EoR pipeline, a model of Fornax A is essential, as an MS CLEAN model takes 4 times more CPU hours.

A potential weakness of SHAMFI was highlighted in Section 6, where it was found shapelets tend to fit the side-lobe noise of other sources surrounding the target object. One could avoid this by fitting shapelets just to the MS CLEAN components, rather than the restored image. In doing this, however, one will miss any diffuse emission that hasn’t been deconvolved, limiting the improvement shapelets that can bring over MS CLEAN components.

In this paper, we focus on a single object, Fornax A. In reality, thousands of sources must be peeled, ranging from a simple to complicated morphology. Shapelets are well suited to modelling sources with extended diffuse emission. These types of sources are well observed by the MWA, due to the compact layout of core receiving elements giving the MWA an excellent surface brightness sensitivity. Sources such as NGC 253 (Kapinska et al. Reference Kapinska2017) and extended sources described in Procopio et al. (Reference Procopio2017), which lie in MWA foregrounds, could be accurately modelled with shapelets.

Although not able to immediately make an impact on EoR limits, the simulations suggest that as the MWA EoR analysis progresses in pushing down the systematics from the instrument and software, shapelet models of Fornax A should improve the limits found. With an eye on future instruments such as the upcoming SKA_LOW, with far greater angular resolution, we note that the number of MS CLEAN components needed to create models will also increase. With the excellent response to truncation shown by shapelets, they might be a vital method to include highly complex sources in an EoR foreground sky model, for a lower computational cost.

Acknowledgements

This scientific work makes use of the Murchison Radio-astronomy Observatory, operated by CSIRO. We acknowledge the Wajarri Yamatji people as the traditional owners of the Observatory site. Support for the operation of the MWA is provided by the Australian Government (NCRIS), under a contract to Curtin University administered by Astronomy Australia Limited. We acknowledge the Pawsey Supercomputing Centre, which is supported by the Western Australian and Australian Governments.

This research was supported by the Australian Research Council Centre of Excellence for All Sky Astrophysics in 3 Dimensions (ASTRO 3D), through project number CE170100013. CMT is supported by an ARC Future Fellowship under grant FT180100321.

This work was supported by resources awarded under the merit allocation scheme of Astronomy Australia Ltd on the OzSTAR national facility at Swinburne University of Technology. OzSTAR is funded by Swinburne University of Technology and the National Collaborative Research Infrastructure Strategy (NCRIS).

During this work, we made extension use of the kvis (Gooch Reference Gooch1996) and DS9 (Joye & Mandel Reference Joye and Mandel2003) FITS file image viewers, and the astropy (Astropy Collaboration et al. 2013, 2018) and aplpy (Robitaille & Bressert Reference Robitaille and Bressert2012; Robitaille Reference Robitaille2019) python modules.

We thank the anonymous referee for their useful and constructive comments.

A. WSCLEAN COMMAND

Figure A1. Example WSClean command used on simulated data.

B. SOFTWARE VERSIONS

Table B1. Versions of software used.

References

Astropy Collaboration, et al. 2013, A&A, 558, 33Google Scholar
Astropy Collaboration, et al. 2018, AJ, 156, 123CrossRefGoogle Scholar
Bacon, D. J., Rowe, B., Patel, P., Beswick, R. J., Abdalla, F. B., & Smirnov, O. M. 2014, MNRAS, 444, 2893Google Scholar
Barry, N., et al. 2019, ApJ , 884, 1CrossRefGoogle Scholar
Bastien, D., Oozeer, N., & Somanah, R. 2017, IOP Conference Series: Materials Science and Engineering, 198CrossRefGoogle Scholar
Condon, J. J., Cotton, W. D., Greisen, E. W., Yin, Q. F., Perley, R. A., Taylor, G. B., & Broderick, J. J. 1998, AJ, 115, 1693CrossRefGoogle Scholar
Cornwell, T. J. 2008, IEEE Journal of Selected Topics in Signal Processing , 2, 793CrossRefGoogle Scholar
DeBoer, D. R., et al. 2017, PASP , 129, 045001CrossRefGoogle Scholar
Dewdney, P. E., Turner, W., Millenaar, R., Mccool, R., Lazio, J., & Cornwell, T. J. 2013, SKA Technical DocumentGoogle Scholar
Fomalont, E. B., Ebneter, K. A., van Breugel, W. J. M., & Ekers, R. D. 1989, ApJ , 346, L17CrossRefGoogle Scholar
Gooch, R. 1996, Karma: a Visualization Test-Bed. p. 80Google Scholar
Hurley-Walker, N. 2020, MNRAS, in preparation.Google Scholar
Hurley-Walker, N., et al. 2017, MNRAS , 464, 1146CrossRefGoogle Scholar
Intema, H. T., Jagannathan, P., Mooley, K. P., & Frail, D. A. 2017, A&A , 598, A78Google Scholar
Jacobs, D. C., et al. 2016, ApJ , 825, 114CrossRefGoogle Scholar
Jordan, C. H., et al. 2017, MNRAS , 471, 3974CrossRefGoogle Scholar
Joye, W. A. & Mandel, E., 2003. New Features of SAOImage DS9, p. 489Google Scholar
Kapinska, A. D., et al. 2017, ApJ , 838, 68CrossRefGoogle Scholar
Kazemi, S., Yatawatta, S., Zaroubi, S., Lampropoulos, P., de Bruyn, A. G., Koopmans, L. V. E., & Noordam, J. 2011, MNRAS , 414, 1656CrossRefGoogle Scholar
Kerrigan, J. R., et al. 2018, ApJ , 864, 131CrossRefGoogle Scholar
Lane, W. M., Cotton, W. D., Helmboldt, J. F., & Kassim, N. E. 2012, RaSc, 47CrossRefGoogle Scholar
Large, M. I, Mills, B. Y., Little, A. G., Crawford, D. F. & Sutton, J. M., 1981, MNRAS, 194, 693CrossRefGoogle Scholar
Li, W., et al. 2019, ApJ, 887, 141CrossRefGoogle Scholar
Line, J. L. B., Webster, R. L. L., Pindor, B., Mitchell, D. A. A., & Trott, C. M. M. 2017, PASA, 34, e003CrossRefGoogle Scholar
Lynch, C. 2020, PASA, in preparation.Google Scholar
Mauch, T., Murphy, T., Buttery, H. J., Curran, J., Hunstead, R. W., Piestrzynski, B., Robertson, J. G., & Sadler, E. M. 2003, MNRAS, 342, 1117CrossRefGoogle Scholar
McKinley, B., et al. 2013, MNRAS, 436, 1286CrossRefGoogle Scholar
Mellema, G., et al. 2013, ExA, 36, 235Google Scholar
Mertens, F. G., et al. 2020, MNRAS, 493, 1662CrossRefGoogle Scholar
Mesinger, A., Greig, B., & Sobacchi, E. 2016, MNRAS, 459, 2342CrossRefGoogle Scholar
Mitchell, D. A., Greenhill, L. J., Wayth, R. B., Sault, R. J., Lonsdale, C. J., Cappallo, R. J., Morales, M. F., & Ord, S. M. 2008, ISTSP, 2, 707Google Scholar
Morales, M. F., & Wyithe, J. S. B. 2010, ARA&A, 48, 127CrossRefGoogle Scholar
Mouri Sardarabadi, A., & Koopmans, L. V. E. 2019, MNRAS, 483, 5480CrossRefGoogle Scholar
Nickolls, J., Buck, I., Garland, M., & Skadron, K. 2008, Queue, 6, 40CrossRefGoogle Scholar
Noordam, J. E. 2004, in Ground-based Telescopes, ed. Oschmann, J. M. Jr. (Vol. 5489), 817. doi: 10.1117/12.544262CrossRefGoogle Scholar
Offringa, A. R. & Smirnov, O. 2017, MNRAS, 471, 301CrossRefGoogle Scholar
Offringa, A. R., et al. 2014, MNRAS, 444, 606CrossRefGoogle Scholar
Offringa, A. R., et al. 2015, PASA, 32, e008CrossRefGoogle Scholar
Offringa, A. R., et al. 2016, MNRAS, 458, 1057CrossRefGoogle Scholar
Parsons, A. R., et al. 2010, AJ, 139, 1468CrossRefGoogle Scholar
Patil, A. H., et al. 2017, ApJ, 838, 65CrossRefGoogle Scholar
Pritchard, J. R., & Loeb, A. 2012, RPPh. Physical Society (Great Britain), 75, 086901CrossRefGoogle Scholar
Procopio, P., et al. 2017, PASA, 34, e033CrossRefGoogle Scholar
Refregier, A. 2003, MNRAS, 338, 35CrossRefGoogle Scholar
Riding, J. L., Mitchell, D. A., & Webster, R. L. 2017, in Astronomical Data Analysis Software and Systems XXV. Astronomical Society of the Pacific Conference Series, doi: 2017ASPC.512.257RGoogle Scholar
Robitaille, T. 2019, APLpy v2.0: The Astronomical Plotting Library in Python, doi: 10.5281/zenodo.2567476CrossRefGoogle Scholar
Robitaille, T., & Bressert, E. 2012, APLpy: Astronomical Plotting Library in Python, Astrophysics Source Code Library (ascl:1208.017)Google Scholar
Schechtman-Rook, A., Bershady, M. A., & Wood, K. 2012, ApJ, 746CrossRefGoogle Scholar
Sokolowski, M., et al. 2017, PASA, 34, e062CrossRefGoogle Scholar
Tagore, A. S. & Jackson, N. 2016, MNRAS, 457, 3066CrossRefGoogle Scholar
Thompson, A. R., Moran, J. M., & Swenson, G. W. Jr. 2001, Interferometry and Synthesis inRadio Astronomy (2nd edn.; Wiley)Google Scholar
Tingay, S. J. J., et al. 2013, PASA, 30, e007CrossRefGoogle Scholar
Trott, C. M., et al. 2016, ApJ, 752, 34Google Scholar
Trott, C. M., et al. 2020, MNRAS, 19, 1Google Scholar
Wayth, R. B., et al. 2015, PASA, 32, e025CrossRefGoogle Scholar
Wayth, R. B., et al. 2018, PASA, 35, e033CrossRefGoogle Scholar
Yatawatta, S. 2010, in 2010 IEEE Sensor Array and Multichannel Signal Processing Workshop, SAM 2010 , 6972Google Scholar
Yatawatta, S., Zaroubi, S., de Bruyn, G., Koopmans, L., & Noordam, J. 2009, in 2009 IEEE 13th Digital Signal Processing Workshop and 5th IEEE Signal Processing Education Workshop (IEEE), 150155, doi: 10.1109/DSP.2009.4785912CrossRefGoogle Scholar
van Haarlem, M. P., et al. 2013, A&A, 556, A2Google Scholar
van der Tol, S., Veenboer, B., & Offringa, A. R., 2018, A&A, 616, A27Google Scholar
Figure 0

Figure 1. Upper panel: Fornax A image obtained from NED, which includes CLEAN residuals around the edge of the lobes. Over-plotted in red is a boundary line outside of which a Gaussian taper was applied to create the model used in WODEN. Lower panel: The image after applying the taper and transforming from B1950 to J2000 centred coordinates. Any pixels that were not masked (the grey region) in the lower image were converted into point sources.

Figure 1

Figure 2. MS CLEANed images of the WODEN simulation of Fornax A. Top: All Phase I configuration, imaged with briggs 0 weighing to balance the large number of short baselines present in Phase I data with a reasonable resolution. Bottom: A combination of both Phase I and Phase II configurations. These are imaged with uniform weighing to take advantage of the higher resolution Phase II data.

Figure 2

Figure 3. Left: Example of initial Gaussian fit used to set $\phi_{\text{PA}}$ of the basis functions for one of the simulated Phase 1+2 lobe images. The red line shows the FWHM, with the white lines demonstrating the PA found. Right: Example of the grid-based approach for fitting $\beta_1$ and $\beta_2$. The colour scale here represents the residuals in (Jy/pixel)$^2$ left after fitting the image in the left panel with all basis functions up to $p_{\text{max}} = 86$.

Figure 3

Figure 4. Left: MS CLEANed Phase 1 simulated data Middle: Fitted shapelet model recreated in image space, using restoring beam convolved basis functions Right: The model subtracted from the data. The footprints of the two separate shapelet models (one for each lobe) are clearly visible in the residual plot.

Figure 4

Figure 5. Simulated peeling results. Plots (a) to (d) show MS CLEANed peel residuals of an integration over the full bandwidth and time span of a single simulated Phase 1 observation, each peeled with a different model for Fornax A. The models are: (a) a shapelet model from Phase I data; (b) a shapelet model from Phase 1 and 2 data; (c) an MS CLEAN model from Phase I data; (d) an MS CLEAN model from Phase I and II data. A contour plot of the unpeeled power is shown on each plot as a angular scale reference. Plot (f) shows the 1D PS as estimated with CHIPS for the models in (a) to (d). For reference, the power without peeling is shown for the unpeeled WODEN simulation in (e). The line style and marker for each 1D PS plotted in (f) is also plotted in each plot from (a) to (d) on the lower right for reference.

Figure 5

Figure 6. The 2D PS as estimated by CHIPS of the simulated peeling results. Plots (a) and (b) show the shapelet model peel results for Phase I and Phase I + II, with (c) showing the difference between the 2D PS in (a) and (b). The MS CLEAN peel results are shown in a similar manner in plots (d), (e), and (f). In the difference PS of (c) and (f), blue means the Phase I model subtracted less power than the Phase I + II model, and red means the Phase I model subtracted more power.

Figure 6

Figure 7. Results from compressing the MS CLEAN and shapelet models. Plots (a), (b), and (c) on the left show results from Phase I models, where (d), (e), and (f) on the right show results from Phase I + II models. The top row show residual 1D PS left after peeling MS CLEAN models (a) and (d) and shapelet models (b) and (e), at various levels of truncation. The percentage in the legend is the percentage of remaining components after truncation. We take slices through the truncation results at specific low k values, shown by the vertical dotted lines, and plot them as a function of GPU time in the bottom row (c) and (f). In both plots, dashed lines are MS CLEAN, and solid lines are shapelet results. Matching colours plot matching k-modes.

Figure 7

Figure 8. Stages of fitting for the Fornax A image using real MWA data as described in Section 6.1 and shown in (a). Four compact sources (outlined in red circles) were found through trial and error to cause fitting problems, and so were subtracted as Gaussians. The image was then split into two lobes, (b) and (c), and fitted separately. (d) shows the fitted shapelet model recreated in image space, using restoring beam convolved basis functions, as well as the subtracted Gaussians. (e) shows the fitting residuals.

Figure 8

Table 1. Observational parameters of the real data used to test peeling.

Figure 9

Figure 9. Real data peel results for Fornax A data detailed in Table 1: (a) the calibrated data before subtraction; (b) and (c) the residuals after peeling the shapelet model and MS CLEAN model, respectively; (d) and (e) the 2D PS of the residuals after peeling the shapelet model and MS CLEAN model, respectively; (f) the difference between the two 2D PS shown in (d) and (e), with the MS CLEAN PS subtracted from the shapelet PS. In (f), red means the shapelet model subtracted more power during the peel, and blue means the MS CLEAN model subtracted more.

Figure 10

Figure 10. Real data peel results for the EoR1 data detailed in Table 1: (a) and (b) the new Phase I + II shapelet model simulated using the RTS and the residuals after peeling with this model, respectively; (c) and (d) the old shapelet model and residuals after peeling, respectively; (e) an east-west polarisation difference 2D PS where the residuals of (d) (old shapelets) are subtracted from (b) (new shapelets); (f) 1D PS of the residuals left behind after peeling the new shapelet mode (blue line with squares) and the old model (orange with triangles), with an estimate of the thermal noise in the east-west polarisation; (g) the ratio of the 1D PS shown in (f), with the new model divide by the old model. In (e), red means the new model subtracted more power during the peel, and blue means the old model subtracted more. We only show east-west here, but the north-south display the same behaviour.

Figure 11

Figure 11. Residual 1D PS comparing peeling 1 000 simulated sources, including the new Fornax A MS CLEAN model, when peeling sources including the new shapelet model (blue with squares) and including the old shapelet model (orange with triangles).