Open in Colab: https://colab.research.google.com/github/casangi/cngi_prototype/blob/master/docs/imaging/continuum_imaging_example.ipynb


Continuum Imaging

This notebook will demonstrate how to create a continuum dirty image with natural weighting using ngCASA. The resulting image will be compared with an image created by CASA.

For this demonstration data from the ALMA First Look at Imaging CASAguide (https://casaguides.nrao.edu/index.php/First_Look_at_Imaging) will be used. The measurement set has been converted to vis.zarr (using convert_ms in cngi.conversion)

This walkthrough is designed to be run in a Jupyter notebook on Google Colaboratory. To open the notebook in colab, go here.

Installation and Dataset Download

[1]:
import os
os.system("pip install cngi-prototype==0.0.91")

!gdown -q --id 1PNL0ANqnN7eyYOpQ_vMslQlorUhtrUmE
!unzip sis14_twhya_field_5_lsrk_pol_xx.vis.zarr.zip > /dev/null

!gdown -q --id 19VlXp9Qyx60ld0rBSAYMVFTq5Y1KuxO8
!unzip casa_twhya_standard_gridder_lsrk_mfs_natural.img.zarr.zip > /dev/null

%matplotlib widget
print('complete')
complete

Load Dataset

Two datasets are are needed for this notebook sis14_twhya_field_5_lsrk_pol_xx.vis.zarr and casa_twhya_standard_gridder_lsrk_mfs_natural.img.zarr (for more information about the img.zarr format go here and for the vis.zarr format go here).

The sis14_twhya_field_5_lsrk_pol_xx.vis.zarr dataset is used to create a continuum image. The dataset was created by using the mstransform command in CASA

mstransform('sis14_twhya_calibrated_flagged.ms',
            outputvis='sis14_twhya_field_5_lsrk_pol_xx.ms',
            regridms=True, outframe='LSRK', datacolumn='data',
            correlation='XX',field='5')

and then convert_ms in cngi.conversion

infile = 'sis14_twhya_field_5_lsrk_pol_xx.ms'
outfile = 'sis14_twhya_field_5_lsrk_pol_xx.vis.zarr'
chunk_shape=(270, 210, 12, 1)
convert_ms(infile, outfile=outfile , chunk_shape=chunk_shape)

The conversion to ‘LSRK’ is necessary because cngi does not currently have an implementation and tclean does a conversion to ‘LSRK’ before imaging.

To check the ngcasa imaging results the casa_twhya_standard_gridder_lsrk_mfs_natural.img.zarr dataset is used. This dataset was generated by running tclean in CASA

tclean(vis='sis14_twhya_field_5_lsrk_pol_xx.ms',
       imagename='twhya_standard_gridder_lsrk_mfs_natural',
       specmode='mfs',
       deconvolver='hogbom',
       imsize=[200,400],
       cell=['0.08arcsec'],
       weighting='natural',
       threshold='0mJy',
       niter=0,stokes='XX')

and then image_ms in cngi.conversion

infile = 'twhya_standard_gridder_lsrk_mfs_natural.image'
outfile = 'casa_twhya_standard_gridder_lsrk_mfs_natural.img.zarr'
convert_image(infile=infile,outfile=outfile)
[2]:
import xarray as xr
from cngi.dio import read_vis, read_image

xr.set_options(display_style="html")

mxds = read_vis("sis14_twhya_field_5_lsrk_pol_xx.vis.zarr",chunks={'chan':192})
mxds.xds0
overwrite_encoded_chunks True
[2]:
<xarray.Dataset>
Dimensions:         (baseline: 210, chan: 384, pol: 1, pol_id: 1, spw_id: 1, time: 270, uvw_index: 3)
Coordinates:
  * baseline        (baseline) int64 0 1 2 3 4 5 6 ... 204 205 206 207 208 209
  * chan            (chan) float64 3.725e+11 3.725e+11 ... 3.728e+11 3.728e+11
    chan_width      (chan) float64 dask.array<chunksize=(192,), meta=np.ndarray>
    effective_bw    (chan) float64 dask.array<chunksize=(192,), meta=np.ndarray>
  * pol             (pol) int32 9
  * pol_id          (pol_id) int32 0
    resolution      (chan) float64 dask.array<chunksize=(192,), meta=np.ndarray>
  * spw_id          (spw_id) int32 0
  * time            (time) datetime64[ns] 2012-11-19T07:56:26.544000626 ... 2...
Dimensions without coordinates: uvw_index
Data variables: (12/17)
    ANTENNA1        (baseline) int32 dask.array<chunksize=(210,), meta=np.ndarray>
    ANTENNA2        (baseline) int32 dask.array<chunksize=(210,), meta=np.ndarray>
    ARRAY_ID        (time, baseline) int32 dask.array<chunksize=(270, 210), meta=np.ndarray>
    DATA            (time, baseline, chan, pol) complex128 dask.array<chunksize=(270, 210, 192, 1), meta=np.ndarray>
    DATA_WEIGHT     (time, baseline, chan, pol) float64 dask.array<chunksize=(270, 210, 192, 1), meta=np.ndarray>
    EXPOSURE        (time, baseline) float64 dask.array<chunksize=(270, 210), meta=np.ndarray>
    ...              ...
    OBSERVATION_ID  (time, baseline) int32 dask.array<chunksize=(270, 210), meta=np.ndarray>
    PROCESSOR_ID    (time, baseline) int32 dask.array<chunksize=(270, 210), meta=np.ndarray>
    SCAN_NUMBER     (time, baseline) int32 dask.array<chunksize=(270, 210), meta=np.ndarray>
    STATE_ID        (time, baseline) int32 dask.array<chunksize=(270, 210), meta=np.ndarray>
    TIME_CENTROID   (time, baseline) float64 dask.array<chunksize=(270, 210), meta=np.ndarray>
    UVW             (time, baseline, uvw_index) float64 dask.array<chunksize=(270, 210, 3), meta=np.ndarray>
Attributes: (12/13)
    bbc_no:           2
    corr_product:     [[0, 0]]
    data_groups:      [{'0': {'data': 'DATA', 'flag': 'FLAG', 'id': '0', 'uvw...
    freq_group:       0
    freq_group_name:
    if_conv_chain:    0
    ...               ...
    name:             ALMA_RB_07#BB_2#SW-01#FULL_RES
    net_sideband:     2
    num_chan:         384
    num_corr:         1
    ref_frequency:    372520022603.63745
    total_bandwidth:  234366781.0546875

Note that the chunks parameter in cngi and ngcasa functions specifies the size of a chunk and not the number of chunks (in CASA tclean chanchunks refers to the number of channel chunks).

The dimensionality of the sis14_twhya_field_5_lsrk_pol_xx.vis.zarr dataset is (time:270,baseline:210,chan:384,pol:1) and a zarr chunk size of (time:270,baseline:210,chan:12,pol:1) was chosen. With the cngi.dio.read_vis function the dask chunk size was set to (time:270,baseline:210,chan:192,pol:1). For more information concerning chunking go here.

Flag Data and Create Imaging Weights

The apply_flags cngi.vis function sets all values that should be flagged to nan. The ngcasa.imaging code does not internally apply flags but does ignore nan values. apply_flags documentation

The make_imaging_weight cngi.imaging function takes the WEIGHT data variables and creates IMAGING_WEIGHT data variable that has dimensions time x baseline x chan x pol (matches the visibility DATA variable). Weighting schemes that are supported include natural, uniform, briggs, briggs_abs. make_imaging_weight documentation

[3]:
from cngi.vis import apply_flags
from ngcasa.imaging import make_imaging_weight

mxds = apply_flags(mxds, 'xds0', flags='FLAG')

imaging_weights_parms = {}
imaging_weights_parms['weighting'] = 'natural'

sel_parms = {}
sel_parms['xds'] = 'xds0'
sel_parms['data_group_in_id'] = 0

grid_parms = {}
grid_parms['chan_mode'] = 'continuum'
grid_parms['image_size'] = [200,400]
grid_parms['cell_size'] = [0.08,0.08]

mxds = make_imaging_weight(mxds, imaging_weights_parms, grid_parms, sel_parms)

mxds.xds0
######################### Start make_imaging_weights #########################
Setting data_group_in  to  {'data': 'DATA', 'flag': 'FLAG', 'id': '0', 'uvw': 'UVW', 'weight': 'DATA_WEIGHT'}
Setting default data_group_out  to  {'data': 'DATA', 'flag': 'FLAG', 'id': '0', 'uvw': 'UVW', 'weight': 'DATA_WEIGHT', 'imaging_weight': 'IMAGING_WEIGHT'}
Since weighting is natural input weight will be reused as imaging weight.
######################### Created graph for make_imaging_weight #########################
[3]:
<xarray.Dataset>
Dimensions:         (baseline: 210, chan: 384, pol: 1, pol_id: 1, spw_id: 1, time: 270, uvw_index: 3)
Coordinates:
  * baseline        (baseline) int64 0 1 2 3 4 5 6 ... 204 205 206 207 208 209
  * chan            (chan) float64 3.725e+11 3.725e+11 ... 3.728e+11 3.728e+11
    chan_width      (chan) float64 dask.array<chunksize=(192,), meta=np.ndarray>
    effective_bw    (chan) float64 dask.array<chunksize=(192,), meta=np.ndarray>
  * pol             (pol) int32 9
  * pol_id          (pol_id) int32 0
    resolution      (chan) float64 dask.array<chunksize=(192,), meta=np.ndarray>
  * spw_id          (spw_id) int32 0
  * time            (time) datetime64[ns] 2012-11-19T07:56:26.544000626 ... 2...
Dimensions without coordinates: uvw_index
Data variables: (12/17)
    ANTENNA1        (baseline) int32 dask.array<chunksize=(210,), meta=np.ndarray>
    ANTENNA2        (baseline) int32 dask.array<chunksize=(210,), meta=np.ndarray>
    ARRAY_ID        (time, baseline) int32 dask.array<chunksize=(270, 210), meta=np.ndarray>
    DATA            (time, baseline, chan, pol) complex128 dask.array<chunksize=(270, 210, 192, 1), meta=np.ndarray>
    DATA_WEIGHT     (time, baseline, chan, pol) float64 dask.array<chunksize=(270, 210, 192, 1), meta=np.ndarray>
    EXPOSURE        (time, baseline) float64 dask.array<chunksize=(270, 210), meta=np.ndarray>
    ...              ...
    OBSERVATION_ID  (time, baseline) int32 dask.array<chunksize=(270, 210), meta=np.ndarray>
    PROCESSOR_ID    (time, baseline) int32 dask.array<chunksize=(270, 210), meta=np.ndarray>
    SCAN_NUMBER     (time, baseline) int32 dask.array<chunksize=(270, 210), meta=np.ndarray>
    STATE_ID        (time, baseline) int32 dask.array<chunksize=(270, 210), meta=np.ndarray>
    TIME_CENTROID   (time, baseline) float64 dask.array<chunksize=(270, 210), meta=np.ndarray>
    UVW             (time, baseline, uvw_index) float64 dask.array<chunksize=(270, 210, 3), meta=np.ndarray>
Attributes: (12/13)
    bbc_no:           2
    corr_product:     [[0, 0]]
    data_groups:      [{'0': {'data': 'DATA', 'flag': 'FLAG', 'id': '0', 'uvw...
    freq_group:       0
    freq_group_name:
    if_conv_chain:    0
    ...               ...
    name:             ALMA_RB_07#BB_2#SW-01#FULL_RES
    net_sideband:     2
    num_chan:         384
    num_corr:         1
    ref_frequency:    372520022603.63745
    total_bandwidth:  234366781.0546875

Create Dirty Continuum Image and Primary Beam

The make_image cngi.imaging function grids the data (using the prolate spheroidal function as an anti-aliasing filter), fast Fourier transform the gridded data to an image and normalizes the image. The make_pb function currently supports rotationally symmetric airy disk primary beams. The write_zarr function is now used to trigger a compute (which includes applying the flags, creating the imaging weights and making the image).

make_pb documentation

make_image documentation

[4]:
from ngcasa.imaging import make_image
from ngcasa.imaging import make_pb
from cngi.dio import write_image
import dask

grid_parms = {}
grid_parms['chan_mode'] = 'continuum'
grid_parms['image_size'] = [200,400]
grid_parms['cell_size'] = [0.08,0.08]
grid_parms['phase_center'] = mxds.FIELD.PHASE_DIR[0,0,:].data.compute()

vis_sel_parms = {}
vis_sel_parms['xds'] = 'xds0'
vis_sel_parms['data_group_in_id'] = 0

img_sel_parms = {}
img_sel_parms['data_group_out_id'] = 0

img_xds = xr.Dataset() #empty dataset

img_xds = make_image(mxds, img_xds, grid_parms, vis_sel_parms, img_sel_parms)


make_pb_parms = {}
make_pb_parms['function'] = 'casa_airy'
make_pb_parms['list_dish_diameters'] = [10.7]
make_pb_parms['list_blockage_diameters'] = [0.75]

sel_parms = {}
sel_parms['img_description_in_indx'] = 0


img_xds = make_pb(img_xds,make_pb_parms, grid_parms, sel_parms)


img_xds = write_image(img_xds, 'twhya_standard_gridder_lsrk_mfs_natural.img.zarr')

######################### Start make_image #########################
Setting default image_center  to  [100 200]
Setting default fft_padding  to  1.2
Setting data_group_in  to  {'data': 'DATA', 'flag': 'FLAG', 'id': '0', 'uvw': 'UVW', 'weight': 'DATA_WEIGHT', 'imaging_weight': 'DATA_WEIGHT'}
Setting default data_group_out  to  {'data': 'DATA', 'flag': 'FLAG', 'id': '1', 'uvw': 'UVW', 'weight': 'DATA_WEIGHT', 'imaging_weight': 'DATA_WEIGHT'}
Setting default data_group_in  to  {'id': '0'}
Setting default data_group_out [' sum_weight ']  to  SUM_WEIGHT
Setting default data_group_out [' image ']  to  IMAGE
######################### Created graph for make_image #########################
######################### Start make_pb #########################
Setting default data_group_in  to  {'id': '0', 'sum_weight': 'SUM_WEIGHT', 'image': 'IMAGE'}
Setting default data_group_out  to  {'id': '0', 'sum_weight': 'SUM_WEIGHT', 'image': 'IMAGE', 'pb': 'PB'}
Setting default image_center  to  [100 200]
Setting default fft_padding  to  1.2
######################### Created graph for make_pb #########################
Time to store and execute graph  write_zarr 7.705185174942017
[5]:
casa_img_xds = read_image("casa_twhya_standard_gridder_lsrk_mfs_natural.img.zarr")
casa_img_xds
[5]:
<xarray.Dataset>
Dimensions:            (chan: 1, l: 200, m: 400, pol: 1, time: 1)
Coordinates:
  * chan               (chan) float64 3.726e+11
    declination        (l, m) float64 dask.array<chunksize=(200, 400), meta=np.ndarray>
  * l                  (l) float64 3.879e-05 3.84e-05 ... -3.801e-05 -3.84e-05
  * m                  (m) float64 -7.757e-05 -7.718e-05 ... 7.679e-05 7.718e-05
  * pol                (pol) float64 9.0
    right_ascension    (l, m) float64 dask.array<chunksize=(200, 400), meta=np.ndarray>
  * time               (time) datetime64[ns] 2012-11-19T07:56:26.544000626
Data variables:
    IMAGE              (l, m, time, chan, pol) float64 dask.array<chunksize=(200, 400, 1, 1, 1), meta=np.ndarray>
    IMAGE_MASK0        (l, m, time, chan, pol) bool dask.array<chunksize=(200, 400, 1, 1, 1), meta=np.ndarray>
    IMAGE_PBCOR        (l, m, time, chan, pol) float64 dask.array<chunksize=(200, 400, 1, 1, 1), meta=np.ndarray>
    IMAGE_PBCOR_MASK0  (l, m, time, chan, pol) bool dask.array<chunksize=(200, 400, 1, 1, 1), meta=np.ndarray>
    MODEL              (l, m, time, chan, pol) float64 dask.array<chunksize=(200, 400, 1, 1, 1), meta=np.ndarray>
    PB                 (l, m, time, chan, pol) float64 dask.array<chunksize=(200, 400, 1, 1, 1), meta=np.ndarray>
    PB_MASK0           (l, m, time, chan, pol) bool dask.array<chunksize=(200, 400, 1, 1, 1), meta=np.ndarray>
    PSF                (l, m, time, chan, pol) float64 dask.array<chunksize=(200, 400, 1, 1, 1), meta=np.ndarray>
    RESIDUAL           (l, m, time, chan, pol) float64 dask.array<chunksize=(200, 400, 1, 1, 1), meta=np.ndarray>
    RESIDUAL_MASK0     (l, m, time, chan, pol) bool dask.array<chunksize=(200, 400, 1, 1, 1), meta=np.ndarray>
    SUMWT              (time, chan, pol) float64 dask.array<chunksize=(1, 1, 1), meta=np.ndarray>
Attributes: (12/19)
    axisnames:            ['Right Ascension', 'Declination', 'Time', 'Frequen...
    axisunits:            ['rad', 'rad', 'datetime64[ns]', 'Hz', '']
    commonbeam:           [0.6637904644012451, 0.5050938129425049, -65.900085...
    commonbeam_units:     ['arcsec', 'arcsec', 'deg']
    date_observation:     2012/11/19/07
    direction_reference:  j2000
    ...                   ...
    restoringbeam:        [0.6637904644012451, 0.5050938129425049, -65.900085...
    spectral__reference:  lsrk
    telescope:            alma
    telescope_position:   [2.22514e+06m, -5.44031e+06m, -2.48103e+06m] (itrf)
    unit:                 Jy/beam
    velocity__type:       radio

Plot and Compare With CASA

[6]:
import matplotlib.pylab as plt
import numpy as np

img_xds = read_image("twhya_standard_gridder_lsrk_mfs_natural.img.zarr")
casa_img_xds = read_image("casa_twhya_standard_gridder_lsrk_mfs_natural.img.zarr")
plt.close('all')

#### Compare dirty images ####
dirty_image = img_xds['IMAGE'].isel(chan=0,time=0,pol=0) #Image created by ngCASA
casa_dirty_image = casa_img_xds['RESIDUAL'].isel(chan=0,time=0,pol=0) #Image created by CASA

#Plotting Images
minmin = np.min([np.min(casa_dirty_image.data), np.min(dirty_image.data)])
maxmax = np.max([np.max(casa_dirty_image.data), np.max(dirty_image.data)])
extent = extent=(np.min(casa_dirty_image.m),np.max(casa_dirty_image.m),np.min(casa_dirty_image.l),np.max(casa_dirty_image.l))

fig0, ax0 = plt.subplots(1, 2, sharey=True,figsize=(10, 5))
im0 = ax0[0].imshow(casa_dirty_image,extent=extent)
im1 = ax0[1].imshow(dirty_image,extent=extent)
ax0[0].title.set_text('CASA Dirty Image')
ax0[1].title.set_text('ngCASA Dirty Image')
ax0[0].set_xlabel('m')
ax0[1].set_xlabel('m')
ax0[0].set_ylabel('l')
ax0[1].set_ylabel('l')
fig0.colorbar(im0, ax=ax0[0], fraction=0.046, pad=0.04)
fig0.colorbar(im1, ax=ax0[1], fraction=0.046, pad=0.04)
plt.show()

dif_image = casa_dirty_image - dirty_image
plt.figure()
plt.imshow(dif_image,extent=extent)
plt.title('Difference Dirty Image')
plt.xlabel('m')
plt.ylabel('l')
plt.colorbar(fraction=0.046, pad=0.04)
plt.show()

dirty_image = dirty_image / np.max(np.abs(dirty_image))
casa_dirty_image = casa_dirty_image / np.max(np.abs(casa_dirty_image))

# Calculate max error
max_error_dirty_image = np.max(np.abs(dirty_image - casa_dirty_image)).values
print('Max Error',max_error_dirty_image)

# Calculate root mean square error
rms_error_dirty_image = np.linalg.norm(dirty_image - casa_dirty_image, 'fro')
print('RMS Error',rms_error_dirty_image)
../_images/imaging_continuum_imaging_example_12_0.png
../_images/imaging_continuum_imaging_example_12_1.png
Max Error 2.2810074962187343e-07
RMS Error 1.6647486690733482e-05

The reason for the small difference between ngCASA and CASA is due to ngCASA using a different implementation of the Fast Fourier Transform.

[7]:
import matplotlib.pylab as plt
import numpy as np
import xarray as xr
import dask.array as da
import dask
from cngi.dio import read_image

img_xds = read_image("twhya_standard_gridder_lsrk_mfs_natural.img.zarr").isel(chan=0, pol=0, time=0, dish_type=0)
casa_img_xds = read_image("casa_twhya_standard_gridder_lsrk_mfs_natural.img.zarr").isel(chan=0, time=0, pol=0)
plt.close('all')


#### Primary Beam Corrected Images ####
pb_limit = 0.2

primary_beam = img_xds.PB.isel(l=100).where(img_xds.PB.isel(l=100) > pb_limit,other=0.0)
dirty_image_pb_cor = img_xds.IMAGE/img_xds.PB
dirty_image_pb_cor = dirty_image_pb_cor.where(img_xds.PB > pb_limit,other=np.nan)

#print(casa_img_xds)
casa_primary_beam = casa_img_xds['PB'].isel(l=100).data #Primary beam created by CASA
casa_dirty_image_pb_cor = (casa_img_xds['IMAGE_PBCOR']).where(casa_img_xds['PB'] > pb_limit,other=np.nan) #Image created by CASA

#Plot Primary Beams
fig0, ax0, = plt.subplots(1, 2, sharey=True,figsize=(10, 5))
im0 = ax0[0].plot(casa_dirty_image.m,casa_primary_beam)
im1 = ax0[1].plot(casa_dirty_image.m,primary_beam)
ax0[0].title.set_text('CASA Primary Beam Cross Section')
ax0[1].title.set_text('ngCASA Primary Beam Cross Section')
ax0[0].set_xlabel('m')
ax0[1].set_xlabel('m')
ax0[0].set_ylabel('Amplitude')
ax0[1].set_ylabel('Amplitude')
plt.show()

plt.figure()
plt.plot(casa_dirty_image.m,casa_primary_beam-primary_beam)
plt.title('Difference Primary Beam')
plt.xlabel('m')
plt.ylabel('Amplitude')
plt.show()


extent=(np.min(casa_dirty_image.m),np.max(casa_dirty_image.m),np.min(casa_dirty_image.l),np.max(casa_dirty_image.l))

#Plotting Images
fig0, ax0 = plt.subplots(1, 2, sharey=True,figsize=(10, 5))
im0 = ax0[0].imshow(casa_dirty_image_pb_cor,extent=extent)
im1 = ax0[1].imshow(dirty_image_pb_cor,extent=extent)
ax0[0].title.set_text('CASA PB Corrected Dirty Image')
ax0[1].title.set_text('ngCASA PB Corrected Dirty Image')
ax0[0].set_xlabel('m')
ax0[1].set_xlabel('m')
ax0[0].set_ylabel('l')
ax0[1].set_ylabel('l')
fig0.colorbar(im0, ax=ax0[0], fraction=0.046, pad=0.04)
fig0.colorbar(im1, ax=ax0[1], fraction=0.046, pad=0.04)
plt.show()

plt.figure()
plt.imshow(casa_dirty_image_pb_cor - dirty_image_pb_cor,extent=extent)
plt.title('Difference Dirty Image')
plt.xlabel('m')
plt.ylabel('l')
plt.colorbar(fraction=0.046, pad=0.04)
plt.show()

dirty_image_pb_cor = dirty_image_pb_cor / np.nanmax(np.abs(dirty_image_pb_cor))
casa_dirty_image_pb_cor = casa_dirty_image_pb_cor / np.nanmax(np.abs(casa_dirty_image_pb_cor))
norm_diff_image_pb_cor = dirty_image_pb_cor - casa_dirty_image_pb_cor

# Calculate max error
max_error_dirty_image = np.nanmax(np.abs(norm_diff_image_pb_cor))
print('Max Normalized Error',max_error_dirty_image)

# Calculate root mean square error
rms_error_dirty_image = np.sqrt(np.nansum(np.square(norm_diff_image_pb_cor)))
print('RMS Normalized Error',rms_error_dirty_image)
../_images/imaging_continuum_imaging_example_14_0.png
../_images/imaging_continuum_imaging_example_14_1.png
../_images/imaging_continuum_imaging_example_14_2.png
../_images/imaging_continuum_imaging_example_14_3.png
Max Normalized Error 0.0005570343301252167
RMS Normalized Error 0.011389306569008918

The difference in primary beam is due to CASA using a sampled 1D function while ngCASA calculates the PB for each pixel. If it is found that PB creation becomes a bottleneck for ngCASA the implementation will be changed to match CASA.