cngi.conversion.convert_ms

convert_ms(infile, outfile=None, ddis=None, ignore=['HISTORY'], compressor=None, chunks=(100, 400, 32, 1), sub_chunks=10000, append=False)[source]

Convert legacy format MS to xarray Visibility Dataset and zarr storage format

This function requires CASA6 casatools module. The CASA MSv2 format is converted to the MSv3 schema per the specified definition here: https://drive.google.com/file/d/10TZ4dsFw9CconBc-GFxSeb2caT6wkmza/view?usp=sharing

The MS is partitioned by DDI, which guarantees a fixed data shape per partition. This results in different subdirectories under the main vis.zarr folder. There is no DDI in MSv3, so this simply serves as a partition id in the zarr directory.

Parameters
  • infile (str) – Input MS filename

  • outfile (str) – Output zarr filename. If None, will use infile name with .vis.zarr extension

  • ddis (list) – List of specific DDIs to convert. DDI’s are integer values, or use ‘global’ string for subtables. Leave as None to convert entire MS

  • ignore (list) – List of subtables to ignore (case sensitive and generally all uppercase). This is useful if a particular subtable is causing errors. Default is None. Note: default is now temporarily set to ignore the HISTORY table due a CASA6 issue in the table tool affecting a small set of test cases (set back to None if HISTORY is needed)

  • compressor (numcodecs.blosc.Blosc) – The blosc compressor to use when saving the converted data to disk using zarr. If None the zstd compression algorithm used with compression level 2.

  • chunks (4-D tuple of ints) – Shape of desired chunking in the form of (time, baseline, channel, polarization), use -1 for entire axis in one chunk. Default is (100, 400, 20, 1) Note: chunk size is the product of the four numbers, and data is batch processed by time axis, so that will drive memory needed for conversion.

  • sub_chunks (int) – Chunking used for subtable conversion (except for POINTING which will use time/baseline dims from chunks parameter). This is a single integer used for the row-axis (d0) chunking only, no other dims in the subtables will be chunked.

  • append (bool) – Keep destination zarr store intact and add new DDI’s to it. Note that duplicate DDI’s will still be overwritten. Default False deletes and replaces entire directory.

Returns

Master xarray dataset of datasets for this visibility set

Return type

xarray.core.dataset.Dataset