cngi.conversion.convert_table

convert_table(infile, outfile=None, subtable=None, keys=None, timecols=None, ignorecols=None, compressor=None, chunks=(10000, - 1), append=False, nofile=False)[source]

Convert casacore table format to xarray Dataset and zarr storage format.

This function requires CASA6 casatools module. Table rows may be renamed or expanded to n-dim arrays based on column values specified in keys.

Parameters
  • infile (str) – Input table filename

  • outfile (str) – Output zarr filename. If None, will use infile name with .tbl.zarr extension

  • subtable (str) – Name of the subtable to process. If None, main table will be used

  • keys (dict or str) – Source column mappings to dimensions. Can be a dict mapping source columns to target dims, use a tuple when combining cols (ie {(‘ANTENNA1’,’ANTENNA2’):’baseline’} or a string to rename the row axis dimension to the specified value. Default of None

  • timecols (list) – list of strings specifying column names to convert to datetime format from casacore time. Default is None

  • ignorecols (list) – list of column names to ignore. This is useful if a particular column is causing errors. Default is None

  • compressor (numcodecs.blosc.Blosc) – The blosc compressor to use when saving the converted data to disk using zarr. If None the zstd compression algorithm used with compression level 2.

  • chunks (int) – Shape of desired chunking in the form of (dim0, dim1, …, dimN), use -1 for entire axis in one chunk. Default is (80000, 10). Chunking is applied per column / data variable. If too few dimensions are specified, last chunk size is reused as necessary. Note: chunk size is the product of the four numbers, and data is batch processed by the first axis, so that will drive memory needed for conversion.

  • append (bool) – Append an xarray dataset as a new partition to an existing zarr directory. False will overwrite zarr directory with a single new partition

  • nofile (bool) – Allows legacy table to be directly read without file conversion. If set to true, no output file will be written and entire table will be held in memory. Requires ~4x the memory of the table size. Default is False

Returns

New xarray Dataset of table data contents. One element in list per DDI plus the metadata global.

Return type

New xarray.core.dataset.Dataset