cngi.vis.join_vis

join_vis(mxds, vis1, vis2)[source]

Concatenate together two Visibility xds’s of compatible shape from the same mxds

The data variables of the two datasets are merged together, with some limitations (see “Current Limitations” in the Notes section).

Coordinate values that are not also being used as dimensions are compared for equality.

Certain known attributes are updated, namely “ddi”. For the rest, they are merged where keys are present in one dataset but not the other, or the values from the first dataset override those from the second where the keys are the same.

Parameters
  • mxds (xarray.core.dataset.Dataset) – input multi-xarray Dataset with global data

  • vis1 (str) – first visibility partition in the mxds to join

  • vis2 (str) – second visibility partition in the mxds to join

Returns

New output multi-xarray Dataset with global data

Return type

xarray.core.dataset.Dataset

Warning

Joins are highly discouraged for datasets that don’t share a common ‘global’ DDI (ie are sourced from different .zarr archives). Think really hard about if a join would even mean anything before doing so.

Warning

DDIs are separated by spectral window and correlation (polarity) because it is a good indicator of how data is gathered in hardware. Ultimately, if source data comes from different DDIs, it means that the data followed different paths through hardware during the measurement. This is important in that there are more likely to be discontinuities across DDIs than within them. Therefore, don’t haphazardly join DDIs, because it could break the inherent link between data and hardware.

Notes

Conflicts in data variable values between datasets:

There are many ways that data values could end up differing between datasets for the same coordinate values. One example is the error bars represented in SIGMA or WEIGHT could differ for a different spw.

There are many possible solutions to dealing with conflicting data values:

  1. only allow joins that don’t have conflicts (current solution)

  2. add extra indexes CHAN, POL, and/or SPW to the data variables that conflict

  3. add extra indexes CHAN, POL, and/or SPW to all data variables

  4. numerically merge the values (average, max, min, etc)

  5. override the values in xds1 with the values in xds2

Joins are allowed for:

Datasets that have all different dimension values.

  • Example: xds1 covers time range 22:00-22:59, and xds2 covers time range 23:00-24:00

Datasets that have overlapping dimension values with matching data values at all of those coordinates.

  • Example: xds1.PROCESSOR_ID[0][0] == xds2.PROCESSOR_ID[0][0]

Current Limitations:

Joins are not allowed for datasets that have overlapping dimension values with mismatched data values at any of those coordinates.

  • Example: xds1.PROCESSOR_ID[0][0] != xds2.PROCESSOR_ID[0][0]

  • See “Conflicts in data variable values”, above

Joins between ‘global’ datasets, such as those returned by cngi.dio.read_vis(ddi=’global’), are probably meaningless and should be avoided. Datasets do not need to have the same shape.

  • Example xds1.DATA.shape != xds2.DATA.shape

Examples

### Use cases (some of them), to be turned into examples. Note: these use cases come from CASA’s mstransform(combinespws=True) and may not apply to ddijoin.

  • universal calibration across spws

  • autoflagging with broadband rfi

  • uvcontfit and uvcontsub

  • joining datasets that had previously been split, operated on, and are now being re-joined