recursive_diff¶
Recursively compare Python objects.
See also its most commonly used wrapper:
recursive_eq()
-
xarray_extras.recursive_diff.
recursive_diff
(lhs, rhs, *, rel_tol=1e-09, abs_tol=0.0, brief_dims=())¶ Compare two objects and yield all differences. The two objects must any of:
- basic types (str, int, float, bool)
- basic collections (list, tuple, dict, set, frozenset)
numpy.ndarray
pandas.Series
pandas.DataFrame
pandas.Index
xarray.DataArray
xarray.Dataset
- any recursive combination of the above
- any other object (compared with ==)
Special treatment is reserved to different types:
- floats and ints are compared with tolerance, using
math.isclose()
- NaN equals to NaN
- bools are only equal to other bools
- numpy arrays are compared elementwise and with tolerance, also testing the dtype
- pandas and xarray objects are compared elementwise, with tolerance, and without order, and do not support duplicate indexes
- xarray dimensions and variables are compared without order
- collections (list, tuple, dict, set, frozenset) are recursively descended into
- generic/unknown objects are compared with ==
Custom classes can be registered to benefit from the above behaviour; see documentation in
cast()
.Parameters: - lhs – left-hand-side data structure
- rhs – right-hand-side data structure
- rel_tol (float) – relative tolerance when comparing numbers. Applies to floats, integers, and all numpy-based data.
- abs_tol (float) – absolute tolerance when comparing numbers. Applies to floats, integers, and all numpy-based data.
- brief_dims –
One of:
- sequence of strings representing xarray dimensions. If one or more differences are found along one of these dimensions, only one message will be reported, stating the differences count.
- ”all”, to produce one line only for every xarray variable that differs
Omit to output a line for every single different cell.
Yields strings containing difference messages, prepended by the path to the point that differs.
-
xarray_extras.recursive_diff.
cast
(obj, brief_dims)¶ Helper function of
recursive_diff()
.Cast objects into simpler object types:
- Cast tuple to list
- Cast frozenset to set
- Cast all numpy-based objects to
xarray.DataArray
, as it is the most generic format that can describe all use cases:numpy.ndarray
pandas.Series
pandas.DataFrame
pandas.Index
, exceptpandas.RangeIndex
, which is instead returned unalteredxarray.Dataset
The data will be potentially wrapped by a dict to hold the various attributes and marked so that it doesn’t trigger an infinite recursion.
- Do nothing for any other object types.
Parameters: - obj – complex object that must be simplified
- brief_dims (tuple) – sequence of xarray dimensions that must be compacted.
See documentation on
recursive_diff()
.
Returns: simpler object to compare
Custom objects
This is a single dispatch function which can be extended to compare custom objects. Take for example this custom class:
>>> class Rectangle: ... def __init__(self, w, h): ... self.w = w ... self.h = h ... ... def __eq__(self, other): ... return self.w == other.w and self.h == other.h ... ... def __repr__(self): ... return 'Rectangle(%f, %f)' % (self.w, self.h)
The above can be processed by recursive_diff, because it supports the == operator against objects of the same type, and when converted to string it conveys meaningful information:
>>> list(recursive_diff(Rectangle(1, 2), Rectangle(3, 4))) ['Rectangle(1.000000, 2.000000) != Rectangle(2.000000, 3.000000)']
However, it doesn’t support the more powerful features of recursive_diff, namely recursion and tolerance:
>>> list(recursive_diff( ... Rectangle(1, 2), Rectangle(1.1, 2.2), abs_tol=.5)) ['Rectangle(1.0000000, 2.0000000) != Rectangle(1.100000, 2.200000)']
This can be fixed by registering a custom cast function:
>>> @cast.register(Rectangle) ... def _(obj, brief_dims): ... return {'w': obj.w, 'h': obj.h}
After doing so, w and h will be compared with tolerance and, if they are collections, will be recursively descended into:
>>> list(recursive_diff( ... Rectangle(1, 2), Rectangle(1.1, 2.7), abs_tol=.5)) ['[h]: 2.0 != 2.7 (abs: 7.0e-01, rel: 3.5e-01)']