sort

Sorting functions

xarray_extras.sort.topk(a, k, dim, split_every=None)

Extract the k largest elements from a on the given dimension, and return them sorted from largest to smallest. If k is negative, extract the -k smallest elements instead, and return them sorted from smallest to largest.

This assumes that k is small. All results will be returned in a single chunk along the given axis.

xarray_extras.sort.argtopk(a, k, dim, split_every=None)

Extract the indexes of the k largest elements from a on the given dimension, and return them sorted from largest to smallest. If k is negative, extract the -k smallest elements instead, and return them sorted from smallest to largest.

This assumes that k is small. All results will be returned in a single chunk along the given axis.

xarray_extras.sort.take_along_dim(a, ind, dim)

Use the output of argtopk() to pick points from a.

Parameters:
  • a – any xarray object
  • ind – array of ints, as returned by argtopk()
  • dim – dimension along which argtopk was executed

An example that uses all of the above functions is source attribution. Given a generic function \(y = f(x_{0}, x_{1}, ..., x_{i})\), which is embarassingly parallel along a given dimension, one wants to find:

  • the top k elements of y along the dimension
  • the elements of all x’s that generated the top k elements of y
>>> from xarray import DataArray
>>> from xarray_extras.sort import *
>>> x = DataArray([[5, 3, 2, 8, 1],
>>>                [0, 7, 1, 3, 2]], dims=['x', 's'])
>>> y = x.sum('x')  # y = f(x), embarassingly parallel among dimension 's'
>>> y
<xarray.DataArray (s: 5)>
array([ 5, 10,  3, 11,  3])
Dimensions without coordinates: s
>>> top_y = topk(y, 3, 's')
>>> top_y
<xarray.DataArray (s: 3)>
array([11, 10,  5])
Dimensions without coordinates: s
>>> top_x = take_along_dim(x, argtopk(y, 3, 's'), 's')
>>> top_x
<xarray.DataArray (x: 2, s: 3)>
array([[8, 3, 5],
       [3, 7, 0]])
Dimensions without coordinates: x, s