# sort

Sorting functions

xarray_extras.sort.argtopk(a: TV, k: int, dim: Hashable, split_every: = None) TV

Extract the indexes of the k largest elements from a on the given dimension, and return them sorted from largest to smallest. If k is negative, extract the -k smallest elements instead, and return them sorted from smallest to largest.

This assumes that k is small. All results will be returned in a single chunk along the given axis.

xarray_extras.sort.take_along_dim(a: xarray_extras.sort.T, ind: xarray_extras.sort.T, dim: collections.abc.Hashable) xarray_extras.sort.T

Use the output of argtopk() to pick points from a.

Parameters
• a – xarray.DataArray or xarray.Dataset

• ind – array of ints, as returned by argtopk()

• dim – dimension along which argtopk was executed

xarray_extras.sort.topk(a: TV, k: int, dim: Hashable, split_every: = None) TV

Extract the k largest elements from a on the given dimension, and return them sorted from largest to smallest. If k is negative, extract the -k smallest elements instead, and return them sorted from smallest to largest.

This assumes that k is small. All results will be returned in a single chunk along the given axis.

An example that uses all of the above functions is source attribution. Given a generic function $$y = f(x_{0}, x_{1}, ..., x_{i})$$, which is embarassingly parallel along a given dimension, one wants to find:

• the top k elements of y along the dimension

• the elements of all x’s that generated the top k elements of y

>>> from xarray import DataArray
>>> from xarray_extras.sort import *
>>> x = DataArray([[5, 3, 2, 8, 1],
>>>                [0, 7, 1, 3, 2]], dims=['x', 's'])
>>> y = x.sum('x')  # y = f(x), embarassingly parallel among dimension 's'
>>> y
<xarray.DataArray (s: 5)>
array([ 5, 10,  3, 11,  3])
Dimensions without coordinates: s
>>> top_y = topk(y, 3, 's')
>>> top_y
<xarray.DataArray (s: 3)>
array([11, 10,  5])
Dimensions without coordinates: s
>>> top_x = take_along_dim(x, argtopk(y, 3, 's'), 's')
>>> top_x
<xarray.DataArray (x: 2, s: 3)>
array([[8, 3, 5],
[3, 7, 0]])
Dimensions without coordinates: x, s