moocore.apply_within_sets#
- moocore.apply_within_sets(x, sets, func, **kwargs)[source]#
Split
x
by row according tosets
and applyfunc
to each sub-array.- Parameters:
x (
ArrayLike
) – 2D array to be divided into sub-arrays.sets (
ArrayLike
) – A list or 1D array of length equal to the number of rows ofx
. The values are used as-is to determine the groups and do not need to be sorted.func (
Callable
[...
,Any
]) – A function that can take a 2D array as input. This function may return (1) a 2D array with the same number of rows as the input, (2) a 1D array as long as the number of input rows, (3) a scalar value, or (4) a 2D array with a single row.kwargs – Additional keyword arguments to
func
.
- Returns:
ndarray
– An array whose shape depends on the output offunc
. See Examples below.
Examples
>>> sets = np.array([3, 1, 2, 4, 2, 3, 1]) >>> x = np.arange(len(sets) * 2).reshape(-1, 2) >>> x = np.hstack((x, sets.reshape(-1, 1)))
If
func
returns an array with the same number of rows as the input (case 1), then the output is ordered in exactly the same way as the input.>>> moocore.apply_within_sets(x, sets, lambda x: x) array([[ 0, 1, 3], [ 2, 3, 1], [ 4, 5, 2], [ 6, 7, 4], [ 8, 9, 2], [10, 11, 3], [12, 13, 1]])
This is also the behavior if
func
returns a 1D array with one value per input row (case 2).>>> moocore.apply_within_sets(x, sets, lambda x: x.sum(axis=1)) array([ 4, 6, 11, 17, 19, 24, 26])
If
func
returns a single scalar (case 3) or a 2D array with a single row (case 4), then the order of the output is the order of the unique values as found insets
, without sorting the unique values, which is whatpandas.Series.unique()
returns and NOT whatnumpy.unique()
returns.>>> moocore.apply_within_sets(x, sets, lambda x: x.max()) array([11, 13, 9, 7])
>>> moocore.apply_within_sets(x, sets, lambda x: [x.max(axis=0)]) array([[10, 11, 3], [12, 13, 1], [ 8, 9, 2], [ 6, 7, 4]])
In the previous example,
func
returns a 2D array with a single row. The following will produce an error because it returns a 1D array, which is interpreted as case 2, but the number of values does not match the number of input rows.>>> moocore.apply_within_sets( ... x, sets, lambda x: x.max(axis=0) ... ) Traceback (most recent call last): ... ValueError: `func` returned an array of length 3 but the input has length 2 for rows [0 5]