apply_within_sets#
- moocore.apply_within_sets(x, sets, func, **kwargs)[source]#
Split
xby row according tosetsand applyfuncto each sub-array.- Parameters:
x (
ArrayLike) – 2D array to be divided into sub-arrays.sets (
ArrayLike) – A list or 1D array of length equal to the number of rows ofx. The values are used as-is to determine the groups and do not need to be sorted.func (
Callable[...,Any]) – A function that can take a 2D array as input. This function may return (1) a 2D array with the same number of rows as the input, (2) a 1D array as long as the number of input rows, (3) a scalar value, or (4) a 2D array with a single row.kwargs – Additional keyword arguments to
func.
- Returns:
ndarray– An array whose shape depends on the output offunc. See Examples below.
Examples
>>> sets = np.array([3, 1, 2, 4, 2, 3, 1]) >>> x = np.arange(len(sets) * 2).reshape(-1, 2) >>> x = np.hstack((x, sets.reshape(-1, 1)))
If
funcreturns an array with the same number of rows as the input (case 1), then the output is ordered in exactly the same way as the input.>>> moocore.apply_within_sets(x, sets, lambda x: x) array([[ 0, 1, 3], [ 2, 3, 1], [ 4, 5, 2], [ 6, 7, 4], [ 8, 9, 2], [10, 11, 3], [12, 13, 1]])
This is also the behavior if
funcreturns a 1D array with one value per input row (case 2).>>> moocore.apply_within_sets(x, sets, lambda x: x.sum(axis=1)) array([ 4, 6, 11, 17, 19, 24, 26])
If
funcreturns a single scalar (case 3) or a 2D array with a single row (case 4), then the order of the output is the order of the unique values as found insets, without sorting the unique values, which is whatpandas.Series.unique()returns and NOT whatnumpy.unique()returns.>>> moocore.apply_within_sets(x, sets, lambda x: x.max()) array([11, 13, 9, 7])
>>> moocore.apply_within_sets(x, sets, lambda x: [x.max(axis=0)]) array([[10, 11, 3], [12, 13, 1], [ 8, 9, 2], [ 6, 7, 4]])
In the previous example,
funcreturns a 2D array with a single row. The following will produce an error because it returns a 1D array, which is interpreted as case 2, but the number of values does not match the number of input rows.>>> moocore.apply_within_sets( ... x, sets, lambda x: x.max(axis=0) ... ) Traceback (most recent call last): ... ValueError: `func` returned an array of length 3 but the input has length 2 for rows [0 5]