oasis.stratification.stratify_by_scores

oasis.stratification.stratify_by_scores(scores, goal_n_strata='auto', method='cum_sqrt_F', n_bins='auto')

Stratify by binning the items based on their scores

Parameters
scoresarray-like, shape=(n_items,)

ordered array of scores which quantify the classifier confidence for the items in the pool. High scores indicate a high confidence that the true label is a “1” (and vice versa for label “0”).

goal_n_strataint or ‘auto’, optional, default ‘auto’

desired number of strata. If set to ‘auto’, the number is selected using the Freedman-Diaconis rule. Note that for the ‘cum_sqrt_F’ method this number is a goal – the actual number of strata created may be less than the goal.

method{‘cum_sqrt_F’ or ‘equal_size’}, optional, default ‘cum_sqrt_F’

stratification method to use. ‘equal_size’ aims to create s

Returns
Strata instance
Other Parameters
n_binsint or ‘auto’, optional, default ‘auto’

specify the number of bins to use when estimating the distribution of the score function. This is used when goal_n_strata = 'auto' and/or when method = 'cum_sqrt_F'. If set to ‘auto’, the number is selected using the Freedman-Diaconis rule.