nimble.calculate.performanceFunction

nimble.calculate.performanceFunction(optimal, best=None, predict=None, validate=True, requires1D=True, samePtCount=True, sameFtCount=True, allowEmpty=False, allowMissing=False)

Decorator factory for Nimble performance functions.

A convenient way to make a function compatible with Nimble’s testing API. The function that will be decorated must take the form: function(knownValues, predictedValues) as these inputs are provided by Nimble during testing (including validation testing). The optimal, best, and predict become attributes of the function that allow Nimble to compare and analyze the performances. The remaining parameters control validations of the data input to the decorated function. By default, both knownValues and predictedValues must be non-empty Nimble data objects with one feature, an equal number of points, and no missing values.

The predict parameter informs Nimble how to generate the predictedValues for the decorated function. Values of None, ‘bestScore’ and ‘allScores’ utilize TrainedLearner.apply with predict providing the scoreMode argument. More complex cases are handled by providing a custom function to predict which must be in the form: predict(trainedLearner, knownValues, arguments). With access to the TrainedLearner instance, knownValues, and the arguments provided to the testing function it should be possible to generate the desired predicted values.

Note

Common performance functions are available in nimble.calculate.

Parameters:
  • optimal (str) – Either ‘max’ or ‘min’ indicating whether higher or lower values are better.

  • best (int, float, None) – The best possible value for the performance function. None assumes that the values have no bound in the optimal direction.

  • predict (str, function) – Informs Nimble how to produce the predictedValues. May be None, ‘bestScore’, or ‘allScores’ to utilize TrainedLearner.apply() or a custom function in the form: predict(trainedLearner, knownValues, arguments)

  • validate (bool) – Whether to perform validation on the function inputs. If False, none of the parameters below will apply.

  • requires1D (bool) – Checks that the predictedValues object is one-dimensional.

  • samePtCount (bool) – Checks that the knownValues and predictedValues have the same number of points.

  • sameFtCount (bool) – Checks that the knownValues and predictedValues have the same number of features.

  • allowEmpty (bool) – Allow the knownValues and predictedValues objects to be empty.

  • allowMissing (bool) – Allow the knownValues and predictedValues to contain missing values.

See also

nimble.calculate

Examples

Here, correctVoteRatio finds the number of times the correct label received a vote and divides by the total number of votes. The best score is 1, indicating that 100% of the votes were for the correct labels. As inputs it expects the knownValues to be a feature vector of known labels and the predictedValues to be a matrix of vote counts for each label.

>>> @performanceFunction('max', 1, 'allScores', requires1D=False,
...                      sameFtCount=False)
... def correctVoteRatio(knownValues, predictedValues):
...     cumulative = 0
...     totalVotes = 0
...     for true, votes in zip(knownValues, predictedValues.points):
...         cumulative += votes[true]
...         totalVotes += sum(votes)
...     return cumulative / totalVotes
...
>>> trainX = nimble.data([[0, 0], [2, 2], [-2, -2]] * 10)
>>> trainY = nimble.data([0, 1, 2] * 10).T
>>> testX = nimble.data([[0, 0], [1, 1], [2, 2], [1, 1], [-1, -2]])
>>> testY = nimble.data([0, 0, 1, 1, 2]).T
>>> knn = nimble.train('nimble.KNNClassifier', trainX, trainY, k=3)
>>> # Can visualize our predictedValues for this case using apply()
>>> knn.apply(testX, scoreMode='allScores') # 12/15 votes correct
<Matrix 5pt x 3ft
     0  1  2
   ┌────────
 0 │ 3  0  0
 1 │ 2  1  0
 2 │ 0  3  0
 3 │ 2  1  0
 4 │ 0  0  3
>
>>> knn.test(correctVoteRatio, testX, testY)
0.8

Here, averageDistanceToCenter calculates the average distance of a point to its center. This function expects the predictedValues to be the cluster center for the predicted label of each point in the knownValues. The string options for ‘predict’ do not cover this case, so it requires the labelsWithCenters function.

>>> def labelsWithCenters(trainedLearner, knownValues, arguments):
...     labels = trainedLearner.apply(knownValues)
...     centers = trainedLearner.getAttributes()['cluster_centers_']
...     return nimble.data([centers[l] for l in labels])
...
>>> @performanceFunction('min', 0, predict=labelsWithCenters,
...                      requires1D=False, sameFtCount=False)
... def averageDistanceToCenter(knownValues, predictedValues):
...     rootSqDiffs = ((knownValues - predictedValues) ** 2) ** 0.5
...     distances = rootSqDiffs.points.calculate(lambda pt: sum(pt))
...     return sum(distances) / len(distances.points)
...
>>> X = nimble.data([[0, 0], [4, 0], [0, 4], [4, 4]] * 25)
>>> X += nimble.random.data(100, 2, 0, randomSeed=1) # add noise
>>> round(nimble.trainAndTest('skl.KMeans', averageDistanceToCenter, X,
...                     n_clusters=4, randomSeed=1), 6)
1.349399