Features.normalize¶
- Features.normalize(function, applyResultTo=None, features=None, *, useLog=None)¶
Modify all features in this object using the given function.
Normalize the data by a function that adjusts each feature based on the provided function. If the function allows, the normalization can also be applied to a second object. Examples of normalizations provided in Nimble are
meanNormalize
,percentileNormalize
, and the others in the examples below. See nimble.calculate.normalize for all the default provided types of normalization.- Parameters:
function – The function applying the normalization. Functions must accept a feature view and output the normalized feature data. When
applyResultTo
is not None, the function must accept a second feature view and return a two-tuple (normalized feature from the calling object, normalized feature from applyResultTo). Common normalizations can be found in nimble.calculate.normalize.applyResultTo (nimble Base object, None) – The secondary object to apply the the normalization to. Must have the same number of features as the calling object.
features (identifier, list of identifiers, None) – Select specific features to apply the normalization to. If features is None, the normalization will be applied to all features.
useLog (bool, None) – Local control for whether to send object creation to the logger. If None (default), use the value as specified in the “logger” “enabledByDefault” configuration option. If True, send to the logger regardless of the global option. If False, do NOT send to the logger, regardless of the global option.
Examples
Calling object only.
>>> from nimble.calculate import range0to1Normalize >>> lstTrain = [[5, 9.8, 92], ... [3, 6.2, 58], ... [2, 3.0, 29]] >>> pts = ['movie1', 'movie2', 'movie3'] >>> fts = ['review1', 'review2', 'review3'] >>> train = nimble.data(lstTrain, pts, fts) >>> train.features.normalize(range0to1Normalize) >>> train <Matrix 3pt x 3ft review1 review2 review3 ┌────────────────────────── movie1 │ 1.000 1.000 1.000 movie2 │ 0.333 0.471 0.460 movie3 │ 0.000 0.000 0.000 >
With applyResultTo.
>>> from nimble.calculate import meanStandardDeviationNormalize >>> lstTrain = [[5, 9.8, 92], ... [3, 6.2, 58], ... [2, 3.0, 10]] >>> lstTest = [[4, 9.1, 43], ... [3, 5.1, 88]] >>> fts = ['review1', 'review2', 'review3'] >>> trainPts = ['movie1', 'movie2', 'movie3'] >>> train = nimble.data(lstTrain, trainPts, fts) >>> testPts = ['movie4', 'movie5'] >>> test = nimble.data(lstTest, testPts, fts) >>> train.features.normalize(meanStandardDeviationNormalize, ... applyResultTo=test) >>> train <Matrix 3pt x 3ft review1 review2 review3 ┌────────────────────────── movie1 │ 1.336 1.248 1.149 movie2 │ -0.267 -0.048 0.139 movie3 │ -1.069 -1.200 -1.288 > >>> test <Matrix 2pt x 3ft review1 review2 review3 ┌────────────────────────── movie4 │ 0.535 0.996 -0.307 movie5 │ -0.267 -0.444 1.031 >
With user defined normalization function.
>>> import numpy as np >>> lstTrain = [[482], [30000], [7900], [35],[600]] >>> pts = ['user1', 'user2', 'user3', 'user4', 'user5'] >>> fts = ['miles'] >>> train = nimble.data(lstTrain, pts, fts) >>> train <Matrix 5pt x 1ft miles ┌────── user1 │ 482 user2 │ 30000 user3 │ 7900 user4 │ 35 user5 │ 600 > >>> def logNormalize(ft): ... return np.log(ft) >>> train.features.normalize(logNormalize) >>> train <Matrix 5pt x 1ft miles ┌─────── user1 │ 6.178 user2 │ 10.309 user3 │ 8.975 user4 │ 3.555 user5 │ 6.397 >
Keywords: standardize, scale, rescale, divide, length