nimble.normalizeData¶
- nimble.normalizeData(learnerName, trainX, trainY=None, testX=None, arguments=None, randomSeed=None, *, useLog=None, **kwarguments)¶
Modify data according to a produced model.
Calls on the functionality of a package to train on some data and then return the modified
trainX
andtestX
(if provided) according to the results of the trained model. If onlytrainX
is provided, the normalizedtrainX
is returned. IftestX
is also provided a tuple (normalizedTrain, normalizedTest) is returned. The name of the learner will be added to each normalized object’sname
attribute to indicate the normalization that has been applied. Point and feature names are preserved when possible.- Parameters:
learnerName (str) – The learner to be called. This can be a string in the form ‘package.learner’ or the learner class object.
trainX (nimble Base object) – Data to be used for training.
trainY (identifier, nimble Base object) – A name or index of the feature in
trainX
containing the labels or another nimble Base object containing the labels that correspond totrainX
.testX (nimble Base object) – Data to be used for testing.
arguments (dict) – Mapping argument names (strings) to their values, to be used during training and application (e.g., {‘dimensions’:5, ‘k’:5}). To provide an argument that is an object from the same package as the learner, use a
nimble.Init
object with the object name and its instantiation arguments (e.g., {‘optimizer’: nimble.Init(‘SGD’, learning_rate=0.01}). Note: learner arguments can also be passed askwarguments
so this dictionary will be merged with any keyword arguments.randomSeed (int) – Set a random seed for the operation. When None, the randomness is controlled by Nimble’s random seed. Ignored if learner does not depend on randomness.
useLog (bool, None) – Local control for whether to send object creation to the logger. If None (default), use the value as specified in the “logger” “enabledByDefault” configuration option. If True, send to the logger regardless of the global option. If False, do NOT send to the logger, regardless of the global option.
kwarguments – Keyword arguments specified variables that are passed to the learner. These are combined with the
arguments
parameter. To provide an argument that is an object from the same package as the learner, use animble.Init
object with the object name and its instantiation arguments (e.g., optimizer=nimble.Init(‘SGD’, learning_rate=0.01)).
See also
Examples
Normalize a single data set.
>>> lst = [[20, 1.97, 89], [28, 1.87, 75], [24, 1.91, 81]] >>> trainX = nimble.data(lst, pointNames=['a', 'b', 'c'], ... featureNames=['age', 'height', 'weight'], ... returnType="Matrix") >>> normTrainX = nimble.normalizeData('scikitlearn.StandardScaler', ... trainX) >>> normTrainX <Matrix 3pt x 3ft age height weight ┌─────────────────────── a │ -1.225 1.298 1.279 b │ 1.225 -1.136 -1.162 c │ 0.000 -0.162 -0.116 >
Normalize training and testing data.
>>> lst1 = [[0, 1, 3], [-1, 1, 2], [1, 2, 2]] >>> trainX = nimble.data(lst1) >>> lst2 = [[-1, 0, 5]] >>> testX = nimble.data(lst2) >>> pcaTrain, pcaTest = nimble.normalizeData('scikitlearn.PCA', ... trainX, testX=testX, ... n_components=2) >>> pcaTrain <Matrix 3pt x 2ft 0 1 ┌─────────────── 0 │ -0.216 0.713 1 │ -1.005 -0.461 2 │ 1.221 -0.253 > >>> pcaTest <Matrix 1pt x 2ft 0 1 ┌────────────── 0 │ -1.739 2.588 >
Keywords: modify, apply, standardize, scale, rescale, encode, center, mean, standard deviation, z-scores, z scores