Points.splitByCollapsingFeatures

Points.splitByCollapsingFeatures(featuresToCollapse, featureForNames, featureForValues, *, useLog=None)

Separate feature/value pairs into unique points.

Split each point in this object into k points, one point for each featureName/value pair in featuresToCollapse. For all k points, the uncollapsed features are copied from the original point. The collapsed features are replaced by only two features which are filled with a unique featureName/value pair for each of the k points. An object containing n points, m features and k features-to-collapse will result in this object containing (n * m) points and (m - k + 2) features.

Parameters:
  • featuresToCollapse (list) – Names and/or indices of the features that will be collapsed. The first of the two resulting features will contain the names of these features. The second resulting feature will contain the values of this feature.

  • featureForNames (str) – Describe the feature which will contain the collapsed feature names.

  • featureForValues (str) – Describe the feature which will contain the values from the collapsed features.

  • useLog (bool, None) – Local control for whether to send object creation to the logger. If None (default), use the value as specified in the “logger” “enabledByDefault” configuration option. If True, send to the logger regardless of the global option. If False, do NOT send to the logger, regardless of the global option.

Notes

A visual representation of the Example:

temp.points.splitByCollapsingFeatures(['jan', 'feb', 'mar'],
                                      'month', 'temp')

      temp (before)                     temp (after)
+------------------------+       +---------------------+
| city | jan | feb | mar |       | city | month | temp |
+------+-----+-----+-----+       +------+-------+------+
| NYC  | 4   | 5   | 10  |       | NYC  | jan   | 4    |
+------+-----+-----+-----+  -->  +------+-------+------+
| LA   | 20  | 21  | 21  |       | NYC  | feb   | 5    |
+------+-----+-----+-----+       +------+-------+------+
| CHI  | 0   | 2   | 7   |       | NYC  | mar   | 10   |
+------+-----+-----+-----+       +------+-------+------+
                                 | LA   | jan   | 20   |
                                 +------+-------+------+
                                 | LA   | feb   | 21   |
                                 +------+-------+------+
                                 | LA   | mar   | 21   |
                                 +------+-------+------+
                                 | CHI  | jan   | 0    |
                                 +------+-------+------+
                                 | CHI  | feb   | 2    |
                                 +------+-------+------+
                                 | CHI  | mar   | 7    |
                                 +------+-------+------+

This function was inspired by the pivot_wider function from the tidyr library created by Hadley Wickham [1] in the R programming language.

References

Examples

>>> lst = [['NYC', 4, 5, 10],
...        ['LA', 20, 21, 21],
...        ['CHI', 0, 2, 7]]
>>> fts = ['city', 'jan', 'feb', 'mar']
>>> temp = nimble.data(lst, featureNames=fts)
>>> temp.points.splitByCollapsingFeatures(['jan', 'feb', 'mar'],
...                                        'month', 'temp')
>>> temp
<DataFrame 9pt x 3ft
     city  month  temp
   ┌──────────────────
 0 │ NYC    jan     4
 1 │ NYC    feb     5
 2 │ NYC    mar    10
 3 │  LA    jan    20
 4 │  LA    feb    21
 5 │  LA    mar    21
 6 │ CHI    jan     0
 7 │ CHI    feb     2
 8 │ CHI    mar     7
>

Keywords: gather, melt, unpivot, fold, pivot_wider, tidy, tidyr