Points.mapReduce

Points.mapReduce(mapper, reducer, *, useLog=None)

Transforms each point in this object using a specified mapper function and then aggregates the data using the specified reducer function.

Returns a new object containing the aggregated results of the given mapper and reducer functions along the Points axis.

Parameters:
  • mapper (function) – Input a point and output an iterable containing two-tuple(s) of mapping identifiers and feature values.

  • reducer (function) – Input the mapper output and output a two-tuple containing the identifier and the reduced value.

  • useLog (bool, None) – Local control for whether to send object creation to the logger. If None (default), use the value as specified in the “logger” “enabledByDefault” configuration option. If True, send to the logger regardless of the global option. If False, do NOT send to the logger, regardless of the global option.

Examples

mapReduce that finds the average distance traveled by state across points.

>>> def distanceMapper(pt):
...     location = pt[0]
...     distance = pt[1] * pt[2]
...     return [(location, distance)]
>>> def distanceReducer(location, distance):
...     return (location, sum(distance)/len(distance))
>>> travelData = [['Iowa', 0.5, 19],
...               ['Maryland', 1.5, 48],
...               ['Maryland', 2, 40],
...               ['Texas', 3.2, 50],
...               ['Texas', 3, 45]]
>>> fts = ['STATE', 'HOURS', 'MPH']
>>> X = nimble.data(travelData, featureNames=fts)
>>> X
<DataFrame 5pt x 3ft
      STATE    HOURS  MPH
   ┌─────────────────────
 0 │     Iowa  0.500   19
 1 │ Maryland  1.500   48
 2 │ Maryland  2.000   40
 3 │    Texas  3.200   50
 4 │    Texas  3.000   45
>
>>> X.points.mapReduce(distanceMapper, distanceReducer)
<DataFrame 3pt x 2ft
        0         1
   ┌──────────────────
 0 │     Iowa    9.500
 1 │ Maryland   76.000
 2 │    Texas  147.500
>

Keywords: map, reduce, apply