Item-at-a-time Reduce Functions in Python
by Gianni Tedesco
If you need to write a Reduce function in python, there’s a number of ways of doing it. I’m going to assume you already know what that is and leap right in. Suffice to say, you can think of that as something like an aggregate function in an SQL query.
Perhaps the most obvious would be to use functools.reduce.
from functools import reduce from operator import add reduce(add, range(20))
But in this case you would need the entire iterable to be ready at the time of
reduce. What if you want to call the
reduce one step at a time?
This would be particularly important, for example, if you were receiving the
items over a network connection or reading them from a huge file.
Another way might be to use a generator, but that’s not without it’s problems either:
from operator import add def genreduce(init_val, func): val = init_val yield init_val while True: nxt = yield val val = func(val, nxt) def online_reduce(func, init_val, gen): state = genreduce(init_val, func) init_val = next(state) for item in gen: init_val = state.send(item) return init_val
So now things have got pretty ponderous. And the two things we want to specify here are quite cumbersome to carry around - the function defining the reduction and it’s initial value.
With Function Attributes
What I’m really after is a concise way to specify the callees - the reduce functions themselves. And an intuitive pythonic protocol for a caller to use them.
Luckily, in Python, we can just add attributes to functions.
def foldsum(prev_val, val): return prev_val + val foldsum.init_val = 0 def online_reduce(func, gen): val = func.init_val for item in gen: val = func(val, item) return val
This is nice, it lets us combine the function and the initial-value without
resorting to defining some new class, or just using a tuple like
which is essentially untyped and can get confusing.
It does make type checking blow up though.
With a Decorator
Lets try again but make this a function decorator. We want to be able to
decorate our own functions as well as things like
operator.add, so we’ll need
an extra level of indirection so we don’t start adding attributes on to members
When it comes to typing, we want the caller to be able to access
func.init_val without causing a type error. We can define a custom protocol
for that using the
But also, our actual decorator needs to not blow up
mypy and that gets a
little bit tricky. I couldn’t figure out how to avoid using a
With all that said, here’s all the tedious boilerplate for the decorator:
from typing import Callable,Any,cast from typing_extensions import Protocol from functools import wraps class ReduceFunc(Protocol): init_val : Any def __call__(prev_val, val): raise NotImplementedError def reducer(parm : Any): def decorator(func) -> ReduceFunc: @wraps(func) def wrapper(a, b): return func(a, b) rf = cast(ReduceFunc, wrapper) rf.init_val = parm return rf return decorator __all__ = ( 'reducer' )
And here’s two ways of defining a
reduce function. First for a user-defined
function, secondly when wrapping a function from somewhere else like the
@reducer(0) def foldsum(prev_val, val): return prev_val + val
from operator import add foldsum = reducer(0)(add)