Deriving Average Without Totals
Even an old dog like me learns something new every day.
Let’s say you need a program that receives a sequence of numbers and output the current average. How do you do this efficiently? The naive solution is to to keep track of two variables:
total # add up all numbers seen so far
count # a count of how many numbers seen so far
This way, at any step, you can calculate the current average using total/count
.
It turns out there is another way to do this that allows you to produce the average. You will track
current_average # the value of `average` that was last calculated
count # a count of how many numbers seen so far
With each new number that you see, you can recalculate current_average
using the formula
current_average_new = current_average_previous + (( new_number - current_average_previous ) / count)
In terms of space complexity, the storage is the same: two variables. However, if you need to show the average repeatedly, the second approach only performs calculation once. Nice!
A good step-by-step explanation is stolen from Henry @ Math StackExchange
And inspired by a snippet from Anson Wong’s multi-armed bandit code.