Question
Help with a python problem. Only need to complete running_medians(iterable): # YOUR CODE HERE -------------------------------------------------------------------------- 2. Computing the Running Median The median of a series
Help with a python problem. Only need to complete
running_medians(iterable):
# YOUR CODE HERE
--------------------------------------------------------------------------
2. Computing the Running Median
The median of a series of numbers is simply the middle term if ordered by magnitude, or, if there is no middle term, the average of the two middle terms. E.g., the median of the series [3, 1, 9, 25, 12] is 9, and the median of the series [8, 4, 11, 18] is 9.5.
If we are in the process of accumulating numerical data, it is useful to be able to compute the running median where, as each new data point is encountered, an updated median is computed. This should be done, of course, as efficiently as possible.
The following function demonstrates a naive way of computing the running medians based on the series passed in as an iterable.
In [10]:
def running_medians_naive(iterable):
values = []
medians = []
for i, x in enumerate(iterable):
values.append(x)
values.sort()
if i%2 == 0:
medians.append(values[i//2])
else:
medians.append((values[i//2] + values[i//2+1]) / 2)
return medians
In [*]:
running_medians_naive([3, 1, 9, 25, 12])
Out[11]:
[3, 2.0, 3, 6.0, 9]
In [*]:
running_medians_naive([8, 4, 11, 18])
Note that the function keeps track of all the values encountered during the iteration and uses them to compute the running medians, which are returned at the end as a list. The final running median, naturally, is simply the median of the entire series.
Unfortunately, because the function sorts the list of values during every iteration it is incredibly inefficient. Your job is to implement a version that computes each running median in O(log N) time using, of course, the heap data structure!
Hints
You will need to use two heaps for your solution: one min-heap, and one max-heap.
The min-heap should be used to keep track of all values greater than the most recent running median, and the max-heap for all values less than the most recent running median this way, the median will lie between the minimum value on the min-heap and the maximum value on the max-heap (both of which can be efficiently extracted)
In addition, the difference between the number of values stored in the min-heap and max-heap must never exceed 1 (to ensure the median is being computed). This can be taken care of by intelligently pop-ping/add-ing elements between the two heaps.
In [8]:
def running_medians(iterable):
# YOUR CODE HERE
values = []
medians = []
for i, x in enumerate(iterable):
values.append(x)
values.sort()
if i%2 == 0:
medians.append(values[i//2])
else:
medians.append((values[i//2] + values[i//2+1]) / 2)
return medians
In [9]:
# (2 points)
from unittest import TestCase
tc = TestCase()
tc.assertEqual([3, 2.0, 3, 6.0, 9], running_medians([3, 1, 9, 25, 12]))
In [ ]:
# (2 points)
import random
from unittest import TestCase
tc = TestCase()
vals = [random.randrange(10000) for _ in range(1000)]
tc.assertEqual(running_medians_naive(vals), running_medians(vals))
In [ ]:
# (4 points) MUST COMPLETE IN UNDER 10 seconds!
import random
from unittest import TestCase
tc = TestCase()
vals = [random.randrange(100000) for _ in range(100001)]
m_mid = sorted(vals[:50001])[50001//2]
m_final = sorted(vals)[len(vals)//2]
running = running_medians(vals)
tc.assertEqual(m_mid, running[50000])
tc.assertEqual(m_final, running[-1])
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started