If you ever need to capture the smallest or largest “n” from a stream of data, the approach more often than not will be to use a simple data-structure called the Priority Queue.
Priority Queues do one thing very well – once a bunch of data is added, it can return the lowest value (or the highest value) in constant time.
How is this useful to answer a top or bottom “n” type question. Let’s see.
Consider this hypothetical stream of data:
And you have to answer the smallest 2 at any point as this stream of data comes in. The trick is to use a Priority Queue
- with a size limit of 2
- which returns the largest of these 2 when asked for (sometimes referred to as a Max Priority Queue)
Two considerations as data streams in:
- if the size of the priority queue is less than 2 then add the value to the priority queue
- if the size of the priority queue is equal to 2, then compare the value from the stream with the largest in the queue
- if less then remove the largest and add the new value
- if more then do nothing
At the bottom is the state of the Priority Queue as each data in the stream is processed:
See how it always holds the bottom 2.
Similarly for the largest 3, a Priority Queue with a max capacity of 3 which returns the smallest (referred to as a Min Priority Queue) can be used the following way:
- if the size of the priority queue is less than 3, then add to the priority queue
- if the size is equal to 2, then check the value from the stream with the smallest in the queue
- if more then remove smallest add the value from stream and ignore otherwise
Here is a simple kotlin based implementation that uses the built in
PriorityQueue in Java standard library.
The implementation is very straightforward and follows the outlined algorithm to the letter.