python-2.5.2/win32/Lib/heapq.py
author Deepak Modgil <Deepak.Modgil@Nokia.com>
Fri, 03 Apr 2009 17:19:34 +0100
changeset 0 ae805ac0140d
permissions -rw-r--r--
DP tools release version Revision: 200912
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
0
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
     1
# -*- coding: Latin-1 -*-
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
     2
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
     3
"""Heap queue algorithm (a.k.a. priority queue).
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
     4
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
     5
Heaps are arrays for which a[k] <= a[2*k+1] and a[k] <= a[2*k+2] for
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
     6
all k, counting elements from 0.  For the sake of comparison,
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
     7
non-existing elements are considered to be infinite.  The interesting
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
     8
property of a heap is that a[0] is always its smallest element.
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
     9
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    10
Usage:
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    11
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    12
heap = []            # creates an empty heap
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    13
heappush(heap, item) # pushes a new item on the heap
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    14
item = heappop(heap) # pops the smallest item from the heap
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    15
item = heap[0]       # smallest item on the heap without popping it
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    16
heapify(x)           # transforms list into a heap, in-place, in linear time
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    17
item = heapreplace(heap, item) # pops and returns smallest item, and adds
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    18
                               # new item; the heap size is unchanged
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    19
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    20
Our API differs from textbook heap algorithms as follows:
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    21
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    22
- We use 0-based indexing.  This makes the relationship between the
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    23
  index for a node and the indexes for its children slightly less
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    24
  obvious, but is more suitable since Python uses 0-based indexing.
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    25
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    26
- Our heappop() method returns the smallest item, not the largest.
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    27
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    28
These two make it possible to view the heap as a regular Python list
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    29
without surprises: heap[0] is the smallest item, and heap.sort()
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    30
maintains the heap invariant!
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    31
"""
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    32
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    33
# Original code by Kevin O'Connor, augmented by Tim Peters and Raymond Hettinger
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    34
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    35
__about__ = """Heap queues
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    36
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    37
[explanation by François Pinard]
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    38
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    39
Heaps are arrays for which a[k] <= a[2*k+1] and a[k] <= a[2*k+2] for
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    40
all k, counting elements from 0.  For the sake of comparison,
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    41
non-existing elements are considered to be infinite.  The interesting
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    42
property of a heap is that a[0] is always its smallest element.
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    43
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    44
The strange invariant above is meant to be an efficient memory
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    45
representation for a tournament.  The numbers below are `k', not a[k]:
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    46
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    47
                                   0
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    48
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    49
                  1                                 2
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    50
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    51
          3               4                5               6
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    52
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    53
      7       8       9       10      11      12      13      14
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    54
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    55
    15 16   17 18   19 20   21 22   23 24   25 26   27 28   29 30
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    56
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    57
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    58
In the tree above, each cell `k' is topping `2*k+1' and `2*k+2'.  In
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    59
an usual binary tournament we see in sports, each cell is the winner
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    60
over the two cells it tops, and we can trace the winner down the tree
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    61
to see all opponents s/he had.  However, in many computer applications
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    62
of such tournaments, we do not need to trace the history of a winner.
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    63
To be more memory efficient, when a winner is promoted, we try to
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    64
replace it by something else at a lower level, and the rule becomes
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    65
that a cell and the two cells it tops contain three different items,
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    66
but the top cell "wins" over the two topped cells.
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    67
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    68
If this heap invariant is protected at all time, index 0 is clearly
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    69
the overall winner.  The simplest algorithmic way to remove it and
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    70
find the "next" winner is to move some loser (let's say cell 30 in the
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    71
diagram above) into the 0 position, and then percolate this new 0 down
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    72
the tree, exchanging values, until the invariant is re-established.
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    73
This is clearly logarithmic on the total number of items in the tree.
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    74
By iterating over all items, you get an O(n ln n) sort.
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    75
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    76
A nice feature of this sort is that you can efficiently insert new
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    77
items while the sort is going on, provided that the inserted items are
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    78
not "better" than the last 0'th element you extracted.  This is
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    79
especially useful in simulation contexts, where the tree holds all
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    80
incoming events, and the "win" condition means the smallest scheduled
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    81
time.  When an event schedule other events for execution, they are
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    82
scheduled into the future, so they can easily go into the heap.  So, a
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    83
heap is a good structure for implementing schedulers (this is what I
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    84
used for my MIDI sequencer :-).
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    85
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    86
Various structures for implementing schedulers have been extensively
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    87
studied, and heaps are good for this, as they are reasonably speedy,
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    88
the speed is almost constant, and the worst case is not much different
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    89
than the average case.  However, there are other representations which
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    90
are more efficient overall, yet the worst cases might be terrible.
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    91
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    92
Heaps are also very useful in big disk sorts.  You most probably all
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    93
know that a big sort implies producing "runs" (which are pre-sorted
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    94
sequences, which size is usually related to the amount of CPU memory),
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    95
followed by a merging passes for these runs, which merging is often
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    96
very cleverly organised[1].  It is very important that the initial
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    97
sort produces the longest runs possible.  Tournaments are a good way
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    98
to that.  If, using all the memory available to hold a tournament, you
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
    99
replace and percolate items that happen to fit the current run, you'll
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   100
produce runs which are twice the size of the memory for random input,
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   101
and much better for input fuzzily ordered.
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   102
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   103
Moreover, if you output the 0'th item on disk and get an input which
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   104
may not fit in the current tournament (because the value "wins" over
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   105
the last output value), it cannot fit in the heap, so the size of the
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   106
heap decreases.  The freed memory could be cleverly reused immediately
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   107
for progressively building a second heap, which grows at exactly the
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   108
same rate the first heap is melting.  When the first heap completely
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   109
vanishes, you switch heaps and start a new run.  Clever and quite
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   110
effective!
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   111
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   112
In a word, heaps are useful memory structures to know.  I use them in
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   113
a few applications, and I think it is good to keep a `heap' module
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   114
around. :-)
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   115
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   116
--------------------
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   117
[1] The disk balancing algorithms which are current, nowadays, are
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   118
more annoying than clever, and this is a consequence of the seeking
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   119
capabilities of the disks.  On devices which cannot seek, like big
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   120
tape drives, the story was quite different, and one had to be very
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   121
clever to ensure (far in advance) that each tape movement will be the
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   122
most effective possible (that is, will best participate at
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   123
"progressing" the merge).  Some tapes were even able to read
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   124
backwards, and this was also used to avoid the rewinding time.
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   125
Believe me, real good tape sorts were quite spectacular to watch!
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   126
From all times, sorting has always been a Great Art! :-)
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   127
"""
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   128
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   129
__all__ = ['heappush', 'heappop', 'heapify', 'heapreplace', 'nlargest',
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   130
           'nsmallest']
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   131
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   132
from itertools import islice, repeat, count, imap, izip, tee
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   133
from operator import itemgetter, neg
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   134
import bisect
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   135
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   136
def heappush(heap, item):
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   137
    """Push item onto heap, maintaining the heap invariant."""
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   138
    heap.append(item)
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   139
    _siftdown(heap, 0, len(heap)-1)
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   140
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   141
def heappop(heap):
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   142
    """Pop the smallest item off the heap, maintaining the heap invariant."""
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   143
    lastelt = heap.pop()    # raises appropriate IndexError if heap is empty
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   144
    if heap:
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   145
        returnitem = heap[0]
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   146
        heap[0] = lastelt
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   147
        _siftup(heap, 0)
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   148
    else:
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   149
        returnitem = lastelt
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   150
    return returnitem
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   151
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   152
def heapreplace(heap, item):
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   153
    """Pop and return the current smallest value, and add the new item.
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   154
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   155
    This is more efficient than heappop() followed by heappush(), and can be
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   156
    more appropriate when using a fixed-size heap.  Note that the value
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   157
    returned may be larger than item!  That constrains reasonable uses of
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   158
    this routine unless written as part of a conditional replacement:
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   159
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   160
        if item > heap[0]:
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   161
            item = heapreplace(heap, item)
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   162
    """
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   163
    returnitem = heap[0]    # raises appropriate IndexError if heap is empty
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   164
    heap[0] = item
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   165
    _siftup(heap, 0)
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   166
    return returnitem
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   167
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   168
def heapify(x):
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   169
    """Transform list into a heap, in-place, in O(len(heap)) time."""
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   170
    n = len(x)
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   171
    # Transform bottom-up.  The largest index there's any point to looking at
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   172
    # is the largest with a child index in-range, so must have 2*i + 1 < n,
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   173
    # or i < (n-1)/2.  If n is even = 2*j, this is (2*j-1)/2 = j-1/2 so
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   174
    # j-1 is the largest, which is n//2 - 1.  If n is odd = 2*j+1, this is
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   175
    # (2*j+1-1)/2 = j so j-1 is the largest, and that's again n//2-1.
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   176
    for i in reversed(xrange(n//2)):
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   177
        _siftup(x, i)
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   178
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   179
def nlargest(n, iterable):
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   180
    """Find the n largest elements in a dataset.
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   181
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   182
    Equivalent to:  sorted(iterable, reverse=True)[:n]
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   183
    """
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   184
    it = iter(iterable)
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   185
    result = list(islice(it, n))
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   186
    if not result:
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   187
        return result
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   188
    heapify(result)
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   189
    _heapreplace = heapreplace
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   190
    sol = result[0]         # sol --> smallest of the nlargest
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   191
    for elem in it:
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   192
        if elem <= sol:
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   193
            continue
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   194
        _heapreplace(result, elem)
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   195
        sol = result[0]
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   196
    result.sort(reverse=True)
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   197
    return result
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   198
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   199
def nsmallest(n, iterable):
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   200
    """Find the n smallest elements in a dataset.
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   201
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   202
    Equivalent to:  sorted(iterable)[:n]
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   203
    """
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   204
    if hasattr(iterable, '__len__') and n * 10 <= len(iterable):
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   205
        # For smaller values of n, the bisect method is faster than a minheap.
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   206
        # It is also memory efficient, consuming only n elements of space.
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   207
        it = iter(iterable)
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   208
        result = sorted(islice(it, 0, n))
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   209
        if not result:
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   210
            return result
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   211
        insort = bisect.insort
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   212
        pop = result.pop
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   213
        los = result[-1]    # los --> Largest of the nsmallest
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   214
        for elem in it:
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   215
            if los <= elem:
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   216
                continue
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   217
            insort(result, elem)
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   218
            pop()
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   219
            los = result[-1]
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   220
        return result
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   221
    # An alternative approach manifests the whole iterable in memory but
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   222
    # saves comparisons by heapifying all at once.  Also, saves time
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   223
    # over bisect.insort() which has O(n) data movement time for every
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   224
    # insertion.  Finding the n smallest of an m length iterable requires
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   225
    #    O(m) + O(n log m) comparisons.
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   226
    h = list(iterable)
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   227
    heapify(h)
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   228
    return map(heappop, repeat(h, min(n, len(h))))
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   229
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   230
# 'heap' is a heap at all indices >= startpos, except possibly for pos.  pos
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   231
# is the index of a leaf with a possibly out-of-order value.  Restore the
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   232
# heap invariant.
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   233
def _siftdown(heap, startpos, pos):
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   234
    newitem = heap[pos]
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   235
    # Follow the path to the root, moving parents down until finding a place
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   236
    # newitem fits.
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   237
    while pos > startpos:
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   238
        parentpos = (pos - 1) >> 1
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   239
        parent = heap[parentpos]
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   240
        if parent <= newitem:
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   241
            break
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   242
        heap[pos] = parent
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   243
        pos = parentpos
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   244
    heap[pos] = newitem
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   245
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   246
# The child indices of heap index pos are already heaps, and we want to make
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   247
# a heap at index pos too.  We do this by bubbling the smaller child of
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   248
# pos up (and so on with that child's children, etc) until hitting a leaf,
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   249
# then using _siftdown to move the oddball originally at index pos into place.
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   250
#
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   251
# We *could* break out of the loop as soon as we find a pos where newitem <=
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   252
# both its children, but turns out that's not a good idea, and despite that
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   253
# many books write the algorithm that way.  During a heap pop, the last array
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   254
# element is sifted in, and that tends to be large, so that comparing it
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   255
# against values starting from the root usually doesn't pay (= usually doesn't
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   256
# get us out of the loop early).  See Knuth, Volume 3, where this is
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   257
# explained and quantified in an exercise.
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   258
#
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   259
# Cutting the # of comparisons is important, since these routines have no
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   260
# way to extract "the priority" from an array element, so that intelligence
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   261
# is likely to be hiding in custom __cmp__ methods, or in array elements
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   262
# storing (priority, record) tuples.  Comparisons are thus potentially
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   263
# expensive.
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   264
#
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   265
# On random arrays of length 1000, making this change cut the number of
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   266
# comparisons made by heapify() a little, and those made by exhaustive
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   267
# heappop() a lot, in accord with theory.  Here are typical results from 3
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   268
# runs (3 just to demonstrate how small the variance is):
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   269
#
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   270
# Compares needed by heapify     Compares needed by 1000 heappops
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   271
# --------------------------     --------------------------------
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   272
# 1837 cut to 1663               14996 cut to 8680
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   273
# 1855 cut to 1659               14966 cut to 8678
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   274
# 1847 cut to 1660               15024 cut to 8703
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   275
#
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   276
# Building the heap by using heappush() 1000 times instead required
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   277
# 2198, 2148, and 2219 compares:  heapify() is more efficient, when
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   278
# you can use it.
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   279
#
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   280
# The total compares needed by list.sort() on the same lists were 8627,
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   281
# 8627, and 8632 (this should be compared to the sum of heapify() and
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   282
# heappop() compares):  list.sort() is (unsurprisingly!) more efficient
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   283
# for sorting.
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   284
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   285
def _siftup(heap, pos):
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   286
    endpos = len(heap)
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   287
    startpos = pos
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   288
    newitem = heap[pos]
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   289
    # Bubble up the smaller child until hitting a leaf.
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   290
    childpos = 2*pos + 1    # leftmost child position
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   291
    while childpos < endpos:
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   292
        # Set childpos to index of smaller child.
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   293
        rightpos = childpos + 1
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   294
        if rightpos < endpos and heap[rightpos] <= heap[childpos]:
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   295
            childpos = rightpos
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   296
        # Move the smaller child up.
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   297
        heap[pos] = heap[childpos]
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   298
        pos = childpos
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   299
        childpos = 2*pos + 1
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   300
    # The leaf at pos is empty now.  Put newitem there, and bubble it up
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   301
    # to its final resting place (by sifting its parents down).
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   302
    heap[pos] = newitem
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   303
    _siftdown(heap, startpos, pos)
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   304
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   305
# If available, use C implementation
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   306
try:
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   307
    from _heapq import heappush, heappop, heapify, heapreplace, nlargest, nsmallest
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   308
except ImportError:
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   309
    pass
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   310
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   311
# Extend the implementations of nsmallest and nlargest to use a key= argument
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   312
_nsmallest = nsmallest
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   313
def nsmallest(n, iterable, key=None):
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   314
    """Find the n smallest elements in a dataset.
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   315
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   316
    Equivalent to:  sorted(iterable, key=key)[:n]
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   317
    """
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   318
    in1, in2 = tee(iterable)
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   319
    it = izip(imap(key, in1), count(), in2)                 # decorate
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   320
    result = _nsmallest(n, it)
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   321
    return map(itemgetter(2), result)                       # undecorate
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   322
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   323
_nlargest = nlargest
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   324
def nlargest(n, iterable, key=None):
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   325
    """Find the n largest elements in a dataset.
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   326
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   327
    Equivalent to:  sorted(iterable, key=key, reverse=True)[:n]
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   328
    """
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   329
    in1, in2 = tee(iterable)
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   330
    it = izip(imap(key, in1), imap(neg, count()), in2)      # decorate
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   331
    result = _nlargest(n, it)
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   332
    return map(itemgetter(2), result)                       # undecorate
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   333
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   334
if __name__ == "__main__":
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   335
    # Simple sanity test
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   336
    heap = []
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   337
    data = [1, 3, 5, 7, 9, 2, 4, 6, 8, 0]
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   338
    for item in data:
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   339
        heappush(heap, item)
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   340
    sort = []
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   341
    while heap:
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   342
        sort.append(heappop(heap))
ae805ac0140d DP tools release version Revision: 200912
Deepak Modgil <Deepak.Modgil@Nokia.com>
parents:
diff changeset
   343
    print sort