Problem 1-3d: This problem asks you to modify mergesort so as to count number
of inversions in permutations. The answer given in class was glib and incorrect.
Here's a better one:
Imagine writing the original permuted data above its final locations, and connecting
each original element to its destination with a line, either slanting left, if the element
ends up earlier than it was; or vertical,if it ends up at the same place; or slanting right
if it ends up later.
Pictorially, an inversion is represented by two lines crossing. The total number of
crossings is the number of inversions.
Now the same description can be used during mergesort, where elements do their
crossings in lg(n) rounds, rather than all at once. The count will still be correct,
because elements always move TOWARDS their destinations during mergesort.
In each recursive call to mergesort, an element starts out in one half
of a subarray,
called either L or R, and ends up in either L or R.
These observations make it possible to do the count:
- Elements in L never cross one another, ditto for R. (Because L and R are already
sorted.
- Through the lg(n) rounds elements progress only in the direction they're destined
for. Therefore we can sum up the crossings through the rounds.
- A crossing takes place only when an R element becomes an L element, and the
number of L elements it crosses is exactly mid - l_index + 1. So these numbers are
what we must sum. (By symmetry, we could also discuss L elements crossing R
elements.)
So
Merge'(A, lo, mid, hi)
...while l_ptr and r_ptr in bounds
......do if A[l_ptr] <= A[r_ptr]
............B[k] <- A[l_ptr]; l_ptr <- l_ptr + 1; k <- k+1;
.........else B[k <- A[r_ptr]; r_ptr <- r_ptr + 1; k <- k+1;
............inversions <- inversions + mid - l_ptr + 1;
....../* number of L elements crossed */
etc.
In writing Merge, a number of people indexed R (the right subarray) with A[mid + i + j].
But this index increases whether L is copied or R is copied, so it will never finish an
input like [1,3,2,4].
Also, any scheme where "inversions" is incremented by 1 each time is bound to run in
worst-case OMEGA(n^2), so it can't work.