Apply optimizations to repartitioning routines
The repartition-by-swapping routines can benefit from the following improvements:
- repartition in parallel by pairing up lowest weight with highest weight part, second-lowest weight with second-highest weight part etc.
- sort the larger part weights (size n) to achieve O(log n) lookup
This issue is mostly for documenting the progress of an on-going implementation: the above is done for the variant for multi-process and 4 byte integer weights (mp_i4) the following are required for completion:
Where the multi-threaded variants have yet to be written.