[Pharo-project] Revisiting Issue 1718

Levente Uzonyi leves at elte.hu
Fri Nov 19 13:50:36 CET 2010


For 1718:
- don't use #asSortedCollection for sorting.
- SortedCollection uses quicksort, not merge sort as stated in a comment. 
That's why #asSortedCollection is slow when there are only a few different 
values. Quicksort performance degrades to O(n^2) in this case, but that's 
totally valid. However merge sort has guaranteed O(n*log(n)) runtime.

On Fri, 19 Nov 2010, Stephan Eggermont wrote:

> I would be interested in improvements. This is just a straight translation
> of the java code.

It has some errors. If you define this method in Array, replace 'array' 
with 'self' and evaluate this code, it'll raise an error:

a := (1 to: 50) asArray.
[ a dualPivotQuicksort: 1 to: a size split: 3 ] bench

It's also not general. #sort: #sorted: and #asSortedCollection: use 
a single block for comparison, while this method uses #<, #>, #= and #~= 
in a mixed way.

>
> Yaroslavskiy's Dual-Pivot Quicksort seems to perform better for large collections:
>
> A straightforward
> defaultSort: i to: j
> 	self dualPivotQuicksort: i to: j split: 3
>
> dualPivotQuicksort: left to: right split: div
> 	|len third m1 m2 pivot1 pivot2 less great dist newDiv|
> 	len := right - left.
> 	newDiv := div.
> 	len < 27 ifTrue: [
> 		left+1 to: right do: [ :i | | j |
> 			j := i.
> 			[(j>left) and: [(array at:j) < (array at: (j-1))]] whileTrue: [
> 				array swap: j with: j-1.
> 				j := j-1]]]

That looks like an optimized bubble sort, ouch.


Levente

> 	ifFalse: [
> 		third := len // div.
> 		m1 := left+third.
> 		m2 := right-third.
> 		m1 <= left ifTrue: [m1 := left+1].
> 		m2 >= right ifTrue: [m2 := right-1].
> 		(array at: m1) < (array at: m2) ifTrue: [
> 			array swap: m1 with: left.
> 			array swap: m2 with: right]
> 		ifFalse: [
> 			array swap: m1 with: right.
> 			array swap: m2 with: left].
> 		pivot1 := array at: left.
> 		pivot2 := array at: right.
> 		less := left+1.
> 		great := right-1.
> 		less to: great do: [ :k |
> 			(array at: k) < pivot1 ifTrue: [
> 				array swap: k with: less.
> 				less := less+1.]
> 			ifFalse: [
> 				(array at: k ) > pivot2 ifTrue: [
> 					[(k < great and: [(array at: great) > pivot2])] whileTrue: [
> 						great := great -1].
> 					array swap: k with: great.
> 					great := great-1.
> 					(array at: k) < pivot1 ifTrue: [
> 						array swap: k with: less.
> 						less := less+1]
> 				]
> 			]].
> 		dist := great-less.
> 		dist < 13 ifTrue: [ newDiv := div+1].
> 		array swap: (less-1) with: left.
> 		array swap: (great+1) with: right.
> 		self dualPivotQuicksort: left to: (less-2) split: newDiv.
> 		self dualPivotQuicksort: (great+2) to: right split: newDiv.
> 		(dist > (len -13) and: [ pivot1 ~= pivot2]) ifTrue:[
> 			less to: great do: [ :k |
> 				(array at: k) = pivot1 ifTrue: [
> 					array swap: k with: less.
> 					less := less+1]
> 				ifFalse: [
> 					(array at: k) = pivot2 ifTrue: [
> 						array swap: k with: great.
> 						great := great - 1.
> 						(array at: k) = pivot1 ifTrue: [
> 							array swap: k with: less.
> 							less := less+1]
> 					]
> 				]
> 			]
> 		].
> 		pivot1 < pivot2 ifTrue: [
> 			self dualPivotQuicksort: less to: great split: newDiv]
> 	]
>
> Do we have a performance test set	for sorting?
>




More information about the Pharo-project mailing list