[Pharo-project] Preallocation behavior

Toon Verwaest toon.verwaest at gmail.com
Thu Apr 28 12:12:58 CEST 2011


>> In the first example, you are making a single string with all A's of size 9000 repeated 500 times.
>> In the second example, you are making 9000 strings with all A's of size 10000  repeated 500 times.
> Why?
Because it's not true :) The problem is rather that you are doing 
nextPutAll: with a string, which is obviously slower than nextPut: with 
a character. NextPutAll: has to loop over the input string to copy the 
values and as an optimization (which works against you in this case) it 
increases the size of the string to be as big as you want. Look at their 
code:

nextPutAll: aCollection

      | newEnd |
      collection class == aCollection class ifFalse:
          [^ super nextPutAll: aCollection ].

      newEnd := position + aCollection size.
      newEnd > writeLimit ifTrue:
          [self growTo: newEnd + 10].

      collection replaceFrom: position+1 to: newEnd  with: aCollection 
startingAt: 1.
      position := newEnd.

     ^ aCollection

vs

nextPut: anObject
     "Primitive. Insert the argument at the next position in the Stream
     represented by the receiver. Fail if the collection of this stream 
is not an
     Array or a String. Fail if the stream is positioned at its end, or 
if the
     position is out of bounds in the collection. Fail if the argument 
is not
     of the right type for the collection. Optional. See Object 
documentation
     whatIsAPrimitive."

<primitive: 66>
     position >= writeLimit
         ifTrue: [^ self pastEndPut: anObject]
         ifFalse:
             [position := position + 1.
             ^collection at: position put: anObject]

The primitive will only fail if you are trying to write past the end. 
This check is very fast since it's primitive... and we'll only go in 
Smalltalk code when it fails. The first version on the other hand is 
always first some smalltalk code (quite a lot for your basic example) 
before it goes into a slightly more heavy primitive 
(replaceFrom:to:with:startingAt:).

Of course it's both the same big-O (unlike the first example), but the 
constant factor is pretty different because of smalltalk vs primitive 
behavior.
>>>> An optimization is to use a Stream. Here is my source code:
>>>> ===
>>>> MessageTally spyOn:
>>>> [ 500 timesRepeat: [
>>>> | str |
>>>> *str := WriteStream on: (String new)*.
>>>> 9000 timesRepeat: [ str nextPut: $A ]]].
>>>> ===
>>>>
>>>> The result appears after *812 ms*, which is a large improvement.
>>>> Now, we could optimize again using the preallocation. Here is my source
>>>> code:
>>>>
>>>> ====
>>>> MessageTally spyOn:
>>>> [ 500 timesRepeat: [
>>>> | str |
>>>> *str := WriteStream on: (String new: 10000)*.
>>>> 9000 timesRepeat: [ str nextPutAll: 'A' ]]].
>>>> ====
>>>>
>>>> And the result is strange: it makes 2 times slower to display the result.
>>>> The result appears after 1656 ms.
>> In the first example, you are making a single string with all A's of size 9000 repeated 500 times.
>> In the second example, you are making 9000 strings with all A's of size 10000  repeated 500 times.
>




More information about the Pharo-project mailing list