[Pharo-project] Small improvement for WriteStream >> nextPutAll:

Nicolas Cellier nicolas.cellier.aka.nice at gmail.com
Mon Jan 9 22:07:57 CET 2012


2012/1/9 Henrik Sperre Johansen <henrik.s.johansen at veloxit.no>:
> On 09.01.2012 00:18, Eliot Miranda wrote:
>>
>>
>>
>> On Sun, Jan 8, 2012 at 5:46 AM, Mariano Martinez Peck
>> <marianopeck at gmail.com <mailto:marianopeck at gmail.com>> wrote:
>>
>>    Hi guys. Some time ago Henry spot to us a small improvement for
>>    WriteStream >> nextPutAll: that we are using in Fuel, but I think
>>    it can be general. Henry comment was exactly:
>>
>>    /Also: For variableBytes classes, if you rewrite:
>>
>>    WriteStream
>>    nextPutAll: aCollection
>>
>>         | newEnd |
>>    *collection class instSpec == aCollection class instSpec ifFalse:*
>>
>>             [^ super nextPutAll: aCollection ].
>>
>>         newEnd := position + aCollection size.
>>         newEnd > writeLimit ifTrue:
>>             [self growTo: newEnd + 10].
>>
>>    collection replaceFrom: position+1 to: newEnd  with: aCollection
>>    startingAt: 1.
>>         position := newEnd.
>>
>>        ^ aCollection
>>
>>    You can now pass all variableByte classes (Like ByteString)
>>    directly to a stream with a ByteArray collection, and the
>>    replaceFrom:to:with:/
>>    /startingAt: primitive will work correctly, just like the file
>>
>>    primitive does
>>    This means you don't need special Serializers for these either,
>> using f.ex. clunky nextStringPutAll: methods with manual
>>    asByteArray conversions./
>>
>>
>>
>>    So...if you agree, I can commit the patch.
>>
>>
>> Surely this will break special encoded strings where the encode/decode is
>> done in at:/at:put:.  Moving the bytes isn't safe in general.  However, if
>> the programmer knows they themselves can use next:putAll:startingAt: or some
>> such.  I don't think your suggestion is safe in general for WriteStream.
>>  Instead if  next:putAll:startingAt: doesn't do the job invent some new
>> protocol that will.
>

While analyzing the implications, I went to same conclusion as Eliot.
I feel like he is right
If we can just copy bytes uninterpreted, semantically it's a cast...
While such trick might be inevitable at a few localizations,
it does not look like a good thing to promote it as the default
behaviour, see below

> In principle I agree, though the only such case I can think of in an image
> atm is the newly introduced:
> wa := WordArray new: 2.
> ws := wa writeStream.
> ws nextPutAll: 1.0
>

Sure, or
    ByteArray new writeStream nextPutAll: SmallInteger maxVal + 1.

Our both examples seems a bit weird but perfectly intentional.
But what if we wish to nextPutAll: self words, and #words answer a Float ?
Whether intentional or not we can't tell...
Such behavior might hide an error or at least delay it's discovery.
Sacrificing least surprising principle for the sake of efficiency is a mistake.
But I have a better argument, see below.

> The string example in itself is rather theoretical, the "convention" is to
> use ByteStrings for all encodings, and let the programmer fend for himself
> when it comes to remembering which ByteStrings he's converted to which
> encoding.
> (A practice which, sadly, is only enabled by the collection class =
> aCollection class check (the one in next:putAll:startingAt:), as
> encoding/decoding a ByteString to a ByteString is much faster than the
> equivalent operation resulting in a ByteArray... )
> So in practice, even the current class check does not protect against what
> would be the common error.
>

Sure! We know what we want...
- a ByteArray for holding codes,
- an Encoder for producing these codes
- and a Decoder for interpreting them.
We trust this representation as a bit more safe and expressive because
we can't mix anyByteArrayWhateverTheEncoding and aString together, and
know exactly which form is encoded in which format (as long as we do
not loose the associated codec)...

We'll certainly encounter some performance bottlenecks like this in
our way and be tempted to optimize...
But our "optimization" is introducing exactly what we rejected: we can now write
    aString writeStream nextPutAll: anyByteArrayWhateverTheEncoding
Horror!

Having the current design broken shall not give us a licence to make
it potentially worse ;)

> Though, for a general, collection-agnostic approach which can still use
> primitives between compatible classes, I guess giving the nextPutAll:
> argument a say in what method is actually used to do a
> next:putAll:startingAt: would be a better approach than simply removing the
> check and crossing our fingers. :)
>

I didn't understand this sentence... You mean using explicitly an
optimized method in the sender?

Nicolas

> Cheers,
> Henry
>



More information about the Pharo-project mailing list