[Pharo-project] Prematurely ending document on purpose with the SAXParser

Stéphane Ducasse stephane.ducasse at inria.fr
Fri Jan 6 13:09:02 CET 2012

Is the XMLPulParser part of XMLSupport?


>>>> I wonder what to do in this kind of situations. See
>>>> 1) XML is one of the most important packages
>>>> 2) There is no mailng-list for xml ?
>>>> 3) Nobody replies and
>>>> 4) You want latest updates from that package
>>>> Any suggestion?
>>> Maybe you could try to reach someone from the administrators/developers directly, there are quite a few on SS.
>> I'm one of those listed. But jaayer is the developer of the XML package. He picked up Michaels implementation and extended it. So there is only one guy that can give an elaborate answer.
>> Hernan, I don't really understand your problem. It sounds like you need to mess with the method checkEOD in order to get your use case done. While the real problem is that SAX is about to read the whole file at once. And while this is inappropriate for a lot of cases StAX/pull parsers have been implemented. So the real issue is that there is no StAX/pull parser in smalltalk (or at least in pharo).
>> Norbert
> Hi,
> Of course the ideal would be a using a full StAX implementation, or
> even better a vtd-xml one, that would really be a cool use case for
> attracting more developers into Smalltalk. As there is no
> specification of SAX we are in the consensus line. So while SAX is
> commonly and thought to be used as a read-all solution, there is some
> workaround to stop the reading in Java [*]. So yes, I've used a
> workaround for that case. Today I played with XMLPullParser (StAX) and
> after a few changes I could parse and stop parsing when a specific
> node is reached, example:
> | parser |
> parser := XMLPullParser parse: '<?xml version="1.0"?>
> <!DOCTYPE BlastOutput PUBLIC "-//NCBI//NCBI BlastOutput/EN"
> "NCBI_BlastOutput.dtd">
> <BlastOutput>
>  <BlastOutput_program>blastn</BlastOutput_program>
>  <BlastOutput_version>BLASTN 2.2.26+</BlastOutput_version>
>  <BlastOutput_reference>Zheng Zhang, Scott Schwartz, Lukas Wagner,
> and Webb Miller (2000), &quot;A greedy algorithm for aligning DNA
> sequences&quot;, J Comput Biol 2000;
> 7(1-2):203-14.</BlastOutput_reference>
>  <BlastOutput_param>
>    <Parameters>
>      <Parameters_expect>10</Parameters_expect>
>      <Parameters_sc-match>1</Parameters_sc-match>
>    </Parameters>
>  </BlastOutput_param>
> <BlastOutput_iterations>'.
> [ parser isStartTag: 'BlastOutput_param' ]
> 	whileFalse:[
> 		Transcript show: parser text; cr.
> 		parser next ]
> I've sent the changes to Ken Treis so he can integrate them in
> XMLPullParser package.
> Cheers,
> Hernán
> http://www.ibm.com/developerworks/xml/library/x-tipsaxstop/
> http://stackoverflow.com/questions/3405702/using-sax-to-parse-common-xml-elements/3409270#3409270
> http://stackoverflow.com/questions/1345293/how-to-stop-parsing-xml-document-with-sax-at-any-time

