[Pharo-project] petit parser help

Ricardo Moran richi.moran at gmail.com
Tue Apr 26 04:24:12 CEST 2011


I only played a little with PetitParser but I think the answer is in
PetitXml>>#element. You see in the action block that it compares the
"qualified" of the open and close tags and if they're different it returns a
PPFailure. It also takes care of the inlineTag in the same block by asking
if the fifth node is '/>'.

element
"[39]   element   ::=    EmptyElemTag | STag content ETag"
 ^ $< asParser , qualified , attributes , whitespace optional , ('/>'
asParser / ($> asParser , content , [ :stream | stream position ] asParser ,
'</' asParser , qualified , whitespace optional , $> asParser)) ==> [ :nodes
|
 *nodes fifth = '/>'*
ifTrue: [ Array with: nodes second with: nodes third with: #() ]
 ifFalse: [
*nodes second = nodes fifth fifth*
ifTrue: [ Array with: nodes second with: nodes third with: nodes fifth
second ]
 ifFalse: [ PPFailure message: 'Expected </' , nodes second qualifiedName ,
'>' at: nodes fifth third ] ] ]

I hope this helps.
Cheers,
Richo

On Mon, Apr 25, 2011 at 6:42 PM, Esteban Lorenzano <estebanlm at gmail.com>wrote:

> Hi Lukas, all
> I'm finally working on a HTML petit parser (a very basic one, based on XML
> petit parser) and I have a serious problem (well... besides my complete
> ignorance about petit parser, he...)
> I need to match this pattern:
>
> openTag, contents, closeTag     (that will be something like "<html> ...
> </html>")
> inlineTag                                       (that will be something
> like "<br/>")
> openTag                                         (that will be something
> like "<link ...>" or "<img src='anUrl'>")
>
> so, after try some variants... I came with this construct:
>
> element
>        "[39]           element    ::=           EmptyElemTag | STag content
> ETag"
>
>        ^(self inlineTag / (self openTag, content, self closeTag) / self
> openTag)
>                ==> [ :nodes | ].
>
> openTag
>        ^ $< asParser, qualified, whitespace optional, attributes,
> whitespace optional, $> asParser
>
> inlineTag
>        ^ $< asParser, qualified, whitespace optional, attributes,
> whitespace optional, '/>' asParser
>
> closeTag
>        ^'</' asParser , qualified , whitespace optional , $> asParser
>
>
> so... the problem here is that the statement
>
> self openTag, contents, self closeTag
>
> matchs with
>
> ...
>        <link ...>
> </html>
>
> and for that reason, the resulting tree is invalid.
>
> So, I need a way to ensure the openTag name is equal to the closeTag name.
>
> How can I do that?
>
> Cheers,
> Esteban
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gforge.inria.fr/pipermail/pharo-project/attachments/20110425/a5f92d9a/attachment.htm>


More information about the Pharo-project mailing list