Talk:Regex Posix

From HaskellWiki
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Just a quick note that a lot of the other-implementations-are-not-compliant examples appear to be about empty patterns (e.g., issues with the matching of () in (()|.)(b)).

If you read the linked to POSIX standard, however, it seems that such empty expressions are not actually valid regexs. For example, the defined extended regex grammar is

extended_reg_exp   :                      ERE_branch
                   | extended_reg_exp '|' ERE_branch
                   ;
ERE_branch         :            ERE_expression
                   | ERE_branch ERE_expression
                   ;
ERE_expression     : one_char_or_coll_elem_ERE
                   | '^'
                   | '$'
                   | '(' extended_reg_exp ')'
                   | ERE_expression ERE_dupl_symbol
                   ;
one_char_or_coll_elem_ERE  : ORD_CHAR
                   | QUOTED_CHAR
                   | '.'
                   | bracket_expression
                   ;
ERE_dupl_symbol    : '*'
                   | '+'
                   | '?'
                   | '{' DUP_COUNT               '}'
                   | '{' DUP_COUNT ','           '}'
                   | '{' DUP_COUNT ',' DUP_COUNT '}'
                   ;

from which I don't see how you can form () as it must contain a extended_reg_exp which has to consist of at least one ERE_branch which must consist of at least one ERE_expression which must have at least one character of some sort.