regexp matches of sgml tags

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

regexp matches of sgml tags

Richard Goerwitz
>>From a recent posting:

       Unix nitpick:  the question has been raised as to what kind
       of regular expression might match SGML tags the best. Certainly,
       as Erik has pointed out, the initial suggestion
         /<.*>/
       is indeed far too destructive.

Not only this, but for many Unix systems, you'll run into line length
limits.  I checked GNU sed, and from a brief look at the code, it seems
it will allocate extra space if it needs it.  But this isn't so for all
sed implementations.

       Isn't a more elegant solution the regexp:

            <[^<>]*>

(Originally, your regexp had ^Z's in it, and no initial caret in the
brackets - isn't the above what you meant?)

One thing I'm wondering about is how you specify a literal (i.e. non-
meta) "<" or ">."

-Richard

Loading...