• naptera@feddit.de
    link
    fedilink
    arrow-up
    5
    ·
    edit-2
    1 year ago

    For the purpose of algorithm verification, the final and/or pushdown automaton or probably sometimes even Turing Machines are used, because they are easier to work with. “Real” regular expressions are only nice to write a grammar for regular languages which can be easily interpreted by the computer I think. The thing is, that regexs in the *nix and programming language world are also used for searching which is why there are additional special characters to indicate things like: “it has to end with …” and there are shortcuts for when you want that a character or sequence occurs

    • at least once,
    • once or never or
    • a specified number of times back to back.

    In “standard” regex, you would only have

    • () for grouping,
    • * for 0 or any number of occurances (so a* means blank or a or aa or …)
    • + as combining two characters/groups with exclusive or (in programming, a+ is mostly the same as aa* so this is a difference)
    • and sometimes some way to have a shortcut for (a+b+c+…+z) if you want to allow any lower case character as the next one

    So there are only 4 characters which have the same expressive power as the extended syntax with the exception of not being able to indicate, that it should occur at the end or beginning of a string/line (which could even be removed if one would have implemented different functions or options for the tools we now have instead)

    So one could say that *nix regex is bloated /s