Fun with regular languages

13 May 2008

For Augeas, I wanted to support subtraction of regular expresions, so that you can say

let key_re = /[A-Za-z]+/ - /(Allow|Deny)(Groups|Users)/

which would make key_re match all words made up of lower and upper case letters except for AllowGroups, AllowUsers, DenyGroups and DenyUsers — the reason being, that those four special cases are handled differently from “generic” keys.

Since the - can’t be expressed in regular language notation, it needs to be constructed by compiling its two operands into a finite automaton, subtracting the two automata from each other, and then converting the automaton back into a regular expression. All these operations, except for the conversion from automaton to regular expression, were already supported by libfa.

Implementing the conversion was quite a bit of fun, and the implementation follows almost literally the proof that every language recognized by a finite automaton is regular. For some reason, these graph algorithms are always fun to implement, especially when they wind up working ;)

Creative Commons License Watzmann.Blog by David Lutterkort is licensed under a Creative Commons Attribution-Share Alike 3.0 United States License.

Generated with Jekyll