I just released Augeas 0.1.1; without really planning it, it turned out that the last two weeks were mostly spend on fixing bugs (besides the regular expression enhancement I blogged about previously — even though the real reason for doing that was that the typechecker had a serious bug, and subtraction of regular languages is needed to make the fixed typecheck usable)
The reference counting code in the interpreter had some serious leaks. I had known about them for a while, but never tried to track them down systematically, partly because I thought it would be way too hairy. As it turned out, they weren't that hard to track down; the key ingredient in squashing them was writing little test scripts that only exercised a small number of operations, like
let l = key /a/
and then running Valgrind a lot, and gdb a little. Of course, the real trick is to figure out what little toy scripts to write ...
Besides memory leaks, I also realized, using Valgrind's massif tool, that compiled regular expressions are huge, and I was hanging on to them for way too long.
With all that, Augeas 0.1.1 has no known memory leaks, and uses a reasonable amount of memory. Most of the honor for that goes to Valgrind, which is an amazingly useful tool.
For Augeas, I wanted to support subtraction of regular expresions, so that you can say
let key_re = /[A-Za-z]+/ - /(Allow|Deny)(Groups|Users)/
which would make key_re match all words made up of lower and upper case letters except for AllowGroups, AllowUsers, DenyGroups and DenyUsers --- the reason being, that those four special cases are handled differently from "generic" keys.
Since the - can't be expressed in regular language notation, it needs to be constructed by compiling its two operands into a finite automaton, subtracting the two automata from each other, and then converting the automaton back into a regular expression. All these operations, except for the conversion from automaton to regular expression, were already supported by libfa.
Implementing the conversion was quite a bit of fun, and the implementation follows almost literally the proof [pdf] that every language recognized by a finite automaton is regular. For some reason, these graph algorithms are always fun to implement, especially when they wind up working
A while ago I had what would be a hallway conversation with Mark if we worked in the same office (or country, for that matter.) Something he said set me thinking that getting a better handle on the mess of file formats in /etc would be possible, and in a way that would hide much of the pain those different formats inflict when config files need to be changed programmatically; it's actually nice that config data is stored in text files for interactive use (yes, that means vi), but a smoldering trainwreck when changes need to be scripted.
Editorial Note: we apologize for the length of this entry. If you don't want to read all this slipslop, feel free to go straight to the Augeas website. Just tell lutter how much fun you had reading his blog
It's a commonplace that the colorful variety of files and file formats used to configure the average Linux (or Unixy) system keeps us from having any sort of API to modify config data, and that any attempt to change that is doomed. Pretty exactly a year ago, I argued precisely that point (convincingly, I thought): that the best we can hope for is to have a few better tools for each service to modify its configuration. Maybe we can even build something on top of those tools, but that that's about as far as any such attempt could ever go in practice.
After that non-hallway conversation, it dawned on me that the various attempts to deal with this situation boiled down to three different approaches:
All these have been tried, and they all have serious limitations:
With all this in mind, my list of requirements for Augeas roughly looked like this:
After banging my head against the above for a while, and learning most, if not all, of the ways in which not to achieve it, I came across some work by the programming languages group at Penn, in particular Harmony and Boomerang. That work sent me down the right path. Because of the nice theoretical foundation laid by Harmony and Boomerang, Augeas checks descriptions statically (i.e., before they are ever used to modify a single file) to guard against a whole host of possible problems. Some of these problems are quite subtle, and are much easier for a computer to detect than for a human.
The Augeas website contains an introductory tour showing how the API is used, and more details on how file formats are described.
Recently, I needed a finite automata library written in C (for those of you who don't remember their formal language classes too well, finite automata are the theoretical underpinning of regular expressions) In a nutshell, a finite automaton represents the set of all strings matching a regular expression.
Such a library is a little different from regular expression matching, for which there are plenty of libraries, like GNU regex, because it supports operations that are impossible with regular expressions alone, such as intersection and the more exotic deciding of ambiguity.
Unfortunately, there isn't a well-maintained open-source C library to do that. Lucky for me, there is a very well-written Java library, dk.brics.automaton by Anders Møller. Based on that, I just finished implementing libfa. It is built as a separate DSO, but it's not distributed separately from Augeas yet; if you need an FA library and libfa seems like it would be useful, drop me a line and I'll split it out. It's mostly a matter of wrestling with autotools.
If you're curious about what you can do with finite automata that you can't with regular expressions alone, deciding ambiguous concatenation is a good example: the concatenation of two regular expressions r1 and r2 is ambiguous if there is a string upv that matches r1r2 such that both u and up match r1 (and therefore pv and v match r2)
Ambiguity is important if you attach actions to r1 and r2, for example, delete strings matching r1 from the output and convert everything matching r2 to uppercase: if you're confronted with an ambiguous string upv, you don't know what to do with p: should you delete it (splitting upv into up and v) or should you upcase it (splitting upv into u and pv). What's worse, when you match r1r2 against upv, you'll never know that there is ambiguity, and how it gets split largely depends on minute details of the implementation of the regular expression matcher.
At long last, Ruby on Rails 2 is part of Fedora. Packages are already in rawhide and will show up in the testing repos for F-7 and F-8 really soon now. The package is called rubygem-rails, as it's based on the rubygems for Rails.
For those few who installed the rubygem-rails-1.2.6 package either from my yum repo or a updates-testing repo, you need to first get rid of rubygem-actionwebservice. Running yum erase rubygem-actionwebservice before yum install rubygem-rails will suffice. There's no smoother update path, but since rails-1.2.6 was only ever in updates-testing, I didn't bother finding a better fix for the switch from actionwebservice to activeresource with Rails 2.
:: Next Page >>