Valid XHTML 1.0!


A tagging database

I am a great advocate of the paperless office. After a few years of scientific work, I have therefore amassed quite a number of articles in electronic form, and, until now, named them according to first author and bibliographic data and categorized them in a directory hierarchy. But this approach has problems: what if I remember reading about some specific aspect of some system and quickly want to look it up again? In most cases I will have to traverse a number of possible categories/directories until I found what I was looking for (in fact, if I remember the first author, I just do a recursive find). Or if I want to generally review all my saved articles on elemental Ni? These are spread over a number of directories according to the principal physical effect or measurement method.

So the point is that a tree-like organization is not optimal. What would be needed here is an electronic equivalent of the old-fashioned library keyword catalogues. In newspeak, this concept is called tagging. A quick solution would be to keep a spreadsheet with a vertical list of all articles, a horizontal list of all tags, and enter logical TRUEs in the cells that correspond to the set tags for any article. And probably you could then do some cell arithmetic that gives you a small list for a given logical combination of tags. What I did instead was to write a set of small programs manipulating and querying a database, which I call litdb (for literature database, although you could use it equally well for music or whatever).

The format of the database can be seen just as the spreadsheet described above (i.e., a logical matrix), stored in compressed column sparse format, and also the programs accessing it are optimized for efficiency. Interaction with the database happens via the command line, so if I want to list all publications on the phonons in Zr by my group, I just issue find_db phonons "own munich" Zr. For a version 2.0, I will probably implement a form of tab completion.

Here is the source code, available under the GNU General Public Licence Version 2.
Phone: +49-89-289-11762 | Email: | last modified: 10.06.2013