Wednesday, November 27, 2013

recoll: indexed file searching that works

Today I discovered recoll, a tool for indexing files (i.e. both file names and contents, and including emails) and running searches over the index. It has both graphical and command-line interfaces, and unlike some other open-source tools that claim to do this job, it really seems to work. While traditional tools like grep and find work fine, grep for example was not designed to run searches with multiple search keys, and neither of them use an index, so every search walks the filesystem anew. Another traditional tool, locate, uses an index for speed, but it only knows how to search filenames (and paths), not file contents.

Anyway, recoll appears to be the open-source file indexing and searching tool that I was looking for. Here's how to install it in openSUSE:

  1. Go to software.opensuse.org
  2. In the search box, type "recoll-qt3"
  3. Under your version of openSUSE, click on "Show unstable packages"
  4. Find version 1.14.4 from the KDE:KDE3 repository and click on "1 Click Install"

Once the tool is installed, you need to set up an index like so:
  1. review the configuration file documentation:
    # less /usr/share/recoll/examples/recoll.conf
    
  2. as your normal user, create a .recoll directory in your home directory:
    $ mkdir ~/.recoll
    
  3. create a basic recoll.conf in that directory:
    $ echo "topdirs = ~" > ~/.recoll/recoll.conf
    
    (topdirs is all you need to get started)
  4. build the index:
    $ recollindex 2>/dev/null
    
    (run without >/dev/null to see the error messages, but they probably aren't important)
  5. run some simple searches
    $ recollq searchterm
    $ recollq searchterm1
    $ recollq searchterm1 searchterm2
    

5 comments:

  1. Recoll 1.14.4 probably begins to show its age... If you wish to update, I think that there is a much more recent release in KDE:Extra:
    https://build.opensuse.org/package/show?package=recoll&project=KDE%3AExtra

    ReplyDelete
    Replies
    1. Yes, but it appears to be missing the command-line client (recollq).

      Delete
  2. As Jean-François noted, why using such dated version?? Furthermore, qt3 package??
    There is a nice package in KDE:Extra + aditional kio recoll...

    ReplyDelete
    Replies
    1. Did you read my response to Jean-François? I'm primarily interested in the command-line interface.

      Delete
  3. recoll -t behaves exactly like recollq (no connection to X server, same query syntax, same output). The only reason to use recollq is for people who don't have the qt and x11 libs installed at all, for example on a NAS box.

    ReplyDelete