in software

How to recover from Cyrus when you have some DB errors

I’ll try to explain some methods and tips on how to recover from a mix in db libraries, or from messages like:

  • DBERROR: reading /var/lib/cyrus/db/skipstamp, assuming the worst: No such file or directory
  • DBERROR db4: PANIC: fatal region error detected; run recovery
  • DBERROR: critical database situation

Structure of Cyrus database

The database is split in two parts. The first part is the real database, normally located in /var/lib/cyrus:

mail:/usr/src/cyrus-imapd-2.2.10# cd /var/lib/cyrus
mail:/var/lib/cyrus# ll
total 60
-rw——- 1 cyrus mail 144 Feb 20 16:54 annotations.db
drwxr-xr-x 2 cyrus mail 4096 Feb 20 16:54 db
drwx—— 2 cyrus mail 4096 Feb 20 16:54 db.backup1
-rw——- 1 cyrus mail 8192 Feb 20 16:54 deliver.db
drwxr-xr-x 2 cyrus mail 4096 Feb 20 16:46 log
-rw——- 1 cyrus mail 11644 Feb 20 16:54 mailboxes.db
drwxr-xr-x 2 cyrus mail 4096 Feb 20 16:46 msg
drwxr-xr-x 2 cyrus mail 4096 Feb 20 17:15 proc
drwxr-xr-x 2 cyrus mail 4096 Feb 20 16:46 ptclient
drwxr-xr-x 2 cyrus mail 4096 Feb 20 16:55 socket
-rw——- 1 cyrus mail 8192 Feb 20 16:54 tls_sessions.db

In fact, in this part, the only (and most important in all of Cyrus) useful thing, is the mailboxes.db file. Everything else can be reconstructed from nothing.

The second part is containing all the emails and filters (sieve). Usually they are located in /var/spool/cyrus/. I advise you to make a recursive backup of these two directories in case something goes really wrong.

Possible problems, and their solution

Look for all the instances of cyrus executables you may have. The easiest for this is to use the locate command (after updating the database with updatedb). Try locate ctl_mboxlist for example. For each of the files run a ldd ctl_mboxlist (or on any other executable from cyrus):

mail:/var/lib/cyrus# ldd /usr/cyrus/bin/ctl_mboxlist
libsasl2.so.2 => /usr/lib/libsasl2.so.2 (0x4001c000)
libssl.so.0.9.7 => /usr/lib/i686/cmov/libssl.so.0.9.7 (0x40031000)
libcrypto.so.0.9.7 => /usr/lib/i686/cmov/libcrypto.so.0.9.7 (0x40063000)
libresolv.so.2 => /lib/tls/libresolv.so.2 (0x40162000)
libdb-4.2.so => /usr/lib/libdb-4.2.so (0x40174000)
libc.so.6 => /lib/tls/libc.so.6 (0x4024a000)
libdl.so.2 => /lib/tls/libdl.so.2 (0x4037f000)
/lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x40000000)

The important thing is to have all of your executables linked with the same libdb version (here it is libdb-4.2.so).

The first thing to do, before anything else, is to verify the integrity of your Cyrus installation. Stop the Cyrus daemon, and restart it, and look at the logs. Many times, I see errors about daemons not found. For example, one of them (ctl_cyrusdb) is supposed to recover automatically from a db crash, but when you have the following error:

Feb 20 07:13:48 mail master[11873]: about to exec /usr/sbin/ctl_cyrusdb
Feb 20 07:13:48 mail master[11873]: can’t exec /usr/sbin/ctl_cyrusdb on schedule: No such file or directory
Feb 20 07:13:48 mail master[10665]: process 11873 exited, status 71

You can be sure, you’ll have a disaster next.

One problem is that the database has not been generated with the same libdb version as the current one. To correct this, change your user to the cyrus admin user (normally cyrus) and execute the following command, while looking at the syslog files:

[email protected]:/tmp# su – cyrus
[email protected]:/var/lib/cyrus$ ctl_cyrusdb -r

It is normal you don’t get any output from the command. The output goes into SYSLOG. When the recovery is successful, you’ll have this:

Feb 20 16:00:07 mail ctl_cyrusdb[31041]: recovering cyrus databases
Feb 20 16:00:07 mail ctl_cyrusdb[31041]: skiplist: recovered /var/lib/cyrus/mailboxes.db (124 records, 13092 bytes) in 0 seconds
Feb 20 16:00:08 mail ctl_cyrusdb[31041]: skiplist: recovered /var/lib/cyrus/annotations.db (0 records, 144 bytes) in 1 second
Feb 20 16:00:08 mail ctl_cyrusdb[31041]: done recovering cyrus databases

Here, you see that Cyrus found 124 mailboxes.

If this is not sufficient, you’ll have to delete the /var/lib/cyrus/db/* contents and re-run ctl_cyrusdb.

If this is still not sufficient, you’ll have to do the extreme: Rebuild all the /var/lib/cyrus directory. To do this, first try to get a “text” version (export) of the mailboxes.db. If you manage to do this, you’ve saved. To export the list of mailboxes, do the following, as user cyrus:

[email protected]:/tmp$ /usr/cyrus/bin/ctl_mboxlist -d -f /tmp/mailboxes.db

This will output the users/mailboxes to stdout. Verify that it’s ok, then re-run the command and redirect the output to a file:

[email protected]:/tmp$ /usr/cyrus/bin/ctl_mboxlist -d -f /tmp/mailboxes.db > /tmp/mboxlist.txt

Then remove completely the /var/lib/cyrus directory. Recreate it with the following commands:

[email protected]:/tmp# /usr/bin/cyrus/tools/mkimap
[email protected]:/tmp# chown -R cyrus:mail /var/lib/cyrus
[email protected]:/tmp# /usr/cyrus/bin/reconstruct -r -f user

Here, you’ll have no output, meaning it found no users. That’s normal, as we just have a clean new mailboxes.db. If you’ve run the command reconstruct as root, do forget to immediately change back the roght to the files:

[email protected]:/tmp# chown -R cyrus:mail /var/lib/cyrus

Now we just have to rebuild the mailboxes.db file:

[email protected]:/tmp# su – cyrus
[email protected]:/tmp$ cat /tmp/mboxlist.txt | /usr/cyrus/bin/ctl_mboxlist -u
[email protected]:/tmp$ /usr/cyrus/bin/reconstruct -r -f user

Here, you’ll have a list of the found users in the database.

Now try to start again Cyrus.

Useful references

Related posts:

Facebooktwittergoogle_plusredditpinterestlinkedinmailNo tags for this post.

Write a Comment

Comment

 
  1. Thank you, this was a saver for me.
    Just one remark:
    before
    /usr/cyrus/bin/ctl_mboxlist -d -f /tmp/mailboxes.db
    you have to copy /var/lib/cyrus/mailboxes.db to /tmp

    Pim

  2. Thanks very much, that was really really helpful. I need to delete /var/lib/cyrus/db/* and then rerun ctl_cyrusdb -r and we are back