Message Archiving Benchmark: General Statistics

A few days ago I updated mod_archive (XEP-0136) ejabberd module to support PostgreSQL with text search feature (tsearch2). Simple benchmarks looked good, so to benchmark it on real traffic I ran it with enabled automated archiving on one of two nodes for 24 hours on business day. It worked smoothly and I didn’t notice any performance downgrades or higher CPU load.

Here are some results:

Database size: 197,096,948 bytes
Messages: 474,562
Collections: 39,836

Another node should show the same results, so to support enabled by default automated archiving would require 400MB/day or about 120-140GB/year (I don’t know weekend statistics).
Only “normal” and “chat” message types were handled. Note, that messages between two users on this node were accounted twice.
Each message requires on the average 415 bytes on disk.

Collection is a set of messages between the same user and full JID with intervals between consecutive messages not more than 15 minutes. The biggest collection contains 1657 messages and 6 hours 45 minutes long. :)

Numbers of incoming and outgoing messages are almost equal:

Messages sent: 241044
Messages received: 233518

Soon I’ll publish more detailed statistics about the collected messages.

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.