A few days ago I updated mod_archive (XEP-0136) ejabberd module to support PostgreSQL with text search feature (tsearch2). Simple benchmarks looked good, so to benchmark it on real traffic I ran it with enabled automated archiving on one of two jabber.ru nodes for 24 hours on business day. It worked smoothly and I didn’t notice any performance downgrades or higher CPU load.
Here are some results:
Database size: 197,096,948 bytes Messages: 474,562 Collections: 39,836
Another node should show the same results, so to support enabled by default automated archiving jabber.ru would require 400MB/day or about 120-140GB/year (I don’t know weekend statistics).
Only “normal” and “chat” message types were handled. Note, that messages between two users on this node were accounted twice.
Each message requires on the average 415 bytes on disk.
Collection is a set of messages between the same user and full JID with intervals between consecutive messages not more than 15 minutes. The biggest collection contains 1657 messages and 6 hours 45 minutes long. :)
Numbers of incoming and outgoing messages are almost equal:
Messages sent: 241044 Messages received: 233518
Soon I’ll publish more detailed statistics about the collected messages.