Home > Uncategorized > Helping Zabbix housekeeper to catch up

Helping Zabbix housekeeper to catch up

A howling wind of nightmares
Howling through barren streets
Frozen in time
The city woke up – paralyzed
(At the Gates – At War with Reality)

Zabbix monitoring system is old generation system storing timeseries data in regular SQL database backend. This design decision has many drawbacks: performance issues and storage size being two most noticeable ones.

To tackle these problems, Zabbix divided performance metric data into two segments: history and trends. History data stores all values as they are received by sensors. After the expiration time, which can be set per individual sensor, Zabbix does averaging (trending) on hourly basis and discards history data. Hourly average data is called trends.

Backend process designed used for deleting old data is called “HouseKeeper”. HouseKeeper is not perfect unfortunately. It runs every HousekeepingFrequency hours and deletes MaxHousekeeperDelete entries in each pass.

What happens if HouseKeeper deletion process lasts longer then interval between two runs? Next run will skip cause two HouseKeeper processes can’t run simultaneously. This is like a government budget: you’re in deficit each year and cruft (external debt) just piles up.

This is especially noticeable if you run Zabbix datastore on a relatively slow block device…

There is an easy way to clean out Zabbix datastore, but it requires a short Zabbix downtime. Procudere goes like this:

  • stop zabbix server
  • backup, backup, backup !!!
  • copy schema of history table(s) to history_copy
  • copy data newer then last X days from history to history_copy
  • drop history table
  • rename history_copy to history
  • re-create indexes on history table (if you’re running PgSQL)
  • start zabbix server

As ever – if you’re doing something twice, it has to be automatized third time.

So, I created a script for Zabbix Servers running PostgreSQL backend that automates this job for me. Script is pretty simple so I’ll paste it here:


TABLES="history history_log history_old history_str history_text history_uint"

EPOCH_TIME=$(date --date '20 days ago' +'%s')

for table in $TABLES; do
    # remove old schema dumps
    rm -f ${table}.pre-data.sql ${table}.post-data.sql

    # dump schema
    pg_dump --section=pre-data  -t $table $DBNAME > ${table}.pre-data.sql
    pg_dump --section=post-data -t $table $DBNAME > ${table}.post-data.sql

    # create new table
    sed -r "s/\b${table}\b/${table}_copy/" ${table}.pre-data.sql \
    | psql $DBNAME -f -

    # copy data
    echo "INSERT INTO ${table}_copy \
          SELECT * FROM ${table} WHERE clock > ${EPOCH_TIME};" \
    | psql zabbix -f -

    # swap tables
    echo "ALTER TABLE ${table} RENAME TO ${table}_old;"  | psql zabbix -f -
    echo "ALTER TABLE ${table}_copy RENAME TO ${table};" | psql zabbix -f -

    # drop indexes from saved table and create new one
    sed 's/CREATE/DROP/' ${table}.post-data.sql \
    | sed 's/ ON.*/;/' \
    | psql $DBNAME -f -
    cat ${table}.post-data.sql | psql $DBNAME -f -

Note: Script works only on PostgreSQL backend.

Categories: Uncategorized
  1. September 8, 2017 at 12:46 am

    Hi – this is a great help to me and I’ll try it out.

    One question – the script will rename the current table to table_old, but not drop it.

    Am I reading that correctly?

    • September 8, 2017 at 1:08 am

      You are correct, tables will get renamed, it’s just a backup measure. You can drop them later if everything works.

  2. September 13, 2017 at 5:11 am

    I have now run this (with a few small changes I needed, and after a lot of testing) on our prod zabbix install.

    Went from 101Gb DB to 20Gb. I am only keeping 7 days of data though in history(s).

    Super result, many thanks!

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: