Archive for the ‘Uncategorized’ Category

Helping Zabbix housekeeper to catch up

December 2, 2016 3 comments

A howling wind of nightmares
Howling through barren streets
Frozen in time
The city woke up – paralyzed
(At the Gates – At War with Reality)

Zabbix monitoring system is old generation system storing timeseries data in regular SQL database backend. This design decision has many drawbacks: performance issues and storage size being two most noticeable ones.

To tackle these problems, Zabbix divided performance metric data into two segments: history and trends. History data stores all values as they are received by sensors. After the expiration time, which can be set per individual sensor, Zabbix does averaging (trending) on hourly basis and discards history data. Hourly average data is called trends.

Backend process designed used for deleting old data is called “HouseKeeper”. HouseKeeper is not perfect unfortunately. It runs every HousekeepingFrequency hours and deletes MaxHousekeeperDelete entries in each pass.

What happens if HouseKeeper deletion process lasts longer then interval between two runs? Next run will skip cause two HouseKeeper processes can’t run simultaneously. This is like a government budget: you’re in deficit each year and cruft (external debt) just piles up.

This is especially noticeable if you run Zabbix datastore on a relatively slow block device…

There is an easy way to clean out Zabbix datastore, but it requires a short Zabbix downtime. Procudere goes like this:

  • stop zabbix server
  • backup, backup, backup !!!
  • copy schema of history table(s) to history_copy
  • copy data newer then last X days from history to history_copy
  • drop history table
  • rename history_copy to history
  • re-create indexes on history table (if you’re running PgSQL)
  • start zabbix server

As ever – if you’re doing something twice, it has to be automatized third time.

So, I created a script for Zabbix Servers running PostgreSQL backend that automates this job for me. Script is pretty simple so I’ll paste it here:


TABLES="history history_log history_old history_str history_text history_uint"

EPOCH_TIME=$(date --date '20 days ago' +'%s')

for table in $TABLES; do
    # remove old schema dumps
    rm -f ${table}.pre-data.sql ${table}.post-data.sql

    # dump schema
    pg_dump --section=pre-data  -t $table $DBNAME > ${table}.pre-data.sql
    pg_dump --section=post-data -t $table $DBNAME > ${table}.post-data.sql

    # create new table
    sed -r "s/\b${table}\b/${table}_copy/" ${table}.pre-data.sql \
    | psql $DBNAME -f -

    # copy data
    echo "INSERT INTO ${table}_copy \
          SELECT * FROM ${table} WHERE clock > ${EPOCH_TIME};" \
    | psql zabbix -f -

    # swap tables
    echo "ALTER TABLE ${table} RENAME TO ${table}_old;"  | psql zabbix -f -
    echo "ALTER TABLE ${table}_copy RENAME TO ${table};" | psql zabbix -f -

    # drop indexes from saved table and create new one
    sed 's/CREATE/DROP/' ${table}.post-data.sql \
    | sed 's/ ON.*/;/' \
    | psql $DBNAME -f -
    cat ${table}.post-data.sql | psql $DBNAME -f -

Note: Script works only on PostgreSQL backend.

Categories: Uncategorized
%d bloggers like this: