Archive

Archive for March, 2013

rsyslog filtering iptables from messages

March 25, 2013 Leave a comment

Where did we come from?
Why are we here?
Where do we go when we die?

(Dream Theater – Spirit carries on)

I tend to face problem of iptables DOS-ing my log files every now and then. If I can, i tend to use ULOG target and leave iptables logs to ulogd. But, sometimes ulogd is not an option – for example on shared OpenVZ hosting. So, today I decided to harness the power of rsyslogd.

Rsyslogd is new standard daemon for logging on major linux distrubtions, that replaced old sys(k)logd. It has powerful regex and scripting engine builtin, which can be used for many cool things.

So, to solve problem of iptables logs, let’s first mark them somehow, so that we can later recognize them. This is example rules that generates logs:

-A INPUT   -j LOG --log-level info --log-prefix "iptables INPUT   DROP: "
-A FORWARD -j LOG --log-level info --log-prefix "iptables FORWARD DROP: "

Offcourse, this is written in ‘iptables-save/restore’ format.

It’s obvious we can recognize log entry by the word ‘iptables’.

Now, lets add the following to rsyslog.conf:

:msg, startswith, "iptables"				/var/log/iptables
&~

Note that these two lines have to be before the ‘/var/log/messages’ entry to take effect.

The first line directs rsyslog to send all messages that start with “iptables” to /var/log/iptables, and the second line discards those messages. So, that magical discard is what cleans out iptables noise from all subsequent logging files in rsyslog.conf.

Save the rsyslog.conf and restart daemon, and that’s it!

Advertisements

Linux reaction on AMD Phenom x4 overheating

March 10, 2013 1 comment

I’ll beat you with your spinal cord
Split your skull in two
I’ll feast on your intestines
There’s nothing I can’t do

(Iced Earth – Violator)

Next day my ssh console greeted me with some nice and unexpected output:

Message from syslogd@mon at Mar  7 20:51:28 ...
 kernel:[Hardware Error]: MC4_STATUS[-|CE|MiscV|-|AddrV|CECC]: 
0x9c5c40e1011c011b

Message from syslogd@mon at Mar  7 20:51:28 ...
 kernel:[Hardware Error]: Northbridge Error
 (node 0, core 3): L3 ECC data cache error.

Message from syslogd@mon at Mar  7 20:51:28 ...
 kernel:[Hardware Error]: cache level: L3/GEN, tx:
 GEN, mem-tx: RD

L3 is a CPU cache, so I guess my CPU went berzerk, or at least one of it’s cores. It’s a 3GHz AMD Phenom(tm) II X4 945 processor, with 4 cores. I heard stories about 3-core CPU’s beign 4 core with 1 locked core, because it showed some instability during manufacturing process. So I thought maybe this CPU was on the borderline of beign marked as 3-core? But at my surprise, new error spawned, but this time saying “node0, core0”.

This is how the error looks in /var/log/messages:

Mar  7 06:51:28 node kernel: [Hardware Error]:
 MC4_STATUS[Over|CE|MiscV|-|AddrV|CECC]: 0xdc5c40e0011c017b
Mar  7 06:51:28 node kernel: [Hardware Error]:
 Northbridge Error (node 0): L3 ECC data cache error.
Mar  7 06:51:28 node kernel: [Hardware Error]:
 cache level: L3/GEN, tx: GEN, mem-tx: EV
Mar  7 06:51:28 node kernel: [Hardware Error]:
 Machine check events logged

Friend of mine had idea – maybe it’s everything ok with hardware – but it’s overheating?! Well, nice idea, lets check:

fan1:       3139 RPM  (min =    0 RPM)
fan2:          0 RPM  (min =    0 RPM)
fan3:          0 RPM  (min =    0 RPM)
fan5:          0 RPM  (min =    0 RPM)
temp1:       +32.0°C  (low  = +127.0°C, high = +127.0°C)  sensor = thermistor
temp2:       +77.0°C  (low  = +127.0°C, high = +127.0°C)  sensor = thermal diode
temp3:       +68.0°C  (low  = +127.0°C, high = +127.0°C)  sensor = thermistor
cpu0_vid:   +1.050 V

So off we went to clean the dust. After vacuuming, temperatures dropped sharply as did the RPM’s of the CPU fan:

fan1:       2410 RPM  (min =    0 RPM)
fan2:          0 RPM  (min =    0 RPM)
fan3:          0 RPM  (min =    0 RPM)
fan5:          0 RPM  (min =    0 RPM)
temp1:       +30.0°C  (low  = +127.0°C, high = +127.0°C)  sensor = thermistor
temp2:       +46.0°C  (low  = +127.0°C, high = +127.0°C)  sensor = thermal diode
temp3:       +43.0°C  (low  = +127.0°C, high = +127.0°C)  sensor = thermistor
cpu0_vid:   +1.050 V

I guess that’s it, 4 days and no more errors from CPU…

%d bloggers like this: