15

We have CentOS 6.4 and the kipmi0 is showing as 99.8%cpu and 0.0% memory and load average is 1.00. What should we do to rectify on this?

Anthon
  • 79,293
biz14
  • 471
  • you should start by reading this http://www-01.ibm.com/support/docview.wss?uid=nas7d580df3d15874988862575fa0050f604 – squareborg May 06 '13 at 19:31
  • 2
    @I have read it before so it just says ignore should I just ignore but my other machines are not having this issue? – biz14 May 06 '13 at 19:40
  • Are the other systems identical to this system? You're going to have to determine that they are. There has to be something that's fundamentally different between them. Firmware? Same RPM versions? – slm May 07 '13 at 01:08
  • @Yes there is two same machines with same centos 6.4 what should I look for now? – biz14 May 07 '13 at 03:44
  • Compare the outputs from lshw and dmidecode would be my next areas to look into. – slm May 07 '13 at 18:00
  • @I guess the lshw need to be installed right? You want me to compare manually both the output from both machines or should I post here the output? – biz14 May 07 '13 at 19:18
  • Please post the files. Use something like http://pastebin.com/ so we can see all the files thus far. – slm May 07 '13 at 19:41
  • I have posted please see the comments to your answer. – biz14 May 08 '13 at 18:21

6 Answers6

24

According to the IPMI Document:

this thread can use a lot of CPU depending on the interface's performance. This can waste a lot of CPU and cause various issues with detecting idle CPU and using extra power. To avoid this, the kipmid_max_busy_us sets the maximum amount of time, in microseconds, that kipmid will spin before sleeping for a tick. This value sets a balance between performance and CPU waste and needs to be tuned to your needs. Maybe, someday, auto-tuning will be added, but that's not a simple thing and even the auto-tuning would need to be tuned to the user's desired performance.

So,we can execute this command to set the kipmid_max_busy_us parameter:

echo 100 > /sys/module/ipmi_si/parameters/kipmid_max_busy_us

In our system, after setting this parameter, the cpu of kipmi0 decreased to 15%.

You can try this.

To make the changes persistent you can configure the options for the ipmi_si kernel module.
Create a file in /etc/modprobe.d/, i.e./etc/modprobe.d/ipmi.conf, and add the following content:
# Prevent kipmi0 from consuming 100% CPU
options ipmi_si kipmid_max_busy_us=100

Now every time the ipmi_si kernel module is loaded into the kernel, the parameter should be automatically and correctly set.

Jeff Schaller
  • 67,283
  • 35
  • 116
  • 255
d0ngw
  • 341
  • Although this may be the correct answer, it is considered best practice on SE sites to detail the reasoning as part of your answer, as well as quoting any external links. That way, if the external link becomes defunct, the logic and reasoning is still viewable here. – Drav Sloan Sep 16 '13 at 14:32
  • Is there a standard way of making that take effect permanently? – tgharold Sep 17 '13 at 13:27
  • On CentOS/RHEL, that command can be made permanent by adding it to /etc/rc.d/rc.local. The rc.local runs after all of the other init scripts. – tgharold Sep 17 '13 at 13:33
6

Debugging the issue

Are the other systems identical to this system? You're going to have to determine that they are. There has to be something that's fundamentally different between them. Firmware? Same RPM versions?

You can use tools such as lshw, dmidecode, and looking at the dmesg log for clues as to what's different and what's the root cause.

I'd get a good baseline of the RPMs installed by running this command on one of the systems that's not exhibiting this issue and the one that is and compare the package lists to make sure they're all at the same versions.

 # machine #1
 $ rpm -aq | sort -rn > machine1_rpms.txt

 # machine #2
 $ rpm -aq | sort -rn > machine2_rpms.txt     

Then get the files on the same machine and do an sdiff of the 2 files:

 sdiff machine1_rpms.txt machine2_rpms.txt

Potential cause #1

The IBM website had this technote titled: Kipmi0 May Show Increased CPU Utilization on Linux, regarding this issue. According to this issue you can essentially ignore the problem.

description of issue

The kipmi0 process may show increased CPU utilization in Linux. The utilization may increase up to 100% when the IPMI (Intelligent Platform Management Interface) device, such as a BMC (Baseboard Management Controller) or IMM (Integrated Management Controller) is busy or non-responsive.

Fix

No fix required. You should ignore increased CPU utilization as it has no impact on actual system performance.

Work-around

  1. If using an IPMI device, reset the BMC or reboot the system.
  2. If not using an IPMI device, stop the IPMI service by issuing the following command:

    service ipmi stop

Potential solution #2

I found this post on someones blog simply titled: kipmi0 problem. This problem sounded identical to yours. The issue was traced to an issue with 2 kernel modules that were getting loaded as part of the lm_sensors package.

These were the 2 kernel modules:

  • ipmi_si
  • ipmi_msghandler

Work-around

You can manually remove these with the following commands:

rmmod ipmi_msghandler
rmmod ipmi_si

To make this fix permanent, you'lll need to disable the loading of these particular kernel modules within one of the lm_sensors configuration files, by commenting them out like so:

# /etc/sysconfig/lm_sensors
# MODULE_0=ipmi-si
# MODULE_1=ipmisensors
# MODULE_2=coretemp

Restart lm_sensors after making these changes:

/etc/init.d/lm_sensors
Stefan Lasiewski
  • 19,754
  • 24
  • 70
  • 85
slm
  • 369,824
  • I have been to both the website and in my system I dont find this file /etc/sysconfig/lm_sensors. Something funny when I do the sort on the first file is Asc but the second file is desc? Secondly how to output the difference into a file. Yes I can see quite a number of difference too. – biz14 May 07 '13 at 04:07
  • yes now I did the second time it is sorted accordingly descending. I dont get you how to use the grep "|". What else should I do to rectify this problem? – biz14 May 07 '13 at 12:05
  • All I was saying was to do this: sdiff machine1_rpms.txt machine2_rpms.txt | grep "|" will pull out all the differences b/w the 2 .txt files. There are other ways to do it but that's one way. – slm May 07 '13 at 14:00
  • I ran this command and here is the output sdiff 12_rpms.txt 11_rpms.txt | grep "|" perl-DBI-1.609-4.el6.x86_64 | perl-Digest-SHA-5.47-131.el6_4.x86_64 . The 12_rpms is the problem machine and the other one is without the issue. But when I manually look 12_rpms have 247 lines and 11_rpms have 263 but the sdiff is just one? So what should be my next step now based on this difference? – biz14 May 07 '13 at 17:56
  • Please post these files as well on http://pastebin.com/. – slm May 07 '13 at 19:41
  • Here is the link for the machine with no problem for the rpms list http://pastebin.com/Csu51LBr and for the machine with problem http://pastebin.com/2f6jis2P – biz14 May 08 '13 at 17:55
  • Here is the link for the machine with no problem for the dmidecode list http://pastebin.com/agW3R65J and for the machine with problem http://pastebin.com/XZCBiQKM . For the lshw I am having issue the machine with the kipmi0 problem when I do yum install lshw says no package to install. So how to solve this issue? – biz14 May 08 '13 at 18:20
  • Rebooting the service processor on the machine worked for me. I'm also running CentOS 6.4 and the machine is a Sun Fire X4140. The kipmi0 process went down to almost zero per top. – Banjer Sep 16 '13 at 13:54
  • Using ipmitool bmc reset cold doesn't fix this for me, nor does resetting from the web interface :( – Stefan Lasiewski Apr 14 '14 at 18:35
1

kipmi0 can be disabled on CentOS 6 entirely by adding ipmi_si.force_kipmid=0 as a kernel parameter

Test at the GRUB boot screen by highlighting the kernel you want to boot, hit 'a' to modify parameters and appending ipmi_si.force_kipmid=0

Make permanent by appending ipmi_si.force_kipmid=0 to the relevant kernel lines in /boot/grub/grub.conf

NOTE: In distros that have ipmi_si as a separate kernel module, using a modprobe.d conf file is more appropriate. In CentOS ipmi_si is built in to the kernel image, so modprobe configs do not work.

Dev
  • 111
1

CentOS 6 have ipmi driver compiled in kernel. If you do not need ipmi support then just disable it grub.conf

ipmi_si.tryacpi=0 ipmi_si.trydmi=0 ipmi_si.trydefaults=0
1

I found the following helps with this issue:

ipmitool bmc info

This seems to wake up IPMI and then it stops using 100% of a core.

I also found the following helpful:

echo 100 > /sys/module/ipmi_si/parameters/kipmid_max_busy_us

Also in the past I have been able on some servers to resolve the 100% CPU usage by:

ipmitool lan print

and

ipmitool bmc reset cold

but in my most recent experience the above options would just cause ipmitool to be non-responsive and sit there, causing me to Ctrl+C it.

Hopefully this helps someone.

0

I found this running CentOS 7 and trying to figure out what was taking it up.

For me, it was Supermicro's "ipmicfg" running from a script I wrote or something. I just pkilled it and the kipmi0 usage went away.

Locane
  • 125
  • 1
  • 7