본문 바로가기

Linux

CentOS 6.3 hp-health 오류시 수정방법

1. # /etc/init.d/mcelogd stop

 

2. # /etc/init.d/hp-health start

 

3. # /etc/init.d/mcelogd start

 

4. # /etc/init.d/mcelogd status
Checking for mcelog
mcelog (pid  13879) is running...


5. # vi /etc/init.d/hp-health

# chkconfig: 2345 91 2
을 아래와 같이 수정

# chkconfig: 2345 31 2

 

6. # /sbin/chkconfig --del hp-health

 

7. # /sbin/chkconfig --add hp-health

 

----- 원문 내용 -----

 

Fix for hp-health on DL100 series running CentOS6

I got HP DL160 G6 server running CentOS 6.2. I installed hp-health on the server which has been a common practice to get some good HP related insights.

The installation was successful, however when I tried to start the hp-health service the following Segmentation fault error occurred.

/etc/init.d/hp-health: line 666: 11390 Segmentation fault (core dumped) $PNAME $PARGS < /dev/null >> $LOGFILE 2>&1

Snapshots:

[root@gagan ~]# /etc/init.d/hp-health start
  Using Proliant Standard
     IPMI based 1XX System Health Monitor
  Using standard Linux IPMI device driver
Starting ipmi drivers:                                     [  OK  ]
  Starting Proliant Standard
     IPMI based 1XX System Health Monitor (hpasmpld):
/etc/init.d/hp-health: line 666: 11390 Segmentation fault      (core dumped) $PNAME $PARGS < /dev/null >> $LOGFILE 2>&1

The following is what I could find in the /var/log/messages:

Aug 16 19:45:52 gagan hpasmpld[11298]: ehpsmb_parse_SMBIOS: SMBIOSInitTable was not successful.
Aug 16 19:45:52 gagan kernel: hpasmpld:11298 map pfn expected mapping type uncached-minus for 9e000-a0000, got write-back
Aug 16 19:45:52 gagan kernel: hpasmpld[11298]: segfault at 0 ip 0000000000414918 sp 00007fff09822e18 error 4 in hpasmpld[400000+2a000]
Aug 16 19:45:52 gagan abrt[11299]: saved core dump of pid 11298 (/opt/hp/hp-health/bin/hpasmpld) to /var/spool/abrt/ccpp-2012-08-16-19:45:52-11298.new/coredump (516096 bytes)
Aug 16 19:45:52 gagan abrtd: Directory 'ccpp-2012-08-16-19:45:52-11298' creation detected
Aug 16 19:45:52 gagan abrtd: Package 'hp-health' isn't signed with proper key
Aug 16 19:45:52 gagan abrtd: Corrupted or bad dump /var/spool/abrt/ccpp-2012-08-16-19:45:52-11298 (res:2), deleting

Tried to work with a lot of combinations and configuration changes, but could not get the hp-health service started.

I finally came across the following as a temporary fix for hp-health on DL100 series running CentOS6. This is more of a temporary workaround for this problem.

Fix for hp-health on DL100 series running CentOS6

The problem is that the hp-health is trying to read a memory block which is already being read by another process. This process in question is mcelogd.

A fix for this problem is to stop mcelogd.

[root@gagan ~]# /etc/init.d/mcelogd stop

Output:

[root@root ~]# /etc/init.d/mcelogd stop
Stopping mcelog
[root@root ~]#                                  [  OK  ]

And then start the hp-health service.

[root@root ~]# /etc/init.d/hp-health start

Output:

[root@gagan ~]# /etc/init.d/hp-health start
  Using Proliant Standard
 	IPMI based 1XX System Health Monitor
  Using standard Linux IPMI device driver
Starting ipmi drivers:                                     [  OK  ]
  Starting Proliant Standard
 	IPMI based 1XX System Health Monitor (hpasmpld): 
                                                           [  OK  ]
[root@gagan ~]# /etc/init.d/hp-health status
  Using Proliant Standard
 	IPMI based 1XX System Health Monitor
  Using standard Linux IPMI device driver
  
ipmi_msghandler module loaded.
ipmi_si module loaded.
ipmi_devintf module loaded.
/dev/ipmi0 exists.
  
  (hpasmpld) is running...                                 [  OK  ]

Now start mcelogd

[root@gagan ~]# /etc/init.d/mcelogd start

Output:

[root@gagan ~]# /etc/init.d/mcelogd start
Starting mcelog daemon
[root@gagan ~]# /etc/init.d/mcelogd status
Checking for mcelog
mcelog (pid  23478) is running...

A more permanent fix to ensure that this problem is fixed during next reboot is to update the startup order for these two services:

Modify the file /etc/init.d/hp-health

[root@gagan ~]# vim /etc/init.d/hp-health

Change the following line in the file:

# chkconfig: 2345 91 2

TO

# chkconfig: 2345 31 2

Remove and Add hp-health service from the chkconfig.

[root@gagan ~]# /sbin/chkconfig –del hp-health
[root@gagan ~]# /sbin/chkconfig –add hp-health

Ensure that the startup priority for hp-health is higher (lower in the SXX number) in comparison to mcelogd.

[root@gagan ~]# ls -lah /etc/rc*.d/ | grep “hp-health|mcelogd”

 

 

내용발췌 : http://gaganonthenet.com/2012/08/22/fix-for-hp-health-on-dl100-series-running-centos6/