본문 바로가기

이것~저것~

Red Hat Enterprise Linux systems using the TSC clock source reboots or panics when 'sched_clock()' overflows after an uptime of 208.5 days

말괄량이현이 2015. 9. 8. 15:55

208.5일 uptime 관련 버그 입니다.

문제

Linux Kernel panics when sched_clock() overflows after an uptime of around 208.5 days
Red Hat Enterprise Linux system reboots with sched_clock() overflow after an uptime of around 208.5 days
This symptom may happen on a system using the Time Stamp Counter (TSC) clock source
Some processes may generate following log message:

BUG: soft lockup - CPU#N stuck for 4278190091s!

환경

Red Hat Enterprise Linux (RHEL) 6

Red Hat Enterprise Linux 6.0 (some kernels)
Red Hat Enterprise Linux 6.1 (some kernels)
Red Hat Enterprise Linux 6.2 (some kernels)
With TSC clock source

Red Hat Enterprise Linux (RHEL) 5

Red Hat Enterprise Linux 5.0
Red Hat Enterprise Linux 5.1
Red Hat Enterprise Linux 5.2
Red Hat Enterprise Linux 5.3 (some kernels)
Red Hat Enterprise Linux 5.4
Red Hat Enterprise Linux 5.5
Red Hat Enterprise Linux 5.6 (some kernels)
Red Hat Enterprise Linux 5.7
Red Hat Enterprise Linux 5.8 (some kernels)
With TSC clock source

Red Hat Enterprise MRG 1.3 Realtime kernel
An approximate uptime of around 208.5 days

해결

Red Hat Enterprise Linux 6

RHEL 6.x

Update to kernel-2.6.32-279.el6 or later (RHSA-2012-0862)
This kernel is already part of Red Hat Enterprise Linux 6.3 GA

RHEL 6.2

Update to kernel-2.6.32-220.4.2.el6 or later (RHBA-2012-0124)

RHEL 6.1 Extended Update Support

Update to kernel-2.6.32-131.26.1.el6 or later (RHBA-2012-0424)

Red Hat Enterprise Linux 5 x86_64 (64bit)

RHEL 5.x

Update to kernel-2.6.18-348.el5 or later (RHBA-2013-0006)
Red Hat Enterprise Linux 5.9 GA and later already contain this fix

RHEL 5.8.z

Update to kernel-2.6.18-308.11.1.el5 or later (RHSA-2012-1061)

RHEL 5.6.z

Update to kernel-2.6.18-238.40.1.el5 or later (RHSA-2012-1087)

RHEL 5.3.z

Update to kernel-2.6.18-128.39.1.el5 or later (RHBA-2012-1093)

Red Hat Enterprise Linux 5 x86 (32bit)

RHEL 5.x

Update to kernel-2.6.18-348.el5 or later (RHBA-2013-0006)
Red Hat Enterprise Linux 5.9 GA and later already contain this fix

RHEL 5.8.z

Update to kernel-2.6.18-308.13.1.el5 or later (RHSA-2012-1174)

RHEL 5.6.z

Update to kernel-2.6.18-238.40.1.el5 or later (RHSA-2012-1087)

RHEL 5.3.z

Update to kernel-2.6.18-128.39.1.el5 or later (RHBA-2012-1093)

Red Hat Enterprise 1.3 Realtime kernel

Red Hat Enterprise MRG 1.3 Realtime kernel

Update to kernel kernel-rt-2.6.33.9-rt31.86.el5rt or later (RHBA-2013:0927)

근본 원인

An insufficiently designed calculation in the CPU accelerator in the previous kernel caused an arithmetic overflow in the sched_clock() function

This overflow led to a kernel panic or any other unpredictable trouble on the systems using the TSC clock source
This problem will occur only when system uptime reaches or exceeds 208.5 days
This update corrects the aforementioned calculation so that this arithmetic overflow and kernel panic can no longer occur under these circumstances

On RHEL5, this problem is a timing issue and is very unlikely to be encountered.
Switching to another clocksource is usually not a workaround for most workloads

The TSC is a fast access clock, whereas the HPET and PMTimer are both slow access clocks
Using notsc would be a significant performance hit
In RHEL5, the affected sched_clock() uses the TSC regardless of clock source selection.
Also, in some situation, the system may hit this issue even if you set notsc to current_clocksource.

진단 단계

Note: this issue could occur in numerous locations that deal with time in the kernel

For example, a user running a non-Red Hat kernel could have a kernel panic with a soft lockup in __ticket_spin_lock

The system must be booting a kernel that is a version prior to releases mentioned in the "Resolution" field
Use following command to determaind the current clock source

cat /sys/devices/system/clocksource/clocksource0/current_clocksource

'이것~저것~' 카테고리의 다른 글

테슬라X (0)	2017.05.07
고라니 로드킬... (0)	2017.04.27
BMW 118D 기어봉 프래임 교체 (0)	2016.08.29
2016년 엔지니어링 기술자 노임단가 (0)	2016.01.27
bash 취약점 조치 및 패치 (0)	2015.09.21

티스토리툴바