OCI , OIC, IDCS, IDM ,DBCS, Databases, EBS ERP ,Linux, GoldenGate, Fusion Applications Practices : BUG: soft lockup - CPU#11 stuck for 10s! [updatedb:20793] update db mlocate consuming 100% of cpu resources

Updatedb process consumes the 100% of resources and the NFS or iscsci storage mount points staled

the update db process consumes log resources when the Server io is high so in the high io system we can stop the updatedb command by commenting the mlocate file in the /etc/cron.daily

if you check the /var/log/messages you can find the error messages

Aug 5 04:03:57 fahmid kernel: BUG: soft lockup - CPU#11 stuck for 10s! [updatedb:20793]

Aug 5 04:03:57 fahmid kernel: CPU 11:

Aug 5 04:03:57 fahmid kernel: Modules linked in: autofs4 hidp nfs fscache nfs_acl rfcomm l2cap bluetooth lockd sunrpc bonding be2iscsi ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp bnx2i cnic uio cxgb3i cxgb3 8021q libiscsi_tcp libiscsi2 scsi_transport_iscsi2 scsi_transport_iscsi dm_mirror dm_multipath scsi_dh video backlight sbs power_meter hwmon i2c_ec i2c_core dell_wmi wmi button battery asus_acpi acpi_memhotplug ac ipv6 xfrm_nalgo crypto_api parport_pc lp parport st joydev sr_mod cdrom sg qla2xxx hpilo scsi_transport_fc bnx2 serio_raw pcspkr dm_raid45 dm_message dm_region_hash dm_log dm_mod dm_mem_cache ata_piix libata shpchp cciss sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd

Aug 5 04:03:57 fahmid kernel: Pid: 20793, comm: updatedb Not tainted 2.6.18-194.el5 #1

Aug 5 04:03:57 fahmid kernel: RIP: 0010:[<ffffffff88052f98>] [<ffffffff88052f98>] :ext3:ext3_find_entry+0x20f/0x575

Aug 5 04:03:57 fahmid kernel: RSP: 0018:ffff8106ba77fb48 EFLAGS: 00000206

Aug 5 04:03:57 fahmid kernel: RAX: 0000000000000000 RBX: ffff81025c11bc10 RCX: 000000003386148f

Aug 5 04:03:57 fahmid kernel: RDX: ffff81061cf64c00 RSI: ffff8106ba77fb28 RDI: ffff8101157326e8

Aug 5 04:03:57 fahmid kernel: RBP: 0000000000000000 R08: ffff8109bf933a48 R09: 0000000000000034

Aug 5 04:03:57 fahmid kernel: R10: ffff81063510c790 R11: 000000005019c3a3 R12: 0000000000000002

Aug 5 04:03:57 fahmid kernel: R13: ffff81025c11bc10 R14: ffffffff80064ab0 R15: ffff810b8ded8740

Aug 5 04:03:57 fahmid kernel: FS: 00002ae9f2950260(0000) GS:ffff810c1fc25cc0(0000) knlGS:0000000000000000

Aug 5 04:03:57 fahmid kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b

Aug 5 04:03:57 fahmid kernel: CR2: 000000009ff81bbc CR3: 00000001a233d000 CR4: 00000000000006e0

Aug 5 04:03:57 fahmid kernel:

Step 1 Check the top command for the highest consuming resources

[root@fahmid cron.daily]# top

top - 17:42:35 up 25 days, 8:02, 8 users, load average: 775.61, 776.08, 775.54

Tasks: 1594 total, 6 running, 1588 sleeping, 0 stopped, 0 zombie

Cpu(s): 0.3%us, 6.4%sy, 0.9%ni, 18.4%id, 74.0%wa, 0.0%hi, 0.0%si, 0.0%st

Mem: 49454804k total, 49333240k used, 121564k free, 107464k buffers

Swap: 50331636k total, 5675056k used, 44656580k free, 4303088k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

20793 root 34 19 3904 592 504 R 100.0 0.0 5139:00 updatedb

380 oracle 21 0 3693m 1.4g 4804 S 6.2 3.0 1560:36 java

432 oracle 20 0 3579m 1.4g 4024 S 5.6 3.0 1674:38 java

Step 2 Check the updatedb process

[root@fahmid log]# ps -ef |grep updatedb

root 6046 1 0 Aug03 ? 00:00:00 /usr/bin/updatedb -f sysfs?rootfs?bdev?proc?cpuset?binfmt_misc?debugfs?securityfs?sockfs?usbfs?pipefs?anon_inodefs?futexfs?tmpfs?inotifyfs?eventpollfs?devpts?ramfs?hugetlbfs?mqueue?rpc_pipefs?nfs?nfs4?autofs

root 6657 1 0 04:03 ? 00:00:00 /usr/bin/updatedb -f sysfs?rootfs?bdev?proc?cpuset?binfmt_misc?debugfs?securityfs?sockfs?usbfs?pipefs?anon_inodefs?futexfs?tmpfs?inotifyfs?eventpollfs?devpts?ramfs?hugetlbfs?mqueue?rpc_pipefs?nfs?nfs4?autofs

root 20793 1 99 Aug02 ? 3-14:02:05 /usr/bin/updatedb -f sysfs?rootfs?bdev?proc?cpuset?binfmt_misc?debugfs?securityfs?sockfs?usbfs?pipefs?anon_inodefs?futexfs?tmpfs?inotifyfs?eventpollfs?devpts?ramfs?hugetlbfs?mqueue?rpc_pipefs?nfs?nfs4?autofs

root 21571 1 0 Aug04 ? 00:00:00 /usr/bin/updatedb -f sysfs?rootfs?bdev?proc?cpuset?binfmt_misc?debugfs?securityfs?sockfs?usbfs?pipefs?anon_inodefs?futexfs?tmpfs?inotifyfs?eventpollfs?devpts?ramfs?hugetlbfs?mqueue?rpc_pipefs?nfs?nfs4?autofs

root 21982 20160 0 18:05 pts/16 00:00:00 grep updatedb

Step 3 Kill the process by using the hard kill

Try to kill all the process by hard killing but still it will not kill the process

[root@fahmid log]# kill -9 6046

[root@fahmid log]# kill -9 6657

[root@fahmid log]# kill -9 20793

[root@fahmid log]# kill -9 21571

[root@fahmid log]# ps -ef |grep updatedb

root 20793 1 99 Aug02 ? 3-14:05:54 /usr/bin/updatedb -f sysfs?rootfs?bdev?proc?cpuset?binfmt_misc?debugfs?securityfs?sockfs?usbfs?pipefs?anon_inodefs?futexfs?tmpfs?inotifyfs?eventpollfs?devpts?ramfs?hugetlbfs?mqueue?rpc_pipefs?nfs?nfs4?autofs

root 22232 20160 0 18:09 pts/16 00:00:00 grep updatedb

Step 4 Check the mlocate in the /etc/cron.daily

[root@fahmid cron.daily]# pwd

/etc/cron.daily

[root@fahmid cron.daily]# ls

0anacron 0logwatch cups logrotate makewhatis.cron mlocate.cron prelink rpm tetex.cron tmpwatch

[root@fahmid cron.daily]# vi mlocate.cron

#!/bin/sh

nodevs=$(< /proc/filesystems awk '$1 == "nodev" { print $2 }')

renice +19 -p $$ >/dev/null 2>&1

/usr/bin/updatedb -f "$nodevs"

Step 5 comment the mlocate file in the cron .daily

change the mlocate to another name

[root@fahmid cron.daily]# vi mlocateold.cron

#!/bin/sh

#nodevs=$(< /proc/filesystems awk '$1 == "nodev" { print $2 }')

#renice +19 -p $$ >/dev/null 2>&1

#/usr/bin/updatedb -f "$nodevs"

Step 6 Try to kill the process but still the process is alive then restart the server

[root@fahmid cron.daily]# reboot

Broadcast message from root (pts/16) (Sun Aug 5 18:35:36 2012):

The system is going down for reboot NOW!

Some times the reboot from the terminal does not work if we had the console then try to reboot through the console

or physical reboot

OCI , OIC, IDCS, IDM ,DBCS, Databases, EBS ERP ,Linux, GoldenGate, Fusion Applications Practices

Sunday, August 5, 2012

BUG: soft lockup - CPU#11 stuck for 10s! [updatedb:20793] update db mlocate consuming 100% of cpu resources

Updatedb process consumes the 100% of resources and the NFS or iscsci storage mount points staled

No comments:

Post a Comment