内核异常检测机制——Soft lockup、Hard lockup、hung up

最新推荐文章于 2025-06-14 22:17:43 发布

幸运没有眷顾

最新推荐文章于 2025-06-14 22:17:43 发布

阅读量1.4k

点赞数 7

CC 4.0 BY-SA版权

文章标签： linux 服务器

本文链接：https://blog.csdn.net/qq_37294304/article/details/142856029

文章目录

1. Soft lockup和Hard lockup
- 1.1 定义
2. Hung task

1. Soft lockup和Hard lockup

1.1 定义

Lock up检测机制是内核中非常重要的机制，用于检测内核Lock up，也就是说CPU长时间执行在内核态的一种异常状态。如果说内核出现了以下警告，就与该问题相关：

    watchdog: BUG: soft lockup - CPU#244 stuck for 26s!
    pstate: 83400009 (Nzcv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
    pc : arm_smmu_cmdq_issue_cmdlist+0x178/0xa50
    lr : arm_smmu_cmdq_issue_cmdlist+0x150/0xa50
    sp : ffff8000d83ef290
    x29: ffff8000d83ef290 x28: 000000003b9aca00 x27: 0000000000000000
    x26: ffff8000d83ef3c0 x25: da86c0812194a0e8 x24: 0000000000000000
    x23: 0000000000000040 x22: ffff8000d83ef340 x21: ffff0000c63980c0
    x20: 0000000000000001 x19: ffff0000c6398080 x18: 0000000000000000
    x17: 0000000000000000 x16: 0000000000000000 x15: ffff3000b4a3bbb0
    x14: ffff3000b4a30888 x13: ffff3000b4a3cf60 x12: 0000000000000000
    x11: 0000000000000000 x10: 0000000000000000 x9 : ffffc08120e4d6bc
    x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000048cfa
    x5 : 0000000000000000 x4 : 0000000000000001 x3 : 000000000000000a
    x2 : 0000000080000000 x1 : 0000000000000000 x0 : 0000000000000001
    Call trace:
     arm_smmu_cmdq_issue_cmdlist+0x178/0xa50
     __arm_smmu_tlb_inv_range+0x118/0x254
     arm_smmu_tlb_inv_range_asid+0x6c/0x130
     arm_smmu_mm_invalidate_range+0xa0/0xa4
     __mmu_notifier_invalidate_range_end+0x88/0x120
     unmap_vmas+0x194/0x1e0
     unmap_region+0xb4/0x144
     do_mas_align_munmap+0x290/0x490
     do_mas_munmap+0xbc/0x124
     __vm_munmap+0xa8/0x19c
     __arm64_sys_munmap+0x28/0x50
     invoke_syscall+0x78/0x11c
     el0_svc_common.constprop.0+0x58/0x1c0
     do_el0_svc+0x34/0x60
     el0_svc+0x2c/0xd4
     el0t_64_sync_handler+0x114/0x140
     el0t_64_sync+0x1a4/0x1a8

首先定义两种Lock up。

术语	定义	可能原因
Hard lockup	CPU停留在内核态超过10s，此种状态下不仅不能调度task，也不能响应任何中断	1. 硬件异常，如CPU停滞; 2. 中断子系统异常; 3. 死锁; 4. 关中断过久
Soft lockup	CPU停留在内核态超过20s，此种状态下没有给其他task任何调度机会，但此时能够响应中断。	1. 关抢占过久; 2. 中断(软中断、硬中断)处理程序执行过久; 3. 异常的锁行为，如spinlock持锁过久(原理就是关了抢占过久)，死锁等; 4. 长时间的内核态操作，如在内核态中长时间循环，且未使用schedule/sleep等进行调度;

1.2. Lockup的检测原理

在内核中Soft lockup和Hard lockup的检测拥有相同的入口，如下代码所示：

在内核中，lockup检测函数入口是一个cpuhp的回调函数，保证每一个CPU在online时调用到。

/* CPU热拔插回调 */
static struct cpuhp_step cpuhp_hp_states[] = {
   
   
	/* ... */
	[CPUHP_AP_WATCHDOG_ONLINE] = {
   
   
		.name			= "lockup_detector:online",
		.startup.single		= lockup_detector_online_cpu,
		.teardown.single	= lockup_detector_offline_cpu,
	},
	/* ... */
};

int lockup_detector_online_cpu(unsigned int cpu)
{
   
   
	if (cpumask_test_cpu(cpu, &watchdog_allowed_mask))
		watchdog_enable(cpu);
	return 0;
}

Soft lockup和Hard lockup的检测入口都是watchdog_enable，通过该函数注册一个hrtimer回调watchdog_timer_fn，完成打点、软锁和硬锁检测的功能。

static void watchdog_enable(unsigned int cpu)
{
   
   
	struct hrtimer *hrtimer = this_cpu_ptr(&watchdog_hrtimer);
	struct completion *done = this_cpu_ptr(&softlockup_completion);

	WARN_ON_ONCE(cpu != smp_processor_id());

	init_completion(done);
	complete(done);

	/*
	 * Start the timer first to prevent the hardlockup watchdog triggering
	 * before the timer has a chance to fire.
	 */
	hrtimer_init(hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL_HARD);
	hrtimer->function = watchdog_timer_fn;
	hrtimer_start(hrtimer, ns_to_ktime(sample_period),
		      HRTIMER_MODE_REL_PINNED_HARD);

	/* Initialize timestamp */
	update_touch_ts();
	/* Enable the hardlockup detector */
	if (watchdog_enabled & WATCHDOG_HARDLOCKUP_ENABLED)
		watchdog_hardlockup_enable(cpu);
}


/* watchdog kicker functions */
static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
{
   
   
	unsigned long touch_ts, period_ts, now;
	struct pt_regs *regs = get_irq_regs();
	int duration;
	int softlockup_all_cpu_backtrace = sysctl_softlockup_all_cpu_backtrace;

	if (!watchdog_enabled)
		return HRTIMER_NORESTART;

	watchdog_hardlockup_kick();

	/* kick the softlockup detector */
	if (completion_done(this_cpu_ptr(&softlockup_completion))) {
   
   
		reinit_completion(this_cpu_ptr(&softlockup_completion));
		stop_one_cpu_nowait(smp_processor_id(),
				softlockup_fn, NULL,
				this_cpu_ptr(&softlockup_stop_work));
	}

	/* .. and repeat */
	hrtimer_forward_now(hrtimer, ns_to_ktime(sample_period));

	/*
	 * Read the current timestamp first. It might become invalid anytime
	 * when a virtual machine is stopped by the host or when the watchog
	 * is touched from NMI.
	 */
	now = get_timestamp();
	/*
	 * If a virtual machine is stopped by the host it can look to
	 * the watchdog like a soft lockup. This function touches the watchdog.
	 */
	kvm_check_and_clear_guest_paused();
	/*
	 * The stored timestamp is comparable with @now only when not touched.
	 * It might get touched anytime from NMI. Make sure that is_softlockup()
	 * uses the same (valid) value.
	 */
	period_ts = READ_ONCE(*this_cpu_ptr(&watchdog_report_ts));

	/* Reset the interval when touched by known problematic code. */
	if (period_ts == SOFTLOCKUP_DELAY_REPORT) {
   
   
		if (unlikely(__this_cpu_read(softlockup_touch_sync))) {
   
   
			/*
			 * If the time stamp was touched atomically
			 * make sure the scheduler tick is up to date.
			 */
			__this_cpu_write(softlockup_touch_sync, false);
			sched_clock_tick();
		}

		update_report_ts();
		return HRTIMER_RESTART;
	}

	/* Check for a softlockup. */
	touch_ts = __this_cpu_read(watchdog_touch_ts);
	duration = is_softlockup(touch_ts, period_ts, now);
	if (unlikely(duration)) {
   
   
		/*
		 * Prevent multiple soft-lockup reports if one cpu is already
		 * engaged in dumping all cpu back traces.
		 */
		if (softlockup_all_cpu_backtrace) {
   
   
			if (test_and_set_bit_lock(0, &soft_lockup_nmi_warn))
				return HRTIMER_RESTART;
		}

		/* Start period for the next softlockup warning. */
		update_report_ts();

		pr_emerg("BUG: soft lockup - CPU#%d stuck for %us! [%s:%d]\n",
			smp_processor_id(), duration,
			current->comm, task_pid_nr(current));
		print_modules();
		print_irqtrace_events(current);
		if (regs)
			show_regs(regs);
		else
			dump_stack();

		if (softlockup_all_cpu_backtrace) {
   
   
			trigger_allbutcpu_cpu_backtrace(smp_processor_id());
			clear_bit_unlock(0, &soft_lockup_nmi_warn);
		}

		add_taint(TAINT_SOFTLOCKUP, LOCKDEP_STILL_OK);
		if (softlockup_panic)
			panic("softlockup: hung tasks");
	}

	return HRTIMER_RESTART;
}

1.2.1. Soft lockup检测原理

Soft lockup最核心的检测机制是利用高精度计时器hrtimer和停机调度机制stop_machine完成的。

Soft lockup会在每个核心上都会启用一个hrtimer，该hrtimer以4s(即1/5软锁阈值)为采样周期，产生一个硬中断，在该硬中断先后执行两件事：

先通过stop_one_cpu_nowait来请求本CPU的停机调度线程migration执行时间戳打点操作。
随后，硬中断检测距离上一次打点是否超过20s，如果超过则报Soft lockup警告。

为什么这么做？

首先了解migration进程是什么。
migration是一个停机调度进程，每个cpu都有一个，其是整个系统中最高优先级的进程，具备自停车(self parking)特性。虽然其有最高优先级，但不直接抢占进程，而是等待当前CPU进入下一个调度点时，按优先级调度该进程。其没有时间片的概念，只要不主动让出cpu，其将一直霸占cpu。

如果说migration这个最高优先级的进程都无法被正常调度，说明该cpu处于某种异常的状态，导致始终无法调度task。此时可能得原因有：

长时间关抢占，无法调度进程
中断执行久、中断嵌套多、中断风暴等，无法调度进程
锁异常，如spinlock持锁过久(也是关了抢占)，死锁等

如果说hrtimer产生的硬中断或者migration进程任何一环出现了异常，导致打点或检测时机被延迟了，就会由内核报出Soft lockup异常。

调度类	调度策略	优先级	抢占能力	典型应用场景
停机调度类（Stop Class）	`N/A`	最高	不抢占，等待当前任务执行调度点触发后才会调度	内核管理任务、CPU 迁移、热插拔等
限期调度类（Deadline Class）	`SCHED_DEADLINE`	较高	抢占所有实时调度类、公平调度类、空闲调度类，以及同类低优先级进程	工业控制、音视频处理、自动驾驶等
实时调度类（Real-Time Class）	`SCHED_FIFO`、`SCHED_RR`	中高	抢占所有公平调度类、空闲调度类、以及同类低优先级进程	实时音视频、低延迟任务、数据采集等
公平调度类（Fair Class）	`SCHED_NORMAL`、`SCHED_BATCH`	普通	只能抢占同类低优先级进程	普通用户应用程序、批处理任务等
空闲调度类（Idle Class）	`SCHED_IDLE`	最低	无法抢占任何其他进程	系统空闲时执行的任务、后台维护任务