内核异常检测机制——Soft lockup、Hard lockup、hung up

1. Soft lockup和Hard lockup

1.1 定义

Lock up检测机制是内核中非常重要的机制,用于检测内核Lock up,也就是说CPU长时间执行在内核态的一种异常状态。如果说内核出现了以下警告,就与该问题相关:

    watchdog: BUG: soft lockup - CPU#244 stuck for 26s!
    pstate: 83400009 (Nzcv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
    pc : arm_smmu_cmdq_issue_cmdlist+0x178/0xa50
    lr : arm_smmu_cmdq_issue_cmdlist+0x150/0xa50
    sp : ffff8000d83ef290
    x29: ffff8000d83ef290 x28: 000000003b9aca00 x27: 0000000000000000
    x26: ffff8000d83ef3c0 x25: da86c0812194a0e8 x24: 0000000000000000
    x23: 0000000000000040 x22: ffff8000d83ef340 x21: ffff0000c63980c0
    x20: 0000000000000001 x19: ffff0000c6398080 x18: 0000000000000000
    x17: 0000000000000000 x16: 0000000000000000 x15: ffff3000b4a3bbb0
    x14: ffff3000b4a30888 x13: ffff3000b4a3cf60 x12: 0000000000000000
    x11: 0000000000000000 x10: 0000000000000000 x9 : ffffc08120e4d6bc
    x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000048cfa
    x5 : 0000000000000000 x4 : 0000000000000001 x3 : 000000000000000a
    x2 : 0000000080000000 x1 : 0000000000000000 x0 : 0000000000000001
    Call trace:
     arm_smmu_cmdq_issue_cmdlist+0x178/0xa50
     __arm_smmu_tlb_inv_range+0x118/0x254
     arm_smmu_tlb_inv_range_asid+0x6c/0x130
     arm_smmu_mm_invalidate_range+0xa0/0xa4
     __mmu_notifier_invalidate_range_end+0x88/0x120
     unmap_vmas+0x194/0x1e0
     unmap_region+0xb4/0x144
     do_mas_align_munmap+0x290/0x490
     do_mas_munmap+0xbc/0x124
     __vm_munmap+0xa8/0x19c
     __arm64_sys_munmap+0x28/0x50
     invoke_syscall+0x78/0x11c
     el0_svc_common.constprop.0+0x58/0x1c0
     do_el0_svc+0x34/0x60
     el0_svc+0x2c/0xd4
     el0t_64_sync_handler+0x114/0x140
     el0t_64_sync+0x1a4/0x1a8

首先定义两种Lock up。

术语 定义 可能原因
Hard lockup CPU停留在内核态超过10s,此种状态下不仅不能调度task,也不能响应任何中断 1. 硬件异常,如CPU停滞;
2. 中断子系统异常;
3. 死锁;
4. 关中断过久
Soft lockup CPU停留在内核态超过20s,此种状态下没有给其他task任何调度机会,但此时能够响应中断。 1. 关抢占过久;
2. 中断(软中断、硬中断)处理程序执行过久;
3. 异常的锁行为,如spinlock持锁过久(原理就是关了抢占过久),死锁等;
4. 长时间的内核态操作,如在内核态中长时间循环,且未使用schedule/sleep等进行调度;

1.2. Lockup的检测原理

在内核中Soft lockupHard lockup的检测拥有相同的入口,如下代码所示:

在内核中,lockup检测函数入口是一个cpuhp的回调函数,保证每一个CPU在online时调用到。

/* CPU热拔插回调 */
static struct cpuhp_step cpuhp_hp_states[] = {
   
   
	/* ... */
	[CPUHP_AP_WATCHDOG_ONLINE] = {
   
   
		.name			= "lockup_detector:online",
		.startup.single		= lockup_detector_online_cpu,
		.teardown.single	= lockup_detector_offline_cpu,
	},
	/* ... */
};

int lockup_detector_online_cpu(unsigned int cpu)
{
   
   
	if (cpumask_test_cpu(cpu, &watchdog_allowed_mask))
		watchdog_enable(cpu);
	return 0;
}

Soft lockupHard lockup的检测入口都是watchdog_enable,通过该函数注册一个hrtimer回调watchdog_timer_fn,完成打点、软锁和硬锁检测的功能。

static void watchdog_enable(unsigned int cpu)
{
   
   
	struct hrtimer *hrtimer = this_cpu_ptr(&watchdog_hrtimer);
	struct completion *done = this_cpu_ptr(&softlockup_completion);

	WARN_ON_ONCE(cpu != smp_processor_id());

	init_completion(done);
	complete(done);

	/*
	 * Start the timer first to prevent the hardlockup watchdog triggering
	 * before the timer has a chance to fire.
	 */
	hrtimer_init(hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL_HARD);
	hrtimer->function = watchdog_timer_fn;
	hrtimer_start(hrtimer, ns_to_ktime(sample_period),
		      HRTIMER_MODE_REL_PINNED_HARD);

	/* Initialize timestamp */
	update_touch_ts();
	/* Enable the hardlockup detector */
	if (watchdog_enabled & WATCHDOG_HARDLOCKUP_ENABLED)
		watchdog_hardlockup_enable(cpu);
}


/* watchdog kicker functions */
static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
{
   
   
	unsigned long touch_ts, period_ts, now;
	struct pt_regs *regs = get_irq_regs();
	int duration;
	int softlockup_all_cpu_backtrace = sysctl_softlockup_all_cpu_backtrace;

	if (!watchdog_enabled)
		return HRTIMER_NORESTART;

	watchdog_hardlockup_kick();

	/* kick the softlockup detector */
	if (completion_done(this_cpu_ptr(&softlockup_completion))) {
   
   
		reinit_completion(this_cpu_ptr(&softlockup_completion));
		stop_one_cpu_nowait(smp_processor_id(),
				softlockup_fn, NULL,
				this_cpu_ptr(&softlockup_stop_work));
	}

	/* .. and repeat */
	hrtimer_forward_now(hrtimer, ns_to_ktime(sample_period));

	/*
	 * Read the current timestamp first. It might become invalid anytime
	 * when a virtual machine is stopped by the host or when the watchog
	 * is touched from NMI.
	 */
	now = get_timestamp();
	/*
	 * If a virtual machine is stopped by the host it can look to
	 * the watchdog like a soft lockup. This function touches the watchdog.
	 */
	kvm_check_and_clear_guest_paused();
	/*
	 * The stored timestamp is comparable with @now only when not touched.
	 * It might get touched anytime from NMI. Make sure that is_softlockup()
	 * uses the same (valid) value.
	 */
	period_ts = READ_ONCE(*this_cpu_ptr(&watchdog_report_ts));

	/* Reset the interval when touched by known problematic code. */
	if (period_ts == SOFTLOCKUP_DELAY_REPORT) {
   
   
		if (unlikely(__this_cpu_read(softlockup_touch_sync))) {
   
   
			/*
			 * If the time stamp was touched atomically
			 * make sure the scheduler tick is up to date.
			 */
			__this_cpu_write(softlockup_touch_sync, false);
			sched_clock_tick();
		}

		update_report_ts();
		return HRTIMER_RESTART;
	}

	/* Check for a softlockup. */
	touch_ts = __this_cpu_read(watchdog_touch_ts);
	duration = is_softlockup(touch_ts, period_ts, now);
	if (unlikely(duration)) {
   
   
		/*
		 * Prevent multiple soft-lockup reports if one cpu is already
		 * engaged in dumping all cpu back traces.
		 */
		if (softlockup_all_cpu_backtrace) {
   
   
			if (test_and_set_bit_lock(0, &soft_lockup_nmi_warn))
				return HRTIMER_RESTART;
		}

		/* Start period for the next softlockup warning. */
		update_report_ts();

		pr_emerg("BUG: soft lockup - CPU#%d stuck for %us! [%s:%d]\n",
			smp_processor_id(), duration,
			current->comm, task_pid_nr(current));
		print_modules();
		print_irqtrace_events(current);
		if (regs)
			show_regs(regs);
		else
			dump_stack();

		if (softlockup_all_cpu_backtrace) {
   
   
			trigger_allbutcpu_cpu_backtrace(smp_processor_id());
			clear_bit_unlock(0, &soft_lockup_nmi_warn);
		}

		add_taint(TAINT_SOFTLOCKUP, LOCKDEP_STILL_OK);
		if (softlockup_panic)
			panic("softlockup: hung tasks");
	}

	return HRTIMER_RESTART;
}
1.2.1. Soft lockup检测原理

Soft lockup最核心的检测机制是利用高精度计时器hrtimer和停机调度机制stop_machine完成的。

Soft lockup会在每个核心上都会启用一个hrtimer,该hrtimer以4s(即1/5软锁阈值)为采样周期,产生一个硬中断,在该硬中断先后执行两件事:

  1. 先通过stop_one_cpu_nowait来请求本CPU的停机调度线程migration执行时间戳打点操作。
  2. 随后,硬中断检测距离上一次打点是否超过20s,如果超过则报Soft lockup警告。

为什么这么做?

首先了解migration进程是什么。
migration是一个停机调度进程,每个cpu都有一个,其是整个系统中最高优先级的进程,具备自停车(self parking)特性。虽然其有最高优先级,但不直接抢占进程,而是等待当前CPU进入下一个调度点时,按优先级调度该进程。其没有时间片的概念,只要不主动让出cpu,其将一直霸占cpu。

如果说migration这个最高优先级的进程都无法被正常调度,说明该cpu处于某种异常的状态,导致始终无法调度task。此时可能得原因有:

  1. 长时间关抢占,无法调度进程
  2. 中断执行久、中断嵌套多、中断风暴等,无法调度进程
  3. 锁异常,如spinlock持锁过久(也是关了抢占),死锁等

如果说hrtimer产生的硬中断或者migration进程任何一环出现了异常,导致打点或检测时机被延迟了,就会由内核报出Soft lockup异常。

调度类 调度策略 优先级 抢占能力 典型应用场景
停机调度类(Stop Class) N/A 最高 不抢占,等待当前任务执行调度点触发后才会调度 内核管理任务、CPU 迁移、热插拔等
限期调度类(Deadline Class) SCHED_DEADLINE 较高 抢占所有实时调度类、公平调度类、空闲调度类,以及同类低优先级进程 工业控制、音视频处理、自动驾驶等
实时调度类(Real-Time Class) SCHED_FIFOSCHED_RR 中高 抢占所有公平调度类、空闲调度类、以及同类低优先级进程 实时音视频、低延迟任务、数据采集等
公平调度类(Fair Class) SCHED_NORMALSCHED_BATCH 普通 只能抢占同类低优先级进程 普通用户应用程序、批处理任务等
空闲调度类(Idle Class) SCHED_IDLE 最低 无法抢占任何其他进程 系统空闲时执行的任务、后台维护任务

也就是说,CPU会定期执行下面的操作:

migration上下文
hrtimer硬中断上下文
软锁检测
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值