Linux内核 -- 多核数据同步原语言之 smp_load_acquire 的作用与使用

Linux 内核中的 smp_load_acquire 函数

在 Linux 内核中,smp_load_acquire 函数是一种用于实现“获取(acquire)”语义的原子操作,主要用来同步不同 CPU 之间的数据访问,确保在读取共享变量时遵循正确的内存顺序。它的主要作用是防止编译器和 CPU 在此函数之前和之后的指令重排序,从而确保读取的数据是最新的有效数据。

smp_load_acquire 的作用

acquire 语义意味着在该函数之后的所有内存访问操作都不能被重排序到该函数之前。这对于多核系统上的同步非常重要,因为它保证了从共享内存中读取的数据是最新的,有效防止了不同 CPU 之间的读写不一致问题。

acquire 语义一般用在需要对共享状态进行读取的场景,比如锁的实现或读取共享变量的标志位,以确保在获取锁或标志之后,后续操作看到的是最新的共享状态。

用法

smp_load_acquire 的用法是对给定的指针执行带有 acquire 语义的读取操作。其语法如下:

 type smp_load_acquire(type *ptr);
  • ptr:指向需要读取的共享变量的指针。
  • 返回值:返回共享变量的当前值。

示例

以下是一个简单的使用 smp_load_acquire 的示例,描述了如何用它来读取共享变量,以确保读取到的是其他 CPU 上最新写入的值:

#include <linux/spinlock.h>

static int shared_flag = 0;

void producer(void)
{
    // 生产者设置共享变量的值
    smp_store_release(&shared_flag, 1);
}

void consumer(void)
{
    // 消费者读取共享变量的值
    int flag = smp_load_acquire(&shared_flag);

    if (flag == 1) {
        // 共享变量已经被生产者设置,可以执行进一步的操作
        // 在此处继续处理共享数据
    }
}

在上述代码中:

  1. producer 函数通过 smp_store_release 设置共享变量 shared_flag 的值为 1,release 语义确保在写入该变量之前的所有内存写操作都不会被重排序到写入之后。
  2. consumer 函数通过 smp_load_acquire 读取 shared_flagacquire 语义确保在读取 shared_flag 之后的内存访问操作不会被重排序到读取之前。

相关函数

  • smp_store_release(ptr, value):与 smp_load_acquire 相对应,release 语义确保在写入共享变量之前的所有内存操作已经完成。
  • READ_ONCE()WRITE_ONCE():用于确保对共享变量的读写操作不会被编译器优化掉,但没有提供内存屏障,因此在多核同步场景下需要配合其他内存屏障来确保正确性。

总结

  • smp_load_acquire 用于在读取共享变量时提供获取(acquire)内存屏障,确保在读取该变量后的所有操作只能在其之后执行。
  • 这种语义在实现锁、信号量、状态标志等多核间同步机制时非常有用,以确保数据的可见性和操作的顺序性。

在 Linux 内核代码中,合理使用 smp_load_acquiresmp_store_release 可以避免数据竞争和内存一致性问题,从而确保代码在多核环境中的正确性。

/* 3965 * Notes on Program-Order guarantees on SMP systems. 3966 * 3967 * MIGRATION 3968 * 3969 * The basic program-order guarantee on SMP systems is that when a task [t] 3970 * migrates, all its activity on its old CPU [c0] happens-before any subsequent 3971 * execution on its new CPU [c1]. 3972 * 3973 * For migration (of runnable tasks) this is provided by the following means: 3974 * 3975 * A) UNLOCK of the rq(c0)->lock scheduling out task t 3976 * B) migration for t is required to synchronize *both* rq(c0)->lock and 3977 * rq(c1)->lock (if not at the same time, then in that order). 3978 * C) LOCK of the rq(c1)->lock scheduling in task 3979 * 3980 * Release/acquire chaining guarantees that B happens after A and C after B. 3981 * Note: the CPU doing B need not be c0 or c1 3982 * 3983 * Example: 3984 * 3985 * CPU0 CPU1 CPU2 3986 * 3987 * LOCK rq(0)->lock 3988 * sched-out X 3989 * sched-in Y 3990 * UNLOCK rq(0)->lock 3991 * 3992 * LOCK rq(0)->lock // orders against CPU0 3993 * dequeue X 3994 * UNLOCK rq(0)->lock 3995 * 3996 * LOCK rq(1)->lock 3997 * enqueue X 3998 * UNLOCK rq(1)->lock 3999 * 4000 * LOCK rq(1)->lock // orders against CPU2 4001 * sched-out Z 4002 * sched-in X 4003 * UNLOCK rq(1)->lock 4004 * 4005 * 4006 * BLOCKING -- aka. SLEEP + WAKEUP 4007 * 4008 * For blocking we (obviously) need to provide the same guarantee as for 4009 * migration. However the means are completely different as there is no lock 4010 * chain to provide order. Instead we do: 4011 * 4012 * 1) smp_store_release(X->on_cpu, 0) -- finish_task() 4013 * 2) smp_cond_load_acquire(!X->on_cpu) -- try_to_wake_up() 4014 * 4015 * Example: 4016 * 4017 * CPU0 (schedule) CPU1 (try_to_wake_up) CPU2 (schedule) 4018 * 4019 * LOCK rq(0)->lock LOCK X->pi_lock 4020 * dequeue X 4021 * sched-out X 4022 * smp_store_release(X->on_cpu, 0); 4023 * 4024 * smp_cond_load_acquire(&X->on_cpu, !VAL); 4025 * X->state = WAKING 4026 * set_task_cpu(X,2) 4027 * 4028 * LOCK rq(2)->lock 4029 * enqueue X 4030 * X->state = RUNNING 4031 * UNLOCK rq(2)->lock 4032 * 4033 * LOCK rq(2)->lock // orders against CPU1 4034 * sched-out Z 4035 * sched-in X 4036 * UNLOCK rq(2)->lock 4037 * 4038 * UNLOCK X->pi_lock 4039 * UNLOCK rq(0)->lock 4040 * 4041 * 4042 * However, for wakeups there is a second guarantee we must provide, namely we 4043 * must ensure that CONDITION=1 done by the caller can not be reordered with 4044 * accesses to the task state; see try_to_wake_up() and set_current_state(). 4045 */ 4046 4047 /** 4048 * try_to_wake_up - wake up a thread 4049 * @p: the thread to be awakened 4050 * @state: the mask of task states that can be woken 4051 * @wake_flags: wake modifier flags (WF_*) 4052 * 4053 * Conceptually does: 4054 * 4055 * If (@state & @p->state) @p->state = TASK_RUNNING. 4056 * 4057 * If the task was not queued/runnable, also place it back on a runqueue. 4058 * 4059 * This function is atomic against schedule() which would dequeue the task. 4060 * 4061 * It issues a full memory barrier before accessing @p->state, see the comment 4062 * with set_current_state(). 4063 * 4064 * Uses p->pi_lock to serialize against concurrent wake-ups. 4065 * 4066 * Relies on p->pi_lock stabilizing: 4067 * - p->sched_class 4068 * - p->cpus_ptr 4069 * - p->sched_task_group 4070 * in order to do migration, see its use of select_task_rq()/set_task_cpu(). 4071 * 4072 * Tries really hard to only take one task_rq(p)->lock for performance. 4073 * Takes rq->lock in: 4074 * - ttwu_runnable() -- old rq, unavoidable, see comment there; 4075 * - ttwu_queue() -- new rq, for enqueue of the task; 4076 * - psi_ttwu_dequeue() -- much sadness :-( accounting will kill us. 4077 * 4078 * As a consequence we race really badly with just about everything. See the 4079 * many memory barriers and their comments for details. 4080 * 4081 * Return: %true if @p->state changes (an actual wakeup was done), 4082 * %false otherwise. 4083 */ 分析上述注释并给出设计思路以及解释
07-04
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值