【PyTorch】【distributions】normal.py

资源存储库

已于 2024-12-29 11:45:05 修改

阅读量983

点赞数 22

CC 4.0 BY-SA版权

分类专栏：笔记 python 文章标签： pytorch 人工智能 python

于 2024-12-29 11:44:18 首次发布

本文链接：https://blog.csdn.net/wq6qeg88/article/details/144802859

笔记同时被 2 个专栏收录

249 篇文章

订阅专栏

python

62 篇文章

订阅专栏

【PyTorch】【distributions】normal.py

7. 自然参数（Natural Parameters）：

8. 对数归一化常数（Log Normalizer）：

总结

【PyTorch】【distributions】normal.py

这段代码定义了一个 正态分布（Normal distribution） 类，用于生成、采样、计算对数概率、计算累积分布函数（CDF）等。

该类继承自 ExponentialFamily，并使用 PyTorch 张量来表示分布的参数。

下面我会逐个解释每个方法和公式。

1. 类构造和属性：

构造方法

def __init__(self, loc, scale, validate_args=None):
    self.loc, self.scale = broadcast_all(loc, scale)
    if isinstance(loc, Number) and isinstance(scale, Number):
        batch_shape = torch.Size()
    else:
        batch_shape = self.loc.size()
    super(Normal, self).__init__(batch_shape, validate_args=validate_args)

loc: 正态分布的均值（mean），通常表示为 μ。
scale: 正态分布的 标准差（standard deviation），通常表示为 $\sigma^2$ 。
batch_shape: 如果 loc 和 scale 是单一数值，那么就是标量形状，否则会根据 loc 的形状来广播（broadcast_all）形成批次形状。

属性：

mean: 返回均值 μ。
stddev: 返回 标准差 σ。
variance: 返回方差，即 $\sigma^2$ 。

2. 采样方法：

`sample()` 方法

def sample(self, sample_shape=torch.Size()):
    shape = self._extended_shape(sample_shape)
    with torch.no_grad():
        return torch.normal(self.loc.expand(shape), self.scale.expand(shape))

从正态分布中采样的 sample 方法返回一个形状为 sample_shape 的样本。
torch.normal(mean, std) 用于生成正态分布的随机样本，其中 mean 是均值，std 是标准差。

`rsample()` 方法

def rsample(self, sample_shape=torch.Size()):
    shape = self._extended_shape(sample_shape)
    eps = _standard_normal(shape, dtype=self.loc.dtype, device=self.loc.device)
    return self.loc + eps * self.scale

重参数化采样：这是深度学习中常用的技术，尤其是在变分推断和生成模型中。通过对标准正态分布的样本 eps 进行线性变换（ $\mu + \epsilon \cdot \sigma$ ）来生成样本。
eps 是从标准正态分布 $\mathcal{N}(0, 1)$ 中采样的噪声。

3. 对数概率：

def log_prob(self, value):
    if self._validate_args:
        self._validate_sample(value)
    var = (self.scale ** 2)
    log_scale = math.log(self.scale) if isinstance(self.scale, Real) else self.scale.log()
    return -((value - self.loc) ** 2) / (2 * var) - log_scale - math.log(math.sqrt(2 * math.pi))

该方法计算正态分布的 对数概率，公式为：

$\log p(x) = -\frac{(x - \mu)^2}{2\sigma^2} - \log(\sigma) - \frac{1}{2} \log(2\pi)$

μ 是均值，σ是标准差，x是观察值。
该公式即是正态分布的概率密度函数（PDF）的对数。

4. 累积分布函数 (CDF)：

def cdf(self, value):
    if self._validate_args:
        self._validate_sample(value)
    return 0.5 * (1 + torch.erf((value - self.loc) * self.scale.reciprocal() / math.sqrt(2)))

计算正态分布的 累积分布函数 $P(X \leq x)$ ，公式为：

$F(x) = \frac{1}{2} \left[ 1 + \text{erf}\left(\frac{x - \mu}{\sigma \sqrt{2}}\right) \right]$

其中，erf 是误差函数，Python 中的 torch.erf 是误差函数的实现。

5. 逆累积分布函数 (ICDF):

def icdf(self, value):
    return self.loc + self.scale * torch.erfinv(2 * value - 1) * math.sqrt(2)

逆累积分布函数 (Inverse CDF) 也叫 分位数函数，用于从概率值（0 到 1 之间的值）推算出对应的随机变量值 xxx。
公式为：

$x = \mu + \sigma \cdot \sqrt{2} \cdot \text{erf}^{-1}(2p - 1)$

其中 p 是概率值， $\text{erf}^{-1}$ 是误差函数的逆函数。

6. 熵（Entropy）：

def entropy(self):
    return 0.5 + 0.5 * math.log(2 * math.pi) + torch.log(self.scale)

熵是信息论中的概念，用来度量随机变量的不确定性。正态分布的熵公式为：

$H(X) = \frac{1}{2} \log(2\pi e \sigma^2)$

这里的 self.scale 即为标准差 σ。

7. 自然参数（Natural Parameters）：

@property
def _natural_params(self):
    return (self.loc / self.scale.pow(2), -0.5 * self.scale.pow(2).reciprocal())

正态分布的 自然参数 是参数化 Exponential 家族分布的一种方式。对于正态分布：

$\eta_1 = \frac{\mu}{\sigma^2}, \quad \eta_2 = -\frac{1}{2\sigma^2}$

这些自然参数在变分推断和优化中非常重要，因为它们使得计算变得更加高效。

8. 对数归一化常数（Log Normalizer）：

def _log_normalizer(self, x, y):
    return -0.25 * x.pow(2) / y + 0.5 * torch.log(-math.pi / y)

对数归一化常数是正态分布的对数概率的归一化常数，确保概率分布的积分为 1。
这个公式涉及到高斯分布的标准化系数：

$\log Z = -\frac{1}{2} \log(2\pi \sigma^2)$

这个常数确保整个概率分布的归一化性质，即积分为 1。

总结

这段代码定义了一个 正态分布 类，支持通过 PyTorch 来进行样本采样、计算对数概率、累积概率、熵等操作。主要公式包括：

对数概率： $\log p(x) = -\frac{(x - \mu)^2}{2\sigma^2} - \log(\sigma) - \frac{1}{2} \log(2\pi)$
累积分布函数： $F(x) = \frac{1}{2} \left[ 1 + \text{erf}\left(\frac{x - \mu}{\sigma \sqrt{2}}\right) \right]$
逆累积分布函数： $x = \mu + \sigma \cdot \sqrt{2} \cdot \text{erf}^{-1}(2p - 1)$
熵： $H(X) = \frac{1}{2} \log(2\pi e \sigma^2)$

这些公式和方法实现了正态分布的所有基本功能，可以用于概率计算、生成样本和进行变分推断等任务。

import math
from numbers import Real
from numbers import Number

import torch
from torch.distributions import constraints
from torch.distributions.exp_family import ExponentialFamily
from torch.distributions.utils import _standard_normal, broadcast_all


class Normal(ExponentialFamily):
    r"""
    Creates a normal (also called Gaussian) distribution parameterized by
    :attr:`loc` and :attr:`scale`.

    Example::

        >>> m = Normal(torch.tensor([0.0]), torch.tensor([1.0]))
        >>> m.sample()  # normally distributed with loc=0 and scale=1
        tensor([ 0.1046])

    Args:
        loc (float or Tensor): mean of the distribution (often referred to as mu)
        scale (float or Tensor): standard deviation of the distribution
            (often referred to as sigma)
    """
    arg_constraints = {'loc': constraints.real, 'scale': constraints.positive}
    support = constraints.real
    has_rsample = True
    _mean_carrier_measure = 0

    @property
    def mean(self):
        return self.loc

    @property
    def stddev(self):
        return self.scale

    @property
    def variance(self):
        return self.stddev.pow(2)

    def __init__(self, loc, scale, validate_args=None):
        self.loc, self.scale = broadcast_all(loc, scale)
        if isinstance(loc, Number) and isinstance(scale, Number):
            batch_shape = torch.Size()
        else:
            batch_shape = self.loc.size()
        super(Normal, self).__init__(batch_shape, validate_args=validate_args)

    def expand(self, batch_shape, _instance=None):
        new = self._get_checked_instance(Normal, _instance)
        batch_shape = torch.Size(batch_shape)
        new.loc = self.loc.expand(batch_shape)
        new.scale = self.scale.expand(batch_shape)
        super(Normal, new).__init__(batch_shape, validate_args=False)
        new._validate_args = self._validate_args
        return new

    def sample(self, sample_shape=torch.Size()):
        shape = self._extended_shape(sample_shape)
        with torch.no_grad():
            return torch.normal(self.loc.expand(shape), self.scale.expand(shape))

    def rsample(self, sample_shape=torch.Size()):
        shape = self._extended_shape(sample_shape)
        eps = _standard_normal(shape, dtype=self.loc.dtype, device=self.loc.device)
        return self.loc + eps * self.scale

    def log_prob(self, value):
        if self._validate_args:
            self._validate_sample(value)
        # compute the variance
        var = (self.scale ** 2)
        log_scale = math.log(self.scale) if isinstance(self.scale, Real) else self.scale.log()
        return -((value - self.loc) ** 2) / (2 * var) - log_scale - math.log(math.sqrt(2 * math.pi))

    def cdf(self, value):
        if self._validate_args:
            self._validate_sample(value)
        return 0.5 * (1 + torch.erf((value - self.loc) * self.scale.reciprocal() / math.sqrt(2)))

    def icdf(self, value):
        return self.loc + self.scale * torch.erfinv(2 * value - 1) * math.sqrt(2)

    def entropy(self):
        return 0.5 + 0.5 * math.log(2 * math.pi) + torch.log(self.scale)

    @property
    def _natural_params(self):
        return (self.loc / self.scale.pow(2), -0.5 * self.scale.pow(2).reciprocal())

    def _log_normalizer(self, x, y):
        return -0.25 * x.pow(2) / y + 0.5 * torch.log(-math.pi / y)