torch.nn.Bilinear函数计算原理

最新推荐文章于 2025-06-03 14:01:33 发布

原创最新推荐文章于 2025-06-03 14:01:33 发布 · 987 阅读

10 ·

CC 4.0 BY-SA版权

文章标签：

#python #经验分享 #pytorch

torch.nn.Bilinear函数计算原理

1. torch.nn.Bilinear函数官网解释
2. 作者解释
3. 示例

1. torch.nn.Bilinear函数官网解释

在这里插入图片描述

2. 作者解释

实例化 torch.nn.Bilinear 类时，通过Bilinear_layer = nn.Bilinear(in1_features, in2_features, out_features) 来实例化。

调用 Bilinear_layer 对象时需要输入 $x_1, x_2$ ，如 output = Bilinear_layer(input1, input2)，这里 input1.shape 为 (batch_size, in1_features)，input2.shape 为 (batch_size, in2_features)。

双线性层的 W 的维度为 (out_features，in1_features, in2_features)；
双线性层的 B 的维度为 (out_features)；
$x_1$ 的维度为 (batch_size, in1_features);
$x_2$ 的维度为 (batch_size, in2_features);

双线性变换 $y = x_1^T W x_2 + B$ 的计算原理如下：

循环遍历 W 第0个维度out_features，每次循环执行以下步骤2,3；
(batch_size, in1_features) 矩阵乘法(in1_features, in2_features)点乘(batch_size, in2_features) 得到(batch_size, in2_features)；
然后将(batch_size, in2_features)按照第一个维度求和得到(batch_size)；
经过1,2,3步之后，会得到 out_features 个 (batch_size)维向量，把他们拼接起来构成(batch_size, out_features)，最后加上偏置就得到输出 $y$ .

3. 示例

import torch
import torch.nn as nn
import numpy as np


m = nn.Bilinear(10, 20, 30)
input1 = torch.randn(128, 10)
input2 = torch.randn(128, 20)
output = m(input1, input2)
print(output.size())
arr_output = output.data.cpu().numpy()

# 复制数据用以上计算原理得到y 
weight = m.weight.data.cpu().numpy()
bias = m.bias.data.cpu().numpy()
x1 = input1.data.cpu().numpy()
x2 = input2.data.cpu().numpy()
print(x1.shape,weight.shape,x2.shape,bias.shape)
y = np.zeros((x1.shape[0],weight.shape[0]))
for k in range(weight.shape[0]):
    buff = np.dot(x1, weight[k])
    buff = buff * x2    # (128, 20)
    buff = np.sum(buff,axis=1)    # (128,)
    y[:,k] = buff
y += bias

# 计算误差
dif = y - arr_output
print(np.mean(np.abs(dif.flatten())))

# torch.Size([128, 30])
# (128, 10) (30, 10, 20) (128, 20) (30,)
# 1.6828271327540277e-07