将~/bcc/build/src/python/bcc-python3/bcc目录下的文件全部拷贝到/usr/lib/python3/dist-packages/bcc目录下
时间: 2025-01-18 10:03:43 浏览: 46
将`~/bcc/build/src/python/bcc-python3/bcc`目录下的所有文件复制到`/usr/lib/python3/dist-packages/bcc`目录,可以按照以下步骤操作:
1. 打开终端并导航至源文件所在的目录:
```
cd ~/bcc/build/src/python/bcc-python3/bcc
```
2. 使用`rsync`命令同步目录内容,保持文件权限和属性:
```
rsync -avh --copy-links . /usr/lib/python3/dist-packages/bcc
```
`-a`表示archive模式(归档),包括递归、保留原文件权限和时间戳等;
`-v`表示verbose(详细),显示进度信息;
`-h`表示human-readable sizes(人性化大小),输出字节数更易读;
`--copy-links`保留链接指向的目标文件。
3. 检查目标目录,确认文件已成功复制:
```
cd /usr/lib/python3/dist-packages/bcc
ls
```
请注意,在某些系统上,可能需要管理员权限(sudo)来进行这个操作。
相关问题
打开bcc编译⽣成的bcc lib库,进⾏替换。 编译⽣成的库位置看⾃⼰的git拉取位置 python3的bcc库默认在 cd /usr/lib/python3/dist-packages/bcc怎么替换用cmake生成的构建文件
要替换通过bcc编译生成的bcc库,可以按照以下步骤进行操作:
1. **找到编译生成的库文件**:
首先,你需要找到通过bcc编译生成的库文件。通常情况下,这些库文件会位于你克隆的bcc项目的`build`目录中。假设你的bcc项目位于`~/bcc`,那么库文件可能位于`~/bcc/build/lib`目录下。
2. **备份现有的bcc库**:
在替换现有库之前,建议先备份现有的bcc库文件,以防出现意外情况。
```sh
sudo cp -r /usr/lib/python3/dist-packages/bcc /usr/lib/python3/dist-packages/bcc_backup
```
3. **复制新的库文件**:
将新编译生成的库文件复制到bcc的安装目录中。
```sh
sudo cp ~/bcc/build/lib/* /usr/lib/python3/dist-packages/bcc/
```
4. **更新链接库**:
如果有新的链接库需要更新,可以使用`ldconfig`命令来更新链接库缓存。
```sh
sudo ldconfig
```
5. **验证替换**:
替换完成后,可以通过导入bcc库并运行简单的示例代码来验证替换是否成功。
```python
import bcc
print(bcc.__version__)
```
6. **使用cmake生成的构建文件**:
如果你使用cmake生成的构建文件进行编译,可以按照以下步骤进行:
```sh
cd ~/bcc
mkdir build
cd build
cmake ..
make
sudo make install
```
这将会在`/usr/lib/python3/dist-packages/bcc`目录下生成新的库文件。
通过以上步骤,你可以成功替换bcc编译生成的库文件。
NPUDoublePsEmbFunctor Init Done. Table: UnirecEmbedding [E compiler_depend.ts:280] call aclnnMatmul failed, detail:EZ9903: [PID: 543476] 2025-06-16-10:46:41.752.975 rtKernelLaunchWithFlagV2 failed: 107003 Solution: In this scenario, collect the plog when the fault occurs and locate the fault based on the plog. TraceBack (most recent call last): Kernel launch failed, stream is not in current ctx, stream_id=3.[FUNC:KernelLaunch][FILE:api_impl.cc][LINE:465] rtKernelLaunchWithFlagV2 execute failed, reason=[stream not in current context][FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:53] rtKernelLaunchWithFlagV2 failed: 107003 #### KernelLaunch failed: /usr/local/Ascend/ascend-toolkit/8.0.T60/opp/built-in/op_impl/ai_core/tbe//kernel/ascend910b/mem_set/MemSet_1a6864193b99ef93ef38616f04a712ab_high_performance.o Memset Output Tensor failed Kernel Run failed. opType: 3, MatMulV2 launch failed for MatMulV2, errno:361001. [ERROR] 2025-06-16-10:46:41 (PID:543476, Device:0, RankID:-1) ERR01100 OPS call acl api failed Exception raised from operator() at build/CMakeFiles/torch_npu.dir/compiler_depend.ts:84 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x57 (0x7ff5a3392617 in /usr/local/miniconda3/envs/torchrec/lib/python3.11/site-packages/torch/lib/libc10.so) frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::string const&) + 0x64 (0x7ff5a334d98d in /usr/local/miniconda3/envs/torchrec/lib/python3.11/site-packages/torch/lib/libc10.so) frame #2: <unknown function> + 0xed1162 (0x7ff4fcad1162 in /usr/local/miniconda3/envs/torchrec/lib/python3.11/site-packages/torch_npu/lib/libtorch_npu.so) frame #3: <unknown function> + 0x14c27bc (0x7ff4fd0c27bc in /usr/local/miniconda3/envs/torchrec/lib/python3.11/site-packages/torch_npu/lib/libtorch_npu.so) frame #4: <unknown function> + 0x7365b7 (0x7ff4fc3365b7 in /usr/local/miniconda3/envs/torchrec/lib/python3.11/site-packages/torch_npu/lib/libtorch_npu.so) frame #5: <unknown function> + 0x736a76 (0x7ff4fc336a76 in /usr/local/miniconda3/envs/torchrec/lib/python3.11/site-packages/torch_npu/lib/libtorch_npu.so) frame #6: <unknown function> + 0x73468f (0x7ff4fc33468f in /usr/local/miniconda3/envs/torchrec/lib/python3.11/site-packages/torch_npu/lib/libtorch_npu.so) frame #7: <unknown function> + 0xdbbf4 (0x7ff5736c7bf4 in /usr/local/miniconda3/lib/libstdc++.so.6) frame #8: <unknown function> + 0x8937a (0x7ff5bcc8b37a in /lib64/libc.so.6) frame #9: <unknown function> + 0x109e4c (0x7ff5bcd0be4c in /lib64/libc.so.6) [W compiler_depend.ts:286] Warning: (function ExecFunc) Traceback (most recent call last): File "/workspace/doupleps_emb_test.py", line 258, in <module> test_simple_trainning() File "/workspace/doupleps_emb_test.py", line 158, in test_simple_trainning my_out0, my_loss, my_out1 = _training_loop(my_model, lr, "unirec") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/workspace/doupleps_emb_test.py", line 119, in _training_loop out = model(index) ^^^^^^^^^^^^ File "/usr/local/miniconda3/envs/torchrec/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/miniconda3/envs/torchrec/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/workspace/doupleps_emb_test.py", line 44, in forward x0 = self.linear1(emb_out) ^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/miniconda3/envs/torchrec/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/miniconda3/envs/torchrec/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/miniconda3/envs/torchrec/lib/python3.11/site-packages/torch/nn/modules/linear.py", line 114, in forward return F.linear(input, self.weight, self.bias) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ RuntimeError: The Inner error is reported as above. The process exits for this inner error, and the current working operator name is aclnnMatmul. Since the operator is called asynchronously, the stacktrace may be inaccurate. If you want to get the accurate stacktrace, pleace set the environment variable ASCEND_LAUNCH_BLOCKING=1. [ERROR] 2025-06-16-10:46:41 (PID:543476, Device:0, RankID:-1) ERR00100 PTA call acl api failed [W compiler_depend.ts:305] Warning: NPU warning, error code is 107003[Error]: [Error]: The stream is not in the current context. Check whether the context where the stream is located is the same as the current context. EE9999: Inner Error! EE9999: [PID: 543476] 2025-06-16-10:46:41.760.771 Stream destroy failed, stream is not in current ctx, stream_id=3.[FUNC:StreamDestroy][FILE:api_impl.cc][LINE:1104] TraceBack (most recent call last): rtStreamDestroyForce execute failed, reason=[stream not in current context][FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:53] destroy stream force failed, runtime result = 107003[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:161] (function operator()) I0616 10:46:42.707563 543476 hbm_table_manager.h:75] NpuHbmSparseTableManager destructor called Segmentation fault (core dumped)
### 解决 aclnnMatmul 调用失败的问题
错误代码 `EZ9903` 和 `107003` 通常与 NPU 环境配置、上下文管理以及矩阵乘法操作的参数设置有关。以下是对问题的详细分析和解决方案。
#### 错误代码 EZ9903 的原因
错误代码 `EZ9903` 表示在调用 `aclnnMatmul` 时,矩阵乘法操作未能成功完成。这可能是由于以下几个原因:
- 输入张量的形状不匹配或数据类型不支持[^1]。
- 设备内存不足,无法分配足够的资源来执行矩阵运算。
- 当前上下文未正确设置,导致 NPU 流不在当前上下文中[^2]。
为了解决此问题,建议检查以下内容:
- 确保输入张量的维度符合矩阵乘法的要求。
- 使用工具(如 `nvidia-smi` 或 Ascend 的相关工具)监控设备内存使用情况,确保有足够的可用内存。
- 检查是否正确设置了上下文和设备,例如通过调用 `acl.rt.set_context` 和 `acl.rt.set_device`[^1]。
#### 错误代码 107003 的原因
错误代码 `107003` 通常与 NPU 流的上下文管理有关。具体原因可能包括:
- 当前流未正确绑定到上下文。
- 上下文初始化失败或未正确释放。
- 内核启动时传递的参数超出硬件限制。
为了解决此问题,建议执行以下操作:
- 确保在调用 `aclnnMatmul` 之前正确初始化上下文和流。
- 检查是否正确释放了上下文资源,避免资源泄漏。
- 根据引用内容[^2],调整 `.bashrc` 文件中的环境变量设置,确保仅加载必要的库文件,避免冲突。
#### 示例代码:验证 MatMulV2 操作的上下文设置
以下是一个简单的代码示例,用于验证 `MatMulV2` 操作的上下文设置是否正确:
```python
import numpy as np
from ascend import aclrt, aclnn
# 初始化上下文和流
context = aclrt.create_context(0)
stream = aclrt.create_stream()
# 定义输入张量
input_a = np.random.rand(3, 4).astype(np.float32)
input_b = np.random.rand(4, 5).astype(np.float32)
# 创建输出张量
output = np.zeros((3, 5), dtype=np.float32)
try:
# 设置上下文和流
aclrt.set_context(context)
aclrt.set_current_stream(stream)
# 调用 MatMulV2
aclnn.matmul(input_a, input_b, output)
print("MatMulV2 调用成功")
except Exception as e:
print(f"MatMulV2 调用失败: {e}")
finally:
# 释放资源
aclrt.destroy_stream(stream)
aclrt.destroy_context(context)
```
#### 驱动版本和库兼容性
如果问题仍然存在,可能是驱动版本或运行时库与硬件不兼容。根据引用内容[^3],建议升级 NPU 驱动版本至最新版本,并重新构建系统以确保兼容性。
###
阅读全文
相关推荐
















