File "/home/user/work/AI/sentout/utils/networks2_quant.py", line 138, in forward output = self.float_func.add(output, row_output) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/anaconda3/lib/python3.12/site-packages/torch/ao/nn/quantized/modules/functional_modules.py", line 45, in add r = torch.add(x, y) ^^^^^^^^^^^^^^^ NotImplementedError: Could not run 'aten::add.out' with arguments from the 'QuantizedCPU' backend

时间: 2025-07-15 20:56:52 浏览: 6
这个错误表明在PyTorch量化模型中,尝试对两个量化张量(`QuantizedCPU`后端)执行`torch.add`操作时失败,因为当前版本的PyTorch未实现该后端的加法运算。以下是详细分析和解决方案: --- ### **1. 错误原因** - **量化张量的运算限制**: PyTorch的量化张量(`QuantizedCPU`)仅支持部分运算(如`torch.matmul`、`torch.conv2d`),但**不支持直接的逐元素加法(`add`)**。这是由于量化张量的算术运算需要特殊的反量化-计算-再量化流程,而某些操作尚未实现。 - **触发场景**: 1. 在量化模型中直接对两个量化层的输出进行加法(如残差连接、多分支融合)。 2. 使用`QFunctional`模块时,未正确配置支持的操作(如默认未启用`add`)。 --- ### **2. 解决方案** #### **方法1:显式反量化后运算** 在量化模型中,对需要加法的张量先反量化(`dequantize`),执行浮点加法后再重新量化: ```python import torch from torch.quantization import DeQuantStub, QuantStub class QuantizedAddModule(torch.nn.Module): def __init__(self): super().__init__() self.dequant = DeQuantStub() # 反量化 self.quant = QuantStub() # 重新量化 def forward(self, x, y): x_float = self.dequant(x) # 反量化到浮点 y_float = self.dequant(y) sum_float = x_float + y_float # 浮点加法 return self.quant(sum_float) # 重新量化 ``` #### **方法2:使用`QFunctional`的`add`(需PyTorch版本支持)** 某些PyTorch版本支持通过`QFunctional`配置量化加法(需检查文档): ```python from torch.quantization import QFunctional qfunc = QFunctional() # 手动设置加法操作的量化参数(需与输入张量的scale/zero_point一致) qfunc.scale = ... # 通常取输入张量的scale qfunc.zero_point = ... output = qfunc.add(output, row_output) # 可能仍需版本支持 ``` #### **方法3:升级PyTorch或使用自定义算子** - **升级PyTorch**:确保使用最新稳定版(如≥2.0),部分量化操作在新版本中已支持。 - **自定义CUDA扩展**:如需高性能量化加法,可编写自定义CUDA核函数(参考[PyTorch C++ API](https://pytorch.org/cppdocs/))。 --- ### **3. 代码修复示例** 假设原模型中存在以下量化加法问题: ```python # 错误代码(直接量化加法) output = self.float_func.add(quant_output1, quant_output2) ``` **修复为显式反量化流程**: ```python class QuantizedResidualBlock(torch.nn.Module): def __init__(self): super().__init__() self.quant = QuantStub() self.dequant = DeQuantStub() self.conv = torch.nn.Conv2d(...) # 量化卷积层 def forward(self, x): x_quant = self.quant(x) conv_output = self.conv(x_quant) # 显式反量化后加法 residual = self.dequant(x_quant) # 残差分支 output = self.dequant(conv_output) + residual return self.quant(output) # 重新量化 ``` --- ### **4. 验证量化兼容性** 检查当前PyTorch版本支持的量化操作: ```python import torch # 列出所有支持的量化算子 print(torch.quantization.get_qconfig_mapping('fbgemm').activation()) print(torch.ops.quantized.get_qconfig_propagation_list()) ``` --- ### **5. 替代方案:动态量化+静态量化混合** 如果静态量化(`static quantization`)对加法支持不足,可考虑: 1. **对加法部分使用动态量化**(运行时确定`scale/zero_point`): ```python dynamic_quantized_layer = torch.quantization.quantize_dynamic( torch.nn.Linear(...), {torch.nn.Linear}, dtype=torch.qint8 ) ``` 2. **仅量化计算密集型层**(如卷积),保留加法为浮点运算。 --- ### **6. 常见误区** 1. **误认为所有张量运算自动支持量化**: 需显式验证操作是否在`torch.ops.quantized`命名空间下。 2. **忽略`scale/zero_point`一致性**: 反量化-加法-再量化时,需确保所有路径的量化参数匹配,否则会导致数值错误。 3. **过度依赖`QFunctional`**: `QFunctional`主要用于权重量化,对激活值运算的支持有限。 --- ### **7. 扩展建议** - **使用TensorRT或ONNX Runtime**: 这些框架对量化加法的支持更完善,可通过导出量化模型到ONNX后使用其他推理引擎。 - **参考PyTorch量化示例**: 官方教程:[Quantization API Examples](https://pytorch.org/tutorials/recipes/quantization.html) ---
阅读全文

相关推荐

raceback (most recent call last): File "train_rcnn.py", line 243, in <module> trainer.train( File "/root/autodl-tmp/project/tools/../tools/train_utils/train_utils.py", line 199, in train loss, tb_dict, disp_dict = self._train_it(batch) File "/root/autodl-tmp/project/tools/../tools/train_utils/train_utils.py", line 132, in _train_it loss, tb_dict, disp_dict = self.model_fn(self.model, batch) File "/root/autodl-tmp/project/tools/../lib/net/train_functions.py", line 35, in model_fn ret_dict = model(input_data) File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/root/autodl-tmp/project/tools/../lib/net/point_rcnn.py", line 33, in forward rpn_output = self.rpn(input_data) File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/root/autodl-tmp/project/tools/../lib/net/rpn.py", line 74, in forward backbone_xyz, backbone_features = self.backbone_net(pts_input) # (B, N, 3), (B, C, N) File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/root/autodl-tmp/project/tools/../lib/net/pointnet2_msg.py", line 154, in forward li_xyz, li_features = self.SA_modules[i](l_xyz[i], l_features[i]) File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) TypeError: forward() takes 2 positional arguments but 3 were given

python web_demo.py Explicitly passing a revision is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Traceback (most recent call last): File "/home/nano/THUDM/ChatGLM-6B/web_demo.py", line 5, in <module> tokenizer = AutoTokenizer.from_pretrained("/home/nano/THUDM/chatglm-6b", trust_remote_code=True) File "/home/nano/.local/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 679, in from_pretrained return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs) File "/home/nano/.local/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1804, in from_pretrained return cls._from_pretrained( File "/home/nano/.local/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1958, in _from_pretrained tokenizer = cls(*init_inputs, **init_kwargs) File "/home/nano/.cache/huggingface/modules/transformers_modules/chatglm-6b/tokenization_chatglm.py", line 221, in __init__ self.sp_tokenizer = SPTokenizer(vocab_file, num_image_tokens=num_image_tokens) File "/home/nano/.cache/huggingface/modules/transformers_modules/chatglm-6b/tokenization_chatglm.py", line 64, in __init__ self.text_tokenizer = TextTokenizer(vocab_file) File "/home/nano/.cache/huggingface/modules/transformers_modules/chatglm-6b/tokenization_chatglm.py", line 22, in __init__ self.sp.Load(model_path) File "/home/nano/.local/lib/python3.10/site-packages/sentencepiece/__init__.py", line 905, in Load return self.LoadFromFile(model_file) File "/home/nano/.local/lib/python3.10/site-packages/sentencepiece/__init__.py", line 310, in LoadFromFile return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg) RuntimeError: Internal: src/sentencepiece_processor.cc(1101) [model_proto->ParseFromArray(serialized.data(), serialized.size())]什么错误

Arguments: datasets_root: None models_dir: saved_models cpu: False seed: None Exception in thread Thread-2: Traceback (most recent call last): File "/home/human588/.conda/envs/audio_env/lib/python3.7/urllib/request.py", line 1350, in do_open encode_chunked=req.has_header('Transfer-encoding')) File "/home/human588/.conda/envs/audio_env/lib/python3.7/http/client.py", line 1281, in request self._send_request(method, url, body, headers, encode_chunked) File "/home/human588/.conda/envs/audio_env/lib/python3.7/http/client.py", line 1327, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "/home/human588/.conda/envs/audio_env/lib/python3.7/http/client.py", line 1276, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/home/human588/.conda/envs/audio_env/lib/python3.7/http/client.py", line 1036, in _send_output self.send(msg) File "/home/human588/.conda/envs/audio_env/lib/python3.7/http/client.py", line 976, in send self.connect() File "/home/human588/.conda/envs/audio_env/lib/python3.7/http/client.py", line 1443, in connect super().connect() File "/home/human588/.conda/envs/audio_env/lib/python3.7/http/client.py", line 948, in connect (self.host,self.port), self.timeout, self.source_address) File "/home/human588/.conda/envs/audio_env/lib/python3.7/socket.py", line 728, in create_connection raise err File "/home/human588/.conda/envs/audio_env/lib/python3.7/socket.py", line 716, in create_connection sock.connect(sa) OSError: [Errno 101] Network is unreachable During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/home/human588/.conda/envs/audio_env/lib/python3.7/threading.py", line 926, in _bootstrap_inner self.run() File "/home/human588/.conda/envs/audio_env/lib/python3.7/threading.py", line 870, in run self._target(*self._args, **self._kwargs) File "/python/Real-Time-Voice-Cloning-master/utils/default_models.py", line 30, in download urllib.request.urlretrieve(url, filename=target, reporthook=t.update_to) File "/home/human588/.conda/envs/audio_env/lib/python3.7/urllib/request.py", line 247, in urlretrieve with contextlib.closing(urlopen(url, data)) as fp: File "/home/human588/.conda/envs/audio_env/lib/python3.7/urllib/request.py", line 222, in urlopen return opener.open(url, data, timeout) File "/home/human588/.conda/envs/audio_env/lib/python3.7/urllib/request.py", line 525, in open response = self._open(req, data) File "/home/human588/.conda/envs/audio_env/lib/python3.7/urllib/request.py", line 543, in _open '_open', req) File "/home/human588/.conda/envs/audio_env/lib/python3.7/urllib/request.py", line 503, in _call_chain result = func(*args) File "/home/human588/.conda/envs/audio_env/lib/python3.7/urllib/request.py", line 1393, in https_open context=self._context, check_hostname=self._check_hostname) File "/home/human588/.conda/envs/audio_env/lib/python3.7/urllib/request.py", line 1352, in do_open raise URLError(err) urllib.error.URLError: <urlopen error [Errno 101] Network is unreachable> Exception in thread Thread-1: Traceback (most recent call last): File "/home/human588/.conda/envs/audio_env/lib/python3.7/urllib/request.py", line 1350, in do_open encode_chunked=req.has_header('Transfer-encoding')) File "/home/human588/.conda/envs/audio_env/lib/python3.7/http/client.py", line 1281, in request self._send_request(method, url, body, headers, encode_chunked) File "/home/human588/.conda/envs/audio_env/lib/python3.7/http/client.py", line 1327, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "/home/human588/.conda/envs/audio_env/lib/python3.7/http/client.py", line 1276, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/home/human588/.conda/envs/audio_env/lib/python3.7/http/client.py", line 1036, in _send_output self.send(msg) File "/home/human588/.conda/envs/audio_env/lib/python3.7/http/client.py", line 976, in send self.connect() File "/home/human588/.conda/envs/audio_env/lib/python3.7/http/client.py", line 1443, in connect super().connect() File "/home/human588/.conda/envs/audio_env/lib/python3.7/http/client.py", line 948, in connect (self.host,self.port), self.timeout, self.source_address) File "/home/human588/.conda/envs/audio_env/lib/python3.7/socket.py", line 728, in create_connection raise err File "/home/human588/.conda/envs/audio_env/lib/python3.7/socket.py", line 716, in create_connection sock.connect(sa) OSError: [Errno 101] Network is unreachable During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/home/human588/.conda/envs/audio_env/lib/python3.7/threading.py", line 926, in _bootstrap_inner self.run() File "/home/human588/.conda/envs/audio_env/lib/python3.7/threading.py", line 870, in run self._target(*self._args, **self._kwargs) File "/python/Real-Time-Voice-Cloning-master/utils/default_models.py", line 30, in download urllib.request.urlretrieve(url, filename=target, reporthook=t.update_to) File "/home/human588/.conda/envs/audio_env/lib/python3.7/urllib/request.py", line 247, in urlretrieve with contextlib.closing(urlopen(url, data)) as fp: File "/home/human588/.conda/envs/audio_env/lib/python3.7/urllib/request.py", line 222, in urlopen return opener.open(url, data, timeout) File "/home/human588/.conda/envs/audio_env/lib/python3.7/urllib/request.py", line 525, in open response = self._open(req, data) File "/home/human588/.conda/envs/audio_env/lib/python3.7/urllib/request.py", line 543, in _open '_open', req) File "/home/human588/.conda/envs/audio_env/lib/python3.7/urllib/request.py", line 503, in _call_chain result = func(*args) File "/home/human588/.conda/envs/audio_env/lib/python3.7/urllib/request.py", line 1393, in https_open context=self._context, check_hostname=self._check_hostname) File "/home/human588/.conda/envs/audio_env/lib/python3.7/urllib/request.py", line 1352, in do_open raise URLError(err) urllib.error.URLError: <urlopen error [Errno 101] Network is unreachable> Exception in thread Thread-4: Traceback (most recent call last): File "/home/human588/.conda/envs/audio_env/lib/python3.7/urllib/request.py", line 1350, in do_open encode_chunked=req.has_header('Transfer-encoding')) File "/home/human588/.conda/envs/audio_env/lib/python3.7/http/client.py", line 1281, in request self._send_request(method, url, body, headers, encode_chunked) File "/home/human588/.conda/envs/audio_env/lib/python3.7/http/client.py", line 1327, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "/home/human588/.conda/envs/audio_env/lib/python3.7/http/client.py", line 1276, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/home/human588/.conda/envs/audio_env/lib/python3.7/http/client.py", line 1036, in _send_output self.send(msg) File "/home/human588/.conda/envs/audio_env/lib/python3.7/http/client.py", line 976, in send self.connect() File "/home/human588/.conda/envs/audio_env/lib/python3.7/http/client.py", line 1443, in connect super().connect() File "/home/human588/.conda/envs/audio_env/lib/python3.7/http/client.py", line 948, in connect (self.host,self.port), self.timeout, self.source_address) File "/home/human588/.conda/envs/audio_env/lib/python3.7/socket.py", line 728, in create_connection raise err File "/home/human588/.conda/envs/audio_env/lib/python3.7/socket.py", line 716, in create_connection sock.connect(sa) OSError: [Errno 101] Network is unreachable During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/home/human588/.conda/envs/audio_env/lib/python3.7/threading.py", line 926, in _bootstrap_inner self.run() File "/home/human588/.conda/envs/audio_env/lib/python3.7/threading.py", line 870, in run self._target(*self._args, **self._kwargs) File "/python/Real-Time-Voice-Cloning-master/utils/default_models.py", line 30, in download urllib.request.urlretrieve(url, filename=target, reporthook=t.update_to) File "/home/human588/.conda/envs/audio_env/lib/python3.7/urllib/request.py", line 247, in urlretrieve with contextlib.closing(urlopen(url, data)) as fp: File "/home/human588/.conda/envs/audio_env/lib/python3.7/urllib/request.py", line 222, in urlopen return opener.open(url, data, timeout) File "/home/human588/.conda/envs/audio_env/lib/python3.7/urllib/request.py", line 525, in open response = self._open(req, data) File "/home/human588/.conda/envs/audio_env/lib/python3.7/urllib/request.py", line 543, in _open '_open', req) File "/home/human588/.conda/envs/audio_env/lib/python3.7/urllib/request.py", line 503, in _call_chain result = func(*args) File "/home/human588/.conda/envs/audio_env/lib/python3.7/urllib/request.py", line 1393, in https_open context=self._context, check_hostname=self._check_hostname) File "/home/human588/.conda/envs/audio_env/lib/python3.7/urllib/request.py", line 1352, in do_open raise URLError(err) urllib.error.URLError: <urlopen error [Errno 101] Network is unreachable> Traceback (most recent call last): File "demo_toolbox.py", line 34, in <module> ensure_default_models(args.models_dir) File "/python/Real-Time-Voice-Cloning-master/utils/default_models.py", line 55, in ensure_default_models f"Download for {target_path.name} failed. You may download models manually instead.\n" \ AssertionError: Download for encoder.pt failed. You may download models manually instead. https://drive.google.com/drive/folders/1fU6umc5uQAVR2udZdHX-lDgXYzTyqG_j

INFO 07-06 22:09:23 __init__.py:207] Automatically detected platform cuda. (VllmWorkerProcess pid=10542) INFO 07-06 22:09:23 multiproc_worker_utils.py:229] Worker ready; awaiting tasks (VllmWorkerProcess pid=10542) INFO 07-06 22:09:24 cuda.py:178] Cannot use FlashAttention-2 backend for Volta and Turing GPUs. (VllmWorkerProcess pid=10542) INFO 07-06 22:09:24 cuda.py:226] Using XFormers backend. (VllmWorkerProcess pid=10542) INFO 07-06 22:09:25 utils.py:916] Found nccl from library libnccl.so.2 (VllmWorkerProcess pid=10542) INFO 07-06 22:09:25 pynccl.py:69] vLLM is using nccl==2.21.5 INFO 07-06 22:09:25 utils.py:916] Found nccl from library libnccl.so.2 INFO 07-06 22:09:25 pynccl.py:69] vLLM is using nccl==2.21.5 INFO 07-06 22:09:25 custom_all_reduce_utils.py:206] generating GPU P2P access cache in /root/.cache/vllm/gpu_p2p_access_cache_for_0,1.json INFO 07-06 22:09:39 custom_all_reduce_utils.py:244] reading GPU P2P access cache from /root/.cache/vllm/gpu_p2p_access_cache_for_0,1.json (VllmWorkerProcess pid=10542) INFO 07-06 22:09:39 custom_all_reduce_utils.py:244] reading GPU P2P access cache from /root/.cache/vllm/gpu_p2p_access_cache_for_0,1.json WARNING 07-06 22:09:39 custom_all_reduce.py:145] Custom allreduce is disabled because your platform lacks GPU P2P capability or P2P test failed. To silence this warning, specify disable_custom_all_reduce=True explicitly. (VllmWorkerProcess pid=10542) WARNING 07-06 22:09:39 custom_all_reduce.py:145] Custom allreduce is disabled because your platform lacks GPU P2P capability or P2P test failed. To silence this warning, specify disable_custom_all_reduce=True explicitly. INFO 07-06 22:09:39 shm_broadcast.py:258] vLLM message queue communication handle: Handle(connect_ip='127.0.0.1', local_reader_ranks=[1], buffer_handle=(1, 4194304, 6, 'psm_b99a0acb'), local_subscribe_port=57279, remote_subscribe_port=None) INFO 07-06 22:09:39 model_runner.py:1110] Starting to load model /home/user/Desktop/DeepSeek-R1-Distill-Qwen-32B... (VllmWorkerProcess pid=10542) INFO 07-06 22:09:39 model_runner.py:1110] Starting to load model /home/user/Desktop/DeepSeek-R1-Distill-Qwen-32B... ERROR 07-06 22:09:39 engine.py:400] CUDA out of memory. Tried to allocate 136.00 MiB. GPU 0 has a total capacity of 21.49 GiB of which 43.50 MiB is free. Including non-PyTorch memory, this process has 21.44 GiB memory in use. Of the allocated memory 21.19 GiB is allocated by PyTorch, and 20.25 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables) ERROR 07-06 22:09:39 engine.py:400] Traceback (most recent call last): ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/engine/multiprocessing/engine.py", line 391, in run_mp_engine ERROR 07-06 22:09:39 engine.py:400] engine = MQLLMEngine.from_engine_args(engine_args=engine_args, ERROR 07-06 22:09:39 engine.py:400] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/engine/multiprocessing/engine.py", line 124, in from_engine_args ERROR 07-06 22:09:39 engine.py:400] return cls(ipc_path=ipc_path, ERROR 07-06 22:09:39 engine.py:400] ^^^^^^^^^^^^^^^^^^^^^^ ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/engine/multiprocessing/engine.py", line 76, in __init__ ERROR 07-06 22:09:39 engine.py:400] self.engine = LLMEngine(*args, **kwargs) ERROR 07-06 22:09:39 engine.py:400] ^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/engine/llm_engine.py", line 273, in __init__ ERROR 07-06 22:09:39 engine.py:400] self.model_executor = executor_class(vllm_config=vllm_config, ) ERROR 07-06 22:09:39 engine.py:400] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/executor/executor_base.py", line 271, in __init__ ERROR 07-06 22:09:39 engine.py:400] super().__init__(*args, **kwargs) ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/executor/executor_base.py", line 52, in __init__ ERROR 07-06 22:09:39 engine.py:400] self._init_executor() ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/executor/mp_distributed_executor.py", line 125, in _init_executor ERROR 07-06 22:09:39 engine.py:400] self._run_workers("load_model", ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/executor/mp_distributed_executor.py", line 185, in _run_workers ERROR 07-06 22:09:39 engine.py:400] driver_worker_output = run_method(self.driver_worker, sent_method, ERROR 07-06 22:09:39 engine.py:400] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/utils.py", line 2196, in run_method ERROR 07-06 22:09:39 engine.py:400] return func(*args, **kwargs) ERROR 07-06 22:09:39 engine.py:400] ^^^^^^^^^^^^^^^^^^^^^ ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/worker/worker.py", line 183, in load_model ERROR 07-06 22:09:39 engine.py:400] self.model_runner.load_model() ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/worker/model_runner.py", line 1112, in load_model ERROR 07-06 22:09:39 engine.py:400] self.model = get_model(vllm_config=self.vllm_config) ERROR 07-06 22:09:39 engine.py:400] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/model_executor/model_loader/__init__.py", line 14, in get_model ERROR 07-06 22:09:39 engine.py:400] return loader.load_model(vllm_config=vllm_config) ERROR 07-06 22:09:39 engine.py:400] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/model_executor/model_loader/loader.py", line 406, in load_model ERROR 07-06 22:09:39 engine.py:400] model = _initialize_model(vllm_config=vllm_config) ERROR 07-06 22:09:39 engine.py:400] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/model_executor/model_loader/loader.py", line 125, in _initialize_model ERROR 07-06 22:09:39 engine.py:400] return model_class(vllm_config=vllm_config, prefix=prefix) ERROR 07-06 22:09:39 engine.py:400] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/model_executor/models/qwen2.py", line 453, in __init__ ERROR 07-06 22:09:39 engine.py:400] self.model = Qwen2Model(vllm_config=vllm_config, ERROR 07-06 22:09:39 engine.py:400] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/compilation/decorators.py", line 151, in __init__ ERROR 07-06 22:09:39 engine.py:400] old_init(self, vllm_config=vllm_config, prefix=prefix, **kwargs) ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/model_executor/models/qwen2.py", line 307, in __init__ ERROR 07-06 22:09:39 engine.py:400] self.start_layer, self.end_layer, self.layers = make_layers( ERROR 07-06 22:09:39 engine.py:400] ^^^^^^^^^^^^ ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/model_executor/models/utils.py", line 557, in make_layers ERROR 07-06 22:09:39 engine.py:400] [PPMissingLayer() for _ in range(start_layer)] + [ ERROR 07-06 22:09:39 engine.py:400] ^ ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/model_executor/models/utils.py", line 558, in ERROR 07-06 22:09:39 engine.py:400] maybe_offload_to_cpu(layer_fn(prefix=f"{prefix}.{idx}")) ERROR 07-06 22:09:39 engine.py:400] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/model_executor/models/qwen2.py", line 309, in <lambda> ERROR 07-06 22:09:39 engine.py:400] lambda prefix: Qwen2DecoderLayer(config=config, ERROR 07-06 22:09:39 engine.py:400] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/model_executor/models/qwen2.py", line 220, in __init__ ERROR 07-06 22:09:39 engine.py:400] self.mlp = Qwen2MLP( ERROR 07-06 22:09:39 engine.py:400] ^^^^^^^^^ ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/model_executor/models/qwen2.py", line 82, in __init__ ERROR 07-06 22:09:39 engine.py:400] self.down_proj = RowParallelLinear( ERROR 07-06 22:09:39 engine.py:400] ^^^^^^^^^^^^^^^^^^ ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/model_executor/layers/linear.py", line 1062, in __init__ ERROR 07-06 22:09:39 engine.py:400] self.quant_method.create_weights( ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/model_executor/layers/linear.py", line 129, in create_weights ERROR 07-06 22:09:39 engine.py:400] weight = Parameter(torch.empty(sum(output_partition_sizes), ERROR 07-06 22:09:39 engine.py:400] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/torch/utils/_device.py", line 106, in __torch_function__ ERROR 07-06 22:09:39 engine.py:400] return func(*args, **kwargs) ERROR 07-06 22:09:39 engine.py:400] ^^^^^^^^^^^^^^^^^^^^^ ERROR 07-06 22:09:39 engine.py:400] torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 136.00 MiB. GPU 0 has a total capacity of 21.49 GiB of which 43.50 MiB is free. Including non-PyTorch memory, this process has 21.44 GiB memory in use. Of the allocated memory 21.19 GiB is allocated by PyTorch, and 20.25 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables) Process SpawnProcess-1: INFO 07-06 22:09:39 multiproc_worker_utils.py:128] Killing local vLLM worker processes Traceback (most recent call last): File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/multiprocessing/process.py", line 314, in _bootstrap self.run() File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/multiprocessing/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/engine/multiprocessing/engine.py", line 402, in run_mp_engine raise e File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/engine/multiprocessing/engine.py", line 391, in run_mp_engine engine = MQLLMEngine.from_engine_args(engine_args=engine_args, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/engine/multiprocessing/engine.py", line 124, in from_engine_args return cls(ipc_path=ipc_path, ^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/engine/multiprocessing/engine.py", line 76, in __init__ self.engine = LLMEngine(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/engine/llm_engine.py", line 273, in __init__ self.model_executor = executor_class(vllm_config=vllm_config, ) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/executor/executor_base.py", line 271, in __init__ super().__init__(*args, **kwargs) File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/executor/executor_base.py", line 52, in __init__ self._init_executor() File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/executor/mp_distributed_executor.py", line 125, in _init_executor self._run_workers("load_model", File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/executor/mp_distributed_executor.py", line 185, in _run_workers driver_worker_output = run_method(self.driver_worker, sent_method, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/utils.py", line 2196, in run_method return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/worker/worker.py", line 183, in load_model self.model_runner.load_model() File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/worker/model_runner.py", line 1112, in load_model self.model = get_model(vllm_config=self.vllm_config) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/model_executor/model_loader/__init__.py", line 14, in get_model return loader.load_model(vllm_config=vllm_config) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/model_executor/model_loader/loader.py", line 406, in load_model model = _initialize_model(vllm_config=vllm_config) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/model_executor/model_loader/loader.py", line 125, in _initialize_model return model_class(vllm_config=vllm_config, prefix=prefix) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/model_executor/models/qwen2.py", line 453, in __init__ self.model = Qwen2Model(vllm_config=vllm_config, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/compilation/decorators.py", line 151, in __init__ old_init(self, vllm_config=vllm_config, prefix=prefix, **kwargs) File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/model_executor/models/qwen2.py", line 307, in __init__ self.start_layer, self.end_layer, self.layers = make_layers( ^^^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/model_executor/models/utils.py", line 557, in make_layers [PPMissingLayer() for _ in range(start_layer)] + [ ^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/model_executor/models/utils.py", line 558, in maybe_offload_to_cpu(layer_fn(prefix=f"{prefix}.{idx}")) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/model_executor/models/qwen2.py", line 309, in <lambda> lambda prefix: Qwen2DecoderLayer(config=config, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/model_executor/models/qwen2.py", line 220, in __init__ self.mlp = Qwen2MLP( ^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/model_executor/models/qwen2.py", line 82, in __init__ self.down_proj = RowParallelLinear( ^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/model_executor/layers/linear.py", line 1062, in __init__ self.quant_method.create_weights( File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/model_executor/layers/linear.py", line 129, in create_weights weight = Parameter(torch.empty(sum(output_partition_sizes), ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/torch/utils/_device.py", line 106, in __torch_function__ return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 136.00 MiB. GPU 0 has a total capacity of 21.49 GiB of which 43.50 MiB is free. Including non-PyTorch memory, this process has 21.44 GiB memory in use. Of the allocated memory 21.19 GiB is allocated by PyTorch, and 20.25 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables) [rank0]:[W706 22:09:40.345098611 ProcessGroupNCCL.cpp:1250] Warning: WARNING: process group has NOT been destroyed before we destruct ProcessGroupNCCL. On normal program exit, the application should call destroy_process_group to ensure that any pending NCCL operations have finished in this process. In rare cases this process can exit before this point and block the progress of another member of the process group. This constraint has always been present, but this warning has only been added since PyTorch 2.4 (function operator()) Traceback (most recent call last): File "/root/miniconda3/envs/deepseek_vllm/bin/vllm", line 8, in <module> sys.exit(main()) ^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/entrypoints/cli/main.py", line 73, in main args.dispatch_function(args) File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/entrypoints/cli/serve.py", line 34, in cmd uvloop.run(run_server(args)) File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/uvloop/__init__.py", line 105, in run return runner.run(wrapper()) ^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/asyncio/runners.py", line 118, in run return self._loop.run_until_complete(task) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "uvloop/loop.pyx", line 1518, in uvloop.loop.Loop.run_until_complete File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/uvloop/__init__.py", line 61, in wrapper return await main ^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/entrypoints/openai/api_server.py", line 947, in run_server async with build_async_engine_client(args) as engine_client: File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/contextlib.py", line 210, in __aenter__ return await anext(self.gen) ^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/entrypoints/openai/api_server.py", line 139, in build_async_engine_client async with build_async_engine_client_from_engine_args( File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/contextlib.py", line 210, in __aenter__ return await anext(self.gen) ^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/entrypoints/openai/api_server.py", line 233, in build_async_engine_client_from_engine_args raise RuntimeError( RuntimeError: Engine process failed to start. See stack trace for the root cause. (deepseek_vllm) root@user-X99:/home/user/Desktop# /root/miniconda3/envs/deepseek_vllm/lib/python3.11/multiprocessing/resource_tracker.py:254: UserWarning: resource_tracker: There appear to be 1 leaked shared_memory objects to clean up at shutdown warnings.warn('resource_tracker: There appear to be %d ' (deepseek_vllm) root@user-X99:/home/user/Desktop#

--------------------------------------------------------------------------- RasterioIOError Traceback (most recent call last) Cell In[1], line 91 89 model.train() 90 epoch_loss = 0 ---> 91 for inputs, labels in train_loader: 92 inputs = inputs.to(device) 93 labels = labels.to(device) File /usr/local/lib/python3.8/site-packages/torch/utils/data/dataloader.py:633, in _BaseDataLoaderIter.__next__(self) 630 if self._sampler_iter is None: 631 # TODO(https://github.com/pytorch/pytorch/issues/76750) 632 self._reset() # type: ignore[call-arg] --> 633 data = self._next_data() 634 self._num_yielded += 1 635 if self._dataset_kind == _DatasetKind.Iterable and \ 636 self._IterableDataset_len_called is not None and \ 637 self._num_yielded > self._IterableDataset_len_called: File /usr/local/lib/python3.8/site-packages/torch/utils/data/dataloader.py:1345, in _MultiProcessingDataLoaderIter._next_data(self) 1343 else: 1344 del self._task_info[idx] -> 1345 return self._process_data(data) File /usr/local/lib/python3.8/site-packages/torch/utils/data/dataloader.py:1371, in _MultiProcessingDataLoaderIter._process_data(self, data) 1369 self._try_put_index() 1370 if isinstance(data, ExceptionWrapper): -> 1371 data.reraise() 1372 return data File /usr/local/lib/python3.8/site-packages/torch/_utils.py:644, in ExceptionWrapper.reraise(self) 640 except TypeError: 641 # If the exception takes multiple arguments, don't try to 642 # instantiate since we don't know how to 643 raise RuntimeError(msg) from None --> 644 raise exception RasterioIOError: Caught RasterioIOError in DataLoader worker process 0. Original Traceback (most recent call last): File "rasterio/_base.pyx", line 310, in rasterio._base.DatasetBase.__init__ File "rasterio/_base.pyx", line 221, in rasterio._base.open_dataset File "rasterio/_err.pyx", line 221, in rasterio._err.exc_wrap_pointer rasterio._err.CPLE_OpenFailedError: /openbayes/input/input1/output_cut/rgb/52983.tif: No such file or directory During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop data = fetcher.fetch(index) File "/usr/local/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/usr/local/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 51, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/tmp/ipykernel_477/3259673788.py", line 22, in __getitem__ with rasterio.open(self.rgb_paths[idx]) as src: File "/output/.pylibs/lib/python3.8/site-packages/rasterio/env.py", line 451, in wrapper return f(*args, **kwds) File "/output/.pylibs/lib/python3.8/site-packages/rasterio/__init__.py", line 304, in open dataset = DatasetReader(path, driver=driver, sharing=sharing, **kwargs) File "rasterio/_base.pyx", line 312, in rasterio._base.DatasetBase.__init__ rasterio.errors.RasterioIOError: /openbayes/input/input1/output_cut/rgb/52983.tif: No such file or directory ​ 是什么原因

/home/chenxingyue/anaconda3/envs/py39/bin/python /home/chenxingyue/codes/caopengfei/CMeKG_tools/test4.py Loading a TensorFlow model in PyTorch, requires both PyTorch and TensorFlow to be installed. Please see https://pytorch.org/ and https://www.tensorflow.org/install/ for installation instructions. Loading a TensorFlow model in PyTorch, requires both PyTorch and TensorFlow to be installed. Please see https://pytorch.org/ and https://www.tensorflow.org/install/ for installation instructions. Traceback (most recent call last): File "/home/chenxingyue/codes/caopengfei/CMeKG_tools/test4.py", line 9, in <module> my_pred=medical_ner() File "/home/chenxingyue/codes/caopengfei/CMeKG_tools/medical_ner.py", line 21, in __init__ self.model = BERT_LSTM_CRF('/home/chenxingyue/codes/caopengfei/medical_ner', tagset_size, 768, 200, 2, File "/home/chenxingyue/codes/caopengfei/CMeKG_tools/model_ner/bert_lstm_crf.py", line 16, in __init__ self.word_embeds = BertModel.from_pretrained(bert_config,from_tf=True) File "/home/chenxingyue/anaconda3/envs/py39/lib/python3.9/site-packages/transformers/modeling_utils.py", line 2612, in from_pretrained model, loading_info = load_tf2_checkpoint_in_pytorch_model( File "/home/chenxingyue/anaconda3/envs/py39/lib/python3.9/site-packages/transformers/modeling_tf_pytorch_utils.py", line 390, in load_tf2_checkpoint_in_pytorch_model import tensorflow as tf # noqa: F401 ModuleNotFoundError: No module named 'tensorflow' 这个报错可以是需要把tensorflow安装到本地吗?还是Linux

[2025-06-16 05:44:39.016+0800] [1869236] [281473290531168] [batchscheduler] [ERROR] [model.py:59] : [Model] >>> Exception:External Comm Manager: Create the hccl communication group failed. export ASDOPS_LOG_LEVEL=ERROR, export ASDOPS_LOG_TO_STDOUT=1 to see more details. Default log path is $HOME/atb/log. Traceback (most recent call last): File "/usr/local/lib/python3.11/site-packages/model_wrapper/model.py", line 57, in initialize return self.python_model.initialize(config) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/model_wrapper/standard_model.py", line 117, in initialize self.generator = Generator( ^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/mindie_llm/text_generator/generator.py", line 230, in __init__ self.cache_manager = self.warm_up( ^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/mindie_llm/text_generator/generator.py", line 427, in warm_up raise e File "/usr/local/lib/python3.11/site-packages/mindie_llm/text_generator/generator.py", line 420, in warm_up self._generate_inputs_warm_up_backend(cache_manager, input_metadata, inference_mode, dummy=True) File "/usr/local/lib/python3.11/site-packages/mindie_llm/text_generator/generator.py", line 523, in _generate_inputs_warm_up_backend self.generator_backend._warm_up(model_inputs, inference_mode=inference_mode) File "/usr/local/lib/python3.11/site-packages/mindie_llm/text_generator/adapter/generator_torch.py", line 507, in _warm_up super()._warm_up(model_inputs) File "/usr/local/lib/python3.11/site-packages/mindie_llm/text_generator/adapter/generator_backend.py", line 219, in _warm_up _ = self.forward(model_inputs, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/mindie_llm/utils/decorators/time_decorator.py", line 69, in wrapper return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/mindie_llm/text_generator/adapter/generator_torch.py", line 197, in forward logits = self._forward(model_inputs, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/mindie_llm/text_generator/adapter/generator_torch.py", line 527, in _forward logits = self.model_wrapper.forward(model_inputs, self.cache_pool.npu_cache, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/mindie_llm/modeling/model_wrapper/atb/atb_model_wrapper.py", line 165, in forward result = self.forward_tensor( ^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/mindie_llm/modeling/model_wrapper/atb/atb_model_wrapper.py", line 205, in forward_tensor result = self.model_runner.forward( ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/Ascend/atb-models/atb_llm/runner/model_runner.py", line 297, in forward res = self.model.forward(**kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/Ascend/atb-models/atb_llm/models/base/flash_causal_lm.py", line 491, in forward self.init_ascend_weight() File "/usr/local/Ascend/atb-models/atb_llm/models/qwen2/flash_causal_qwen2.py", line 287, in init_ascend_weight self.acl_encoder_operation.set_param(json.dumps({**encoder_param})) RuntimeError: External Comm Manager: Create the hccl communication group failed. export ASDOPS_LOG_LEVEL=ERROR, export ASDOPS_LOG_TO_STDOUT=1 to see more details. Default log path is $HOME/atb/log.

2025-07-16 16:14:04,825][controller.model.diffusion.transformer_for_action_diffusion][INFO] - Number of parameters in DiT: 7.385703e+07 Error executing job with overrides: [] Traceback (most recent call last): File "/home/fishros/hdx/DexGraspVLA/deploy1.py", line 4, in <module> main() File "/home/fishros/anaconda3/envs/hdx/lib/python3.9/site-packages/hydra/main.py", line 90, in decorated_main _run_hydra( File "/home/fishros/anaconda3/envs/hdx/lib/python3.9/site-packages/hydra/_internal/utils.py", line 389, in _run_hydra _run_app( File "/home/fishros/anaconda3/envs/hdx/lib/python3.9/site-packages/hydra/_internal/utils.py", line 452, in _run_app run_and_report( File "/home/fishros/anaconda3/envs/hdx/lib/python3.9/site-packages/hydra/_internal/utils.py", line 216, in run_and_report raise ex File "/home/fishros/anaconda3/envs/hdx/lib/python3.9/site-packages/hydra/_internal/utils.py", line 213, in run_and_report return func() File "/home/fishros/anaconda3/envs/hdx/lib/python3.9/site-packages/hydra/_internal/utils.py", line 453, in <lambda> lambda: hydra.run( File "/home/fishros/anaconda3/envs/hdx/lib/python3.9/site-packages/hydra/_internal/hydra.py", line 132, in run _ = ret.return_value File "/home/fishros/anaconda3/envs/hdx/lib/python3.9/site-packages/hydra/core/utils.py", line 260, in return_value raise self._return_value File "/home/fishros/anaconda3/envs/hdx/lib/python3.9/site-packages/hydra/core/utils.py", line 186, in run_job ret.return_value = task_function(task_cfg) File "/home/fishros/hdx/DexGraspVLA/agent/hdx_angle_output.py", line 298, in main robot = Robot(cfg) File "/home/fishros/hdx/DexGraspVLA/agent/hdx_angle_output.py", line 81, in __init__ self.controller.to(self.device) File "/home/fishros/anaconda3/envs/hdx/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1340, in to return self._apply(convert) File "/home/fishros/anaconda3/envs/hdx/lib/python3.9/site-packages/torch/nn/modules/module.py", line 900, in _apply module._apply(fn) File "/home/fishros/anaconda3/envs/hdx/lib/python3.9/site-packages/torch/nn/modules/module.py", line 900, in _apply module._apply(fn) File "/home/fishros/anaconda3/envs/hdx/lib/python3.9/site-packages/torch/nn/modules/module.py", line 900, in _apply module._apply(fn) [Previous line repeated 1 more time] File "/home/fishros/anaconda3/envs/hdx/lib/python3.9/site-packages/torch/nn/modules/module.py", line 927, in _apply param_applied = fn(param) File "/home/fishros/anaconda3/envs/hdx/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1326, in convert return t.to( File "/home/fishros/anaconda3/envs/hdx/lib/python3.9/site-packages/torch/cuda/__init__.py", line 319, in _lazy_init torch._C._cuda_init() RuntimeError: No CUDA GPUs are available

最新推荐

recommend-type

langchain4j-1.0.0-beta2.jar中文-英文对照文档.zip

1、压缩文件中包含: 中文-英文对照文档、jar包下载地址、Maven依赖、Gradle依赖、源代码下载地址。 2、使用方法: 解压最外层zip,再解压其中的zip包,双击 【index.html】 文件,即可用浏览器打开、进行查看。 3、特殊说明: (1)本文档为人性化翻译,精心制作,请放心使用; (2)只翻译了该翻译的内容,如:注释、说明、描述、用法讲解 等; (3)不该翻译的内容保持原样,如:类名、方法名、包名、类型、关键字、代码 等。 4、温馨提示: (1)为了防止解压后路径太长导致浏览器无法打开,推荐在解压时选择“解压到当前文件夹”(放心,自带文件夹,文件不会散落一地); (2)有时,一套Java组件会有多个jar,所以在下载前,请仔细阅读本篇描述,以确保这就是你需要的文件。 5、本文件关键字: jar中文-英文对照文档.zip,java,jar包,Maven,第三方jar包,组件,开源组件,第三方组件,Gradle,中文API文档,手册,开发手册,使用手册,参考手册。
recommend-type

Wamp5: 一键配置ASP/PHP/HTML服务器工具

根据提供的文件信息,以下是关于标题、描述和文件列表中所涉及知识点的详细阐述。 ### 标题知识点 标题中提到的是"PHP集成版工具wamp5.rar",这里面包含了以下几个重要知识点: 1. **PHP**: PHP是一种广泛使用的开源服务器端脚本语言,主要用于网站开发。它可以嵌入到HTML中,从而让网页具有动态内容。PHP因其开源、跨平台、面向对象、安全性高等特点,成为最流行的网站开发语言之一。 2. **集成版工具**: 集成版工具通常指的是将多个功能组合在一起的软件包,目的是为了简化安装和配置流程。在PHP开发环境中,这样的集成工具通常包括了PHP解释器、Web服务器以及数据库管理系统等关键组件。 3. **Wamp5**: Wamp5是这类集成版工具的一种,它基于Windows操作系统。Wamp5的名称来源于它包含的主要组件的首字母缩写,即Windows、Apache、MySQL和PHP。这种工具允许开发者快速搭建本地Web开发环境,无需分别安装和配置各个组件。 4. **RAR压缩文件**: RAR是一种常见的文件压缩格式,它以较小的体积存储数据,便于传输和存储。RAR文件通常需要特定的解压缩软件进行解压缩操作。 ### 描述知识点 描述中提到了工具的一个重要功能:“可以自动配置asp/php/html等的服务器, 不用辛辛苦苦的为怎么配置服务器而烦恼”。这里面涵盖了以下知识点: 1. **自动配置**: 自动配置功能意味着该工具能够简化服务器的搭建过程,用户不需要手动进行繁琐的配置步骤,如修改配置文件、启动服务等。这是集成版工具的一项重要功能,极大地降低了初学者的技术门槛。 2. **ASP/PHP/HTML**: 这三种技术是Web开发中常用的组件。ASP (Active Server Pages) 是微软开发的服务器端脚本环境;HTML (HyperText Markup Language) 是用于创建网页的标准标记语言;PHP是服务器端脚本语言。在Wamp5这类集成环境中,可以很容易地对这些技术进行测试和开发,因为它们已经预配置在一起。 3. **服务器**: 在Web开发中,服务器是一个运行Web应用程序并响应客户端请求的软件或硬件系统。常见的服务器软件包括Apache、Nginx等。集成版工具提供了一个本地服务器环境,使得开发者可以在本地测试他们的应用程序。 ### 标签知识点 标签中仅出现了“PHP”一个关键词,这意味着该工具专注于与PHP相关的开发环境配置。 ### 压缩包子文件的文件名称列表知识点 1. **wamp.exe**: 这是Wamp5集成版工具的可执行文件,用户通过运行这个文件,即可启动Wamp5环境,开始进行PHP等相关开发。 2. **使用说明文档.txt**: 通常这样的文本文件包含了软件的安装指南、功能介绍、常见问题解答以及操作教程等内容。它是用户了解和使用软件的重要辅助资料,可以帮助用户快速上手。 ### 总结 Wamp5作为一个PHP集成版工具,为PHP开发者提供了一个便捷的本地开发环境。它通过自动化配置减少了开发者在搭建环境时的麻烦,整合了Web开发的关键组件,从而使得开发者能够专注于代码的编写,而不是环境配置。此外,它还提供了详尽的使用文档,帮助用户理解和运用这一工具,有效地提高了开发效率。对于希望快速入门PHP或需要一个简单易用的本地开发环境的开发者来说,Wamp5是一个非常好的选择。
recommend-type

【数据融合技术】:甘肃土壤类型空间分析中的专业性应用

# 摘要 数据融合技术作为一种集成多源数据信息的方法,在土壤类型空间分析中发挥着关键作用。本文介绍了数据融合技术的基本概念及其理论基础,阐述了数据预处理、同步整合及冲突解决等关键技术,并详细描述了甘肃土壤类型数据准备的流程,包括数据采集、质量评估、空间化处理及融合实践准备。通过具体案例分析,展示了数据融合在土壤类型空间分布分析、土壤质量评估及土壤保护规划中的应用。同时,文
recommend-type

sht20温湿度传感器使用什么将上拉电阻和滤波电容引出

<think>好的,我需要解决用户关于SHT20温湿度传感器的上拉电阻和滤波电容的引出方法问题。首先,我得回忆一下SHT20的基本特性和常见电路设计。SHT20是数字传感器,通常使用I2C接口,所以上拉电阻是必须的,用于确保数据线和时钟线的稳定。根据引用[2],SHT10是通过SCK和DATA线与单片机通信,而SHT30在引用[3]中使用I2C协议,需要上拉电阻。虽然用户问的是SHT20,但SHT系列通常设计类似,所以可以推断SHT20也需要类似的上拉电阻配置。通常I2C总线的上拉电阻值在4.7kΩ到10kΩ之间,但具体值可能取决于总线速度和电源电压。需要确认数据手册中的推荐值,但用户可能没有
recommend-type

Delphi仿速达财务软件导航条组件开发教程

Delphi作为一款历史悠久的集成开发环境(IDE),由Embarcadero Technologies公司开发,它使用Object Pascal语言,被广泛应用于Windows平台下的桌面应用程序开发。在Delphi中开发组件是一项核心技术,它允许开发者创建可复用的代码单元,提高开发效率和软件模块化水平。本文将详细介绍如何在Delphi环境下仿制速达财务软件中的导航条组件,这不仅涉及到组件的创建和使用,还会涉及界面设计和事件处理等技术点。 首先,需要了解Delphi组件的基本概念。在Delphi中,组件是一种特殊的对象,它们被放置在窗体(Form)上,可以响应用户操作并进行交互。组件可以是可视的,也可以是不可视的,可视组件在设计时就能在窗体上看到,如按钮、编辑框等;不可视组件则主要用于后台服务,如定时器、数据库连接等。组件的源码可以分为接口部分和实现部分,接口部分描述组件的属性和方法,实现部分包含方法的具体代码。 在开发仿速达财务软件的导航条组件时,我们需要关注以下几个方面的知识点: 1. 组件的继承体系 仿制组件首先需要确定继承体系。在Delphi中,大多数可视组件都继承自TControl或其子类,如TPanel、TButton等。导航条组件通常会继承自TPanel或者TWinControl,这取决于导航条是否需要支持子组件的放置。如果导航条只是单纯的一个显示区域,TPanel即可满足需求;如果导航条上有多个按钮或其他控件,可能需要继承自TWinControl以提供对子组件的支持。 2. 界面设计与绘制 组件的外观和交互是用户的第一印象。在Delphi中,可视组件的界面主要通过重写OnPaint事件来完成。Delphi提供了丰富的绘图工具,如Canvas对象,使用它可以绘制各种图形,如直线、矩形、椭圆等,并且可以对字体、颜色进行设置。对于导航条,可能需要绘制背景图案、分隔线条、选中状态的高亮等。 3. 事件处理 导航条组件需要响应用户的交互操作,例如鼠标点击事件。在Delphi中,可以通过重写组件的OnClick事件来响应用户的点击操作,进而实现导航条的导航功能。如果导航条上的项目较多,还可能需要考虑使用滚动条,让更多的导航项能够显示在窗体上。 4. 用户自定义属性和方法 为了使组件更加灵活和强大,开发者通常会为组件添加自定义的属性和方法。在导航条组件中,开发者可能会添加属性来定义按钮个数、按钮文本、按钮位置等;同时可能会添加方法来处理特定的事件,如自动调整按钮位置以适应不同的显示尺寸等。 5. 数据绑定和状态同步 在财务软件中,导航条往往需要与软件其他部分的状态进行同步。例如,用户当前所处的功能模块会影响导航条上相应项目的选中状态。这通常涉及到数据绑定技术,Delphi支持组件间的属性绑定,通过数据绑定可以轻松实现组件状态的同步。 6. 导航条组件的封装和发布 开发完毕后,组件需要被封装成独立的单元供其他项目使用。封装通常涉及将组件源码保存为pas文件,并在设计时能够在组件面板中找到。发布组件可能还需要编写相应的安装包和使用文档,方便其他开发者安装和使用。 7. Delphi IDE的支持 Delphi IDE提供了组件面板编辑器(Component Palette),允许开发者将开发好的组件添加到组件面板中。在组件面板编辑器中,可以自定义组件的图标和分类,使得组件在Delphi中的使用更为便捷。 通过以上的知识点梳理,可以看出Delphi仿速达导航条组件的开发涉及到的不仅仅是简单的代码编写,还涉及到用户界面设计、事件驱动编程、组件封装等多个方面。掌握这些知识点,对于一名Delphi开发者而言,是十分重要的。
recommend-type

【空间分布规律】:甘肃土壤类型与农业生产的关联性研究

# 摘要 本文对甘肃土壤类型及其在农业生产中的作用进行了系统性研究。首先概述了甘肃土壤类型的基础理论,并探讨了土壤类型与农业生产的理论联系。通过GIS技术分析,本文详细阐述了甘肃土壤的空间分布规律,并对其特征和影响因素进行了深入分析。此外,本文还研究了甘肃土壤类型对农业生产实际影响,包括不同区域土壤改良和作物种植案例,以及土壤养分、水分管理对作物生长周期和产量的具体影响。最后,提出了促进甘肃土壤与农业可持续发展的策略,包括土壤保护、退化防治对策以及土壤类型优化与农业创新的结合。本文旨在为
recommend-type

常见运放电路的基本结构和基本原理

<think>首先,用户的问题是关于运算放大器电路的基本结构和工作原理。我需要根据系统级指令来构建回答。系统级指令:-所有行内数学表达式必须使用$...$格式。-独立公式必须使用$$...$$格式并单独成段。-LaTeX语法正确。-使用中文回答。-生成相关问题。-回答中引用的段落末尾自然地添加引用标识,例如[^1]。-回答结构清晰,帮助用户逐步解决问题,尽量保证回答真实可靠。参考站内引用:-引用[1]:关于运算放大器基本电路用法,特别是反相放大器电路。-引用[2]:关于uA741运算放大器电路的基本原理,包括输入级、输出级等。用户的问题:"我想了解运放电路的基本结构和工作原理请问运算放大器电路
recommend-type

ASP.NET2.0初学者个人网站实例分享

标题:“ASP.NET 2.0个人网站”指向了一个网站开发项目,这个项目是使用ASP.NET 2.0框架构建的。ASP.NET 2.0是微软公司推出的一种用于Web开发的服务器端技术,它是.NET Framework的一部分。这个框架允许开发者构建动态网站、网络应用程序和网络服务。开发者可以使用C#或VB.NET等编程语言来编写应用程序。由于这被标签为“2.0”,我们可以假设这是一个较早版本的ASP.NET,相较于后来的版本,它可能没有那么先进的特性,但对于初学者来说,它提供了基础并且易于上手的工具和控件来学习Web开发。 描述:“个人练习所做,适合ASP.NET初学者参考啊,有兴趣的可以前来下载去看看,同时帮小弟我赚些积分”提供了关于该项目的背景信息。它是某个个人开发者或学习者为了实践和学习ASP.NET 2.0而创建的个人网站项目。这个项目被描述为适合初学者作为学习参考。开发者可能是为了积累积分或网络声誉,鼓励他人下载该项目。这样的描述说明了该项目可以被其他人获取,进行学习和参考,或许还能给予原作者一些社区积分或其他形式的回报。 标签:“2.0”表明这个项目专门针对ASP.NET的2.0版本,可能意味着它不是最新的项目,但是它可以帮助初学者理解早期ASP.NET版本的设计和开发模式。这个标签对于那些寻找具体版本教程或资料的人来说是有用的。 压缩包子文件的文件名称列表:“MySelf”表示在分享的压缩文件中,可能包含了与“ASP.NET 2.0个人网站”项目相关的所有文件。文件名“我的”是中文,可能是指创建者以“我”为中心构建了这个个人网站。虽然文件名本身没有提供太多的信息,但我们可以推测它包含的是网站源代码、相关资源文件、数据库文件(如果有的话)、配置文件和可能的文档说明等。 知识点总结: 1. ASP.NET 2.0是.NET Framework下的一个用于构建Web应用程序的服务器端框架。 2. 它支持使用C#和VB.NET等.NET支持的编程语言进行开发。 3. ASP.NET 2.0提供了一组丰富的控件,可帮助开发者快速构建Web表单、用户界面以及实现后台逻辑。 4. 它还提供了一种称作“Web站点”项目模板,使得初学者能够方便地开始Web开发项目。 5. ASP.NET 2.0是微软.NET历史上一个重要的里程碑,引入了许多创新特性,如成员资格和角色管理、主题和皮肤、网站导航和个性化设置等。 6. 在学习ASP.NET 2.0的过程中,初学者可以了解到如HTTP请求和响应、服务器控件、状态管理、数据绑定、缓存策略等基础概念。 7. 本项目可作为ASP.NET初学者的实践平台,帮助他们理解框架的基本结构和工作流程,从而为学习更高版本的ASP.NET打下坚实基础。 8. 个人网站项目的构建可以涵盖前端设计(HTML, CSS, JavaScript)和后端逻辑(C#或VB.NET)的综合应用。 9. 在学习过程中,初学者应该学会如何配置和使用IIS(Internet Information Services)来部署ASP.NET网站。 10. “赚取积分”可能指的是在某个在线社区、论坛或代码托管平台上,通过分享项目来获得一定的积分或奖励,这通常是用来衡量用户对社区贡献大小的一种方式。 综上所述,该“ASP.NET 2.0个人网站”项目不仅为初学者提供了一个实用的学习资源,同时体现了开发者对于开源共享精神的实践,对社区贡献出自己的力量。通过这样的实践,初学者能够更好地理解ASP.NET框架的运作,逐步建立起自己的Web开发技能。
recommend-type

【制图技术】:甘肃高质量土壤分布TIF图件的成图策略

# 摘要 本文针对甘肃土壤分布数据的TIF图件制作进行了系统研究。首先概述了甘肃土壤的分布情况,接着介绍了TIF图件的基础知识,包括其格式特点、空间数据表达以及质量控制方法。随后,文中构建了成图策略的理论框架,分析了土壤分布图的信息需求与数据处理流程,并探讨了成图原则与标准。在实践操作部分,详细阐述了制图软
recommend-type

代码解释 ```c char* image_data = (char*)malloc(width * height * channels); ```

<think>我们正在分析C语言中的一行代码:char* image_data = (char*)malloc(width * height * channels); 这行代码位于一个完整的代码片段中,该片段还包括检查内存分配是否成功。 代码功能分析: 这行代码的主要功能是动态分配一块内存,并将分配的内存地址赋值给指针变量image_data。 详细解析: 1. malloc函数:malloc是C语言中用于动态内存分配的函数,它从堆上分配指定字节数的内存,并返回一个指向这块内存起始地址的void指针(void*)。如果分配失败,则返回NULL。 2. 参数:malloc的参数