Traceback (most recent call last): File "main_test.py", line 146, in <module> main_worker(args, args_main) File "main_test.py", line 46, in main_worker sys.stdout = Logger(os.path.join(log_path, "log_test.txt")) File "/media/lele/e/zzg/OTLA/utils.py", line 262, in __init__ self.file = open(fpath, 'w') PermissionError: [Errno 13] Permission denied: 'sysu_semi-supervised_otla-reid/sysu_log/log_test.txt'

时间: 2024-04-11 22:32:54 浏览: 368
这个错误是由于没有权限写入文件导致的。你可以尝试以下解决方案: 1. 确保你有足够的权限来写入所指定的文件路径。你可以检查文件的权限设置,并确保你有写入权限。 2. 如果你是在一个受限制的环境下运行,比如在某些服务器上,你可能需要联系管理员或者使用管理员权限来运行程序。 3. 如果你在运行程序之前已经打开了同名的文件,尝试关闭该文件并重新运行程序。 4. 如果你在使用其他程序或者进程访问该文件,确保没有其他程序正在使用该文件。可以尝试关闭这些程序或者等待它们完成操作后再运行程序。 如果以上解决方案都无效,可能需要进一步检查代码和环境设置来定位问题。
相关问题

Error executing method 'init_device'. This might cause deadlock in distributed execution. (RayWorkerWrapper pid=326, ip=10.16.6.31) ERROR 03-26 06:01:54 worker_base.py:574] Traceback (most recent call last): (RayWorkerWrapper pid=326, ip=10.16.6.31) ERROR 03-26 06:01:54 worker_base.py:574] File "/mnt/vllm0.7.2/vllm/worker/worker_base.py", line 566, in execute_method (RayWorkerWrapper pid=326, ip=10.16.6.31) ERROR 03-26 06:01:54 worker_base.py:574] return run_method(target, method, args, kwargs) (RayWorkerWrapper pid=326, ip=10.16.6.31) ERROR 03-26 06:01:54 worker_base.py:574] File "/mnt/vllm0.7.2/vllm/utils.py", line 2220, in run_method (RayWorkerWrapper pid=326, ip=10.16.6.31) ERROR 03-26 06:01:54 worker_base.py:574] return func(*args, **kwargs) (RayWorkerWrapper pid=326, ip=10.16.6.31) ERROR 03-26 06:01:54 worker_base.py:574] File "/mnt/vllm0.7.2/vllm/worker/worker.py", line 155, in init_device (RayWorkerWrapper pid=326, ip=10.16.6.31) ERROR 03-26 06:01:54 worker_base.py:574] torch.cuda.set_device(self.device) (RayWorkerWrapper pid=326, ip=10.16.6.31) ERROR 03-26 06:01:54 worker_base.py:574] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py", line 420, in set_device (RayWorkerWrapper pid=326, ip=10.16.6.31) ERROR 03-26 06:01:54 worker_base.py:574] torch._C._cuda_setDevice(device) (RayWorkerWrapper pid=326, ip=10.16.6.31) ERROR 03-26 06:01:54 worker_base.py:574] RuntimeError: HIP error: invalid device ordinal (RayWorkerWrapper pid=326, ip=10.16.6.31) ERROR 03-26 06:01:54 worker_base.py:574] HIP kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. (RayWorkerWrapper pid=326, ip=10.16.6.31) ERROR 03-26 06:01:54 worker_base.py:574] For debugging consider passing AMD_SERIALIZE_KERNEL=3 (RayWorkerWrapper pid=326, ip=10.16.6.31) ERROR 03-26 06:01:54 worker_base.py:574] Compile with `TORCH_USE_HIP_DSA` to enable device-side assertions. 怎么解决

<think>好的,我现在需要解决用户提到的两个问题:在分布式执行中,使用RayWorkerWrapper的init_device方法导致的死锁问题,以及运行时出现的HIP错误:无效设备序号。用户还提到了PyTorch的DistributedDataParallel,可能和分布式训练相关。我需要先理解这些问题的背景和可能的原因,然后找到解决方案。 首先,关于死锁问题。在分布式训练中,死锁通常发生在进程间同步的时候。比如,如果多个进程在初始化设备时没有正确协调,可能会互相等待资源,导致死锁。特别是使用Ray这样的框架时,每个工作进程可能需要正确设置自己的CUDA设备,否则可能出现竞争条件或资源未释放的情况。 然后是HIP错误:无效设备序号。HIP是AMD的GPU计算平台,但用户提到的是PyTorch的CUDA设置,可能这里存在混淆,或者用户的环境中存在AMD和NVIDIA显卡混用的情况?不过更可能的是,用户在使用CUDA时指定的设备序号超出了实际可用的范围。例如,如果只有2块GPU,但代码中设置device=3,就会报这个错误。需要检查设备序号是否正确,以及进程是否正确地分配到了可用的设备。 接下来,分析可能的原因: 1. **设备序号错误**:在初始化每个worker时,可能传递了错误的device_id。例如,使用Ray启动多个worker时,每个worker可能没有正确获取其应该使用的GPU序号,导致设置时超出实际数量。 2. **进程竞争**:在分布式环境中,如果多个进程同时尝试初始化设备或申请资源,而没有适当的同步机制,可能导致资源竞争,进而引发死锁或错误。 3. **CUDA上下文未正确释放**:在之前的操作中,如果某个进程没有正确释放CUDA资源,后续进程可能无法正确初始化设备,导致错误。 4. **Ray与PyTorch分布式协作问题**:Ray管理worker的方式可能与PyTorch的分布式训练(如DDP)存在冲突,特别是在设备初始化和进程启动顺序上。 解决方案的思路: 针对设备序号错误,需要确保每个worker在调用torch.cuda.set_device()时使用的device_id是正确的,并且不超过实际可用GPU的数量。例如,使用Ray时,每个worker应该被分配到一个独立的GPU,并在初始化时设置对应的设备序号。 对于死锁问题,可能需要确保所有进程在初始化设备时同步,或者调整初始化的顺序,避免资源竞争。例如,在Ray的worker启动时,可能需要按顺序初始化每个worker的设备,而不是并行初始化。 此外,检查CUDA_VISIBLE_DEVICES环境变量是否正确设置,确保每个进程只能看到分配给它的GPU。这可以通过在启动每个worker之前设置环境变量来实现。 可能的解决步骤: 1. **验证设备数量**:首先确认系统中实际可用的GPU数量,确保代码中使用的device_id在0到(n-1)的范围内。 2. **正确分配设备给每个worker**:使用Ray时,可能需要为每个worker分配不同的GPU。例如,在启动worker时,通过环境变量CUDA_VISIBLE_DEVICES来限制每个worker可见的GPU。例如,第一个worker设置CUDA_VISIBLE_DEVICES=0,第二个设置为1,依此类推。 3. **确保每个worker在初始化时设置正确的设备**:在每个worker的初始化函数(如init_device)中,先调用torch.cuda.set_device(),并且确保该设备是当前进程可见的。 4. **避免在初始化设备前进行任何CUDA操作**:在设置设备之前,不要进行任何会隐式初始化CUDA上下文的操作,例如创建张量到GPU上。这可能导致后续设置设备失败。 5. **使用进程组正确的初始化方法**:在使用PyTorch的DDP时,确保每个进程都正确调用init_process_group,并且在初始化之前已经设置好了设备。 6. **检查Ray的资源配置**:在启动Ray时,确保指定了正确的GPU资源,避免多个worker竞争同一个GPU。例如,在Ray的初始化参数中指定num_gpus_per_worker=1,确保每个worker独占一个GPU。 关于代码示例,用户可能需要修改RayWorkerWrapper的初始化过程,确保每个worker在实例化时正确设置其设备。例如: 在Ray的远程函数中,每个worker在初始化时,获取分配到的GPU编号,然后设置CUDA设备。同时,需要确保在初始化DDP之前,设备已经正确设置。 此外,HIP错误可能与ROCm(AMD的GPU平台)相关,如果用户的环境中使用的是AMD显卡,那么需要确认PyTorch是否支持该硬件,并且正确安装了ROCm版本的PyTorch。但通常,HIP错误中的“invalid device ordinal”同样适用于CUDA环境中的设备号越界问题,所以可能不需要区分硬件类型,只需检查设备号是否正确。 总结可能的解决方案: 1. 检查设备号是否有效,确保不超过实际GPU数量。 2. 使用环境变量隔离每个worker的可见设备。 3. 在初始化设备前设置CUDA_VISIBLE_DEVICES,并调用torch.cuda.set_device()。 4. 确保在初始化DDP之前,设备已经正确设置,避免进程间竞争。 5. 在Ray的worker启动配置中正确分配GPU资源,避免资源竞争导致死锁。 另外,查看PyTorch的文档,特别是关于DistributedDataParallel的使用,可能需要确保每个进程在正确的设备上初始化,并且进程组正确初始化。例如,使用NCCL后端时,需要每个进程对应一个独立的GPU。 可能需要用户提供更多的代码细节,比如RayWorkerWrapper的实现,init_device方法的具体内容,以及如何启动分布式训练。但根据现有信息,可以给出通用的解决方案步骤。</think>针对Ray分布式执行中`init_device`方法引发的死锁及`RuntimeError: HIP error: invalid device ordinal`问题,以下是分步解决方案: --- ### **1. 验证设备序号的合法性** - **问题根源**:当设置的GPU设备序号超过实际可用数量时(例如系统只有2块GPU但设置`device_id=3`),会触发`invalid device ordinal`错误。 - **解决方法**: - 使用`torch.cuda.device_count()`获取实际可用GPU数量,确保所有`device_id`均满足$0 \leq \text{device\_id} < \text{device\_count}$。 - 在代码中添加断言检查: ```python import torch device_id = 0 # 示例值,需根据实际情况调整 assert device_id < torch.cuda.device_count(), "设备序号超出范围" ``` --- ### **2. 通过环境变量隔离GPU设备** - **问题根源**:多个Ray Worker可能竞争同一块GPU,导致死锁或CUDA上下文初始化冲突。 - **解决方法**: - 使用`CUDA_VISIBLE_DEVICES`为每个Worker分配独立的GPU。 - 在Ray任务启动时动态设置环境变量: ```python import ray import os @ray.remote(num_gpus=1) # 每个Worker独占1块GPU class RayWorkerWrapper: def __init__(self, worker_id): os.environ["CUDA_VISIBLE_DEVICES"] = str(worker_id) # 隔离设备可见性 self.device = f"cuda:{0}" # 每个Worker只看到自己的第0块GPU(实际对应全局的worker_id) def init_device(self): torch.cuda.set_device(self.device) # 显式设置当前设备 ``` - **关键点**:`num_gpus=1`确保Ray为每个Worker分配独立GPU资源,避免资源竞争[^1]。 --- ### **3. 确保正确的设备初始化顺序** - **问题根源**:在初始化CUDA上下文后设置设备会导致冲突。 - **解决方法**: - **在初始化任何CUDA操作前调用`set_device`**: ```python def init_device(device_id): torch.cuda.set_device(device_id) # 先设置设备 # 后续操作(如模型加载、张量创建) ``` - **避免隐式CUDA初始化**:确保在调用`init_device`前不执行任何涉及GPU的操作(如`torch.tensor(...).cuda()`)。 --- ### **4. 分布式训练进程同步** - **问题根源**:PyTorch的`DistributedDataParallel`要求进程组正确初始化。 - **解决方法**: - 使用`torch.distributed.init_process_group`初始化进程组: ```python import torch.distributed as dist def setup(rank, world_size): dist.init_process_group( backend="nccl", # 使用NCCL后端 init_method="env://", rank=rank, world_size=world_size ) torch.cuda.set_device(rank) # 确保进程与GPU绑定 ``` - **调用顺序**:先调用`setup`初始化进程组,再初始化模型和优化器。 --- ### **5. 死锁问题排查与解决** - **问题根源**:Ray Worker间因资源竞争或同步不当导致死锁。 - **解决方法**: - **减少进程间依赖**:确保每个Worker独立初始化,避免共享锁或文件资源。 - **超时机制**:在Ray任务配置中增加超时参数,避免无限等待: ```python ray.init(_system_config={"health_check_initial_delay_ms": 0, "health_check_period_ms": 1000}) ``` - **日志调试**:添加详细日志输出,定位死锁发生的具体位置。 --- ### **示例代码整合** ```python import ray import torch import os import torch.distributed as dist @ray.remote(num_gpus=1) class RayWorkerWrapper: def __init__(self, rank, world_size): os.environ["CUDA_VISIBLE_DEVICES"] = str(rank) self.rank = rank self.world_size = world_size self.setup() def setup(self): dist.init_process_group( backend="nccl", init_method="env://", rank=self.rank, world_size=self.world_size ) torch.cuda.set_device(self.rank) # 绑定设备与进程 def train(self): model = Model().cuda() ddp_model = torch.nn.parallel.DistributedDataParallel(model, device_ids=[self.rank]) # 训练逻辑... # 启动分布式任务 ray.init() workers = [RayWorkerWrapper.remote(i, 2) for i in range(2)] ray.get([worker.train.remote() for worker in workers]) ``` ---

Error executing method 'init_device'. This might cause deadlock in distributed execution. [repeated 30x across cluster] (RayWorkerWrapper pid=18360, ip=13.13.6.12) ERROR 04-02 17:09:26 worker_base.py:609] Traceback (most recent call last): [repeated 30x across cluster] (RayWorkerWrapper pid=18360, ip=13.13.6.12) ERROR 04-02 17:09:26 worker_base.py:609] File "/usr/local/lib/python3.10/dist-packages/vllm/worker/worker_base.py", line 601, in execute_method [repeated 30x across cluster] (RayWorkerWrapper pid=18360, ip=13.13.6.12) ERROR 04-02 17:09:26 worker_base.py:609] return run_method(target, method, args, kwargs) [repeated 30x across cluster] (RayWorkerWrapper pid=18360, ip=13.13.6.12) ERROR 04-02 17:09:26 worker_base.py:609] File "/usr/local/lib/python3.10/dist-packages/vllm/utils.py", line 2311, in run_method [repeated 30x across cluster] (RayWorkerWrapper pid=18360, ip=13.13.6.12) ERROR 04-02 17:09:26 worker_base.py:609] return func(*args, **kwargs) [repeated 30x across cluster] (RayWorkerWrapper pid=18360, ip=13.13.6.12) ERROR 04-02 17:09:26 worker_base.py:609] File "/usr/local/lib/python3.10/dist-packages/vllm/spec_decode/spec_decode_worker.py", line 355, in init_device [repeated 30x across cluster] (RayWorkerWrapper pid=18360, ip=13.13.6.12) ERROR 04-02 17:09:26 worker_base.py:609] self.scorer_worker.load_model() [repeated 30x across cluster] (RayWorkerWrapper pid=18360, ip=13.13.6.12) ERROR 04-02 17:09:26 worker_base.py:609] File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/model_loader/loader.py", line 388, in load_model [repeated 90x across cluster] (RayWorkerWrapper pid=18360, ip=13.13.6.12) ERROR 04-02 17:09:26 worker_base.py:609] self.model_runner.load_model() [repeated 30x across cluster] (RayWorkerWrapper pid=18360, ip=13.13.6.12) ERROR 04-02 17:09:26 worker_base.py:609] self.model = get_model(vllm_config=self.vllm_config) [repeated 30x across cluster] (RayWorkerWrapper pid=18360, ip=13.13.6.12) ERROR 04-02 1

### 可能的原因分析 在分布式执行环境中,`init_device` 方法引发的死锁问题通常与资源分配、通信框架配置以及底层硬件支持有关。以下是几个常见的原因及其对应的解决方案: #### 1. **TensorFlow Server 创建错误** 如果 `tf.errors.OpError` 或其子类被抛出,则表明 TensorFlow 的服务器创建过程中出现了异常[^1]。这种情况下,可以尝试以下措施: - 确保集群中的每台机器都已正确安装 TensorFlow 并且版本一致。 - 验证网络连接是否正常,特别是当使用多机环境时。 #### 2. **NCCL 安装或配置问题** 对于涉及 GPU 的分布式训练任务,NCCL 是实现高效设备间通信的核心库。如果 NCCL 版本不匹配或者未正确初始化,可能会导致死锁或其他同步问题[^2]。可以通过运行以下命令验证 NCCL 是否可用: ```bash python -m torch.distributed.check_nccl_version ``` 此外,设置一些调试相关的环境变量可以帮助定位具体问题所在: ```bash export NCCL_DEBUG=INFO export NCCL_P2P_DISABLE=1 export NCCL_SHM_DISABLE=1 ``` 这些变量的作用分别是启用详细的日志记录、禁用点对点传输模式以及关闭共享内存机制。通过调整它们能够有效规避某些特定场景下的兼容性隐患。 #### 3. **资源泄漏或权限不足** 根据描述,“resource_tracker” 提醒存在信号量泄露现象,这意味着可能存在未释放的系统资源占用情况;另外,在基于 NFS (Network File System) 这样的远程存储介质上操作文件时也容易遇到并发访问限制所引起的锁定争议。针对这类状况建议采取如下策略: - 彻底排查程序逻辑是否存在遗漏 close() 函数调用的地方; - 如果确实依赖外部挂载盘来保存临时数据的话,考虑改用本地磁盘路径作为工作目录以减少潜在干扰因素的影响程度。 #### 示例代码片段 下面给出一段简单的 Python 脚本来演示如何安全地管理资源生命周期从而避免上述提及的风险点之一——即防止因不当处理而造成句柄丢失的现象发生: ```python import os from contextlib import ExitStack def safe_open_files(filepaths): with ExitStack() as stack: files = [stack.enter_context(open(fp)) for fp in filepaths] # Perform operations on 'files' pass if __name__ == "__main__": paths = ["file1.txt", "file2.txt"] try: safe_open_files(paths) except Exception as e: print(f"An error occurred: {e}") ``` 此例子利用了 `ExitStack` 来自动跟踪所有打开的对象并在退出作用域之前逐一销毁它们,进而杜绝意外遗留的可能性。 --- ###
阅读全文

相关推荐

(cvnets) D:\code\ml-cvnets-main>cvnets-train --common.config-file config/segmentation/pascal_voc/deeplabv3_mobilevitv2.yaml --common.results-loc deeplabv3_mobilevitv2_results/width_1_0_0 --common.override-kwargs model.classification.pretrained="LOCATION_OF_IMAGENET_1k_CHECKPOINT" NOTE: Redirects are currently not supported in Windows or MacOs. C:\Users\boardman\.conda\envs\cvnets\lib\site-packages\torchvision\models\detection\anchor_utils.py:63: UserWarning: Failed to initialize NumPy: module compiled against API version 0x10 but this version of numpy is 0xe (Triggered internally at ..\torch\csrc\utils\tensor_numpy.cpp:77.) device: torch.device = torch.device("cpu"), C:\Users\boardman\.conda\envs\cvnets\lib\site-packages\torchaudio\backend\utils.py:62: UserWarning: No audio backend is available. warnings.warn("No audio backend is available.") 2025-03-04 21:57:30 - DEBUG - Cannot load internal arguments, skipping. RuntimeError: module compiled against API version 0xf but this version of numpy is 0xe Traceback (most recent call last): File "C:\Users\boardman\.conda\envs\cvnets\lib\runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Users\boardman\.conda\envs\cvnets\lib\runpy.py", line 86, in _run_code exec(code, run_globals) File "C:\Users\boardman\.conda\envs\cvnets\Scripts\cvnets-train.exe\__main__.py", line 7, in <module> File "D:\code\ml-cvnets-main\main_train.py", line 193, in main_worker opts = get_training_arguments(args=args) File "D:\code\ml-cvnets-main\options\opts.py", line 332, in get_training_arguments parser = METRICS_REGISTRY.all_arguments(parser=parser) File "D:\code\ml-cvnets-main\utils\registry.py", line 180, in all_arguments self._load_all() File "D:\code\ml-cvnets-main\utils\registry.py", line 97, in _load_all import_modules_from_folder(dir_name, extra_roots=self.internal_dirs) File "D:\code\ml-cvnets-main\utils\import_utils.py", line 41, in import_modules_from_folder importlib.import_module(module_name) File "C:\Users\boardman\.conda\envs\cvnets\lib\importlib\__init__.py", line 126, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "<frozen importlib._bootstrap>", line 1050, in _gcd_import File "<frozen importlib._bootstrap>", line 1027, in _find_and_load File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked File "<frozen importlib._bootstrap>", line 688, in _load_unlocked File "<frozen importlib._bootstrap_external>", line 883, in exec_module File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed File "D:\code\ml-cvnets-main\metrics\average_precision.py", line 11, in <module> from sklearn.metrics import average_precision_score File "C:\Users\boardman\.conda\envs\cvnets\lib\site-packages\sklearn\__init__.py", line 82, in <module> from .base import clone File "C:\Users\boardman\.conda\envs\cvnets\lib\site-packages\sklearn\base.py", line 17, in <module> from .utils import _IS_32BIT File "C:\Users\boardman\.conda\envs\cvnets\lib\site-packages\sklearn\utils\__init__.py", line 17, in <module> from scipy.sparse import issparse File "C:\Users\boardman\.conda\envs\cvnets\lib\site-packages\scipy\sparse\__init__.py", line 267, in <module> from ._csr import * File "C:\Users\boardman\.conda\envs\cvnets\lib\site-packages\scipy\sparse\_csr.py", line 10, in <module> from ._sparsetools import (csr_tocsc, csr_tobsr, csr_count_blocks, ImportError: numpy.core.multiarray failed to import 请逐句进行分析

Traceback (most recent call last): File "<string>", line 1, in <module> File "C:\Users\泽熙\.conda\envs\material\lib\multiprocessing\spawn.py", line 105, in spawn_main exitcode = _main(fd) File "C:\Users\泽熙\.conda\envs\material\lib\multiprocessing\spawn.py", line 114, in _main prepare(preparation_data) File "C:\Users\泽熙\.conda\envs\material\lib\multiprocessing\spawn.py", line 225, in prepare _fixup_main_from_path(data['init_main_from_path']) File "C:\Users\泽熙\.conda\envs\material\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path run_name="__mp_main__") File "C:\Users\泽熙\.conda\envs\material\lib\runpy.py", line 263, in run_path pkg_name=pkg_name, script_name=fname) File "C:\Users\泽熙\.conda\envs\material\lib\runpy.py", line 96, in _run_module_code mod_name, mod_spec, pkg_name, script_name) File "C:\Users\泽熙\.conda\envs\material\lib\runpy.py", line 85, in _run_code exec(code, run_globals) File "D:\yolo_picture\yolo\yolov5-5.0\train.py", line 12, in <module> import torch.distributed as dist File "C:\Users\泽熙\.conda\envs\material\lib\site-packages\torch\__init__.py", line 721, in <module> import torch.utils.data File "C:\Users\泽熙\.conda\envs\material\lib\site-packages\torch\utils\data\__init__.py", line 38, in <module> from torch.utils.data.dataloader_experimental import DataLoader2 File "C:\Users\泽熙\.conda\envs\material\lib\site-packages\torch\utils\data\dataloader_experimental.py", line 11, in <module> from torch.utils.data.datapipes.iter import IterableWrapper File "C:\Users\泽熙\.conda\envs\material\lib\site-packages\torch\utils\data\datapipes\__init__.py", line 3, in <module> from . import dataframe File "C:\Users\泽熙\.conda\envs\material\lib\site-packages\torch\utils\data\datapipes\dataframe\__init__.py", line 4, in <module> from torch.utils.data.datapipes.dataframe.datapipes import ( File "C:\Users\泽熙\.conda\envs\material\lib\site-packages\torch\utils\data\datapipes\datafr

[rank0]: Traceback (most recent call last): [rank0]: File "test.py", line 213, in <module> [rank0]: main() [rank0]: File "test.py", line 209, in main [rank0]: eval_single_ckpt(model, test_loader, args, eval_output_dir, logger, epoch_id, dist_test=dist_test) [rank0]: File "test.py", line 72, in eval_single_ckpt [rank0]: eval_utils.eval_one_epoch( [rank0]: File "/home/mtr/MTR/tools/eval_utils/eval_utils.py", line 42, in eval_one_epoch [rank0]: for i, batch_dict in enumerate(dataloader): [rank0]: File "/root/anaconda3/envs/mtr/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 630, in __next__ [rank0]: data = self._next_data() [rank0]: File "/root/anaconda3/envs/mtr/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1324, in _next_data [rank0]: return self._process_data(data) [rank0]: File "/root/anaconda3/envs/mtr/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1370, in _process_data [rank0]: data.reraise() [rank0]: File "/root/anaconda3/envs/mtr/lib/python3.8/site-packages/torch/_utils.py", line 706, in reraise [rank0]: raise exception [rank0]: AssertionError: Caught AssertionError in DataLoader worker process 2. [rank0]: Original Traceback (most recent call last): [rank0]: File "/root/anaconda3/envs/mtr/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 309, in _worker_loop [rank0]: data = fetcher.fetch(index) # type: ignore[possibly-undefined] [rank0]: File "/root/anaconda3/envs/mtr/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 52, in fetch [rank0]: data = [self.dataset[idx] for idx in possibly_batched_index] [rank0]: File "/root/anaconda3/envs/mtr/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 52, in [rank0]: data = [self.dataset[idx] for idx in possibly_batched_index] [rank0]: File "/home/mtr/MTR/tools/../mtr/datasets/waymo/waymo_dataset.py", line 68, in __getitem__ [rank0]:

/home/shuo/VLA/openpi/.venv/lib/python3.11/site-packages/tyro/_parsers.py:332: UserWarning: The field model.action-expert-variant is annotated with type typing.Literal['dummy', 'gemma_300m', 'gemma_2b', 'gemma_2b_lora'], but the default value gemma_300m_lora has type <class 'str'>. We'll try to handle this gracefully, but it may cause unexpected behavior. warnings.warn(message) 19:07:30.004 [I] Running on: shuo-hp (10287:train.py:195) INFO:2025-05-12 19:07:30,228:jax._src.xla_bridge:945: Unable to initialize backend 'rocm': module 'jaxlib.xla_extension' has no attribute 'GpuAllocatorConfig' 19:07:30.228 [I] Unable to initialize backend 'rocm': module 'jaxlib.xla_extension' has no attribute 'GpuAllocatorConfig' (10287:xla_bridge.py:945) INFO:2025-05-12 19:07:30,228:jax._src.xla_bridge:945: Unable to initialize backend 'tpu': INTERNAL: Failed to open libtpu.so: libtpu.so: cannot open shared object file: No such file or directory 19:07:30.228 [I] Unable to initialize backend 'tpu': INTERNAL: Failed to open libtpu.so: libtpu.so: cannot open shared object file: No such file or directory (10287:xla_bridge.py:945) 19:07:30.500 [I] Wiped checkpoint directory /home/shuo/VLA/openpi/checkpoints/pi0_ours_aloha/your_experiment_name (10287:checkpoints.py:25) 19:07:30.500 [I] Created BasePyTreeCheckpointHandler: pytree_metadata_options=PyTreeMetadataOptions(support_rich_types=False), array_metadata_store=None (10287:base_pytree_checkpoint_handler.py:332) 19:07:30.500 [I] Created BasePyTreeCheckpointHandler: pytree_metadata_options=PyTreeMetadataOptions(support_rich_types=False), array_metadata_store=None (10287:base_pytree_checkpoint_handler.py:332) 19:07:30.500 [I] [thread=MainThread] Failed to get flag value for EXPERIMENTAL_ORBAX_USE_DISTRIBUTED_PROCESS_ID. (10287:multihost.py:375) 19:07:30.500 [I] [process=0][thread=MainThread] CheckpointManager init: checkpointers=None, item_names=None, item_handlers={'assets': <openpi.training.checkpoints.CallbackHandler object at 0x72e5cae0ff50>, 'train_state': <orbax.checkpoint._src.handlers.pytree_checkpoint_handler.PyTreeCheckpointHandler object at 0x72e5cafa0e90>, 'params': <orbax.checkpoint._src.handlers.pytree_checkpoint_handler.PyTreeCheckpointHandler object at 0x72e5cafa05d0>}, handler_registry=None (10287:checkpoint_manager.py:622) 19:07:30.501 [I] Deferred registration for item: "assets". Adding handler <openpi.training.checkpoints.CallbackHandler object at 0x72e5cae0ff50> for item "assets" and save args <class 'openpi.training.checkpoints.CallbackSave'> and restore args <class 'openpi.training.checkpoints.CallbackRestore'> to _handler_registry. (10287:composite_checkpoint_handler.py:239) 19:07:30.501 [I] Deferred registration for item: "train_state". Adding handler <orbax.checkpoint._src.handlers.pytree_checkpoint_handler.PyTreeCheckpointHandler object at 0x72e5cafa0e90> for item "train_state" and save args <class 'orbax.checkpoint._src.handlers.pytree_checkpoint_handler.PyTreeSaveArgs'> and restore args <class 'orbax.checkpoint._src.handlers.pytree_checkpoint_handler.PyTreeRestoreArgs'> to _handler_registry. (10287:composite_checkpoint_handler.py:239) 19:07:30.501 [I] Deferred registration for item: "params". Adding handler <orbax.checkpoint._src.handlers.pytree_checkpoint_handler.PyTreeCheckpointHandler object at 0x72e5cafa05d0> for item "params" and save args <class 'orbax.checkpoint._src.handlers.pytree_checkpoint_handler.PyTreeSaveArgs'> and restore args <class 'orbax.checkpoint._src.handlers.pytree_checkpoint_handler.PyTreeRestoreArgs'> to _handler_registry. (10287:composite_checkpoint_handler.py:239) 19:07:30.501 [I] Deferred registration for item: "metrics". Adding handler <orbax.checkpoint._src.handlers.json_checkpoint_handler.JsonCheckpointHandler object at 0x72e5cad7fd10> for item "metrics" and save args <class 'orbax.checkpoint._src.handlers.json_checkpoint_handler.JsonSaveArgs'> and restore args <class 'orbax.checkpoint._src.handlers.json_checkpoint_handler.JsonRestoreArgs'> to _handler_registry. (10287:composite_checkpoint_handler.py:239) 19:07:30.501 [I] Initialized registry DefaultCheckpointHandlerRegistry({('assets', <class 'openpi.training.checkpoints.CallbackSave'>): <openpi.training.checkpoints.CallbackHandler object at 0x72e5cae0ff50>, ('assets', <class 'openpi.training.checkpoints.CallbackRestore'>): <openpi.training.checkpoints.CallbackHandler object at 0x72e5cae0ff50>, ('train_state', <class 'orbax.checkpoint._src.handlers.pytree_checkpoint_handler.PyTreeSaveArgs'>): <orbax.checkpoint._src.handlers.pytree_checkpoint_handler.PyTreeCheckpointHandler object at 0x72e5cafa0e90>, ('train_state', <class 'orbax.checkpoint._src.handlers.pytree_checkpoint_handler.PyTreeRestoreArgs'>): <orbax.checkpoint._src.handlers.pytree_checkpoint_handler.PyTreeCheckpointHandler object at 0x72e5cafa0e90>, ('params', <class 'orbax.checkpoint._src.handlers.pytree_checkpoint_handler.PyTreeSaveArgs'>): <orbax.checkpoint._src.handlers.pytree_checkpoint_handler.PyTreeCheckpointHandler object at 0x72e5cafa05d0>, ('params', <class 'orbax.checkpoint._src.handlers.pytree_checkpoint_handler.PyTreeRestoreArgs'>): <orbax.checkpoint._src.handlers.pytree_checkpoint_handler.PyTreeCheckpointHandler object at 0x72e5cafa05d0>, ('metrics', <class 'orbax.checkpoint._src.handlers.json_checkpoint_handler.JsonSaveArgs'>): <orbax.checkpoint._src.handlers.json_checkpoint_handler.JsonCheckpointHandler object at 0x72e5cad7fd10>, ('metrics', <class 'orbax.checkpoint._src.handlers.json_checkpoint_handler.JsonRestoreArgs'>): <orbax.checkpoint._src.handlers.json_checkpoint_handler.JsonCheckpointHandler object at 0x72e5cad7fd10>}). (10287:composite_checkpoint_handler.py:508) 19:07:30.501 [I] orbax-checkpoint version: 0.11.1 (10287:abstract_checkpointer.py:35) 19:07:30.501 [I] [process=0][thread=MainThread] Using barrier_sync_fn: <function get_barrier_sync_fn.<locals>.<lambda> at 0x72e5cacb85e0> timeout: 7200 secs and primary_host=0 for async checkpoint writes (10287:async_checkpointer.py:80) 19:07:30.501 [I] Found 0 checkpoint steps in /home/shuo/VLA/openpi/checkpoints/pi0_ours_aloha/your_experiment_name (10287:checkpoint_manager.py:1528) 19:07:30.501 [I] Saving root metadata (10287:checkpoint_manager.py:1569) 19:07:30.501 [I] [process=0][thread=MainThread] Skipping global process sync, barrier name: CheckpointManager:save_metadata (10287:multihost.py:293) 19:07:30.501 [I] [process=0][thread=MainThread] CheckpointManager created, primary_host=0, CheckpointManagerOptions=CheckpointManagerOptions(save_interval_steps=1, max_to_keep=1, keep_time_interval=None, keep_period=5000, should_keep_fn=None, best_fn=None, best_mode='max', keep_checkpoints_without_metrics=True, step_prefix=None, step_format_fixed_length=None, step_name_format=None, create=False, cleanup_tmp_directories=False, save_on_steps=frozenset(), single_host_load_and_broadcast=False, todelete_subdir=None, enable_background_delete=False, read_only=False, enable_async_checkpointing=True, async_options=AsyncOptions(timeout_secs=7200, barrier_sync_fn=None, post_finalization_callback=None, create_directories_asynchronously=False), multiprocessing_options=MultiprocessingOptions(primary_host=0, active_processes=None, barrier_sync_key_prefix=None), should_save_fn=None, file_options=FileOptions(path_permission_mode=None), save_root_metadata=True, temporary_path_class=None, save_decision_policy=None), root_directory=/home/shuo/VLA/openpi/checkpoints/pi0_ours_aloha/your_experiment_name: <orbax.checkpoint.checkpoint_manager.CheckpointManager object at 0x72e5cadffd10> (10287:checkpoint_manager.py:797) 19:07:30.553 [I] Loaded norm stats from s3://openpi-assets/checkpoints/pi0_base/assets/trossen (10287:config.py:166) Returning existing local_dir /home/shuo/VLA/lerobot/aloha-real-data as remote repo cannot be accessed in snapshot_download (None). 19:07:30.553 [W] Returning existing local_dir /home/shuo/VLA/lerobot/aloha-real-data as remote repo cannot be accessed in snapshot_download (None). (10287:_snapshot_download.py:213) Returning existing local_dir /home/shuo/VLA/lerobot/aloha-real-data as remote repo cannot be accessed in snapshot_download (None). 19:07:30.554 [W] Returning existing local_dir /home/shuo/VLA/lerobot/aloha-real-data as remote repo cannot be accessed in snapshot_download (None). (10287:_snapshot_download.py:213) Returning existing local_dir /home/shuo/VLA/lerobot/aloha-real-data as remote repo cannot be accessed in snapshot_download (None). 19:07:30.555 [W] Returning existing local_dir /home/shuo/VLA/lerobot/aloha-real-data as remote repo cannot be accessed in snapshot_download (None). (10287:_snapshot_download.py:213) Traceback (most recent call last): File "/home/shuo/VLA/openpi/scripts/train.py", line 273, in <module> main(_config.cli()) File "/home/shuo/VLA/openpi/scripts/train.py", line 226, in main batch = next(data_iter) ^^^^^^^^^^^^^^^ File "/home/shuo/VLA/openpi/src/openpi/training/data_loader.py", line 177, in __iter__ for batch in self._data_loader: File "/home/shuo/VLA/openpi/src/openpi/training/data_loader.py", line 257, in __iter__ batch = next(data_iter) ^^^^^^^^^^^^^^^ File "/home/shuo/VLA/openpi/.venv/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 708, in __next__ data = self._next_data() ^^^^^^^^^^^^^^^^^ File "/home/shuo/VLA/openpi/.venv/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 1480, in _next_data return self._process_data(data) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/shuo/VLA/openpi/.venv/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 1505, in _process_data data.reraise() File "/home/shuo/VLA/openpi/.venv/lib/python3.11/site-packages/torch/_utils.py", line 733, in reraise raise exception KeyError: Caught KeyError in DataLoader worker process 0. Original Traceback (most recent call last): File "/home/shuo/VLA/openpi/.venv/lib/python3.11/site-packages/torch/utils/data/_utils/worker.py", line 349, in _worker_loop data = fetcher.fetch(index) # type: ignore[possibly-undefined] ^^^^^^^^^^^^^^^^^^^^ File "/home/shuo/VLA/openpi/.venv/lib/python3.11/site-packages/torch/utils/data/_utils/fetch.py", line 52, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/shuo/VLA/openpi/.venv/lib/python3.11/site-packages/torch/utils/data/_utils/fetch.py", line 52, in data = [self.dataset[idx] for idx in possibly_batched_index] ~~~~~~~~~~~~^^^^^ File "/home/shuo/VLA/openpi/src/openpi/training/data_loader.py", line 47, in __getitem__ return self._transform(self._dataset[index]) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/shuo/VLA/openpi/src/openpi/transforms.py", line 70, in __call__ data = transform(data) ^^^^^^^^^^^^^^^ File "/home/shuo/VLA/openpi/src/openpi/transforms.py", line 101, in __call__ return jax.tree.map(lambda k: flat_item[k], self.structure) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/shuo/VLA/openpi/.venv/lib/python3.11/site-packages/jax/_src/tree.py", line 155, in map return tree_util.tree_map(f, tree, *rest, is_leaf=is_leaf) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/shuo/VLA/openpi/.venv/lib/python3.11/site-packages/jax/_src/tree_util.py", line 358, in tree_map return treedef.unflatten(f(*xs) for xs in zip(*all_leaves)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/shuo/VLA/openpi/.venv/lib/python3.11/site-packages/jax/_src/tree_util.py", line 358, in <genexpr> return treedef.unflatten(f(*xs) for xs in zip(*all_leaves)) ^^^^^^ File "/home/shuo/VLA/openpi/src/openpi/transforms.py", line 101, in <lambda> return jax.tree.map(lambda k: flat_item[k], self.structure) ~~~~~~~~~^^^ KeyError: 'observation.images.cam_low'

INFO 07-06 22:01:35 custom_cache_manager.py:19] Setting Triton cache manager to: vllm.triton_utils.custom_cache_manager:CustomCacheManager INFO 07-06 22:01:36 cuda.py:178] Cannot use FlashAttention-2 backend for Volta and Turing GPUs. INFO 07-06 22:01:36 cuda.py:226] Using XFormers backend. INFO 07-06 22:01:40 __init__.py:207] Automatically detected platform cuda. (VllmWorkerProcess pid=8965) INFO 07-06 22:01:40 multiproc_worker_utils.py:229] Worker ready; awaiting tasks (VllmWorkerProcess pid=8965) INFO 07-06 22:01:41 cuda.py:178] Cannot use FlashAttention-2 backend for Volta and Turing GPUs. (VllmWorkerProcess pid=8965) INFO 07-06 22:01:41 cuda.py:226] Using XFormers backend. ERROR 07-06 22:01:41 engine.py:400] Bfloat16 is only supported on GPUs with compute capability of at least 8.0. Your NVIDIA GeForce RTX 2080 Ti GPU has compute capability 7.5. You can use float16 instead by explicitly setting thedtype flag in CLI, for example: --dtype=half. ERROR 07-06 22:01:41 engine.py:400] Traceback (most recent call last): ERROR 07-06 22:01:41 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/engine/multiprocessing/engine.py", line 391, in run_mp_engine ERROR 07-06 22:01:41 engine.py:400] engine = MQLLMEngine.from_engine_args(engine_args=engine_args, ERROR 07-06 22:01:41 engine.py:400] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 07-06 22:01:41 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/engine/multiprocessing/engine.py", line 124, in from_engine_args ERROR 07-06 22:01:41 engine.py:400] return cls(ipc_path=ipc_path, ERROR 07-06 22:01:41 engine.py:400] ^^^^^^^^^^^^^^^^^^^^^^ ERROR 07-06 22:01:41 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/engine/multiprocessing/engine.py", line 76, in __init__ ERROR 07-06 22:01:41 engine.py:400] self.engine = LLMEngine(*args, **kwargs) ERROR 07-06 22:01:41 engine.py:400] ^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 07-06 22:01:41 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/engine/llm_engine.py", line 273, in __init__ ERROR 07-06 22:01:41 engine.py:400] self.model_executor = executor_class(vllm_config=vllm_config, ) ERROR 07-06 22:01:41 engine.py:400] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 07-06 22:01:41 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/executor/executor_base.py", line 271, in __init__ ERROR 07-06 22:01:41 engine.py:400] super().__init__(*args, **kwargs) ERROR 07-06 22:01:41 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/executor/executor_base.py", line 52, in __init__ ERROR 07-06 22:01:41 engine.py:400] self._init_executor() ERROR 07-06 22:01:41 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/executor/mp_distributed_executor.py", line 124, in _init_executor ERROR 07-06 22:01:41 engine.py:400] self._run_workers("init_device") ERROR 07-06 22:01:41 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/executor/mp_distributed_executor.py", line 185, in _run_workers ERROR 07-06 22:01:41 engine.py:400] driver_worker_output = run_method(self.driver_worker, sent_method, ERROR 07-06 22:01:41 engine.py:400] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 07-06 22:01:41 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/utils.py", line 2196, in run_method ERROR 07-06 22:01:41 engine.py:400] return func(*args, **kwargs) ERROR 07-06 22:01:41 engine.py:400] ^^^^^^^^^^^^^^^^^^^^^ ERROR 07-06 22:01:41 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/worker/worker.py", line 157, in init_device ERROR 07-06 22:01:41 engine.py:400] _check_if_gpu_supports_dtype(self.model_config.dtype) ERROR 07-06 22:01:41 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/worker/worker.py", line 525, in _check_if_gpu_supports_dtype ERROR 07-06 22:01:41 engine.py:400] raise ValueError( ERROR 07-06 22:01:41 engine.py:400] ValueError: Bfloat16 is only supported on GPUs with compute capability of at least 8.0. Your NVIDIA GeForce RTX 2080 Ti GPU has compute capability 7.5. You can use float16 instead by explicitly setting thedtype flag in CLI, for example: --dtype=half. Process SpawnProcess-1: ERROR 07-06 22:01:41 multiproc_worker_utils.py:124] Worker VllmWorkerProcess pid 8965 died, exit code: -15 INFO 07-06 22:01:41 multiproc_worker_utils.py:128] Killing local vLLM worker processes Traceback (most recent call last): File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/multiprocessing/process.py", line 314, in _bootstrap self.run() File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/multiprocessing/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/engine/multiprocessing/engine.py", line 402, in run_mp_engine raise e File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/engine/multiprocessing/engine.py", line 391, in run_mp_engine engine = MQLLMEngine.from_engine_args(engine_args=engine_args, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/engine/multiprocessing/engine.py", line 124, in from_engine_args return cls(ipc_path=ipc_path, ^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/engine/multiprocessing/engine.py", line 76, in __init__ self.engine = LLMEngine(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/engine/llm_engine.py", line 273, in __init__ self.model_executor = executor_class(vllm_config=vllm_config, ) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/executor/executor_base.py", line 271, in __init__ super().__init__(*args, **kwargs) File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/executor/executor_base.py", line 52, in __init__ self._init_executor() File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/executor/mp_distributed_executor.py", line 124, in _init_executor self._run_workers("init_device") File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/executor/mp_distributed_executor.py", line 185, in _run_workers driver_worker_output = run_method(self.driver_worker, sent_method, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/utils.py", line 2196, in run_method return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/worker/worker.py", line 157, in init_device _check_if_gpu_supports_dtype(self.model_config.dtype) File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/worker/worker.py", line 525, in _check_if_gpu_supports_dtype raise ValueError( ValueError: Bfloat16 is only supported on GPUs with compute capability of at least 8.0. Your NVIDIA GeForce RTX 2080 Ti GPU has compute capability 7.5. You can use float16 instead by explicitly setting thedtype flag in CLI, for example: --dtype=half. Traceback (most recent call last): File "/root/miniconda3/envs/deepseek_vllm/bin/vllm", line 8, in <module> sys.exit(main()) ^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/entrypoints/cli/main.py", line 73, in main args.dispatch_function(args) File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/entrypoints/cli/serve.py", line 34, in cmd uvloop.run(run_server(args)) File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/uvloop/__init__.py", line 105, in run return runner.run(wrapper()) ^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/asyncio/runners.py", line 118, in run return self._loop.run_until_complete(task) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "uvloop/loop.pyx", line 1518, in uvloop.loop.Loop.run_until_complete File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/uvloop/__init__.py", line 61, in wrapper return await main ^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/entrypoints/openai/api_server.py", line 947, in run_server async with build_async_engine_client(args) as engine_client: File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/contextlib.py", line 210, in __aenter__ return await anext(self.gen) ^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/entrypoints/openai/api_server.py", line 139, in build_async_engine_client async with build_async_engine_client_from_engine_args( File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/contextlib.py", line 210, in __aenter__ return await anext(self.gen) ^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/entrypoints/openai/api_server.py", line 233, in build_async_engine_client_from_engine_args raise RuntimeError( RuntimeError: Engine process failed to start. See stack trace for the root cause. (deepseek_vllm) root@user-X99:/home/user/Desktop# ^C

INFO 07-06 22:09:23 __init__.py:207] Automatically detected platform cuda. (VllmWorkerProcess pid=10542) INFO 07-06 22:09:23 multiproc_worker_utils.py:229] Worker ready; awaiting tasks (VllmWorkerProcess pid=10542) INFO 07-06 22:09:24 cuda.py:178] Cannot use FlashAttention-2 backend for Volta and Turing GPUs. (VllmWorkerProcess pid=10542) INFO 07-06 22:09:24 cuda.py:226] Using XFormers backend. (VllmWorkerProcess pid=10542) INFO 07-06 22:09:25 utils.py:916] Found nccl from library libnccl.so.2 (VllmWorkerProcess pid=10542) INFO 07-06 22:09:25 pynccl.py:69] vLLM is using nccl==2.21.5 INFO 07-06 22:09:25 utils.py:916] Found nccl from library libnccl.so.2 INFO 07-06 22:09:25 pynccl.py:69] vLLM is using nccl==2.21.5 INFO 07-06 22:09:25 custom_all_reduce_utils.py:206] generating GPU P2P access cache in /root/.cache/vllm/gpu_p2p_access_cache_for_0,1.json INFO 07-06 22:09:39 custom_all_reduce_utils.py:244] reading GPU P2P access cache from /root/.cache/vllm/gpu_p2p_access_cache_for_0,1.json (VllmWorkerProcess pid=10542) INFO 07-06 22:09:39 custom_all_reduce_utils.py:244] reading GPU P2P access cache from /root/.cache/vllm/gpu_p2p_access_cache_for_0,1.json WARNING 07-06 22:09:39 custom_all_reduce.py:145] Custom allreduce is disabled because your platform lacks GPU P2P capability or P2P test failed. To silence this warning, specify disable_custom_all_reduce=True explicitly. (VllmWorkerProcess pid=10542) WARNING 07-06 22:09:39 custom_all_reduce.py:145] Custom allreduce is disabled because your platform lacks GPU P2P capability or P2P test failed. To silence this warning, specify disable_custom_all_reduce=True explicitly. INFO 07-06 22:09:39 shm_broadcast.py:258] vLLM message queue communication handle: Handle(connect_ip='127.0.0.1', local_reader_ranks=[1], buffer_handle=(1, 4194304, 6, 'psm_b99a0acb'), local_subscribe_port=57279, remote_subscribe_port=None) INFO 07-06 22:09:39 model_runner.py:1110] Starting to load model /home/user/Desktop/DeepSeek-R1-Distill-Qwen-32B... (VllmWorkerProcess pid=10542) INFO 07-06 22:09:39 model_runner.py:1110] Starting to load model /home/user/Desktop/DeepSeek-R1-Distill-Qwen-32B... ERROR 07-06 22:09:39 engine.py:400] CUDA out of memory. Tried to allocate 136.00 MiB. GPU 0 has a total capacity of 21.49 GiB of which 43.50 MiB is free. Including non-PyTorch memory, this process has 21.44 GiB memory in use. Of the allocated memory 21.19 GiB is allocated by PyTorch, and 20.25 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables) ERROR 07-06 22:09:39 engine.py:400] Traceback (most recent call last): ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/engine/multiprocessing/engine.py", line 391, in run_mp_engine ERROR 07-06 22:09:39 engine.py:400] engine = MQLLMEngine.from_engine_args(engine_args=engine_args, ERROR 07-06 22:09:39 engine.py:400] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/engine/multiprocessing/engine.py", line 124, in from_engine_args ERROR 07-06 22:09:39 engine.py:400] return cls(ipc_path=ipc_path, ERROR 07-06 22:09:39 engine.py:400] ^^^^^^^^^^^^^^^^^^^^^^ ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/engine/multiprocessing/engine.py", line 76, in __init__ ERROR 07-06 22:09:39 engine.py:400] self.engine = LLMEngine(*args, **kwargs) ERROR 07-06 22:09:39 engine.py:400] ^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/engine/llm_engine.py", line 273, in __init__ ERROR 07-06 22:09:39 engine.py:400] self.model_executor = executor_class(vllm_config=vllm_config, ) ERROR 07-06 22:09:39 engine.py:400] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/executor/executor_base.py", line 271, in __init__ ERROR 07-06 22:09:39 engine.py:400] super().__init__(*args, **kwargs) ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/executor/executor_base.py", line 52, in __init__ ERROR 07-06 22:09:39 engine.py:400] self._init_executor() ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/executor/mp_distributed_executor.py", line 125, in _init_executor ERROR 07-06 22:09:39 engine.py:400] self._run_workers("load_model", ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/executor/mp_distributed_executor.py", line 185, in _run_workers ERROR 07-06 22:09:39 engine.py:400] driver_worker_output = run_method(self.driver_worker, sent_method, ERROR 07-06 22:09:39 engine.py:400] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/utils.py", line 2196, in run_method ERROR 07-06 22:09:39 engine.py:400] return func(*args, **kwargs) ERROR 07-06 22:09:39 engine.py:400] ^^^^^^^^^^^^^^^^^^^^^ ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/worker/worker.py", line 183, in load_model ERROR 07-06 22:09:39 engine.py:400] self.model_runner.load_model() ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/worker/model_runner.py", line 1112, in load_model ERROR 07-06 22:09:39 engine.py:400] self.model = get_model(vllm_config=self.vllm_config) ERROR 07-06 22:09:39 engine.py:400] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/model_executor/model_loader/__init__.py", line 14, in get_model ERROR 07-06 22:09:39 engine.py:400] return loader.load_model(vllm_config=vllm_config) ERROR 07-06 22:09:39 engine.py:400] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/model_executor/model_loader/loader.py", line 406, in load_model ERROR 07-06 22:09:39 engine.py:400] model = _initialize_model(vllm_config=vllm_config) ERROR 07-06 22:09:39 engine.py:400] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/model_executor/model_loader/loader.py", line 125, in _initialize_model ERROR 07-06 22:09:39 engine.py:400] return model_class(vllm_config=vllm_config, prefix=prefix) ERROR 07-06 22:09:39 engine.py:400] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/model_executor/models/qwen2.py", line 453, in __init__ ERROR 07-06 22:09:39 engine.py:400] self.model = Qwen2Model(vllm_config=vllm_config, ERROR 07-06 22:09:39 engine.py:400] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/compilation/decorators.py", line 151, in __init__ ERROR 07-06 22:09:39 engine.py:400] old_init(self, vllm_config=vllm_config, prefix=prefix, **kwargs) ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/model_executor/models/qwen2.py", line 307, in __init__ ERROR 07-06 22:09:39 engine.py:400] self.start_layer, self.end_layer, self.layers = make_layers( ERROR 07-06 22:09:39 engine.py:400] ^^^^^^^^^^^^ ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/model_executor/models/utils.py", line 557, in make_layers ERROR 07-06 22:09:39 engine.py:400] [PPMissingLayer() for _ in range(start_layer)] + [ ERROR 07-06 22:09:39 engine.py:400] ^ ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/model_executor/models/utils.py", line 558, in ERROR 07-06 22:09:39 engine.py:400] maybe_offload_to_cpu(layer_fn(prefix=f"{prefix}.{idx}")) ERROR 07-06 22:09:39 engine.py:400] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/model_executor/models/qwen2.py", line 309, in <lambda> ERROR 07-06 22:09:39 engine.py:400] lambda prefix: Qwen2DecoderLayer(config=config, ERROR 07-06 22:09:39 engine.py:400] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/model_executor/models/qwen2.py", line 220, in __init__ ERROR 07-06 22:09:39 engine.py:400] self.mlp = Qwen2MLP( ERROR 07-06 22:09:39 engine.py:400] ^^^^^^^^^ ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/model_executor/models/qwen2.py", line 82, in __init__ ERROR 07-06 22:09:39 engine.py:400] self.down_proj = RowParallelLinear( ERROR 07-06 22:09:39 engine.py:400] ^^^^^^^^^^^^^^^^^^ ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/model_executor/layers/linear.py", line 1062, in __init__ ERROR 07-06 22:09:39 engine.py:400] self.quant_method.create_weights( ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/model_executor/layers/linear.py", line 129, in create_weights ERROR 07-06 22:09:39 engine.py:400] weight = Parameter(torch.empty(sum(output_partition_sizes), ERROR 07-06 22:09:39 engine.py:400] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 07-06 22:09:39 engine.py:400] File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/torch/utils/_device.py", line 106, in __torch_function__ ERROR 07-06 22:09:39 engine.py:400] return func(*args, **kwargs) ERROR 07-06 22:09:39 engine.py:400] ^^^^^^^^^^^^^^^^^^^^^ ERROR 07-06 22:09:39 engine.py:400] torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 136.00 MiB. GPU 0 has a total capacity of 21.49 GiB of which 43.50 MiB is free. Including non-PyTorch memory, this process has 21.44 GiB memory in use. Of the allocated memory 21.19 GiB is allocated by PyTorch, and 20.25 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables) Process SpawnProcess-1: INFO 07-06 22:09:39 multiproc_worker_utils.py:128] Killing local vLLM worker processes Traceback (most recent call last): File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/multiprocessing/process.py", line 314, in _bootstrap self.run() File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/multiprocessing/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/engine/multiprocessing/engine.py", line 402, in run_mp_engine raise e File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/engine/multiprocessing/engine.py", line 391, in run_mp_engine engine = MQLLMEngine.from_engine_args(engine_args=engine_args, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/engine/multiprocessing/engine.py", line 124, in from_engine_args return cls(ipc_path=ipc_path, ^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/engine/multiprocessing/engine.py", line 76, in __init__ self.engine = LLMEngine(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/engine/llm_engine.py", line 273, in __init__ self.model_executor = executor_class(vllm_config=vllm_config, ) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/executor/executor_base.py", line 271, in __init__ super().__init__(*args, **kwargs) File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/executor/executor_base.py", line 52, in __init__ self._init_executor() File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/executor/mp_distributed_executor.py", line 125, in _init_executor self._run_workers("load_model", File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/executor/mp_distributed_executor.py", line 185, in _run_workers driver_worker_output = run_method(self.driver_worker, sent_method, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/utils.py", line 2196, in run_method return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/worker/worker.py", line 183, in load_model self.model_runner.load_model() File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/worker/model_runner.py", line 1112, in load_model self.model = get_model(vllm_config=self.vllm_config) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/model_executor/model_loader/__init__.py", line 14, in get_model return loader.load_model(vllm_config=vllm_config) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/model_executor/model_loader/loader.py", line 406, in load_model model = _initialize_model(vllm_config=vllm_config) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/model_executor/model_loader/loader.py", line 125, in _initialize_model return model_class(vllm_config=vllm_config, prefix=prefix) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/model_executor/models/qwen2.py", line 453, in __init__ self.model = Qwen2Model(vllm_config=vllm_config, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/compilation/decorators.py", line 151, in __init__ old_init(self, vllm_config=vllm_config, prefix=prefix, **kwargs) File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/model_executor/models/qwen2.py", line 307, in __init__ self.start_layer, self.end_layer, self.layers = make_layers( ^^^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/model_executor/models/utils.py", line 557, in make_layers [PPMissingLayer() for _ in range(start_layer)] + [ ^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/model_executor/models/utils.py", line 558, in maybe_offload_to_cpu(layer_fn(prefix=f"{prefix}.{idx}")) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/model_executor/models/qwen2.py", line 309, in <lambda> lambda prefix: Qwen2DecoderLayer(config=config, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/model_executor/models/qwen2.py", line 220, in __init__ self.mlp = Qwen2MLP( ^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/model_executor/models/qwen2.py", line 82, in __init__ self.down_proj = RowParallelLinear( ^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/model_executor/layers/linear.py", line 1062, in __init__ self.quant_method.create_weights( File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/model_executor/layers/linear.py", line 129, in create_weights weight = Parameter(torch.empty(sum(output_partition_sizes), ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/torch/utils/_device.py", line 106, in __torch_function__ return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 136.00 MiB. GPU 0 has a total capacity of 21.49 GiB of which 43.50 MiB is free. Including non-PyTorch memory, this process has 21.44 GiB memory in use. Of the allocated memory 21.19 GiB is allocated by PyTorch, and 20.25 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables) [rank0]:[W706 22:09:40.345098611 ProcessGroupNCCL.cpp:1250] Warning: WARNING: process group has NOT been destroyed before we destruct ProcessGroupNCCL. On normal program exit, the application should call destroy_process_group to ensure that any pending NCCL operations have finished in this process. In rare cases this process can exit before this point and block the progress of another member of the process group. This constraint has always been present, but this warning has only been added since PyTorch 2.4 (function operator()) Traceback (most recent call last): File "/root/miniconda3/envs/deepseek_vllm/bin/vllm", line 8, in <module> sys.exit(main()) ^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/entrypoints/cli/main.py", line 73, in main args.dispatch_function(args) File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/entrypoints/cli/serve.py", line 34, in cmd uvloop.run(run_server(args)) File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/uvloop/__init__.py", line 105, in run return runner.run(wrapper()) ^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/asyncio/runners.py", line 118, in run return self._loop.run_until_complete(task) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "uvloop/loop.pyx", line 1518, in uvloop.loop.Loop.run_until_complete File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/uvloop/__init__.py", line 61, in wrapper return await main ^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/entrypoints/openai/api_server.py", line 947, in run_server async with build_async_engine_client(args) as engine_client: File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/contextlib.py", line 210, in __aenter__ return await anext(self.gen) ^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/entrypoints/openai/api_server.py", line 139, in build_async_engine_client async with build_async_engine_client_from_engine_args( File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/contextlib.py", line 210, in __aenter__ return await anext(self.gen) ^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/deepseek_vllm/lib/python3.11/site-packages/vllm/entrypoints/openai/api_server.py", line 233, in build_async_engine_client_from_engine_args raise RuntimeError( RuntimeError: Engine process failed to start. See stack trace for the root cause. (deepseek_vllm) root@user-X99:/home/user/Desktop# /root/miniconda3/envs/deepseek_vllm/lib/python3.11/multiprocessing/resource_tracker.py:254: UserWarning: resource_tracker: There appear to be 1 leaked shared_memory objects to clean up at shutdown warnings.warn('resource_tracker: There appear to be %d ' (deepseek_vllm) root@user-X99:/home/user/Desktop#

Exception in thread Thread-1 (worker): Traceback (most recent call last): File "D:\python\lib\threading.py", line 1016, in _bootstrap_inner self.run() File "D:\python\lib\threading.py", line 953, in run self._target(*self._args, **self._kwargs) File "c:\Users\12732\Desktop\qp\test.py", line 31, in worker Worker Thread-3 (worker) modified num: 7 Exception in thread Thread-2 (worker)fcntl.flock(f, fcntl.LOCK_EX) AttributeError: : Traceback (most recent call last): module 'fcntl' has no attribute 'LOCK_EX' File "D:\python\lib\threading.py", line 1016, in _bootstrap_inner self.run() File "D:\python\lib\threading.py", line 953, in run Exception in thread Thread-3 (worker): Traceback (most recent call last): File "D:\python\lib\threading.py", line 1016, in _bootstrap_inner Worker Thread-4 (worker) modified num: 6 self._target(*self._args, **self._kwargs) self.run() File "c:\Users\12732\Desktop\qp\test.py", line 31, in worker Exception in thread Thread-4 (worker) File "D:\python\lib\threading.py", line 953, in run : Traceback (most recent call last): File "D:\python\lib\threading.py", line 1016, in _bootstrap_inner Worker Thread-5 (worker) modified num: 5 fcntl.flock(f, fcntl.LOCK_EX) self._target(*self._args, **self._kwargs)AttributeError: module 'fcntl' has no attribute 'LOCK_EX' File "c:\Users\12732\Desktop\qp\test.py", line 31, in worker self.run() File "D:\python\lib\threading.py", line 953, in run fcntl.flock(f, fcntl.LOCK_EX) self._target(*self._args, **self._kwargs)AttributeError: module 'fcntl' has no attribute 'LOCK_EX' File "c:\Users\12732\Desktop\qp\test.py", line 31, in worker Exception in thread Thread-5 (worker): Traceback (most recent call last): File "D:\python\lib\threading.py", line 1016, in _bootstrap_inner fcntl.flock(f, fcntl.LOCK_EX) self.run() AttributeError: File "D:\python\lib\threading.py", line 953, in run module 'fcntl' has no attribute 'LOCK_EX' self._target(*self._args, **self._kwargs) File "c:\Users\12732\Desktop\qp\test.py", line 31, in worker fcntl.flock(f, fcntl.LOCK_EX) AttributeError: module 'fcntl' has no attribute 'LOCK_EX'

/home/dwh/anaconda3/envs/egpo_a/bin/python3.7 /home/dwh/EGPO/training_script/train_cql.py WARNING:tensorflow:From /home/dwh/anaconda3/envs/egpo_a/lib/python3.7/site-packages/tensorflow/python/compat/v2_compat.py:96: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version. Instructions for updating: non-resource variables are not supported in the long term It seems you don't install our cython utilities yet! Please reinstall MetaDrive via: or ! But don't worry! We are now falling back to native python implementation! WARNING:root:It seems you don't install our cython utilities yet! Please reinstall MetaDrive via: or ! But don't worry! We are now falling back to native python implementation! It seems you don't install our cython utilities yet! Please reinstall MetaDrive via: or ! But don't worry! We are now falling back to native python implementation! WARNING:root:It seems you don't install our cython utilities yet! Please reinstall MetaDrive via: or ! But don't worry! We are now falling back to native python implementation! Successfully registered the following environments: ['MetaDrive-test-v0', 'MetaDrive-validation-v0', 'MetaDrive-v0', 'MetaDrive-10envs-v0', 'MetaDrive-1000envs-v0', 'MetaDrive-training0-v0', 'MetaDrive-training1-v0', 'MetaDrive-training2-v0']. /home/dwh/EGPO/training_script/expert_traj_500.json Traceback (most recent call last): File "/home/dwh/anaconda3/envs/egpo_a/lib/python3.7/site-packages/aiohttp/client.py", line 92, in <module> import cchardet as chardet ModuleNotFoundError: No module named 'cchardet' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/home/dwh/EGPO/training_script/train_cql.py", line 82, in <module> custom_callback=ILCallBack, File "/home/dwh/EGPO/egpo_utils/train/train.py", line 33, in train initialize_ray(test_mode=test_mode, local_mode=local_mode, num_gpus=num_gpus, **init_kws) File "/home/dwh/EGPO/egpo_utils/train/utils.py", line 22, in initialize_ray **kwargs File "/home/dwh/anaconda3/envs/egpo_a/lib/python3.7/site-packages/ray/_private/client_mode_hook.py", line 47, in wrapper return func(*args, **kwargs) File "/home/dwh/anaconda3/envs/egpo_a/lib/python3.7/site-packages/ray/worker.py", line 718, in init ray_params=ray_params) File "/home/dwh/anaconda3/envs/egpo_a/lib/python3.7/site-packages/ray/node.py", line 220, in __init__ self.start_head_processes() File "/home/dwh/anaconda3/envs/egpo_a/lib/python3.7/site-packages/ray/node.py", line 840, in start_head_processes self.start_dashboard(require_dashboard=False) File "/home/dwh/anaconda3/envs/egpo_a/lib/python3.7/site-packages/ray/node.py", line 680, in start_dashboard port=self._ray_params.dashboard_port) File "/home/dwh/anaconda3/envs/egpo_a/lib/python3.7/site-packages/ray/_private/services.py", line 1150, in start_dashboard import aiohttp # noqa: F401 File "/home/dwh/anaconda3/envs/egpo_a/lib/python3.7/site-packages/aiohttp/__init__.py", line 6, in <module> from .client import ( File "/home/dwh/anaconda3/envs/egpo_a/lib/python3.7/site-packages/aiohttp/client.py", line 94, in <module> import charset_normalizer as chardet # type: ignore[no-redef] File "/home/dwh/anaconda3/envs/egpo_a/lib/python3.7/site-packages/charset_normalizer/__init__.py", line 23, in <module> from charset_normalizer.api import from_fp, from_path, from_bytes, normalize File "/home/dwh/anaconda3/envs/egpo_a/lib/python3.7/site-packages/charset_normalizer/api.py", line 10, in <module> from charset_normalizer.md import mess_ratio AttributeError: module 'charset_normalizer' has no attribute 'md__mypyc'

vllm serve 部署模型出现ERROR 03-21 09:05:00 engine.py:400] 'Gemma3Config' object has no attribute 'vocab_size' ERROR 03-21 09:05:00 engine.py:400] Traceback (most recent call last): ERROR 03-21 09:05:00 engine.py:400] File "/usr/local/python3.10/lib/python3.10/site-packages/vllm/engine/multiprocessing/engine.py", line 391, in run_mp_engine ERROR 03-21 09:05:00 engine.py:400] engine = MQLLMEngine.from_engine_args(engine_args=engine_args, ERROR 03-21 09:05:00 engine.py:400] File "/usr/local/python3.10/lib/python3.10/site-packages/vllm/engine/multiprocessing/engine.py", line 124, in from_engine_args ERROR 03-21 09:05:00 engine.py:400] return cls(ipc_path=ipc_path, ERROR 03-21 09:05:00 engine.py:400] File "/usr/local/python3.10/lib/python3.10/site-packages/vllm/engine/multiprocessing/engine.py", line 76, in __init__ ERROR 03-21 09:05:00 engine.py:400] self.engine = LLMEngine(*args, **kwargs) ERROR 03-21 09:05:00 engine.py:400] File "/usr/local/python3.10/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 273, in __init__ ERROR 03-21 09:05:00 engine.py:400] self.model_executor = executor_class(vllm_config=vllm_config, ) ERROR 03-21 09:05:00 engine.py:400] File "/usr/local/python3.10/lib/python3.10/site-packages/vllm/executor/executor_base.py", line 52, in __init__ ERROR 03-21 09:05:00 engine.py:400] self._init_executor() ERROR 03-21 09:05:00 engine.py:400] File "/usr/local/python3.10/lib/python3.10/site-packages/vllm/executor/uniproc_executor.py", line 47, in _init_executor ERROR 03-21 09:05:00 engine.py:400] self.collective_rpc("load_model") ERROR 03-21 09:05:00 engine.py:400] File "/usr/local/python3.10/lib/python3.10/site-packages/vllm/executor/uniproc_executor.py", line 56, in collective_rpc ERROR 03-21 09:05:00 engine.py:400] answer = run_method(self.driver_worker, method, args, kwargs) ERROR 03-21 09:05:00 engine.py:400] File "/usr/local/python3.10/lib/python3.10/site-packages/vllm/utils.py", line 2196, in run_method ERROR 03-21 09:05:00 engine.py:400] return func(*args, **kwargs) ERROR 03-21 09:05:00 engine.py:400] File "/usr/local/python3.10/lib/python3.10/site-packages/vllm_ascend/worker/worker.py", line 179, in load_model ERROR 03-21 09:05:00 engine.py:400] self.model_runner.load_model() ERROR 03-21 09:05:00 engine.py:400] File "/usr/local/python3.10/lib/python3.10/site-packages/vllm_ascend/worker/model_runner.py", line 818, in load_model ERROR 03-21 09:05:00 engine.py:400] self.model = get_model(vllm_config=self.vllm_config) ERROR 03-21 09:05:00 engine.py:400] File "/usr/local/python3.10/lib/python3.10/site-packages/vllm/model_executor/model_loader/__init__.py", line 14, in get_model ERROR 03-21 09:05:00 engine.py:400] return loader.load_model(vllm_config=vllm_config) ERROR 03-21 09:05:00 engine.py:400] File "/usr/local/python3.10/lib/python3.10/site-packages/vllm/model_executor/model_loader/loader.py", line 406, in load_model ERROR 03-21 09:05:00 engine.py:400] model = _initialize_model(vllm_config=vllm_config) ERROR 03-21 09:05:00 engine.py:400] File "/usr/local/python3.10/lib/python3.10/site-packages/vllm/model_executor/model_loader/loader.py", line 125, in _initialize_model ERROR 03-21 09:05:00 engine.py:400] return model_class(vllm_config=vllm_config, prefix=prefix) ERROR 03-21 09:05:00 engine.py:400] File "/usr/local/python3.10/lib/python3.10/site-packages/vllm/model_executor/models/transformers.py", line 135, in __init__ ERROR 03-21 09:05:00 engine.py:400] self.vocab_size = config.vocab_size ERROR 03-21 09:05:00 engine.py:400] File "/usr/local/python3.10/lib/python3.10/site-packages/transformers/configuration_utils.py", line 214, in __getattribute__ ERROR 03-21 09:05:00 engine.py:400] return super().__getattribute__(key)

最新推荐

recommend-type

完整word版操作系统2010-11-1-A试卷(1).doc

完整word版操作系统2010-11-1-A试卷(1).doc
recommend-type

spring-ai-autoconfigure-vector-store-redis-1.0.0-RC1.jar中文-英文对照文档.zip

1、压缩文件中包含: 中文-英文对照文档、jar包下载地址、Maven依赖、Gradle依赖、源代码下载地址。 2、使用方法: 解压最外层zip,再解压其中的zip包,双击 【index.html】 文件,即可用浏览器打开、进行查看。 3、特殊说明: (1)本文档为人性化翻译,精心制作,请放心使用; (2)只翻译了该翻译的内容,如:注释、说明、描述、用法讲解 等; (3)不该翻译的内容保持原样,如:类名、方法名、包名、类型、关键字、代码 等。 4、温馨提示: (1)为了防止解压后路径太长导致浏览器无法打开,推荐在解压时选择“解压到当前文件夹”(放心,自带文件夹,文件不会散落一地); (2)有时,一套Java组件会有多个jar,所以在下载前,请仔细阅读本篇描述,以确保这就是你需要的文件。 5、本文件关键字: jar中文-英文对照文档.zip,java,jar包,Maven,第三方jar包,组件,开源组件,第三方组件,Gradle,中文API文档,手册,开发手册,使用手册,参考手册。
recommend-type

Wamp5: 一键配置ASP/PHP/HTML服务器工具

根据提供的文件信息,以下是关于标题、描述和文件列表中所涉及知识点的详细阐述。 ### 标题知识点 标题中提到的是"PHP集成版工具wamp5.rar",这里面包含了以下几个重要知识点: 1. **PHP**: PHP是一种广泛使用的开源服务器端脚本语言,主要用于网站开发。它可以嵌入到HTML中,从而让网页具有动态内容。PHP因其开源、跨平台、面向对象、安全性高等特点,成为最流行的网站开发语言之一。 2. **集成版工具**: 集成版工具通常指的是将多个功能组合在一起的软件包,目的是为了简化安装和配置流程。在PHP开发环境中,这样的集成工具通常包括了PHP解释器、Web服务器以及数据库管理系统等关键组件。 3. **Wamp5**: Wamp5是这类集成版工具的一种,它基于Windows操作系统。Wamp5的名称来源于它包含的主要组件的首字母缩写,即Windows、Apache、MySQL和PHP。这种工具允许开发者快速搭建本地Web开发环境,无需分别安装和配置各个组件。 4. **RAR压缩文件**: RAR是一种常见的文件压缩格式,它以较小的体积存储数据,便于传输和存储。RAR文件通常需要特定的解压缩软件进行解压缩操作。 ### 描述知识点 描述中提到了工具的一个重要功能:“可以自动配置asp/php/html等的服务器, 不用辛辛苦苦的为怎么配置服务器而烦恼”。这里面涵盖了以下知识点: 1. **自动配置**: 自动配置功能意味着该工具能够简化服务器的搭建过程,用户不需要手动进行繁琐的配置步骤,如修改配置文件、启动服务等。这是集成版工具的一项重要功能,极大地降低了初学者的技术门槛。 2. **ASP/PHP/HTML**: 这三种技术是Web开发中常用的组件。ASP (Active Server Pages) 是微软开发的服务器端脚本环境;HTML (HyperText Markup Language) 是用于创建网页的标准标记语言;PHP是服务器端脚本语言。在Wamp5这类集成环境中,可以很容易地对这些技术进行测试和开发,因为它们已经预配置在一起。 3. **服务器**: 在Web开发中,服务器是一个运行Web应用程序并响应客户端请求的软件或硬件系统。常见的服务器软件包括Apache、Nginx等。集成版工具提供了一个本地服务器环境,使得开发者可以在本地测试他们的应用程序。 ### 标签知识点 标签中仅出现了“PHP”一个关键词,这意味着该工具专注于与PHP相关的开发环境配置。 ### 压缩包子文件的文件名称列表知识点 1. **wamp.exe**: 这是Wamp5集成版工具的可执行文件,用户通过运行这个文件,即可启动Wamp5环境,开始进行PHP等相关开发。 2. **使用说明文档.txt**: 通常这样的文本文件包含了软件的安装指南、功能介绍、常见问题解答以及操作教程等内容。它是用户了解和使用软件的重要辅助资料,可以帮助用户快速上手。 ### 总结 Wamp5作为一个PHP集成版工具,为PHP开发者提供了一个便捷的本地开发环境。它通过自动化配置减少了开发者在搭建环境时的麻烦,整合了Web开发的关键组件,从而使得开发者能够专注于代码的编写,而不是环境配置。此外,它还提供了详尽的使用文档,帮助用户理解和运用这一工具,有效地提高了开发效率。对于希望快速入门PHP或需要一个简单易用的本地开发环境的开发者来说,Wamp5是一个非常好的选择。
recommend-type

【数据融合技术】:甘肃土壤类型空间分析中的专业性应用

# 摘要 数据融合技术作为一种集成多源数据信息的方法,在土壤类型空间分析中发挥着关键作用。本文介绍了数据融合技术的基本概念及其理论基础,阐述了数据预处理、同步整合及冲突解决等关键技术,并详细描述了甘肃土壤类型数据准备的流程,包括数据采集、质量评估、空间化处理及融合实践准备。通过具体案例分析,展示了数据融合在土壤类型空间分布分析、土壤质量评估及土壤保护规划中的应用。同时,文
recommend-type

sht20温湿度传感器使用什么将上拉电阻和滤波电容引出

<think>好的,我需要解决用户关于SHT20温湿度传感器的上拉电阻和滤波电容的引出方法问题。首先,我得回忆一下SHT20的基本特性和常见电路设计。SHT20是数字传感器,通常使用I2C接口,所以上拉电阻是必须的,用于确保数据线和时钟线的稳定。根据引用[2],SHT10是通过SCK和DATA线与单片机通信,而SHT30在引用[3]中使用I2C协议,需要上拉电阻。虽然用户问的是SHT20,但SHT系列通常设计类似,所以可以推断SHT20也需要类似的上拉电阻配置。通常I2C总线的上拉电阻值在4.7kΩ到10kΩ之间,但具体值可能取决于总线速度和电源电压。需要确认数据手册中的推荐值,但用户可能没有
recommend-type

Delphi仿速达财务软件导航条组件开发教程

Delphi作为一款历史悠久的集成开发环境(IDE),由Embarcadero Technologies公司开发,它使用Object Pascal语言,被广泛应用于Windows平台下的桌面应用程序开发。在Delphi中开发组件是一项核心技术,它允许开发者创建可复用的代码单元,提高开发效率和软件模块化水平。本文将详细介绍如何在Delphi环境下仿制速达财务软件中的导航条组件,这不仅涉及到组件的创建和使用,还会涉及界面设计和事件处理等技术点。 首先,需要了解Delphi组件的基本概念。在Delphi中,组件是一种特殊的对象,它们被放置在窗体(Form)上,可以响应用户操作并进行交互。组件可以是可视的,也可以是不可视的,可视组件在设计时就能在窗体上看到,如按钮、编辑框等;不可视组件则主要用于后台服务,如定时器、数据库连接等。组件的源码可以分为接口部分和实现部分,接口部分描述组件的属性和方法,实现部分包含方法的具体代码。 在开发仿速达财务软件的导航条组件时,我们需要关注以下几个方面的知识点: 1. 组件的继承体系 仿制组件首先需要确定继承体系。在Delphi中,大多数可视组件都继承自TControl或其子类,如TPanel、TButton等。导航条组件通常会继承自TPanel或者TWinControl,这取决于导航条是否需要支持子组件的放置。如果导航条只是单纯的一个显示区域,TPanel即可满足需求;如果导航条上有多个按钮或其他控件,可能需要继承自TWinControl以提供对子组件的支持。 2. 界面设计与绘制 组件的外观和交互是用户的第一印象。在Delphi中,可视组件的界面主要通过重写OnPaint事件来完成。Delphi提供了丰富的绘图工具,如Canvas对象,使用它可以绘制各种图形,如直线、矩形、椭圆等,并且可以对字体、颜色进行设置。对于导航条,可能需要绘制背景图案、分隔线条、选中状态的高亮等。 3. 事件处理 导航条组件需要响应用户的交互操作,例如鼠标点击事件。在Delphi中,可以通过重写组件的OnClick事件来响应用户的点击操作,进而实现导航条的导航功能。如果导航条上的项目较多,还可能需要考虑使用滚动条,让更多的导航项能够显示在窗体上。 4. 用户自定义属性和方法 为了使组件更加灵活和强大,开发者通常会为组件添加自定义的属性和方法。在导航条组件中,开发者可能会添加属性来定义按钮个数、按钮文本、按钮位置等;同时可能会添加方法来处理特定的事件,如自动调整按钮位置以适应不同的显示尺寸等。 5. 数据绑定和状态同步 在财务软件中,导航条往往需要与软件其他部分的状态进行同步。例如,用户当前所处的功能模块会影响导航条上相应项目的选中状态。这通常涉及到数据绑定技术,Delphi支持组件间的属性绑定,通过数据绑定可以轻松实现组件状态的同步。 6. 导航条组件的封装和发布 开发完毕后,组件需要被封装成独立的单元供其他项目使用。封装通常涉及将组件源码保存为pas文件,并在设计时能够在组件面板中找到。发布组件可能还需要编写相应的安装包和使用文档,方便其他开发者安装和使用。 7. Delphi IDE的支持 Delphi IDE提供了组件面板编辑器(Component Palette),允许开发者将开发好的组件添加到组件面板中。在组件面板编辑器中,可以自定义组件的图标和分类,使得组件在Delphi中的使用更为便捷。 通过以上的知识点梳理,可以看出Delphi仿速达导航条组件的开发涉及到的不仅仅是简单的代码编写,还涉及到用户界面设计、事件驱动编程、组件封装等多个方面。掌握这些知识点,对于一名Delphi开发者而言,是十分重要的。
recommend-type

【空间分布规律】:甘肃土壤类型与农业生产的关联性研究

# 摘要 本文对甘肃土壤类型及其在农业生产中的作用进行了系统性研究。首先概述了甘肃土壤类型的基础理论,并探讨了土壤类型与农业生产的理论联系。通过GIS技术分析,本文详细阐述了甘肃土壤的空间分布规律,并对其特征和影响因素进行了深入分析。此外,本文还研究了甘肃土壤类型对农业生产实际影响,包括不同区域土壤改良和作物种植案例,以及土壤养分、水分管理对作物生长周期和产量的具体影响。最后,提出了促进甘肃土壤与农业可持续发展的策略,包括土壤保护、退化防治对策以及土壤类型优化与农业创新的结合。本文旨在为
recommend-type

常见运放电路的基本结构和基本原理

<think>首先,用户的问题是关于运算放大器电路的基本结构和工作原理。我需要根据系统级指令来构建回答。系统级指令:-所有行内数学表达式必须使用$...$格式。-独立公式必须使用$$...$$格式并单独成段。-LaTeX语法正确。-使用中文回答。-生成相关问题。-回答中引用的段落末尾自然地添加引用标识,例如[^1]。-回答结构清晰,帮助用户逐步解决问题,尽量保证回答真实可靠。参考站内引用:-引用[1]:关于运算放大器基本电路用法,特别是反相放大器电路。-引用[2]:关于uA741运算放大器电路的基本原理,包括输入级、输出级等。用户的问题:"我想了解运放电路的基本结构和工作原理请问运算放大器电路
recommend-type

ASP.NET2.0初学者个人网站实例分享

标题:“ASP.NET 2.0个人网站”指向了一个网站开发项目,这个项目是使用ASP.NET 2.0框架构建的。ASP.NET 2.0是微软公司推出的一种用于Web开发的服务器端技术,它是.NET Framework的一部分。这个框架允许开发者构建动态网站、网络应用程序和网络服务。开发者可以使用C#或VB.NET等编程语言来编写应用程序。由于这被标签为“2.0”,我们可以假设这是一个较早版本的ASP.NET,相较于后来的版本,它可能没有那么先进的特性,但对于初学者来说,它提供了基础并且易于上手的工具和控件来学习Web开发。 描述:“个人练习所做,适合ASP.NET初学者参考啊,有兴趣的可以前来下载去看看,同时帮小弟我赚些积分”提供了关于该项目的背景信息。它是某个个人开发者或学习者为了实践和学习ASP.NET 2.0而创建的个人网站项目。这个项目被描述为适合初学者作为学习参考。开发者可能是为了积累积分或网络声誉,鼓励他人下载该项目。这样的描述说明了该项目可以被其他人获取,进行学习和参考,或许还能给予原作者一些社区积分或其他形式的回报。 标签:“2.0”表明这个项目专门针对ASP.NET的2.0版本,可能意味着它不是最新的项目,但是它可以帮助初学者理解早期ASP.NET版本的设计和开发模式。这个标签对于那些寻找具体版本教程或资料的人来说是有用的。 压缩包子文件的文件名称列表:“MySelf”表示在分享的压缩文件中,可能包含了与“ASP.NET 2.0个人网站”项目相关的所有文件。文件名“我的”是中文,可能是指创建者以“我”为中心构建了这个个人网站。虽然文件名本身没有提供太多的信息,但我们可以推测它包含的是网站源代码、相关资源文件、数据库文件(如果有的话)、配置文件和可能的文档说明等。 知识点总结: 1. ASP.NET 2.0是.NET Framework下的一个用于构建Web应用程序的服务器端框架。 2. 它支持使用C#和VB.NET等.NET支持的编程语言进行开发。 3. ASP.NET 2.0提供了一组丰富的控件,可帮助开发者快速构建Web表单、用户界面以及实现后台逻辑。 4. 它还提供了一种称作“Web站点”项目模板,使得初学者能够方便地开始Web开发项目。 5. ASP.NET 2.0是微软.NET历史上一个重要的里程碑,引入了许多创新特性,如成员资格和角色管理、主题和皮肤、网站导航和个性化设置等。 6. 在学习ASP.NET 2.0的过程中,初学者可以了解到如HTTP请求和响应、服务器控件、状态管理、数据绑定、缓存策略等基础概念。 7. 本项目可作为ASP.NET初学者的实践平台,帮助他们理解框架的基本结构和工作流程,从而为学习更高版本的ASP.NET打下坚实基础。 8. 个人网站项目的构建可以涵盖前端设计(HTML, CSS, JavaScript)和后端逻辑(C#或VB.NET)的综合应用。 9. 在学习过程中,初学者应该学会如何配置和使用IIS(Internet Information Services)来部署ASP.NET网站。 10. “赚取积分”可能指的是在某个在线社区、论坛或代码托管平台上,通过分享项目来获得一定的积分或奖励,这通常是用来衡量用户对社区贡献大小的一种方式。 综上所述,该“ASP.NET 2.0个人网站”项目不仅为初学者提供了一个实用的学习资源,同时体现了开发者对于开源共享精神的实践,对社区贡献出自己的力量。通过这样的实践,初学者能够更好地理解ASP.NET框架的运作,逐步建立起自己的Web开发技能。
recommend-type

【制图技术】:甘肃高质量土壤分布TIF图件的成图策略

# 摘要 本文针对甘肃土壤分布数据的TIF图件制作进行了系统研究。首先概述了甘肃土壤的分布情况,接着介绍了TIF图件的基础知识,包括其格式特点、空间数据表达以及质量控制方法。随后,文中构建了成图策略的理论框架,分析了土壤分布图的信息需求与数据处理流程,并探讨了成图原则与标准。在实践操作部分,详细阐述了制图软