C:\Application\Python312\Lib\site-packages\whisper\__init__.py:150: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature. checkpoint = torch.load(fp, map_location=device) Traceback (most recent call last): File "C:\Users\C502\PycharmProjects\PythonProject\语音识别.py", line 11, in <module> result = model.transcribe(audio_file, fp16=False, language="zh") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Application\Python312\Lib\site-packages\whisper\transcribe.py", line 133, in transcribe mel = log_mel_spectrogram(audio, model.dims.n_mels, padding=N_SAMPLES) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Application\Python312\Lib\site-packages\whisper\audio.py", line 140, in log_mel_spectrogram audio = load_audio(audio) ^^^^^^^^^^^^^^^^^ File "C:\Application\Python312\Lib\site-packages\whisper\audio.py", line 58, in load_audio out = run(cmd, capture_output=True, check=True).stdout ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Application\Python312\Lib\subprocess.py", line 548, in run with Popen(*popenargs, **kwargs) as process: ^^^^^^^^^^
时间: 2025-06-10 13:42:36 浏览: 18
### 解决Whisper模型中的`torch.load`安全警告和音频文件加载失败问题
在使用Whisper模型进行语音识别时,可能会遇到以下两个主要问题:`torch.load`的安全警告以及音频文件加载失败。以下是针对这两个问题的详细解决方案。
#### 1. `torch.load`的安全警告
`torch.load`会发出`FutureWarning`,提示可能存在的安全性问题,因为默认情况下它会尝试执行序列化对象中的任意代码。为了解决这个问题,可以显式设置`map_location`参数,并通过`pickle_module`或`torch.serialization`来增强安全性[^1]。
```python
import torch
# 加载模型时指定 map_location 和 pickle_module
model, decoder, utils = torch.hub.load(
repo_or_dir='snakers4/silero-models',
model='silero_stt',
language='en',
device=torch.device('cpu'),
trust_repo=True # 确保仓库是可信的
)
```
通过设置`trust_repo=True`,可以减少不必要的警告信息[^2]。
#### 2. 音频文件加载失败
音频文件加载失败通常与文件路径、格式或依赖库版本有关。以下是几个常见原因及解决方法:
- **文件路径错误**:确保音频文件路径正确且文件存在。
- **文件格式不支持**:TorchAudio支持多种音频格式,但某些特殊格式可能需要额外安装依赖项。
- **依赖库版本冲突**:确保`torchaudio`和`torch`版本匹配。例如,如果使用CUDA加速,则需安装对应的cu版本[^2]。
以下是一个完整的音频加载示例:
```python
from glob import glob
from torchaudio import load
from silero.utils import read_audio, prepare_model_input
# 确保音频文件路径正确
test_files = glob('speech_orig.wav')
if not test_files:
raise FileNotFoundError("音频文件未找到,请检查路径")
# 加载音频文件
audio, sr = load(test_files[0])
# 如果采样率不匹配,进行重采样
if sr != 16000: # Whisper模型通常要求16kHz
audio = audio.resample(sr, 16000)
# 准备输入数据
input_audio = prepare_model_input(audio.unsqueeze(0), device=torch.device('cpu'))
```
此外,需要注意的是,Whisper模型对音频长度有一定限制。如果音频时长超过30秒,可能会引发错误[^3]。可以通过分割音频文件来规避此问题。
```python
import librosa
def split_audio(file_path, max_duration=30):
y, sr = librosa.load(file_path, sr=None)
duration = librosa.get_duration(y=y, sr=sr)
if duration > max_duration:
chunks = librosa.util.frame(y, frame_length=int(max_duration * sr), hop_length=int(max_duration * sr))
for i, chunk in enumerate(chunks.T):
librosa.output.write_wav(f"chunk_{i}.wav", chunk, sr)
return [f"chunk_{i}.wav" for i in range(len(chunks.T))]
return [file_path]
```
#### 总结
通过上述方法,可以有效解决`torch.load`的安全警告和音频文件加载失败的问题。同时,注意音频文件的时长限制,必要时对音频进行分割处理。
阅读全文
相关推荐



















