C:\Users\faroh\Desktop\nari>py test3.txt Traceback (most recent call last): File "C:\Users\faroh\Desktop\nari\test3.txt", line 18, in <module> copy_txt_file(source_path, target_path) File "C:\Users\faroh\Desktop\nari\test3.txt", line 9, in copy_txt_file line = source_file.readline() ^^^^^^^^^^^^^^^^^^^^^^ File "<frozen codecs>", line 322, in decode UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb7 in position 20: invalid start byte
时间: 2025-06-17 13:42:22 浏览: 18
### 解决Python复制txt文件时出现的UnicodeDecodeError问题
在处理文本文件时,`UnicodeDecodeError`通常是因为文件的实际编码与指定的编码不匹配。例如,尝试用`utf-8`解码一个非UTF-8编码的文件会导致该错误[^1]。
为了解决这个问题,可以采取以下方法:
#### 方法一:检测并指定正确的编码
如果不确定文件的实际编码,可以使用`chardet`或`charset-normalizer`库来检测文件编码。
```python
import chardet
def detect_encoding(file_path):
with open(file_path, 'rb') as f:
raw_data = f.read()
result = chardet.detect(raw_data)
return result['encoding']
source_path = "source.txt"
encoding = detect_encoding(source_path)
print(f"Detected encoding: {encoding}")
```
然后,在读取文件时指定检测到的编码:
```python
with open(source_path, mode='r', encoding=encoding) as source_file, \
open(target_path, mode='w', encoding=encoding) as target_file:
for line in source_file:
target_file.write(line)
```
#### 方法二:忽略或替换无法解码的字符
如果不需要保留所有字符的精确性,可以使用`errors='ignore'`或`errors='replace'`参数来忽略或替换无法解码的字符。
```python
with open(source_path, mode='r', encoding='utf-8', errors='ignore') as source_file, \
open(target_path, mode='w', encoding='utf-8') as target_file:
for line in source_file:
target_file.write(line)
```
#### 方法三:逐字节读取并手动解码
对于复杂的编码问题,可以逐字节读取文件并尝试不同的解码方式。
```python
with open(source_path, 'rb') as source_file, \
open(target_path, 'w', encoding='utf-8') as target_file:
for line in source_file:
try:
decoded_line = line.decode('utf-8')
except UnicodeDecodeError:
decoded_line = line.decode('latin1') # 尝试其他编码
target_file.write(decoded_line)
```
#### 方法四:使用通用编码处理库
可以使用`charset-normalizer`库自动处理编码问题。
```python
from charset_normalizer import from_path
normalized_text = from_path(source_path).best().decoded_contents()
with open(target_path, 'w', encoding='utf-8') as target_file:
target_file.write(normalized_text)
```
以上方法可以根据具体需求选择使用,确保文件能够正确读取和写入[^3]。
### 注意事项
- 如果文件包含多种编码的混合内容,可能需要更复杂的处理逻辑。
- 在Windows系统上处理中文字符时,可能会遇到GBK编码的问题[^2]。此时可以尝试将编码设置为`gbk`或`cp936`。
阅读全文
相关推荐


















