编辑距离python代码实现_在Python中编辑距离

博主正在开发一个Python拼写检查程序,需找出与给定无效单词距离为2的词汇。求助于如何高效地通过Levenshtein距离算法生成候选词,避免多次遍历字典。分享了Levenshtein距离计算方法及其Python实现。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

I'm programming a spellcheck program in Python. I have a list of valid words (the dictionary) and I need to output a list of words from this dictionary that have an edit distance of 2 from a given invalid word.

I know I need to start by generating a list with an edit distance of one from the invalid word(and then run that again on all the generated words). I have three methods, inserts(...), deletions(...) and changes(...) that should output a list of words with an edit distance of 1, where inserts outputs all valid words with one more letter than the given word, deletions outputs all valid words with one less letter, and changes outputs all valid words with one different letter.

I've checked a bunch of places but I can't seem to find an algorithm that describes this process. All the ideas I've come up with involve looping through the dictionary list multiple times, which would be extremely time consuming. If anyone could offer some insight, I'd be extremely grateful.

解决方案

The thing you are looking at is called an edit distance and here is a nice explanation on wiki. There are a lot of ways how to define a distance between the two words and the one that you want is called Levenshtein distance and here is a DP implementation in python.

def levenshteinDistance(s1, s2):

if len(s1) > len(s2):

s1, s2 = s2, s1

distances = range(len(s1) + 1)

for i2, c2 in enumerate(s2):

distances_ = [i2+1]

for i1, c1 in enumerate(s1):

if c1 == c2:

distances_.append(distances[i1])

else:

distances_.append(1 + min((distances[i1], distances[i1 + 1], distances_[-1])))

distances = distances_

return distances[-1]

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值