编辑距离又称为Levenshtein Distance,PHP已经有专门的一个函数Levenshtein()
Levenshtein距离,又称编辑距离,用于比较字符串(String)的相似程度。 LD(s1, s2)=把s1变成s2需要的最小单字符操作(插入、删除和替换)的次数。
它的PHP实现如下:
function Levenshtein_Shore($from,$to)
{
$lf = strlen($from);
$lt = strlen($to);
foreach(range(0,$lf) as $i)
{
$d[0][$i] = $i;
}
foreach(range(0,$lt) as $j)
{
$d[$j][0] = $j;
}
foreach(range(1,$lf) as $i)
{
foreach(range(1,$lt) as $j)
{
if($from[$i - 1] == $to[$j - 1])
$d[$i][$j] = $d[$i - 1][$j - 1];
else
$d[$i][$j] = min($d[$i - 1][$j] + 1,$d[$i][$j - 1] + 1,$d[$i - 1][$j - 1] + 1);
}
}
return $d[$lf][$lt];
}
echo Levenshtein_Shore('saturday','sunday');
echo Levenshtein('saturday','sunday');
麻烦的似乎是证明过程,类似归纳法,详见此处
更多算法问题在这。
DP的解释见这。
A DP is an algorithmic technique which is usually based on a recurrent formula and one (or some) starting states. A sub-solution of the problem is constructed from previously found ones. DP solutions have a polynomial complexity which assures a much faster running time than other techniques like backtracking, brute-force etc.
