Commit 12679b8b authored by Robert Haas's avatar Robert Haas

In levenshtein_internal(), describe algorithm a bit more clearly.

parent 54c88dee
...@@ -277,15 +277,25 @@ levenshtein_internal(text *s, text *t, ...@@ -277,15 +277,25 @@ levenshtein_internal(text *s, text *t,
++n; ++n;
/* /*
* Instead of building an (m+1)x(n+1) array, we'll use two different * One way to compute Levenshtein distance is to incrementally construct
* arrays of size m+1 for storing accumulated values. At each step one * an (m+1)x(n+1) matrix where cell (i, j) represents the minimum number
* represents the "previous" row and one is the "current" row of the * of operations required to transform the first i characters of s into
* notional large array. * the first j characters of t. The last column of the final row is the
* answer.
*
* We use that algorithm here with some modification. In lieu of holding
* the entire array in memory at once, we'll just use two arrays of size
* m+1 for storing accumulated values. At each step one array represents
* the "previous" row and one is the "current" row of the notional large
* array.
*/ */
prev = (int *) palloc(2 * m * sizeof(int)); prev = (int *) palloc(2 * m * sizeof(int));
curr = prev + m; curr = prev + m;
/* Initialize the "previous" row to 0..cols */ /*
* To transform the first i characters of s into the first 0 characters
* of t, we must perform i deletions.
*/
for (i = 0; i < m; i++) for (i = 0; i < m; i++)
prev[i] = i * del_c; prev[i] = i * del_c;
...@@ -297,8 +307,8 @@ levenshtein_internal(text *s, text *t, ...@@ -297,8 +307,8 @@ levenshtein_internal(text *s, text *t,
int y_char_len = n != t_bytes + 1 ? pg_mblen(y) : 1; int y_char_len = n != t_bytes + 1 ? pg_mblen(y) : 1;
/* /*
* First cell must increment sequentially, as we're on the j'th row of * To transform the first 0 characters of s into the first j
* the (m+1)x(n+1) array. * characters of t, we must perform j insertions.
*/ */
curr[0] = j * ins_c; curr[0] = j * ins_c;
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment