Estimator
for Levenshtein distance. #
The usual algorithm for computing Levenshtein distances
provides successively better lower bounds for the Levenshtein distance as it runs,
as proved in Mathlib.Data.List.EditDistance.Bounds
.
In this file we package that fact as an instance of
Estimator (Thunk.mk fun _ => levenshtein C xs ys) (LevenshteinEstimator C xs ys)
allowing us to use the Estimator
framework for Levenshtein distances.
This is then used in the implementation of rewrite_search
to avoid needing the entire edit distance calculation in unlikely search paths.
Data showing that the Levenshtein distance from xs
to ys
is bounded below by the minimum Levenshtein distance between some suffix of xs
and a particular suffix of ys
.
This bound is (non-strict) monotone as we take longer suffixes of ys
.
This is an auxiliary definition for the later LevenshteinEstimator
:
this variant constructs a lower bound for the pair consisting of
the Levenshtein distance from xs
to ys
,
along with the length of ys
.
- pre_rev : List β
The prefix of
ys
that is not is not involved in the bound, in reverse order. - suff : List β
The suffix of
ys
, such that the distance fromxs
toys
is bounded below by the minimum distance from any suffix ofxs
to this suffix. - split : List.reverse self.pre_rev ++ self.suff = ys
Witness that
ys
has been decomposed into a prefix and suffix. - distances : { r : List δ // 0 < List.length r }
The distances from each suffix of
xs
tosuff
. - distances_eq : self.distances = suffixLevenshtein C xs self.suff
Witness that
distances
are correct. The current bound on the pair (distance from
xs
toys
, length ofys
).- bound_eq : self.bound = match self.pre_rev, (_ : List.reverse self.pre_rev ++ self.suff = ys) with | [], split => ((↑self.distances)[0], List.length ys) | x, split => (List.minimum_of_length_pos (_ : 0 < List.length ↑self.distances), List.length self.suff)
Predicate describing the current bound.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Equations
- One or more equations did not get rendered due to their size.
An estimator for Levenshtein distances.
Equations
- LevenshteinEstimator C xs ys = Estimator.fst { fn := fun (x : Unit) => (levenshtein C xs ys, List.length ys) } (LevenshteinEstimator' C xs ys)
Instances For
Equations
- One or more equations did not get rendered due to their size.
The initial estimator for Levenshtein distances.
Equations
- One or more equations did not get rendered due to their size.