Description |
The Hamming and the edit metrics are two common notions of measuring
distances between pairs of strings x,y lying in the Boolean hypercube. The
edit distance between x and y is defined as the minimum number of
character insertion, deletion, and bit flips needed for converting x into
y. Whereas, the Hamming distance between x and y is the number of bit
flips needed for converting x to y.
In this talk I will present a randomized injective embedding of the edit
distance into the Hamming distance with a small (qudratic) distortion.
Namely, for any satisfying that their edit distance equals , the
Hamming distance between the embedding of and is with
high probability. This improves over the distortion ratio of $O(\log n
\log^* n)$ obtained by Jowhari (2012) for small values of . Moreover,
the embedding output size is linear in the input size and the embedding
can be computed using a single pass over the input.
I will mention several applications for this embedding. Among our results
we provide a one-pass (streaming) algorithm for edit distance running in
space and computing edit distance exactly up-to distance .
This algorithm is based on kernelization for edit distance that is of
independent (joint work with Diptarka Chakraborty and Elazar Goldenberg).
|