School of Technology and Computer Science Seminars
Streaming Algorithms for Embedding and Computing Edit Distance in the Low Distance Regime
by Michal Koucky (Charles University, Prague, Czech Republic)
Wednesday, April 6, 2016
from
to
(Asia/Kolkata)
at A-201 (STCS Seminar Room)
at A-201 (STCS Seminar Room)
Description |
The Hamming and the edit metrics are two common notions of measuring distances between pairs of strings x,y lying in the Boolean hypercube. The edit distance between x and y is defined as the minimum number of character insertion, deletion, and bit flips needed for converting x into y. Whereas, the Hamming distance between x and y is the number of bit flips needed for converting x to y. In this talk I will present a randomized injective embedding of the edit distance into the Hamming distance with a small (qudratic) distortion. Namely, for any satisfying that their edit distance equals , the Hamming distance between the embedding of and is with high probability. This improves over the distortion ratio of $O(\log n \log^* n)$ obtained by Jowhari (2012) for small values of . Moreover, the embedding output size is linear in the input size and the embedding can be computed using a single pass over the input. I will mention several applications for this embedding. Among our results we provide a one-pass (streaming) algorithm for edit distance running in space and computing edit distance exactly up-to distance . This algorithm is based on kernelization for edit distance that is of independent (joint work with Diptarka Chakraborty and Elazar Goldenberg). |