School of Technology and Computer Science Seminars

Latent Dirichlet Allocation forText Segmentation

by Dr. Hemant Misra (Xerox Research Center Eurpoe, France)

Tuesday, December 28, 2010 from to (Asia/Kolkata)
at Colaba Campus ( A-212 )
Description
n this presentation, first we visit latent Dirichlet allocation (LDA), an unsupervised topic model, and propose its application for the task of text segmentation. The proposed methodology has state-of-the-art performance on a benchmark database, is able to perform segmentation in an online manner, and assigns a meaningful topic distribution to each segment. The last point is particularly interesting for information retrieval at segment level. Another important discussion will be on how the computational cost associated with the dynamic programming (DP) algorithm typically used for the search can be reduced by a factor of more than 95%, and the usability of this result to the entire domain of text segmentation.
Organised by John Barretto