Embed this Speech!

<script type='text/javascript' src='http://www.sweetspeeches.com/s/e/16015---algebraic-techniques-for-multilingual-document-clustering'></script>

Verified

Algebraic Techniques for Multilingual Document Clustering February 8, 2011

Send This Speech Embed This Speech

Favorite:

  • Favorite_star_off
  • Bg_dislike

    0

Google Tech Talks
January 25, 2011

Presented by Brett Bader.

ABSTRACT

Multilingual documents pose difficulties for clustering by topic, not least because translating everything to a common language is not feasible with a large corpus or many languages. This presentation will address those difficulties with a variety of novel algebraic methods for efficiently clustering multilingual text documents, and brieflyillustrate their implementation via high performance computing. The methods use a multilingual parallel corpus as a 'Rosetta Stone' from which algorithmic variations (including statistical morphological analysis to bypass the need for stemming) of Latent Semantic Analysis (LSA) are able to learn concepts in term space. New documents are projected into this concept space to produce language-independent feature vectors for subsequent use in similarity calculations or machine learning applications. Our experiments show that the new methods have better performance than LSA, and possess some interesting and counter-intuitive properties.

Brett W. Bader received his Ph.D. in computer science from the University of Colorado at Boulder, studying higher-order methods for optimization and solving systems of nonlinear equations. In 2003, Brett received the John von Neumann Research Fellowship at Sandia National Laboratories, where he now develops algorithms for multi-way data analysis and machine learning for informatics applications in networks and text.

Telepromptor

Print transcript

Full Transcript coming soon

  • Randomspeech

Speech Sender

close [x]

You are sending:

Algebraic Techniques for Multilingual Document Clustering- February 8, 2011

- - -
Send to:

We welcome any and all feedback for Sweet Speeches! Speak your mind!