Embed this Speech!

<script type='text/javascript' src='http://www.sweetspeeches.com/s/e/16246---mongoose-ingest-monitor-rinse-repeat'></script>

Verified

MONGOOSE: Ingest, Monitor, Rinse, Repeat October 29, 2009

Send This Speech Embed This Speech

Favorite:

  • Favorite_star_off
  • Bg_dislike

    0

Google Tech Talk
October 23, 2009

ABSTRACT

Presented by Daniel Gruhl.

Currently, data analytics technology is in high demand as people try to extract as much value as possible from their most valuable resource - the information around them, whether in their organizations or freely and publicly available. Unfortunately, though many data analytics efforts are focused a particularly interesting (and often difficult) question, whose answer hopefully lies in the data, these projects tend to spend most of their cycles acquiring and ingesting data. Thus, the focus of these efforts tend to tilt away from data analysis and towards data ingestion. MONGOOSE is 1) A suite of technologies that one can plug domain knowledge cartridges into and that outputs data suitable for OLAP or BI consumption. One plugs in small amounts of domain knowledge that involves pulling in unstructured, semi-structured and structured data, and MONGOOSE converts it all into structured form. 2) A Platform for Worst-Case Scenario Workflow Management. MONGOOSE is built on the assumption that failure happens and it must be handled quickly and seamlessly, such that it does not stop or hinder information ingest. 3) A Platform for Community-Based Information Extraction around specific phenomenon that can be fed into statistical analysis tools.

Daniel Gruhl (dgruhl@almaden.ibm.com) is a research staff member in the Computer Science Department of IBM Almaden Research Center, San Jose, CA. Dan is currently in the Health Informatics research group. Dan specializes in very large scale text analytics for a variety of applications from healthcare to pop music. Dan co-architected IBM's Unstructured Information Management Architecture (UIMA), which is now the de facto standard for text analytics projects. He earned his Ph.D. in electrical engineering from the Massachusetts Institute of Technology in 2000 with thesis work on distributed text analytics systems. Dan was named in MIT's Technology Review Top 100 (TR 100) in 2004.

Varun Bhagwan (vbhagwan@us.ibm.com) is an advisory software engineer in the Computer Science Department of IBM Almaden Research Center, San Jose, CA. His interests lie in the field of text analytics, data mining, machine learning/AI, internet technologies, and services science. Since joining IBM research in 2001, Varun has worked at multiple levels of a large scale text mining project, ranging from cluster management, to indexing a multi-billion page corpus, to crawling the internet. He is currently a member of the the Health Informatics research group. Varun holds a Master's degree in Computer Science from University of Florida, Gainesville and is currently pursuing a Ph.D. at the University of California, Santa Cruz.

Tyrone Grandison (tyroneg@us.ibm.com) manages the Intelligent Information Systems team in the Computer Science department at the IBM Almaden Research Center, San Jose, CA. Tyrone's research interests are in data disclosure management relevant and applicable to industry verticals. Over the years, Tyrone has worked in data privacy, RFID data management, privacy-preserving mobile data management and text analytics. Tyrone is a senior member of both the ACM and IEEE and was named Pioneer of the Year by NSBE in 2009. Tyrone received a Ph.D. from Imperial College, London and M.Sc. and B.Sc. degrees from the University of the West Indies, Mona, Jamaica.

Telepromptor

Print transcript

Full Transcript coming soon

  • Randomspeech

Speech Sender

close [x]

You are sending:

MONGOOSE: Ingest, Monitor, Rinse, Repeat- October 29, 2009

- - -
Send to:

We welcome any and all feedback for Sweet Speeches! Speak your mind!