Web Images Videos Maps News Shopping Gmail more »
Search settings | Sign in
Go to Google Videos home
Customizing collections
50:18  - 2 years ago
Google Tech Talks May 31, 2007 ABSTRACT Efficient book scanning and increasingly sophisticated OCR have laid the foundation for collections that are far larger, but also much less structured, than the carefully curated collections of literary and historical materials on which demanding study has depended. This talk describes how high peformance services can be built on top of these large collections. The challenge is to provide mechanisms whereby particular communities customize the content and the services that underlie very large collections. No centralized entity can optimize its services for every community. We need mechanisms with which communities can extend OCR (e.g., adding new language models and/or character sets), multi-lingual services (e.g., adding language specific modules such as morphological analyzers and either adding or pointing to knowledge sources such as machine readable dictionaries, parallel texts), and named entity identification/information extraction services (e.g., adding domain specific gazetteers, biographical encyclopedias etc.). In such an environment, therefore, communities need not only APIs to search and visualization services but reasonable methods with which to improve the performance of core services on their particular materials. This talk will review services on which demanding work within several areas of the humanities will depend.
Download video - iPod/PSP
Embed video