用户名: 密码: 验证码:
Pattern-oriented access to document collections.
详细信息   
  • 作者:Dworman ; Irving Garett.
  • 学历:Doctor
  • 年:1999
  • 导师:Kimbrough, Steven
  • 毕业院校:University of Pennsylvania
  • 专业:Information Science.;Library Science.;Computer Science.
  • ISBN:0599559802
  • CBH:9953525
  • Country:USA
  • 语种:English
  • FileSize:13036519
  • Pages:374
文摘
This dissertation investigates pattern-oriented access to collections of unstructured text documents. A pattern-oriented information search differs from a more traditional record-oriented search just as the study of an entire forest differs from the inspection of specific trees. For example, to enjoy Abraham Lincoln's eloquence, we might look up a particular speech such as the Gettysburg Address (a trees-perspective); to understand the evolution of Lincoln's ideas, we must seek trends across the collection of his public statements (a forest perspective). Data-mining seeks this forest-perspective by finding statistical patterns in data. Unfortunately, data-mining is only applied to highly-structured data, and therefore ignores much, if not most, of the world's information, which exists as unstructured text.;Evidence from the Information Retrieval, Information Visualization, Bibliometrics, and Library Science literatures demonstrate that pattern-oriented access to document collections is a critically important task; one in which people often engage even if they do not have tools designed for this purpose. Informed by these literatures, a prototypical pattern-discovery system named Homer is introduced and applied in two empirical studies. The first study required subjects to answer specific questions about the prose of a photographer's captions; the second study required subjects to respond to open-ended medical questions based on a collection of emergency room medical reports. Results show Homer users learning more and taking less time, on average, than users of more-traditional record-oriented systems. These results, combined with evidence from the literature, argue strongly that pattern-oriented access to document collections is possible, and can potentially tap vast, previously-unavailable sources of knowledge by helping us find the stories hidden within our document collections.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700