文摘
CiteSeer, a scientific literature search engine that focuses on documents in the computer science and information science domains, suffers from scalability issue on the number of requests and the size of indexed documents, which increased dramatically over the years. CiteSeer X^\mathcal{X} is an effort to re-architect the search engine. In this paper, we present our initial design of a framework for caching query results, indices, and documents. This design is based on analysis of logged workload in CiteSeer. Our experiments based on mock client requests that simulate actual user behaviors confirm that our approach works well in enhancing system performances.