Monrai Blog

News about Cypher, Semantic Web, Natural Language Processing, and Computational Linguistics

Saturday, August 05, 2006

AOL's Data Collection

eBiquity has a post on AOL's release of a large-scale data collection of natural language questions and answers, and includes 20K manually annotated queries:

AOL Research has released some interesting data collections, including:

  • 20K hand labeled, classified queries
  • 3.5M web question answering queries (who, what, where, when …)
  • Query streams for 500K users over three months (20M queries)
  • Query arrival rates for queuing analysis
  • 2M queries against US Government domains

Additional datasets are promised in the future.

Would be nice if someone took time to evaluate the use of this data for Cypher.


Post a Comment

Subscribe to Post Comments [Atom]

<< Home