Data Sets and Codebases

  1. Data sets and code from my thesis (also used in our IJCAI 2005, ACL 2007, and NIPS 2008 papers) .
  2. Codebase for the LEX algorithm (from our IJCAI 2007 paper, Locating Complex Named Entities in Web Text)
  3. HMM-LM: an MPI-based parallel HMM package for language modeling (from our NAACL 2010 paper, Improved Extraction Assessment through Better Language Models)
  4. The Atlasify 240 semantic relatedness data set (from our SIGIR 2012 paper, Explanatory Semantic Relatedness and Explicit Spatialization for Exploratory Search.)