My research is focused on natural language processing, machine learning, and artificial intelligence. My research group investigates:
Learning from large data sets: Exploding amounts of data are creating new opportunities for artificial intelligence. Our group is focused on unsupervised and semi-supervised learning from large data sets, in particular the Web. We interested in:
- Statistical models of language built from large bodies of text, to enable automatic knowledge extraction. E.g.:
- Efficient Methods for Inferring Large Sparse Topic Hierarchies Doug Downey, Chandra Sekhar Bhagavatula, Yi Yang. (ACL 2015)
- Learning Representations for Weakly Supervised Natural Language Processing Tasks. Fei Huang, Arun Ahuja, Doug Downey, Yi Yang, Yuhong Guo, and Alexander Yates. (Computational Linguistics, 2014).
- Using Natural Language to Integrate, Evaluate, and Optimize Extracted Knowledge Bases Doug Downey, Chandra Sekhar Bhagavatula, Alexander Yates. (AKBC 2013)
- Overcoming the Memory Bottleneck in Distributed Training of Latent Variable Models of Text. Yi Yang, Alex Yates, Doug Downey. (NAACL-HLT 2013)
- Novel techniques and theoretical understanding for learning from lots of data and limited human input. E.g.:
- Active Learning with Constrained Topic Model. Yi Yang, Shimei Pan, Doug Downey, Kunpeng Zhang. (ILLVI workshop, ACL 2014)
- Scaling Semi-supervised Naive Bayes with Feature Marginals Michael Lucas, Doug Downey. (ACL 2013)
- Look Ma, No Hands: Analyzing the Monotonic Feature Abstraction for Text Classification. Doug Downey and Oren Etzioni. (NIPS 2008).
- Analysis of a Probabilistic Model of Redundancy in Unsupervised Information Extraction. Doug Downey, Oren Etzioni, and Stephen Soderland. (Artificial Intelligence, 2010).
Applications of knowledge extraction from the Web: We're exploring applications of our work on extracting and synthesizing the Web's information, in order to enable new and improved Web search experiences. E.g.:
- TextJoiner: On-demand Information Extraction with Multi-Pattern Queries Chandra Sekhar Bhagavatula, Thanapon Noraset, Doug Downey (AKBC 2014)
- Adding High-Precision Links to Wikipedia. Thanapon Noraset, Chandra Sekhar Bhagavatula, Doug Downey (EMNLP 2014).
- Analyzing the Content Emphasis of Web Search Engines. Mohammed Alam, Doug Downey. (SIGIR 2014)
- Methods for Exploring and Mining Tables on Wikipedia. Chandra Sekhar Bhagavatula, Thanapon Noraset, Doug Downey. (IDEA 2013)
See also our project page for Web Information Extraction, an NSF-funded project related to the above directions, focused on scaling and integrating knowledge extracted automatically from the Web.