Devendra Singh Sachan

I am a PhD student at School of Computer Science, McGill University and also a member of Mila, where I am fortunate to be advised by Joelle Pineau. I was previously advised by William Hamilton (now at Citadel LLC).

My research interests are in natural language processing and machine learning. I have done internships at DeepMind (with Manzil Zaheer and Rob Fergus), Meta AI (with Luke Zettlemoyer, Mike Lewis, and Scott Wen-tau Yih), Amazon AI (with Marcello Federico), and NVIDIA (with Bryan Catanzaro, Mohammad Shoeybi, and Mostofa Patwary). I have also collaborated with the awesome Dani Yogatama from DeepMind (now at USC).

I have completed my masters from Language Technologies Institute, School of Computer Science at Carnegie Mellon University. Previously, I did my bachelors at Indian Institute of Technology, Guwahati.

Email  /  CV (updated April 2022)  /  Google Scholar  /  Twitter  /  Github

profile photo

I'm interested in natural language processing, machine learning, optimization, and deep learning. More recently, my research is focused towards information retrieval, open-domain question answering, and pre-trained language models.

Questions Are All You Need to Train a Dense Passage Retriever
Devendra Singh Sachan, Mike Lewis, Dani Yogatama, Luke Zettlemoyer, Joelle Pineau, Manzil Zaheer
TACL 2023 (presented at ACL 2023)
[poster] [slides] [code]
Improving Retrieval Augmented Neural Machine Translation by Controlling Source and Fuzzy-Match Interactions
Cuong Hoang, Devendra Singh Sachan, Prashant Mathur, Brian Thompson, Marcello Federico
EACL 2023 (Findings Track)
Improving Passage Retrieval with Zero-Shot Question Generation
Devendra Singh Sachan, Mike Lewis, Mandar Joshi, Armen Aghajanyan, Wen-tau Yih, Joelle Pineau, Luke Zettlemoyer
EMNLP 2022
[poster] [code]
End-to-End Training of Multi-Document Reader and Retriever for Open-Domain Question Answering
Devendra Singh Sachan, Siva Reddy, William Hamilton, Chris Dyer, Dani Yogatama
NeurIPS 2021
[slides] [poster] [code]
End-to-End Training of Neural Retrievers for Open-Domain Question Answering
Devendra Singh Sachan, Mostofa Patwary, Mohammad Shoeybi, Neel Kant, Wei Ping, William Hamilton, Bryan Catanzaro
ACL 2021
Oral Presentation
[slides] [code]
Do Syntax Trees Help Pre-trained Transformers Extract Information?
Devendra Singh Sachan, Yuhao Zhang, Peng Qi, William Hamilton
EACL 2021
Oral Presentation
[slides] [poster] [code]
Stronger Transformers for Neural Multi-Hop Question Generation
Devendra Singh Sachan, Lingfei Wu, Mrinmaya Sachan, William Hamilton
Texar: A Modularized, Versatile, and Extensible Toolkit for Text Generation
Zhiting Hu, Haoran Shi, Bowen Tan, Wentao Wang, Zichao Yang, Tiancheng Zhao, Junxian He, Lianhui Qin, Di Wang, Xuezhe Ma, Zhengzhong Liu, Xiaodan Liang, Wangrong Zhu, Devendra Singh Sachan, Eric P. Xing
ACL 2019
Best Demo Paper Nomination
Revisiting LSTM Networks for Semi-Supervised TextClassification via Mixed Objective Function
Devendra Singh Sachan, Manzil Zaheer, Ruslan Salakhutdinov
AAAI 2019
Spotlight Presentation
Adaptive Methods for Nonconvex Optimization
Manzil Zaheer, Sashank Reddi, Devendra Singh Sachan, Satyen Kale, Sanjiv Kumar
NeurIPS 2018
Parameter Sharing Methods for Multilingual Self-Attentional Translation Models
Devendra Singh Sachan, Graham Neubig
WMT 2018
Oral Presentation
[slides] [code]
Investigating the Working of Text Classifiers
Devendra Singh Sachan, Manzil Zaheer, Ruslan Salakhutdinov
Effective Use of Bidirectional Language Modeling for Transfer Learning in Biomedical Named Entity Recognition
Devendra Singh Sachan, Pengtao Xie, Mrinmaya Sachan, Eric P Xing
MLHC 2018
Spotlight Presentation
When and Why are Pre-trained Word Embeddings Useful for Neural Machine Translation?
Ye Qi, Devendra Singh Sachan, Matthieu Felix, Sarguna Janani Padmanabhan, Graham Neubig
NAACL 2018
XNMT: The eXtensible Neural Machine Translation Toolkit
Graham Neubig, Matthias Sperber, Xinyi Wang, Matthieu Felix, Austin Matthews, Sarguna Padmanabhan, Ye Qi, Devendra Singh Sachan, Philip Arthur, Pierre Godard, John Hewitt, Rachid Riad, Liming Wang
AMTA 2018
Sports Video Event Classification from Multimodal Information using Deep Learning
Devendra Singh Sachan, Umesh Tekwani, Amit Sethi
AAAI 2013 Fall Symposium Series
Academic Services
Reviewer: NeurIPS (2022, 2021, 2020), ICLR (2022, 2021), NAACL 2021, ACL (2023, 2019), NAACL 2019, CoNLL 2018, EMNLP (2022,2018)
  • Highlighted reviewer of ICLR 2022 ( URL )

Program Committee Member: Spa-NLP 2022, Rep4NLP 2020, WMT 2019, WMT 2018
Teaching Experience
TA for COMP-424 Artificial Intelligence, Spring 2020 at McGill University, Montreal, Canada.

TA for IFT-6390 Fundamentals of Machine Learning, Fall 2019 at University of Montreal, Montreal, Canada.

TA for 10-725 Convex Optimization, Fall 2017 at Carnegie Mellon University, Pittsburgh, USA.
Comments highlighted in the MIT Technology Review magazine (

Design and source code from Jon Barron's website.