# Unsupervised Question Decomposition for Question Answering (@ EMNLP 2020)

### Ethan Perez, Patrick Lewis, Wen-tau Yih, Kyunghyun Cho, Douwe Kiela

This paper tackles a problem that seems fundamentally quite important to me. Given a question, how can we decompose it in easier questions? This is clearly related to how humans answer questions. Suppose you don't know Alice and Bob, and I ask you: who is older, Alice or Bob?'', that naturally maps to at least three subquestions: how old is Alice?'', how old is Bob?'', and which number is greater?''. It turns out we have Question Answering models that are good at answering each of these simpler questions. But how do we break down a complex question?
There are a number of things in the paper that I find not elegant, solution-wise. For example, they always decompose a question into two subquestions. If you give their model a simple question that their base QA model can answer, they would still decompose it. Also, it can't decompose them into more than 2, and from their objective it does seem computationally hard to extend it. Also, the model is quite complicated, with an amalgamation of different unsupervised objectives, which has been common in unsupervised NLP. Finally, the idea of maximally diverging subquestions does not appear sound to me. For example, how old is Alice?'' and how old is Bob?'' are very similar, yet they are the right questions to ask. Their model seems to produce paraphrases (e.g. how many years ago was Bob born?'') to get around that, which doesn't feel like what should happen. I think you want questions that provide you different (complementary) useful bits of information, not necessarily questions that are as divergent as possible. For instance, you don't want to ask how old is Alice?'' and what year was Alice born in?'', even if they're both quite different.