# What is the Jeopardy Model? A Quasi-Synchronous Grammar for QA

<https://www.aclweb.org/anthology/D07-1003>

## Abstract

​ This paper presents a syntax-driven(语法驱动) approach to question answering, specifically the answer-sentence selection problem for short-answer questions(特别是针对简短答案的答案句选择问题). Rather than using syntactic features to augment existing statistical classifiers (as in previous work), we build on the idea that questions and their (correct) answers relate to each other via loose but predictable syntactic transformations. We propose a probabilistic quasi-synchronous grammar, inspired by one proposed for machine translation (D. Smith and Eisner, 2006), and parameterized by mixtures of a robust nonlexical syntax/alignment model with a(n optional) lexical-semantics-driven log-linear model. Our model learns soft alignments as a hidden variable in discriminative training. Experimental results using the TREC dataset are shown to significantly outperform strong state-of-the-art baselines.

## 1 Introduction and Motivation

​ Open-domain question answering (QA) is a widely- studied and fast-growing research problem. State- of-the-art QA systems are extremely complex. They usually take the form of a pipeline architecture(通常采用管道架构的形式), chaining together modules（将模块链接在一起） that perform tasks such as answer type analysis（执行像答案类型匹配之类的任务） (identifying whether the correct answer will be a person, location, date, etc.人物类型、地理、时间等类型), document retrieval文档检索, answer candidate extraction, and answer reranking. This architecture is so predominant that each task listed above has evolved into its own subfield and is often studied and evaluated independently（这种架构非常重要，以至于上面列出的每个任务都已经演变成自己的子领域，并且经常被独立地研究和评估。） (Shima et al., 2006).

​ At a high level, the QA task boils down to only two essential steps (Echihabi and Marcu, 2003). **The first step**, retrieval, narrows down the search space from a corpus of millions of documents to a focused set of maybe a few hundred using an IR engine（在信息检索引擎里面把数百万文档缩小到几百个文档的范围内）, where efficiency and recall are the main focus（效率和召回率是主要重点）. **The second step**, selection, assesses each candidate answer string proposed by the first step , and finds the one that is most likely to be an answer to the given question（从第一步中的候选答案中选择最佳答案）. The granularity of the target answer string varies depending on the type of the question（目标答案字符串的粒度取决于问题的类型）. For example, answers to factoid questions (e.g., Who, When, Where) are usually single words or short phrases（比如，事实类型的问题答案通常很简短，甚至只是一个单词）, while definitional questions and other more complex question types (e.g., How, Why) look for sentences or short passages（而定义性问题和其他更复杂的问题类型（例如，方法，原因）则查找句子或简短的段落）. In this work, we fix the granularity of an answer to a single sentence（在这项工作中，我们将答案的粒度固定为单个句子）.

​ Earlier work on answer selection relies only on the surface-level text information（早期依赖表层信息）.Two approaches are most common（两个相似的途径）: surface pattern matching（表面模式匹配）, and similarity measures on the question and answer（问题和答案直接相似计算）, represented as bags of words（表示为词袋模型）. In the former, patterns for a certain answer type are either crafted manually (Soubbotin and Soubbotin, 2001) or acquired from training examples automatically（前者，特定答案类型的模式可以手动制作，也可以从训练示例中自动获取） (Itty- cheriah et al., 2001; Ravichandran et al., 2003; Licuanan and Weischedel, 2003). In the latter, measures like cosine-similarity are applied to (usu- ally) bag-of-words representations of the question and answer（在词袋模型里面(余弦相似度计算)用于计算问题和答案）. Although many of these systems have achieved very good results in TREC-style evalua- tions, shallow methods using the bag-of-word repre- sentation clearly have their limitations（尽管这些系统中的许多系统在TREC式评估中都取得了很好的结果，但是使用词袋表示法的浅层方法显然有其局限性）. Examples of cases where the bag-of-words approach fails abound in QA literature（词袋方法失败的案例示例）; here we borrow an example used by Echihabi and Marcu (2003). The question is “*Who is the leader of France?*”, and the sentence “Henri Hadjenberg, *who is the leader of France* ’s Jewish community, endorsed ...” (note tokenization), which is not the correct answer, matches all keywords in the question in exactly the same order(本来不是一个意思，但是关键词匹配会认为是完全一样). (The cor- rect answer is found in “Bush later met with French President Jacques Chirac.”)

​

​


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://im-qianuxn.gitbook.io/pytorch/lun-wen-yue-du/what-is-the-jeopardy-model-a-quasi-synchronous-grammar-for-qa.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.