Dynamic Pooling And Unfolding Recursive Autoencoders For Paraphrase Detection

Paraphrase detection is the task of examining two sentences and determining whether they have the same meaning. In order to obtain high accuracy on this task, thorough syntactic and semantic analysis of the two statements is needed. We introduce a method for paraphrase detection based on recursive autoencoders (RAE). Our unsupervised RAEs are based on a novel unfolding objective and learn feature vectors for phrases in syntactic trees. These features are used to measure the word- and phrase-wise similarity between two sentences. Since sentences may be of arbitrary length, the resulting matrix of similarity measures is of variable size. We introduce a novel dynamic pooling layer which computes a fixed-sized representation from the variable-sized matrices. The pooled representation is then used as input to a classifier. Our method outperforms other state-of-the-art approaches on the challenging MSRP paraphrase corpus.


An overview of our paraphrase model. The recursive autoencoder learns phrase features for each node in a parse tree. The distances between all nodes then fill a similarity matrix whose size depends on the length of the sentences. Using a novel dynamic pooling layer we can compare the variable-sized sentences and classify pairs as being paraphrases or not.

Download Paper

Download Code

Full Paraphrase System

Computing Compositional Vectors

Updated Related Work

Bibtex

Comments

For remarks, criticism or other thoughts on the paper. Save what you write before you post, then type in the password, post (nothing happens), then copy the text and re-post. It's the only way to prevent spammers.

Add Comment 
Sign as Author 
Enter code: