Semantic Compositionality Through Recursive Matrix-Vector Spaces

Single-word vector space models have been very successful at learning lexical information. However, they cannot capture the compositional meaning of longer phrases, preventing them from a deeper understanding of language. We introduce a recursive neural network (RNN) model that learns compositional vector representations for phrases and sentences of arbitrary syntactic type and length. Our model assigns a vector and a matrix to every node in a parse tree: the vector captures the inherent meaning of the constituent, while the matrix captures how it changes the meaning of neighboring words or phrases. This matrix-vector RNN can learn the meaning of operators in propositional logic and natural language. The model obtains state of the art performance on three different experiments: predicting fine-grained sentiment distributions of adverb-adjective pairs; classifying sentiment labels of movie reviews and classifying semantic relationships such as cause-effect or topic-message between nouns using the syntactic path between them.

Download Paper

Download Code

Relation Classification

More Results

  
  

Bibtex

Comments

For remarks, critical comments or other thoughts on the paper.

Add Comment 
Sign as Author 
Enter code:

RichardSocher13 April 2014, 23:52

Hi Thomas, I think you're mixing up datasets. There is a document level one that gets 90 from the same authors. Best, Richard

Thomas?17 March 2014, 22:18

Hi,

very interesting work you have done. I have a question on the movie review benchmark you did using the Recursive Neural Network. In table 1 of "Semantic C Ompositionality?..." paper, you listed some state of the art results, the best being 79% using MV-RNN. I must say I was very surprised by that, because there are many techniques others have tried, that can reach 90+% accuracy.

I am wondering if I am missing some important assumption in the paper. Please help me understand your method better.

Brody?2 August 2013

Hi Peter,

Sorry for the delay. The problem occurs because the binary executable for the sst tagger, which gathers external features like POS and NER, was accidentally included in the download. You'll need to download and build it on your machine in order for it to work with external features. You can download it here: http://sourceforge.net/projects/supersensetag/

After you build it, reset the pathToSST variable within classifyRelations.sh to point to the new folder. Let me know if you run into anymore issues.

Best, Brody

Peter Hastings?19 July 2013, 05:41

No, I was wrong about that one. Substituted a known word for the unknown, and it didn't make a difference. But I am curious about this line (about #50) in the processTags.m file:

     splitLine = splitLine(1:end-5);

It looks like it's dropping the features for the last word in the sentence, but I can't see why. And when I comment it out, I don't get the error.

Peter Hastings?18 July 2013, 22:53

Hi. I tried to run the relationClassification code using the four sentences provided in the README file as the input. (Thanks for making it available!) But I got an error as shown below. I tried to track it down, and it looks like in the second sentence, "The <e1>company</e1> fabricates plastic <e2>chairs</e2>." the word "fabricates" comes out as unknown in the allSStr matrix. It looks like allSHyp has the parser tags for the sentence, but for the second sentence, there are only 4 items: ['0' 'B-noun.group' 'B-verb.creation' 'B-noun.substance']. (because there was a UNK word?) Then because the target word is at index 5, it gets the "exceeds dimensions" error below. Am I on the right track with that? Thanks! -Peter

... No labels, will just output predictions >> >> >> >> >> >> >> Getting external features... Index exceeds matrix dimensions.

Error in processTags>getHypVecIndicies (line 264)

        h2 = allSHyp{s}{level,e2};

Error in processTags>getHyp (line 190) allHypVecs = getHypVecIndicies(allSHyp,allIndicies,hyp2Ind);

Error in processTags (line 7) allSHypVecs = getHyp(allSHyp,allIndicies,params);

Error in loadExternalFeatures (line 12)

       [external.allHypVecs,external.allNERVecs,external.allPOSVecs] = ...

Error in addExternalFeatures (line 3) [allHypVecs allNERVecs allPOSVecs] = loadExternalFeatures(params,type);

Error in test_with_external_features (line 8) test_data = addExternalFeatures(test_data,params,type);