Convolutional-Recursive Deep Learning for 3D Object Classification

Richard Socher, Brody Huval, Bharath Bhat, Christopher D. Manning, Andrew Y. Ng

Recent advances in 3D sensing technologies make it possible to easily record color and depth images which together can improve object recognition. Most current methods rely on very well-designed features for this new 3D modality. We introduce a model based on a combination of convolutional and recursive neural networks (CNN and RNN) for learning features and classifying `RGB-D images. The `CNN layer learns low-level translationally invariant features which are then given as inputs to multiple, fixed-tree RNNs in order to compose higher order features. RNNs can be seen as combining convolution and pooling into one efficient, hierarchical operation. Our main result is that even RNNs with random weights compose powerful features. Our model obtains state of the art performance on a standard `RGB-D object dataset while being more accurate and faster during training and testing than comparable architectures such as two-layer CNNs.

Download Paper


Example Object Images


Comments Critique, Questions

Save what you write before you post, then type in the password, post (nothing happens), then copy the text and re-post. It's the only way to prevent spammers.

Add Comment 
Sign as Author 
Enter code:

Everal?22 November 2014, 09:37

Hi Richard, I read you paper, while there is one question bothered me, As you say, resulting in K filter response, each of dimensionality dI- dP + 1. then get pooled , the pooled response equal to r = (dI - dL)/s + 1, why is not r = (dI - dp + 1 -dL)/s + 1??? Hope you answer,thank you。

Sven?07 November 2014, 17:53

Hi Richard, I think that there is a minor bug in your pretraining code. In pretrain.m from line 104 on you sample image patches for the pretraining step. The variable "patchesToKeep" includes the indices to those image patches that are sufficiently covered by the mask. But then you use the variable "keepInds" to randomly draw indices. These keepInds do _not_ point to the image, but they are indices to the patchesToKeep indices! In line 112 the code uses keepInds directly as indices to image patches anyway.

This should be fixed if you replace the three lines of code in the "subsample all possible patches" block with the following two: + numToTake = min(round(length(patchesToKeep)*0.005), numWant-numHave); + keepInds = patchesToKeep(randperm(length(patchesToKeep), numToTake));

Salem?26 August 2014, 23:26

Hi Richard,

I am not from computer science dept. but I would like to use this code and paper in my Ph D? work. Could you please let me know if you have any manal file or post explaining how we can run the code from the begning.



RichardSocher24 July 2014, 09:18

You can set the minibatches to the size that works for your memory. We had some very large memory machines >64GB that we ran our experiments on.

@Anran You can use the automatically obtained segmentation to improve results but at test time it works with and without the segmentation.

20 June 2014, 15:29

Hi Richard, I'm currently trying to run your code on the rgbd object dataset. My computer is able to complete in debug-mode. But whenever I turn off debug mode, Matlab runs out of memory (during forwardCNN). I even tried it on a cluster node with 32 GB of RAM, but it still crashes. Could you please tell me on what kind of computer you ran your evaluation on, and how much memory is needed? Thanks!

Wang Anran?11 May 2014, 09:02

Hi Richard, Did you use segmented images or raw images?

Andy?20 February 2014, 04:29

Hi Richard, problem solved! I changed: params.dataFolder = '../data/'; to: params.dataFolder = './data/'; seems like it works. sorry to bother you. thx!

Andy?20 February 2014, 04:29

Hi Richard, problem solved! I changed: params.dataFolder = '../data/'; to: params.dataFolder = './data/'; seems like it works. sorry to bother you. thx!

Andy?20 February 2014, 02:38

Hi Richard, Thanks for your reply. Yes. I extracted the whole 'rgbd-dataset' folder into data folder, but it doesn't work. Should I change anything of the program?

RichardSocher19 February 2014, 17:47

Hi Andy,

Do you have images in your data folder?

Andy?17 February 2014, 05:18

I failed in running your program:

>> runCRNN

            debug: 1
       dataFolder: '../data/'
            split: 2
       numFilters: 128
           numRNN: 64
            depth: 0
    extraFeatures: 1

Forward propagating RGB data Pretraining CNN Filters... Index exceeds matrix dimensions.

Error in pretrain>getPatches (line 65)

    if isValid(categories(mod(count-1,length(categories))+1).name)

Error in pretrain (line 6) patches = getPatches(params);

Error in runCRNN>forwardProp (line 39) [filters params] = pretrain(params);

Error in runCRNN (line 11) [rgbTrain rgbTest] = forwardProp(params);

would you please tell me how to fix it? thx!