Convolutional-Recursive Deep Learning for 3D Object Classification
Richard Socher, Brody Huval, Bharath Bhat, Christopher D. Manning, Andrew Y. Ng

Recent advances in 3D sensing technologies make it possible to easily record color and depth images which together can improve object recognition. Most current methods rely on very well-designed features for this new 3D modality. We introduce a model based on a combination of convolutional and recursive neural networks (CNN and RNN) for learning features and classifying `RGB-D images. The `CNN layer learns low-level translationally invariant features which are then given as inputs to multiple, fixed-tree RNNs in order to compose higher order features. RNNs can be seen as combining convolution and pooling into one efficient, hierarchical operation. Our main result is that even RNNs with random weights compose powerful features. Our model obtains state of the art performance on a standard `RGB-D object dataset while being more accurate and faster during training and testing than comparable architectures such as two-layer CNNs. |
Download Paper
Code
- For full training and testing code of the final convolutional-recursive neural network, download CRNNCodeTrainTest.zip
Example Object Images
Bibtex
- Please cite the following paper when you use the data set or code:
@incollection{SocherEtAl2012:CRNN,
title = {{Convolutional-Recursive Deep Learning for 3D Object Classification}},
author = {{Richard Socher and Brody Huval and Bharath Bhat and Christopher D. Manning and Andrew Y. Ng}},
booktitle = {{Advances in Neural Information Processing Systems 25}},
year = {2012}
}
Comments Critique, Questions
Save what you write before you post, then type in the password, post (nothing happens), then copy the text and re-post. It's the only way to prevent spammers.
I failed in running your program:
>> runCRNN
debug: 1
dataFolder: '../data/'
split: 2
numFilters: 128
numRNN: 64
depth: 0
extraFeatures: 1
Forward propagating RGB data Pretraining CNN Filters... Index exceeds matrix dimensions.
Error in pretrain>getPatches (line 65)
if isValid(categories(mod(count-1,length(categories))+1).name)
Error in pretrain (line 6) patches = getPatches(params);
Error in runCRNN>forwardProp (line 39) [filters params] = pretrain(params);
Error in runCRNN (line 11) [rgbTrain rgbTest] = forwardProp(params);
would you please tell me how to fix it? thx!
lj? — 28 July 2015, 11:14
Hi Richard, I'm currently trying to run your code on the rgbd object dataset. But the result i got :ans=100,would you please tell me the reason ,thanks!!
Hi Richard, I read you paper, while there is one question bothered me, As you say, resulting in K filter response, each of dimensionality dI- dP + 1. then get pooled , the pooled response equal to r = (dI - dL)/s + 1, why is not
r = (dI - dp + 1 -dL)/s + 1??? Hope you answer,thank you。
Sven? — 07 November 2014, 17:53
Hi Richard,
I think that there is a minor bug in your pretraining code.
In pretrain.m from line 104 on you sample image patches for the pretraining step. The variable "patchesToKeep" includes the indices to those image patches that are sufficiently covered by the mask. But then you use the variable "keepInds" to randomly draw indices. These keepInds do _not_ point to the image, but they are indices to the patchesToKeep indices! In line 112 the code uses keepInds directly as indices to image patches anyway.
This should be fixed if you replace the three lines of code in the "subsample all possible patches" block with the following two:
+ numToTake = min(round(length(patchesToKeep)*0.005), numWant-numHave);
+ keepInds = patchesToKeep(randperm(length(patchesToKeep), numToTake));
Salem? — 26 August 2014, 23:26
Hi Richard,
I am not from computer science dept. but I would like to use this code and paper in my Ph D? work. Could you please let me know if you have any manal file or post explaining how we can run the code from the begning.
Thanks
Salem
You can set the minibatches to the size that works for your memory. We had some very large memory machines >64GB that we ran our experiments on.
@Anran
You can use the automatically obtained segmentation to improve results but at test time it works with and without the segmentation.
Hi Richard,
I'm currently trying to run your code on the rgbd object dataset. My computer is able to complete in debug-mode. But whenever I turn off debug mode, Matlab runs out of memory (during forwardCNN). I even tried it on a cluster node with 32 GB of RAM, but it still crashes. Could you please tell me on what kind of computer you ran your evaluation on, and how much memory is needed?
Thanks!
Hi Richard, Did you use segmented images or raw images?
Andy? — 20 February 2014, 04:29
Hi Richard, problem solved!
I changed:
params.dataFolder = '../data/';
to:
params.dataFolder = './data/';
seems like it works.
sorry to bother you. thx!
Andy? — 20 February 2014, 04:29
Hi Richard, problem solved!
I changed:
params.dataFolder = '../data/';
to:
params.dataFolder = './data/';
seems like it works.
sorry to bother you. thx!
Andy? — 20 February 2014, 02:38
Hi Richard, Thanks for your reply.
Yes. I extracted the whole 'rgbd-dataset' folder into data folder, but it doesn't work. Should I change anything of the program?
Hi Andy,
Do you have images in your data folder?
Andy? — 17 February 2014, 05:18
I failed in running your program:
>> runCRNN
debug: 1
dataFolder: '../data/'
split: 2
numFilters: 128
numRNN: 64
depth: 0
extraFeatures: 1
Forward propagating RGB data
Pretraining CNN Filters...
Index exceeds matrix dimensions.
Error in pretrain>getPatches (line 65)
if isValid(categories(mod(count-1,length(categories))+1).name)
Error in pretrain (line 6)
patches = getPatches(params);
Error in runCRNN>forwardProp (line 39)
[filters params] = pretrain(params);
Error in runCRNN (line 11)
[rgbTrain rgbTest] = forwardProp(params);
would you please tell me how to fix it? thx!