Spectral Networks and Locally Connected Networks on Graphs
Joan Bruna, Wojciech Zaremba, Arthur Szlam, Yann LeCun
24 Dec 2013 arXiv 8 Comments
Conference Track
Convolutional Neural Networks are extremely efficient architectures in image and audio recognition tasks, thanks to their ability to exploit the local translational invariance of signal classes over their domain. In this paper we consider possible generalizations of CNNs to signals defined on more general domains without the action of a translation group. In particular, we propose two constructions, one based upon a hierarchical clustering of the domain, and another based on the spectrum of the graph Laplacian. We show through experiments that for low-dimensional graphs it is possible to learn convolutional layers with $O(1)$ parameters, resulting in efficient deep architectures.
State From To ( Cc) Subject Date Due Action
New Request
Joan Bruna Conference Track
Request for Endorsed for oral presentation: Spectral Networks and Locally...

24 Dec 2013
Reveal: document
Joan Bruna
Revealed: document: Spectral Networks and Locally Connected Networks on Graphs

24 Dec 2013
Completed
Conference Track Anonymous 9d8d
Request for review of Spectral Networks and Locally Connected Networks on Graphs

14 Jan 2014 04 Feb 2014
Completed
Conference Track Anonymous ff10
Request for review of Spectral Networks and Locally Connected Networks on Graphs

14 Jan 2014 04 Feb 2014
Completed
Conference Track Anonymous 9f60
Request for review of Spectral Networks and Locally Connected Networks on Graphs

14 Jan 2014 04 Feb 2014

8 Comments

Anonymous ff10 07 Feb 2014
State From To ( Cc) Subject Date Due Action
Reveal: document
Anonymous ff10
Revealed: document: review of Spectral Networks and Locally Connected Networks...

07 Feb 2014
Fulfill
Anonymous ff10 Conference Track
Fulfilled: Request for review of Spectral Networks and Locally Connected...

07 Feb 2014
This paper investigates the construction of convolutional[-like] neural networks (CNNs) on data with a graph structure that is not a simple grid like 2D images. Two types of constructions are proposed to generalize CNNs: one in the spectral domain (of the graph Laplacian) and one in the spatial domain (based on a multi-scale clustering of the graph). Experiments on variants of the MNIST dataset show that such constructions can lead to networks much smaller than fully connected networks, with similar or better generalization abilities. Of the 4 papers I reviewed, this is definitely the one I spent the most time on, but also the one I understand the least. It is a very dense paper, with lots of interesting ideas and observations, but without detailed enough explanations in my opinion, thus making it pretty difficult to follow. I need to mention, though, that my knowledge of CNNs is limited to seeing a few times figures of LeNetX architectures, and hopefully people more familiar with CNNs will be able to better grasp the ideas presented here. My first suggestion would be to start with the spatial construction (2.2) rather than the spectral one, as it is probably easier to visualize. And speaking of visualization, a picture showing the neighborhoods and how the various scales are used would be very helpful (I believe I understand what is being done here, but to be honest I think I need a picture to be sure). On the spectral construction, if there is a way to put eq. 2.2/2.3 in pictures, it would be great as well. Something unclear about these equations is that we seem to keep only a given number of components at each step, but the definition of y_k does not show that. The main point I failed to understand here is what it means for the group structure to "interact correctly with the Laplacian". Unfortunately the example in 2.1.1 is not clear enough for me: instead of saying it recovers a "standard convolutional net", could you describe the exact net structure, in particular in terms of weight sharing, pooling and subsampling layers? For 2.1.2 I do not really have any specific comment/question -- I was quite lost at this point -- except you could say what is a cubic spline kernel and why it makes sense to use it. For experiments, please first say in intro of section 4 that the full description of the projected dataset will be given in 4.2. I read the intro of section 4 several times, trying to understand what it meant (it did not help that I understood "the 2d unit sphere" as "the unit sphere in 2d", ie a circle)... before finally giving up (I think I got the idea now after reading 4.2 and looking at the pictures, but the description is still confusing: what are e1, e2 and e3 and what is the motivation in the choice of their norms?). Some other comments on experiments: - I find the color maps hard to read. Would they look better in grey scale? (Fig. 4 (a)(b): how are we supposed to see it is the same feature?) - Codenames for models in the results tables do not seem to be documented. - In Fig. 2 is (a) really the finest? It looks like the coarsest. Overall, I do believe it is a paper worth publishing. Taking advantage of the inner (unknown) structure of input variables is definitely a direction that could bring substantial improvements, the early experiments presented here are encouraging, and it brings some new ideas to the able. I hope, however, that the authors can increase the readability of the paper by adding more figures and explanations for those less familiar with CNNs. A few more small remarks: - The O(1) used in the intro could be a bit misleading, I think it is more O(kd) with k the local neighborhood size and d the number of layers? (or O(d log d) if k decreases exponentially with d) - "just an in the case of the grid": typo - In 2.1.1 I am wondering how important is the assumption of equal variance, and if you coul use the correlation instead - "Suppose have a real valued nonlinearity": typo - "by a dropping a set number of coefficients": typo - "The upshot is that the the construction": typo - "navie choice": typo - w_k-1 under eq. 2.4 should be W_k-1? - "gauranteed": typo - "the property that the subsampling the Fourier functions on the grid to a coarser grid": typo? - Figure references in section 4 are messed up (all are Fig. 4.1) - You could mention "Learning the 2D topology of images" in the related work section
Please log in to comment.
Joan Bruna 18 Feb 2014
State From To ( Cc) Subject Date Due Action
Reveal: document
Joan Bruna
Revealed: document:

18 Feb 2014
We thank all the reviewers for their insightful and relevant comments. We are preparing a new version of the document where we will address the questions you raised. We will keep you informed.
Please log in to comment.
Joan Bruna 20 Feb 2014
State From To ( Cc) Subject Date Due Action
Reveal: document
Joan Bruna
Revealed: document:

20 Feb 2014
To all reviewers: We just uploaded a new version of the paper on arxiv, which should be accessible in a few hours. This version has been almost entirely rewritten to address the concerns about accessibility to audience not familiar with Harmonic Analysis, and taking into account the feedback from the reviewers. We have also included figures illustrating the construction and fixed the typos. In particular, I would like to thank anonynous ff10 for her/his extensive and valuable comments, which greatly helped to increase the quality of the paper.
Please log in to comment.
Joan Bruna 20 Feb 2014
State From To ( Cc) Subject Date Due Action
Reveal: document
Joan Bruna
Revealed: document:

20 Feb 2014
To all reviewers: We just uploaded a new version of the paper on arxiv, which should be accessible in a few hours. This version has been almost entirely rewritten to address the concerns about accessibility to audience not familiar with Harmonic Analysis, and taking into account the feedback from the reviewers. We have also included figures illustrating the construction and fixed the typos. In particular, I would like to thank anonynous ff10 for her/his extensive and valuable comments, which greatly helped to increase the quality of the paper.
Please log in to comment.
Joan Bruna 25 Feb 2014
State From To ( Cc) Subject Date Due Action
Reveal: document
Joan Bruna
Revealed: document:

25 Feb 2014
The new version of the paper is now available at http://arxiv.org/pdf/1312.6203v2.pdf Joan
Please log in to comment.
Olivier Delalleau 26 Feb 2014
State From To ( Cc) Subject Date Due Action
Reveal: document
Olivier Delalleau
Revealed: document:

26 Feb 2014
Thanks for the update. I will definitely check it out, but I probably won't be able to do it before the end of the "official" discussion period (which I guess ends this week... actually I'm not sure)
Please log in to comment.
Anonymous 9f60 08 Feb 2014
State From To ( Cc) Subject Date Due Action
Reveal: document
Anonymous 9f60
Revealed: document: review of Spectral Networks and Locally Connected Networks...

08 Feb 2014
Fulfill
Anonymous 9f60 Conference Track
Fulfilled: Request for review of Spectral Networks and Locally Connected...

08 Feb 2014
> - A brief summary of the paper's contributions, in the context of prior work. Exploiting the grid structure of different types of data (e.g., images) with convolutional neural networks has been essential to the recent breakthrough results in various pattern recognition tasks. This paper explores generalizing convolutional neural networks from grids to weighted graphs. The reviewer finds weighted graph inputs best motivated by two datasets constructed at the end of the paper in order to test the proposed techniques. Both datasets are MNIST derivatives. The first subsamples the MNIST pixels in a disorganized manner, destroying the grid structure. The second projects MNIST onto a sphere, giving the input a more complicated manifold structure. Both of these are interpreted as weighted graphs in the natural manner. (A couple real-world examples of such structures occur to the reviewer: geo-spatial data and surfaces in 3D graphics.) If we are persuaded that weighted graphs are an interesting type of input for a neural net, how can we generalize convolutional neural networks to them? The paper introduces two broad approaches. The first approach is to use a metric on the graph to define neighborhoods and build a locally-connected network. The second, "spectral," approach is a bit more complicated. This can be understood as similar to how one can look at convolutional neural networks in terms of the Fourier Transform. A regular convolution can be thought of as pointwise multiplication in the Fourier domain. Drawing on the harmonic analysis of graphs, the paper uses the eigenvectors of the Laplacian, which have similar properties. Functions on the graph can be decomposed into coefficients of these eigenvectors and pointwise multiplied to achieve a convolution-like effect. As mentioned previously, the authors test their techniques on two constructed datasets. For the subsampled MNIST, they are able to beat a fully-connected network with a locally-connected one, but only tie it with the spectral approach (though the spectral approach uses almost two orders of magnitude fewer parameters). For MNIST on a sphere, both approaches achieve slightly worse results than the fully-connected network (but, again, use fewer parameters). > - An assessment of novelty and quality. The reviewer is not familiar with this area but believes this work to be novel. The ideas in this paper seem quite deep, and the experiments performed are interesting. In fact, the constructed datasets alone are interesting. The exposition of the paper could be a bit stronger. This seems somewhat more sensitive because most people in the neural networks community will not have the mathematical background the paper presently requires. A little more motivation and hand-holding could make the paper more accessible. That said, this doesn't seem like something that should be a barrier to publication. > - A list of pros and cons (reasons to accept/reject) Pros: * Generalizing convolutional neural networks to graphs seems like a valuable enterprise. * Explores some very intriguing ideas. In particular, the spectral generalization of convolutional networks feels quite deep. * Constructs cute datasets to test the ideas on. Cons: * Paper could be more accessible (see above).
Please log in to comment.
Anonymous 9d8d 11 Feb 2014
State From To ( Cc) Subject Date Due Action
Reveal: document
Anonymous 9d8d
Revealed: document: review of Spectral Networks and Locally Connected Networks...

11 Feb 2014
Fulfill
Anonymous 9d8d Conference Track
Fulfilled: Request for review of Spectral Networks and Locally Connected...

11 Feb 2014
Spectral networks This paper aims at applying convolutional neural networks to data which do not fit into the standard convolutional framework. They do so by considering that the coordinates lie on a graph and using the Laplacian of that graph. The topic is of utmost interest as CNNs consistently achieve very high performance while keeping the number of parameters in the network. I am glad to see advances in that direction. This excitement is moderated by the extreme difficulty with which I read the paper. In fact, most of the paper assumes advanced notions of harmonic analysis which I do not possess. I fully understand that such notions are necessary to fully apprehend this work but I would have appreciated if the authors had provided pointers or tried to give intuition on the concepts. As it is, only people familiar with the field will capture the full gist of the method. Additionally, I find the experimental section a bit weak, in great part because of the sole use of the ubiquitous MNIST dataset (albeit distorted versions of it). That being said, I want this work to be disseminated so that CNN can be used in wider contexts. Pros: - Great extension of CNNs - Results are based on profound understanding of harmonic analysis and not just trial and error Cons: - Extremely difficult to read for an audience not familiar with harmonic analysis - Experimental section a bit weak.
Please log in to comment.

Please log in to comment.