Today in CFAR seminar, Dr. Yi Li and his student Fang Wang from Australian National University ented the work on Sketch-based 3D Shape Retrieval using Convolutional Neural Networks. Here is the paper in arXiv. Here is my summary:

Yet-another-CNN-application

IMG_6691

  • The question for superior performance
    • What if I don’t have 100 GPUs like Google / Facebook?
    • What if I don’t want to wait for a month?
    • What’s the insight of the performance gain?
  • The quests for original ideas
    • Beyond simply applying the network
    • Beyond standard network structure
    • How to tailor the CNN for specific applications
    • A new perspective of classical CV problems

DeepVision

They organized the Deep Learning Workshop (DeepVision) in 2014 and 2015 CVPR.

DeepVision

DeepVision

Motivation

IMG_6702

  • Defining hand-crafted feature representations
  • Object tracking

Demo

IMG_6709-ANIMATION

Rotation Aware Shape Recognition

IMG_6796

 

IMG_6800

Sketch-based 3D Shape Retrieval Using Convolutional Neural Networks

Presented by Fang Wang

Abstract

Retrieving 3D models from 2D human sketches has received considerable attention in the areas of graphics, image retrieval, and computer vision. Almost always in state of the art approaches a large amount of “best views” are computed for 3D models, with the hope that the query sketch matches one of these 2D projections of 3D models using predefined features. We argue that this two stage approach (view selection – matching) is pragmatic but also problematic because the “best views” are subjective and ambiguous, which makes the matching inputs obscure. This imprecise nature of matching further makes it challenging to choose features manually. Instead of relying on the elusive concept of “best views” and the hand-crafted features, we propose to define our views using a minimalism approach and learn features for both sketches and views. Specifically, we drastically reduce the number of views to only two predefined directions for the whole dataset. Then, we learn two Siamese Convolutional Neural Networks (CNNs), one for the views and one for the sketches. The loss function is defined on the within-domain as well as the cross-domain similarities. Our experiments on three benchmark datasets demonstrate that our method is significantly better than state of the art approaches, and outperforms them in all conventional metrics.

IMG_6799 IMG_6820

Siamese Neural Network

Widely used in Face / Digits recognition.

IMG_6819 IMG_6823

Live Demo

IMG_6832-ANIMATION

Check out their demo here: http://users.cecs.anu.edu.au/~yili/cnnsbsr/