Incorporating side-channel information into convolutional neural networks for robotic tasks


Conference Paper

© 2017 IEEE. Convolutional neural networks (CNN) are a deep learning technique that has achieved state-of-the-art prediction performance in computer vision and robotics, but assume the input data can be formatted as an image or video (e.g. predicting a robot grasping location given RGB-D image input). This paper considers the problem of augmenting a traditional CNN for handling image-like input (called main-channel input) with additional, highly predictive, non-image-like input (called side-channel input). An example of such a task would be to predict whether a robot path is collision-free given an occupancy grid of the environment and the path's start and goal configurations; the occupancy grid is the main-channel and the start and goal are the side-channel. This paper presents several candidate network architectures for doing so. Empirical tests on robot collision prediction and control problems compare the proposed architectures in terms of learning speed, memory usage, learning capacity, and susceptibility to overfitting.

Full Text

Duke Authors

Cited Authors

  • Zhou, Y; Hauser, K

Published Date

  • July 21, 2017

Published In

Start / End Page

  • 2177 - 2183

International Standard Serial Number (ISSN)

  • 1050-4729

International Standard Book Number 13 (ISBN-13)

  • 9781509046331

Digital Object Identifier (DOI)

  • 10.1109/ICRA.2017.7989251

Citation Source

  • Scopus