Hybrid Frame-Event Solution for Vision-Based Grasp and Pose Detection of Objects

Kyra Wang; Sihan Yang; Deepesh Kumar; Nitish Thakor

doi:10.1109/CASE48305.2020.9216970

Hybrid Frame-Event Solution for Vision-Based Grasp and Pose Detection of Objects

Kyra Wang, Sihan Yang, Deepesh Kumar, Nitish Thakor

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Abstract

A key challenge in object manipulation using prosthetic hands is grasp detection and pose estimation, especially in cluttered scenes. Vision-based robotic grasping solutions typically only use conventional frame-based video cameras with high spatiotemporal redundancy, which is unsuitable for mobile platforms like prostheses with low processing power. On the other hand, while event-based dynamic vision sensors (DVS) have low spatiotemporal redundancy, their low resolution results in poor object segmentation and detection performance. In this paper we outline a novel hybrid solution inspired by the two-streams hypothesis of the neural processing of vision, utilizing both a frame-based video camera and a DVS to counter the pitfalls of both systems. By using computationally efficient object detection methods on the frame-based camera to highlight regions-of-interest (ROIs) for the DVS, we are able to perform pose estimation by computing the smallest axis of DVS events generated in the ROI. The proposed approach allows us to rapidly determine the required wrist rotation and a suitable grasp type to pick up objects using a prosthetic hand. Results on a laptop show that our method matches the accuracy of a conventional solution that employs only a frame-based video camera, while achieving 77.29% faster inference speed.

Original language	English (US)
Title of host publication	2020 IEEE 16th International Conference on Automation Science and Engineering, CASE 2020
Publisher	IEEE Computer Society
Pages	1383-1388
Number of pages	6
ISBN (Electronic)	9781728169040
DOIs	https://doi.org/10.1109/CASE48305.2020.9216970
State	Published - Aug 2020
Externally published	Yes
Event	16th IEEE International Conference on Automation Science and Engineering, CASE 2020 - Hong Kong, Hong Kong Duration: Aug 20 2020 → Aug 21 2020

Publication series

Name	IEEE International Conference on Automation Science and Engineering
Volume	2020-August
ISSN (Print)	2161-8070
ISSN (Electronic)	2161-8089

Conference

Conference	16th IEEE International Conference on Automation Science and Engineering, CASE 2020
Country/Territory	Hong Kong
City	Hong Kong
Period	8/20/20 → 8/21/20

Keywords

Computer vision
Grasping
Neuromorphic engineering
Pose estimation
Prosthetic hand

ASJC Scopus subject areas

Control and Systems Engineering
Electrical and Electronic Engineering

Access to Document

10.1109/CASE48305.2020.9216970

Cite this

Wang, K., Yang, S., Kumar, D., & Thakor, N. (2020). Hybrid Frame-Event Solution for Vision-Based Grasp and Pose Detection of Objects. In 2020 IEEE 16th International Conference on Automation Science and Engineering, CASE 2020 (pp. 1383-1388). Article 9216970 (IEEE International Conference on Automation Science and Engineering; Vol. 2020-August). IEEE Computer Society. https://doi.org/10.1109/CASE48305.2020.9216970

Hybrid Frame-Event Solution for Vision-Based Grasp and Pose Detection of Objects. / Wang, Kyra; Yang, Sihan; Kumar, Deepesh et al.
2020 IEEE 16th International Conference on Automation Science and Engineering, CASE 2020. IEEE Computer Society, 2020. p. 1383-1388 9216970 (IEEE International Conference on Automation Science and Engineering; Vol. 2020-August).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Wang, K, Yang, S, Kumar, D & Thakor, N 2020, Hybrid Frame-Event Solution for Vision-Based Grasp and Pose Detection of Objects. in 2020 IEEE 16th International Conference on Automation Science and Engineering, CASE 2020., 9216970, IEEE International Conference on Automation Science and Engineering, vol. 2020-August, IEEE Computer Society, pp. 1383-1388, 16th IEEE International Conference on Automation Science and Engineering, CASE 2020, Hong Kong, Hong Kong, 8/20/20. https://doi.org/10.1109/CASE48305.2020.9216970

@inproceedings{8916ffe82cbb4dcabbecee461c16aee8,

title = "Hybrid Frame-Event Solution for Vision-Based Grasp and Pose Detection of Objects",

abstract = "A key challenge in object manipulation using prosthetic hands is grasp detection and pose estimation, especially in cluttered scenes. Vision-based robotic grasping solutions typically only use conventional frame-based video cameras with high spatiotemporal redundancy, which is unsuitable for mobile platforms like prostheses with low processing power. On the other hand, while event-based dynamic vision sensors (DVS) have low spatiotemporal redundancy, their low resolution results in poor object segmentation and detection performance. In this paper we outline a novel hybrid solution inspired by the two-streams hypothesis of the neural processing of vision, utilizing both a frame-based video camera and a DVS to counter the pitfalls of both systems. By using computationally efficient object detection methods on the frame-based camera to highlight regions-of-interest (ROIs) for the DVS, we are able to perform pose estimation by computing the smallest axis of DVS events generated in the ROI. The proposed approach allows us to rapidly determine the required wrist rotation and a suitable grasp type to pick up objects using a prosthetic hand. Results on a laptop show that our method matches the accuracy of a conventional solution that employs only a frame-based video camera, while achieving 77.29% faster inference speed.",

keywords = "Computer vision, Grasping, Neuromorphic engineering, Pose estimation, Prosthetic hand",

author = "Kyra Wang and Sihan Yang and Deepesh Kumar and Nitish Thakor",

note = "Publisher Copyright: {\textcopyright} 2020 IEEE.; 16th IEEE International Conference on Automation Science and Engineering, CASE 2020 ; Conference date: 20-08-2020 Through 21-08-2020",

year = "2020",

month = aug,

doi = "10.1109/CASE48305.2020.9216970",

language = "English (US)",

series = "IEEE International Conference on Automation Science and Engineering",

publisher = "IEEE Computer Society",

pages = "1383--1388",

booktitle = "2020 IEEE 16th International Conference on Automation Science and Engineering, CASE 2020",

}

TY - GEN

T1 - Hybrid Frame-Event Solution for Vision-Based Grasp and Pose Detection of Objects

AU - Wang, Kyra

AU - Yang, Sihan

AU - Kumar, Deepesh

AU - Thakor, Nitish

PY - 2020/8

Y1 - 2020/8

N2 - A key challenge in object manipulation using prosthetic hands is grasp detection and pose estimation, especially in cluttered scenes. Vision-based robotic grasping solutions typically only use conventional frame-based video cameras with high spatiotemporal redundancy, which is unsuitable for mobile platforms like prostheses with low processing power. On the other hand, while event-based dynamic vision sensors (DVS) have low spatiotemporal redundancy, their low resolution results in poor object segmentation and detection performance. In this paper we outline a novel hybrid solution inspired by the two-streams hypothesis of the neural processing of vision, utilizing both a frame-based video camera and a DVS to counter the pitfalls of both systems. By using computationally efficient object detection methods on the frame-based camera to highlight regions-of-interest (ROIs) for the DVS, we are able to perform pose estimation by computing the smallest axis of DVS events generated in the ROI. The proposed approach allows us to rapidly determine the required wrist rotation and a suitable grasp type to pick up objects using a prosthetic hand. Results on a laptop show that our method matches the accuracy of a conventional solution that employs only a frame-based video camera, while achieving 77.29% faster inference speed.

AB - A key challenge in object manipulation using prosthetic hands is grasp detection and pose estimation, especially in cluttered scenes. Vision-based robotic grasping solutions typically only use conventional frame-based video cameras with high spatiotemporal redundancy, which is unsuitable for mobile platforms like prostheses with low processing power. On the other hand, while event-based dynamic vision sensors (DVS) have low spatiotemporal redundancy, their low resolution results in poor object segmentation and detection performance. In this paper we outline a novel hybrid solution inspired by the two-streams hypothesis of the neural processing of vision, utilizing both a frame-based video camera and a DVS to counter the pitfalls of both systems. By using computationally efficient object detection methods on the frame-based camera to highlight regions-of-interest (ROIs) for the DVS, we are able to perform pose estimation by computing the smallest axis of DVS events generated in the ROI. The proposed approach allows us to rapidly determine the required wrist rotation and a suitable grasp type to pick up objects using a prosthetic hand. Results on a laptop show that our method matches the accuracy of a conventional solution that employs only a frame-based video camera, while achieving 77.29% faster inference speed.

KW - Computer vision

KW - Grasping

KW - Neuromorphic engineering

KW - Pose estimation

KW - Prosthetic hand

UR - http://www.scopus.com/inward/record.url?scp=85094165396&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85094165396&partnerID=8YFLogxK

U2 - 10.1109/CASE48305.2020.9216970

DO - 10.1109/CASE48305.2020.9216970

M3 - Conference contribution

AN - SCOPUS:85094165396

T3 - IEEE International Conference on Automation Science and Engineering

SP - 1383

EP - 1388

BT - 2020 IEEE 16th International Conference on Automation Science and Engineering, CASE 2020

PB - IEEE Computer Society

T2 - 16th IEEE International Conference on Automation Science and Engineering, CASE 2020

Y2 - 20 August 2020 through 21 August 2020

ER -

Hybrid Frame-Event Solution for Vision-Based Grasp and Pose Detection of Objects

Abstract

Publication series

Conference

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this