Multi‐modal deep network for RGB‐D segmentation of clothes

Boris Joukovsky, Pengpeng Hu, Adrian Munteanu

Research output: Contribution to journalArticlepeer-review

7 Citations (Scopus)

Abstract

In this Letter, the authors propose a deep learning based method to perform semantic segmentation of clothes from RGB-D images of people. First, they present a synthetic dataset containing more than 50,000 RGB-D samples of characters in different clothing styles, featuring various poses and environments for a total of nine semantic classes. The proposed data generation pipeline allows for fast production of RGB, depth images and ground-truth label maps. Secondly, a novel multi-modal encoder–ecoder convolutional network is proposed which operates on RGB and depth modalities. Multi-modal features are merged using trained fusion modules which use multi-scale atrous convolutions in the fusion process. The method is numerically evaluated on synthetic data and visually assessed on real-world data. The experiments demonstrate the efficiency of the proposed model over existing methods.
Original languageEnglish
Pages (from-to)432-435
Number of pages4
JournalIET Electronics Letters
Volume56
Issue number9
Early online date13 Feb 2020
DOIs
Publication statusPublished - 1 Apr 2020
Externally publishedYes

Fingerprint

Dive into the research topics of 'Multi‐modal deep network for RGB‐D segmentation of clothes'. Together they form a unique fingerprint.

Cite this