An RGB-D Vision-Guided Robotic Depalletizing System for Irregular Camshafts with Transformer-Based Instance Segmentation and Flexible Magnetic Gripper

Accurate segmentation of densely stacked and weakly textured objects remains a core challenge in robotic depalletizing for industrial applications. To address this, we propose MaskNet, an instance segmentation network tailored for RGB-D input, designed to enhance recognition performance under occlus...

Full description

Bibliographic Details
Published in:Actuators
Main Authors: Runxi Wu, Ping Yang
Format: Article
Language:English
Published: MDPI AG 2025-07-01
Subjects:
Online Access:https://www.mdpi.com/2076-0825/14/8/370
_version_ 1849361617474551808
author Runxi Wu
Ping Yang
author_facet Runxi Wu
Ping Yang
author_sort Runxi Wu
collection DOAJ
container_title Actuators
description Accurate segmentation of densely stacked and weakly textured objects remains a core challenge in robotic depalletizing for industrial applications. To address this, we propose MaskNet, an instance segmentation network tailored for RGB-D input, designed to enhance recognition performance under occlusion and low-texture conditions. Built upon a Vision Transformer backbone, MaskNet adopts a dual-branch architecture for RGB and depth modalities and integrates multi-modal features using an attention-based fusion module. Further, spatial and channel attention mechanisms are employed to refine feature representation and improve instance-level discrimination. The segmentation outputs are used in conjunction with regional depth to optimize the grasping sequence. Experimental evaluations on camshaft depalletizing tasks demonstrate that MaskNet achieves a precision of 0.980, a recall of 0.971, and an F1-score of 0.975, outperforming a YOLO11-based baseline. In an actual scenario, with a self-designed flexible magnetic gripper, the system maintains a maximum grasping error of 9.85 mm and a 98% task success rate across multiple camshaft types. These results validate the effectiveness of MaskNet in enabling fine-grained perception for robotic manipulation in cluttered, real-world scenarios.
format Article
id doaj-art-aa868c8d604f4e5582a3a758dd15d963
institution Directory of Open Access Journals
issn 2076-0825
language English
publishDate 2025-07-01
publisher MDPI AG
record_format Article
spelling doaj-art-aa868c8d604f4e5582a3a758dd15d9632025-08-27T13:58:53ZengMDPI AGActuators2076-08252025-07-0114837010.3390/act14080370An RGB-D Vision-Guided Robotic Depalletizing System for Irregular Camshafts with Transformer-Based Instance Segmentation and Flexible Magnetic GripperRunxi Wu0Ping Yang1School of Aerospace Engineering, Xiamen University, Xiamen 361102, ChinaSchool of Aerospace Engineering, Xiamen University, Xiamen 361102, ChinaAccurate segmentation of densely stacked and weakly textured objects remains a core challenge in robotic depalletizing for industrial applications. To address this, we propose MaskNet, an instance segmentation network tailored for RGB-D input, designed to enhance recognition performance under occlusion and low-texture conditions. Built upon a Vision Transformer backbone, MaskNet adopts a dual-branch architecture for RGB and depth modalities and integrates multi-modal features using an attention-based fusion module. Further, spatial and channel attention mechanisms are employed to refine feature representation and improve instance-level discrimination. The segmentation outputs are used in conjunction with regional depth to optimize the grasping sequence. Experimental evaluations on camshaft depalletizing tasks demonstrate that MaskNet achieves a precision of 0.980, a recall of 0.971, and an F1-score of 0.975, outperforming a YOLO11-based baseline. In an actual scenario, with a self-designed flexible magnetic gripper, the system maintains a maximum grasping error of 9.85 mm and a 98% task success rate across multiple camshaft types. These results validate the effectiveness of MaskNet in enabling fine-grained perception for robotic manipulation in cluttered, real-world scenarios.https://www.mdpi.com/2076-0825/14/8/370depalletizing systemrobot graspinginstance segmentationRGB-D sensingflexible magnetic gripper
spellingShingle Runxi Wu
Ping Yang
An RGB-D Vision-Guided Robotic Depalletizing System for Irregular Camshafts with Transformer-Based Instance Segmentation and Flexible Magnetic Gripper
depalletizing system
robot grasping
instance segmentation
RGB-D sensing
flexible magnetic gripper
title An RGB-D Vision-Guided Robotic Depalletizing System for Irregular Camshafts with Transformer-Based Instance Segmentation and Flexible Magnetic Gripper
title_full An RGB-D Vision-Guided Robotic Depalletizing System for Irregular Camshafts with Transformer-Based Instance Segmentation and Flexible Magnetic Gripper
title_fullStr An RGB-D Vision-Guided Robotic Depalletizing System for Irregular Camshafts with Transformer-Based Instance Segmentation and Flexible Magnetic Gripper
title_full_unstemmed An RGB-D Vision-Guided Robotic Depalletizing System for Irregular Camshafts with Transformer-Based Instance Segmentation and Flexible Magnetic Gripper
title_short An RGB-D Vision-Guided Robotic Depalletizing System for Irregular Camshafts with Transformer-Based Instance Segmentation and Flexible Magnetic Gripper
title_sort rgb d vision guided robotic depalletizing system for irregular camshafts with transformer based instance segmentation and flexible magnetic gripper
topic depalletizing system
robot grasping
instance segmentation
RGB-D sensing
flexible magnetic gripper
url https://www.mdpi.com/2076-0825/14/8/370
work_keys_str_mv AT runxiwu anrgbdvisionguidedroboticdepalletizingsystemforirregularcamshaftswithtransformerbasedinstancesegmentationandflexiblemagneticgripper
AT pingyang anrgbdvisionguidedroboticdepalletizingsystemforirregularcamshaftswithtransformerbasedinstancesegmentationandflexiblemagneticgripper
AT runxiwu rgbdvisionguidedroboticdepalletizingsystemforirregularcamshaftswithtransformerbasedinstancesegmentationandflexiblemagneticgripper
AT pingyang rgbdvisionguidedroboticdepalletizingsystemforirregularcamshaftswithtransformerbasedinstancesegmentationandflexiblemagneticgripper