| Summary: | Accurate detection of Jinxiu Malus fruits in unstructured orchard environments is hampered by frequent overlap, occlusion, and variable illumination. To address these challenges, we propose YOLOv8-MSP-PD (YOLOv8 with Multi-Scale Pyramid Fusion and Proportional Distance IoU), a lightweight model built on an enhanced YOLOv8 architecture. We replace the backbone with MobileNetV4, incorporating unified inverted bottleneck (UIB) modules and depth-wise separable convolutions for efficient feature extraction. We introduce a spatial pyramid pooling fast cross-stage partial connections (SPPFCSPC) module for multi-scale feature fusion and a modified proportional distance IoU (MPD-IoU) loss to optimize bounding-box regression. Finally, layer-adaptive magnitude pruning (LAMP) combined with knowledge distillation compresses the model while retaining performance. On our custom Jinxiu Malus dataset, YOLOv8-MSP-PD achieves a mean average precision (mAP) of 92.2% (1.6% gain over baseline), reduces floating-point operations (FLOPs) by 59.9%, and shrinks to 2.2 MB. Five-fold cross-validation confirms stability, and comparisons with Faster R-CNN and SSD demonstrate superior accuracy and efficiency. This work offers a practical vision solution for agricultural robots and guidance for lightweight detection in precision agriculture.
|