Abstract:An apple detection method (Light-YOLOv3) based on lightweight YOLO (You only look once) convolutional neural network was proposed for apple picking robots to detect apples quickly and accurately in the complex background of fruit trees. Firstly, in order to improve the traditional YOLOv3 deep convolutional neural network architecture, a feature extraction network structure containing cascaded homogeneous residual blocks was designed, and the dimensionality of the feature map for object detection was simplified. In this architecture, the conventional convolution was replaced by the depth wise separable convolution, and a multiobjective loss function was defined in terms of the mean square error loss and the cross entropy loss. Secondly, the training data was obtained from the Internet by means of a crawler program, and then labelled. The data augmentation technique was used to expand the training data and normalize it. Thirdly, a multistage learning optimization approach based on stochastic gradient descent (SGD) and adaptive moment estimation (Adam) was proposed to train Light-YOLOv3 network. Finally, an apple detection experiment in the complex background of fruit trees was performed on a computer workstation and an embedded processor, respectively. The experimental results showed that the apple detection method based on Light-YOLOv3 network improved the detection speed and accuracy significantly, i.e., the detection speed on the computer workstation and the embedded processor can reach 116.96f/s, 7.59f/s, F1 value can reach 9457%, and the mean average precision (mAP) can reach 94.69%.