Abstract:In view of the characteristics of small size, dense distribution and changeable color of Camellia oleifera fruit, in order to realize the rapid and accurate identification of Camellia oleifera fruit in complex natural scene, and determine the appropriate clamping position for the automatic oscillating harvesting device according to the density distribution of the fruit, the YOLO v5s convolutional neural network model was used to carry out research on the image detection method of Camellia oleifera fruit in the natural scene. Through data enhancement, totally 3296 Camellia oleifera fruit images were obtained to make the PASCAL VOC data set. After 150 rounds of training, the optimal weight model was got. The accurate rate was 90.73%, the recall rate was 98.38%, the comprehensive evaluation index was 94.4%, the average detection accuracy was 98.71%, the single image detection time was 12.7ms, and the memory size of the model was 14.08MB. Compared with the current mainstream first-stage detection algorithms YOLO v4-tiny and RetinaNet, its accuracy rate was increased by 1.99 percentage points and 4.50 percentage points, the recall rate was increased by 9.41 percentage points and 10.77 percentage points, and the time was reduced by 96.39% and 96.25%, respectively. In addition, the weight file of the YOLO v5s model was small, indicating that its network was simpler and had the advantage of rapid deployment. It could be transplanted to edge devices in the future to provide algorithm reference for the vision system of the Camellia oleifera fruit automatic harvesting device. Through comparative experiment, the results also showed that the model can achieve high-precision recognition and positioning of fruits in dense, occluded, dim environments and fuzzy blur conditions, and it had strong robustness. The research results can provide a reference for the research of mechanical harvesting of Camellia oleifera fruit under the natural complex environment.