Abstract:Phenotype-based classification plays an essential role in plant research. Chrysanthemum flower has great momentous economic value and medicinal value, and has feature of morphological and genetic diversity as well. Due to the limitations of the artificial classification model by expert and the characteristic of genetic diversity, phenotype-based classification has been facing great challenges for its research. At present, the technologies and applications of machine learning and artificial intelligence are developing rapidly. With the vehicle of machine learning, the semi-supervised learning technology was employed to provide an effective way for improving the classification performance. This method was based on label propagation of graph model as well as active learning technique. According to this method, a small number of classified chrysanthemum data as well as a large amount of unlabeled chrysanthemum samples were exploited to improve the classification accuracy. This method can automatically make use of the unlabeled samples to improve the quality of chrysanthemum classification without relying on external interactions. The chrysanthemum phenotypic data was collected to train the learning model, and manually annotate the chrysanthemum category information. For exploiting the categorical attribute, the coding skill was studied as well. The label propagation of graph model was utilized by the semi-supervised learning skill for the unlabeled chrysanthemums. In order to improve the effectiveness of semi-supervised classification, active learning technique was applied, which was based on the entropy maximization strategy to select difficult-to-identify samples to improve classification performance further. Extensive experiments were conducted and comparisons were made. The experimental results showed that the unlabeled chrysanthemum samples can improve the classification accuracy remarkably, with the labeled ratio increasing from 6.25% to 23%, the recognition accuracy rapidly reached 0.7, the average recognition accuracy and recall rate can reach 0.91 and 0.88, respectively, when the labeled ratio was 81.25%. In conclusion, semi-supervised based learning for the intelligent identification and effective management of chrysanthemum flowers had great significance in theory and application for the studying of chrysanthemum phenotype.