Abstract:On internet agricultural technology platform, thousands of new questions are waiting to be answered by experts every day. It is generally doubted because of slowly response time and uncertain quality of the manual services. An intelligent response system based on agricultural technology knowledge base can help to answer some questions automatically. To build the knowledge base, it is necessary to recognize triples of “crop-disease-pesticide” named entities from mass of existing questions and answers data. However, fewer studies are reported on recognition methods for named entities of diseases and pesticides in Chinese, and accuracies of those for named entities of crops are low. Thus, a recognition method based on conditional random fields (CRF) was proposed, which recognized crops, diseases, and pesticides named entities from agricultural technology questions and answers data. In the method, question and answer texts was formatted and split to pieces of corpus. Each corpus piece was automatically annotated with several features, including whether it contained characteristic Chinese characters and characteristic radicals, whether it was numeral, whether it was the left or right bound of a compound word, and part of speech. A CRF model was trained with these annotated texts to classify pieces of corpus, including judging whether they were parts of crop, disease, or pesticide named entities and recognizing positions in named entities. With the trained model, three types of named entities could be accurately recognized and triples could be associated automatically. Recognition accuracies and time cost of model training were optimized by choosing input feature combinations and adjusting sizes of context windows in experiments. Accuracies of recognizing crops, diseases, and pesticides of this method were 97.72%, 87.63% and 98.05% respectively, which were significantly higher than existing methods.