亚洲一区欧美在线,日韩欧美视频免费观看,色戒的三场床戏分别是在几段,欧美日韩国产在线人成

基于條件隨機場的農(nóng)作物病蟲害及農(nóng)藥命名實體識別
作者:
作者單位:

作者簡介:

通訊作者:

中圖分類號:

基金項目:

國家自然科學(xué)基金項目(61502500)、北京市自然科學(xué)基金項目(4164090)和中央高校基本科研業(yè)務(wù)費專項資金項目(2017QC077)


Recognition of Crops, Diseases and Pesticides Named Entities in Chinese Based on Conditional Random Fields
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 圖/表
  • |
  • 訪問統(tǒng)計
  • |
  • 參考文獻
  • |
  • 相似文獻
  • |
  • 引證文獻
  • |
  • 資源附件
  • |
  • 文章評論
    摘要:

    互聯(lián)網(wǎng)農(nóng)技問答平臺現(xiàn)僅依靠人工提供答題服務(wù),響應(yīng)速度慢,回答質(zhì)量難以保證。實現(xiàn)智能農(nóng)技問題解答,構(gòu)建農(nóng)技知識庫,需要從現(xiàn)有問答數(shù)據(jù)提取“農(nóng)作物-病蟲害-農(nóng)藥”命名實體三元組?,F(xiàn)有對農(nóng)業(yè)中文命名實體識別的研究較少,且準確率較低。根據(jù)農(nóng)作物、病蟲害及農(nóng)藥命名實體的特點,針對農(nóng)技問答數(shù)據(jù),提出基于條件隨機場的農(nóng)作物、病蟲害及農(nóng)藥命名實體的識別方法。對數(shù)據(jù)集進行格式整理及自動分詞,并對分詞后的語料,針對是否包含特定界定詞、是否含特定偏旁部首、是否是數(shù)量詞、是否是特定左右指界詞及詞性等特征進行自動標注。利用標注后的數(shù)據(jù)訓(xùn)練CRF模型,可以對語料進行分類,包括判斷語料是否屬于農(nóng)作物、病蟲害、農(nóng)藥3類命名實體并識別該語料在復(fù)合命名實體中的位置,從而實現(xiàn)了對3類命名實體的識別,由此可自動構(gòu)建關(guān)聯(lián)三元組。通過試驗選擇特征組合和調(diào)整上下文窗口大小,提高了本方法的識別準確度,降低了模型訓(xùn)練時間,對農(nóng)作物、病蟲害、農(nóng)藥命名實體識別的準確度分別達97.72%、87.63%、98.05%,比現(xiàn)有方法有顯著提高。

    Abstract:

    On internet agricultural technology platform, thousands of new questions are waiting to be answered by experts every day. It is generally doubted because of slowly response time and uncertain quality of the manual services. An intelligent response system based on agricultural technology knowledge base can help to answer some questions automatically. To build the knowledge base, it is necessary to recognize triples of “crop-disease-pesticide” named entities from mass of existing questions and answers data. However, fewer studies are reported on recognition methods for named entities of diseases and pesticides in Chinese, and accuracies of those for named entities of crops are low. Thus, a recognition method based on conditional random fields (CRF) was proposed, which recognized crops, diseases, and pesticides named entities from agricultural technology questions and answers data. In the method, question and answer texts was formatted and split to pieces of corpus. Each corpus piece was automatically annotated with several features, including whether it contained characteristic Chinese characters and characteristic radicals, whether it was numeral, whether it was the left or right bound of a compound word, and part of speech. A CRF model was trained with these annotated texts to classify pieces of corpus, including judging whether they were parts of crop, disease, or pesticide named entities and recognizing positions in named entities. With the trained model, three types of named entities could be accurately recognized and triples could be associated automatically. Recognition accuracies and time cost of model training were optimized by choosing input feature combinations and adjusting sizes of context windows in experiments. Accuracies of recognizing crops, diseases, and pesticides of this method were 97.72%, 87.63% and 98.05% respectively, which were significantly higher than existing methods.

    參考文獻
    相似文獻
    引證文獻
引用本文

李想,魏小紅,賈璐,陳昕,劉磊,張彥娥.基于條件隨機場的農(nóng)作物病蟲害及農(nóng)藥命名實體識別[J].農(nóng)業(yè)機械學(xué)報,2017,48(s1):178-185. LI Xiang, WEI Xiaohong, JIA Lu, CHEN Xin, LIU Lei, ZHANG Yan’e. Recognition of Crops, Diseases and Pesticides Named Entities in Chinese Based on Conditional Random Fields[J]. Transactions of the Chinese Society for Agricultural Machinery,2017,48(s1):178-185.

復(fù)制
分享
文章指標
  • 點擊次數(shù):
  • 下載次數(shù):
  • HTML閱讀次數(shù):
  • 引用次數(shù):
歷史
  • 收稿日期:2017-07-10
  • 最后修改日期:
  • 錄用日期:
  • 在線發(fā)布日期: 2017-12-10
  • 出版日期:
文章二維碼