亚洲一区欧美在线,日韩欧美视频免费观看,色戒的三场床戏分别是在几段,欧美日韩国产在线人成

基于部首嵌入和注意力機(jī)制的病蟲害命名實(shí)體識別
作者:
作者單位:

作者簡介:

通訊作者:

中圖分類號:

基金項(xiàng)目:

國家重點(diǎn)研發(fā)計(jì)劃項(xiàng)目(2016YFD0300710)


Recognition of Chinese Agricultural Diseases and Pests Named Entity with Joint Radicalembedding and Self-attention Mechanism
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 圖/表
  • |
  • 訪問統(tǒng)計(jì)
  • |
  • 參考文獻(xiàn)
  • |
  • 相似文獻(xiàn)
  • |
  • 引證文獻(xiàn)
  • |
  • 資源附件
  • |
  • 文章評論
    摘要:

    為了解決農(nóng)業(yè)病蟲害命名實(shí)體識別過程中存在的內(nèi)在語義信息缺失、局部上下文特征易被忽略和捕獲長距離依賴能力不足等問題,以農(nóng)業(yè)病蟲害文本為研究對象,提出一種基于部首嵌入和注意力機(jī)制的農(nóng)業(yè)病蟲害命名實(shí)體識別模型(Chinese agricultural diseases and pests named entity recognition with joint radicalembedding and selfattention, RS-ADP)。首先,該模型將部首嵌入集成到字符嵌入中作為輸入,用以豐富語義信息。其中,針對部首嵌入設(shè)計(jì)了3種特征提取策略,即卷積神經(jīng)網(wǎng)絡(luò)(Convolutional neural network, CNN)、雙向長短時(shí)記憶網(wǎng)絡(luò)(Bidirectional long shortterm memory network, BiLSTM) 和CNN-BiLSTM;其次,采用多層不同窗口尺寸的CNNs層提取不同尺度的局部上下文信息;然后,在BiLSTM提取全局序列特征的基礎(chǔ)上,采用自注意力機(jī)制進(jìn)一步增強(qiáng)模型提取更長距離依賴的能力;最后,采用條件隨機(jī)場(Conditional random field, CRF)聯(lián)合識別實(shí)體邊界和劃分實(shí)體類別。在包含11個(gè)類別和24715條標(biāo)注樣本的農(nóng)業(yè)病蟲害自制語料上進(jìn)行了實(shí)驗(yàn)。結(jié)果表明,本文模型RS-ADP在該數(shù)據(jù)集上精確率、召回率和F1值分別為94.16%、94.47%和94.32%;在具體實(shí)體類別上,RS-ADP在作物、病害、蟲害等易識別實(shí)體上F1值高達(dá)95.81%、97.76%和97.23%。同時(shí),RS-ADP在草害、病原等難以識別實(shí)體上F1值仍保持86%以上。實(shí)驗(yàn)結(jié)果表明,本文所提模型能夠有效識別農(nóng)業(yè)病蟲害命名實(shí)體,其識別精度優(yōu)于其他模型,且具有一定的泛化性。

    Abstract:

    Chinese named entity recognition in agricultural diseases and pests domain(CNER-ADP) plays an important role in agricultural natural language processing such as relation extraction, agricultural knowledge graph construction, and agricultural knowledge question and answering, but it still presents some problems, i.e., the neglect of inherent semantic information and local contextual features and the insufficiency of capturing longdistance dependencies, which will lead to low accuracy and robustness. To solve the above problems and tackle the CNER-ADP task, a novel Chinese named entity recognition method for agricultural diseases and pests via jointly using radicalembedding and selfattention (RS-ADP) was proposed. Firstly, the model integrated radical embedding and character embedding as input to enrich semantic information. Among them, three different strategies, including CNN and BiLSTM were both designed to capture the radicallevel embedding. Secondly, a CNNs layer with different kernel sizes was considered capturing multiscale local contextual features. Thirdly, based on the BiLSTM layer, selfattention mechanism was used to further enhance the ability of the model to extract longerdistance dependencies. Finally, the conditional random field (CRF) was utilized to identify entity boundaries and category. The experiments were carried out on the corpus of agricultural diseases and pests, named AgCNER, which contained 11 categories and 24715 samples. At macrolevel, the RS-ADP model achieved optimal precision, recall, and F1 values of 94.16%, 94.47%, and 94.32%, respectively. In terms of specific categories, it achieved F1 values as high as 95.81%, 97.76%, and 97.23% on easily identifiable entities such as crop, disease, and pest. Meanwhile, this model still maintained over 86% of F1 value on some other difficultly recognized entities such as weed and pathogeny. The experimental results showed that the proposed model could effectively recognize the named entities of agricultural pests and diseases without feature engineering. Moreover, it had certain generalization and outperformed other models. 

    參考文獻(xiàn)
    相似文獻(xiàn)
    引證文獻(xiàn)
引用本文

郭旭超,唐詹,刁磊,周晗,李林.基于部首嵌入和注意力機(jī)制的病蟲害命名實(shí)體識別[J].農(nóng)業(yè)機(jī)械學(xué)報(bào),2020,51(s2):335-343. GUO Xuchao, TANG Zhan, DIAO Lei, ZHOU Han, LI Lin. Recognition of Chinese Agricultural Diseases and Pests Named Entity with Joint Radicalembedding and Self-attention Mechanism[J]. Transactions of the Chinese Society for Agricultural Machinery,2020,51(s2):335-343.

復(fù)制
分享
文章指標(biāo)
  • 點(diǎn)擊次數(shù):
  • 下載次數(shù):
  • HTML閱讀次數(shù):
  • 引用次數(shù):
歷史
  • 收稿日期:2020-08-01
  • 最后修改日期:
  • 錄用日期:
  • 在線發(fā)布日期: 2020-12-10
  • 出版日期: 2020-12-10
文章二維碼