亚洲一区欧美在线,日韩欧美视频免费观看,色戒的三场床戏分别是在几段,欧美日韩国产在线人成

基于文本數(shù)據(jù)增強(qiáng)的中文水稻育種問(wèn)句命名實(shí)體識(shí)別
作者:
作者單位:

作者簡(jiǎn)介:

通訊作者:

中圖分類號(hào):

基金項(xiàng)目:

國(guó)家自然科學(xué)基金項(xiàng)目(62303472)


Named Entity Recognition in Chinese Rice Breeding Questions Based on Text Data Augmentation
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 圖/表
  • |
  • 訪問(wèn)統(tǒng)計(jì)
  • |
  • 參考文獻(xiàn)
  • |
  • 相似文獻(xiàn)
  • |
  • 引證文獻(xiàn)
  • |
  • 資源附件
  • |
  • 文章評(píng)論
    摘要:

    針對(duì)現(xiàn)有水稻育種問(wèn)答系統(tǒng)存在數(shù)據(jù)管理水平低、知識(shí)粒度大,水稻育種領(lǐng)域缺乏用于命名實(shí)體識(shí)別的標(biāo)注數(shù)據(jù)、人工標(biāo)注成本高等問(wèn)題,提出了一種基于文本數(shù)據(jù)增強(qiáng)的方法來(lái)識(shí)別水稻育種問(wèn)句的命名實(shí)體,通過(guò)構(gòu)建水稻育種知識(shí)圖譜,對(duì)水稻育種問(wèn)句中的大類命名實(shí)體進(jìn)行分類,從而增強(qiáng)實(shí)體邊界,降低知識(shí)粒度。針對(duì)水稻育種數(shù)據(jù)標(biāo)注成本高導(dǎo)致命名實(shí)體識(shí)別性能不佳的難點(diǎn),通過(guò)在BERT-BILSTM-CRF模型中引入數(shù)據(jù)增強(qiáng)層,提出了DA-BERT-BILSTM-CRF模型。實(shí)驗(yàn)以標(biāo)注的水稻育種問(wèn)句為訓(xùn)練數(shù)據(jù),將所提出的模型與其他基線模型進(jìn)行比較。結(jié)果表明,本文方法在水稻育種問(wèn)句中命名實(shí)體識(shí)別的單類別識(shí)別任務(wù)和整體識(shí)別任務(wù)上均優(yōu)于其他方法,其中單類別識(shí)別精確率達(dá)到94.26%,F(xiàn)1值達(dá)到93.32%;整體識(shí)別精確率達(dá)到93.86%,F(xiàn)1值達(dá)到93.34%。

    Abstract:

    Issues of low-level data management and high knowledge granularity exist in current rice breeding question answering systems. In addition, there is a lack of publicly available labeled data for named entity recognition in rice breeding, and manual annotation can be costly. To address these issues, an approach based on text data augmentation to the named entity recognition was proposed for rice breeding questions. The rice breeding knowledge graph was created to assist in subdividing larger named entity categories in rice breeding, such as rice characteristics entities, into smaller subcategories, such as resistance to abiotic stress and eating quality. It helped to enhance entity boundaries and reduce knowledge granularity. Responding to the challenge of high annotation costs for rice breeding data that results in suboptimal performance in named entity recognition, the DA-BERT-BILSTM-CRF model was presented by introducing a data augmentation layer into the BERT-BILSTM-CRF model. Using manually labeled rice breeding questions as training data, the proposed model was compared with three other baseline models. In the overall named entity recognition experiment under the small class entity division, the model achieved a precision of 93.86%, a recall of 92.82%, and an F1 score of 93.34%. Compared with the best-performing BERT-BILSTM-CRF model among the three baseline models, the model outperformed by 4.98, 5.3 and 5.15 percentages points, respectively. Meanwhile, it also performed better in the single-entity recognition metric, achieving a precision of 94.26% and an F1 score of 93.32%. The experiments showed that the proposed approach performed better in both overall named entity recognition and single-class named entity recognition tasks in rice breeding questions.

    參考文獻(xiàn)
    相似文獻(xiàn)
    引證文獻(xiàn)
引用本文

牛培宇,侯琛.基于文本數(shù)據(jù)增強(qiáng)的中文水稻育種問(wèn)句命名實(shí)體識(shí)別[J].農(nóng)業(yè)機(jī)械學(xué)報(bào),2024,55(8):333-343. NIU Peiyu, HOU Chen. Named Entity Recognition in Chinese Rice Breeding Questions Based on Text Data Augmentation[J]. Transactions of the Chinese Society for Agricultural Machinery,2024,55(8):333-343.

復(fù)制
分享
文章指標(biāo)
  • 點(diǎn)擊次數(shù):
  • 下載次數(shù):
  • HTML閱讀次數(shù):
  • 引用次數(shù):
歷史
  • 收稿日期:2023-12-07
  • 最后修改日期:
  • 錄用日期:
  • 在線發(fā)布日期: 2024-08-10
  • 出版日期: