Feature Mining of News Communication Topic Elements Based on BERT Model

Volume 7, Issue 3, June 2023     |     PP. 133-157      |     PDF (976 K)    |     Pub. Date: May 7, 2023
DOI: 10.54647/sociology841046    87 Downloads     2195 Views  

Author(s)

Fei Yang Zheng, The University of Chicago, Edward H. Levi Hall 5801 S. Ellis Ave. Chicago, IL 60637, USA

Abstract
In order to solve the problems of lack of standardization, fuzzy semantics and sparse features in news topic texts, a feature mining of news communication topic elements based on the BERT model is proposed. In the research, multi-layer fully connected layer feature extraction is performed on the output of the news topic text in the BERT model, and the final extracted text features are purified by the feature projection method to enhance the classification effect. Then the feature projection network is fused in the hidden layer inside the BERT model for feature projection, so as to enhance and purify the classification features through the feature projection of the hidden layer. Experiments are performed on Toutiao, Sohu News, THUC News-L, and THUC News-S datasets. The experimental results show that compared with the baseline BERT method, the two methods have better performance in terms of accuracy and macro-average F1 value, and the highest accuracy is 86. 96%, 86. 17%, 94. 40% and 93.73%, respectively, which verifies the feasibility and effectiveness of the proposed method. It is concluded that the proposed method for news topic text classification combining BERT and FP net is effective and efficient.

Keywords
pre-trained language model; text classification; news topics; BERT; feature projection network

Cite this paper
Fei Yang Zheng, Feature Mining of News Communication Topic Elements Based on BERT Model , SCIREA Journal of Sociology. Volume 7, Issue 3, June 2023 | PP. 133-157. 10.54647/sociology841046

References

[ 1 ] Salminen, J. , Hopf, M. , Chowdhury, S. A. , Jung, S. G. , & Jansen, B. J. . (2020). Developing an online hate classifier for multiple social media platforms. Human-centric Computing and Information Sciences, 10(1), 1.
[ 2 ] Shin, J. . (2020). How do partisans consume news on social media? a comparison of self-reports with digital trace measures among twitter users:. Social Media + Society, 6(4), 173-190.
[ 3 ] Hu, W. , Cai, X. , Hou, J. , Yi, S. , & Lin, Z. . (2020). Gtc: guided training of ctc towards efficient and accurate scene text recognition. Proceedings of the AAAI Conference on Artificial Intelligence, 34(7), 11005-11012.
[ 4 ] Yenduri, G. , Rajakumar, B. R. , Praghash, K. , & Binu, D. . (2021). Heuristic-assisted bert for twitter sentiment analysis. International Journal of Computational Intelligence and Applications, 20(03), 20625-20631.
[ 5 ] Liu, Y. , Lu, J. , Yang, J. , & Mao, F. . (2020). Sentiment analysis for e-commerce product reviews by deep learning model of bert-bigru-softmax. Mathematical Biosciences and Engineering, 17(6), 7819-7837.
[ 6 ] Sanford, N. , Lavelle, M. , Markiewicz, O. , Reedy, G. , Rafferty, A. M. , & Darzi, A. , et al. (2022). Understanding complex work using an extension of the resilience care model: an ethnographic study. BMC Health Services Research, 22(1), 1-10.
[ 7 ] Wang, H. , He, J. , Zhang, X. , & Liu, S. . (2020). A short text classification method based on n-gram and cnn. Chinese Journal of Electronics, 29(2), 248-254.
[ 8 ] Levis, M. , Westgate, C. L. , Jiang, G. , Watts, B. V. , & Shiner, B. . (2020). Natural language processing of clinical mental health notes may add predictive value to existing suicide risk models. Psychological Medicine, 51(8), 1-10.
[ 9 ] Santos, B. , Marcacini, R. M. , & Rezende, S. O. . (2021). Multi-domain aspect extraction using bidirectional encoder representations from transformers. IEEE Access, PP(99), 1-1.
[ 10 ] О. А. Розанова, М. В. Кашуба, & М. В. Цинова. (2020). Prospection and retrospection as temporal markers of text organization in the novel "the choice" by n. sparks. Writings in Romance-Germanic Philology(1(44)), 264-272.
[ 11 ] Zou, Y. , Shi, Y. , Shi, D. , Wang, Y. , & Tian, Y. . (2020). Adaptation-oriented feature projection for one-shot action recognition. IEEE Transactions on Multimedia, PP(99), 1-1.
[ 12 ] Zhou, Y. , Liao, L. , Gao, Y. , Wang, R. , & Huang, H. . (2021). Topicbert: a topic-enhanced neural language model fine-tuned for sentiment classification. IEEE Transactions on Neural Networks and Learning Systems, PP(99), 1-14.
[ 13 ] Tang, H. , Mi, Y. , Xue, F. , & Cao, Y. . (2021). Graph domain adversarial transfer network for cross-domain sentiment classification. IEEE Access, PP(99), 1-1.
[ 14 ] Wan, X. , Li, Z. , Chen, E. , Zhao, L. , & Xu, K. . (2021). Forest aboveground biomass estimation using multi-features extracted by fitting vertical backscattered power profile of tomographic sar. Remote Sensing, 13(2), 186.
[ 15 ] Liu, S. , Whidborne, J. F. , & Chumalee, S. . (2021). Disturbance observer enhanced neural network lpv control for a blended-wing-body large aircraft. IEEE Transactions on Aerospace and Electronic Systems, PP(99), 1-1.
[ 16 ] Wen, L. , Li, X. , & Gao, L. . (2020). A new reinforcement learning based learning rate scheduler for convolutional neural network in fault classification. IEEE Transactions on Industrial Electronics, PP(99), 1-1.
[ 17 ] Lin, J. P. , Feng, H. S. , Zhai, H. , & Shen, X. . (2021). Cerebral hemodynamic responses to the difficulty level of ambulatory tasks in patients with parkinson's disease: a systematic review and meta-analysis:. Neurorehabilitation and Neural Repair, 35(9), 755-768.
[ 18 ] Verma, P. , Awasthi, V. K. , & Sahu, S. K. . (2021). Classification of coronary artery disease using multilayer perceptron neural network. International Journal of Applied Evolutionary Computation, 12(3), 35-43.
[ 19 ] Church, K. W. . (2020). Emerging trends: subwords, seriously?. Natural Language Engineering, 26(3), 375-382.
[ 20 ] Gtsch, T. , Tuerk, H. , Schmidt, F. P. , Vinke, I. C. , Haart, L. , & Schlgl, R. , et al. (2021). Visualizing the atomic structure between ysz and lsm: an interface stabilized by complexions?. ECS Transactions, 103(1), 1331-1337.