Özet
Deep learning methods are used for prediction tasks in many areas such as computer vision, speech recognition, and natural language processing. Such methods give more flexible and powerful results than traditional machine learning methods. Improvements in the publicly available healthcare data sets lead to deep learning works to be increased in the healthcare domain especially for predicting the medical codes of patients. In this work, a deep learning-based multi-modal method is proposed to predict medical codes by using both text-based and time-series data. This method was developed with the help of publicly available MIMIC-III data set and gives competitive results with similar healthcare prediction methods which are using only text-based data. Furthermore, the proposed approach is also tested with some state-of-the-art text-based models and showed that using both types of data improves results in terms of F1 score.
Künye
[1] Rnn, lstm & gru. http://dprogrammer.org/rnn-lstm-gru.
Accessed: 2022-12-12.
[2] Sepp Hochreiter and Jurgen Schmidhuber. Long short-term memory. ¨ Neural
computation, 9:1735–80, 1997. doi:10.1162/neco.1997.9.8.1735.
[3] Kyunghyun Cho, Bart van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, ¨
Fethi Bougares, Holger Schwenk, and Yoshua Bengio. Learning phrase
representations using RNN encoder–decoder for statistical machine
translation. In Proceedings of the 2014 Conference on Empirical Methods
in Natural Language Processing (EMNLP), pages 1724–1734. Association for
Computational Linguistics, Doha, Qatar, 2014. doi:10.3115/v1/D14-1179.
[4] Liying Liu and Yain-Whar Si. 1d convolutional neural networks for chart pattern
classification in financial time series. J. Supercomput., 78(12):14191–14214,
2022. ISSN 0920-8542. doi:10.1007/s11227-022-04431-5.
[5] Xin Rong. word2vec parameter learning explained. CoRR, abs/1411.2738, 2014.
[6] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones,
Aidan N Gomez, Ł ukasz Kaiser, and Illia Polosukhin. Attention is all you need.
In I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan,
and R. Garnett, editors, Advances in Neural Information Processing Systems,
volume 30. Curran Associates, Inc., 2017.
[7] Tushaar Gangavarapu, Aditya Jayasimha, Gokul S. Krishnan, and
Sowmya Kamath S. Predicting icd-9 code groups with fuzzy similarity
based supervised multi-label classification of unstructured clinical nursing
notes. Knowledge-Based Systems, 190:105321, 2020. ISSN 0950-7051.
doi:https://doi.org/10.1016/j.knosys.2019.105321.
[8] Yuqi Si, Jingcheng Du, Zhao Li, Xiaoqian Jiang, Timothy Miller, Fei Wang,
W. Jim Zheng, and Kirk Roberts. Deep representation learning of patient
data from electronic health records (EHR): A systematic review. Journal of
Biomedical Informatics, 115:103671, 2021. doi:10.1016/j.jbi.2020.103671.
[9] Pascal Vincent, Hugo Larochelle, Isabelle Lajoie, Yoshua Bengio, and
Pierre-Antoine Manzagol. Stacked denoising autoencoders: Learning useful
representations in a deep network with a local denoising criterion. Journal of
Machine Learning Research, 11(110):3371–3408, 2010.
[10] Sanjay Purushotham, Chuizheng Meng, Zhengping Che, and Yan Liu.
Benchmarking deep learning models on large healthcare datasets. Journal of
Biomedical Informatics, 83:112–134, 2018. ISSN 1532-0464. doi:https://doi.
org/10.1016/j.jbi.2018.04.007.
[11] Pengtao Xie and Eric Xing. A neural architecture for automated ICD coding. In
Proceedings of the 56th Annual Meeting of the Association for Computational
Linguistics (Volume 1: Long Papers). Association for Computational Linguistics,
2018. doi:10.18653/v1/p18-1098.
[12] James Mullenbach, Sarah Wiegreffe, Jon Duke, Jimeng Sun, and Jacob
Eisenstein. Explainable prediction of medical codes from clinical text. In
Proceedings of the 2018 Conference of the North American Chapter of the
Association for Computational Linguistics: Human Language Technologies,
Volume 1 (Long Papers), pages 1101–1111. Association for Computational
Linguistics, New Orleans, Louisiana, 2018. doi:10.18653/v1/N18-1100.
[13] Thanh Vu, Dat Quoc Nguyen, and Anthony Nguyen. A label attention model for
icd coding from clinical text. In Proceedings of the Twenty-Ninth International
Joint Conference on Artificial Intelligence, IJCAI-20, pages 3335–3341. 2020.
doi:10.24963/ijcai.2020/461. Main track.
[14] Zheng Yuan, Chuanqi Tan, and Songfang Huang. Code synonyms do matter:
Multiple synonyms matching network for automatic ICD coding. In Proceedings
of the 60th Annual Meeting of the Association for Computational Linguistics
(Volume 2: Short Papers), pages 808–814. Association for Computational
Linguistics, Dublin, Ireland, 2022.
[15] Zhichao Yang, Shufan Wang, Bhanu Pratap Singh Rawat, Avijit Mitra, and Hong
Yu. Knowledge injected prompt based fine-tuning for multi-label few-shot icd
coding, 2022. doi:10.48550/ARXIV.2210.03304.
[16] J. R. Le Gall. A new simplified acute physiology score (SAPS II) based on a
european/north american multicenter study. JAMA: The Journal of the American
Medical Association, 270(24):2957–2963, 1993. doi:10.1001/jama.270.24.2957.
[17] J. L. Vincent, R. Moreno, J. Takala, S. Willatts, A. De Mendonc¸a, H. Bruining,
C. K. Reinhart, P. M. Suter, and L. G. Thijs. The SOFA (sepsis-related organ
failure assessment) score to describe organ dysfunction/failure. Intensive Care
Medicine, 22(7):707–710, 1996. doi:10.1007/bf01709751.
[18] H. Michael Marsh, Iqbal Krishan, James M. Naessens, Robert A. Strickland,
Douglas R. Gracey, Mary E. Campion, Fred T. Nobrega, Peter A. Southorn,
John C. McMichan, and Mary P. Kelly. Assessment of prediction of mortality
by using the apache ii scoring system in intensive-care units. Mayo Clinic
Proceedings, 65(12):1549–1557, 1990. ISSN 0025-6196. doi:https://doi.org/10.
1016/S0025-6196(12)62188-0.
[19] Kexin Huang, Jaan Altosaar, and Rajesh Ranganath. Clinicalbert: Modeling
clinical notes and predicting hospital readmission, 2019. doi:10.48550/ARXIV.
1904.05342.
[20] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT:
pre-training of deep bidirectional transformers for language understanding. In Jill
Burstein, Christy Doran, and Thamar Solorio, editors, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational
Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis,
MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers), pages 4171–4186.
Association for Computational Linguistics, 2019. doi:10.18653/v1/n19-1423.
[21] Mengqi Jin, Mohammad Taha Bahadori, Aaron Colak, Parminder Bhatia,
Busra Celikkaya, Ram Bhakta, Selvan Senthivel, Mohammed Khalilia, Daniel
Navarro, Borui Zhang, Tiberiu Doman, Arun Ravi, Matthieu Liger, and Taha
Kass-hout. Improving hospital mortality prediction with medical named entities
and multimodal learning, 2018. doi:10.48550/ARXIV.1811.12276.
[22] Minmin Chen. Efficient vector representation for documents through corruption.
5th International Conference on Learning Representations, 2017.
[23] Yanyao Shen, Hyokun Yun, Zachary Lipton, Yakov Kronrod, and Animashree
Anandkumar. Deep active learning for named entity recognition. In Proceedings
of the 2nd Workshop on Representation Learning for NLP, pages 252–256.
Association for Computational Linguistics, Vancouver, Canada, 2017. doi:10.
18653/v1/W17-2630.
[24] Zhouhan Lin, Minwei Feng, C´ıcero Nogueira dos Santos, Mo Yu, Bing
Xiang, Bowen Zhou, and Yoshua Bengio. A structured self-attentive sentence
embedding. In 5th International Conference on Learning Representations,
ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings.
OpenReview.net, 2017.
[25] Arthur D. Reys, Danilo Silva, Daniel Severo, Saulo Pedro, Marcia M. de Sousa e
Sa , and Guilherme A. C. Salgado. Predicting multiple ICD-10 codes from ´
brazilian-portuguese clinical notes. In Intelligent Systems, pages 566–580.
Springer International Publishing, 2020. doi:10.1007/978-3-030-61377-8 39.
[26] Leopold Franz, Yash Shrestha, and Bibek Paudel. A deep learning pipeline for
patient diagnosis prediction using electronic health records. 2020. doi:10.1145/
1122445.1122457.
[27] Yang Liu, Hua Cheng, Russell Klopfer, Matthew R. Gormley, and Thomas
Schaaf. Effective convolutional attention network for multi-label clinical
document classification. In Proceedings of the 2021 Conference on Empirical
Methods in Natural Language Processing, pages 5941–5953. Association for
Computational Linguistics, Online and Punta Cana, Dominican Republic, 2021.
doi:10.18653/v1/2021.emnlp-main.481.
[28] Min Chen, Yixue Hao, Kai Hwang, Lu Wang, and Lin Wang. Disease prediction
by machine learning over big data from healthcare communities. Ieee Access,
5:8869–8879, 2017.
[29] Chung-Chian Hsu, Pei-Chi Chang, and Arthur Chang. Multi-label classification
of icd coding using deep learning. In 2020 International Symposium on
Community-centric Systems (CcS), pages 1–6. IEEE, 2020.
[30] Dongyang Wang, Junli Su, and Hongbin Yu. Feature extraction and analysis of
natural language processing for deep learning english language. IEEE Access,
PP:1–1, 2020. doi:10.1109/ACCESS.2020.2974101.
[31] Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient estimation ´
of word representations in vector space. In Yoshua Bengio and Yann LeCun,
editors, 1st International Conference on Learning Representations, ICLR 2013,
Scottsdale, Arizona, USA, May 2-4, 2013, Workshop Track Proceedings. 2013.
[32] Quoc Le and Tomas Mikolov. Distributed representations of sentences and
documents. In Eric P. Xing and Tony Jebara, editors, Proceedings of the 31st
International Conference on Machine Learning, volume 32 of Proceedings of
Machine Learning Research, pages 1188–1196. PMLR, Bejing, China, 2014.
[33] Iz Beltagy, Matthew E. Peters, and Arman Cohan. Longformer: The
long-document transformer. arXiv:2004.05150, 2020.
[34] Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury,
Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga,
Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison,
Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai,
and Soumith Chintala. Pytorch: An imperative style, high-performance deep
learning library. In Advances in Neural Information Processing Systems 32, pages
8024–8035. Curran Associates, Inc., 2019.
[35] Wes McKinney. Data Structures for Statistical Computing in Python. In Stefan ´
van der Walt and Jarrod Millman, editors, Proceedings of the 9th Python in
Science Conference, pages 56 – 61. 2010. doi:10.25080/Majora-92bf1922-00a.
[36] Edward Loper and Steven Bird. Nltk: The natural language toolkit. CoRR,
cs.CL/0205028, 2002.
[37] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel,
M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos,
D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: Machine
learning in Python. Journal of Machine Learning Research, 12:2825–2830, 2011.
[38] Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization,
2014. Cite arxiv:1412.6980Comment: Published as a conference paper at the 3rd
International Conference for Learning Representations, San Diego, 2015.
[39] Y. Zhu, R. Kiros, R. Zemel, R. Salakhutdinov, R. Urtasun, A. Torralba, and
S. Fidler. Aligning books and movies: Towards story-like visual explanations by
watching movies and reading books. In 2015 IEEE International Conference on
Computer Vision (ICCV), pages 19–27. IEEE Computer Society, Los Alamitos,
CA, USA, 2015. ISSN 2380-7504. doi:10.1109/ICCV.2015.11.
[40] Mahnaz Koupaee and William Yang Wang. Wikihow: A large scale text
summarization dataset, 2018. doi:10.48550/ARXIV.1810.09305.
[41] Mandar Joshi, Eunsol Choi, Daniel Weld, and Luke Zettlemoyer. TriviaQA: A
large scale distantly supervised challenge dataset for reading comprehension. In
Proceedings of the 55th Annual Meeting of the Association for Computational
Linguistics (Volume 1: Long Papers), pages 1601–1611. Association for
Computational Linguistics, Vancouver, Canada, 2017. doi:10.18653/v1/
P17-1147.
[42] Yikuan Li, Ramsey M Wehbe, Faraz S Ahmad, Hanyin Wang, and Yuan
Luo. Clinical-longformer and clinical-bigbird: Transformers for long clinical
sequences. arXiv preprint arXiv:2201.11838, 2022.
[43] Biplob Biswas, Thai-Hoang Pham, and Ping Zhang. Transicd: Transformer based
code-wise attention model for explainable icd coding. In Allan Tucker, Pedro
Henriques Abreu, Jaime Cardoso, Pedro Pereira Rodrigues, and David Riano, ˜
editors, Artificial Intelligence in Medicine, pages 469–478. Springer International
Publishing, Cham, 2021. ISBN 978-3-030-77211-6.
[44] Zachariah Zhang, Jingshu Liu, and Narges Razavian. BERT-XML: Large scale
automated ICD coding using BERT pretraining. In Proceedings of the 3rd
Clinical Natural Language Processing Workshop, pages 24–34. Association for
Computational Linguistics, Online, 2020. doi:10.18653/v1/2020.clinicalnlp-1.3.
[45] Damian Pascual, Sandro Luck, and Roger Wattenhofer. Towards BERT-based
automatic ICD coding: Limitations and opportunities. In Proceedings of the
20th Workshop on Biomedical Language Processing, pages 54–63. Association
for Computational Linguistics, Online, 2021. doi:10.18653/v1/2021.bionlp-1.6.
[46] Fei Li and Hong Yu. Icd coding from clinical text using multi-filter residual
convolutional neural network. In Proceedings of the Thirty-Fourth AAAI
Conference on Artificial Intelligence. 2020.