使用基於圖形的深度轉換器與大型語言模型來從醫學影像產生臨床報告_

帳號：guest(3.135.207.201) 離開系統

字體大小：

詳目顯示

第 1 筆 / 共 1 筆

/1頁

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士論文系統

、以作者查詢全國書目

論文基本資料
摘要
外文摘要
論文目次
參考文獻
電子全文

作者(中文):	阮英祿
作者(外文):	Nguyen, Anh-Loc
論文名稱(中文):	使用基於圖形的深度轉換器與大型語言模型來從醫學影像產生臨床報告
論文名稱(外文):	Generating Clinical Reports From Medical Images Using Deep Graph-based Transformers and Large Language Models
指導教授(中文):	蘇豐文
指導教授(外文):	Soo, Von-Wun
口試委員(中文):	陳冠甫郭柏志
口試委員(外文):	Chen, Kuan-Fu Kuo, Po-Chih
學位類別:	碩士
校院名稱:	國立清華大學
系所名稱:	資訊系統與應用研究所
學號:	109065421
出版年(民國):	113
畢業學年度:	112
語文別:	英文
論文頁數:	109
中文關鍵詞:	医学影像、生成模型、变换器、图神经网络、临床报告、大型语言模型
外文關鍵詞:	medical image、generative model、transformer、graph neural network、clinical reports、large language models
相關次數:	推薦:0 點閱:106 評分: 下載:0 收藏:0

從醫學影像中自動生成臨床報告是現代醫療的關鍵部分，旨在提高診斷準確性、效率和病人護理。本研究探索使用深度圖基變換器（BioGraph-Transformer）和大型語言模型（LLMs）的創新方法，以克服現有放射學報告生成方法的限制，如次優的視覺特徵提取、有限的罕見疾病檢測能力，以及整合外部醫學知識的困難。此外，確保LLM輸出的話語連貫性和減輕偏見仍然是挑戰，這導致手動診斷過程耗時並容易產生變異，強調了需要自動生成報告以提高效率和一致性的需求。

主要目標是開發一個能夠準確解釋圖像並生成全面報告的精密框架。該框架利用先進的深度學習技術來解決這些挑戰。它包括（1）具有記憶驅動解碼器的混合結構BioGraph-Transformer模型用於生成放射學報告，（2）圖卷積網絡（GCNs）用於多標籤分類，通過臨床相關詞嵌入捕獲圖像中醫學對象之間的語義關係；以及（3）微調的ChatGPT LLM用於改進生成的報告，提高清晰度、連貫性和事實準確性。

進行了廣泛的實驗來評估所提出方法的性能。結果顯示在多標籤分類、報告生成準確性和流暢性方面有顯著改進。如臨床效能、自然語言生成和Discord Coherence等關鍵指標在臨床相關性和語言連貫性方面表現出增強的性能。值得注意的發現包括Snomed2Vec嵌入的優越性、BioGraph-Transformer模型的性能，以及ChatGPT集成對提高報告可讀性和連貫性的影響。

總之，這項研究不僅彌補了當前自動化醫療影像診斷方法中的關鍵缺口，而且為未來的研究開闢了新的途徑。其將圖基變換器與語言模型結合的強大框架有可能顯著提高臨床工作流程並支持醫療專業人員提供更好的病人護理服務。未來的工作將集中於解決數據處理、罕見疾病檢測和LLM幻覺緩解的挑戰。這項研究標誌著革新醫學診斷自動化和提高醫療服務品質的一個步驟。

Automated generation of clinical reports from medical images is a crucial aspect of modern healthcare, aiming to enhance diagnostic accuracy, efficiency, and patient care. This study explores innovative approaches using Deep Graph-based Transformers (BioGraph-Transformer) and Large Language Models (LLMs) to overcome the limitations of existing radiology report generation methods, such as suboptimal visual feature extraction, limited rare disease detection capabilities and difficulty in integrating external medical knowledge. Additionally, ensuring discourse coherence and mitigating biases in LLM outputs remain challenging, leading to manual diagnosis processes being time-consuming and prone to variability, emphasizing the need for automated report generation to improve efficiency and consistency.

The primary objective is to develop a sophisticated framework that accurately interprets images and generates comprehensive reports. The proposed framework leverages advanced deep-learning techniques to address these challenges. It includes (1) a hybrid structure BioGraph-Transformer model with a memory-driven decoder for generating radiology reports, (2) Graph Convolutional Networks (GCNs) for multi-label classification, capturing semantic relationships between medical objects in images by clinical-relevant word embeddings, and (3) a fine-tuned ChatGPT LLM to refine generated reports, improving clarity, coherence, and factual accuracy.

Extensive experiments were conducted to evaluate the proposed approach's performance. The results significantly improved multi-label classification, report generation accuracy, and fluency. Key metrics such as Clinical Efficacy, Natural Language Generation, and Discord Coherence demonstrated enhanced performance in terms of clinical relevance and linguistic coherence. Notable findings include the superiority of Snomed2Vec embeddings, the performance of the BioGraph-Transformer model, and the impact of ChatGPT integration on enhancing report readability and coherence.

In conclusion, this research not only bridges critical gaps in current methodologies in automating medical image diagnosis but also opens new avenues for future research. Its robust framework of combining graph-based transformers with language models has the potential to significantly enhance the automation of clinical workflows and support healthcare professionals in delivering better patient care service. Future work will focus on addressing challenges in data handling, rare disease detection, and LLM hallucination mitigation. This research signifies a step forward in revolutionizing automating medical diagnostics and enhancing healthcare service.

Contents
Abstract (Chinese) I
Acknowledgements (Chinese) II
Abstract III
Acknowledgements V
Contents VI
List of Figures IX
List of Tables X
1 Introduction 1
2 Related Work 12
1 The Transformer models . . . . . . . . . . . . . . . . . . . . . . . . 12
2 Graph-based Learning Using Graph Convolution Network . . . . . . 14
3 Image Captioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.1 Integration of ResNeXt-50 . . . . . . . . . . . . . . . . . . . 17
4 Data Augmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
5 Discourse Structure of X-ray Examination Report . . . . . . . . . . 19
6 Evaluation Metrics For Generation Performances . . . . . . . . . . 21
7 Large Language Models To Fine-Tune Report Generation . . . . . . 26
3 Methodologies 29
1 Problem Description and Formulation . . . . . . . . . . . . . . . . . 29
2 Image Augmentation . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3 The Model Structure . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.1 Multi-Label Classification via Graph-Learning . . . . . . . . 35
3.1.1 Graph Construction . . . . . . . . . . . . . . . . . 36
3.1.2 Learning Representation of Vector Embeddings of
Node Features . . . . . . . . . . . . . . . . . . . . 37
3.1.3 Graph Convolutional Network Recap . . . . . . . . 41
3.1.4 GCN for Multi-label Recognition . . . . . . . . . . 44
3.2 Report Generation . . . . . . . . . . . . . . . . . . . . . . . 46
3.2.1 A Memory-Driven Transformer . . . . . . . . . . . 46
3.2.2 Generation Refinement With Fine-tuned ChatGPT
LLM(s) . . . . . . . . . . . . . . . . . . . . . . . . 48
4 Experiments and Results 53
1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
1.1 Disease Distribution . . . . . . . . . . . . . . . . . . . . . . 53
2 Experimental Design for Evaluation on Report Generation . . . . . 55
3 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.1 Graph Construction . . . . . . . . . . . . . . . . . . . . . . 58
3.2 Node Feature Representation . . . . . . . . . . . . . . . . . 60
3.3 Results on Multi-label Classification . . . . . . . . . . . . . 62
3.4 Results on Report Generation . . . . . . . . . . . . . . . . . 64
3.5 Analysis and Discussion . . . . . . . . . . . . . . . . . . . . 66
3.5.1 Comparative Analysis of Report Generation with
and without ChatGPT . . . . . . . . . . . . . . . . 66
3.6 Comparative Evaluation of Other LLMs for Clinical Report
Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5 Discussion and Conclusion 75
Bibliography 83
A Adjacency Matrix Visualization 109

Bibliography
[1] Matthew Joseph Adiletta, Jesmin Jahan Tithi, Emmanouil Farsarakis,
Gerasimos Gerogiannis, Robert Adolf, Robert Benke, Sidharth Kashyap,
Samuel Hsia, Kartik Lakhotia, Fabrizio Petrini, Gu-Yeon Wei, and David M.
Brooks. Characterizing the scalability of graph convolutional networks
on intel® piuma. 2023 IEEE International Symposium on Performance
Analysis of Systems and Software (ISPASS), pages 168–177, 2023. URL
https://api.semanticscholar.org/CorpusID:259235396.
[2] Jean-Baptiste Alayrac, Jeff Donahue, Pauline Luc, Antoine Miech, Iain
Barr, Yana Hasson, Karel Lenc, Arthur Mensch, Katie Millicah, Malcolm
Reynolds, Roman Ring, Eliza Rutherford, Serkan Cabi, Tengda Han, Zhi-
tao Gong, Sina Samangooei, Marianne Monteiro, Jacob Menick, Sebastian
Borgeaud, Andrew Brock, Aida Nematzadeh, Sahand Sharifzadeh, Mikolaj
Binkowski, Ricardo Barreira, Oriol Vinyals, Andrew Zisserman, and Karen
Simonyan. Flamingo: a visual language model for few-shot learning. In Pro-
ceedings of the 36th International Conference on Neural Information Pro-
cessing Systems, NIPS ’22, Red Hook, NY, USA, 2024. Curran Associates
Inc. ISBN 9781713871088.
[3] Danial Alihosseini, Ehsan Montahaei, and Mahdieh Soleymani Baghshah.
Jointly measuring diversity and quality in text generation models. In Antoine
Bosselut, Asli Celikyilmaz, Marjan Ghazvininejad, Srinivasan Iyer, Urvashi Khandelwal, Hannah Rashkin, and Thomas Wolf, editors, Proceedings of
the Workshop on Methods for Optimizing and Evaluating Neural Language
Generation, pages 90–98, Minneapolis, Minnesota, June 2019. Association
for Computational Linguistics. doi: 10.18653/v1/W19-2311. URL https:
//aclanthology.org/W19-2311.
[4] Martin Arjovsky, Soumith Chintala, and L ́eon Bottou. Wasserstein genera-
tive adversarial networks. In Proceedings of the 34th International Confer-
ence on Machine Learning - Volume 70, ICML’17, page 214–223. JMLR.org,
2017.
[5] Aldo Badano, Craig Revie, Andrew Casertano, Wei-Chung Cheng, Phil J.
Green, Tom Kimpe, Elizabeth A. Krupinski, Christye Sisson, Stein Olav
Skrøvseth, Darren E. Treanor, Paul A. Boynton, David A. Clunie, Michael J.
Flynn, Tatsuo Heki, Stephen M. Hewitt, Hiroyuki Homma, Andy Masia,
Takashi Matsui, Bal ́azs Vince Nagy, Masahiro Nishibori, John Penczek,
Thomas R. Schopf, Yukako Yagi, and Hideto Yokoi. Consistency and stan-
dardization of color in medical imaging: a consensus report. Journal of
Digital Imaging, 28:41 – 52, 2014. URL https://api.semanticscholar.
org/CorpusID:13459257.
[6] Debapriya Banik and Debotosh Bhattacharjee. Mitigating data imbalance
issues in medical image analysis. pages 66–89, 06 2021. ISBN 9781799873730.
doi: 10.4018/978-1-7998-7371-6.ch004.
[7] Regina Barzilay and Mirella Lapata. Modeling local coherence: An entity-
based approach, 2005.
[8] Anastasiya Belyaeva, Justin Cosentino, Farhad Hormozdiari, Krish Eswaran,
Shravya Shetty, Greg Corrado, Andrew Carroll, Cory Y. McLean, and Nicholas A. Furlotte. Multimodal llms for health grounded in individual-
specific data. In Andreas K. Maier, Julia A. Schnabel, Pallavi Tiwari, and
Oliver Stegle, editors, Machine Learning for Multimodal Healthcare Data,
pages 86–102, Cham, 2024. Springer Nature Switzerland. ISBN 978-3-031-
47679-2.
[9] Gemma A. Bilkey, Belinda L. Burns, Emily P. Coles, Trinity Mahede,
Gareth S. Baynam, and Kristen J. Nowak. Optimizing precision medicine
for public health. Frontiers in Public Health, 7, 2019. URL https:
//api.semanticscholar.org/CorpusID:71140291.
[10] Siddharth Biswal, Cao Xiao, Lucas M. Glass, Brandon Westover, and Jimeng
Sun. Clara: Clinical report auto-completion. In Proceedings of The Web
Conference 2020, WWW ’20, page 541–550, New York, NY, USA, 2020.
Association for Computing Machinery. ISBN 9781450370233. doi: 10.1145/
3366423.3380137. URL https://doi.org/10.1145/3366423.3380137.
[11] Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Ka-
plan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry,
Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger,
Tom Henighan, Rewon Child, Aditya Ramesh, Daniel Ziegler, Jeffrey Wu,
Clemens Winter, Chris Hesse, Mark Chen, Eric Sigler, Mateusz Litwin,
Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam Mc-
Candlish, Alec Radford, Ilya Sutskever, and Dario Amodei. Language
models are few-shot learners. In H. Larochelle, M. Ranzato, R. Had-
sell, M.F. Balcan, and H. Lin, editors, Advances in Neural Informa-
tion Processing Systems, volume 33, pages 1877–1901. Curran Associates,
Inc., 2020. URL https://proceedings.neurips.cc/paper_files/paper/
2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf.
[12] Marco Cascella, Federico Semeraro, Jonathan Montomoli, Valentina Bellini,
Ornella Piazza, and Bignami Elena. The breakthrough of large language
models release for medical applications: 1-year timeline and perspectives.
Journal of Medical Systems, 48, 02 2024. doi: 10.1007/s10916-024-02045-3.
[13] Feilong Chen, Duzhen Zhang, Minglun Han, Xiuyi Chen, Jing Shi, Shuang
Xu, and Bo Xu. Vlp: A survey on vision-language pre-training. Machine In-
telligence Research, 20:38–56, 2022. URL https://api.semanticscholar.
org/CorpusID:246996617.
[14] Zhao-Min Chen, Xiu-Shen Wei, Peng Wang, and Yanwen Guo. Multi-label
image recognition with graph convolutional networks. In 2019 IEEE/CVF
Conference on Computer Vision and Pattern Recognition (CVPR), pages
5172–5181, 2019. doi: 10.1109/CVPR.2019.00532.
[15] Zhihong Chen, Yan Song, Tsung-Hui Chang, and Xiang Wan. Generating
radiology reports via memory-driven transformer. In Proceedings of the 2020
Conference on Empirical Methods in Natural Language Processing, Novem-
ber 2020.
[16] Xiaoxue Cheng, Junyi Li, Wayne Xin Zhao, and Ji-Rong Wen. Chainlm: Em-
powering large language models with improved chain-of-thought prompting.
In International Conference on Language Resources and Evaluation, 2024.
URL https://api.semanticscholar.org/CorpusID:268554107.
[17] Giampiero Chiaselotti, Tommaso Gentile, Federico G. Infusino, and Paolo A.
Oliverio. The adjacency matrix of a graph as a data table: a geometric per-
spective. Annali di Matematica Pura ed Applicata (1923 -), 196:1073 – 1112,
2016. URL https://api.semanticscholar.org/CorpusID:125560571.
[18] Kathleen Dahlgren. Discourse Coherence, pages 171–230. Springer US, Boston, MA, 1988. ISBN 978-1-4613-1075-4. doi:
10.1007/978-1-4613-1075-4 8. URL https://doi.org/10.1007/
978-1-4613-1075-4_8.
[19] Chuan Dai, Yajuan Wei, Zhijie Xu, Minsi Chen, and Ying Liu. Performance
analysis of graph laplacian matrices in node classification. In Andrew D. Ball,
Huajiang Ouyang, Jyoti K. Sinha, and Zuolu Wang, editors, Proceedings
of the UNIfied Conference of DAMAS, IncoME and TEPEN Conferences
(UNIfied 2023), pages 877–885, Cham, 2024. Springer Nature Switzerland.
[20] Dina Demner-Fushman, Sameer Antani, Matthew Simpson, and George
Thoma. Design and development of a multimodal biomedical information
retrieval system. Journal of Computing Science and Engineering, 6, 06 2012.
doi: 10.5626/JCSE.2012.6.2.168.
[21] Hantian Dong, Biaokai Zhu, Xinri Zhang, and Xiaomei Kong. Use data
augmentation for a deep learning classification model with chest x-ray clinical
imaging featuring coal workers’ pneumoconiosis. BMC Pulmonary Medicine,
22, 2022. URL https://api.semanticscholar.org/CorpusID:250534768.
[22] Karol Draszawka and Julian Szyma ́nski. From scores to predictions in multi-
label classification: Neural thresholding strategies. Applied Sciences, 2023.
URL https://api.semanticscholar.org/CorpusID:259688933.
[23] Mohamed Elgendi, Muhammad Umer Nasir, Qunfeng Tang, David L Smith,
John-Paul Grenier, Catherine Batte, Bradley M. Spieler, William D. Leslie,
Carlo Menon, Richard Rib ́on Fletcher, Newton Howard, Rabab Kreidieh
Ward, William Parker, and Savvas Nicolaou. The effectiveness of image
augmentation in deep learning networks for detecting covid-19: A geometric transformation perspective. Frontiers in Medicine, 8, 2021. URL https:
//api.semanticscholar.org/CorpusID:232070409.
[24] Caitlin Farmer, Allison Bourne, Denise O’Connor, Jeffrey Jarvik, and
Rachelle Buchbinder. Enhancing clinician and patient understanding of ra-
diology reports: a scoping review of international guidelines. Insights into
Imaging, 11, 12 2020. doi: 10.1186/s13244-020-00864-9.
[25] Megan Feely, Kristen D. Seay, Paul J. Lanier, Wendy F. Auslander, and
Patricia L. Kohl. Measuring fidelity in research studies: A field guide
to developing a comprehensive fidelity measurement system. Child and
Adolescent Social Work Journal, 35:139 – 152, 2017. URL https://api.
semanticscholar.org/CorpusID:254370211.
[26] Fabio Garcea, Alessio Serra, Fabrizio Lamberti, and L. Morra. Data aug-
mentation for medical imaging: A systematic literature review. Com-
puters in biology and medicine, 152:106391, 2022. URL https://api.
semanticscholar.org/CorpusID:254520707.
[27] Akshay Goel, Almog Gueta, Omry Gilon, Chang Liu, Sofia Erell, Lan Huong
Nguyen, Xiaohong Hao, Bolous Jaber, Shashir Reddy, Rupesh Kartha, Jean
Steiner, Itay Laish, and Amir Feder. Llms accelerate annotation for medi-
cal information extraction. 2023. URL https://proceedings.mlr.press/
v225/goel23a.
[28] Josu Goikoetxea, Eneko Agirre, and Aitor Soroa. Single or multiple? com-
bining word representations independently learned from text and wordnet.
In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence,
AAAI’16, page 2608–2614. AAAI Press, 2016.
[29] A. L. Goldberger, L. A. N. Amaral, L. Glass, J. M. Hausdorff, P. Ch. Ivanov, R. G. Mark, J. E. Mietus, G. B. Moody, C.-K. Peng, and
H. E. Stanley. PhysioBank, PhysioToolkit, and PhysioNet: Compo-
nents of a new research resource for complex physiologic signals. Circu-
lation, 101(23):e215–e220, 2000 (June 13). Circulation Electronic Pages:
http://circ.ahajournals.org/content/101/23/e215.full PMID:1085218; doi:
10.1161/01.CIR.101.23.e215.
[30] Mounia Hamidouche, Laura Cottatellucci, and Konstantin Avrachenkov.
On the normalized laplacian spectra of random geometric graphs. Jour-
nal of Theoretical Probability, 36:46–77, 2022. URL https://api.
semanticscholar.org/CorpusID:246842605.
[31] David M. Hansell, Alexander A. Bankier, Heber Macmahon, Theresa C
McLoud, Nestor Luiz M ̈uller, and Jacques Remy. Fleischner society: glos-
sary of terms for thoracic imaging. Radiology, 246 3:697–722, 2008. URL
https://api.semanticscholar.org/CorpusID:207583334.
[32] Michael Hartung, Ian Bickle, Frank Gaillard, and Jeffrey Kanne. How to
create a great radiology report. Radiographics, 40:1658–1670, 10 2020. doi:
10.1148/rg.2020200020.
[33] Michael P. Hartung, Ian C. Bickle, Frank Gaillard, and Jeffrey P. Kanne.
How to create a great radiology report. RadioGraphics, 40(6):1658–1670,
2020. doi: 10.1148/rg.2020200020. URL https://doi.org/10.1148/rg.
2020200020. PMID: 33001790.
[34] Shihui He, Lijun Yun, and Haicheng Yi. Fusing graph transformer
with multi-aggregate gcn for enhanced drug-disease associations prediction.
BMC Bioinformatics, 25, 2024. URL https://api.semanticscholar.org/
CorpusID:267749483.
[35] Daibing Hou, Zijian Zhao, Yuying Liu, Faliang Chang, and Sanyuan Hu.
Automatic report generation for chest x-ray images via adversarial rein-
forcement learning. IEEE Access, 9:21236–21250, February 2021. doi:
10.1109/ACCESS.2021.3056175.
[36] Yipeng Hu, Daniel C. Alexander, and Thomy Mertzanidou. Image Reg-
istration, pages 632–639. Springer International Publishing, Cham, 2021.
ISBN 978-3-030-63416-2. doi: 10.1007/978-3-030-63416-2 194. URL https:
//doi.org/10.1007/978-3-030-63416-2_194.
[37] Mert Inan, Piyush Kumar Sharma, Baber Khalid, Radu Soricut, M. Stone,
and Malihe Alikhani. Cosmic: A coherence-aware generation metric for im-
age descriptions. In Conference on Empirical Methods in Natural Language
Processing, 2021. URL https://api.semanticscholar.org/CorpusID:
237491865.
[38] Jeremy A. Irvin, Pranav Rajpurkar, Michael Ko, Yifan Yu, Silviana Ciurea-
Ilcus, Chris Chute, Henrik Marklund, Behzad Haghgoo, Robyn L. Ball,
Katie S. Shpanskaya, Jayne Seekins, David A. Mong, Safwan S. Halabi,
Jesse K. Sandberg, Ricky H Jones, David B. Larson, C. Langlotz, Bhavik N.
Patel, Matthew P. Lungren, and A. Ng. Chexpert: A large chest radiograph
dataset with uncertainty labels and expert comparison. In AAAI, 2019.
[39] Ahmet Iscen, Giorgos Tolias, Yannis Avrithis, Ondˇrej Chum, and Cordelia
Schmid. Graph Convolutional Networks for Learning with Few Clean and
Many Noisy Labels, pages 286–302. 11 2020. ISBN 978-3-030-58606-5. doi:
10.1007/978-3-030-58607-2 17.
[40] Ziwei Ji, Nayeon Lee, Rita Frieske, Tiezheng Yu, Dan Su, Yan Xu, Etsuko
Ishii, Yejin Bang, Delong Chen, Wenliang Dai, Andrea Madotto, and Pascale Fung. Survey of hallucination in natural language generation. ACM Com-
puting Surveys, 55:1 – 38, 2022. URL https://api.semanticscholar.org/
CorpusID:246652372.
[41] X. Jia, Y. Xiong, J. Zhang, Y. Zhang, and Y. Zhu. Few-shot radiology report
generation for rare diseases. In 2020 IEEE International Conference on
Bioinformatics and Biomedicine (BIBM), pages 601–608, Los Alamitos, CA,
USA, December 2020. IEEE Computer Society. doi: 10.1109/BIBM49941.
2020.9313563.
[42] Lanlan Jiang, Shengjun Yuan, and Jun Li. A discourse coherence analy-
sis method combining sentence embedding and dimension grid. Complex.,
2021:6654925:1–6654925:9, 2021. URL https://api.semanticscholar.
org/CorpusID:243803823.
[43] Alistair E. W. Johnson, T. Pollard, Seth J. Berkowitz, Nathaniel R. Green-
baum, M. Lungren, Chih ying Deng, R. Mark, and S. Horng. Mimic-cxr, a
de-identified publicly available database of chest radiographs with free-text
reports. Scientific Data, 6, 2019.
[44] R Addanki S Choudhury S Tamang R Rallo K Agarwal, T Eftimov.
Snomed2vec: Poincar ́e and random walk embeddings of a clinical knowledge
base for healthcare analytics. In KDD Workshop on Applied Data Science
for Healthcare: Bridging the Gap between Data and Knowledge, 2019.
[45] Mert Karabacak and Konstantinos Margetis. Embracing large language mod-
els for medical applications: Opportunities and challenges. Cureus, 15, 2023.
URL https://api.semanticscholar.org/CorpusID:258837444.
[46] Andrej Karpathy and Fei Li. Deep visual-semantic alignments for generating image descriptions. pages 3128–3137, 06 2015. doi: 10.1109/CVPR.2015.
7298932.
[47] Jing Ke, Yiqing Shen, Xiaoyao Liang, and Dinggang Shen. Contrastive learn-
ing based stain normalization across multiple tumor in histopathology. In In-
ternational Conference on Medical Image Computing and Computer-Assisted
Intervention, 2021. URL https://api.semanticscholar.org/CorpusID:
238208184.
[48] Aghiles Kebaili, J ́erˆome Lapuyade-Lahorgue, and Su Ruan. Deep learning
approaches for data augmentation in medical imaging: A review. Journal
of Imaging, 9, 2023. URL https://api.semanticscholar.org/CorpusID:
258156024.
[49] Satyam Khare and Isabelle Vedel. Recall bias and reduction measures: an
example in primary health care service utilization. Family Practice, 2019.
URL https://api.semanticscholar.org/CorpusID:208387557.
[50] Thomas N. Kipf and Max Welling. Semi-supervised classification with graph
convolutional networks. In International Conference on Learning Represen-
tations (ICLR), 2017.
[51] Thomas N. Kipf and Max Welling. Semi-supervised classification with graph
convolutional networks. In International Conference on Learning Represen-
tations, 2017. URL https://openreview.net/forum?id=SJU4ayYgl.
[52] Satyam Kumar, Dayima Musharaf, Seerat Musharaf, and Anil Kumar Sagar.
A comprehensive review of the latest advancements in large generative ai
models. In Rabindra Nath Shaw, Marcin Paprzycki, and Ankush Ghosh,
editors, Advanced Communication and Intelligent Systems, pages 90–103,
Cham, 2023. Springer Nature Switzerland. ISBN 978-3-031-45121-8.
[53] Alice Lai and Joel R. Tetreault. Discourse coherence in the wild: A dataset,
evaluation and methods. In SIGDIAL Conference, 2018. URL https://
api.semanticscholar.org/CorpusID:44105851.
[54] Mateusz Lango and Ondrej Dusek. Critic-driven decoding for mitigat-
ing hallucinations in data-to-text generation. In Houda Bouamor, Juan
Pino, and Kalika Bali, editors, Proceedings of the 2023 Conference on Em-
pirical Methods in Natural Language Processing, pages 2853–2862, Singa-
pore, December 2023. Association for Computational Linguistics. doi: 10.
18653/v1/2023.emnlp-main.172. URL https://aclanthology.org/2023.
emnlp-main.172.
[55] Dong Li, Dong Li, and Hao Liu. Multiview learning of homogeneous neigh-
borhood of nodes for the node representation of heterogeneous graph. Applied
Intelligence, 53:25184–25200, 2023. URL https://api.semanticscholar.
org/CorpusID:260677615.
[56] Jianing Li, Yanyan Lan, Jiafeng Guo, and Xueqi Cheng. On the relation be-
tween quality-diversity evaluation and distribution-fitting goal in text gen-
eration. In Proceedings of the 37th International Conference on Machine
Learning, ICML’20. JMLR.org, 2020.
[57] Rumeng Li, Xun Wang, and Hong Yu. Llamacare: An instruction fine-
tuned large language model for clinical nlp. In International Confer-
ence on Language Resources and Evaluation, 2024. URL https://api.
semanticscholar.org/CorpusID:269804667.
[58] Zhiruo Li and Yucheng Wu. The effectiveness of image augmentation in
breast cancer type classification using deep learning. 2021 3rd Interna-
tional Conference on Machine Learning, Big Data and Business Intelligence (MLBDBI), pages 679–684, 2021. URL https://api.semanticscholar.
org/CorpusID:247523637.
[59] Percy Liang, Rishi Bommasani, Tony Lee, Dimitris Tsipras, Dilara Soylu,
Michihiro Yasunaga, Yian Zhang, Deepak Narayanan, Yuhuai Wu, Ananya
Kumar, Benjamin Newman, Binhang Yuan, Bobby Yan, Ce Zhang, Chris-
tian Cosgrove, Christopher D. Manning, Christopher R’e, Diana Acosta-
Navas, Drew A. Hudson, E. Zelikman, Esin Durmus, Faisal Ladhak, Frieda
Rong, Hongyu Ren, Huaxiu Yao, Jue Wang, Keshav Santhanam, Laurel J.
Orr, Lucia Zheng, Mert Yuksekgonul, Mirac Suzgun, Nathan S. Kim, Neel
Guha, Niladri S. Chatterji, O. Khattab, Peter Henderson, Qian Huang,
Ryan Chi, Sang Michael Xie, Shibani Santurkar, Surya Ganguli, Tatsunori
Hashimoto, Thomas F. Icard, Tianyi Zhang, Vishrav Chaudhary, William
Wang, Xuechen Li, Yifan Mai, Yuhui Zhang, and Yuta Koreeda. Holistic
evaluation of language models. Annals of the New York Academy of Sci-
ences, 1525:140 – 146, 2023. URL https://api.semanticscholar.org/
CorpusID:253553585.
[60] Guanxiong Liu, Tzu-Ming Harry Hsu, Matthew McDermott, Willie Boag,
Wei-Hung Weng, Peter Szolovits, and Marzyeh Ghassemi. Clinically accurate
chest x-ray report generation. In Finale Doshi-Velez, Jim Fackler, Ken Jung,
David Kale, Rajesh Ranganath, Byron Wallace, and Jenna Wiens, editors,
Proceedings of the 4th Machine Learning for Healthcare Conference, volume
106 of Proceedings of Machine Learning Research, pages 249–269. PMLR, 09–
10 Aug 2019. URL https://proceedings.mlr.press/v106/liu19a.html.
[61] Siyang Liu, Sahand Sabour, Yinhe Zheng, Pei Ke, Xiaoyan Zhu, and Minlie
Huang. Rethinking and refining the distinct metric. In Annual Meeting of the Association for Computational Linguistics, 2022. URL https://api.
semanticscholar.org/CorpusID:247158518.
[62] Justin Lovelace and Bobak Mortazavi. Learning to generate clini-
cally coherent chest X-ray reports. In Findings of the Association
for Computational Linguistics: EMNLP 2020, pages 1235–1243, On-
line, November 2020. Association for Computational Linguistics. doi:
10.18653/v1/2020.findings-emnlp.110. URL https://aclanthology.org/
2020.findings-emnlp.110.
[63] Jiasen Lu, Caiming Xiong, Devi Parikh, and Richard Socher. Knowing
when to look: Adaptive attention via a visual sentinel for image caption-
ing. 2017 IEEE Conference on Computer Vision and Pattern Recognition
(CVPR), pages 3242–3250, 2016. URL https://api.semanticscholar.
org/CorpusID:18347865.
[64] Yao Lu, Max Bartolo, Alastair Moore, Sebastian Riedel, and Pontus Stene-
torp. Fantastically ordered prompts and where to find them: Overcom-
ing few-shot prompt order sensitivity. ArXiv, abs/2104.08786, 2021. URL
https://api.semanticscholar.org/CorpusID:233296494.
[65] Yangling Ma, Yixin Luo, and Zhouwang Yang. Gcn-based mil: multi-
instance learning utilizing structural relationships among instances. Signal,
Image and Video Processing, 2024. URL https://api.semanticscholar.
org/CorpusID:269938749.
[66] Lara Marques, B ́arbara Costa, Mariana Pereira, Abigail Silva, Joana San-
tos, Leo F. Saldanha, Isabel Silva, Paulo Magalh ̃aes, Stephan Schmidt, and
Nuno Vale. Advancing precision medicine: A review of innovative in silico
approaches for drug development, clinical pharmacology and personalized healthcare. Pharmaceutics, 16, 2024. URL https://api.semanticscholar.
org/CorpusID:268103402.
[67] Danielle C. McGeary. Pacs–an overview. Biomedical instrumentation &
technology, 43 2:127–30, 2009. URL https://api.semanticscholar.org/
CorpusID:40644053.
[68] Lei Meng, Zhonglin Ye, Yanlin Yang, and Haixing Zhao. Deepmcgcn: Multi-
channel deep graph neural networks. Int. J. Comput. Intell. Syst., 17:41,
2024. URL https://api.semanticscholar.org/CorpusID:268148955.
[69] Jing Miao, Charat Thongprayoon, Supawadee Suppadungsuk, Oscar A. Gar-
cia Valencia, and Wisit Cheungpasitporn. Integrating retrieval-augmented
generation with large language models in nephrology: Advancing practical
applications. Medicina, 60, 2024. URL https://api.semanticscholar.
org/CorpusID:268728365.
[70] Seyed Moezzi, Abdolrahman Ghaedi, Mojdeh Rahmanian, Seyedeh Mousavi,
and Ashkan Sami. Application of deep learning in generating structured ra-
diology reports: A transformer-based technique. Journal of Digital Imaging,
36, 08 2022. doi: 10.1007/s10278-022-00692-x.
[71] Jong Hak Moon, HyunGyung Lee, Won Young Shin, and E. Choi. Multi-
modal understanding and generation for medical images and text via vision-
language pre-training. IEEE Journal of Biomedical and Health Informatics,
26:6070–6080, 2021. URL https://api.semanticscholar.org/CorpusID:
235166527.
[72] Mohammad Amin Morid, Alireza Borjali, and Guilherme Del Fiol. A scop-
ing review of transfer learning research on medical image analysis using im-
agenet. Computers in Biology and Medicine, 128:104115, 2021. ISSN 0010-4825. doi: https://doi.org/10.1016/j.compbiomed.2020.104115. URL https:
//www.sciencedirect.com/science/article/pii/S0010482520304467.
[73] Chiranjib Mukherjee and Dr.Gyan Mukherjee. Role of adjacency matrix in
graph theory. IOSR Journal of Computer Engineering, 16:58–63, 2014. URL
https://api.semanticscholar.org/CorpusID:124775163.
[74] Takeshi Nakaura, Rintaro Ito, Daiju Ueda, Taiki Nozaki, Yasutaka Fushimi,
Yusuke Matsui, Masahiro Yanagawa, Akira Yamada, Takahiro Tsuboyama,
Noriyuki Fujima, Fuminari Tatsugami, Kenji Hirata, Shohei Fujita, Koji
Kamagata, Tomoyuki Fujioka, Mariko Kawamura, and Shinji Naganawa.
The impact of large language models on radiology: a guide for radiologists
on the latest innovations in ai. Japanese journal of radiology, 2024. URL
https://api.semanticscholar.org/CorpusID:268751675.
[75] Maximillian Nickel and Douwe Kiela. Poincar ́e embeddings for learning
hierarchical representations. In I. Guyon, U. Von Luxburg, S. Bengio,
H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances
in Neural Information Processing Systems, volume 30. Curran Associates,
Inc., 2017. URL https://proceedings.neurips.cc/paper/2017/file/
59dfa2df42d9e3d41f5b02bfc32229dd-Paper.pdf.
[76] Ye-Jean Park, Abhinav Pillai, Jiawen Deng, Eddie Guo, Mehul Gupta, Mike
Paget, and Christopher Naugler. Assessing the research landscape and clin-
ical utility of large language models: A scoping review. BMC Medical Infor-
matics and Decision Making, 24, 03 2024. doi: 10.1186/s12911-024-02459-6.
[77] Ye-Jean Park, Abhinav Pillai, Jiawen Deng, Eddie Guo, Mehul Gupta,
Mike Paget, and Christopher Naugler. Assessing the research landscape
and clinical utility of large language models: a scoping review. BMC Medical Informatics and Decision Making, 24, 2024. URL https://api.
semanticscholar.org/CorpusID:268371285.
[78] John Pavlopoulos, Vasiliki Kougia, and Ion Androutsopoulos. A survey
on biomedical image captioning. In Raffaella Bernardi, Raquel Fernandez,
Spandana Gella, Kushal Kafle, Christopher Kanan, Stefan Lee, and Moin
Nabi, editors, Proceedings of the Second Workshop on Shortcomings in Vi-
sion and Language, pages 26–36, Minneapolis, Minnesota, June 2019. Asso-
ciation for Computational Linguistics. doi: 10.18653/v1/W19-1803. URL
https://aclanthology.org/W19-1803.
[79] Jeffrey Pennington, Richard Socher, and Christopher Manning. GloVe:
Global vectors for word representation. In Proceedings of the 2014 Confer-
ence on Empirical Methods in Natural Language Processing (EMNLP), pages
1532–1543, Doha, Qatar, October 2014. Association for Computational Lin-
guistics. doi: 10.3115/v1/D14-1162. URL https://aclanthology.org/
D14-1162.
[80] Matt Post. A call for clarity in reporting bleu scores. In Conference
on Machine Translation, 2018. URL https://api.semanticscholar.org/
CorpusID:13751870.
[81] Sophia Pressman, Sahar Borna, Cesar Gomez-Cabello, Syed Haider, Clifton
Haider, and Antonio Forte. Clinical and surgical applications of large lan-
guage models: A systematic review. Journal of Clinical Medicine, 13:3041,
05 2024. doi: 10.3390/jcm13113041.
[82] Esther Puyol-Ant ́on, Bram Ruijsink, Stefan K. Piechnik, Stefan Neubauer,
Steffen E. Petersen, Reza Razavi, and Andrew P. King. Fairness in cardiac
mr image analysis: An investigation of bias due to data imbalance in deep learning based segmentation. In Marleen de Bruijne, Philippe C. Cattin,
St ́ephane Cotin, Nicolas Padoy, Stefanie Speidel, Yefeng Zheng, and Caro-
line Essert, editors, Medical Image Computing and Computer Assisted Inter-
vention – MICCAI 2021, pages 413–423, Cham, 2021. Springer International
Publishing. ISBN 978-3-030-87199-4.
[83] Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya
Sutskever, et al. Language models are unsupervised multitask learners. Ope-
nAI blog, 1(8):9, 2019.
[84] Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel
Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin,
Jack Clark, Gretchen Krueger, and Ilya Sutskever. Learning transferable
visual models from natural language supervision. In Marina Meila and Tong
Zhang, editors, Proceedings of the 38th International Conference on Ma-
chine Learning, volume 139 of Proceedings of Machine Learning Research,
pages 8748–8763. PMLR, 18–24 Jul 2021. URL https://proceedings.mlr.
press/v139/radford21a.html.
[85] Maithra Raghu, Chiyuan Zhang, Jon Kleinberg, and Samy Bengio. Trans-
fusion: Understanding Transfer Learning for Medical Imaging. Curran As-
sociates Inc., Red Hook, NY, USA, 2019.
[86] Pranav Rajpurkar and Matthew P. Lungren. The current and future state of
ai interpretation of medical images. New England Journal of Medicine, 388
(21):1981–1990, 2023. doi: 10.1056/NEJMra2301725. URL https://www.
nejm.org/doi/full/10.1056/NEJMra2301725.
[87] Steven J. Rennie, Etienne Marcheret, Youssef Mroueh, Jerret Ross,
and Vaibhava Goel. Self-critical sequence training for image captioning. 2017 IEEE Conference on Computer Vision and Pattern Recognition
(CVPR), pages 1179–1195, 2016. URL https://api.semanticscholar.
org/CorpusID:206594923.
[88] Laria Reynolds and Kyle McDonell. Prompt programming for large language
models: Beyond the few-shot paradigm. Extended Abstracts of the 2021 CHI
Conference on Human Factors in Computing Systems, 2021. URL https:
//api.semanticscholar.org/CorpusID:231925131.
[89] Santanu Roy, Alok Kumar Jain, Shyam Lal, and Jyoti Ramnath Kini. A
study about color normalization methods for histopathology images. Micron,
114:42–61, 2018. URL https://api.semanticscholar.org/CorpusID:
51958959.
[90] Marc Cicero Schubert, Wolfgang Wick, and Varun Venkataramani. Large
language model-driven evaluation of medical records using medcheckllm.
medRxiv, 2023. doi: 10.1101/2023.11.01.23297684. URL https://www.
medrxiv.org/content/early/2023/11/03/2023.11.01.23297684.
[91] Thibault Sellam, Dipanjan Das, and Ankur P. Parikh. Bleurt: Learning
robust metrics for text generation. In Annual Meeting of the Association
for Computational Linguistics, 2020. URL https://api.semanticscholar.
org/CorpusID:215548699.
[92] Zhiqi Shao, Dai Shi, Andi Han, Andrey Vasnev, Yi Guo, and Junbin
Gao. Enhancing framelet gcns with generalized p-laplacian regulariza-
tion. Int. J. Mach. Learn. Cybern., 15:1553–1573, 2023. URL https:
//api.semanticscholar.org/CorpusID:264393210.
[93] Roshan Ramprasad Shetty and Prasad Narasimha Sarappadi. Self-sequential
attention layer based densenet for thoracic diseases detection. International Journal of Intelligent Engineering and Systems, 2021. URL https://api.
semanticscholar.org/CorpusID:237653334.
[94] Connor Shorten and Taghi M. Khoshgoftaar. A survey on image data aug-
mentation for deep learning. Journal of Big Data, 6:1–48, 2019. URL
https://api.semanticscholar.org/CorpusID:195811894.
[95] Sonit Singh, Sarvnaz Karimi, Kevin Ho-Shon, and Len Hamey. Show, tell
and summarise: learning to generate and summarise radiology findings from
medical images. Neural Comput. Appl., 33(13):7441–7465, jul 2021. ISSN
0941-0643. doi: 10.1007/s00521-021-05943-6. URL https://doi.org/10.
1007/s00521-021-05943-6.
[96] K. Singhal, Shekoofeh Azizi, Tao Tu, Said Mahdavi, Jason Wei, Hyung Won
Chung, Nathan Scales, Ajay Kumar Tanwani, Heather J. Cole-Lewis,
Stephen J. Pfohl, P A Payne, Martin G. Seneviratne, Paul Gamble,
Chris Kelly, Nathaneal Scharli, Aakanksha Chowdhery, P. A. Mansfield,
Blaise Ag ̈uera y Arcas, Dale R. Webster, Greg S. Corrado, Yossi Matias,
Katherine Hui-Ling Chou, Juraj Gottweis, Nenad Tomaev, Yun Liu, Alvin
Rajkomar, Jo ̈elle K. Barral, Christopher Semturs, Alan Karthikesalingam,
and Vivek Natarajan. Large language models encode clinical knowledge.
Nature, 620:172 – 180, 2022. URL https://api.semanticscholar.org/
CorpusID:255124952.
[97] Akshay Smit, Saahil Jain, Pranav Rajpurkar, Anuj Pareek, A. Ng, and
Matthew P. Lungren. Chexbert: Combining automatic labelers and expert
annotations for accurate radiology report labeling using bert. In Confer-
ence on Empirical Methods in Natural Language Processing, 2020. URL
https://api.semanticscholar.org/CorpusID:215827807.
[98] Luigi Libero Lucio Starace and Sergio Di Martino. Can large language
models automatically generate gis reports? In International Workshop on
Web and Wireless Geographical Information Systems, 2024. URL https:
//api.semanticscholar.org/CorpusID:269838344.
[99] Qi Sun, Kun Zhang, Laishui Lv, Xun Li, Kun Huang, and Ting Zhang.
Joint extraction of entities and overlapping relations by improved graph
convolutional networks. Applied Intelligence, 52:5212 – 5224, 2021. URL
https://api.semanticscholar.org/CorpusID:238795444.
[100] Nima Tajbakhsh, Jae Y. Shin, Suryakanth R. Gurudu, R. Todd Hurst,
Christopher B. Kendall, Michael B. Gotway, and Jianming Liang. Con-
volutional neural networks for medical image analysis: Full training or fine
tuning? IEEE Transactions on Medical Imaging, 35(5):1299–1312, 2016.
doi: 10.1109/TMI.2016.2535302.
[101] Kanae Takahashi, Kouji Yamamoto, Aya Kuchiba, and Tatsuki Koyama.
Confidence interval for micro-averaged f1 and macro-averaged f1 scores.
Applied intelligence (Dordrecht, Netherlands), 52:4961 – 4972, 2021. URL
https://api.semanticscholar.org/CorpusID:238818759.
[102] Ana Clara Teixeira, Vaishali Marar, Hamed Yazdanpanah, Aline Pezente,
and Mohammad Ghassemi. Enhancing credit risk reports generation using
llms: An integration of bayesian networks and labeled guide prompting.
Proceedings of the Fourth ACM International Conference on AI in Finance,
2023. URL https://api.semanticscholar.org/CorpusID:265448396.
[103] Cagri Toraman, Eyup Halit Yilmaz, Furkan S ̧ahinu ̧c, and Oguzhan Ozcelik.
Impact of tokenization on language models: An analysis for turkish. ACM
Transactions on Asian and Low-Resource Language Information Processing, 22:1 – 21, 2022. URL https://api.semanticscholar.org/CorpusID:
248240018.
[104] Meimei Tuo, Wenzhong Yang, Fuyuan Wei, and Qicai Dai. A novel chi-
nese overlapping entity relation extraction model using word-label based
on cascade binary tagging. Electronics, 2023. URL https://api.
semanticscholar.org/CorpusID:257074463.
[105] Ehsan Ullah, Anil Parwani, Mirza Baig, and Rajendra Singh. Challenges
and barriers of using large language models (llm) such as chatgpt for diag-
nostic medicine with a focus on digital pathology – a recent scoping review.
Diagnostic Pathology, 19, 02 2024. doi: 10.1186/s13000-024-01464-7.
[106] Ehsan Ullah, Anil Parwani, Mirza Mansoor Baig, and Rajendra Singh.
Challenges and barriers of using large language models (llm) such as chat-
gpt for diagnostic medicine with a focus on digital pathology – a re-
cent scoping review. Diagnostic Pathology, 19, 2024. URL https://api.
semanticscholar.org/CorpusID:268030962.
[107] Usman Ahmad Usmani, Ari Happonen, and Junzo Watada. Enhancing med-
ical diagnosis through deep learning and machine learning approaches in im-
age analysis. In Kohei Arai, editor, Intelligent Systems and Applications,
pages 449–468, Cham, 2024. Springer Nature Switzerland. ISBN 978-3-031-
47718-8.
[108] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones,
Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention is all you
need. In Proceedings of the 31st International Conference on Neural Infor-
mation Processing Systems, NIPS’17, page 6000–6010, Red Hook, NY, USA,
2017. Curran Associates Inc. ISBN 9781510860964.
[109] Oriol Vinyals, Alexander Toshev, Samy Bengio, and Dumitru Erhan. Show
and tell: Lessons learned from the 2015 mscoco image captioning challenge.
IEEE Transactions on Pattern Analysis and Machine Intelligence, 39:1–1,
07 2016. doi: 10.1109/TPAMI.2016.2587640.
[110] Huy-The Vu, Minh-Tien Nguyen, Van-Chien Nguyen, Minh-Hieu Pham,
Van-Quyet Nguyen, and Van-Hau Nguyen. Label-representative graph con-
volutional network for multi-label text classification. Applied Intelligence, 53:
14759 – 14774, 2022. URL https://api.semanticscholar.org/CorpusID:
253332969.
[111] Boshi Wang, Sewon Min, Xiang Deng, Jiaming Shen, You Wu, Luke Zettle-
moyer, and Huan Sun. Towards understanding chain-of-thought prompting:
An empirical study of what matters. In Annual Meeting of the Association
for Computational Linguistics, 2022. URL https://api.semanticscholar.
org/CorpusID:254877569.
[112] Gaihua Wang, Lei Cheng, Jinheng Lin, Yingying Dai, and Tianlun Zhang.
Fine-grained classification based on multi-scale pyramid convolution net-
works. PLoS ONE, 16, 2021. URL https://api.semanticscholar.org/
CorpusID:235786238.
[113] Rui Wang. Review of generative models. Applied and Computational
Engineering, 2023. URL https://api.semanticscholar.org/CorpusID:
260391342.
[114] Xiaosong Wang, Yifan Peng, Le Lu, Zhiyong Lu, Mohammadhadi Bagheri,
and Ronald M. Summers. Chestx-ray8: Hospital-scale chest x-ray database
and benchmarks on weakly-supervised classification and localization of com-
mon thorax diseases. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 3462–3471, 2017. URL https://api.
semanticscholar.org/CorpusID:263796294.
[115] Yizhong Wang, Yeganeh Kordi, Swaroop Mishra, Alisa Liu, Noah A.
Smith, Daniel Khashabi, and Hannaneh Hajishirzi. Self-instruct: Align-
ing language models with self-generated instructions. In Annual Meet-
ing of the Association for Computational Linguistics, 2022. URL https:
//api.semanticscholar.org/CorpusID:254877310.
[116] Philip Watson and Brian McKinstry. A systematic review of interventions
to improve recall of medical advice in healthcare consultations. Journal of
the Royal Society of Medicine, 102:235 – 243, 2009. URL https://api.
semanticscholar.org/CorpusID:46259190.
[117] Lilian Weng. Contrastive representation learning. lilianweng.github.io/lil-
log, 2021. URL https://lilianweng.github.io/lil-log/2021/05/31/
contrastive-representation-learning.html.
[118] Chaoyi Wu, Xiaoman Zhang, Ya Zhang, Yanfeng Wang, and Weidi Xie.
Medklip: Medical knowledge enhanced language-image pre-training for x-ray
diagnosis. 2023 IEEE/CVF International Conference on Computer Vision
(ICCV), pages 21315–21326, 2023. URL https://api.semanticscholar.
org/CorpusID:255440664.
[119] Saining Xie, Ross B. Girshick, Piotr Doll ́ar, Zhuowen Tu, and Kaiming He.
Aggregated residual transformations for deep neural networks. 2017 IEEE
Conference on Computer Vision and Pattern Recognition (CVPR), pages
5987–5995, 2016. URL https://api.semanticscholar.org/CorpusID:
8485068.
[120] Saining Xie, Ross Girshick, Piotr Dollar, Z. Tu, and Kaiming He. Aggregated residual transformations for deep neural networks. pages 5987–5995, 07 2017.
doi: 10.1109/CVPR.2017.634.
[121] Hao Xiong, Zhongjun He, Hua Wu, and Haifeng Wang. Modeling
coherence for discourse neural machine translation. In Proceedings of
the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-
First Innovative Applications of Artificial Intelligence Conference and
Ninth AAAI Symposium on Educational Advances in Artificial Intelligence,
AAAI’19/IAAI’19/EAAI’19. AAAI Press, 2019. ISBN 978-1-57735-809-1.
doi: 10.1609/aaai.v33i01.33017338. URL https://doi.org/10.1609/aaai.
v33i01.33017338.
[122] Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Rus-
lan Salakhudinov, Rich Zemel, and Yoshua Bengio. Show, attend and
tell: Neural image caption generation with visual attention. In Francis
Bach and David Blei, editors, Proceedings of the 32nd International Con-
ference on Machine Learning, volume 37 of Proceedings of Machine Learn-
ing Research, pages 2048–2057, Lille, France, 07–09 Jul 2015. PMLR. URL
https://proceedings.mlr.press/v37/xuc15.html.
[123] Xingyi Yang, Muchao Ye, Quanzeng You, and Fenglong Ma. Writing by
memorizing: Hierarchical retrieval-based medical report generation. In
ACL/IJCNLP, 2021.
[124] Dengju Yao, Bailin Li, Xiaojuan Zhan, Xiaorong Zhan, and Liyang Yu.
Gcnformer: graph convolutional network and transformer for predicting
lncrna-disease associations. BMC Bioinformatics, 25, 2024. URL https:
//api.semanticscholar.org/CorpusID:266727923.
[125] Zhang Yijia, Qingyu Chen, Zhihao Yang, Hongfei Lin, and Zhiyong lu. Biowordvec, improving biomedical word embeddings with subword informa-
tion and mesh. Scientific Data, 6, 05 2019. doi: 10.1038/s41597-019-0055-0.
[126] Changchang Yin, Buyue Qian, Jishang Wei, Xiaoyu Li, Xianli Zhang,
Yinghong Li, and Qinghua Zheng. Automatic generation of medical imag-
ing diagnostic report with hierarchical recurrent neural network. 2019 IEEE
International Conference on Data Mining (ICDM), pages 728–737, 2019.
[127] Jason Yosinski, Jeff Clune, Yoshua Bengio, and Hod Lipson. How trans-
ferable are features in deep neural networks? In Proceedings of the 27th
International Conference on Neural Information Processing Systems - Vol-
ume 2, NIPS’14, page 3320–3328, Cambridge, MA, USA, 2014. MIT Press.
[128] Renchun You, Zhiyao Guo, Lei Cui, Xiang Long, Sid Ying-Ze Bao, and
Shilei Wen. Cross-modality attention with semantic graph embedding for
multi-label classification. ArXiv, abs/1912.07872, 2020.
[129] Jin Yuan, Shikai Chen, Yao Zhang, Zhongchao Shi, Xin Geng, Jianping
Fan, and Yong Rui. Graph attention transformer network for multi-
label image classification. ACM Transactions on Multimedia Computing,
Communications and Applications, 19:1 – 16, 2022. URL https://api.
semanticscholar.org/CorpusID:247315249.
[130] Kevin Zhai, Mohammad S Yousef, Sawsan Mohammed, Nader I. Al-Dewik,
and M. Walid Qoronfleh. Optimizing clinical workflow using precision
medicine and advanced data analytics. Processes, 2023. URL https:
//api.semanticscholar.org/CorpusID:257656238.
[131] Fan Zhang, Yang Song, Weidong (Tom) Cai, Adrien Depeursinge, and Hen-
ning M ̈uller. Text- and content-based medical image retrieval in the visceral retrieval benchmark. In Cloud-Based Benchmarking of Medical Image Analy-
sis, 2017. URL https://api.semanticscholar.org/CorpusID:14740768.
[132] Yixiao Zhang, Xiaosong Wang, Ziyue Xu, Qihang Yu, Alan Yuille, and
Daguang Xu. When radiology report generation meets knowledge graph.
Proceedings of the AAAI Conference on Artificial Intelligence, 34:12910–
12917, 04 2020. doi: 10.1609/aaai.v34i07.6989.
[133] Tianqi Zhao, Thi Ngan Dong, Alan Hanjalic, and Megha Khosla. Multi-
label node classification on graph-structured data. Transactions on Machine
Learning Research, 2023. ISSN 2835-8856. URL https://openreview.net/
forum?id=EZhkV2BjDP.
[134] Wei Zhao, Michael Strube, and Steffen Eger. DiscoScore: Evaluating
text generation with BERT and discourse coherence. In Andreas Vla-
chos and Isabelle Augenstein, editors, Proceedings of the 17th Conference
of the European Chapter of the Association for Computational Linguistics,
pages 3865–3883, Dubrovnik, Croatia, May 2023. Association for Compu-
tational Linguistics. doi: 10.18653/v1/2023.eacl-main.278. URL https:
//aclanthology.org/2023.eacl-main.278.
[135] Jing Zou, Bing Gao, Youyi Song, and Jing Qin. A review of deep learning-
based deformable medical image registration. Frontiers in Oncology, 12,
2022. URL https://api.semanticscholar.org/CorpusID:254295776.

Full-text
Abstract

推文
推薦
評分
引用網址
轉寄

top

詳目顯示

相關論文