Efficient Optimization of Multimodal Language Models for Processing Radiological Data using the ROCO Dataset

Título: Efficient Optimization of Multimodal Language Models for Processing Radiological Data using the ROCO Dataset

Autores: Danilo Felipe Neto, Anderson Luis Amaral & Victor Hugo de Albuquerque

Resumo: This study investigates the computational efficiency and performance trade-offs of model optimization techniques like QLoRA, Knowledge Distillation (KD), and Pruning, when applied to large language models (LLMs) and small language models (SLMs) for radiology question-answering tasks. Using the ROCOv2 multimodal dataset, we systematically compare baseline models against their fine-tuned and compressed counterparts. The primary goal is to evaluate whether such methods can significantly reduce memory and computational demands while maintaining acceptable accuracy, enabling deployment on edge devices and in low-resource clinical environments. Experimental results show that SLMs enhanced with QLoRA retain competitive accuracy while reducing GPU usage by up to 80%, and that combining KD and Pruning further improves inference speed and hardware efficiency making these models viable for real world radiological decision support at the edge computing devices.

Palavras-chave: Large Language Models; Small Language Models; QLoRA; Knowledge Distillation; Pruning; Radiology; Fine-tuning; Edge Computing.

Páginas: 6

Código DOI: 10.21528/CBIC2025-1191702

Artigo em PDF: CBIC_2025_paper1191702.pdf

Arquivo BibTeX:
CBIC_2025_1191702.bib