Published on 01 January 2025

Research on the application of LLaVA model based on Q-LoRA fine-tuning in medical teaching

View Dataset
Zhou, Shiling

Description

The Augmented Reality Large Language Model Medical Teaching System integrates Augmented Reality with LLaVA-Med, a medical multimodal large language model based on LLaVA and specifically designed for biomedical applications, employing QLoRA to advance medical education. Deployed on resource-constrained AR devices, such as INMO Air2 glasses, ARLMT overlays real-time visual annotations and textual feedback on medical scenarios to create an immersive and interactive learning environment. Key advancements include a 66% reduction in memory footprint (from 15.2 GB to 5.1 GB) through QLoRA, enabling efficient operation without compromising performance, and an average response time of 1.009 seconds across various medical imaging categories, surpassing the GPT-4 baseline in both speed and accuracy. The system achieves 98.3% diagnostic accuracy, demonstrating its reliability in real-time applications. By combining visual and textual elements, ARLMT enhances comprehension of complex medical concepts, providing a scalable, real-time solution that bridges technological innovation and pedagogical needs in medical training.

Citations (0)

Mentions (0)

Metrics

Dataset Index

0.3

FAIR Score

13%

Citations

0

Mentions

0

Metrics Over Time

Publication Details

Assigned Domain

Subfield

Computer Vision and Pattern Recognition

Field

Computer Science

Domain

Physical Sciences

Confidence Score

61%

Source

Scholar Data Model

Keywords

Artificial intelligence not elsewhere classifiedHuman-computer interaction

Normalization Factors

FT

13.46

CTw

1.00

MTw

1.00