Medical RAG Chatbot

Developed at Tab Next (HK) as Algorithm Engineer Intern (2025.06 - 2025.09).

  • ReAct Agent Architecture: LLM autonomously calls 7 tool types (business query, appointment, form analysis, etc.)
  • CoT + FAISS Retrieval: Short-term memory and long-term vector retrieval for context-aware responses
  • Fine-tuning: LoRA-based SFT + DPO via LLaMA Factory
  • Multimodal: Qwen2.5VL-7B-Instruct for WhatsApp interactions (+8% stability, +20% efficiency)
  • Performance: vLLM acceleration for multimodal analysis (+70%)
  • Dialogue AI: Real-time transcription with Qwen-ASR, voice demo with Qwen-TTS
  • Customer satisfaction (CSAT 38%) on par with human agents, saving 20% labor costs