Medical RAG Chatbot
Developed at Tab Next (HK) as Algorithm Engineer Intern (2025.06 - 2025.09).
- ReAct Agent Architecture: LLM autonomously calls 7 tool types (business query, appointment, form analysis, etc.)
- CoT + FAISS Retrieval: Short-term memory and long-term vector retrieval for context-aware responses
- Fine-tuning: LoRA-based SFT + DPO via LLaMA Factory
- Multimodal: Qwen2.5VL-7B-Instruct for WhatsApp interactions (+8% stability, +20% efficiency)
- Performance: vLLM acceleration for multimodal analysis (+70%)
- Dialogue AI: Real-time transcription with Qwen-ASR, voice demo with Qwen-TTS
- Customer satisfaction (CSAT 38%) on par with human agents, saving 20% labor costs