#visual-comprehension
#visual-comprehension

[ follow ]

LLaVA-Phi: The Training We Put It Through | HackerNoon

LLaVA-Phi utilizes a structured training pipeline to improve visual and language model capabilities through fine-tuning.

Introducing LLaVA-Phi: A Compact Vision-Language Assistant Powered By a Small Language Model | HackerNoon

LLaVA-Phi showcases the capabilities of smaller language models in multi-modal tasks with only 2.7B parameters.

[ Load more ]