google/paligemma2-3b-pt-448
Image-Text-to-Text • 3B • Updated • 32.3k • 48
Google ❤️ Open Source AI
CityRAG: Stepping Into a City via Spatially-Grounded Video Generation
TIPSv2: Advancing Vision-Language Pretraining with Enhanced Patch-Text Alignment