Multi-Modal Understanding Efficient LLaMA-3.2-Vision by Trimming Cross-attended Visual Features Paper • 2504.00557 • Published Apr 1, 2025 • 15
Efficient LLaMA-3.2-Vision by Trimming Cross-attended Visual Features Paper • 2504.00557 • Published Apr 1, 2025 • 15
SERs speechbrain/emotion-recognition-wav2vec2-IEMOCAP Audio Classification • Updated Jul 23, 2024 • 643k • 169 CAiRE/SER-wav2vec2-large-xlsr-53-eng-zho-all-age Audio Classification • Updated Jun 27, 2023 • 25 • 4
speechbrain/emotion-recognition-wav2vec2-IEMOCAP Audio Classification • Updated Jul 23, 2024 • 643k • 169
CAiRE/SER-wav2vec2-large-xlsr-53-eng-zho-all-age Audio Classification • Updated Jun 27, 2023 • 25 • 4
Multi-Modal Understanding Efficient LLaMA-3.2-Vision by Trimming Cross-attended Visual Features Paper • 2504.00557 • Published Apr 1, 2025 • 15
Efficient LLaMA-3.2-Vision by Trimming Cross-attended Visual Features Paper • 2504.00557 • Published Apr 1, 2025 • 15
SERs speechbrain/emotion-recognition-wav2vec2-IEMOCAP Audio Classification • Updated Jul 23, 2024 • 643k • 169 CAiRE/SER-wav2vec2-large-xlsr-53-eng-zho-all-age Audio Classification • Updated Jun 27, 2023 • 25 • 4
speechbrain/emotion-recognition-wav2vec2-IEMOCAP Audio Classification • Updated Jul 23, 2024 • 643k • 169
CAiRE/SER-wav2vec2-large-xlsr-53-eng-zho-all-age Audio Classification • Updated Jun 27, 2023 • 25 • 4