Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
2.6
TFLOPS
1
1
17
Kim
ilyoung
Follow
IniNLP247's profile picture
1 follower
ยท
44 following
ilyoungkim
AI & ML interests
Researcher for Blockchain & Avatar Services
Recent Activity
liked
a dataset
2 days ago
fka/awesome-chatgpt-prompts
reacted
to
yeonseok-zeticai
's
post
with ๐ค
2 months ago
โก ColBERT-ko-v1.0 Complete On-device Study: SOTA Korean Retrieval on Mobile Major Release: Comprehensive mobile deployment study of yoonjong's ColBERT-ko-v1.0 with detailed performance benchmarks across 50+ mobile devices! ๐ฏ Model Overview: Architecture: Korean-optimized ColBERT (late interaction) Parameters: 0.1B (compact and efficient) Specialty: Korean document retrieval and semantic search Performance: 1.0 recall@10, 0.966 nDCG@10 on AutoRAGRetrieval Benchmark: Outperforms Jina-ColBERT-v2 on Korean MTEB tasks ๐ Mobile Performance Results: Latency Metrics: NPU (Best): 3.17ms average inference GPU: 11.67ms average CPU: 21.36ms average NPU Advantage: 18.46x speedup over CPU Memory Efficiency: Model Size: 567.89 MB (production optimized) Runtime Memory: 170.87 MB peak consumption Load Range: 4-614 MB across device categories FP32 Memory: 642.65 MB (optional high precision) Accuracy Preservation: - FP16 Precision: 53.51 dB maintained - Quantized Mode: 32.77 dB (available for memory constraints) -Retrieval Quality: Production-grade Korean semantic matching ๐ฌ Research Implications: This study demonstrates: - Late interaction models are viable for mobile deployment - Korean language models achieve SOTA performance on consumer hardware - Sub-5ms retrieval enables real-time RAG applications - Privacy-first Korean AI is now technically feasible Benchmark Methodology: - Standardized Korean query inputs - Multiple inference cycles with statistical averaging - Real-world Korean document corpus testing - Thermal and battery impact assessment Deployment Recommendations: - Use NPU acceleration for 18x speedup on Android - Implement MUVERA for real-time requirements - Cache frequent queries for instant responses - Progressive loading for large document collections - Hybrid approach for extremely large corpora (local + cloud) ๐ Resources: Complete Study: https://mlange.zetic.ai/p/Steve/colbert_kor Original Model: https://huggingface.co/yoonjong0505/ColBERT-ko-v1.0
liked
a model
4 months ago
o0dimplz0o/Fine-Tuned-Whisper-Large-v2-Zeroth-STT-KO
View all activity
Organizations
None yet
ilyoung
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
upvoted
a
collection
5 months ago
DiffuCoder
Collection
4 items
โข
Updated
Aug 25
โข
9