|
Anyone else fighting the “valid json, broken pipeline” problem in planner-executor stacks?
|
|
3
|
21
|
May 2, 2026
|
|
Made a Python failure dataset for DPO/RLHF — how do you source negative examples?
|
|
1
|
31
|
April 30, 2026
|
|
TikTok-10M Dataset
|
|
7
|
807
|
April 29, 2026
|
|
Dino Data Workflow Routing Preview: training models to route, structure, and prepare actions instead of only replying
|
|
2
|
18
|
April 30, 2026
|
|
Built a lane-based dataset bundle explorer for LLM training — would love feedback from the HF community
|
|
0
|
15
|
April 29, 2026
|
|
When Your “Labels” Aren’t Really Labels: Dealing with Entity-Based NLP Datasets
|
|
1
|
21
|
April 26, 2026
|
|
Load_dataset() creates a duplicate in cache
|
|
1
|
40
|
April 25, 2026
|
|
Spanish Historical Web Corpus — unique categories (religion, folklore, conspiracies, BOE)
|
|
0
|
11
|
April 21, 2026
|
|
Dataset viewer broke after repo rename
|
|
5
|
47
|
April 20, 2026
|
|
Huggingface Dataset Download Stuck in Kaggle
|
|
8
|
137
|
April 14, 2026
|
|
Add new official benchmark on the Hub
|
|
3
|
56
|
April 13, 2026
|
|
Otal AI beginner with a 25-year photography archive—is this useful for training?
|
|
0
|
16
|
April 10, 2026
|
|
QSBench: Synthetic quantum circuit datasets for QML benchmarking
|
|
0
|
33
|
April 6, 2026
|
|
I would like to get an opinion from knowledgeable people (since I don't understand anything about it myself)
|
|
26
|
203
|
April 4, 2026
|
|
Request to delete DOI-locked dataset: th1nhng0/vietnamese-legal-documents
|
|
2
|
27
|
April 1, 2026
|
|
Indic-faker: Generate realistic Indian synthetic data for NLP/ML — 8 languages, native scripts, batch DataFrame export
|
|
3
|
52
|
March 30, 2026
|
|
What are some AI/ML concepts or problems you found difficult while learning?
|
|
1
|
16
|
March 24, 2026
|
|
The downloads count of dataset hasn't been updated
|
|
2
|
29
|
March 19, 2026
|
|
Need help in fine-tuning of OCR model at production grade
|
|
1
|
87
|
March 12, 2026
|
|
Would a curated dataset of ~4000 social media design layouts be useful for training or fine-tuning design models?
|
|
1
|
25
|
March 10, 2026
|
|
Training LLM model for asking questions
|
|
5
|
350
|
March 10, 2026
|
|
Huggingface datasets card not work correctly
|
|
1
|
57
|
March 9, 2026
|
|
Fastdedup: Rust-based dataset deduplication — benchmarks on FineWeb sample-10BT
|
|
2
|
72
|
March 4, 2026
|
|
New Datasets: Human Vocality Primitives Series
|
|
0
|
40
|
March 4, 2026
|
|
Any way to streaming-preprocess a dataset to disk?
|
|
7
|
185
|
March 4, 2026
|
|
Inquiry About Dataset for AI-Driven Cloud Load Balancing and Auto scaling of instances
|
|
2
|
59
|
March 4, 2026
|
|
Looking for Data
|
|
2
|
60
|
March 4, 2026
|
|
Upload a large folder from S3 to a dataset
|
|
6
|
152
|
March 4, 2026
|
|
Dataset flagged as unsafe due to false positive - how to resolve?
|
|
6
|
129
|
March 4, 2026
|
|
Downloads count data not updating
|
|
0
|
30
|
March 4, 2026
|