|
[SEEKING] Indic Document Dataset (India) — Invoices, Receipts, Utility Bills, Payment Advices, Packing Lists, Commercial Invoices, Credit Notes
|
|
5
|
55
|
June 25, 2026
|
|
Dataset Viewer issue: ConfigNamesError
|
|
3
|
61
|
June 21, 2026
|
|
Add Convence/ParseEmbed as an official benchmark on the Hub (If possible)
|
|
6
|
74
|
June 19, 2026
|
|
Follow-up: the detector reliability check, now with a second human rater + two LLMs (fresh scenes)
|
|
0
|
26
|
June 18, 2026
|
|
Welcome — questions, requests, and feedback Board of Veterans’ Appeals decisions 2019-present
|
|
3
|
25
|
June 10, 2026
|
|
Introducing BenSyc v1.1: A Benchmark for Conversational Sycophancy and Alignment in Bengali Social Contexts
|
|
0
|
33
|
June 9, 2026
|
|
New Dataset Released!
|
|
0
|
55
|
June 8, 2026
|
|
Documented our dataset's limits + ran a reliability check on its rule-based labels
|
|
0
|
27
|
June 8, 2026
|
|
Hosting Dataset in Europe due to Ethics Constraints
|
|
1
|
53
|
June 4, 2026
|
|
V7.2 update — Pattern F gap closed + Hard Negatives Batch 2
|
|
0
|
22
|
May 31, 2026
|
|
Objective Projection v7.1: Narrative Engineering Corpus targeting Summarization Bias
|
|
0
|
31
|
May 30, 2026
|
|
For researchers: What physical interaction scenarios are underrepresented in world model training data?
|
|
1
|
34
|
May 24, 2026
|
|
Integration of Benchmark Dataset for CHI-Bench
|
|
0
|
36
|
May 22, 2026
|
|
Legal data creation
|
|
1
|
67
|
May 16, 2026
|
|
[Dataset] CLI-1M: 975K NL→shell pairs — 13 languages, 6 shells, Apache-2.0
|
|
0
|
39
|
May 14, 2026
|
|
Synthetic Australian medical record PDF library (50-doc free sample) - feedback wanted on dataset
|
|
0
|
66
|
May 7, 2026
|
|
PiC/phrase_retrieval dataset (PR-pass & PR-page) is broken — does anyone have a local copy?
|
|
0
|
24
|
May 5, 2026
|
|
Anyone else fighting the “valid json, broken pipeline” problem in planner-executor stacks?
|
|
2
|
64
|
May 3, 2026
|
|
TikTok-10M Dataset
|
|
5
|
930
|
April 29, 2026
|
|
Dino Data Workflow Routing Preview: training models to route, structure, and prepare actions instead of only replying
|
|
0
|
26
|
April 29, 2026
|
|
Built a lane-based dataset bundle explorer for LLM training — would love feedback from the HF community
|
|
0
|
20
|
April 29, 2026
|
|
When Your “Labels” Aren’t Really Labels: Dealing with Entity-Based NLP Datasets
|
|
1
|
43
|
April 26, 2026
|
|
Made a Python failure dataset for DPO/RLHF — how do you source negative examples?
|
|
0
|
44
|
April 26, 2026
|
|
Load_dataset() creates a duplicate in cache
|
|
1
|
73
|
April 25, 2026
|
|
Spanish Historical Web Corpus — unique categories (religion, folklore, conspiracies, BOE)
|
|
0
|
21
|
April 21, 2026
|
|
Dataset viewer broke after repo rename
|
|
4
|
84
|
April 20, 2026
|
|
Huggingface Dataset Download Stuck in Kaggle
|
|
7
|
396
|
April 14, 2026
|
|
Add new official benchmark on the Hub
|
|
3
|
80
|
April 13, 2026
|
|
Otal AI beginner with a 25-year photography archive—is this useful for training?
|
|
0
|
22
|
April 10, 2026
|
|
QSBench: Synthetic quantum circuit datasets for QML benchmarking
|
|
0
|
54
|
April 6, 2026
|