Text Classification
Transformers
Safetensors
English
deberta-v2
deberta-v3
ecommerce
search
query-volume
seo
keyword-research
amazon
Eval Results (legacy)
text-embeddings-inference
Instructions to use dejanseo/ecommerce-query-volume-classifier with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use dejanseo/ecommerce-query-volume-classifier with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="dejanseo/ecommerce-query-volume-classifier")# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("dejanseo/ecommerce-query-volume-classifier") model = AutoModelForSequenceClassification.from_pretrained("dejanseo/ecommerce-query-volume-classifier") - Notebooks
- Google Colab
- Kaggle
Update README.md
Browse files
README.md
CHANGED
|
@@ -1,5 +1,200 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: other
|
| 3 |
-
license_name: link-attribution
|
| 4 |
-
license_link: https://
|
| 5 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: other
|
| 3 |
+
license_name: link-attribution
|
| 4 |
+
license_link: https://dejan.ai/blog/query-length-vs-volume/
|
| 5 |
+
language:
|
| 6 |
+
- en
|
| 7 |
+
library_name: transformers
|
| 8 |
+
pipeline_tag: text-classification
|
| 9 |
+
tags:
|
| 10 |
+
- deberta-v2
|
| 11 |
+
- deberta-v3
|
| 12 |
+
- ecommerce
|
| 13 |
+
- search
|
| 14 |
+
- query-volume
|
| 15 |
+
- seo
|
| 16 |
+
- keyword-research
|
| 17 |
+
- amazon
|
| 18 |
+
base_model: microsoft/deberta-v3-base
|
| 19 |
+
datasets:
|
| 20 |
+
- amazon/AmazonQAC
|
| 21 |
+
metrics:
|
| 22 |
+
- accuracy
|
| 23 |
+
- f1
|
| 24 |
+
model-index:
|
| 25 |
+
- name: ecommerce-query-volume-classifier
|
| 26 |
+
results:
|
| 27 |
+
- task:
|
| 28 |
+
type: text-classification
|
| 29 |
+
name: Search Query Volume Classification
|
| 30 |
+
dataset:
|
| 31 |
+
name: Amazon Shopping Queries (AmazonQAC)
|
| 32 |
+
type: amazon/AmazonQAC
|
| 33 |
+
metrics:
|
| 34 |
+
- name: Accuracy
|
| 35 |
+
type: accuracy
|
| 36 |
+
value: 0.721
|
| 37 |
+
- name: Macro F1
|
| 38 |
+
type: f1
|
| 39 |
+
value: 0.6877
|
| 40 |
+
- name: Spearman Correlation
|
| 41 |
+
type: spearmanr
|
| 42 |
+
value: 0.896
|
| 43 |
+
---
|
| 44 |
+
|
| 45 |
+
# eCommerce Query Volume Classifier
|
| 46 |
+
|
| 47 |
+
A fine-tuned [DeBERTa v3 base](https://huggingface.co/microsoft/deberta-v3-base) model that predicts the search volume class of ecommerce product queries. Trained on 39.6 million unique queries from the [Amazon Shopping Queries](https://huggingface.co/datasets/amazon/AmazonQAC) dataset spanning 395.5 million search sessions.
|
| 48 |
+
|
| 49 |
+
**Blog post:** [Is Query Length a Reliable Predictor of Search Volume?](https://dejan.ai/blog/query-length-vs-volume/)
|
| 50 |
+
|
| 51 |
+
## Model Description
|
| 52 |
+
|
| 53 |
+
This model classifies ecommerce search queries into five volume tiers based on their expected search popularity:
|
| 54 |
+
|
| 55 |
+
| Label | Class | Occurrences | Description |
|
| 56 |
+
|-------|-------|-------------|-------------|
|
| 57 |
+
| 0 | `very_high` | 10,000+ | Head terms, major brands (e.g. "airpods", "laptop") |
|
| 58 |
+
| 1 | `high` | 1,000–9,999 | Popular product categories and well-known items |
|
| 59 |
+
| 2 | `medium` | 100–999 | Moderately specific queries |
|
| 60 |
+
| 3 | `low` | 10–99 | Niche or qualified queries |
|
| 61 |
+
| 4 | `very_low` | <10 | Long-tail, highly specific queries |
|
| 62 |
+
|
| 63 |
+
The model learns semantic signals — brand recognition, category head terms, specificity markers — rather than superficial features like query length. Simple character/word-count heuristics achieve only ~25% accuracy on this task (barely above the 20% random baseline), while this model achieves **72.1% accuracy**.
|
| 64 |
+
|
| 65 |
+
## Usage
|
| 66 |
+
|
| 67 |
+
```python
|
| 68 |
+
from transformers import AutoTokenizer, AutoModelForSequenceClassification
|
| 69 |
+
import torch
|
| 70 |
+
|
| 71 |
+
model_name = "dejanseo/ecommerce-query-volume-classifier"
|
| 72 |
+
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
| 73 |
+
model = AutoModelForSequenceClassification.from_pretrained(model_name)
|
| 74 |
+
model.eval()
|
| 75 |
+
|
| 76 |
+
labels = ["very_high", "high", "medium", "low", "very_low"]
|
| 77 |
+
|
| 78 |
+
queries = [
|
| 79 |
+
"airpods",
|
| 80 |
+
"wireless mouse",
|
| 81 |
+
"organic flurb capsules",
|
| 82 |
+
"replacement gasket for instant pot duo 8 quart",
|
| 83 |
+
]
|
| 84 |
+
|
| 85 |
+
inputs = tokenizer(queries, return_tensors="pt", padding=True, truncation=True, max_length=32)
|
| 86 |
+
|
| 87 |
+
with torch.no_grad():
|
| 88 |
+
outputs = model(**inputs)
|
| 89 |
+
probs = torch.softmax(outputs.logits, dim=-1)
|
| 90 |
+
preds = torch.argmax(probs, dim=-1)
|
| 91 |
+
|
| 92 |
+
for query, pred, prob in zip(queries, preds, probs):
|
| 93 |
+
label = labels[pred.item()]
|
| 94 |
+
confidence = prob[pred.item()].item() * 100
|
| 95 |
+
print(f"{query:50s} → {label:>10s} ({confidence:.1f}%)")
|
| 96 |
+
```
|
| 97 |
+
|
| 98 |
+
## Performance
|
| 99 |
+
|
| 100 |
+
### Evaluation (25K balanced sample, 5K per class)
|
| 101 |
+
|
| 102 |
+
| Method | Accuracy | Spearman ρ |
|
| 103 |
+
|--------|----------|------------|
|
| 104 |
+
| **This model** | **72.1%** | **0.896** |
|
| 105 |
+
| Word count heuristic | 25.4% | -0.345 |
|
| 106 |
+
| Char count heuristic | 24.9% | -0.336 |
|
| 107 |
+
|
| 108 |
+
### Per-Class F1 Scores (best validation checkpoint)
|
| 109 |
+
|
| 110 |
+
| Class | Precision | Recall | F1 |
|
| 111 |
+
|-------|-----------|--------|----|
|
| 112 |
+
| very_high | 0.892 | 0.980 | 0.934 |
|
| 113 |
+
| high | 0.727 | 0.921 | 0.813 |
|
| 114 |
+
| medium | 0.625 | 0.790 | 0.698 |
|
| 115 |
+
| low | 0.496 | 0.335 | 0.400 |
|
| 116 |
+
| very_low | 0.610 | 0.579 | 0.594 |
|
| 117 |
+
|
| 118 |
+
The model performs best on the extremes (very high and very low volume) and struggles most with the `low` class, which sits in an ambiguous zone between `medium` and `very_low`.
|
| 119 |
+
|
| 120 |
+
## Training Details
|
| 121 |
+
|
| 122 |
+
### Hyperparameters
|
| 123 |
+
|
| 124 |
+
| Parameter | Value |
|
| 125 |
+
|-----------|-------|
|
| 126 |
+
| Base model | `microsoft/deberta-v3-base` |
|
| 127 |
+
| Epochs | 20 |
|
| 128 |
+
| Batch size | 128 |
|
| 129 |
+
| Learning rate | 3e-5 |
|
| 130 |
+
| Max sequence length | 32 |
|
| 131 |
+
| Warmup ratio | 0.1 |
|
| 132 |
+
| Weight decay | 0.01 |
|
| 133 |
+
| Label smoothing | 0.1 |
|
| 134 |
+
| Scheduler | Linear with warmup |
|
| 135 |
+
|
| 136 |
+
### Sampling Strategy
|
| 137 |
+
|
| 138 |
+
Balanced sampling per epoch with different random seeds:
|
| 139 |
+
|
| 140 |
+
| Class | Samples per epoch |
|
| 141 |
+
|-------|-------------------|
|
| 142 |
+
| very_low | 100,000 |
|
| 143 |
+
| low | 100,000 |
|
| 144 |
+
| medium | 100,000 |
|
| 145 |
+
| high | 30,000 |
|
| 146 |
+
| very_high | 30,000 |
|
| 147 |
+
|
| 148 |
+
**Total per epoch:** 324,000 train / 36,000 validation
|
| 149 |
+
|
| 150 |
+
### Hardware
|
| 151 |
+
|
| 152 |
+
- **GPU:** NVIDIA GeForce RTX 4090 (24 GB)
|
| 153 |
+
- **RAM:** 128 GB
|
| 154 |
+
- **OS:** Windows 11
|
| 155 |
+
- **Training time:** ~2 hours 16 minutes
|
| 156 |
+
- **Framework:** PyTorch + Transformers 4.57.1
|
| 157 |
+
|
| 158 |
+
### Dataset
|
| 159 |
+
|
| 160 |
+
[Amazon Shopping Queries (AmazonQAC)](https://huggingface.co/datasets/amazon/AmazonQAC) �� 395.5 million sessions, 39.6 million unique queries. Volume classes derived from raw occurrence counts across sessions.
|
| 161 |
+
|
| 162 |
+
| Class | Unique Queries |
|
| 163 |
+
|-------|---------------|
|
| 164 |
+
| very_high | ~18K |
|
| 165 |
+
| high | ~30K |
|
| 166 |
+
| medium | ~321K |
|
| 167 |
+
| low | ~4.6M |
|
| 168 |
+
| very_low | ~34.7M |
|
| 169 |
+
|
| 170 |
+
## What the Model Learns
|
| 171 |
+
|
| 172 |
+
The model captures semantic patterns rather than surface-level features like query length:
|
| 173 |
+
|
| 174 |
+
- **Brand recognition:** "airpods" → very high, regardless of character count
|
| 175 |
+
- **Category head terms:** "laptop", "headphones", "dog food" → recognized as high-volume entry points
|
| 176 |
+
- **Specificity markers:** Size specs, compatibility constraints, and material callouts signal niche demand
|
| 177 |
+
- **Nonsense detection:** Gibberish queries like "blorf" and "wireless blorf adapter" are correctly classified as very low volume, confirming the model isn't just counting characters
|
| 178 |
+
|
| 179 |
+
## Limitations
|
| 180 |
+
|
| 181 |
+
- Trained exclusively on Amazon product search queries — may not generalize well to Google web search, informational queries, or non-English markets
|
| 182 |
+
- The `low` volume class is the weakest (F1 ≈ 0.39), reflecting genuine ambiguity in the boundary between medium and very low volume queries
|
| 183 |
+
- Volume thresholds are based on the Amazon QAC dataset's session counts, which may not map directly to other volume scales (e.g. Google Keyword Planner)
|
| 184 |
+
- Product trends shift over time; queries that were high volume in the training data may not remain so
|
| 185 |
+
|
| 186 |
+
## Citation
|
| 187 |
+
|
| 188 |
+
```bibtex
|
| 189 |
+
@article{petrovic2026querylength,
|
| 190 |
+
title={Is Query Length a Reliable Predictor of Search Volume?},
|
| 191 |
+
author={Petrovic, Dan},
|
| 192 |
+
year={2026},
|
| 193 |
+
month={March},
|
| 194 |
+
url={https://dejan.ai/blog/query-length-vs-volume/}
|
| 195 |
+
}
|
| 196 |
+
```
|
| 197 |
+
|
| 198 |
+
## Author
|
| 199 |
+
|
| 200 |
+
**Dan Petrovic** — [DEJAN AI](https://dejan.ai/)
|