--- license: apache-2.0 tags: - image-classification - computer-vision - checkbox-detection - efficientnet datasets: - wendys-llc/chkbx metrics: - accuracy - f1 - precision - recall base_model: google/efficientnet-b0 model-index: - name: checkbox-classifier-efficientnet results: - task: type: image-classification name: Image Classification dataset: type: wendys-llc/chkbx name: Checkbox Detection Dataset split: validation metrics: - type: accuracy value: 0.97 name: Validation Accuracy library_name: transformers pipeline_tag: image-classification --- # Checkbox State Classifier - EfficientNet-B0 A fine-tuned EfficientNet-B0 model for binary classification of checkbox states (checked/unchecked). This model achieves ~95% accuracy on UI checkbox detection. ## Model Description This model is fine-tuned from [google/efficientnet-b0](https://huggingface.co/google/efficientnet-b0) on the [wendys-llc/chkbx](https://huggingface.co/datasets/wendys-llc/chkbx) dataset. It's designed to classify UI checkboxes in screenshots and interface images. ### Key Features - **No `trust_remote_code` required** - Uses native transformers support - **Fast inference** - EfficientNet-B0 is optimized for speed - **High accuracy** - ~95% on validation set - **Simple API** - Works with transformers pipeline out of the box ## Usage ### Quick Start with Pipeline (Recommended) ```python from transformers import pipeline from PIL import Image # Load the model classifier = pipeline("image-classification", model="wendys-llc/checkbox-classifier-efficientnet") # Classify an image image = Image.open("checkbox.jpg") results = classifier(image) # Print results for result in results: print(f"{result['label']}: {result['score']:.2%}") # Get just the top prediction top_result = classifier(image, top_k=1)[0] print(f"Checkbox is: {top_result['label']} (confidence: {top_result['score']:.2%})") ``` ### Using AutoModel and AutoImageProcessor ```python from transformers import AutoImageProcessor, AutoModelForImageClassification import torch from PIL import Image # Load model and processor processor = AutoImageProcessor.from_pretrained("wendys-llc/checkbox-classifier-efficientnet") model = AutoModelForImageClassification.from_pretrained("wendys-llc/checkbox-classifier-efficientnet") # Prepare image image = Image.open("checkbox.jpg") inputs = processor(images=image, return_tensors="pt") # Get prediction with torch.no_grad(): outputs = model(**inputs) logits = outputs.logits # Get predicted class predicted_class_idx = logits.argmax(-1).item() predicted_label = model.config.id2label[predicted_class_idx] # Get confidence scores probabilities = torch.nn.functional.softmax(logits, dim=-1) confidence = probabilities.max().item() print(f"Prediction: {predicted_label} (confidence: {confidence:.2%})") ``` ### Batch Processing ```python from transformers import pipeline from PIL import Image classifier = pipeline("image-classification", model="wendys-llc/checkbox-classifier-efficientnet") # Process multiple images images = [Image.open(f"checkbox_{i}.jpg") for i in range(1, 4)] results = classifier(images) for i, result in enumerate(results): top_pred = result[0] # Get top prediction print(f"Image {i+1}: {top_pred['label']} ({top_pred['score']:.2%})") ``` ## Model Details ### Architecture - **Base Model**: google/efficientnet-b0 - **Model Type**: EfficientNet for Image Classification - **Number of Labels**: 2 (checked, unchecked) - **Input Size**: 224x224 RGB images - **Framework**: PyTorch via Transformers ### Training Details - **Dataset**: [wendys-llc/chkbx](https://huggingface.co/datasets/wendys-llc/chkbx) - ~4,800 training samples - ~1,200 validation samples - **Training Configuration**: - Epochs: 15 (with early stopping) - Batch Size: 64 (on A100) - Learning Rate: Default AdamW - Mixed Precision: FP16 - Hardware: NVIDIA A100 GPU ## Acknowledgments - Base model: [google/efficientnet-b0](https://huggingface.co/google/efficientnet-b0) - Dataset: [wendys-llc/chkbx](https://huggingface.co/datasets/wendys-llc/chkbx) - Framework: [HuggingFace Transformers](https://github.com/huggingface/transformers) ## License This model is licensed under the Apache 2.0 License. See the [LICENSE](https://www.apache.org/licenses/LICENSE-2.0) file for details.