cnn-prelu-imagenet

A Convolutional Neural Network (CNN) trained on ImageNet-1k with PReLU activation.

Repository: github.com/chrisjob1021/model-examples

Model Description

This is a ResNet-style CNN architecture featuring:

  • Activation Function: PReLU
  • Number of Classes: 1000
  • Architecture: Deep residual network with bottleneck blocks
  • Training Dataset: ImageNet-1k
  • Code: See cnn/ directory in the repository for training scripts and model implementation

Key Features

  • Residual connections for better gradient flow
  • Batch normalization for training stability
  • PReLU activation for learnable non-linearity
  • Manual and builtin convolution implementations for educational purposes

Training Details

  • Epochs: 300.0
  • Global Steps: 375,300
  • Training Loss: 0.898557186126709

Evaluation Results

  • Top-1 Accuracy: 78.01%
  • Top-5 Accuracy: 93.89%
  • Top-1 Error: 21.99%
  • Top-5 Error: 6.11%

Comparison with ImageNet Benchmarks

Model Top-1 Top-5 Parameters Year Notes
AlexNet 57.0% 80.3% 60M 2012 First deep CNN
VGG-16 71.5% 90.1% 138M 2014 Deep with small filters
ResNet-50 76.0% 93.0% 25M 2015 Baseline
ResNet-152 78.3% 94.3% 60M 2015 Deeper variant
Inception-v3 78.0% 93.9% 24M 2015 Multi-scale
This model 78.01% 93.89% ~23M 2025 PReLU

Key Achievement: +2.01% improvement over ResNet-50 baseline

Usage

from prelu_cnn import CNN

# Load the model
model = CNN.from_pretrained(
    "your-username/cnn-prelu-imagenet",
    use_prelu=True,
    num_classes=1000
)

# Use for inference
import torch
from torchvision import transforms
from PIL import Image

# Prepare image
transform = transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])

image = Image.open("path/to/image.jpg")
input_tensor = transform(image).unsqueeze(0)

# Get prediction
model.eval()
with torch.no_grad():
    output = model(input_tensor)
    probabilities = torch.nn.functional.softmax(output[0], dim=0)
    top5_prob, top5_catid = torch.topk(probabilities, 5)

Training Procedure

This model was trained on ImageNet-1k using advanced techniques:

Optimization

  • Optimizer: AdamW (weight_decay=0.02)
  • Learning Rate: 0.1 with 10-epoch warmup
  • Schedule: Cosine annealing
  • Batch Size: 1024 effective (512 per GPU Γ— 2 gradient accumulation)
  • Mixed Precision: fp16 for efficiency

Data Augmentation

  • MixUp (Ξ±=0.2, prob=50%): Linearly combines pairs of images and labels [Zhang et al., 2017]
  • CutMix (Ξ±=1.0, prob=50%): Replaces image regions with patches from other images [Yun et al., 2019]
  • RandAugment (magnitude=9, std=0.5): Automated augmentation policy [Cubuk et al., 2020]
  • Standard: RandomResizedCrop (224Γ—224), random horizontal flip, color jitter, random erasing (prob=0.25)

Regularization

  • Stochastic depth (drop_path_rate=0.1)
  • Label smoothing (via MixUp/CutMix)
  • Weight decay
  • Batch normalization

Model Architecture

ResNet-50 with PReLU

CNN(
  conv1: [ConvAct(3 β†’ 64, 7Γ—7, stride=2) + MaxPool(3Γ—3, stride=2)]
    Input: 224Γ—224 β†’ 112Γ—112 β†’ 56Γ—56

  conv2_x: 3Γ— BottleneckBlock(64 β†’ 64 β†’ 256)
    56Γ—56 (no downsampling)

  conv3_x: 4Γ— BottleneckBlock(256 β†’ 128 β†’ 512)
    56Γ—56 β†’ 28Γ—28 (first block stride=2)

  conv4_x: 6Γ— BottleneckBlock(512 β†’ 256 β†’ 1024)
    28Γ—28 β†’ 14Γ—14 (first block stride=2)

  conv5_x: 3Γ— BottleneckBlock(1024 β†’ 512 β†’ 2048)
    14Γ—14 β†’ 7Γ—7 (first block stride=2)

  avgpool: AdaptiveAvgPool2d(1Γ—1)
    7Γ—7 β†’ 1Γ—1

  fc: Linear(2048 β†’ 1000)
)

Total Layers: 50 (1 + 3Γ—3 + 4Γ—3 + 6Γ—3 + 3Γ—3 = 49 conv + 1 fc)

Key Features

  • PReLU Activation: Learnable negative slope for adaptive non-linearity
  • Bottleneck Blocks: 1Γ—1 β†’ 3Γ—3 β†’ 1Γ—1 design (4Γ— parameter reduction)
  • Residual Connections: Skip connections for deep network training
  • ReZero Scaling: Learnable residual scaling (initialized at 0)
  • Stochastic Depth: Linear decay DropPath (0.0 β†’ 0.1)
  • Batch Normalization: Momentum=0.01 for stable statistics
  • Global Average Pooling: Spatial invariance, zero parameters

References

Citation

If you use this model, please cite:

@misc{cnn_prelu_imagenet,
  title={cnn-prelu-imagenet: CNN with PReLU for ImageNet Classification},
  year={2025},
  publisher={HuggingFace Hub},
}

Original PReLU Paper

This model uses PReLU activation. Please also cite the original paper:

@inproceedings{he2015delving,
  title={Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification},
  author={He, Kaiming and Zhang, Xiangyu and Ren, Shaoqing and Sun, Jian},
  booktitle={Proceedings of the IEEE International Conference on Computer Vision},
  pages={1026--1034},
  year={2015}
}

License

This model is released under the MIT License.

Downloads last month
-
Safetensors
Model size
25.7M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Dataset used to train chrisjob1021/cnn-prelu-imagenet

Papers for chrisjob1021/cnn-prelu-imagenet