cnn-prelu-imagenet

A Convolutional Neural Network (CNN) trained on ImageNet-1k with PReLU activation.

Repository: github.com/chrisjob1021/model-examples

Model Description

This is a ResNet-style CNN architecture featuring:

Activation Function: PReLU
Number of Classes: 1000
Architecture: Deep residual network with bottleneck blocks
Training Dataset: ImageNet-1k
Code: See cnn/ directory in the repository for training scripts and model implementation

Key Features

Residual connections for better gradient flow
Batch normalization for training stability
PReLU activation for learnable non-linearity
Manual and builtin convolution implementations for educational purposes

Training Details

Epochs: 300.0
Global Steps: 375,300
Training Loss: 0.898557186126709

Evaluation Results

Top-1 Accuracy: 78.01%
Top-5 Accuracy: 93.89%
Top-1 Error: 21.99%
Top-5 Error: 6.11%

Comparison with ImageNet Benchmarks

Model	Top-1	Top-5	Parameters	Year	Notes
AlexNet	57.0%	80.3%	60M	2012	First deep CNN
VGG-16	71.5%	90.1%	138M	2014	Deep with small filters
ResNet-50	76.0%	93.0%	25M	2015	Baseline
ResNet-152	78.3%	94.3%	60M	2015	Deeper variant
Inception-v3	78.0%	93.9%	24M	2015	Multi-scale
This model	78.01%	93.89%	~23M	2025	PReLU

Key Achievement: +2.01% improvement over ResNet-50 baseline

Usage

from prelu_cnn import CNN

# Load the model
model = CNN.from_pretrained(
    "your-username/cnn-prelu-imagenet",
    use_prelu=True,
    num_classes=1000
)

# Use for inference
import torch
from torchvision import transforms
from PIL import Image

# Prepare image
transform = transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])

image = Image.open("path/to/image.jpg")
input_tensor = transform(image).unsqueeze(0)

# Get prediction
model.eval()
with torch.no_grad():
    output = model(input_tensor)
    probabilities = torch.nn.functional.softmax(output[0], dim=0)
    top5_prob, top5_catid = torch.topk(probabilities, 5)

Training Procedure

This model was trained on ImageNet-1k using advanced techniques:

Optimization

Optimizer: AdamW (weight_decay=0.02)
Learning Rate: 0.1 with 10-epoch warmup
Schedule: Cosine annealing
Batch Size: 1024 effective (512 per GPU × 2 gradient accumulation)
Mixed Precision: fp16 for efficiency

Data Augmentation

MixUp (α=0.2, prob=50%): Linearly combines pairs of images and labels [Zhang et al., 2017]
CutMix (α=1.0, prob=50%): Replaces image regions with patches from other images [Yun et al., 2019]
RandAugment (magnitude=9, std=0.5): Automated augmentation policy [Cubuk et al., 2020]
Standard: RandomResizedCrop (224×224), random horizontal flip, color jitter, random erasing (prob=0.25)

Regularization

Stochastic depth (drop_path_rate=0.1)
Label smoothing (via MixUp/CutMix)
Weight decay
Batch normalization

Model Architecture

ResNet-50 with PReLU

CNN(
  conv1: [ConvAct(3 → 64, 7×7, stride=2) + MaxPool(3×3, stride=2)]
    Input: 224×224 → 112×112 → 56×56

  conv2_x: 3× BottleneckBlock(64 → 64 → 256)
    56×56 (no downsampling)

  conv3_x: 4× BottleneckBlock(256 → 128 → 512)
    56×56 → 28×28 (first block stride=2)

  conv4_x: 6× BottleneckBlock(512 → 256 → 1024)
    28×28 → 14×14 (first block stride=2)

  conv5_x: 3× BottleneckBlock(1024 → 512 → 2048)
    14×14 → 7×7 (first block stride=2)

  avgpool: AdaptiveAvgPool2d(1×1)
    7×7 → 1×1

  fc: Linear(2048 → 1000)
)

Total Layers: 50 (1 + 3×3 + 4×3 + 6×3 + 3×3 = 49 conv + 1 fc)

Key Features

PReLU Activation: Learnable negative slope for adaptive non-linearity
Bottleneck Blocks: 1×1 → 3×3 → 1×1 design (4× parameter reduction)
Residual Connections: Skip connections for deep network training
ReZero Scaling: Learnable residual scaling (initialized at 0)
Stochastic Depth: Linear decay DropPath (0.0 → 0.1)
Batch Normalization: Momentum=0.01 for stable statistics
Global Average Pooling: Spatial invariance, zero parameters

References

ResNet: He et al., "Deep Residual Learning for Image Recognition", CVPR 2016
PReLU: He et al., "Delving Deep into Rectifiers", ICCV 2015
Stochastic Depth: Huang et al., "Deep Networks with Stochastic Depth", ECCV 2016
ReZero: Bachlechner et al., "ReZero is All You Need", UAI 2021

Citation

If you use this model, please cite:

@misc{cnn_prelu_imagenet,
  title={cnn-prelu-imagenet: CNN with PReLU for ImageNet Classification},
  year={2025},
  publisher={HuggingFace Hub},
}

Original PReLU Paper

This model uses PReLU activation. Please also cite the original paper:

@inproceedings{he2015delving,
  title={Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification},
  author={He, Kaiming and Zhang, Xiangyu and Ren, Shaoqing and Sun, Jian},
  booktitle={Proceedings of the IEEE International Conference on Computer Vision},
  pages={1026--1034},
  year={2015}
}