--- base_model: aswincandra/rgai-air-pollution-image-classification metrics: - accuracy model-index: - name: rgai-air-pollution-image-classification results: - task: name: Image Classification type: image-classification dataset: name: imagefolder type: imagefolder config: default split: train args: default metrics: - name: Accuracy type: accuracy value: 0.8166 --- # RGAI Air Pollution Image Classification by Aswin Candra ## Example Usage First of all, clone this repo. ``` from aswin_air_pollution import CustomCNN model = CustomCNN.from_pretrained('aswincandra/rgai-air-pollution-image-classification') ``` Notebook example: [here](https://colab.research.google.com/drive/1MXI53GahSEvmLBz7eRMQX66tJwGfnI-w?usp=sharing) ## Model description This model is trying to reproduce the architecture of [Utomo Sapdo et al.](https://dl.acm.org/doi/abs/10.1145/3582515.3609531) on the [Air Pollution Image Dataset from India and Nepal](https://www.kaggle.com/datasets/adarshrouniyar/air-pollution-image-dataset-from-india-and-nepal/data) Kaggle dataset. It achieves the following results on the testing set: - Loss: 0.5276 - Accuracy: 0.8166 ### Architecture *Quoted from [Utomo Sapdo et al.](https://dl.acm.org/doi/abs/10.1145/3582515.3609531)* >The proposed model accepts (224 x 224 x 3) RGB images as inputs. The initial model block is comprised of two CNN layers with 64 filters and one maxpooling layer. The second model block contains two CNN layers with 128 filters and one maxpooling layer. The third through fifth blocks use modified residual blocks that only apply one CNN layer, with the output of that CNN layer being added to the previous maxpooling output before being transmitted to the maxpooling layer. All maxpooling layers employ a kernel size of 3x3. Instead of ReLU, the activation function for all CNN layers is LeakyReLU. These five blocks are utilized for image feature extraction. The extracted features will then be flattened and transmitted to FC layer sets. The first FC layer consists of 256 neurons, whereas the second FC layer consists of 128 neurons. Additionally, these two FC layers use LeakyReLU as an activation function. Then the output from the last layer that described above fed to final Fully-connected Layer with 6 outputs, depicts the number of class labels. ### Output labels dictionary * '0': 'a_Good' * '1': 'b_Moderate' * '2': 'c_Unhealthy_for_Sensitive_Groups' * '3': 'd_Unhealthy' * '4': 'e_Very_Unhealthy' * '5': 'f_Severe' ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 1e-4 - train_batch_size: 16 - eval_batch_size: 16 - optimizer: Adam - num_epochs: 15 ### Training results | Epoch | Training Loss | Training Accuracy | Validation Loss | Validation Accuracy | |-------|---------------|-------------------|-----------------|---------------------| | 1 | 1.6998 | 0.2595 | 1.5246 | 0.3568 | | 2 | 1.4923 | 0.3758 | 1.4040 | 0.4375 | | 3 | 1.3921 | 0.4374 | 1.2898 | 0.4911 | | 4 | 1.2737 | 0.5020 | 1.1851 | 0.5232 | | 5 | 1.1706 | 0.5424 | 1.1138 | 0.5738 | | 6 | 1.0749 | 0.5842 | 1.0104 | 0.6182 | | 7 | 0.9780 | 0.6256 | 0.9365 | 0.6452 | | 8 | 0.8919 | 0.6637 | 0.8426 | 0.6998 | | 9 | 0.8184 | 0.7034 | 0.8146 | 0.7029 | | 10 | 0.7486 | 0.7286 | 0.7454 | 0.7494 | | 11 | 0.6851 | 0.7560 | 0.6980 | 0.7560 | | 12 | 0.6305 | 0.7759 | 0.6384 | 0.7744 | | 13 | 0.5859 | 0.7933 | 0.5911 | 0.7922 | | 14 | 0.5358 | 0.8141 | 0.5786 | 0.7963 | | 15 | 0.4971 | 0.8270 | 0.5441 | 0.8142 |