---
datasets:
- google/speech_commands
pipeline_tag: image-classification
tags:
- arXiv:1611.02361
---

# DS-CNN

DS-CNN model from MLCommons repository https://github.com/mlcommons/tiny/tree/master/benchmark/training/keyword_spotting

ONNX version exported from `.pb` model doing

```bash
# setup environment
python3.10 -m venv pytf
source pytf/bin/activate

# install latest officially compatible versions
pip install tensorflow==2.15.1 tf2onnx==1.16.1

# use most recent opset officially supported
python -m tf2onnx.convert \
  --saved-model <path/to/dir> \
  --output converted_ds_cnn.onnx --opset 18
```

This version input format is NHWC.
The following Python code fuses MatMul+Add to Gemm and folds the first Reshape operator.

```python
import onnx
import onnxruntime

import aidge_core as ai
import aidge_onnx

model_onnx = onnx.load_model("converted_ds_cnn.onnx")
model_onnx_clean_nhwc = aidge_onnx.onnx_cleaner.clean_onnx(
    model_onnx, {"input_1": [[1, 49, 10, 1]]}, "test_clean", opset_version=18
)
model = aidge_onnx.convert_onnx_to_aidge(model_onnx_clean_nhwc)

to_replace: set[ai.Node] = set(
    [
        model.get_node("StatefulPartitionedCall_functional_1_conv2d_BiasAdd__6"),
        model.get_node("new_shape__103_out0"),
    ]
)

model.replace(to_replace, set())
model.set_mandatory_inputs_first()

model.forward_dims(dims=[[1, 1, 49, 10]], allow_data_dependency=True)
model_onnx_clean_nchw = aidge_onnx.convert_aidge_to_onnx(model, "ds_cnn", opset=18)
onnx.save_model(model_onnx_clean_nchw, "ds_cnn.onnx")
```

## Aidge support

> Note: We tested this network for the following features. If you encounter any error please open an [issue](https://gitlab.eclipse.org/groups/eclipse/aidge/-/issues). Features not tested in CI may not be functional.

|   Feature    | Tested in CI |
| :----------: | :----------: |
| ONNX import  |      ✔️      |
| Runtime CPU  |      ✔️      |
| Runtime CUDA |      ✔️      |
| Export CPU   |      ✔️      |


## Model

* operators: 43 (9 types)
  - AvgPooling2D: 1
  - Conv2D: 4
  - FC: 1
  - PaddedConv2D: 1
  - PaddedConvDepthWise2D: 4
  - Producer: 21
  - ReLU: 9
  - Reshape: 1
  - Softmax: 1

## Google Speech Commands v2

* Opset: 18
* Source: Google
* **Input**
    * size: [N, 1, 49, 10]
    * format: [N, C, H, W]
    * preprocessing:
      * ?
* **Output**
    * size: [N, 12]