--- datasets: - google/speech_commands pipeline_tag: image-classification tags: - arXiv:1611.02361 --- # DS-CNN DS-CNN model from MLCommons repository https://github.com/mlcommons/tiny/tree/master/benchmark/training/keyword_spotting ONNX version exported from `.pb` model doing ```bash # setup environment python3.10 -m venv pytf source pytf/bin/activate # install latest officially compatible versions pip install tensorflow==2.15.1 tf2onnx==1.16.1 # use most recent opset officially supported python -m tf2onnx.convert \ --saved-model \ --output converted_ds_cnn.onnx --opset 18 ``` This version input format is NHWC. The following Python code fuses MatMul+Add to Gemm and folds the first Reshape operator. ```python import onnx import onnxruntime import aidge_core as ai import aidge_onnx model_onnx = onnx.load_model("converted_ds_cnn.onnx") model_onnx_clean_nhwc = aidge_onnx.onnx_cleaner.clean_onnx( model_onnx, {"input_1": [[1, 49, 10, 1]]}, "test_clean", opset_version=18 ) model = aidge_onnx.convert_onnx_to_aidge(model_onnx_clean_nhwc) to_replace: set[ai.Node] = set( [ model.get_node("StatefulPartitionedCall_functional_1_conv2d_BiasAdd__6"), model.get_node("new_shape__103_out0"), ] ) model.replace(to_replace, set()) model.set_mandatory_inputs_first() model.forward_dims(dims=[[1, 1, 49, 10]], allow_data_dependency=True) model_onnx_clean_nchw = aidge_onnx.convert_aidge_to_onnx(model, "ds_cnn", opset=18) onnx.save_model(model_onnx_clean_nchw, "ds_cnn.onnx") ``` ## Aidge support > Note: We tested this network for the following features. If you encounter any error please open an [issue](https://gitlab.eclipse.org/groups/eclipse/aidge/-/issues). Features not tested in CI may not be functional. | Feature | Tested in CI | | :----------: | :----------: | | ONNX import | ✔️ | | Runtime CPU | ✔️ | | Runtime CUDA | ✔️ | | Export CPU | ✔️ | ## Model * operators: 43 (9 types) - AvgPooling2D: 1 - Conv2D: 4 - FC: 1 - PaddedConv2D: 1 - PaddedConvDepthWise2D: 4 - Producer: 21 - ReLU: 9 - Reshape: 1 - Softmax: 1 ## Google Speech Commands v2 * Opset: 18 * Source: Google * **Input** * size: [N, 1, 49, 10] * format: [N, C, H, W] * preprocessing: * ? * **Output** * size: [N, 12]