CREST is a deployment-realistic hardware-in-the-loop (HIL) neural architecture search framework for embedded sensing systems on resource-constrained microcontrollers. It keeps the optimizer, HIL measurement boundary, logging, and replay workflow fixed while users vary workload, model family, target backend, runtime schedule, quantization mode, and selection policy.
CREST currently includes built-in support for OxIOD inertial odometry and UrbanSound8K prepared-feature audio classification, with target backends for Arduino Nano 33 BLE Sense, Arduino Portenta H7, and STM32 NUCLEO-N657X0-Q.
- Configurable studies that bind dataset, task, model family, target backend, runtime mode, quantization, and scoring policy through YAML.
- Hardware-in-the-loop measurement paths for deployment metrics such as memory, latency, energy, cadence telemetry, and status.
- Continuous-inference and cadenced sensing-window runtime modes.
- Search policies for scalar scoring, multi-objective Pareto exploration, pruning, and feasibility constraints.
- Replay utilities for remeasuring selected candidates across targets, schedules, and policies.
- Analysis scripts for reproducing publication-style plots and claim calculations from existing NAS and replay artifacts.
CREST is designed for controlled comparisons. A study can change the board, runtime schedule, workload, or policy while keeping the search loop, metric schema, HIL path, and replay machinery consistent. That makes it possible to ask whether a candidate was selected because of the deployment condition being tested rather than because a different script or measurement path was used.
CREST separates workload semantics from target-specific deployment mechanics. The NAS client samples and evaluates candidates, while the HIL server prepares target-specific artifacts, invokes the selected backend, and returns normalized metrics to the same scoring and logging path.
For board-level power measurement, the device under test is powered through an INA228 monitor. A separate harness microcontroller reads INA228 telemetry over I2C while observing DUT GPIO markers that delimit the measurement window. Continuous runs mark the repeated inference interval. Cadenced runs mark the scheduled window that includes active inference and the following sleep or wait interval.
The HIL harness lets CREST attach each energy value to a concrete candidate, runtime schedule, and trial outcome without oscilloscope inspection or manual trace segmentation.
Repo-owned CREST code, documentation, configuration, and tests are licensed under the BSD 3-Clause License unless otherwise noted. See LICENSE for the full license text.
Important exceptions and separately governed materials:
- STM32/ST/CMSIS material under
sketches/stm32/crest_stm32_lrun/remains subject to the upstream STMicroelectronics, CMSIS, and related license terms documented in that directory. - STM32 vendor-derived and tool-generated build recipe files in the LRUN workspace are not relicensed by CREST headers.
- The Arduino TensorFlow Lite Micro dependency tracked as the
tools/arduino-user/libraries/Arduino_TensorFlowLitesubmodule remains governed by its upstream Apache-2.0 license and is not relicensed by CREST. - OxIOD split metadata and upstream OxIOD readme snippets are dataset-related materials; raw OxIOD data is downloaded separately and is not stored in this repository.
- UrbanSound8K downloaded audio and generated feature caches are governed by the UrbanSound8K dataset terms and are not stored in this repository.
-
Training only Hardware requirement: no hardware required. Use this when you want to run NAS or training without talking to hardware. For OxIOD, start from src/config/nas_config_flops_rmse.yaml or src/config/nas_config_memory_proxy.yaml. For UrbanSound8K audio DS-CNN training, use src/config/nas_config_audio_stm32.yaml as a template with
device.hil: falseand score/prune terms that do not require measured hardware metrics. Read src/config/README.md plus src/README.md before adapting HIL configs for hardware-free runs. -
Arduino HIL Hardware requirement: development board required; HIL harness required for harness-assisted or harness-only measurement. Use this for Arduino CLI-backed DUTs and harness-backed measurement flows. Start from src/config/nas_config_ble.yaml for Nano 33 BLE or src/config/nas_config_portenta.yaml for Portenta H7. For the audio DS-CNN path, use src/config/nas_config_audio_portenta.yaml. Then read src/crest/microcontrollers/README.md and sketches/README.md.
-
STM32 HIL Hardware requirement: NUCLEO-N657X0-Q board required; HIL harness required for energy-measured runs. Use this for the STM32 NUCLEO-N657X0-Q backend. Start from src/config/nas_config_stm32.yaml, then use src/config/nas_config_audio_stm32.yaml for the audio DS-CNN HIL path. Then read src/crest/microcontrollers/README.md and the committed STM32 workspace notes under sketches/stm32/crest_stm32_lrun/README.md.
-
Analysis scripts / one-off experiments Hardware requirement: script-dependent. Use this for focused measurement or validation runs outside the main NAS loop. Most analysis scripts consume existing artifacts and do not touch hardware; the micro-workload probe uses the CREST HIL harness. Start with analysis_scripts/README.md, then open the package-specific README for the script family you need.
-
Clone with submodules.
git clone --recurse-submodules https://github.com/nesl/crest.git
If you already cloned without submodules:
git submodule update --init --recursive
-
Create and activate the Conda environment.
conda env create -f environment.yml -n crest conda activate crest
-
Install the repo in editable mode.
make install
-
If you are using GPUs, install the repo-tested TensorFlow CUDA wheel set.
pip install --upgrade pip pip install tensorflow[and-cuda]==2.20.0 python -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"Re-run this step after recreating or replacing the Conda environment. The base
environment.ymlinstalls CPU-usable TensorFlow so CPU-only machines do not download CUDA runtime wheels by default; GPU servers need thetensorflow[and-cuda]extra installed after environment creation. Ifnvidia-smisees GPUs but TensorFlow prints[], this CUDA wheel step is usually missing.
If you are CPU-only, the Conda environment already provides the dependencies needed by the repo.
For UrbanSound8K audio experiments, prepare the cached log-mel tensors before running audio training or HIL commands:
make prepare-audio-datasetUse URBANSOUND8K_ARGS="--download --accept-license" when you want the
preparation script to download the dataset through soundata, or
URBANSOUND8K_ARGS="--fold-rotation" when you also need the fold-rotation
reporting caches.
For OxIOD:
-
Download the OxIOD "Complete Dataset" zip from
http://deepio.cs.ox.ac.uk/. -
Rename it to
OxIOD.zipor pass an explicit path. -
Prepare the dataset from the repo root:
make prepare-dataset # or: make prepare-dataset OXIOD_ZIP=/path/to/OxIOD.zip
This extracts the dataset into data/oxiod, normalizes folder names such as
slow walking -> slow_walking, and restores the curated tracked split files
for each activity. Dataset-specific details live in
data/dataset_download_and_splits/README.md.
All Arduino CLI state is kept inside tools/ so the repo does not need to
write into $HOME or system directories.
-
Ensure
crestis active. -
Bootstrap Arduino CLI and repo-local hooks:
make arduino-setup
-
Reactivate the environment so the new hooks are loaded:
conda deactivate conda activate crest
-
Verify the CLI:
arduino-cli --config-file tools/arduino-cli.yaml version
-
Install the board package you need. Example for Nano 33 BLE:
arduino-cli core install arduino:mbed_nano --config-file tools/arduino-cli.yaml
If Portenta uploads on Linux fail with LIBUSB_ERROR_ACCESS, add the udev
rules documented in
src/crest/microcontrollers/README.md.
The STM32 flow keeps STM32CubeCLT installed outside the repo while cloning
the STM32CubeN6 firmware package into tools/stm32/STM32CubeN6.
Run STM32 bootstrap only on the machine that is physically connected to the
STM32 board and will run python src/hil_server.py.
Before running the bootstrap, ensure these tools are on your shell PATH:
ST-LINK_gdbserverarm-none-eabi-gdbSTM32_Programmer_CLIarm-none-eabi-gccarm-none-eabi-sizearm-none-eabi-objdumpSTM32_SigningTool_CLIorSTM32TrustedPackageCreator_CLI
For normal STM32 NAS/HIL candidate generation, also ensure stedgeai is on
PATH. The synthetic micro-workload probe documents its narrower STM32
requirements separately.
Then run:
make stm32-setupThat script validates the toolchain, clones or repairs
tools/stm32/STM32CubeN6, checks out the pinned v1.3.0 baseline, and
refreshes the repo-local STM32 vendor subsets.
The shipped starting points are:
- src/config/nas_config_stm32.yaml
STM32-oriented config for the
STM32_NUCLEO_N657X0_Qbackend. This is the main starting point for STM32 runs and the most complete commented example config in the repo. - src/config/nas_config_ble.yaml
BLE-focused starting point for
ARDUINO_NANO_33_BLE_SENSE. - src/config/nas_config_portenta.yaml Portenta H7-focused starting point.
- src/config/nas_config_audio_stm32.yaml UrbanSound8K audio DS-CNN starting point for desktop training and NUCLEO-N657X0-Q HIL work.
- src/config/nas_config_audio_portenta.yaml UrbanSound8K audio DS-CNN starting point for Arduino-backed Portenta H7 work.
The highest-signal fields for a first pass are:
device.*Target selection, HIL enable/disable, serial ports, runtime mode, and backend-specific nested options.dataset.*Dataset adapter selection and dataset-local paths/parameters, including OxIOD windowing or UrbanSound8K cache locations.training.*NAS epochs/trials, full-training epochs, quantization, and the runtime-sideenergy_aware/input_modeswitches.nas.*Score and prune configuration.dataset,task,modelModular component selection blocks when you want to override the built-in defaults explicitly.
For the full config reference, score/prune schema, and current runtime caveats, see src/config/README.md.
CREST runs a NAS/training client on a training host and talks to the HIL server running on the board-connected device host.
cd /path/to/CREST
conda activate crest
python src/hil_server.pyFor STM32, complete the STM32 setup section above on the board-connected host before starting the HIL server.
ssh -R "6001:127.0.0.1:6001" <gpu_server>The default configs expect the HIL server at 127.0.0.1:6001. If the NAS
client and HIL server run on the same machine, the reverse SSH tunnel is not
needed.
cd /path/to/CREST
conda activate crest
# Quick smoke pass
python3 src/nas_model_client.py --smoke-test 3 --study-name smoke_run
# Full NAS run
python3 src/nas_model_client.py --study-name crest_runUseful flags:
--config /path/to/config.yaml--smoke-test N--study-name NAME
Artifacts are written under the configured outputs.models_dir and
outputs.candidate_dir. Typical outputs include:
models/<study_name>/optuna.dbmodels/<study_name>/trials.csvmodels/<study_name>/train_history.jsonmodels/<study_name>/summary.json- generated TFLite and
.kerasartifacts
Case-study run configs live in src/config/case_study_configs/. They cover:
- Case Study 1: OxIOD/TCN proxy-vs-measured-energy selection across targets.
- Case Study 2: STM32/OxIOD continuous-vs-cadenced schedule comparison.
- Case Study 3: UrbanSound8K/DS-CNN application-level scoring on two targets.
Plotting and calculation utilities live under analysis_scripts/. Those scripts consume existing NAS and replay artifacts; they do not rerun NAS or touch hardware unless their package README says so.
- src/README.md Source architecture, shared abstractions, trial logging, replay, and extension paths.
- src/config/README.md Full config reference and scoring/pruning semantics.
- src/config/case_study_configs/README.md Case-study config index.
- src/crest/datasets/README.md Dataset adapter contributor guide.
- src/crest/tasks/README.md Task adapter contributor guide.
- src/crest/microcontrollers/README.md Backend contracts, bring-up, staging, compile, upload, and runtime flows.
- src/crest/model_families/README.md Model-family extension guide.
- sketches/README.md Shared Arduino sketch and STM32 workspace layout.
- analysis_scripts/README.md Analysis and validation utilities.
- data/dataset_download_and_splits/README.md OxIOD preparation plus UrbanSound8K audio cache preparation.
- If training-only runs should not touch hardware, start from a desktop-safe
config or set
device.hil: falseand remove score/prune terms that require measured hardware metrics. - If Arduino uploads fail on Linux with
LIBUSB_ERROR_ACCESS, apply the udev rules documented in the MCU README. - If STM32 bootstrap, HIL startup, or candidate generation fails, re-check the STM32 setup section above, then consult the MCU README for backend-specific diagnostics.
- If OxIOD preparation fails, confirm the zip exists and that the repo still
contains the tracked split templates under
data/oxiod/<activity>/. - If audio runs fail while loading data, run
make prepare-audio-datasetand confirm the UrbanSound8K cache path in the selected config exists.

