ESP32-S3 XIAO Sense — OV2640 / OV5640 firmware
CameraWebServer_for_esp-arduino_3.0.x — Project-specific notes
Section titled “CameraWebServer_for_esp-arduino_3.0.x — Project-specific notes”This is the firmware half of the ESP32-S3 + OV2640/OV5640 image pipeline, one of three hardware pipelines in the PhotogrammetricWAAM stack. See the parent overview for the bigger picture and how this pipeline relates to the IMX708 (RPi) and DSLR (gphoto2) pipelines.
Upstream
README.mdis preserved as-is — it is just an Espressif/Seeed compatibility note. This doc captures the project-specific divergence: our DEVICE_ID convention, MQTT telemetry, port layout, and the open role-switching work.
Hardware
Section titled “Hardware”- MCU board: Seeed Studio XIAO ESP32-S3 Sense (PSRAM-equipped — required).
- Sensor: OV2640 (default board) or OV5640 (5 MP daughter sensor).
Both work with the same sketch; only
frame_sizeupper-bound differs. - Network: WiFi (most boards) or wired ETH via an external PHY (the “TSO”
boards — see
esp32s3_eth.tmuxp.yml). - PSRAM is required for any
frame_sizeabove SVGA — the firmware refuses high-res mode ifpsramFound()is false.
Per-board identity — the DEVICE_ID convention
Section titled “Per-board identity — the DEVICE_ID convention”Every flashed board is uniquely identified by one preprocessor define near the top of the sketch:
#define DEVICE_ID 143That single number drives everything else on the network:
| Derived value | Pattern | Example for DEVICE_ID=143 |
|---|---|---|
| Static IP address | 172.31.1.<DEVICE_ID> | 172.31.1.143 |
| MQTT log topic | esp32s3/<DEVICE_ID>/log | esp32s3/143/log |
| MQTT temperature topic | esp32s3/<DEVICE_ID>/temp | esp32s3/143/temp |
| MQTT RSSI topic | esp32s3/<DEVICE_ID>/rssi | esp32s3/143/rssi |
| MQTT client id | esp32s3-<DEVICE_ID> | esp32s3-143 |
To deploy a new board: change just that one number, reflash, done.
A small preprocessor trick concatenates DEVICE_ID into MQTT topic strings
at compile time:
#define _STRINGIFY(x) #x#define _TOSTRING(x) _STRINGIFY(x)#define DEVICE_ID_STR _TOSTRING(DEVICE_ID)
const char *mqtt_topic = "esp32s3/" DEVICE_ID_STR "/log";Network endpoints
Section titled “Network endpoints”A flashed board exposes two HTTP servers and one MQTT client:
| Endpoint | Port | Server | Purpose |
|---|---|---|---|
http://172.31.1.<id>:81/stream | 81 | esp_camera httpd | MJPG stream (the one consumed by image_publisher_node) |
http://172.31.1.<id>:81/ | 81 | esp_camera httpd | Espressif’s stock HTML control UI (sliders for resolution, quality, etc.) |
http://172.31.1.<id>:8080/update | 8080 | WebServer + ElegantOTA | OTA firmware update |
mqtt://172.31.1.252:1883 | 1883 | PubSubClient (outbound) | Telemetry publishing (one-way today) |
:81/streamis the canonical Espressif port — different from the IMX708 streamer’s:8000/stream. Wire that into allimage_publisher_nodefilenames accordingly.
MQTT telemetry topics
Section titled “MQTT telemetry topics”The firmware publishes (one-way, no callback) at fixed intervals:
| Topic | Interval | Payload |
|---|---|---|
esp32s3/<DEVICE_ID>/log | 1 s | A monotonic counter (sanity / liveness check) |
esp32s3/<DEVICE_ID>/temp | 5 s | Internal core temperature in °C (temperatureRead() — ±5–10 °C) |
esp32s3/<DEVICE_ID>/rssi | 5 s | WiFi RSSI in dBm (negative; closer to 0 == stronger) |
Reconnect uses non-blocking 2-second backoff so a missing broker never stalls the camera/HTTP loop.
Current camera operating point (as flashed)
Section titled “Current camera operating point (as flashed)”The firmware is presently hard-pinned to a role-(1)-leaning still-quality configuration. This is not yet runtime-switchable — see Open work below.
// From CameraWebServer_for_esp-arduino_3.0.x.ino, setup()config.frame_size = FRAMESIZE_5MP; // 2592 × 1944 (OV5640 limit)config.jpeg_quality = 27; // (lower number = higher quality after the override below)config.fb_count = 1; // explicit override of the more usual fb_count=2config.grab_mode = CAMERA_GRAB_LATEST;config.fb_location = CAMERA_FB_IN_PSRAM;config.xclk_freq_hz = 20_000_000;config.pixel_format = PIXFORMAT_JPEG;
// Then via sensor_t setters:s->set_framesize(s, FRAMESIZE_5MP);s->set_quality(s, 6); // 6 = high quality (range 0..63, lower = better)s->set_wb_mode(s, 3); // fixed white balance modes->set_exposure_ctrl(s, 0); // AE offs->set_aec_value(s, 800); // manual exposure values->set_gain_ctrl(s, 0); // AGC offs->set_whitebal(s, 0); // AWB offs->set_awb_gain(s, 0); // AWB gain offs->set_raw_gma(s, 1); // gamma correction onWhy this combination is role-(1)-leaning today:
FRAMESIZE_5MP+jpeg_quality=6produces ~150–400 KB JPGs at the largest size the sensor can natively output.fb_count=1means only one frame’s worth of PSRAM is reserved — necessary to fit a 5 MP JPG in PSRAM at all on the XIAO.- All the
…_ctrl(s, 0)/set_aec_value/set_whitebal(0)calls lock exposure / gain / white balance, which is what you want for photogrammetry (frame-to-frame consistency) but not what you want for a viewfinder in changing lighting.
Despite this still-leaning configuration the boards are also currently
acting as role (2) MJPG streamers — the ROS 2 host pulls :81/stream and
republishes at 16 Hz (WiFi) / 24 Hz (ETH). They get away with it because the
JPGs are small enough to push, but the encode latency is higher than it
needs to be for real-time monitoring.
Open work — role switching (todo-esp32s3-fb-roles)
Section titled “Open work — role switching (todo-esp32s3-fb-roles)”The firmware needs runtime support for switching between role (1) and role (2)
without reflashing. Target settings, copied from
ros2_ws/edge/README.md:
| Setting | Role (1) HQ stills | Role (2) low-latency MJPG |
|---|---|---|
config.fb_count | 1 (max single-frame size in PSRAM) | 2 (pipeline encoder, hide latency) |
config.frame_size | FRAMESIZE_5MP (2592×1944) | FRAMESIZE_HD or _SVGA |
set_quality() | 4–6 (highest quality) | 12–20 (smaller frames) |
config.grab_mode | CAMERA_GRAB_WHEN_EMPTY | CAMERA_GRAB_LATEST |
set_exposure_ctrl | manual, locked AEC value | auto |
set_whitebal | manual, locked WB | auto OK |
Why fb_count matters specifically on this MCU:
- Role (1) —
fb_count=1is essentially mandatory. With the OV5640 at full 5 MP and JPG quality 6, a single framebuffer can already approach the PSRAM ceiling; a second buffer either won’t allocate or will force a smaller resolution. - Role (2) —
fb_count=2lets the JPG encoder run on buffer N while the sensor DMA is filling buffer N+1, hiding encode latency end-to-end. This is the dominant lever for “fastest stream”.
Switch trigger candidates (decision pending — see also todo-mqtt-bridge):
- New MQTT topic
esp32s3/<DEVICE_ID>/role/{request,response}that takes{"role": "stills" | "stream"}and reconfigures the sensor in-place. This is the cleanest fit with the existingmqtt__gphoto2_delegatecontract and would make the ESP32-S3 schedulable by the batch_request_delegate. - New HTTP endpoint
:81/role?…parallel to Espressif’s existing/control. Faster to wire up but doesn’t unify the control plane.
Implementation hazard: esp_camera_init cannot be re-run without
esp_camera_deinit() first, and switching frame_size at runtime is safer
through the sensor_t setters than through a full reinit. Test on a single
board before fleet rollout.
OTA flashing
Section titled “OTA flashing”ElegantOTA is mounted on the secondary WebServer at port 8080:
http://172.31.1.<DEVICE_ID>:8080/updateDrag-and-drop the new .bin from PlatformIO/Arduino IDE’s build artifacts.
The board reboots into the new firmware on completion.
Quick checks at boot
Section titled “Quick checks at boot”The serial console prints (at 115200 baud):
BEGIN SETUP===========...v0.0FIRMWARE COMPILED: Apr 8th, 2025JPEG quality 6awb: OFFframebuffer - 2 # (note: subsequently overridden to 1 — see informConnectionURL())CAMERA_GRAB_LATESTVFLIP!!!HMIRROR!!!
Camera Stream: http://172.31.1.143:81/streamOTA Update: http://172.31.1.143:8080/update
DEVICE_ID : 143MQTT broker : 172.31.1.252:1883MQTT client : esp32s3-143MQTT topics : esp32s3/143/log , esp32s3/143/temp , esp32s3/143/rssiOVERRIDING FB COUNT - 1Use this as a checklist when bringing up a new board.
Related
Section titled “Related”- Architectural overview:
ros2_ws/edge/README.md - ROS 2 client side (which consumes
:81/stream):ros2_ws/launch/image_publisher_client/README.md - WiFi XIAO launcher:
ros2_ws/launch/xiao_sense_esp32s3_eyes.py - Wired ETH XIAO + Lepton launcher:
ros2_ws/launch/image_publisher_client/esp32s3_eth.tmuxp.yml - Operator browser-pane view:
INBOX/TMUXP_VIEWS/xiao_sense_eyes.tmuxp.yml - IMX708 sibling pipeline:
ros2_ws/edge/simple_picamera2_streamer/README.md - DSLR sibling pipeline:
mqtt__gphoto2_delegate