Skip to content

IMX708 → Picamera2 → MJPG (HTTP)

simple_picamera2_streamer — IMX708 edge stack

Section titled “simple_picamera2_streamer — IMX708 edge stack”

A tiny, dependency-light Python service that owns one IMX708 (Raspberry Pi Camera Module 3) on a Raspberry Pi over CSI and exposes it as:

  • a continuous MJPG stream (multipart/x-mixed-replace) for the live ROS 2 image graph (role 2 — low-latency viewfinder), and
  • a single-shot JPG endpoint (/jpg) plus a control endpoint (/set).

It is one of three hardware ingestion pipelines in this system — see ros2_ws/edge/README.md for the bigger picture.

Status: runs role (2) only today. Role (1) high-fidelity still capture is a known gap — see TODO todo-imx708-fb-roles in the parent README.


[ IMX708 ]──CSI──▶[ picamera2 capture_array ]──cv2.imencode JPG──▶ shared frame_jpg
┌───────┴────────┐
▼ ▼
GET /stream GET /jpg
(multipart MJPG) (one frame)
config = picam2.create_video_configuration(main={"size": (2304, 1296)}, buffer_count=4)
picam2.set_controls({"AeEnable": 1, "AwbEnable": 1, "AfMode": 2}) # AfMode=2 = continuous AF
picam2.set_controls({"FrameDurationLimits": (125000, 125000)}) # 125 ms ≈ 8 Hz, hard-capped

A single background thread does picam2.capture_array()cv2.imencode(".jpg", arr, [cv2.IMWRITE_JPEG_QUALITY, 100]) and atomically swaps the result into a shared frame_jpg under a threading.Lock(). Every HTTP handler reads from that single shared buffer — there is no per-client encoder.

EndpointMethodQuery paramsReturns
/streamGETmultipart/x-mixed-replace; boundary=frame MJPG @ ~8 Hz
/jpgGETThe latest single JPG (image/jpeg)
/setGETExposureTime (µs, int), AnalogueGain (float), LensPosition (float, diopters; setting it forces AfMode=0 manual)200 ok
anything elseany404

The server is a ThreadingHTTPServer so /stream, /jpg, and /set can be served concurrently to multiple clients.

Terminal window
# on the Raspberry Pi (e.g. as a systemd service, or under tmux)
python3 ros2_ws/edge/simple_picamera2_streamer/app.py
# listens on 0.0.0.0:8000

Quick sanity checks from anywhere on the network (substitute the Pi’s IP):

Terminal window
# stream — open in browser or VLC
open http://172.31.1.97:8000/stream
# one-shot JPG into a file
curl -o frame.jpg http://172.31.1.97:8000/jpg
# bias the exposure / gain / focus
curl 'http://172.31.1.97:8000/set?ExposureTime=10000&AnalogueGain=2.0'
curl 'http://172.31.1.97:8000/set?LensPosition=2.5' # diopters → ~40 cm

ROS 2 client side — image_publisher_node

Section titled “ROS 2 client side — image_publisher_node”

The matching ROS 2 client is the stock image_publisher package. One node per camera URL, wrapped in a per-camera namespace.

Terminal window
ros2 run image_publisher image_publisher_node \
--ros-args \
-p filename:=http://172.31.1.97:8000/stream \
-p publish_rate:=8. \
-r __ns:=/cam2

This produces:

  • /cam2/image_rawsensor_msgs/Image
  • /cam2/image_raw/compressedsensor_msgs/CompressedImage
  • /cam2/camera_info — empty unless a calibration is provided

⚠️ The single most important parameter — publish_rate

Section titled “⚠️ The single most important parameter — publish_rate”

publish_rate MUST equal the edge capture rate, exactly.

image_publisher_node opens the MJPG URL via OpenCV (cv::VideoCapture), which buffers decoded frames internally. If publish_rate is slower than the edge produces frames, OpenCV’s queue fills up and the node ends up republishing frames from seconds — sometimes tens of seconds — in the past. rqt_image_view and Foxglove will show stale, lagging video that looks “smooth” but is actually time-shifted.

Concrete contract for this streamer:

Edge settingValueClient publish_rate must be
FrameDurationLimits=(125000, 125000)8 Hz cap8. (not 7.9, not 10)
time.sleep(0.125) in capture loop8 Hz lock8.

If you change app.py’s frame rate, change every tmuxp / launch file’s publish_rate in lockstep.

Two equivalent tmuxp variants live in ros2_ws/launch/image_publisher_client/ — one for Mac/laptop without a system ROS install (uses pixi run -e kilted ros2), one for hosts with ros2 already on PATH. Both fan out one image_publisher_node per camera URL into separate tmux panes:

See ros2_ws/launch/image_publisher_client/README.md for the full IP-and-namespace map.


  • Bandwidth. At 2304×1296 JPG-quality 100 @ 8 Hz, expect roughly 500 KB per frame ≈ 32 Mbit/s per camera. Plan WiFi accordingly — on a 2.4 GHz AP with two active streams you will saturate.
  • CPU on the Pi. cv2.imencode at quality 100 on a Pi 5 is the dominant cost. Drop quality to ~85 if you need headroom; visually indistinguishable for monitoring purposes.
  • Latency budget. Roughly: 125 ms (sensor) + ~20 ms (encode) + ~30 ms (network) + 50–100 ms (OpenCV decode + republish) ≈ 200–300 ms glass-to-RViz.
  • AF behaviour. AfMode=2 is continuous AF; setting LensPosition via /set flips to manual (AfMode=0) and stays there until restart. To return to continuous AF, restart the process — there is intentionally no “go back to auto” verb yet.