Passive acoustic 3D camera for drone detection, classification and localization.
Six identical panels in a half-dodecahedron — each an ESP32-S3 + PCMD3180 + 8× IM70D122 MEMS array. The first system to detect, classify and track drones on an embedded MCU — no FPGA, no cloud.

Each computing layer solves one question — from raw PDM to a tracked 3D target
GCC-PHAT cross-correlations, TDOA. "Where is the sound?" — DOA candidates (azimuth, elevation) over a two-level sphere search.
AM-index + Spectral Variance. "Is it a drone?" — two physics-based discriminators filter candidates (Patent Claims 1+2).
Kalman-filter tracking. "Where is it heading?" — stable 3D DOA output for confirmed targets at 10 Hz.
Per-channel programmable delay registers — hardware TDOA presteering at the PDM level, before digitization.
Commercial acoustic cameras can locate sound — none of them tell a drone apart from the wind
| AkiraEar | Fluke ii915 | FOTRIC TD2 | SonoCam | |
|---|---|---|---|---|
| Chip | ESP32-S3 | FPGA | FPGA | FPGA |
| MEMS / system | 48 (6×8) | 64 | 64 | 64 |
| Coverage | 2π (3D) | 63°×63° | 58°×45° | 66°×52° |
| Drone discrim. | AM-index + Var | No | No | No |
| System price | ≈$200 BOM | $15–25k | $10–20k | $8–15k |
| Cloud-free | Yes | Partial | Partial | No |
| Patent | 3 claims | No | No | No |
ESP32-S3 + PCMD3180 + 8 MEMS — one board, ~$33 BOM, fully repeatable ×6

Xtensa LX7 dual-core @ 240 MHz with PIE SIMD, 8 MB PSRAM, 16 MB Flash, native WiFi + BLE 5.0. Everything on one chip.
Texas Instruments converter in master mode — generates BCLK/FSYNC, per-channel hardware delays, MICBIAS up to 20 mA.
Infineon MEMS, -26 dBFS, 30 Hz low cutoff for blade-pass frequencies. Spiral sparse, Acoular-optimized array per panel.
116.565° dihedral → ~64° between panel boresights. Main lobes meet at -3 dB: 2π steradian coverage, no moving parts.
Each panel uses an Acoular-optimized spiral sparse array — non-redundant baselines, large aperture, no grating lobes up to 2 kHz. Six panels at the dodecahedron's 116.565° dihedral angle place their boresights ~64° apart, so neighboring main lobes meet at the -3 dB level.

~20 ms on a 32 ms frame — 62.5% Core 0 load per panel, with headroom on Core 1
| Task | Time / frame | Core |
|---|---|---|
| I2S TDM DMA (8ch) | < 0.1 ms | HW |
| FFT 512pt × 8ch | 6.4 ms | Core 0 |
| Mel filterbank × 8 | 2.4 ms | Core 0 |
| GCC-PHAT 28 pairs (SSL) | 8.0 ms | Core 0 |
| DOA sphere search | 3.0 ms | Core 0 |
| AM-index × N | 0.5 ms | Core 1 |
| Spec-var × N | 0.3 ms | Core 1 |
| Kalman SST update | 1.0 ms | Core 1 |
| WiFi MQTT publish | < 1 ms | Core 1 |
AkiraEar brings FPGA-class acoustic detection to a $200 embedded platform. We're inviting partners and integrators to co-develop passive, cloud-free drone detection — from perimeter security to airspace monitoring.