Beyond conventional vision: RGB-event fusion for robust object detection in dynamic traffic scenarios
Peer-Reviewed Publication
Updates every hour. Last Updated: 2-Jan-2026 10:11 ET (2-Jan-2026 15:11 GMT/UTC)
The heterogeneity causes spatiotemporal inconsistencies in multimodal data, posing challenges for existing methods in multimodal feature extraction and alignment. First, in the temporal dimension, the microsecond-level temporal resolution of event data is significantly higher than the millisecond-level resolution of RGB data, resulting in temporal misalignment and making direct multimodal fusion infeasible. To address this issue, the researchers design an Event Correction Module (ECM) that temporally aligns asynchronous event streams with their corresponding image frames through optical-flow-based warping. The ECM is jointly optimized with the downstream object detection network to learn task-ware event representations.
In collaboration with universities across the world, Nicholas Hedger (University of Reading) and Tomas Knapen (Netherlands Institute for Neuroscience & Vrije Universiteit Amsterdam) explored the depths of the human experience. They discovered how the brain translates the visual world around us into touch, thereby creating a physical embodied world for us to experience. “This aspect of human experience is a fantastic area for AI development.”
A Swansea University academic has been honoured with the prestigious SEMI Academia Impact Award, recognising his outstanding contributions to semiconductor research, innovation, and industry-academia collaboration in Europe.