Article Highlight | 21-Nov-2023

Towards a new paradigm for brain-inspiredcomputer vision

Beijing Zhongke Journal Publising Co. Ltd.

Nowadays, computer vision or machine vision, represented especially by deep convolutional neural networks (DCNNs), has achieved great success in many vision tasks. Compared to biological vision, however, computer vision is still lagging far behind in both performances and variety of capabilities. For instance, DCNNs, which mainly mimic the feedforward and hierarchical structure of the ventral pathway of biological vision, has achieved an extremely high accuracy in image classification, but in other tasks, such as video analysis and imaging understanding, they are still far from satisfactory. Thus, learning from biological vision, the so-called brain-inspired computer vision, is still a promising and efficient way to speed up the development of computer vision.

 

Although the importance of developing brain-inspired computer vision has been widely recognized, up to now, researchers have not achieved any really breakthrough in the field that can match the achievement of AlphaGo to GO game or AlphaFold to protein prediction. So, what is the obstacle in the development? Researchers identify that an important issue that is missed in the current practice of brain-inspired computer vision is the ignorance of a key nature of biological vision, i.e., biological vision is targeted on processing spatio-temporal patterns. This is fundamentally different from static images which DCNNs are good at. Shortly speaking, the characteristic of having both spatial and temporal structures is the nature of neural signals in every part of the brain. At the beginning stage of acquiring visual information from the external world, the signals received by retina are in the form of continuous optical flow; these signals are converted into spike trains by retinal ganglion cells, which are subsequently transmitted layer by layer to the visual cortex, where the visual input is integrated with spikes from other cortical regions conveying the prior knowledge or memory; eventually, the visual information is extracted in the form of continuous neuronal responses. The whole process is very complicated with many fine details remaining unknown, nevertheless, the fact that the visual system computes spatio-temporal patterns in the form of spike trains is fully validated by experiments.

 

In recognition of the aforementioned difference between machine vision and biological vision, a new paradigm which captures this fundamental difference is emerging for developing brain-inspired computer vision. Specifically, in such a paradigm, from beginning the visual information in the external world is expressed in the form of spike trains, which is subsequently processed by computation models inspired by the biological system. Notably, the current popular spiking neural networks (SNNs) in the field do not fulfill this goal, and they normally miss many features in biological systems that are crucial for processing spatio-temporal patterns efficiently, such as the differentiation of excitatory and inhibitory neurons, the stochastic neuronal firing, the recurrent and feedback interactions between neurons, the short-term plasticity of synapses, etc.

 

To develop brain-inspired computer vision, it is critical to represent visual inputs starting from the sensory level to draw analogies with biological systems. In recent years, brain-like sensing devices have been developed rapidly, which are able to sense visual scenes with high temporal and spatial resolutions, and they convert light signals directly into spike trains. In Section 2, researchers introduce two of them: one is dynamical vision sensor (DVS), which asynchronously senses the luminance change at each pixel in the image and outputs a stream of spike events. The other is spike camera, e.g., Vidar, which provides a way to represent visual inputs by spike trains. Both were inspired by the retina system of the biological brain.

 

In Section 3, researchers further discuss models for extracting information efficiently from these spike trains. In AI, many image processing algorithms have been proposed, however, these algorithms mainly focus on processing static images, rather than spatio-temporal patterns. Simply transforming artificial neurons in a neural network into spiking neurons does not help much. To develop efficient computational models for processing spatio-temporal patterns, researchers should learn from biological vision, as the latter is evolved over millions of years to perform this task efficiently. In essence, computer vision needs to perform three fundamental functions: object detection, object tracking, and object recognition. In this part researchers review three brain-inspired computational models, which can implement rapid signal detection, anticipative object tracking, and spatio-temporal pattern recognition, respectively based on spike trains.

 

Finally, in Section 4, researchers discuss about the future development of brain-inspired computer vision. They look ahead to some key issues needed to be solved for the success of the new brain-inspired computer vision paradigm: First is developing biologically more plausible brain-like sensing devices. Second is designing much smarter brain-inspired computational vision models. Third is exploring more suitable application scenarios for brain-inspired vision models. And the last one is developing more convenient and efficient programming tools for brain-inspired vision models.

 

 

See the article:

Towards a New Paradigm for Brain-inspiredComputer Vision

http://doi.org/10.1007/s11633-022-1370-z

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.