Words categorize the semantic fields they refer to in ways that maximize communication accuracy while minimizing complexity. Recent studies have shown that human languages are optimally balanced between accuracy and complexity. For example, many languages have a word that denotes the colour red, but no language has individual words to distinguish ten different shades of the colour. These additional words would complicate the vocabulary and rarely would they be useful to achieve precise communication.
A study published on 23 March in the journal Proceedings of the National Academy of Sciences of the United States of America analysed how artificial neural networks develop spontaneous systems to name colours. A study by Marco Baroni, ICREA research professor at the UPF Department of Translation and Language Sciences (DTCL), conducted with members of Facebook AI Research (France).
Optimal trade-offs between complexity and accuracy may be a universal property that arises in discrete communication systems, not related to specific features of human biology
For this study, the researchers formed two artificial neural networks trained with two generic deep learning methods. As Baroni explains: "we made the networks play a colour-naming game in which they had to communicate about colour chips from a continuous colour space. We did not limit the "language" they could use, however, when they learned to play the game successfully, we observed the colour-naming terms these artificial neural networks had developed spontaneously".
The results show that modern AI systems naturally adopt similar behaviours to humans
The authors found that the emerging colour vocabulary has exactly the same property of optimizing the complexity/accuracy trade-off found in human languages. Furthermore, this result is only maintained while the systems communicate via a discrete channel: when they are allowed to use continuous signals (such as whistles or non-linguistic hand gestures), their language loses efficiency.
From the point of view of cognitive science, the results suggest that optimal trade-offs between complexity and accuracy may be a universal property that arises in discrete communication systems, not related to specific features of human biology. Baroni adds: "the results show that modern AI systems naturally adopt similar behaviours to humans, which is nonetheless surprising".
This suggests that an efficient categorization of colours (and possibly other semantic domains) in natural languages is not dependent on specific human biological constraints, but is a general property of discrete communication systems.
Rahma Chaabouni, Eugene Kharitonov, Emmanuel Dupoux, Marco Baroni (2021), "Communicating artificial neural networks develop efficient color-naming systems", 23 mars PNAS. https://doi.org/10.1073/pnas.2016569118
Proceedings of the National Academy of Sciences