News Release

Black holes are missing in the early Universe, and computers are after them

Peer-Reviewed Publication

Faculty of Sciences of the University of Lisbon

Artist’s concept of a black hole


Artist’s concept of a black hole. This representation includes a disk of overheated material that is being pulled by the gravitational field, and also the jets of material being spewed perpendicularly to the disk. These jets shine brightly in radio frequencies, a signal the authors of this study are able to predict from the automatic analysis of astronomical images using machine learning techniques.

view more 

Credit: S. Dagnello (NRAO/AUI/NSF)

Forthcoming sky surveys with radio telescopes will capture millions of galaxies in the early Universe, but only automatic tools, like the algorithm created by a team led by the Institute of Astrophysics and Space Sciences (IA), at the Faculty of Sciences of the University of Lisbon (Portugal), may read this data deluge and find the galaxies with massive black holes at their core.

As far as the eye can see, galaxies fill the images of the deep Universe. What processes determined their shapes, colors and populations of stars? Astronomers think that primordial black holes were the engines of galaxies’ growth and transformation and can explain the cosmic landscape we see now.

In an article1 published today in the journal Astronomy & Astrophysics, an international team led by Rodrigo Carvajal, of the Institute of Astrophysics and Space Sciences (IA2) and the Faculty of Sciences of the University of Lisbon (Ciências ULisboa), presents a machine learning3 technique that recognises superluminous galaxies in the early Universe. These are galaxies thought to be dominated by the activity of a voracious black hole at their core4. According to the authors, this should be the first algorithm that predicts when this activity also radiates an intense signal in the radio frequencies. Radio emissions are often distinct from the other light of the galaxy, and sometimes it is difficult to link them. This technique of artificial intelligence will enable astronomers to be more effective in the search for the so-called radio galaxies5.

The algorithm, developed with the collaboration of the Closer company, acting in the sector of technological solutions for data science, was trained with images of galaxies obtained in several wavelengths of the electromagnetic spectrum6. When tested with other images, it was able to predict four times more radio galaxies than the conventional methods that use explicit instructions. As machine learning develops its own algorithms, trying to understand its success may help clarify the physical phenomena that were happening in these galaxies, 1.5 billion of years after the Big Bang, that is, when the Universe had a tenth of its current age.

“We have to find more active galaxies in the sky, because there are predictions that there should exist much more in the early history of the Universe. With the current observations we don’t have that number,” says Rodrigo Carvajal. According to this researcher, more observations are needed to verify if the current understanding about how active galaxies evolve is correct, or has to be modified.

“It’s also important to analyze the machine learning models themselves and to understand what’s happening inside them,” Carvajal adds. “Which features are the most relevant to the decision? For example, we want to know if the most important feature for the module to have stated that it is an active galaxy is the light the galaxy emits in the infrared, possibly an indication of rapid formation of new stars. With this, we are able to produce a new law to separate between what is a normal galaxy and an active galaxy.”

The relative weight of the galaxy features on the decision taken by the computer may point to what is at the origin of its intense activity, in particular in the radio band. In a study in preparation, Carvajal is exploring the implications of this apparent dependency between the radio emission and the formation of stars. Israel Matute, of IA and Ciências ULisboa, the second author of the paper, clarifies: “These models are mathematical tools that help us to look into the right direction when the complexity of the data increases. This work might provide insights into the processes that curbed the formation of new stars in the second half of the history of the Universe.”

The galaxies that seem to be lacking in the primordial Universe may be in the large mass of data that modern radio telescopes will produce in the coming years. Future surveys of extensive regions of the sky will reveal billions of galaxies. One example is the Evolutionary Map of the Universe (EMU), that will map the whole southern celestial hemisphere with the ASKAP radio telescope, in Australia. The team led by IA is already working with data from a pilot project of this survey. Once perfectioned, these tools will be crucial for the processing of the astronomical amount of data the future Square Kilometre Array Observatory (SKAO) will produce. Portugal is a member of the consortium of this observatory, which is already under construction.

“In a new age when astronomy will have access to vast amounts of data, it is increasingly more important the development of advanced techniques for their processing and analysis,” says José Afonso, of IA and Ciências ULisboa and co-author of this paper. “At IA we are developing and implementing these techniques, to be able to decipher the origin of galaxies and the supermassive black holes that most of them host.”

The idea for the collaboration between the Closer company and IA was put forward by one of the co-authors, Helena Cruz, who holds a PhD in Physics and is a data scientist at Closer. Her involvement was key to analyze and process the impact of uncertainties and inconsistencies between different data sources – coming from several telescopes and observation programmes – used to train the machine learning algorithm. 

“I became aware that Astronomy is a field with great opportunities for the exploration and development of models of machine learning, and it made sense to me to apply my professional skills to this field,” says Helena Cruz. “I shared my interest with Closer and both parties showed immediately their willingness to collaborate, which I see as an extension of my work at the company.”

“Closer thrives from the knowledge of its collaborators, this is its capital,” adds João Pires da Cruz, Closer co-founder, professor and researcher. “The more challenging and sophisticated from a scientific point of view are the projects in which our team members get involved, the greater will be the company’s capital. We will have collaborators able to solve the problems of our clients that are similar to the problem of the signals from distant galaxies.”


  1. The article “Selection of powerful radio galaxies with machine learning”, by R. Carvajal et al., was published today in the journal Astronomy & Astrophysics, Vol. 679 (DOI:
  2. The Instituto de Astrofísica e Ciências do Espaço (Institute of Astrophysics and Space Sciences – IA) is the reference Portuguese research unit in this field, integrating researchers from the University of Lisbon and the University of Porto, and encompasses most of the field’s national scientific output. It was evaluated as Excellent in the last evaluation of research and development units undertaken by Fundação para a Ciência e Tecnologia (FCT). IA's activity is funded by national and international funds, including FCT/MCES (UIDB/04434/2020 and UIDP/04434/2020).
  3. Machine learning is a domain of computer sciences, more specifically artificial intelligence, which develops software capable of acquiring knowledge and of making autonomous decisions, without human instructions. One of the applications of this domain is the development of algorithms capable of analysing and identifying regularities in data and of estimating the probabilities about events in future data. These algorithms do not receive direct deterministic instructions, but are based on statistical mathematical models and on exposure to large amounts of data.
  4. Active galaxies are unusually luminous, especially in their nucleus, and at all wavelengths of the electromagnetic spectrum. There is strong evidence that the source of this intense emission of light, from radio to X-rays, is a rapidly rotating and extremely hot disc of material around a black hole with millions of times the mass of the Sun, which is actively attracting and absorbing matter. The galaxy is said to have an active galactic nucleus.
  5. Radio galaxies are a type of galaxies with an active nucleus that emit powerful jets of material, which shine in the radio frequencies.
  6. The machine-learning algorithm was trained with active galaxies already identified in a set of highly sensitive, high-resolution data obtained with the LOFAR radio telescope, in the Netherlands, as well as with data in visible light from the Pan-STARRS telescope, in Hawaii, and in the infrared with NASA's WISE space telescope. Its effectiveness was then tested in another region of the sky, a stretch for which there is data in the visible and infrared light with the Sloan Digital Sky Survey (SDSS) and in the radio with the Very Large Array (VLA) radio telescope.

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.