News Release

Brick by brick: Making AI-based financial portfolio management modular and scalable

Researchers from the University of Tsukuba have developed a deep reinforcement learning-based framework that enables portfolio managers to reallocate multiple portfolios with a large volume of assets at scale

Peer-Reviewed Publication

University of Tsukuba

Tsukuba, Japan – The ability to predict movements in the stock market can be an extremely lucrative skill. For portfolio managers, who reallocate capital into the multiple assets of a portfolio, predicting price trends enables them to maximize capital returns. 

Many approaches to price prediction have been taken over the years, and the formulas and patterns that make up technical analysis are now being replaced by deep learning-based methods, especially those based on a type of learning called deep reinforcement learning. However, existing reinforcement learning-based portfolio management systems tend to have a fixed architecture and lack a modular design, so that they cannot be expanded with additional reinforcement learning agents or be applied to multiple portfolios. Moreover, they can only handle a limited number of assets or types of market information.

In a recent paper published on PLOS ONE, researchers from the University of Tsukuba describe a deep reinforcement learning-based framework for portfolio management that overcomes these problems. “By building this framework with a modular design,” says Zhenhan Huang, lead author of the paper, “systems targeting different portfolios can share and be built with pre-trained modules, just like assembling LEGO bricks, in different configurations.”

The proposed system consists of evolving agent modules, one for each asset, and strategic agent modules, one for each portfolio. An evolving agent module uses a deep Q-network to predict price trends based on historical prices and web news sentiment. A strategic agent module uses a proximal policy optimization agent to reallocate assets according to the information generated by the evolving agent modules.

“Separating the tasks of predicting trends and making strategic decisions has several advantages,” Huang says. The evolving agent module only needs to be trained once for an asset like Alphabet Inc. before it can be used (and reused) for any portfolio that includes that asset. Moreover, the scalability of the system allows new assets with heterogeneous data or different reinforcement-learning agents to be added into existing portfolios without retraining the whole system. The modules in the system can also be run in parallel, increasing efficiency and scalability.

The researchers compared the proposed system with several conventional portfolio management strategies and one cutting-edge RL-based method. They found that the system performed the best with respect to performance metrics such as the accumulated rate of return and daily rate of return, even under the extreme conditions of the US stock market during the global pandemic in the year 2020.

The modularity of the proposed system opens up exciting opportunities for its further development. The team used the deep Q-network and proximal policy optimization in the current implementation, but plan to implement other algorithms. They also plan to use other, unconventional sources of data such as satellite images to predict asset price trends.


The paper, “MSPM: A modularized and scalable multi-agent reinforcement learning-based system for financial portfolio management,” is available from PLOS ONE with DOI: 10.1371/journal.pone.0263689

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.