# Scientists studied distributed satellite cluster laser networking algorithm with double-layer Markov DRL architecture

Beijing Institute of Technology Press Co., Ltd

With the development of satellite networks, space-air-ground integrated networks, and the Internet of Things, the future giant constellations, high-resolution earth observation, human-crewed spacecraft, space stations and other space-based information systems have put forward an increasingly urgent demand for large-capacity space networking and information transmission. The space distributed satellite cluster (DSC) overcomes the resource constraints and technical bottlenecks of single-satellite platforms by using multiple heterogeneous satellites in the same orbit to cooperate with distributed payloads to achieve large-capacity high-speed networking and information transmission and exchange in space, providing an effective solution for the above-mentioned needs.

The high-speed variation of relative positions of multi-satellites in the same orbit and the visible state constraints of satellite-borne optical phased array antennas cause the topology of DSC to be dynamically time-varying and intermittent link interruptions. It is necessary to solve the problems of rapid topology reconstruction and dynamic continuous networking under these two situations. In response to these problems. In a research paper recently published in *Space: Science & Technology*, scholars from School of systems science and engineering, Sun Yat-Sen University and Institute of Systems Engineering, AMS, together develop a multi-objective optimization model for laser networking of DSC and proposed a double-layer Markov DRL architecture-based DSC laser networking algorithm. This algorithm achieves rapid topology reconstruction and dynamic continuous networking under the conditions of dynamic time-varying topology and intermittent link interruption of DSC, maximizes network connectivity and network duration, and minimizes the network connection matrix perturbation.

First, the authors give the system model and problem description. It is assumed that the DSC consists of N GEO satellite nodes. Each satellite in the DSC is loaded with two pairs of optical multibeam antennas located on the north and south sides of the satellite, respectively. When antenna k of satellite i and antenna l of satellite j are both mutually visible and meet the bit error rate constraint, it is considered that there is an available link between them. By analyzing all satellite nodes, the available links of the whole DSC can be obtained, which is denoted as a matrix **L**** _{ink}** with elements 0 or 1. According to the matrix

**L****, the connectivity matrix**

_{ink}

**A****of the antennas carried by each satellite can be obtained, and further the connection matrix**

_{nt}

**T****of the entire DSC can be obtained. In the DSC networking process, with network connectivity, network duration, and network connection matrix perturbation as objectives, a multi-objective optimization model for network topology reconstruction and continuous networking is constructed. The computational complexity of this multi-objective optimization problem is**

_{p}*(2*

*O*

^{N}

^{sat}

^{N}*). It is a mixed integer programming problem which is a typical NP-hard problem.*

^{ant}Then, the authors propose a deep reinforcement learning algorithm DLM-DRL based on a double-layer Markov decision model to solve the problem. The optimization process continuously track the operational status of the DSC to obtain the positions of each satellite and the status of the laser links; calculate the available links of the whole DSC; check whether the DSC network is connected; if yes, the system continues to track the operating status of the DSC; otherwise, the DLM-DRL algorithm will be called to rebuild the laser links between satellites, and the network of DSC will be reconstructed according to the algorithm result. In the DLM-DRL algorithm, the topology change events of the DSC network are modeled as decision nodes, and the comprehensive topology optimization process of multiple topology change events is modeled as a Markov decision process; each topology change event’s optimization decision is composed of a series of laser link selection actions, which can also be described by a Markov decision process. Therefore, for the topology optimization process of DSC, a double-layer Markov decision model with internal and external Markov decision processes is established as shown in Figure 3. The inner layer is the selection process of available laser links in DSC, where each state represents whether or not to connect a laser link; the outer layer is different network topology change events in DSC, where each event takes the result of the inner layer Markov decision process as its action and optimizes it. Based on this double-layer Markov process model, a hierarchical deep reinforcement learning architecture is proposed as shown in Figure 4.

At last, the authors simulate DLM-DRL in a typical DSC application scenario and summarize the simulation results. The simulations are mainly divided into two parts: one is to build a space environment to simulate the operation process of DSC by STK11.2 software, and the other is to train and verify the DLM-DRL algorithm in the environment. The results show that in terms of the algorithm convergence, the proposed DLM-DRL algorithm can complete convergence in a relatively short time, and the convergence speed is fast. In terms of optimization results, the algorithm can quickly and efficiently complete network topology reconstruction and fully ensure the connectivity of DSC networks with dynamic time-varying topology and intermittent link outages throughout the simulation cycle. Meanwhile, by setting different optimization task objectives, the DLM-DRL algorithm can provide optimization results with different objectives, such as higher connectivity, fewer topology changes, or longer topology maintenance time, to meet different distributed constellation networking requirements.

In addition, the comparison of DLM-DRL algorithm with NSGA-II and PSO algorithms shows that while maintaining the same optimization results as NSGA-II and PSO algorithms, the DLM-DRL algorithm can significantly shorten network topology optimization time and adapt to the requirements of rapid topology reconstruction and dynamic continuous networking of DSC.

**Disclaimer:** AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.