Figure 1. (IMAGE)
Caption
The architecture of FOCUS. Given offline data, FOCUS learns a $p$ value matrix by KCI test and then gets the causal structure by choosing a $p$ threshold. After combining the learned causal structure with the neural network, FOCUS learns the policy through an offline MBRL algorithm.
Credit
Zhengmao ZHU, Honglong TIAN, Xionghui CHEN, Kun ZHANG, Yang YU
Usage Restrictions
none
License
Original content