Training Control Policies with Multiple Nodes¶
Controlling multiple nodes to achieve a goal is a common requirement in control tasks. The Revive SDK supports training control policies that contain multiple nodes through simple configuration.
For example, in autonomous driving tasks, it is necessary to simultaneously control the direction and power output of the vehicle. Both aspects affect the direction of the vehicle’s movement, so they need to be considered together. This can be achieved through the use of the Revive SDK.
The following example demonstrates how to enable this feature during training:
metadata:
graph:
action_1:
- observation
action_2:
- action_1
- observation
next_observation:
- action_1
- action_2
- observation
columns:
...
The decision flow diagram above shows a multi-node control business, where the action_1
node and the action_2
node need to work together to complete the control task. During training, the -tpn
parameter is used for multi-policy node training.
Training Command:
python train.py -df test.npz -cf test.yaml -rf test_reward.py -vm once -pm once -tpn action_1,action_2 --run_id test