Generate Decision Flow through Causal Inference¶
When we are unsure of how to construct a decision flow using existing data,
we can use the causal inference tool provided by the REVIVE SDK to automatically build a decision flow *.yaml
file.
This tool is based on Causal Inference
With the help of causal inference methods,
we can also easily check which dimensions in offline data are useful for learning environments or policy models,
as well as the relationship between the states and behaviors of these dimensions.
The tool is purely data-driven.
With the help of this tool, users can easily obtain the generated decision flow chart,
which may help REVIVE learn virtual environments more accurately and infer more effective policies.
This tool can automatically generate a decision flow chart based on the provided .yaml
and .npz
files.
Note
Users do not need to define the graph
section in the .yaml
file, as the REVIVE SDK will automatically generate it using causal inference algorithms.
In some cases, the generated decision flow may be potentially different from that of users’ thoughts. So, users would be sure to check out the reasonability of the generated decision flow before using REVIVE to learn environment models and policies.
Here, we show how to use Causal Inference according to the example we’ve introduced in the Example of applying REVIVE to Lander Hover
Generally, we construct the columns
part of the .yaml
as follows:
metadata:
columns:
- obs_0:
dim: obs
type: continuous
- obs_1:
dim: obs
type: continuous
- obs_2:
dim: obs
type: continuous
- obs_3:
dim: obs
type: continuous
- action:
dim: action
type: category
values: [0,1,2,3]
In the file, we do not have to provide the graph
part. After we provide this .yaml
file together with the corresponding .npz
file to the Causal Inference Tool. The tool
will output the new .yaml
and .npz
files with the name defined by users.
Specifically, the .yaml
is re-built as shown in the following:
metadata:
columns:
- obs_0:
dim: action_realated_obs
type: continuous
- obs_1:
dim: action_realated_obs
type: continuous
- obs_2:
dim: translation_related_obs
type: continuous
- obs_3:
dim: translation_related_obs
type: continuous
- action:
dim: action
type: category
values:
- 0
- 1
- 2
- 3
graph:
action:
- action_realated_obs
next_action_realated_obs:
- action_realated_obs
- translation_related_obs
- action
next_translation_related_obs:
- action_realated_obs
- translation_related_obs
- action
In the generated .yaml
file, the tool (Causal Inference) builds the graph
part. In the columns
part,
the values
of the action
are shown in a little bit different form of a list with that of the
original .yaml
file as [0,1,2,3]
. To be honest, these two forms are the same.
The generated .yaml
file shows that the tool (Causal Inference) split the four dimensions of
obs
data into two parts. The first two dimensions are with the newly changed name action_realated_obs
, which
is directly connected with action
as shown in the graph. And another two dimensions of obs
are now named translation_related_obs
. The split two obs
are now treated as two transition nodes as
shown in the graph
which can be understood as two environments for REVIVE to learn.
The newly built graph
shows the real relationship between the first two
dimensions of obs
and action
for the following reason.
In this case, we used the rule model based on coordinate information to simulate the
historical decision process and obtain historical decision data. In another word,
those data are collected by the “rule model”, whose action
data are only connected
to the coordinates of the lander, which are the first two dimensions of the obs
data.
And the tool of causal inference does indeed uncover this relationship by splitting obs
into
two parts and connecting only the two first dimensions of obs
with action
.
For users who are confused about constructing the decision flow of their experiment. The causal inference tool of REVIVE might be a wonderful choice for briefly generating a cursory decision flow, but which is needed to be checked out carefully.
The causal inference tool can also deduce those useless dimensions in the data.
The outputted .yaml
from the tool is shown in the following for the experiment of
using the example of the Pendulum.
Before that, we add two dimensions in the obs
data with
unrelated random values. So, obs
contains two additional useless data.
In the result, we find that the appended dimensions obs_3
and obs_4
of obs
are flagged as useless_obs
, which means these two dimensions are removed from obs
.
And this result meets our expectations.
metadata:
columns:
- obs_0:
dim: obs
type: continuous
- obs_1:
dim: obs
type: continuous
- obs_2:
dim: obs
type: continuous
- obs_3:
dim: useless_obs
type: continuous
- obs_4:
dim: useless_obs
type: continuous
- action:
dim: action
type: continuous
graph:
action:
- obs
next_obs:
- obs
- action