revive.conf
====
base_config
~~~~~~~~~~~


global_seed
-----------

Set the random number seed for the experiment.

:type: ``int``

:abbreviation: ``gs``

:default: ``42``

:name: ``global_seed``


val_split_ratio
---------------

Ratio to split validate dataset if it is not explicitly given.

:type: ``float``

:abbreviation: ``vsr``

:default: ``0.5``

:name: ``val_split_ratio``


val_split_mode
--------------

Mode of auto splitting training and validation dataset, choose from `outside_traj` and `inside_traj`. `outside_traj` means the split is happened outside the trajectories, one trajectory can only be in one dataset. `inside_traj` means the split is happened inside the trajectories, former part of one trajectory is in training set, later part is in validation set.

:type: ``str``

:abbreviation: ``vsm``

:default: ``outside_traj``

:name: ``val_split_mode``


ignore_check
------------

Flag to ignore data related check, force training.

:type: ``bool``

:abbreviation: ``igc``

:default: ``False``

:name: ``ignore_check``


data_workers
------------

Number of workers to data loader. Setting a larger value can accelerate data loading, but it can lead to resource consumption.

:type: ``int``

:abbreviation: ``dw``

:default: ``2``

:name: ``data_workers``


use_time_step_embed
-------------------

Flag to use positional embedding for time step

:type: ``bool``

:abbreviation: ``utse``

:default: ``True``

:name: ``use_time_step_embed``


time_step_embed_size
--------------------

embedding size of positional embedding for time step

:type: ``int``

:abbreviation: ``tses``

:default: ``64``

:name: ``time_step_embed_size``


use_traj_id_embed
-----------------

Flag to use binary embedding for trajetory id

:type: ``bool``

:abbreviation: ``utie``

:default: ``True``

:name: ``use_traj_id_embed``


pre_horzion
-----------

How many steps of data in the configuration trajectory are used for preprocessing operations.

:type: ``int``

:abbreviation: ``ph``

:default: ``0``

:name: ``pre_horzion``


venv_rollout_horizon
--------------------

Length of sampled trajectory, validate only if the algorithm works on sequential data.

:type: ``int``

:abbreviation: ``vrh``

:default: ``100``

:name: ``venv_rollout_horizon``


venv_gpus_per_worker
--------------------

Number of gpus per worker in venv training, small than 1 means launch multiple workers on the same gpu.

:type: ``float``

:abbreviation: ``vgpw``

:default: ``1.0``

:name: ``venv_gpus_per_worker``


venv_train_dataset_mode
-----------------------

Can be set to `trajectory` mode or `transition` mode.

:type: ``str``

:abbreviation: ``vtdm``

:default: ``transition``

:name: ``venv_train_dataset_mode``


venv_metric
-----------

Metric used to evaluate the trained venv, choose from `nll`, `mae`, `mse`, `wdist`.

:type: ``str``

:default: ``mae``

:name: ``venv_metric``


venv_algo
---------

Algorithm used in venv training. There are currently three algorithms to choose from, `bc` and `revive_p`.

:type: ``str``

:default: ``revive_p``

:name: ``venv_algo``


rollout_plt_frequency
---------------------

How many steps between two plot rollout data. 0 means disable.

:type: ``int``

:abbreviation: ``rpf``

:default: ``50``

:name: ``rollout_plt_frequency``


venv_save_frequency
-------------------

How many epochs to save a model periodically. 0 means disable.

:type: ``int``

:abbreviation: ``vsp``

:default: ``0``

:name: ``venv_save_frequency``


plt_response_curve
------------------

Whether to plot response curve at the end of venv training.

:type: ``bool``

:abbreviation: ``prc``

:default: ``False``

:name: ``plt_response_curve``


rollout_dataset_mode
--------------------

Select the rollout dataset. support `train` and `validate`

:type: ``str``

:default: ``validate``

:name: ``rollout_dataset_mode``


venv_val_freq
-------------

How many epochs to evaluate the model periodically on validate datasset.

:type: ``int``

:abbreviation: ``vvf``

:default: ``1``

:name: ``venv_val_freq``


policy_gpus_per_worker
----------------------

Number of gpus per worker in venv training, small than 1 means launch multiple workers on the same gpu.

:type: ``float``

:abbreviation: ``pgpw``

:default: ``1.0``

:name: ``policy_gpus_per_worker``


behavioral_policy_init
----------------------

Whether to use the learned behavioral policy to as the initialization policy training.

:type: ``bool``

:abbreviation: ``bpi``

:default: ``True``

:name: ``behavioral_policy_init``


policy_algo
-----------

Algorithm used in policy training. There are currently two algorithms to choose from, `ppo` and `sac`.

:type: ``str``

:default: ``ppo``

:name: ``policy_algo``


test_horizon
------------

Rollout length of the venv test.

:type: ``int``

:abbreviation: ``th``

:default: ``100``

:name: ``test_horizon``


workers_per_trial
-----------------

Number of workers per trail, should be set greater than 1 only if gpu per worker is all 1.0.

:type: ``int``

:abbreviation: ``wpt``

:default: ``1``

:name: ``workers_per_trial``


train_venv_trials
-----------------

Number of total trails searched by the search algorithm in venv training.

:type: ``int``

:abbreviation: ``tvt``

:default: ``25``

:name: ``train_venv_trials``


train_policy_trials
-------------------

Number of total trails searched by the search algorithm in policy training.

:type: ``int``

:abbreviation: ``tpt``

:default: ``10``

:name: ``train_policy_trials``

venv_algo_config
~~~~~~~~~~~~~~~~
revive_p
--------


bc_batch_size
*************

:type: ``int``

:abbreviation: ``bbs``

:default: ``256``

:name: ``bc_batch_size``


bc_epoch
********

:type: ``int``

:abbreviation: ``bep``

:default: ``0``

:name: ``bc_epoch``


revive_batch_size
*****************

Batch size of training process.

:type: ``int``

:abbreviation: ``mbs``

:default: ``1024``

:name: ``revive_batch_size``


revive_epoch
************

Number of epcoh for the training process

:type: ``int``

:abbreviation: ``mep``

:default: ``1000``

:name: ``revive_epoch``


fintune
*******

:type: ``int``

:abbreviation: ``bet``

:default: ``1``

:name: ``fintune``


finetune_fre
************

:type: ``int``

:abbreviation: ``betfre``

:default: ``1``

:name: ``finetune_fre``


policy_hidden_features
**********************

Number of neurons per layer of the policy network.

:type: ``int``

:abbreviation: ``phf``

:default: ``256``

:name: ``policy_hidden_features``


policy_hidden_layers
********************

Depth of policy network.

:type: ``int``

:abbreviation: ``phl``

:default: ``4``

:name: ``policy_hidden_layers``


policy_backbone
***************

Backbone of policy network. Support selecting from [mlp, res, ft_transformer, lstm, gru].

:type: ``str``

:abbreviation: ``pb``

:default: ``res``

:name: ``policy_backbone``


transition_hidden_features
**************************

Number of neurons per layer of the transition network.

:type: ``int``

:abbreviation: ``thf``

:default: ``256``

:name: ``transition_hidden_features``


transition_hidden_layers
************************

:type: ``int``

:abbreviation: ``thl``

:default: ``4``

:name: ``transition_hidden_layers``


transition_backbone
*******************

Backbone of Transition network. Support selecting from [mlp, res, ft_transformer, lstm, gru].

:type: ``str``

:abbreviation: ``tb``

:default: ``res``

:name: ``transition_backbone``


matcher_hidden_features
***********************

Number of neurons per layer of the matcher network.

:type: ``int``

:abbreviation: ``dhf``

:default: ``256``

:name: ``matcher_hidden_features``


matcher_hidden_layers
*********************

Depth of the matcher network.

:type: ``int``

:abbreviation: ``dhl``

:default: ``4``

:name: ``matcher_hidden_layers``


g_steps
*******

The number of update rounds of the generator in each epoch.

:type: ``int``

:default: ``1``

:name: ``g_steps``

:search_mode: ``grid``

:search_values: ``1``, ``3``, ``5``


d_steps
*******

Number of update rounds of matcher in each epoch.

:type: ``int``

:default: ``1``

:name: ``d_steps``

:search_mode: ``grid``

:search_values: ``1``, ``3``, ``5``


g_lr
****

Initial learning rate of the generator nodes nets.

:type: ``float``

:default: ``4e-05``

:name: ``g_lr``

:search_mode: ``continuous``

:search_values: ``1e-06``, ``0.0001``


d_lr
****

Initial learning rate of the matcher.

:type: ``float``

:default: ``0.0006``

:name: ``d_lr``

:search_mode: ``continuous``

:search_values: ``1e-06``, ``0.001``


bc_weight_decay
***************

weight_decay in bc finetune

:type: ``float``

:default: ``0.0001``

:name: ``bc_weight_decay``

revive_f
--------


revive_batch_size
*****************

Batch size of training process.

:type: ``int``

:abbreviation: ``mbs``

:default: ``1024``

:name: ``revive_batch_size``


revive_epoch
************

Number of epcoh for the MAIL training process

:type: ``int``

:abbreviation: ``mep``

:default: ``1500``

:name: ``revive_epoch``


policy_hidden_features
**********************

Number of neurons per layer of the policy network.

:type: ``int``

:abbreviation: ``phf``

:default: ``256``

:name: ``policy_hidden_features``


policy_hidden_layers
********************

Depth of policy network.

:type: ``int``

:abbreviation: ``phl``

:default: ``4``

:name: ``policy_hidden_layers``


policy_backbone
***************

Backbone of policy network.

:type: ``str``

:abbreviation: ``pb``

:default: ``res``

:name: ``policy_backbone``


transition_hidden_features
**************************

Number of neurons per layer of the transition network.

:type: ``int``

:abbreviation: ``thf``

:default: ``256``

:name: ``transition_hidden_features``


transition_hidden_layers
************************

:type: ``int``

:abbreviation: ``thl``

:default: ``4``

:name: ``transition_hidden_layers``


transition_backbone
*******************

Backbone of Transition network.

:type: ``str``

:abbreviation: ``tb``

:default: ``res``

:name: ``transition_backbone``


matcher_hidden_features
***********************

Number of neurons per layer of the matcher network.

:type: ``int``

:abbreviation: ``dhf``

:default: ``256``

:name: ``matcher_hidden_features``


matcher_hidden_layers
*********************

Depth of the matcher network.

:type: ``int``

:abbreviation: ``dhl``

:default: ``4``

:name: ``matcher_hidden_layers``


g_steps
*******

The number of update rounds of the generator in each epoch.

:type: ``int``

:default: ``1``

:name: ``g_steps``

:search_mode: ``grid``

:search_values: ``1``, ``3``, ``5``


d_steps
*******

Number of update rounds of matcher in each epoch.

:type: ``int``

:default: ``1``

:name: ``d_steps``

:search_mode: ``grid``

:search_values: ``1``, ``3``, ``5``


g_lr
****

Initial learning rate of the generator nodes nets.

:type: ``float``

:default: ``4e-05``

:name: ``g_lr``

:search_mode: ``continuous``

:search_values: ``1e-06``, ``0.0001``


d_lr
****

Initial learning rate of the matcher.

:type: ``float``

:default: ``0.0006``

:name: ``d_lr``

:search_mode: ``continuous``

:search_values: ``1e-06``, ``0.001``

bc
--


bc_batch_size
*************

Batch size of training process.

:type: ``int``

:abbreviation: ``bbs``

:default: ``256``

:name: ``bc_batch_size``


bc_epoch
********

Number of epcoh for the training process

:type: ``int``

:abbreviation: ``bep``

:default: ``500``

:name: ``bc_epoch``


policy_hidden_features
**********************

Number of neurons per layer of the policy network.

:type: ``int``

:abbreviation: ``phf``

:default: ``256``

:name: ``policy_hidden_features``


policy_hidden_layers
********************

Depth of policy network.

:type: ``int``

:abbreviation: ``phl``

:default: ``4``

:name: ``policy_hidden_layers``

:search_mode: ``grid``

:search_values: ``3``, ``4``, ``5``


policy_backbone
***************

Backbone of policy network. Support selecting from [mlp, res, ft_transformer, lstm, gru].

:type: ``str``

:abbreviation: ``pb``

:default: ``res``

:name: ``policy_backbone``


transition_hidden_features
**************************

:type: ``int``

:abbreviation: ``thf``

:default: ``256``

:name: ``transition_hidden_features``


transition_hidden_layers
************************

:type: ``int``

:abbreviation: ``thl``

:default: ``3``

:name: ``transition_hidden_layers``


transition_backbone
*******************

Backbone of Transition network. Support selecting from [mlp, res, ft_transformer, lstm, gru].

:type: ``str``

:abbreviation: ``tb``

:default: ``res``

:name: ``transition_backbone``


g_lr
****

Initial learning rate of the training process.

:type: ``float``

:default: ``0.0001``

:name: ``g_lr``

:search_mode: ``continuous``

:search_values: ``1e-06``, ``0.001``


loss_type
*********

Bc support different loss function("nll", "mae", "mse").

:name: ``loss_type``

:default: ``nll``

:type: ``str``

policy_algo_config
~~~~~~~~~~~~~~~~~~
ppo
---


ppo_batch_size
**************

Batch size of training process.

:type: ``int``

:abbreviation: ``pbs``

:default: ``256``

:name: ``ppo_batch_size``


policy_bc_epoch
***************

pre-train policy with setting epoch

:type: ``int``

:default: ``0``

:name: ``policy_bc_epoch``


ppo_epoch
*********

Number of epcoh for the training process

:type: ``int``

:abbreviation: ``bep``

:default: ``1000``

:name: ``ppo_epoch``


ppo_rollout_horizon
*******************

Rollout length of the policy train.

:type: ``int``

:abbreviation: ``prh``

:default: ``100``

:name: ``ppo_rollout_horizon``


policy_hidden_features
**********************

Number of neurons per layer of the policy network.

:type: ``int``

:abbreviation: ``phf``

:default: ``256``

:name: ``policy_hidden_features``


policy_hidden_layers
********************

Depth of policy network.

:type: ``int``

:abbreviation: ``phl``

:default: ``4``

:name: ``policy_hidden_layers``


policy_backbone
***************

Backbone of policy network.[mlp, res, ft_transformer]

:type: ``str``

:abbreviation: ``pb``

:default: ``res``

:name: ``policy_backbone``


g_lr
****

Initial learning rate of the training process.

:type: ``float``

:default: ``4e-05``

:name: ``g_lr``

:search_mode: ``continuous``

:search_values: ``1e-06``, ``0.001``

sac
---


sac_batch_size
**************

Batch size of training process.

:type: ``int``

:abbreviation: ``pbs``

:default: ``1024``

:name: ``sac_batch_size``


policy_bc_epoch
***************

pre-train policy with setting epoch

:type: ``int``

:default: ``0``

:name: ``policy_bc_epoch``


sac_epoch
*********

Number of epcoh for the training process.

:type: ``int``

:abbreviation: ``bep``

:default: ``1000``

:name: ``sac_epoch``


sac_steps_per_epoch
*******************

The number of update rounds of sac in each epoch.

:type: ``int``

:abbreviation: ``sspe``

:default: ``200``

:name: ``sac_steps_per_epoch``


sac_rollout_horizon
*******************

:type: ``int``

:abbreviation: ``srh``

:default: ``20``

:name: ``sac_rollout_horizon``


policy_hidden_features
**********************

Number of neurons per layer of the policy network.

:type: ``int``

:abbreviation: ``phf``

:default: ``256``

:name: ``policy_hidden_features``


policy_hidden_layers
********************

Depth of policy network.

:type: ``int``

:abbreviation: ``phl``

:default: ``4``

:name: ``policy_hidden_layers``


policy_backbone
***************

Backbone of policy network. [mlp, res, ft_transformer]

:type: ``str``

:abbreviation: ``pb``

:default: ``res``

:name: ``policy_backbone``


policy_hidden_activation
************************

hidden_activation of policy network.

:type: ``str``

:abbreviation: ``pha``

:default: ``leakyrelu``

:name: ``policy_hidden_activation``


buffer_size
***********

Size of the buffer to store data.

:type: ``int``

:abbreviation: ``bfs``

:default: ``1000000.0``

:name: ``buffer_size``


g_lr
****

Initial learning rate of the training process.

:type: ``float``

:default: ``4e-05``

:name: ``g_lr``

:search_mode: ``continuous``

:search_values: ``1e-06``, ``0.001``