revive.conf ==== base_config ~~~~~~~~~~~ global_seed ----------- Set the random number seed for the experiment. :type: ``int`` :abbreviation: ``gs`` :default: ``42`` :name: ``global_seed`` val_split_ratio --------------- Ratio to split validate dataset if it is not explicitly given. :type: ``float`` :abbreviation: ``vsr`` :default: ``0.5`` :name: ``val_split_ratio`` val_split_mode -------------- Mode of auto splitting training and validation dataset, choose from `outside_traj` and `inside_traj`. `outside_traj` means the split is happened outside the trajectories, one trajectory can only be in one dataset. `inside_traj` means the split is happened inside the trajectories, former part of one trajectory is in training set, later part is in validation set. :type: ``str`` :abbreviation: ``vsm`` :default: ``outside_traj`` :name: ``val_split_mode`` ignore_check ------------ Flag to ignore data related check, force training. :type: ``bool`` :abbreviation: ``igc`` :default: ``False`` :name: ``ignore_check`` data_workers ------------ Number of workers to data loader. Setting a larger value can accelerate data loading, but it can lead to resource consumption. :type: ``int`` :abbreviation: ``dw`` :default: ``2`` :name: ``data_workers`` use_time_step_embed ------------------- Flag to use positional embedding for time step :type: ``bool`` :abbreviation: ``utse`` :default: ``True`` :name: ``use_time_step_embed`` time_step_embed_size -------------------- embedding size of positional embedding for time step :type: ``int`` :abbreviation: ``tses`` :default: ``64`` :name: ``time_step_embed_size`` use_traj_id_embed ----------------- Flag to use binary embedding for trajetory id :type: ``bool`` :abbreviation: ``utie`` :default: ``True`` :name: ``use_traj_id_embed`` pre_horzion ----------- How many steps of data in the configuration trajectory are used for preprocessing operations. :type: ``int`` :abbreviation: ``ph`` :default: ``0`` :name: ``pre_horzion`` venv_rollout_horizon -------------------- Length of sampled trajectory, validate only if the algorithm works on sequential data. :type: ``int`` :abbreviation: ``vrh`` :default: ``100`` :name: ``venv_rollout_horizon`` venv_gpus_per_worker -------------------- Number of gpus per worker in venv training, small than 1 means launch multiple workers on the same gpu. :type: ``float`` :abbreviation: ``vgpw`` :default: ``1.0`` :name: ``venv_gpus_per_worker`` venv_train_dataset_mode ----------------------- Can be set to `trajectory` mode or `transition` mode. :type: ``str`` :abbreviation: ``vtdm`` :default: ``transition`` :name: ``venv_train_dataset_mode`` venv_metric ----------- Metric used to evaluate the trained venv, choose from `nll`, `mae`, `mse`, `wdist`. :type: ``str`` :default: ``mae`` :name: ``venv_metric`` venv_algo --------- Algorithm used in venv training. There are currently three algorithms to choose from, `bc` and `revive_p`. :type: ``str`` :default: ``revive_p`` :name: ``venv_algo`` rollout_plt_frequency --------------------- How many steps between two plot rollout data. 0 means disable. :type: ``int`` :abbreviation: ``rpf`` :default: ``50`` :name: ``rollout_plt_frequency`` venv_save_frequency ------------------- How many epochs to save a model periodically. 0 means disable. :type: ``int`` :abbreviation: ``vsp`` :default: ``0`` :name: ``venv_save_frequency`` plt_response_curve ------------------ Whether to plot response curve at the end of venv training. :type: ``bool`` :abbreviation: ``prc`` :default: ``False`` :name: ``plt_response_curve`` rollout_dataset_mode -------------------- Select the rollout dataset. support `train` and `validate` :type: ``str`` :default: ``validate`` :name: ``rollout_dataset_mode`` venv_val_freq ------------- How many epochs to evaluate the model periodically on validate datasset. :type: ``int`` :abbreviation: ``vvf`` :default: ``1`` :name: ``venv_val_freq`` policy_gpus_per_worker ---------------------- Number of gpus per worker in venv training, small than 1 means launch multiple workers on the same gpu. :type: ``float`` :abbreviation: ``pgpw`` :default: ``1.0`` :name: ``policy_gpus_per_worker`` behavioral_policy_init ---------------------- Whether to use the learned behavioral policy to as the initialization policy training. :type: ``bool`` :abbreviation: ``bpi`` :default: ``True`` :name: ``behavioral_policy_init`` policy_algo ----------- Algorithm used in policy training. There are currently two algorithms to choose from, `ppo` and `sac`. :type: ``str`` :default: ``ppo`` :name: ``policy_algo`` test_horizon ------------ Rollout length of the venv test. :type: ``int`` :abbreviation: ``th`` :default: ``100`` :name: ``test_horizon`` workers_per_trial ----------------- Number of workers per trail, should be set greater than 1 only if gpu per worker is all 1.0. :type: ``int`` :abbreviation: ``wpt`` :default: ``1`` :name: ``workers_per_trial`` train_venv_trials ----------------- Number of total trails searched by the search algorithm in venv training. :type: ``int`` :abbreviation: ``tvt`` :default: ``25`` :name: ``train_venv_trials`` train_policy_trials ------------------- Number of total trails searched by the search algorithm in policy training. :type: ``int`` :abbreviation: ``tpt`` :default: ``10`` :name: ``train_policy_trials`` venv_algo_config ~~~~~~~~~~~~~~~~ revive_p -------- bc_batch_size ************* :type: ``int`` :abbreviation: ``bbs`` :default: ``256`` :name: ``bc_batch_size`` bc_epoch ******** :type: ``int`` :abbreviation: ``bep`` :default: ``0`` :name: ``bc_epoch`` revive_batch_size ***************** Batch size of training process. :type: ``int`` :abbreviation: ``mbs`` :default: ``1024`` :name: ``revive_batch_size`` revive_epoch ************ Number of epcoh for the training process :type: ``int`` :abbreviation: ``mep`` :default: ``1000`` :name: ``revive_epoch`` fintune ******* :type: ``int`` :abbreviation: ``bet`` :default: ``1`` :name: ``fintune`` finetune_fre ************ :type: ``int`` :abbreviation: ``betfre`` :default: ``1`` :name: ``finetune_fre`` policy_hidden_features ********************** Number of neurons per layer of the policy network. :type: ``int`` :abbreviation: ``phf`` :default: ``256`` :name: ``policy_hidden_features`` policy_hidden_layers ******************** Depth of policy network. :type: ``int`` :abbreviation: ``phl`` :default: ``4`` :name: ``policy_hidden_layers`` policy_backbone *************** Backbone of policy network. Support selecting from [mlp, res, ft_transformer, lstm, gru]. :type: ``str`` :abbreviation: ``pb`` :default: ``res`` :name: ``policy_backbone`` transition_hidden_features ************************** Number of neurons per layer of the transition network. :type: ``int`` :abbreviation: ``thf`` :default: ``256`` :name: ``transition_hidden_features`` transition_hidden_layers ************************ :type: ``int`` :abbreviation: ``thl`` :default: ``4`` :name: ``transition_hidden_layers`` transition_backbone ******************* Backbone of Transition network. Support selecting from [mlp, res, ft_transformer, lstm, gru]. :type: ``str`` :abbreviation: ``tb`` :default: ``res`` :name: ``transition_backbone`` matcher_hidden_features *********************** Number of neurons per layer of the matcher network. :type: ``int`` :abbreviation: ``dhf`` :default: ``256`` :name: ``matcher_hidden_features`` matcher_hidden_layers ********************* Depth of the matcher network. :type: ``int`` :abbreviation: ``dhl`` :default: ``4`` :name: ``matcher_hidden_layers`` g_steps ******* The number of update rounds of the generator in each epoch. :type: ``int`` :default: ``1`` :name: ``g_steps`` :search_mode: ``grid`` :search_values: ``1``, ``3``, ``5`` d_steps ******* Number of update rounds of matcher in each epoch. :type: ``int`` :default: ``1`` :name: ``d_steps`` :search_mode: ``grid`` :search_values: ``1``, ``3``, ``5`` g_lr **** Initial learning rate of the generator nodes nets. :type: ``float`` :default: ``4e-05`` :name: ``g_lr`` :search_mode: ``continuous`` :search_values: ``1e-06``, ``0.0001`` d_lr **** Initial learning rate of the matcher. :type: ``float`` :default: ``0.0006`` :name: ``d_lr`` :search_mode: ``continuous`` :search_values: ``1e-06``, ``0.001`` bc_weight_decay *************** weight_decay in bc finetune :type: ``float`` :default: ``0.0001`` :name: ``bc_weight_decay`` revive_f -------- revive_batch_size ***************** Batch size of training process. :type: ``int`` :abbreviation: ``mbs`` :default: ``1024`` :name: ``revive_batch_size`` revive_epoch ************ Number of epcoh for the MAIL training process :type: ``int`` :abbreviation: ``mep`` :default: ``1500`` :name: ``revive_epoch`` policy_hidden_features ********************** Number of neurons per layer of the policy network. :type: ``int`` :abbreviation: ``phf`` :default: ``256`` :name: ``policy_hidden_features`` policy_hidden_layers ******************** Depth of policy network. :type: ``int`` :abbreviation: ``phl`` :default: ``4`` :name: ``policy_hidden_layers`` policy_backbone *************** Backbone of policy network. :type: ``str`` :abbreviation: ``pb`` :default: ``res`` :name: ``policy_backbone`` transition_hidden_features ************************** Number of neurons per layer of the transition network. :type: ``int`` :abbreviation: ``thf`` :default: ``256`` :name: ``transition_hidden_features`` transition_hidden_layers ************************ :type: ``int`` :abbreviation: ``thl`` :default: ``4`` :name: ``transition_hidden_layers`` transition_backbone ******************* Backbone of Transition network. :type: ``str`` :abbreviation: ``tb`` :default: ``res`` :name: ``transition_backbone`` matcher_hidden_features *********************** Number of neurons per layer of the matcher network. :type: ``int`` :abbreviation: ``dhf`` :default: ``256`` :name: ``matcher_hidden_features`` matcher_hidden_layers ********************* Depth of the matcher network. :type: ``int`` :abbreviation: ``dhl`` :default: ``4`` :name: ``matcher_hidden_layers`` g_steps ******* The number of update rounds of the generator in each epoch. :type: ``int`` :default: ``1`` :name: ``g_steps`` :search_mode: ``grid`` :search_values: ``1``, ``3``, ``5`` d_steps ******* Number of update rounds of matcher in each epoch. :type: ``int`` :default: ``1`` :name: ``d_steps`` :search_mode: ``grid`` :search_values: ``1``, ``3``, ``5`` g_lr **** Initial learning rate of the generator nodes nets. :type: ``float`` :default: ``4e-05`` :name: ``g_lr`` :search_mode: ``continuous`` :search_values: ``1e-06``, ``0.0001`` d_lr **** Initial learning rate of the matcher. :type: ``float`` :default: ``0.0006`` :name: ``d_lr`` :search_mode: ``continuous`` :search_values: ``1e-06``, ``0.001`` bc -- bc_batch_size ************* Batch size of training process. :type: ``int`` :abbreviation: ``bbs`` :default: ``256`` :name: ``bc_batch_size`` bc_epoch ******** Number of epcoh for the training process :type: ``int`` :abbreviation: ``bep`` :default: ``500`` :name: ``bc_epoch`` policy_hidden_features ********************** Number of neurons per layer of the policy network. :type: ``int`` :abbreviation: ``phf`` :default: ``256`` :name: ``policy_hidden_features`` policy_hidden_layers ******************** Depth of policy network. :type: ``int`` :abbreviation: ``phl`` :default: ``4`` :name: ``policy_hidden_layers`` :search_mode: ``grid`` :search_values: ``3``, ``4``, ``5`` policy_backbone *************** Backbone of policy network. Support selecting from [mlp, res, ft_transformer, lstm, gru]. :type: ``str`` :abbreviation: ``pb`` :default: ``res`` :name: ``policy_backbone`` transition_hidden_features ************************** :type: ``int`` :abbreviation: ``thf`` :default: ``256`` :name: ``transition_hidden_features`` transition_hidden_layers ************************ :type: ``int`` :abbreviation: ``thl`` :default: ``3`` :name: ``transition_hidden_layers`` transition_backbone ******************* Backbone of Transition network. Support selecting from [mlp, res, ft_transformer, lstm, gru]. :type: ``str`` :abbreviation: ``tb`` :default: ``res`` :name: ``transition_backbone`` g_lr **** Initial learning rate of the training process. :type: ``float`` :default: ``0.0001`` :name: ``g_lr`` :search_mode: ``continuous`` :search_values: ``1e-06``, ``0.001`` loss_type ********* Bc support different loss function("nll", "mae", "mse"). :name: ``loss_type`` :default: ``nll`` :type: ``str`` policy_algo_config ~~~~~~~~~~~~~~~~~~ ppo --- ppo_batch_size ************** Batch size of training process. :type: ``int`` :abbreviation: ``pbs`` :default: ``256`` :name: ``ppo_batch_size`` policy_bc_epoch *************** pre-train policy with setting epoch :type: ``int`` :default: ``0`` :name: ``policy_bc_epoch`` ppo_epoch ********* Number of epcoh for the training process :type: ``int`` :abbreviation: ``bep`` :default: ``1000`` :name: ``ppo_epoch`` ppo_rollout_horizon ******************* Rollout length of the policy train. :type: ``int`` :abbreviation: ``prh`` :default: ``100`` :name: ``ppo_rollout_horizon`` policy_hidden_features ********************** Number of neurons per layer of the policy network. :type: ``int`` :abbreviation: ``phf`` :default: ``256`` :name: ``policy_hidden_features`` policy_hidden_layers ******************** Depth of policy network. :type: ``int`` :abbreviation: ``phl`` :default: ``4`` :name: ``policy_hidden_layers`` policy_backbone *************** Backbone of policy network.[mlp, res, ft_transformer] :type: ``str`` :abbreviation: ``pb`` :default: ``res`` :name: ``policy_backbone`` g_lr **** Initial learning rate of the training process. :type: ``float`` :default: ``4e-05`` :name: ``g_lr`` :search_mode: ``continuous`` :search_values: ``1e-06``, ``0.001`` sac --- sac_batch_size ************** Batch size of training process. :type: ``int`` :abbreviation: ``pbs`` :default: ``1024`` :name: ``sac_batch_size`` policy_bc_epoch *************** pre-train policy with setting epoch :type: ``int`` :default: ``0`` :name: ``policy_bc_epoch`` sac_epoch ********* Number of epcoh for the training process. :type: ``int`` :abbreviation: ``bep`` :default: ``1000`` :name: ``sac_epoch`` sac_steps_per_epoch ******************* The number of update rounds of sac in each epoch. :type: ``int`` :abbreviation: ``sspe`` :default: ``200`` :name: ``sac_steps_per_epoch`` sac_rollout_horizon ******************* :type: ``int`` :abbreviation: ``srh`` :default: ``20`` :name: ``sac_rollout_horizon`` policy_hidden_features ********************** Number of neurons per layer of the policy network. :type: ``int`` :abbreviation: ``phf`` :default: ``256`` :name: ``policy_hidden_features`` policy_hidden_layers ******************** Depth of policy network. :type: ``int`` :abbreviation: ``phl`` :default: ``4`` :name: ``policy_hidden_layers`` policy_backbone *************** Backbone of policy network. [mlp, res, ft_transformer] :type: ``str`` :abbreviation: ``pb`` :default: ``res`` :name: ``policy_backbone`` policy_hidden_activation ************************ hidden_activation of policy network. :type: ``str`` :abbreviation: ``pha`` :default: ``leakyrelu`` :name: ``policy_hidden_activation`` buffer_size *********** Size of the buffer to store data. :type: ``int`` :abbreviation: ``bfs`` :default: ``1000000.0`` :name: ``buffer_size`` g_lr **** Initial learning rate of the training process. :type: ``float`` :default: ``4e-05`` :name: ``g_lr`` :search_mode: ``continuous`` :search_values: ``1e-06``, ``0.001``