What is REVIVE SDK?¶

REVIVE SDK is a data-driven Reinforcement Learning (RL) toolkit that focuses on solving offline RL problems. The toolkit learns optimal policies from historical data without the need for additional interactions with a real environment, thus automating the decision-making process.

REVIVE SDK transforms historical data into a powerful decision-making engine, extracts optimal policies from limited data, and automates decision-making processes in areas such as mechanical system control and energy efficiency improvement. Unlike other RL toolkits, REVIVE SDK focuses on offline RL, making it more efficient in handling historical data and avoiding various risks associated with online training.

REVIVE SDK is a universal software toolkit that can be flexibly applied to various task scenarios. The software runs in two parts: virtual environment training and policy training.

Virtual Environment Training (Venv Training): Uses historical data to build a virtual environment model that can simulate the state transition relationship between various data components in a business scenario.

Policy Training: Optimizes policies using the virtual environment. Based on the trained virtual environment, RL methods are used to train policies to achieve ideal control effects.

When using the REVIVE SDK, it is necessary to understand three core concepts: virtual environment, policy, and reward.

Virtual Environment (Venv): A virtual environment refers to the process of modeling a real business scenario, which can be achieved by driving a neural network with historical data. For example, in manufacturing, a virtual environment can be used to simulate all the machines and material transportation processes on a production line.
Policy: Policy represents the decision-making process of an intelligent agent based on its observation state. The agent is the decision maker and should make different decisions in different situations to maximize the predefined rewards. For example, in autonomous driving, the agent makes decisions such as turning or accelerating based on the current road conditions and traffic signals.
Reward: Reward is used to describe the degree of goodness or badness of a policy within a time step, where a good policy can receive a higher reward. For example, in a mechanical system, a good policy might be a control strategy that achieves the task goal with the lowest possible energy consumption, which can result in a higher reward.

The POLIXIR REVIVE SDK has a wide range of applications, and here are some examples:

Mechanical System Control:The REVIVE SDK can learn the optimal strategy from historical data to achieve automated control of mechanical systems. For example, in the field of robotics, REVIVE SDK can train robots to move and avoid obstacles in different environments.
Energy Efficiency Improvement:REVIVE SDK can associate historical energy consumption data with environmental variables and propose optimal energy use strategies based on the analysis results. For example, in the field of building design, REVIVE SDK can help designers determine the best heating, ventilation, and lighting plans to improve energy efficiency.
Medical Diagnosis:REVIVE SDK can build a virtual environment model based on historical medical data to assist doctors in disease diagnosis and treatment plan development. For example, in cancer diagnosis, REVIVE SDK can train a model to detect tumors more accurately and propose the best treatment plan based on the detection results.
Logistics Management:REVIVE SDK can learn the optimal scheduling strategy from historical transportation data to optimize logistics processes. For example, in the field of air cargo transportation, REVIVE SDK can help airlines determine the best flight routes, altitudes, and speeds to improve cargo efficiency and reduce costs.

In summary, REVIVE SDK can be applied to various fields and can build corresponding virtual environment models and optimization strategies for different business scenarios.

The REVIVE SDK not only provides sample code internally, but also includes some runnable cases, allowing you to directly experience and understand the functionality and application scenarios of the SDK. These cases cover multiple fields, including gaming, smart homes, and industrial control, among others.