0% found this document useful (0 votes)
17 views2 pages

Minor Project Synopsis

This project proposes a framework for automating the design of learning algorithms based on Deep Reinforcement Learning (DRL) for controlling non-linear systems, addressing challenges in manual reward function design and feature extraction. The methodology involves using deep learning for feature extraction and Inverse Reinforcement Learning (IRL) for reward function discovery, aiming for fast convergence and robust performance. The effectiveness of the proposed algorithms will be evaluated through testing on the non-linear Cart-Pole control problem.

Uploaded by

mkanojia1307
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views2 pages

Minor Project Synopsis

This project proposes a framework for automating the design of learning algorithms based on Deep Reinforcement Learning (DRL) for controlling non-linear systems, addressing challenges in manual reward function design and feature extraction. The methodology involves using deep learning for feature extraction and Inverse Reinforcement Learning (IRL) for reward function discovery, aiming for fast convergence and robust performance. The effectiveness of the proposed algorithms will be evaluated through testing on the non-linear Cart-Pole control problem.

Uploaded by

mkanojia1307
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Design of Learning Algorithms based on Deep

Learning Techniques for a class of Non-Linear


Systems

Abstract
The control of non-linear systems is a significant challenge due to model parameter uncer-
tainties and external disturbances. Learning algorithms, particularly Deep Reinforcement
Learning (DRL), offer a promising data-driven solution. However, their application is of-
ten hindered by the manual, labour-intensive processes of designing reward functions
and hand-crafting state features. This project addresses these challenges by proposing a
framework to automate both processes. The core methodology involves first extracting
salient features from raw system data using deep learning models. These features are then
used to compute a reward function via Inverse Reinforcement Learning (IRL) from expert
demonstrations. Finally, this learned reward function is used to train a robust DRL agent.
The main objective is to ensure fast convergence and robust performance, with the goal
of creating algorithms that are generalizable to different classes of non-linear systems.

Objective
The primary objective of this project is to design, implement, and validate a novel learning
algorithm that overcomes key limitations in current DRL-based control methods. The
specific goals are:

1. To develop an integrated framework that automates both feature extraction and


reward function discovery for controlling non-linear systems.

2. To investigate deep learning techniques for unsupervised state representation learn-


ing to eliminate the need for manual feature engineering.

3. To employ Inverse Reinforcement Learning (IRL) to learn a reward function from


expert demonstrations, circumventing the reward engineering bottleneck.

4. To design the algorithm for online adaptation, allowing the agent to improve its
policy and reward model continuously from new interactions.

5. To ensure the final control policy exhibits fast convergence, robust performance
against uncertainties and disturbances, and the ability to generalize across different
non-linear systems.

Project Description and Methodology


1. Introduction to Problem
Real-world control systems are predominantly non-linear and are affected by model pa-
rameter uncertainties and external disturbances. Traditional control methods often strug-
gle in these conditions. Learning algorithms, specifically Reinforcement Learning (RL)
and Deep Reinforcement Learning (DRL), provide a powerful paradigm for developing
adaptive controllers that can learn optimal behaviour directly from interaction with the
environment.

2. Current Challenges and Proposed Approach


A major challenge in applying DRL is the need to manually specify a reward function
and a set of features that effectively represent the system’s state. This project aims to
automate this process. The proposed methodology is as follows:

• Automated Feature Extraction: The project will explore the use of deep learn-
ing models, such as encoder-decoder architectures, to automatically extract a low-
dimensional representation of salient features from high-dimensional state data.

• Reward Function Discovery via IRL: Using these automatically extracted fea-
tures, an Inverse Reinforcement Learning (IRL) algorithm will be used to compute
a reward function that explains the behavior observed in expert demonstrations.

• Policy Optimisation: The reward function learned through IRL will then be used
to train a control policy using a suitable DRL algorithm.

• Online Adaptation: The framework will be extended to an online setting. As


the agent interacts with the environment, new state samples will be added to a
replay buffer. The feature representation and reward function can be periodically
re-computed, allowing the agent to continuously refine its policy. This is expected
to improve sample efficiency and adaptability.

The main novelty lies in creating a synergistic pipeline that automates both feature and
reward engineering, with a primary objective of achieving fast convergence and robust
performance.

3. Testing and Evaluation Procedure


The developed algorithms will be implemented and tested on the non-linear Cart-Pole
control problem. The performance will be evaluated based on metrics such as conver-
gence speed and task success rate, focusing mainly on the fast convergence and robust
performance. The robustness of the final controller will be assessed by introducing pertur-
bations to the system’s physical parameters and observing the degradation in performance
and by comparing it with existing ones.

Dr. Sudhansu Kumar Mishra Prem Kumar Lohani


Associate Professor and Head BTECH/10758/22
Electrical and Electronics Engineering Electrical and Electronics Engineering
(Guide)

You might also like