Hyper-parameters in Deep RL with Optuna [1편 문제]

지제로사 2022. 12. 15. 12:15

2022. 12. 15. 12:15

Hyperparameters in Deep RL with Optuna

https://towardsdatascience.com/hyperparameters-in-deep-rl-f8a9cf264cd6

Hyperparameters in Deep RL

The Hands-on RL Course — Part 6

towardsdatascience.com

Hyperparameters in Deep RL with Optuna

[1편 문제: Hyper-parameters]

[2편 해결: Bayesian Search]

[3편: Hyper-parameter Search with Optuna]

총 3편으로 나눠서 Deep Reinforcement Learning에서의 Hyperparameter 의 중요성과

다양한 Hyperparameter를 선택하는 Search 방법을 소개하고

Optuna 사용 방법을 익혀보겠습니다.

[1] 개요: Hyper-parameters in Deep RL

Build a perfect agent to solve the Cart Pole environment using Deep Q
set of hyperparameters used

to become a real PRO in Reinforcement Learning
need to learn how to tune hyperparameters
with using the right tools
Optuna : open-source library for hyperparameters search in the Python ecosystem

[2] 문제: Hyper-parameters

Machine Learning models

1. parameters

numbers found AFTER training your model

2. hyperparameters

numbers need to set BEFORE training the model
exists all around ML

ex. learning rate in supervised machine learning problems

too low number → stuck in local minima
too large number → oscillate too much and never converge to the optimal parameters
Deep RL is even more challenging.

Deep RL problem
- have more hyperparameters than supervised ML models
- hyperparameters in Deep RL have a huge impact on the final training outcome

How can we find the good hyperparameters?

to find good hyperparameters we follow trial-and-error approach

choose a set of hyperparameters
train the agent
evaluate the agent
if we are happy with the result, we are done.
Otherwise, we choose a new set of hp and repeat the whole process.

find good hyperparameters we follow trial-and-error approach

[3] Grid Search

trying all possible combinations and select the one that works best.
method called grid search
works well for many supervised ML problems

[4] Grid Search Problem

Deep RL problems
- many more hyperparameters (e.g 10-20)
- each can take many possible values
- it creates massive grid in which to search the best combinatio
  - number of combinations grows exponentially
    - ex. 10개 hyperparameter, 1~10까지 가능하다면? grid size = 10,000,000,000
    - 한 번의 training loop이 10분이 걸린다고 해도? 10분 X 10,000,000,000 = 190,258년
    - 현실적으로 불가능!
    - 100,000개의 병렬 프로세스로 돌린다고 해도 2년..
- 따라서 Grid search는 Deep RL 문제 풀이에는 매우 비효율적인 방법

[5] Random Search

instead of checking each of the 𝑁 possible hyperparameter combinations
randomly try a subset of them with size 𝑇
- 𝑇 is much smaller than 𝑁

also train and evaluate the agent 𝑇 times

with this 𝑇 trials
- we select the combination of hyperparameters that worked best
more efficient way to search and works better than Grid Search
- speed and quality of the solution
- but it is just spinning a roulette to decide hyperparameter
- something more smart way needed.

'스스로 학습 > seminar' 카테고리의 다른 글

[ChatGPT 블로그 정리] 99%의 ChatGPT 사용자보다 앞서는 방법 (0)	2023.04.24
Hyper-parameters in Deep RL with Optuna [3편 Optuna] (0)	2022.12.17
Hyper-parameters in Deep RL with Optuna [2편 해결] (0)	2022.12.16

정리왕 지제로사

Hyper-parameters in Deep RL with Optuna [1편 문제]

Hyperparameters in Deep RL with Optuna

Hyperparameters in Deep RL with Optuna

[1편 문제: Hyper-parameters]

[2편 해결: Bayesian Search]

[3편: Hyper-parameter Search with Optuna]

[1] 개요: Hyper-parameters in Deep RL

[2] 문제: Hyper-parameters

How can we find the good hyperparameters?

[3] Grid Search

[4] Grid Search Problem

[5] Random Search

'스스로 학습 > seminar' 카테고리의 다른 글

+ Recent posts

티스토리툴바