반응형

Hyperparameters in Deep RL with Optuna


Hyperparameters in Deep RL with Optuna

[1편 문제: Hyper-parameters]

[2편 해결: Bayesian Search]

[3편: Hyper-parameter Search with Optuna]

총 3편으로 나눠서 Deep Reinforcement Learning에서의 Hyperparameter 의 중요성과

다양한 Hyperparameter를 선택하는 Search 방법을 소개하고

Optuna 사용 방법을 익혀보겠습니다.


[1] 해결: Bayesian Search

  • to search well, remembering what you tried in the past is good
  • and use that information to decide what is best to try next

 

  • Bayesian search methods
    • keep track of past iteration results
    • to decide what are the most promising regions in the hyperparameter space to try next
  • explore the space with surrogate model(of the object function)
    • estimate of how good each hyperparameter combination is
    • as running more iterations, the algorithm updates the surrogate model
    • estimates get better and better

More Information of Efficient Hyperparameater tuning using Bayesian Optimization

https://towardsdatascience.com/a-conceptual-explanation-of-bayesian-model-based-hyperparameter-optimization-for-machine-learning-b8172278050f

 

A Conceptual Explanation of Bayesian Hyperparameter Optimization for Machine Learning

The concepts behind efficient hyperparameter tuning using Bayesian optimization

towardsdatascience.com


  • Surrogate model
    • gets good enough to point the search towards good hyperparameters
    • if the algorithm selects the hyperparameters that maximize the surrogate
      • yield good results on the true evaluation function.

estimating result of surrogate model

  • Bayesian Search
    • superior to random search
    • perfect choice to use in Deep RL
  • differ in how they build the surrogate model
  • will use Tree-structured Parzen Estimator (TPE) method.
    •  uses applying Bayes rule
    • probability of the hyperparameters given score on the object function
bayes rule in action
make two different distributions for the hyperparameters
a lower value of the objective function than the threshold
the value of the objective function is less than the threshold
 
the value of the objective function is greater than the threshold
 

 

반응형
반응형

Hyperparameters in Deep RL with Optuna


Hyperparameters in Deep RL with Optuna

[1편 문제: Hyper-parameters]

[2편 해결: Bayesian Search]

[3편: Hyper-parameter Search with Optuna]

총 3편으로 나눠서 Deep Reinforcement Learning에서의 Hyperparameter 의 중요성과

다양한 Hyperparameter를 선택하는 Search 방법을 소개하고

Optuna 사용 방법을 익혀보겠습니다.


[1] 개요: Hyper-parameters in Deep RL

  • Build a perfect agent to solve the Cart Pole environment using Deep Q
  • set of hyperparameters used
  • to become a real PRO in Reinforcement Learning
  • need to learn how to tune hyperparameters
  • with using the right tools
  • Optuna : open-source library for hyperparameters search in the Python ecosystem

[2] 문제: Hyper-parameters

Machine Learning models

1. parameters

  • numbers found AFTER training your model

2. hyperparameters

  • numbers need to set BEFORE training the model
  • exists all around ML

ex. learning rate in supervised machine learning problems

  • too low number stuck in local minima
  • too large number oscillate too much and never converge to the optimal parameters
  • Deep RL is even more challenging.

  • Deep RL problem
    • have more hyperparameters than supervised ML models
    • hyperparameters in Deep RL have a huge impact on the final training outcome

How can we find the good hyperparameters?

  • to find good hyperparameters we follow trial-and-error approach
  1. choose a set of hyperparameters
  2. train the agent
  3. evaluate the agent
  4. if we are happy with the result, we are done.
    Otherwise, we choose a new set of hp and repeat the whole process.

find good hyperparameters we follow trial-and-error approach


[3] Grid Search

  • trying all possible combinations and select the one that works best.
  • method called grid search
  • works well for many supervised ML problems
grid search

[4] Grid Search Problem

  • Deep RL problems
    • many more hyperparameters (e.g 10-20)
    • each can take many possible values
    • it creates massive grid in which to search the best combinatio
      • number of combinations grows exponentially
        • ex. 10개 hyperparameter, 1~10까지 가능하다면? grid size = 10,000,000,000
        • 한 번의 training loop이 10분이 걸린다고 해도? 10분 X 10,000,000,000 = 190,258년
        • 현실적으로 불가능!
        • 100,000개의 병렬 프로세스로 돌린다고 해도 2년..
    • 따라서 Grid search는 Deep RL 문제 풀이에는 매우 비효율적인 방법

[5] Random Search

  • instead of checking each of the 𝑁 possible hyperparameter combinations
  • randomly try a subset of them with size 𝑇
    • 𝑇 is much smaller than 𝑁
  • also train and evaluate the agent 𝑇 times

 

  • with this 𝑇 trials
    • we select the combination of hyperparameters that worked best
  • more efficient way to search and works better than Grid Search
    • speed and quality of the solution
    • but it is just spinning a roulette to decide hyperparameter
    • something more smart way needed.
반응형

+ Recent posts