The ToySGD Benchmark
This artificial benchmark uses functions like polynomials to test DAC controllers’ ability to control both learning rate and momentum of SGD. At each step until the cutoff, both hyperparameters are updated and one optimization step is taken. As we know the global optimum of the function, the cost is measured as the current regret.
By using function approximation, this benchmark is computationally cheap, so likely a good entry point before tackling the full-sizes SGD or CMA-ES step size benchmarks. It can also serve as a first test whether a DAC method can handle multiple hyperparameters at the same time.
- class dacbench.benchmarks.toysgd_benchmark.ToySGDBenchmark(config_path=None, config=None)
Bases:
AbstractBenchmark- get_environment()
Return SGDEnv env with current configuration
- Returns
SGD environment
- Return type
- read_instance_set(test=False)
Read path of instances from config into list
- class dacbench.envs.toysgd.ToySGDEnv(config)
Bases:
AbstractEnvOptimize toy functions with SGD + Momentum.
Action: [log_learning_rate, log_momentum] (log base 10) State: Dict with entries remaining_budget, gradient, learning_rate, momentum Reward: negative log regret of current and true function value
An instance can look as follows: ID 0 family polynomial order 2 low -2 high 2 coefficients [ 1.40501053 -0.59899755 1.43337392]
- close()
Override close in your subclass to perform any necessary cleanup.
Environments will automatically close() themselves when garbage collected or when the program exits.
- render(**kwargs)
Renders the environment.
The set of supported modes varies per environment. (And some environments do not support rendering at all.) By convention, if mode is:
human: render to the current display or terminal and return nothing. Usually for human consumption.
rgb_array: Return an numpy.ndarray with shape (x, y, 3), representing RGB values for an x-by-y pixel image, suitable for turning into a video.
ansi: Return a string (str) or StringIO.StringIO containing a terminal-style text representation. The text can include newlines and ANSI escape sequences (e.g. for colors).
- Note:
- Make sure that your class’s metadata ‘render.modes’ key includes
the list of supported modes. It’s recommended to call super() in implementations to use the functionality of this method.
- Args:
mode (str): the mode to render with
Example:
- class MyEnv(Env):
metadata = {‘render.modes’: [‘human’, ‘rgb_array’]}
- def render(self, mode=’human’):
- if mode == ‘rgb_array’:
return np.array(…) # return RGB frame suitable for video
- elif mode == ‘human’:
… # pop up a window and render
- else:
super(MyEnv, self).render(mode=mode) # just raise an exception
- reset()
Reset environment
- Returns
Environment state
- Return type
np.array
- step(action: Union[float, Tuple[float, float]]) Tuple[Dict[str, float], float, bool, Dict]
Take one step with SGD
- Parameters
action (Tuple[float, Tuple[float, float]]) – If scalar, action = (log_learning_rate) If tuple, action = (log_learning_rate, log_momentum)
- Returns
- stateDict[str, float]
State with entries “remaining_budget”, “gradient”, “learning_rate”, “momentum”
reward : float
done : bool
info : Dict
- Return type
Tuple[Dict[str, float], float, bool, Dict]