dacbench.envs.toysgd
Module Contents
Functions
|
|
|
- dacbench.envs.toysgd.create_polynomial_instance_set(out_fname: str, n_samples: int = 100, order: int = 2, low: float = -10, high: float = 10)
- dacbench.envs.toysgd.sample_coefficients(order: int = 2, low: float = -10, high: float = 10)
- class dacbench.envs.toysgd.ToySGDEnv(config)
Bases:
dacbench.AbstractEnvOptimize toy functions with SGD + Momentum.
Action: [log_learning_rate, log_momentum] (log base 10) State: Dict with entries remaining_budget, gradient, learning_rate, momentum Reward: negative log regret of current and true function value
An instance can look as follows: ID 0 family polynomial order 2 low -2 high 2 coefficients [ 1.40501053 -0.59899755 1.43337392]
- build_objective_function(self)
- get_initial_position(self)
- step(self, action: Union[float, Tuple[float, float]]) Tuple[Dict[str, float], float, bool, Dict]
Take one step with SGD
- Parameters
action (Tuple[float, Tuple[float, float]]) – If scalar, action = (log_learning_rate) If tuple, action = (log_learning_rate, log_momentum)
- Returns
- stateDict[str, float]
State with entries “remaining_budget”, “gradient”, “learning_rate”, “momentum”
reward : float
done : bool
info : Dict
- Return type
Tuple[Dict[str, float], float, bool, Dict]
- reset(self)
Reset environment
- Returns
Environment state
- Return type
np.array
- render(self, **kwargs)
Renders the environment.
The set of supported modes varies per environment. (And some environments do not support rendering at all.) By convention, if mode is:
human: render to the current display or terminal and return nothing. Usually for human consumption.
rgb_array: Return an numpy.ndarray with shape (x, y, 3), representing RGB values for an x-by-y pixel image, suitable for turning into a video.
ansi: Return a string (str) or StringIO.StringIO containing a terminal-style text representation. The text can include newlines and ANSI escape sequences (e.g. for colors).
- Note:
- Make sure that your class’s metadata ‘render.modes’ key includes
the list of supported modes. It’s recommended to call super() in implementations to use the functionality of this method.
- Args:
mode (str): the mode to render with
Example:
- class MyEnv(Env):
metadata = {‘render.modes’: [‘human’, ‘rgb_array’]}
- def render(self, mode=’human’):
- if mode == ‘rgb_array’:
return np.array(…) # return RGB frame suitable for video
- elif mode == ‘human’:
… # pop up a window and render
- else:
super(MyEnv, self).render(mode=mode) # just raise an exception
- close(self)
Override close in your subclass to perform any necessary cleanup.
Environments will automatically close() themselves when garbage collected or when the program exits.