`dacbench.envs.toysgd`

Module Contents

Classes

ToySGDEnv

Optimize toy functions with SGD + Momentum.

Functions

`create_polynomial_instance_set`(out_fname: str, n_samples: int = 100, order: int = 2, low: float = -10, high: float = 10)
`sample_coefficients`(order: int = 2, low: float = -10, high: float = 10)

dacbench.envs.toysgd.create_polynomial_instance_set(out_fname: str, n_samples: int = 100, order: int = 2, low: float = -10, high: float = 10)

dacbench.envs.toysgd.sample_coefficients(order: int = 2, low: float = -10, high: float = 10)

class dacbench.envs.toysgd.ToySGDEnv(config)

Bases: dacbench.AbstractEnv

Optimize toy functions with SGD + Momentum.

Action: [log_learning_rate, log_momentum] (log base 10) State: Dict with entries remaining_budget, gradient, learning_rate, momentum Reward: negative log regret of current and true function value

An instance can look as follows: ID 0 family polynomial order 2 low -2 high 2 coefficients [ 1.40501053 -0.59899755 1.43337392]

build_objective_function(self)

get_initial_position(self)

step(self, action: Union[float, Tuple[float, float]]) → Tuple[Dict[str, float], float, bool, Dict]

Take one step with SGD

Parameters

action (Tuple[float, Tuple[float, float]]) – If scalar, action = (log_learning_rate) If tuple, action = (log_learning_rate, log_momentum)

Returns

stateDict[str, float]
State with entries “remaining_budget”, “gradient”, “learning_rate”, “momentum”
reward : float
done : bool
info : Dict

Return type

Tuple[Dict[str, float], float, bool, Dict]

reset(self)

Reset environment

Returns: Environment state
Return type: np.array

render(self, **kwargs)

Renders the environment.

The set of supported modes varies per environment. (And some environments do not support rendering at all.) By convention, if mode is:

human: render to the current display or terminal and return nothing. Usually for human consumption.
rgb_array: Return an numpy.ndarray with shape (x, y, 3), representing RGB values for an x-by-y pixel image, suitable for turning into a video.
ansi: Return a string (str) or StringIO.StringIO containing a terminal-style text representation. The text can include newlines and ANSI escape sequences (e.g. for colors).

Note:

Make sure that your class’s metadata ‘render.modes’ key includes: the list of supported modes. It’s recommended to call super() in implementations to use the functionality of this method.

Args:

mode (str): the mode to render with

Example:

class MyEnv(Env):

metadata = {‘render.modes’: [‘human’, ‘rgb_array’]}

def render(self, mode=’human’):

if mode == ‘rgb_array’:: return np.array(…) # return RGB frame suitable for video
elif mode == ‘human’:: … # pop up a window and render
else:: super(MyEnv, self).render(mode=mode) # just raise an exception

close(self)

Override close in your subclass to perform any necessary cleanup.

Environments will automatically close() themselves when garbage collected or when the program exits.

dacbench.envs.toysgd

Module Contents

Classes

Functions

`dacbench.envs.toysgd`