MMLF skill discovery interface
This module defines the interface for skill discovery methods
Inform skill discovery of a new transition and search for new skills.
Inform skill discovery that an episode has terminated.
The skill discovery interface.
New in version 0.9.9.
Factory method that creates skill discovery based on spec-dictionary.
Inform skill discovery method that the current episode terminated.
Returns dict that contains a mapping from skill discovery name to class.
Inform skill discovery method about a new state transition.
The agent has transitioned from state to succState after executing action and obtaining the reward reward.
Returns a pair consisting of the list of discovered options and whether a new option was discovered based on this state transition.
Skill Discovery that creates predefined skills at specified timepoints.
Skill Discovery that creates predefined skills at specified timepoints.
This approach does not really ‘discover’ new skills but implements the SkillDiscovery interface. This can be useful when studying the effect of adding options during runtime since one can exactly control which option is added at what time. Note that only ReachSubgoalOptions can be specified.
predefinedOptions: | |
---|---|
: A list with specifications of the predefined options. Not possible via GUI. This specification consists of:
|
Skill Discovery based on Graph Partitioning.
Skill Discovery based on the Local Graph Partitioning.
Local Graph Partitioning creates local state space transition graphs based on subsequences of the agent’s trajectories. For each of these local graphs, the graph cut is computed that minimizes the “Noralized Cut” metric. States that are chosen often as cut states (hits) are considered to be useful subgoals for the agent.
See also
“Identifying useful subgoals in reinforcement learning by local graph partitioning” by Simsek, Wolfe, and Barto (ICML 2005)
requiredHitRatio: | |
---|---|
: The required minimal ratio of a state being a hit in relation to the number of visits. | |
minimumVisits: | : The minimal number of times a state needs to be visited before it can be chosen as subgoal. |
maximalNCut: | : The maximal value of the N-Cut of a local graph that is allowed. Otherwise, this local graph is ignored since it does not reveal reliable information. |
windowLength: | : The length of the trajectory window that is used for constructing a local graph. |
updateFrequency: | |
: The number of steps between two times a local graph is created. | |
projectionStateDimensions: | |
: The state space dimensions that are considered to be relevant for skill discovery. | |
gridResolution: | : The resolution of the grid that is used for discretizing states of a continuous state space. |