This tutorial explains how you can use the MMLF’s graphical user interface. It assumes that you already installed the MMLF successfully. It might be helpful to read the Quick start (command line interface) tutorial first in order to have some understanding of what is going on “under the hood”.
The MMLF’s GUI can be started from the command line with the command
mmlf_gui
Note
If you installed the MMLF locally you might have to write ”./mmlf_gui” instead under unix-based OS.
This should create a window that looks like this:
The main window consists of three tabs: the “Explorer”, the “Experimenter”, and the “Documentation”. The last one shows this documentation. The other two are explained in more detail in this tutorial.
The explorer’s main purpose is to investigate the behaviour of a specific agent in a specific environment. It provides different kinds of visualizations (depending on the world) of what is going on. In the explorer tab, the environment and the agent that should be used in the world can be selected from combo boxes. The selected agents and environments can be configured by pressing the configure button. This creates a popup window like the following:
In this popup, the agent’s and environment’s parameters can be modified and stored by pressing “Save”. For some agents and environments, pressing “Info” yields some more information regarding the agent/environment. Furthermore, help on a specific parameter is provided as a tooltip of the respective edit field. An alternative to manually configuring agent and environment is to load a world with predefined agent and environment using the “Load Config” button. Accordingly, “Store Config” allows to store a manual configuration of a world such that it can be easily reloaded later on.
Back in the explorer tab, the selected agent and environment can be loaded by pressing “Init World”. Now, the Monitor which controls the information that are automatically stored during running a world in the MMLF can be configured by pressing “Configure Monitor”. Once this is done, we can start the configured world. One step in this world can be performed by pressing “Step”, one single episode by pressing “Episode”, and infinitely many by pressing “Run World”. This infinite running can be stopped by pressing “Stop World”.
The tab “StdOut/Err” shows the output of the program to standard output and standard error. Some more detailed information about the currently running experiment can be obtained via the text output in the “MMLF Log” Tab. Furthermore, so called viewers can be added that visualize the progress in a graphical manner. In all environments, the so called “FloatStreamViewer” is available that allows to monitor the development of a real-valued metric over time. This viewer looks as follows:
This viewer shows the change of the metric over time as well as its moving window average (in red). The metric can be selected via the combobox and the range of shown window as well as the length of the moving window average window can be specified.
For several worlds, additional viewers become available after loading a world using “Init World”. For instance, in the maze2d_environment, an additional viewer is available that shows the current policy and value function (see below):
For an overview over all availabe viewers, please take a look at Viewers.
The “Experimenter” is meant to be used when one wants to compare the learning performance in different settings (for example different agents/agent parametrizations in the same environment or the same agent in slightly different environments etc.). It looks as follows:
The “Create World” button launches a window in which agent and environment of a world can be configured in the same way as in the Explorer. Once this is done, “Save” adds this world to the list of worlds in the upper left part of the Explorer tab. Alternatively, one can also load a stored world configuration using “Load World”. An arbitrary world can be later on modified by selecting it in the world list and pressing “Edit World”, can be removed from the world list by pressig “Remove world”, and can get assigned a more meaningful name by selecting it and editing the name in the text field right of the “Remove world” button. Alternatively, instead of manually adding and editing worlds, one can also load a whole experiment configuration using “Load Experiment”. Accordingly, configured experiments can be stored using “Store experiment”.
Furthermore, one can specify how many independent runs of each world are conducted (the more often the more reliable become the performance estimates) by editing the entry in text field right of “Runs” and how many episodes each run should take (text field “Episodes”). In addition, one can select whether the independent runs of the world should be conducted sequentially (one after the other) or concurrently (in a separate OS-process each). By editing the text field “Parallel running processes”, one can choose how many runs are conducted in parallel (this is fixed to 1 for sequential execution). The maximally allowed number of parallel processes is the number of (virtual) cores in the machine.
Note
A word of warning: The concurrent execution of world processes is still in an experimental state and may behave strangely under certain conditions (for instance it might not shutdown correctly and keep some zombie processes). Furthermore, Windows OSes do not support concurrent execution of worlds.
By pressing “Start Experiment”, the experiment is started and a new tab “Experiment Statistics” is added in which the progress of the experiment can be monitored in real time:
In the top line, the metric that should be displayed can be selected. The metric “Episode Return” is always available; it shows the accumulated reward per Episode. Below this, a table is shown which presents some statistics (min, max, mean, median etc.) of the chosen metric for the different runs of the worlds. The results of the experiment can be analyzed for statistical significance by pressing the “Statistics” (see Evaluate experiments). By pressing “Visualize”, these results can also be displayed in a graphical manner:
In this visualizations, one can see the development of the selected metric over time for the two agents. One can select whether one wants to see the average over all runs conducted for one agent or each of these runs separately. Furthermore, one can specify the length of the moving window average. The plot is not updated automatically, but only when one presses “Update” or when one changes the selections. One can save the generaed plot to a file by pressing “Save”.
The “Experiment Statistics” tab can also be restored for an experiment conducted earlier by loading the results of this experiment into the Experimenter. This can be done by pressing “Load Experiment Results”. This opens a file selection dialog in which the root directory of the particular experiment in the RW area must be selected.
Note
It may happen that different experiments share the same root directory. In this case, the Experimenter cannot distinguish these experiments and interprets them as a single experiment. In order to avoid that, please copy the results of an experiment to a unique directory manually.
See also