Authors: Kevin Stratford, Tom Underwood
Normally solving a given problem using molecular simulation is more complex than simply performing a single simulation and analysing its output. Typically workflows must be employed which involve cycles of running one or more simulations, analysing their output, and then using the results of this analysis to inform input parameters for further simulations. A 'toolkit' of helper utilities for performing such tasks is thus desirable. With this in mind, we have developed a Python toolkit to support DL_MONTE (with the intention of eventually extending its scope beyond DL_MONTE).
This tutorial describes some utilities which help to read, manipulate, and write the inputs
and outputs associated with DL_MONTE. Moreover it describes how to execute DL_MONTE from Python.
The toolkit is named htk
(Histogram reweighting toolkit - 'histogram reweighting' being one of
the key functionalities of the toolkit). To elaborate, this tutorial covers:
CONFIG
, CONTROL
and FIELD
files)PTFILE
and YAMLDATA
files)import sys
print(sys.version_info)
PYTHONPATH
¶PYTHONPATH
is the shell variable which contains a list of directories in which to look for packages to be imported. To use the toolkit, it is prudent to set PYTHONPATH
to the directory containing the toolkit.
One can modify PYTHONPATH
to include the directory containing the toolkit (which will depend on your local system!) within Python as follows:
# Import standard os module for file manipulations
import os
# You will need to set DL_MONTE_HOME appropriately for
# the local system
DL_MONTE_HOME = "/home/tom/Work/Code_workspace/DL_MONTE_2_dev/dlmonte2"
# Set PYTHONPATH to the 'htk' directory within DL_MONTE_HOME
sys.path.append( os.path.join(DL_MONTE_HOME,"htk") )
Alternatively one can modify PYTHONPATH
directly in the shell via:
bash> cd $DL_MONTE_HOME/htk
bash> export PYTHONPATH=$PYTHONPATH:`pwd`
where here $DL_MONTE_HOME
is the main directory containing DL_MONTE.
This is the prefered method because it does not rely on PYTHONPATH being known
by the script importing the toolkit.
Within Python the toolkit is imported as follows:
dlmonte
module¶The functionality we will examine here is contained in the dlmonte
module of the toolkit. This is imported as follows:
# Import the dlmonte module with alias "dlmonte"
import htk.sources.dlmonte as dlmonte
In this tutorial we will use some pre-existing DL_MONTE input files in order to demonstrate the functionality of the dlmonte
module. These files are distributed with the tutorial, and pertain to a short grand-canonical Monte Carlo simulation of a Lennard-Jones fluid near the critical point. The directory containing these files is util-dlmonte_files
. We will store this directory in the variable input_dir
, which may need to be set appropriately for your local system:
# You may need to set input_dir appropriately for the local system
input_dir = "util-dlmonte_files"
We will now examine the dlmonte
module's functionality regarding importing and exporting the three key DL_MONTE input files:
FIELD
CONFIG
CONTROL
The input files we will use as an example are contained in input_dir
FIELD
file¶The dlmonte.dlfield
module provides a method from_file()
which loads
a FIELD
file into an internal python structure (an instance of class FIELD
).
The method takes the path of the FIELD
file as the argument.
In this case, as mentioned above, the input file corresponds to a small grand canonical simulation using Lennard-Jones particles (taken from the DL_MONTE test suite).
filename = os.path.join(input_dir, "FIELD")
field = dlmonte.dlfield.from_file(filename)
We can now examine the contents of the dlfield
structure.
Using the builtin repr()
method shows the internal representation of the whole FIELD
structure:
repr(field)
The description
, cutoff
, and units
attributes are of type string, integer, and string, respectively.
print("Description: ", field.description)
print("Cutoff: ", field.cutoff)
print("Units: ", field.units)
Other attributes, such as atomtypes
are more complex; atomtypes
is a list of entries of class AtomType
. In this example, there is only one atom type present in the simulation:
print ("Atomtypes: ", field.atomtypes)
Non-bonded interations are stored in the vdw
attribute (again a list), which provides a full description of the interaction:
print (repr(field.vdw[0].atom1))
print (repr(field.vdw[0].atom2))
print (repr(field.vdw[0].interaction))
Direct access to numerical values for computation is available, e.g.,
ljinteraction = field.vdw[0].interaction
print ("Twice epsilon is ", 2.0*ljinteraction.epsilon)
FIELD
file output¶The dlfield
module also allows you to write a FIELD file in a well-formed format for use by the main DL-MONTE executable.
The is done via the str()
method of the FIELD object (i.e., the output of print
or the output of a format statement"{!s}".format(...)
:
print(field)
This allows us, if required, to manipulate and write a new FIELD
file with
updated parameters. For example, to adjust the potential interaction,
we could write:
field.vdw[0].interaction.epsilon = 2.0
print(field)
Note that the 'epsilon' parameter for the Van der Waals Lennard-Jones interaction has been modified.
The internal python representation is flexible enough that additional output formats are constructed relatively easily. For example, JSON output is available via:
print(field.to_json())
CONFIG
file¶Likewise, a dlmonte.dlconfig.from_file()
method is available which reads the contents of the CONFIG
file into an internal represetation.
As this is related to the FIELD
description, the FIELD
object can, optionally, be passed as an argument. If the FIELD
reference is provided, the two files can be checked for consistency.
However, the CONFIG
file can be read independently. E.g.,
# The CONFIG will be from `input_dir`
filename = os.path.join(input_dir, "CONFIG")
config = dlmonte.dlconfig.from_file(filename)
The internal representation of the CONFIG
object is:
repr(config)
Again, the config
structure has a number of attributes. Of particular interest (especially for NVT and muVT ensembles) is the vcell
attribute which specifies, indirectly, the volume of the system.
A utility method if provided to return the volume of the cell.
print ("Cell vectors: ", config.vcell)
print ("Cell volume: ", config.volume())
Note that the internal representation produced via repr()
does not show
all the molecule information, as this may be very long.
However, the string formating option can be used to generate output which is a well-formed CONFIG
file with full information.
print (config)
CONTROL
file¶Simulation execution and additional parameters are determined by the CONTROL
file.
# Again using the input_dir location as defined above
filename = os.path.join(input_dir, "CONTROL")
ctrl = dlmonte.dlcontrol.from_file(filename)
The CONTROL
file has a potentially complex structure which is split into three basic parts: a title, a 'use' block, and a 'main' block. The first is just the title string:
repr(ctrl.title)
The second part is the use block, which contains relevant 'use' statements, and any FED block (see the manual for details). In this case, these are both empty:
repr(ctrl.use_block)
The third part is the main section, which contains a series of statements controlling various simulation behaviour:
repr(ctrl.main_block)
Again, the string output is designed to provide a well-form CONTROL
file suitable for use with the DL_MONTE executable:
print (ctrl)
FIELD
, CONFIG
and CONTROL
in one go¶It is convenient to read and store the three mandatory input files in a single step. A container class DLMonteInput
is provided to do this:
# Read all input files in input_dir into a DLMonteInput object
inputs = dlmonte.DLMonteInput.from_directory(input_dir)
This reads the three files with standard filenames FIELD
, CONFIG
and CONTROL
from the named directory.
The result contains field
, config
, and control
objects as attributes.
print (repr(inputs.field))
print ()
print (repr(inputs.config))
print ()
print (repr(inputs.control))
We now have python representations of the FIELD
, CONFIG
, and CONTROL
files required for a DL_MONTE simulation.
This section discusses running DL_MONTE via python.
# We need an executable.
# We assume here it is in the DL_MONTE_HOME directory (see above), and is from a serial compilation
dlx = os.path.join(DL_MONTE_HOME, "bin", "DLMONTE-SRL.X")
# For input again use that from input_dir defined above
myinput = dlmonte.DLMonteInput.from_directory(input_dir)
# We also amend the CONTROL file here...
# Update the main block of the CONTROL file to include YAML output directive (see manual)
yamltag={"yamldata": 1000}
myinput.control.main_block.statements.update(yamltag)
We assume we need to copy the input, without manipulation, to a working directory where the run will take place (and where output will be produced).
# Set an appropriate working directory
# THIS MUST BE CREATED ON YOUR LOCAL SYSTEM: E.G. mkdir util-dlmonte_workspace
work_dir = "util-dlmonte_workspace"
# Copy the input to the working directory
myinput.to_directory(work_dir)
DLMonteRunner
object, and execute DL_MONTE¶This is a utility to help run the DL MONTE executable in the working directory via a sub-process.
# Set up a DLMonteRunner object linked to the directory work_dir
# and the executable dlx
myrun = dlmonte.DLMonteRunner(dlx, work_dir)
# Execute the runner - the output files from the simulation will be in work_dir
myrun.execute()
On successful execution, the DLMonteRunner
creates a DLMonteOutput
object which contains any PTFILE
and/or YAMLDATA
output.
# We will look at the YAML-format data output by the simulation, which is stored in the
# output file YAMLDATA, and within the DLMonteRunner object as follows:
data = myrun.output.yamldata.data
# Print each frame of YAML-format data, where each frame corresponds to a certain timestep
# ('timestamp') in the simulation. Note that data was only output every 100 moves
for step in data:
print (step)
To remove input, output, or both, methods on the DLMonteRunner
class are provided:
# Remove input or output
myrun.remove_input()
myrun.remove_output()
# Or, remove both input and output
myrun.cleanup()