Configuration¶
Introduction¶
Many aspects of NOMAD and its operation can be modified through configuration. Most configuration items have reasonable defaults and typically only a small subset has to be overwritten.
Configuration items get their value in the following order:
- The item is read from the environment. This has the highest priority and will overwrite
values in a
nomad.yaml
file or the NOMAD source-code. - The value is given in a
nomad.yaml
configuration file. - There is no custom value, and the value hard-coded in the NOMAD sources will be used.
Configuration items are structured. The configuration is hierarchical and items are aggregated
in potentially nested section. For example the configuration item services.api_host
denotes
the attribute api_host
in the configuration section services
.
Setting values from the environment¶
NOMAD services will look at the environment.
All environment variables starting with NOMAD_
are considered. The rest of the name
is interpreted as a configuration item. Sections and attributes are concatenated with a _
.
For example, the environment variable NOMAD_SERVICES_API_HOST
will set the value for
the api_host
attribute in the services
section.
Setting values from a nomad.yaml
¶
NOMAD services will look for a nomad.yaml
file. By default, they will look in the
current working directory. This location can be overwritten with the NOMAD_CONFIG
environment
variable.
The configuration sections and attributes can be denoted with YAML objects and attributes.
Here is an example nomad.yaml
file:
services:
api_host: 'localhost'
api_base_path: '/nomad-oasis'
oasis:
is_oasis: true
uses_central_user_management: true
north:
jupyterhub_crypt_key: '978bfb2e13a8448a253c629d8dd84ff89587f30e635b753153960930cad9d36d'
meta:
deployment: 'oasis'
deployment_url: 'https://my-oasis.org/api'
maintainer_email: 'me@my-oasis.org'
mongo:
db_name: nomad_oasis_v1
elastic:
entries_index: nomad_oasis_entries_v1
materials_index: nomad_oasis_materials_v1
The following is a reference of all configuration sections and attributes.
Services¶
services¶
Contains basic configuration of the NOMAD services (app, worker, north).
name | type | |
---|---|---|
api_host | str | The external hostname that clients can use to reach this NOMAD installation. default: localhost |
api_port | int | The port used to expose the NOMAD app and api to clients. default: 8000 |
api_base_path | str | The base path prefix for the NOMAD app and api. default: /fairdi/nomad/latest |
api_secret | str | A secret that is used to issue download and other tokens. default: defaultApiSecret |
https | bool | Set to True , if external clients are using SSL to connect to this installation. Requires to setup a reverse-proxy (e.g. the one used in the docker-compose based installation) that handles the SSL encryption.default: False |
https_upload | bool | Set to True , if upload curl commands should suggest the use of SSL for file uploads. This can be configured independently of https to suggest large file via regular HTTP.default: False |
admin_user_id | str | The admin user user_id . All users are treated the same; there are no particular authorization information attached to user accounts. However, the API will grant the user with the given user_id more rights, e.g. using the admin owner setting in accessing data.default: 00000000-0000-0000-0000-000000000000 |
encyclopedia_base | str | This enables links to the given encyclopedia installation in the UI. default: https://nomad-lab.eu/prod/rae/encyclopedia/# |
aitoolkit_enabled | bool | If true, the UI will show a menu with links to the AI Toolkit notebooks on nomad-lab.eu .default: False |
console_log_level | int | The log level that controls console logging for all NOMAD services (app, worker, north). The level is given in Python logging log level numbers.default: 30 |
upload_limit | int | The maximum allowed unpublished uploads per user. If a user exceeds this amount, the user cannot add more uploads. default: 10 |
force_raw_file_decoding | bool | By default, text raw-files are interpreted with utf-8 encoding. If this fails, the actual encoding is guessed. With this setting, we force to assume iso-8859-1 encoding, if a file is not decodable with utf-8. default: False |
max_entry_download | int | There is an inherent limit in page-based pagination with Elasticsearch. If you increased this limit with your Elasticsearch, you can also adopt this setting accordingly, changing the maximum amount of entries that can be paginated with page-base pagination. Page-after-value-based pagination is independent and can be used without limitations. default: 50000 |
unavailable_value | str | Value that is used in results section Enum fields (e.g. system type, spacegroup, etc.) to indicate that the value could not be determined.default: unavailable |
meta¶
Metadata about the deployment and how it is presented to clients.
name | type | |
---|---|---|
label | str | An additional log-stash data key-value pair added to all logs. Can be used to differentiate deployments when analyzing logs. |
beta | dict | Additional data that describes how the deployment is labeled as a beta-version in the UI. |
version | str | The NOMAD version string. default: 1.1.8.dev0+ge215aebcf.d20230228 |
commit | str | The source-code commit that this installation's NOMAD version is build from. default: `` |
deployment | str | Human-friendly name of this nomad deployment. default: devel |
deployment_url | str | The NOMAD deployment's url (api url). default: http://localhost:8000/fairdi/nomad/latest/api |
service | str | Name for the service that is added to all logs. Depending on how NOMAD is installed, services get a name (app, worker, north) automatically. default: unknown nomad service |
name | str | Web-site title for the NOMAD UI. default: NOMAD deprecated |
homepage | str | Provider homepage. default: https://nomad-lab.eu deprecated |
source_url | str | URL of the NOMAD source-code repository. default: https://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR deprecated |
maintainer_email | str | Email of the NOMAD deployment maintainer. default: markus.scheidgen@physik.hu-berlin.de |
oasis¶
Settings related to the configuration of a NOMAD Oasis deployment.
name | type | |
---|---|---|
allowed_users | str | A list of usernames or user account emails. These represent a white-list of allowed users. With this, users will need to login right-away and only the listed users might use this deployment. All API requests must have authentication information as well. |
is_oasis | bool | Set to True to indicate that this deployment is a NOMAD Oasis.default: False |
uses_central_user_management | bool | Set to True to use the central user-management. Typically the NOMAD backend is using the configured keycloak to access user data. With this, the backend will use the API of the central NOMAD (central_nomad_deployment_url ) instead.default: False |
central_nomad_deployment_url | str | The URL of the API of the NOMAD deployment that is considered the central NOMAD. default: https://nomad-lab.eu/prod/v1/api |
north¶
Settings related to the operation of the NOMAD remote tools hub service north.
name | type | |
---|---|---|
enabled | str | Enables or disables the NORTH API and UI views. This is independent of whether you run a jupyter hub or not. default: True |
hub_connect_ip | str | Overwrites the default hostname that can be used from within a north container to reach the host system. Typically has to be set for non Linux hosts. Set this to host.docker.internal on windows/macos. |
hub_connect_url | str | This setting is forwarded to jupyterhub; refer to the jupyterhub documentation. |
docker_network | str | This setting is forwarded to jupyterhub; refer to the jupyterhub documentation. |
jupyterhub_crypt_key | str | This setting is forwarded to jupyterhub; refer to the jupyterhub documentation. |
nomad_host | str | The NOMAD app host name that spawned containers use. |
tools | typing.Union[str, typing.Dict[str, nomad.config.NorthTool]] | The available north tools. Either the tools definitions as dict or a path to a .json file. default: dependencies/nomad-remote-tools-hub/tools.json |
hub_service_api_token | str | A secret token shared between NOMAD and the NORTH jupyterhub. This needs to be the token of an admin service. default: secret-token |
hub_ip | str | This setting is forwarded to jupyterhub; refer to the jupyterhub documentation. default: 0.0.0.0 |
hub_host | str | The internal host name that NOMAD services use to connect to the jupyterhub API. default: localhost |
hub_port | int | The internal port that NOMAD services use to connect to the jupyterhub API. default: 9000 |
windows | bool | Enable windows OS hacks. default: True |
Files, databases, external services¶
rabbitmq¶
Configures how NOMAD is connecting to RabbitMQ.
name | type | |
---|---|---|
host | str | The name of the host that runs RabbitMQ. default: localhost |
user | str | The RabbitMQ user that is used to connect. default: rabbitmq |
password | str | The password that is used to connect. default: rabbitmq |
fs¶
name | type | |
---|---|---|
staging_external | str | |
public_external | str | |
north_home_external | str | |
external_working_directory | str | |
tmp | str | default: .volumes/fs/tmp |
staging | str | default: .volumes/fs/staging |
public | str | default: .volumes/fs/public |
north_home | str | default: .volumes/fs/north/users |
local_tmp | str | default: /tmp |
prefix_size | int | default: 2 |
archive_version_suffix | str | default: v1 |
working_directory | str | default: /app |
elastic¶
name | type | |
---|---|---|
host | str | default: localhost |
port | int | default: 9200 |
timeout | int | default: 60 |
bulk_timeout | int | default: 600 |
bulk_size | int | default: 1000 |
entries_per_material_cap | int | default: 1000 |
entries_index | str | default: nomad_entries_v1 |
materials_index | str | default: nomad_materials_v1 |
keycloak¶
name | type | |
---|---|---|
public_server_url | str | |
client_secret | str | |
server_url | str | default: https://nomad-lab.eu/fairdi/keycloak/auth/ |
realm_name | str | default: fairdi_nomad_prod |
username | str | default: admin |
password | str | default: password |
client_id | str | default: nomad_public |
mongo¶
Connection and usage settings for MongoDB.
name | type | |
---|---|---|
host | str | The name of the host that runs mongodb. default: localhost |
port | int | The port to connect with mongodb. default: 27017 |
db_name | str | The used mongodb database name. default: nomad_v1 |
logstash¶
name | type | |
---|---|---|
level | int | default: 10 |
enabled | bool | default: False |
host | str | default: localhost |
tcp_port | str | default: 5000 |
mail¶
name | type | |
---|---|---|
enabled | bool | default: False |
with_login | bool | default: False |
host | str | default: `` |
port | int | default: 8995 |
user | str | default: `` |
password | str | default: `` |
from_address | str | default: support@nomad-lab.eu |
cc_address | str | default: support@nomad-lab.eu |
datacite¶
name | type | |
---|---|---|
mds_host | str | default: https://mds.datacite.org |
enabled | bool | default: False |
prefix | str | default: 10.17172 |
user | str | default: * |
password | str | default: * |
rfc3161_timestamp¶
name | type | |
---|---|---|
cert | str | Path to the optional rfc3161ng timestamping server certificate. |
username | str | |
password | str | |
server | str | The rfc3161ng timestamping host. default: http://time.certum.pl/ |
hash_algorithm | str | Hash algorithm used by the rfc3161ng timestamping server. default: sha256 |
Processing¶
celery¶
name | type | |
---|---|---|
max_memory | float | default: 64000000.0 |
timeout | int | default: 1800 |
acks_late | bool | default: False |
routing | str | default: queue |
priorities | dict | default: |
normalize¶
name | type | |
---|---|---|
system_classification_with_clusters_threshold | int | The system size limit for running the dimensionality analysis. For very large systems the dimensionality analysis will get too expensive. default: 64 |
symmetry_tolerance | float | Symmetry tolerance controls the precision used by spglib in order to find symmetries. The atoms are allowed to move 1/2symmetry_tolerance from their symmetry positions in order for spglib to still detect symmetries. The unit is angstroms. The value of 0.1 is used e.g. by Materials Project according to https://pymatgen.org/pymatgen.symmetry.analyzer.html#pymatgen.symmetry.analyzer.SpacegroupAnalyzer default:* 0.1 |
prototype_symmetry_tolerance | float | The symmetry tolerance used in aflow prototype matching. Should only be changed before re-running the prototype detection. default: 0.1 |
max_2d_single_cell_size | int | Maximum number of atoms in the single cell of a 2D material for it to be considered valid. default: 7 |
cluster_threshold | float | The distance tolerance between atoms for grouping them into the same cluster. Used in detecting system type. default: 2.5 |
angle_rounding | float | Defines the "bin size" for rounding cell angles for the material hash in degree. default: 10.0 |
flat_dim_threshold | float | The threshold for a system to be considered "flat". Used e.g. when determining if a 2D structure is purely 2-dimensional to allow extra rigid transformations that are improper in 3D but proper in 2D. default: 0.1 |
k_space_precision | float | The threshold for point equality in k-space. Unit: 1/m. default: 150000000.0 |
band_structure_energy_tolerance | float | The energy threshold for how much a band can be on top or below the fermi level in order to still detect a gap. Unit: Joule. default: 8.01088e-21 |
springer_db_path | str | default: /usr/local/lib/python3.7/site-packages/nomad/normalizing/data/springer.msg |
process¶
name | type | |
---|---|---|
redirect_stdouts | bool | True will redirect lines to stdout (e.g. print output) that occur during processing (e.g. created by parsers or normalizers) as log entries. default: False |
store_package_definition_in_mongo | bool | Configures whether to store the corresponding package definition in mongodb. default: False |
add_definition_id_to_reference | bool | Configures whether to attach definition id to m_def , note it is different from m_def_id . The m_def_id will be exported with the with_def_id=True via m_to_dict .default: False |
write_definition_id_to_archive | bool | Write m_def_id to the archive.default: False |
index_materials | bool | default: True |
reuse_parser | bool | default: True |
metadata_file_name | str | default: nomad |
metadata_file_extensions | tuple | default: ('json', 'yaml', 'yml') |
auxfile_cutoff | int | default: 100 |
parser_matching_size | int | default: 12000 |
max_upload_size | int | default: 34359738368 |
use_empty_parsers | bool | default: False |
reprocess¶
Configures standard behaviour when reprocessing. Note, the settings only matter for published uploads and entries. For uploads in staging, we always reparse, add newfound entries, and delete unmatched entries.
name | type | |
---|---|---|
rematch_published | bool | default: True |
reprocess_existing_entries | bool | default: True |
use_original_parser | bool | default: False |
add_matched_entries_to_published | bool | default: True |
delete_unmatched_published_entries | bool | default: False |
index_individual_entries | bool | default: False |
bundle_export¶
Controls behaviour related to exporting bundles.
name | type | |
---|---|---|
default_cli_bundle_export_path | str | Default path used when exporting bundles using the CLI command. default: ./bundles |
default_settings | BundleExportSettings | General default settings. default: include_raw_files=True include_archive_files=True include_datasets=True |
default_settings_cli | BundleExportSettings | Additional default settings, applied when exporting using the CLI. This allows to override some of the settings specified in the general default settings above. |
BundleExportSettings¶
name | type | |
---|---|---|
include_raw_files | bool | If the raw files should be included in the export default: True |
include_archive_files | bool | If the parsed archive files should be included in the export default: True |
include_datasets | bool | If the datasets should be included in the export default: True |
bundle_import¶
Controls behaviour related to importing bundles.
name | type | |
---|---|---|
required_nomad_version | str | Minimum NOMAD version of bundles required for import. default: 1.1.2 |
default_cli_bundle_import_path | str | Default path used when importing bundles using the CLI command. default: ./bundles |
allow_bundles_from_oasis | bool | If oasis admins can "push" bundles to this NOMAD deployment. default: False |
allow_unpublished_bundles_from_oasis | bool | If oasis admins can "push" bundles of unpublished uploads. default: False |
default_settings | BundleImportSettings | General default settings. default: include_raw_files=True include_archive_files=True include_datasets=True include_bundle_info=True keep_original_timestamps=False set_from_oasis=True delete_upload_on_fail=False delete_bundle_on_fail=True delete_bundle_on_success=True delete_bundle_include_parent_folder=True trigger_processing=False process_settings=Reprocess(rematch_published=True, reprocess_existing_entries=True, use_original_parser=False, add_matched_entries_to_published=True, delete_unmatched_published_entries=False, index_individual_entries=False) |
default_settings_cli | BundleImportSettings | Additional default settings, applied when importing using the CLI. This allows to override some of the settings specified in the general default settings above. default: include_raw_files=True include_archive_files=True include_datasets=True include_bundle_info=True keep_original_timestamps=False set_from_oasis=True delete_upload_on_fail=False delete_bundle_on_fail=False delete_bundle_on_success=False delete_bundle_include_parent_folder=True trigger_processing=False process_settings=Reprocess(rematch_published=True, reprocess_existing_entries=True, use_original_parser=False, add_matched_entries_to_published=True, delete_unmatched_published_entries=False, index_individual_entries=False) |
BundleImportSettings¶
name | type | |
---|---|---|
include_raw_files | bool | If the raw files should be included in the import default: True |
include_archive_files | bool | If the parsed archive files should be included in the import default: True |
include_datasets | bool | If the datasets should be included in the import default: True |
include_bundle_info | bool | If the bundle_info.json file should be kept (not necessary but may be nice to have. default: True |
keep_original_timestamps | bool | If all timestamps (create time, publish time etc) should be imported from the bundle. default: False |
set_from_oasis | bool | If the from_oasis flag and oasis_deployment_url should be set. default: True |
delete_upload_on_fail | bool | If False, it is just removed from the ES index on failure. default: False |
delete_bundle_on_fail | bool | Deletes the source bundle if the import fails. default: True |
delete_bundle_on_success | bool | Deletes the source bundle if the import succeeds. default: True |
delete_bundle_include_parent_folder | bool | When deleting the bundle, also include parent folder, if empty. default: True |
trigger_processing | bool | If the upload should be processed when the import is done (not recommended). default: False |
process_settings | Reprocess | When trigger_processing is set to True, these settings control the reprocessing behaviour (see the config for reprocess for more info). NOTE: reprocessing is no longer the recommended method to import bundles.default: rematch_published=True reprocess_existing_entries=True use_original_parser=False add_matched_entries_to_published=True delete_unmatched_published_entries=False index_individual_entries=False |
Reprocess¶
Configures standard behaviour when reprocessing. Note, the settings only matter for published uploads and entries. For uploads in staging, we always reparse, add newfound entries, and delete unmatched entries.
name | type | |
---|---|---|
rematch_published | bool | default: True |
reprocess_existing_entries | bool | default: True |
use_original_parser | bool | default: False |
add_matched_entries_to_published | bool | default: True |
delete_unmatched_published_entries | bool | default: False |
index_individual_entries | bool | default: False |
archive¶
name | type | |
---|---|---|
block_size | int | default: 262144 |
read_buffer_size | int | GPFS needs at least 256K to achieve decent performance. default: 262144 |
max_process_number | int | Maximum number of processes can be assigned to process archive query. default: 20 |
min_entries_per_process | int | Minimum number of entries per process. default: 20 |
User Interface¶
ui¶
name | type | |
---|---|---|
entry_context | dict | default: |
search_contexts | dict | default: |
example_uploads | dict | default: |
default_unit_system | str | default: Custom |
north_enabled | bool | This is a derived value filled with north.enabled. default: True |
theme | dict | default: |
Others¶
tests¶
name | type | |
---|---|---|
default_timeout | int | default: 60 |
resources¶
name | type | |
---|---|---|
enabled | bool | default: False |
db_name | str | default: nomad_v1_resources |
max_time_in_mongo | float | Maxmimum time a resource is stored in mongodb before being updated. default: 31536000.0 |
download_retries | int | Number of retries when downloading resources. default: 2 |
download_retry_delay | int | Delay between retries in seconds default: 10 |
max_connections | int | Maximum simultaneous connections used to download resources. default: 10 |
client¶
name | type | |
---|---|---|
user | str | |
password | str | |
access_token | str | |
url | str | default: http://nomad-lab.eu/prod/v1/api |
gitlab¶
name | type | |
---|---|---|
private_token | str | default: not set |