RIPER ASE Calculator - Current Implementation Notes
=====================================================

Location
--------
Directory inspected: /home/manas/tm_master/turbomole_vqe_2/riper_ase_calculator
Package name/version: riper 0.2.7
Main public import: from riper import RIPERCalculator
Dependencies declared in pyproject.toml: ase>=3.22, numpy>=1.20

High-Level Purpose
------------------
This package provides an ASE-compatible calculator for TURBOMOLE's RIPER module.
It writes TURBOMOLE coord/control files from ASE Atoms objects, launches a RIPER
binary/command, and parses energy, forces, and stress back into ASE units.

Public API
----------
1. RIPERCalculator
   - Defined in src/riper/calculator.py.
   - Exported from src/riper/__init__.py.
   - Subclasses ase.calculators.calculator.Calculator.
   - implemented_properties = ["energy", "forces", "stress"].
   - Intended ASE usage:
       atoms.calc = RIPERCalculator(...)
       atoms.get_potential_energy()
       atoms.get_forces()
       atoms.get_stress()

2. RIPERConvergenceError
   - Raised when convergence_check="error" and the RIPER output contains a
     recognized non-convergence pattern.

3. RIPERConvergenceWarning
   - Issued when convergence_check="warning" and non-convergence is detected.

Calculator Parameters
---------------------
RIPERCalculator accepts these explicit parameters:

- basis: orbital basis set. If None, it is selected automatically:
  - non-periodic/0D systems: def2-SVP
  - periodic systems: pob-DZVP-rev2
- auxbasis: auxiliary basis for density fitting. Default: universal.
- functional: exchange-correlation functional. Default: pbe.
- nkpoints: k-point grid. Default: [2, 2, 2]. Only the first npdir entries are
  written, where npdir is the number of periodic directions.
- charge: total system charge. Default: 0. Nonzero values write an $atomdens
  block with charge.
- uks: enables unrestricted Kohn-Sham through $uhf. Default: False.
- smear: if True, writes sigma inside $riper. Default: False.
- sigma: Gaussian smearing width in Hartree. Default: 0.01.
- riper_command: shell command used to run RIPER. Default: riper. Examples use
  riper_smp; the docstring also mentions riper_omp and full paths.
- directory: working directory for input/output files. Default: current dir.
- extra_keywords: dictionary of additional $riper keywords. Each key/value is
  written as "key value" under $riper.
- convergence_check: one of error, warning, ignore. Default: error.
- scfiterlimit: writes $scfiterlimit. Default: 50.
- scfconv: writes $scfconv. Default: 7.
- denconv: optional $denconv value. Default: None, so no $denconv line.
- ricore: memory in MB for holding RI integrals in memory through $ricore. Default: 5000.
- ncore: optional RIPER subprocess thread count. Default: None, which leaves
  the inherited environment untouched. If set, subprocess environment
  variables for SMP/OpenMP/BLAS thread counts are set to this value.
- disp3: if True, writes $disp3 for DFT-D3 dispersion. Default: False.
- plot_dens: if True, writes $pointvalper fmt=cub plus dens to request td.cub.
  Default: False.

The constructor also accepts **kwargs that are passed to ASE Calculator. The
examples currently use calculate_stress=True this way; calculate() later reads
self.parameters.get("calculate_stress", False).

Calculation Workflow
--------------------
When ASE calls calculate():

1. ASE Calculator state is updated via super().calculate(...).
2. The internal iteration counter is incremented and self.converged is reset.
3. Periodicity is determined from atoms.pbc using get_periodicity(atoms), which
   returns int(sum(atoms.pbc)).
4. The default basis is selected if basis=None.
5. Stress calculation is enabled if either:
   - ASE requested the "stress" property and the system is periodic, or
   - calculate_stress=True was supplied through the ASE kwargs path.
6. The working directory is created.
7. coord is written with write_coord().
8. control is written with write_control().
9. riper_command is run with subprocess.run(shell=True, cwd=workdir), with both
   stdout and stderr redirected to riper.out.
10. A nonzero process return code raises RuntimeError.
11. Unless convergence_check="ignore", riper.out is scanned for known failure
    text. Failure can raise RIPERConvergenceError or emit RIPERConvergenceWarning.
12. Total energy is parsed from riper.out and stored as results["energy"].
13. Forces are parsed from the control file's $grad block and stored as
    results["forces"] when available.
14. If stress was requested/enabled, stress is parsed from riper.out and stored
    as results["stress"].

Files Written For A Calculation
-------------------------------
In the selected calculation directory:

- coord
  Written by write_coord(). Coordinates are converted from Angstrom to Bohr and
  atom symbols are lowercased. Format:
    $coord
      x y z symbol
    $end

- control
  Written by write_control(). It can include:
    $coord file=coord
    $atoms with basis and jbas
    $periodic and $lattice for periodic systems
    $dft with functional, gridsize m5, and weight derivatives
    $kpoints for periodic systems
    $atomdens charge for nonzero charge
    $uhf for UKS
    $scfiterlimit
    $scfconv
    $denconv when requested
    $disp3 when requested
    $riper when smear or extra_keywords are present
    $optcell when stress/cell optimization is requested for periodic systems
    $pointvalper fmt=cub / dens when plot_dens=True
    $end

- riper.out
  Captures RIPER stdout/stderr.

RIPER/TURBOMOLE may additionally create/update files such as statistics, ddens,
errvec, oldfock, control gradient sections, td.cub, and band/orbital files.

Periodicity And Lattice Handling
--------------------------------
- get_periodicity(atoms) simply counts True entries in atoms.pbc.
- The implementation assumes RIPER-style periodic directions by count:
  - 0D: no $periodic/$lattice
  - 1D: uses cell[0][0] only as a single lattice scalar
  - 2D: uses xy components of cell vectors a and b
  - 3D: uses all three full cell vectors
- Lattice values are converted from Angstrom to Bohr.
- kpoints are truncated to the number of periodic directions.

Output Parsers
--------------
1. read_energy(output_file)
   - Looks for the FINAL ENERGIES section, then TOTAL ENERGY.
   - Parses the first numeric value after the word ENERGY.
   - Converts Hartree to eV.
   - Raises RuntimeError if the file or energy is missing.

2. read_forces(gradient_file, natoms)
   - Despite the parameter name, the calculator passes the control file path.
   - Looks for a $grad block and the final $end.
   - Reads the last natoms lines before $end as gradients.
   - Supports Fortran D exponents.
   - Converts gradients to forces by negating and converting Hartree/Bohr to
     eV/Angstrom.

3. read_stress(output_file, atoms)
   - Looks for the last "stress tensor, raw:" section in riper.out.
   - Expects 9 scalar values in RIPER order: xx, xy, yy, xz, yz, zz, yx, zx, zy.
   - Uses xx, yy, zz, yz, xz, xy for ASE Voigt order.
   - Treats RIPER values as dE/d(strain) in Hartree and divides by cell volume.
   - Converts to eV/Angstrom^3.
   - Raises RuntimeError if no stress section exists, parsing fails, or volume is
     near zero.

Convergence Handling
--------------------
check_convergence(output_file) scans riper.out for failure patterns including:
- did not converge
- NOT CONVERGED
- SCF NOT CONVERGED
- CONVERGENCE NOT REACHED
- convergence not achieved
- exceeded maximum number of iterations
- SCF FAILED

The function also defines success patterns, but the current implementation does
not require a success pattern to be present. If no failure pattern is found, the
calculation is treated as converged.

Cleanup Functionality
---------------------
RIPERCalculator.clean(keep_output=False) removes selected temporary files from
its working directory:
- statistics
- ddens
- errvec
- oldfock
- riper.out unless keep_output=True

Several other possible TURBOMOLE files are listed in comments but are not removed
by the current implementation.

Examples Present
----------------
1. examples/example01_0D_water.py
   - Single-point molecular water calculation.
   - Demonstrates energy and forces for 0D/non-periodic systems.

2. examples/example02_0D_water_geom_opt.py
   - Water geometry optimization from water.xyz using ASE LBFGS.
   - Writes optimized extxyz and converted trajectory.

3. examples/example03_1D_LiH.py
   - 1D periodic LiH chain with pbc=[True, False, False].
   - Demonstrates k-point truncation to the 1D periodic direction.
   - Includes a lattice-parameter scan.

4. examples/example04_adjust_conv_criteria.py
   - Water calculation with tighter SCF settings.
   - Demonstrates scfconv, denconv, and scfiterlimit.

5. examples/example05_dispersion_correction.py
   - Water dimer calculation with and without D3.
   - Demonstrates disp3=True and compares energy difference.

6. examples/example06_gen_density_cube.py
   - Water density cube generation.
   - Demonstrates plot_dens=True and checks for td.cub.

7. examples/example07_calculate_stress.py
   - Bulk Si stress calculation.
   - Demonstrates calculate_stress=True, get_stress(), Voigt labels, conversion
     to GPa, hydrostatic pressure, and extxyz output.

8. examples/example08_bulk_Si_unit_cell+geom_opt.py
   - Bulk Si cell and geometry optimization using ASE FrechetCellFilter and BFGS.
   - Uses a tighter scfconv and dense k-point grid.
   - Writes optimized CIF/extxyz and trajectory extxyz.

9. examples/example09_neb_ethane_torsional_barrier_staggerd_eclipsed.py
   - CI-NEB example for ethane internal rotation.
   - Builds staggered/reactant and rotated/product endpoints, optimizes both,
     interpolates with IDPP, assigns separate calculator directories per image,
     runs climbing-image NEB, reports barrier, and writes path extxyz.

Data Files Present
------------------
- examples/water.xyz: 3-atom water geometry used by example02.

Implementation Observations / Current Limitations
-------------------------------------------------
- README.md is minimal; most usage documentation currently lives in examples and
  docstrings.
- calculate_stress is used by examples but is not an explicit __init__ parameter;
  it currently relies on ASE Calculator **kwargs storage.
- get_periodicity counts periodic axes but does not validate that periodic axes
  are the leading x/y/z axes required by cell_to_lattice_vectors. For example,
  pbc=[False, True, False] would be counted as 1D but the code would still use
  cell[0][0].
- The convergence success patterns list is unused; absence of known failure text
  means success.
- riper_command is executed with shell=True.
- Forces are read from the control file's $grad block, not a separate gradient
  file, because RIPER appears to update control.
- If the control file exists but has no $grad block, force parsing raises a
  RuntimeError.
- Stress parsing ignores the asymmetric duplicate raw components yx, zx, zy.
- plot_dens requests density output but does not parse or return the cube file;
  examples check td.cub manually.
- extra_keywords are written only inside $riper and support simple key/value text.
- No tests were present in this directory during inspection; only examples were
  found.

Update - Band Gap And New Examples
----------------------------------
Changes added after the initial implementation survey:

- Added read_band_gap(output_file, eigs_file=None) in src/riper/io.py.
  It parses the final "Fermi Level Statistics" block in riper.out. The
  preferred value is "Band gap" from the eV block; if only an au/unlabelled
  block is present, the value is converted from Hartree to eV. If no final
  Fermi-level statistics are present, it falls back to an EIGS GAP line.

- Added RIPERCalculator.get_band_gap(). This reads riper.out from the
  calculator directory and returns the HOMO-LUMO/band gap in eV after a prior
  get_potential_energy()/RIPER run. Successful calculations also store the
  parsed value in calc.results["band_gap"] and atoms.info["band_gap"], so ASE
  extXYZ exports include a band_gap field.

- No pre-existing calculator-side block against hybrid-functional gradients was
  found. Hybrid functionals are passed through as the functional string in the
  $dft section.

New examples/data added:

10. examples/example10_0D_water_b3lyp_geom_opt.py
    - Water geometry optimization with B3LYP (functional="b3-lyp").
    - Demonstrates hybrid-functional gradients and get_band_gap().

11. examples/example11_graphene_scan0.py
    - Graphene single-point calculation using SCAN0 and pob-TZVP-rev2.
    - Reads examples/graphene.extxyz.
    - Reports energy, forces, and get_band_gap().

12. examples/example12_Fe_desnue_scan.py
    - Primitive Fe UKS/PBE calculation over several desnue values.
    - Uses sigma=0.001, pob-DZVP-rev2, a dense 18x18x18 k-grid, and reports
      the lowest-energy target magnetic moment.
    - Reads examples/Fe_primitive.cif.

13. examples/example13_graphene_hbn_interlayer_scan_scan0_disp3.py
    - Manual graphene/hBN interlayer-spacing scan using SCAN0 + D3.
    - Reads examples/graphene_hbn_heterostructure.extxyz.
    - Writes an extxyz scan trajectory, a PNG potential-energy curve, and
      reports the sampled optimum spacing.

Additional data files:
- examples/graphene.extxyz
- examples/Fe_primitive.cif
- examples/graphene_hbn_heterostructure.extxyz

Update - Dipole Moment Support
------------------------------
- Added `dipole=True` to RIPERCalculator. This writes `$fields` / `electric on`
  into the control file so the SCF electric dipole moment is calculated during
  the normal RIPER run.
- Added `read_dipole(output_file)` to parse the `SCF electric dipole moment
  (a.u.)` block from riper.out and convert the vector from e*bohr to
  e*Angstrom for ASE.
- Added `dipole` to implemented ASE properties, so `atoms.get_dipole_moment()`
  works. With `dipole=True`, calling it after `get_potential_energy()` uses the
  cached result and does not launch a separate SCF.
- The vector is stored as `calc.results["dipole"]` and mirrored as
  `atoms.info["dipole_moment"]`. The latter avoids ASE extXYZ duplicate-key
  conflicts because `dipole` is a standard calculator result key.
- Added `examples/example14_0D_water_dipole_geom_opt.py`.

Update - RI Integral Memory (`$ricore`)
--------------------------------------
- Added `ricore` argument to RIPERCalculator. Default is `5000`.
- `ricore` is written as `$ricore <value>` in the TURBOMOLE control file.
- Units are MB. Example: `ricore=50000` writes `$ricore 50000`.
- Set `ricore=None` to omit the `$ricore` keyword.


Update - RIPER Subprocess Core Count (`ncore`)
----------------------------------------------
- RIPER is an SMP/OpenMP parallel TURBOMOLE module. The traditional TURBOMOLE
  workflow is to set variables such as `PARNODES`/`OMP_NUM_THREADS` before
  running the program.
- Added `ncore=None` to RIPERCalculator. The default does not set any
  environment variables and lets the user shell/environment control threading.
- If `ncore` is set to a positive integer, the RIPER subprocess receives
  `PARNODES`, `OMP_NUM_THREADS`, `MKL_NUM_THREADS`, `OPENBLAS_NUM_THREADS`, and
  `NUMEXPR_NUM_THREADS` with that value. On macOS, `VECLIB_MAXIMUM_THREADS` and
  `ACCELERATE_MAXIMUM_THREADS` are also set.
- The environment override is passed only to `subprocess.run(..., env=...)`; it
  does not mutate `os.environ` globally.
- Added `examples/example15_graphene_ricore_ncore.py`, a minimal 3x3x1 graphene
  supercell single-point calculation showing `ricore=50000` MB and `ncore=32`.


Update - Additional Example Scripts
-----------------------------------
- Added `examples/example16_0D_water_scan0_vibrations.py` for SCAN0 water
  optimization followed by ASE finite-displacement vibrational frequencies.
- Added `examples/example17_bulk_Si_pbe_hse06_band_gap.py` for comparing the
  silicon unit-cell band gap from PBE and HSE06 using `get_band_gap()`.
- Added `examples/example18_graphene_phonon_band_structure.py` for an ASE
  Phonons finite-displacement graphene phonon band structure along
  Gamma-M-K-Gamma, writing both CSV data and a PNG plot.


Update - Graphene Phonon Cache Handling
---------------------------------------
- Updated `examples/example18_graphene_phonon_band_structure.py` to remove empty or `null`
  ASE phonon cache JSON files before running. ASE leaves zero-byte JSON cache
  files behind if a previous finite-displacement run is interrupted, and those
  stale files make `phonons.read()` fail with a `NoneType` cache entry.
