v1.6.0 (2026-03-10)
-------------------
* New Features (PBC systems)
  - Analytical nuclear gradients for periodic density-fitting methods.
  - Constrained DFT (CDFT) SCF and nuclear gradient support.
* New Features (molecular systems)
  - C6 coefficient calculation via Casimir-Polder integration (#619)
  - Unrestricted KS vv10 gradient and hessian (#640)
  - Constrained DFT (CDFT) SCF and nuclear gradients
  - Function to compute grid density with density gradient and tau (#659)
  - Density-Fitting MP2 (#662)
  - Fewest-switches surface hopping (FSSH) non-adiabatic dynamics with TDDFT.
* Improvements (PBC systems)
  - Performance optimizations for 2c2e integrals and their derivatives.
  - The default numerical precision for one-center overlap and kinetic integrals under PBC.
  - Accelerated Becke weight evaluation.
  - Added support for disabling radii adjustment in Becke derivative kernels.
  - Support for Becke grid derivatives under PBC.
  - PySCF `Mole` and `Cell` class counterparts, with a new `from_cpu` method for CPU to GPU conversion.
  - `multigrid_v2` is now the default integrator (#656).
* Improvements (molecular systems)
  - Refactored derivative integrals of the J and K terms in density-fitting methods, improving performance.
  - Improved efficiency of derivative integral evaluation for TDDFT gradients and NACVs.
  - Optimized CUDA kernels for derivative integrals (#627).
  - Refactored the libxc interface and removed rarely used XC functionals (#635).
  - Upgraded pyscf/dispersion and made gpu4pyscf self-contained with respect to dispersion implementations (#657).
  - Canonical orthogonalization added to the SCF eigenvalue solver.
  - Basis-set linear dependency detection and removal enabled by default.
* Fixes
  - Fixed an issue where the `lebedev_order` attribute in the SMD module was not recognized.
  - Handle zero strides for cutensor (fix #614)
  - Fix grid dimension overflow bug in ft_ao in fftdf pseudopotential evaluation (#626)
  - Fix a bug that `get_rho()` provides wrong result if dm has no attached mo_coeff fields (#654)
* API updates
  - Removed the `get_veff` method from gradient classes.


v1.5.2 (2025-12-29)
-------------------
* Improvements
  - Accelerate grid response computation efficiency in DFT gradients calculations
  - PCM low memory mode that avoids storing ngrids^2 intermediates
  - Introduced SortedMole and SortedCell classes to support general contracted basis
  - Refactored SMD classes to improve compatibility with PySCF SMD implementation across multiple PySCF versions
* Fixes
  - int32 overflow error for empty_mapped function
  - Internal state gets overwritten during int3c2e batch evaluation


v1.5.1 (2025-12-17)
-------------------
* New Features
  - RIS-approximated Z-vector solver for TD-RKS gradients and NACs
* Improvements
  - Save cycle counts for SCF methods.
  - Reduce memory usage when initializing JK builder
* Fixes
  - The ECP bug on Blackwell cards is fixed in CUDA 13.1
  - Illegal memory addresses in 3c2e generator caused by integer overflow
  - PCM hessian bug caused by atoms without PCM grids
  - Fix UHF.eigh bug caused by overwriting overlap matrix


v1.5.0 (2025-11-24)
-------------------
* New Features (PBC systems)
  - PBC GDF extended to k-mesh computations; k-point GDF integrals stored in host memory with compression.
  - Multigrid algorithm supports PBC k-point SCF and band structure calculations.
  - Add the .analyze() method for PBC gamma-point and k-mesh DFT to summarize results and charge populations.
  - Fermi and Gaussian smearing for PBC and molecular DFT.
  - PBC RSJK algorithm for J/K matrix evaluation (J via MD J-engine; K via Rys quadrature).
  - Analytical nuclear gradients using RSJK and AFTDF for PBC gamma-point HF, k-mesh HF, and hybrid DFT.
  - Stress tensor evaluation using RSJK and AFTDF for PBC gamma-point HF, k-mesh HF, and hybrid DFT.
  - Geometry optimizer for PBC DFT.
* New Features (molecular systems)
  - Support for QMMM point charges and external electric fields.
  - 3c2e integrals contracted with density matrices and auxiliary vectors for memory-efficient DF Coulomb matrix evaluation.
  - DFT Hessian second-derivative grid response.
  - Minimum energy crossing point (MECP) search functionality.
  - PCM support for TDDFT derivative coupling calculations.
  - Basic GKS and two-component numerical integration, including GPU-accelerated multi-collinear functionals.
  - Multi-collinear spin-flip TDA/TDDFT excitation energies and analytical gradients.
* Improvements (PBC systems)
  - Linear-dependency handling for basis functions in molecular and PBC DFT calculations.
  - Refactored PBC nuclear gradients for more efficient GTH pseudopotential evaluation.
  - Faster GTH pseudopotential evaluation using the multigrid algorithm on large systems.
* Improvements (molecular systems)
  - Optimized DFT numerical integration memory usage, achieving ~20% performance gains.
  - Refactored and optimized molecular four-center-integral J/K builder, achieving 50-100% speed-up.
  - Improved phase determination method for NACV.
  - More numerically stable Hessian integrals for large-exponent GTOs.
  - MD J-engine optimized with reduced CUDA register pressure.
  - Third-order XC derivatives can be evaluated on GPU (requires gpu4pyscf-libxc 0.7).
  - Default auxbasis_response level increased to 2 for Hessian calculations with DF integrals.
  - Dimension checks for eigh, enabling scipy fallback for large arrays (size > 21350).
* Fixes
  - Handle eps=inf in solvent models.
  - Fixed an edge case in EDA electrostatics when cross-fragment nocc is 1.
  - Fixed EDA crash caused by fragments accessing JK matrices after DF 3-index tensors were freed.
  - Workaround for CUDA 13 compiler bugs affecting ECP kernels (disabling compilier optimizations)
  - Molecular and PBC 3c2e integral dimension issues for generally contracted basis sets.
  - Removed pre-allocated streams that caused inconsistent synchronization.
  - SMD Hessian.
  - UHF crash when level_shift is enabled.
* API updates
  - Fix to_gpu/to_cpu interface in SMD, TDDFT, and PCM-TDDFT
  - Added from_cpu hook on the GPU side, allowing pyscf to invoke this hook in its to_gpu method.


v1.4.3 (2025-08-20)
-------------------
* New Features
  - Geometry optimization for excited states using TDDFT-ris methods.
  - Non-adiabatic coupling vectors for TDDFT-ris methods.
  - Analytical gradients for DFT+U with k-point sampling.
  - Stress tensor calculations for semi-local DFT with k-point sampling and at the gamma-point.
  - ASE interface to support crystal lattice optimization.
  - Multigrid v2 algorithm for meta-GGA functionals.
* Improvements
  - GPU kernels for PBC overlap and kinetic integrals.
  - Reduced GPU memory usage for TDDFT-ris by storing tensors in host memory.
  - In PBC methods, scaled k-points (fractional coordinates) are stored to
    simplify lattice optimization calculations.
  - A preconditioned Krylov solver to accelerate convergence in TDDFT and
    dynamic polarizability calculations.
* Fixes
  - Basis decontraction issue for d and f orbitals.
  - A bug in Multigrid v2 algorithm related to non-orthogonal lattices.
  - Incorrect virtual orbital energies when level_shift was enabled.

v1.4.2 (2025-07-20)
-------------------
* New Features
  - Raman spectrum calculations.
  - Non-adiabatic coupling vector for time-dependent RKS, including the coupling.
    between ground state and excited states as well as among excited states.
  - DFT+U for molecule and PBC systems.
  - ALMO EDA 2 method.
  - Analytical gradients for TDDFT-ris method.
  - Analytical gradients for PBC k-point DFT.
  - Efficient analytical gradients for PBC Gamma-point DFT using the multigrid algorithm.
  - A custom CuPy memory pool to reduce GPU memory usage.
* Improvements
  - Improved PBC GDF integral computation at the Gamma point, including reduced.
    GPU memory usage and enhanced computational efficiency.
  - Set the J engine as the defult Coulomb matrix algorithm in the direct SCF driver.
  - Efficient Multigrid integral algorithm for various functions in PBC DFT.
    Gamma point computation such as get_nuc, get_pp, and GGA functionals.
  - Supporting xc='HF' setting in DFT.
* Fixes
  - Ensured compatibility with CUDA 12.3.
  - Issues related to the combination of density fitting, PCM solvent, and TDDFT.

v1.4.1 (2025-05-20)
-------------------
* New Features
  - Analytical hessian for VV10 functionals
  - DFT polarizability with VV10 functionals
  - TDDFT for VV10 functionals
  - Non-adiabatic coupling constants for TDDFT states
  - TDDFT gradients and geometry optimization solver for excited states
  - LR-PCM for TDDFT and TDDFT gradients
  - TDDFT-ris method
* Improvements
  - Optimization CUDA kernel and integral screening for MD J-engine. The MD
    J-engine is utilized by default for large system HF and DFT computation.
  - Optimization for PBC gaussian density fitting at gamma point.
  - ECP gradients CUDA kernel
  - Reduced atomicAdd overhead in Rys JK kernel
* Fixes
  - MINAO initial guess for ghost atoms

v1.4.0 (2025-03-27)
-------------------
* New Features
  - RKS and UKS TDDFT Gradients for density fitting and direct-SCF methods.
  - ECP integrals and its first and second derivatives accelerated on GPU.
  - Multigrid algorithm for Coulomb matrix and LDA, GGA, MGGA functionals computation.
  - PBC Gaussian density fitting integrals.
  - ASE interface for molecular systems.
* Improvements
  - Reduce memory footprint in SCF driver.
  - Reduce memory requirements for PCM energy and gradients.
  - Reduce memory requirements for DFT gradients.
  - Utilize the sparsity in cart2sph coefficients in the cart2sph transformation in scf.jk kernel
  - Molecular 3c2e integrals generated using the block-divergent alogrithm.
  - Support I orbitals in DFT.
* Fixes
  - LRU cached cart2sph under the multiple GPU environment.
  - A maxDynamicSharedMemorySize setting bug in gradient and hessian calculation under the multiple GPU environment.
  - Remove the limits of 6000 GTO shells in DFT numerical integration module.

v1.3.2 (2025-03-10)
-------------------
* Improvements
  - Dump xc info and grids into to log file
  - Optimize 4-center integral evaluation CUDA kernels using warp divergent algorithm
  - Support up to I orbitals in DFT
  - Fix out-of-bound issue in DFT hessian for heavy atoms (>=19)
* Deprecation
  - SM60 is not supported in PyPi package

v1.3.1 (2025-02-04)
-------------------
* New Features
  - Analytical Hessian for PCM solvent model
  - Driver for 3c methods (wB97x-3c, R2Scan-3c, B97-3c, etc.)
* Improvements
  - Preconditioner and computation efficiency of Davidson iterations for TDDFT


v1.3.0 (2025-01-07)
-------------------
* New Features
  - PBC analytical Fourier transform on GPU
* Improvements
  - Optimized computation efficiency and memory footprint for density fitting Hessian
  - Support pickle serialization for most classes (SCF, DF, PCM, etc.)
  - Efficiency of moving CuPy arrays between GPU cards


v1.2.1 (2024-12-20)
-------------------
* New Features
  - Change the license from GPL v3.0 to Apache 2.0
  - Multi-GPU support for SCF, Gradients, and Hessian computation using AO-direct algorithm
  - Add PBC HF and DFT with k-points, UHF/UKS, and density fitting
* Improvements
  - Change the default conv_tol_cpscf = 1e-3 / batch of atoms to conv_tol_cpscf = 1e-6 / atom
  - Fix numerical instability in complex-valued TDHF diagonalization
  - Improve PCM and QMMM with int1e_grids kernel
  - Support non-symmetric int3c2e integral
  - Optimize Hessian calculation with direct SCF
  - Improve the numerical stability of int3c2e for point charge
  - Add CI workflow for multi-GPU
* Fixes
  - Fix non-contiguous array error in p2p transfer between GPUs.
  - Fix bugs in NMR calculations


v1.2.0 (2024-12-09)
-------------------
* New Features
  - Spin-conserved TDA and TDDFT methods
  - Spin-flip TDA method.
  - J-engine using McMuchie-Davidson integral algorithm
  - Support multi-GPU density fitting energy, gradients and Hessian computation.
  - Second order SCF solver
* Improvements
  - Support non-hermitian density matrix in J/K builder
  - Secondary grids for CPHF solver
  - 3-center integral computation efficiency for gradients and hessian
  - One-electron Coulomb integrals against point charges and Gaussian charge distributions on grids.
  - Automatically apply SCF initial guess from existing wavefunction


v1.1.0 (2024-10-29)
-------------------
* New Features
  - Add esp charge and resp charge by @wxj6000 in #208
  - New Rys kernel by @sunqm in #221
  - Optimize nuclear gradients using new Rys kernel by @sunqm in #224
  - GPU kernel for analytical hessian by @sunqm in #227
  - Add QM/MM by @MoleOrbitalHybridAnalyst in #218
* Improvements
  - Improved compatiability with pyscf 2.7.0 by @wxj6000 in #216
  - Add skipping SCF cycles by @kvkarandashev in #229
  - Skip building gint, gvhf, ... when building libxc by @wxj6000 in #210
* Bugfix
  - Typo in build_wheels.sh by @wxj6000 in #209
  - Typo in dft_driver.py by @wxj6000 in #220
  - Bugfix: cusolver error when specifying gpu by @wxj6000 in #213
  - Bugfix: error in int2c2e by @wxj6000 in #212
  - Bugfix: inconsistent gradient with CPU. Improved to_cpu, uks gradient, and grid_response by @wxj6000 in #230
  - Bugfix: recompute int3c2e in DF UHF by @wxj6000 in #226
  - New Contributors
  - @MoleOrbitalHybridAnalyst made their first contribution in #218
  - @kvkarandashev made their first contribution in #229


v1.0.2 (2024-09-03)
-------------------
* Bugfix: append data in h5 file by @wxj6000 in #200
* Support customized CHELPG radii by @wxj6000 in #202
* Add cupy installation guide for developer installation instructions by @henryw7 in #204
* Bugfix: save density when spin unrestricted by @wxj6000 in #205
* Add chkfile support for pysisyphus by @henryw7 in #203


v1.0.1 (2024-08-24)
-------------------
* Bugfix in rks.reset by @wxj6000 in #191. The bug leads to the failure of geometry optimization with direct SCF (#190)
* Bugfix when CUDA unified memory is disabled. Removed CUDA unified memory in libxc, and reduced the overhead in calling libxc @wxj6000 in #180, #189
* Bugfix and Improvement in opt_driver by @wxj6000 in #187 #197
* Support SMD in opt_driver and dft driver @liuyu-chem1996 in #196
* Support thermo calculation in dft_driver @liuyu-chem1996 in #192


v1.0.0 (2024-07-23)
-------------------
Released features:
* Density fitting scheme and direct SCF scheme
* SCF, analytical gradient, and analytical Hessian calculations for Hartree-Fock and DFT
* Spin-conserved and spin-flip TDA and TDDFT for excitated states
* Nonlocal functional correction (vv10) for SCF and gradient
* PCM models, SMD model, their analytical gradients, and semi-analytical Hessian matrix
* Unrestricted Hartree-Fock and unrestricted DFT, gradient, and Hessian
* MP2/DF-MP2 and CCSD (experimental)
* Polarizability, IR, and NMR shielding (experimental)
