.. _file_formats: Supported File Formats ###################### .. _format_charmm: CHARMM crd file format (``charmm``) =================================== CHARMM coordinate files contain information about the location of each atom in Cartesian space. The format of the ASCII (CARD) CHARMM coordinate files is: Title line(s), number of atoms in file and the coordinate lines (one for each atom in the file). The coordinate lines contain specific information about each atom. These have the following structure: Atom number (sequential), residue number (specified relative to first residue in the PSF), residue name, atom type, x-coordinate, y-coordinate, z-coordinate, segment identifier, residue identifier and a weighting array value. Filename patterns: ``*.crd`` :py:func:`iodata.formats.charmm.load_one` ----------------------------------------- - Always loads ``atcoords``, ``atffparams``, ``atmasses``, ``extra`` - May load ``title`` .. _format_chgcar: VASP 5 CHGCAR file format (``chgcar``) ====================================== This format is used by `VASP 5.X `_ and `VESTA `_. Note that even though the ``CHGCAR`` and ``LOCPOT`` files look very similar, they require different conversions to atomic units. Filename patterns: ``CHGCAR*``, ``AECCAR*`` :py:func:`iodata.formats.chgcar.load_one` ----------------------------------------- - Always loads ``atcoords``, ``atnums``, ``cellvecs``, ``cube``, ``title`` .. _format_cp2klog: CP2K ATOM output file format (``cp2klog``) ========================================== Filename patterns: ``*.cp2k.out`` :py:func:`iodata.formats.cp2klog.load_one` ------------------------------------------ - Always loads ``atcoords``, ``atcorenums``, ``atnums``, ``energy``, ``mo``, ``obasis`` This function assumes that the following subsections are present in the CP2K ATOM input file, in the section ``ATOM%PRINT``: .. code-block:: text &PRINT &POTENTIAL &END POTENTIAL &BASIS_SET &END BASIS_SET &ORBITALS &END ORBITALS &END PRINT .. _format_cube: Gaussian Cube file format (``cube``) ==================================== Cube files are generated by various QC codes these days, including `Gaussian `_, `CP2K `_, `GPAW `_, `Q-Chem `_, ... Note that the second column in the geometry specification of the cube file is interpreted as the effective core charges. Filename patterns: ``*.cube``, ``*.cub`` :py:func:`iodata.formats.cube.load_one` --------------------------------------- - Always loads ``atcoords``, ``atcorenums``, ``atnums``, ``cellvecs``, ``cube`` :py:func:`iodata.formats.cube.dump_one` --------------------------------------- - Requires ``atcoords``, ``atnums``, ``cube`` - May dump ``title``, ``atcorenums`` .. _format_extxyz: Extended XYZ file format (``extxyz``) ===================================== The extended XYZ file format is defined in the `ASE documentation `_. Usually, the different frames in a trajectory describe different geometries of the same molecule, with atoms in the same order. The ``load_many`` function below can also handle an XYZ with different molecules, e.g. a molecular database. Filename patterns: ``*.extxyz`` :py:func:`iodata.formats.extxyz.load_one` ----------------------------------------- - Always loads ``title`` - May load ``atcoords``, ``atgradient``, ``atmasses``, ``atnums``, ``cellvecs``, ``charge``, ``energy``, ``extra`` :py:func:`iodata.formats.extxyz.load_many` ------------------------------------------ - Always loads ``title`` - May load ``atcoords``, ``atgradient``, ``atmasses``, ``atnums``, ``cellvecs``, ``charge``, ``energy``, ``extra`` .. _format_fchk: Gaussian FCHK file format (``fchk``) ==================================== Filename patterns: ``*.fchk``, ``*.fch`` :py:func:`iodata.formats.fchk.load_one` --------------------------------------- - Always loads ``atcharges``, ``atcoords``, ``atnums``, ``atcorenums``, ``energy``, ``lot``, ``mo``, ``obasis``, ``obasis_name``, ``run_type``, ``title`` - May load ``atfrozen``, ``atgradient``, ``athessian``, ``atmasses``, ``one_rdms``, ``extra``, ``moments`` :py:func:`iodata.formats.fchk.dump_one` --------------------------------------- - Requires ``atnums``, ``atcorenums`` - May dump ``atcharges``, ``atcoords``, ``atfrozen``, ``atgradient``, ``athessian``, ``atmasses``, ``charge``, ``energy``, ``lot``, ``mo``, ``one_rdms``, ``obasis_name``, ``extra``, ``moments`` :py:func:`iodata.formats.fchk.load_many` ---------------------------------------- - Always loads ``atcoords``, ``atgradient``, ``atnums``, ``atcorenums``, ``energy``, ``extra``, ``title`` Trajectories from a Gaussian optimization, relaxed scan or IRC calculation are written in groups of frames, called "points" in the Gaussian world, e.g. to discrimininate between different values of the constraint in a relaxed geometry. In most cases, e.g. IRC or conventional optimization, there is only one "point". Within one "point", one can have multiple geometries and their properties. This information is stored in the ``extra`` attribute: - ``ipoint`` is the counter for a point - ``npoint`` is the total number of points. - ``istep`` is the counter within one "point" - ``nstep`` is the total number of geometries within in a "point". - ``reaction_coordinate`` is only present in case of an IRC calculation. .. _format_fcidump: Molpro 2012 FCIDUMP file format (``fcidump``) ============================================= Notes ----- 1. This function works only for restricted wave-functions. 2. One- and two-electron integrals are stored in chemists' notation in an FCIDUMP file, while IOData internally uses Physicist's notation. 3. Keep in mind that the FCIDUMP format changed in MOLPRO 2012, so files generated with older versions are not supported. Filename patterns: ``*FCIDUMP*`` :py:func:`iodata.formats.fcidump.load_one` ------------------------------------------ - Always loads ``core_energy``, ``one_ints``, ``nelec``, ``spinpol``, ``two_ints`` :py:func:`iodata.formats.fcidump.dump_one` ------------------------------------------ - Requires ``one_ints``, ``two_ints`` - May dump ``core_energy``, ``nelec``, ``spinpol`` The dictionary ``one_ints`` must contain a field ``core_mo``. Similarly, ``two_ints`` must contain ``two_mo``. .. _format_gamess: GAMESS punch file format (``gamess``) ===================================== Filename patterns: ``*.dat`` :py:func:`iodata.formats.gamess.load_one` ----------------------------------------- - Always loads ``title``, ``energy``, ``grot``, ``atgradient``, ``athessian``, ``atmasses``, ``atnums``, ``atcoords`` .. _format_gaussianlog: Gaussian Log file format (``gaussianlog``) ========================================== To write out the integrals in a Gaussian log file, which can be loaded with this module, you need to use the following Gaussian command line: .. code-block :: scf(conventional) iop(3/33=5) extralinks=l316 iop(3/27=999) Filename patterns: ``*.log`` :py:func:`iodata.formats.gaussianlog.load_one` ---------------------------------------------- - Always loads - May load ``one_ints``, ``two_ints`` .. _format_gromacs: GROMACS gro file format (``gromacs``) ===================================== Files with the gro file extension contain a molecular structure in Gromos87 format. GROMACS gro files can be used as trajectory by simply concatenating files. http://manual.gromacs.org/current/reference-manual/file-formats.html#gro Filename patterns: ``*.gro`` :py:func:`iodata.formats.gromacs.load_one` ------------------------------------------ - Always loads ``atcoords``, ``atffparams``, ``cellvecs``, ``extra``, ``title`` :py:func:`iodata.formats.gromacs.load_many` ------------------------------------------- - Always loads ``atcoords``, ``atffparams``, ``cellvecs``, ``extra``, ``title`` .. _format_locpot: VASP 5 LOCPOT file format (``locpot``) ====================================== This format is used by `VASP 5.X `_ and `VESTA `_. Note that even though the ``CHGCAR`` and ``LOCPOT`` files look very similar, they require different conversions to atomic units. Filename patterns: ``LOCPOT*`` :py:func:`iodata.formats.locpot.load_one` ----------------------------------------- - Always loads ``atcoords``, ``atnums``, ``cellvecs``, ``cube``, ``title`` .. _format_mol2: MOL2 file format (``mol2``) =========================== There are different formats of mol2 files. Here the compatibility with AMBER software was the main objective to write out files with atomic charges used by antechamber. Filename patterns: ``*.mol2`` :py:func:`iodata.formats.mol2.load_one` --------------------------------------- - Always loads ``atcoords``, ``atnums``, ``atcharges``, ``atffparams`` - May load ``title`` :py:func:`iodata.formats.mol2.dump_one` --------------------------------------- - Requires ``atcoords``, ``atnums`` - May dump ``atcharges``, ``atffparams``, ``title`` :py:func:`iodata.formats.mol2.load_many` ---------------------------------------- - Always loads ``atcoords``, ``atnums``, ``atcharges``, ``atffparams`` - May load ``title`` :py:func:`iodata.formats.mol2.dump_many` ---------------------------------------- - Requires ``atcoords``, ``atnums``, ``atcharges`` - May dump ``title`` .. _format_molden: Molden file format (``molden``) =============================== Many QC codes can write out Molden files, e.g. `Molpro `_, `Orca `_, `PSI4 `_, `Molden `_, `Turbomole `_. Keep in mind that several of these write incorrect versions of the file format, but these errors are corrected when loading them with IOData. Filename patterns: ``*.molden.input``, ``*.molden`` :py:func:`iodata.formats.molden.load_one` ----------------------------------------- - Always loads ``atcoords``, ``atnums``, ``atcorenums``, ``mo``, ``obasis`` - May load ``title`` :py:func:`iodata.formats.molden.dump_one` ----------------------------------------- - Requires ``atcoords``, ``atnums``, ``mo``, ``obasis`` - May dump ``atcorenums``, ``title`` .. _format_molekel: Molekel file format (``molekel``) ================================= This format is used by two programs: `Molekel `_ and `Orca `_. Filename patterns: ``*.mkl`` :py:func:`iodata.formats.molekel.load_one` ------------------------------------------ - Always loads ``atcoords``, ``atnums``, ``mo``, ``obasis`` - May load ``atcharges`` :py:func:`iodata.formats.molekel.dump_one` ------------------------------------------ - Requires ``atcoords``, ``atnums``, ``mo``, ``obasis`` - May dump ``atcharges`` .. _format_orcalog: Orca output file format (``orcalog``) ===================================== Filename patterns: ``*.out`` :py:func:`iodata.formats.orcalog.load_one` ------------------------------------------ - Always loads ``atcoords``, ``atnums``, ``energy``, ``moments``, ``extra`` .. _format_pdb: PDB file format (``pdb``) ========================= There are different formats of pdb files. The convention used here is the last updated one and is described in this link: http://www.wwpdb.org/documentation/file-format-content/format33/v3.3.html Filename patterns: ``*.pdb`` :py:func:`iodata.formats.pdb.load_one` -------------------------------------- - Always loads ``atcoords``, ``atnums``, ``atffparams``, ``extra`` - May load ``title`` :py:func:`iodata.formats.pdb.dump_one` -------------------------------------- - Requires ``atcoords``, ``atnums``, ``extra`` - May dump ``atffparams``, ``title`` :py:func:`iodata.formats.pdb.load_many` --------------------------------------- - Always loads ``atcoords``, ``atnums``, ``atffparams``, ``extra`` - May load ``title`` :py:func:`iodata.formats.pdb.dump_many` --------------------------------------- - Requires ``atcoords``, ``atnums``, ``extra`` - May dump ``atffparams``, ``title`` .. _format_poscar: VASP 5 POSCAR file format (``poscar``) ====================================== This format is used by `VASP 5.X `_ and `VESTA `_. Filename patterns: ``POSCAR*`` :py:func:`iodata.formats.poscar.load_one` ----------------------------------------- - Always loads ``atcoords``, ``atnums``, ``cellvecs``, ``title`` :py:func:`iodata.formats.poscar.dump_one` ----------------------------------------- - Requires ``atcoords``, ``atnums``, ``cellvecs`` - May dump ``title`` .. _format_qchemlog: Q-Chem Log file format (``qchemlog``) ===================================== This module will load Q-Chem log file into IODATA. Filename patterns: ``*.qchemlog`` :py:func:`iodata.formats.qchemlog.load_one` ------------------------------------------- - Always loads ``atcoords``, ``atmasses``, ``atnums``, ``energy``, ``g_rot``, ``mo``, ``lot``, ``obasis_name``, ``run_type``, ``extra`` - May load ``athessian`` .. _format_sdf: SDF file format (``sdf``) ========================= Usually, the different frames in a trajectory describe different geometries of the same molecule, with atoms in the same order. The ``load_many`` and ``dump_many`` functions below can also handle an SDF file with different molecules, e.g. a molecular database. Filename patterns: ``*.sdf`` :py:func:`iodata.formats.sdf.load_one` -------------------------------------- - Always loads ``atcoords``, ``atnums``, ``title`` :py:func:`iodata.formats.sdf.dump_one` -------------------------------------- - Requires ``atcoords``, ``atnums`` - May dump ``title`` :py:func:`iodata.formats.sdf.load_many` --------------------------------------- - Always loads ``atcoords``, ``atnums``, ``title`` :py:func:`iodata.formats.sdf.dump_many` --------------------------------------- - Requires ``atcoords``, ``atnums`` - May dump ``title`` .. _format_wfn: Gaussian/GAMESS-US WFN file format (``wfn``) ============================================ Only use this format if the program that generated it does not offer any alternatives that HORTON can load. The WFN format has the disadvantage that it cannot represent contractions and therefore expands all orbitals into a decontracted basis. This makes the post-processing less efficient compared to formats that do support contractions of Gaussian functions. Filename patterns: ``*.wfn`` :py:func:`iodata.formats.wfn.load_one` -------------------------------------- - Always loads ``atcoords``, ``atnums``, ``energy``, ``mo``, ``obasis``, ``title``, ``extra`` :py:func:`iodata.formats.wfn.dump_one` -------------------------------------- - Requires ``atcoords``, ``atnums``, ``energy``, ``mo``, ``obasis``, ``title``, ``extra`` .. _format_wfx: AIM/AIMAll WFX file format (``wfx``) ==================================== See http://aim.tkgristmill.com/wfxformat.html Filename patterns: ``*.wfx`` :py:func:`iodata.formats.wfx.load_one` -------------------------------------- - Always loads ``atcoords``, ``atgradient``, ``atnums``, ``energy``, ``extra``, ``mo``, ``obasis``, ``title`` :py:func:`iodata.formats.wfx.dump_one` -------------------------------------- - Requires ``atcoords``, ``atnums``, ``atcorenums``, ``mo``, ``obasis``, ``charge`` - May dump ``title``, ``energy``, ``spinpol``, ``lot``, ``atgradient``, ``extra`` .. _format_xyz: XYZ file format (``xyz``) ========================= Usually, the different frames in a trajectory describe different geometries of the same molecule, with atoms in the same order. The ``load_many`` and ``dump_many`` functions below can also handle an XYZ with different molecules, e.g. a molecular database. The ``load_*`` and ``dump_*`` functions all accept the optional argument ``atom_columns``. This argument fixes the meaning of the columns to be loaded from or dumped to an XYZ file. The following example defines, in addition to the conventional columns, also a column with atomic charges and three columns with atomic forces. .. code-block :: python atom_columns = iodata.formats.xyz.DEFAULT_ATOM_COLUMNS + [ # Atomic charges are stored in a dictionary atcharges and they key # refers to the name of the partitioning method. ("atcharges", "mulliken", (), float, float, "{:10.5f}".format), # Note that in IOData, the energy gradient is stored, which contains the # negative forces. ("atgradient", None, (3,), float, (lambda word: -float(word)), (lambda value: "{:15.10f}".format(-value))) ] mol = load_one("test.xyz", atom_columns=atom_columns) # The following attributes are present: print(mol.atnums) print(mol.atcoords) print(mol.atcharges["mulliken"]) print(mol.atgradient) When defining ``atom_columns``, no columns can be skipped, such that all information loaded from a file can also be written back out when dumping it. Filename patterns: ``*.xyz`` :py:func:`iodata.formats.xyz.load_one` -------------------------------------- - Always loads ``atcoords``, ``atnums``, ``title`` - Keyword arguments ``atom_columns`` :py:func:`iodata.formats.xyz.dump_one` -------------------------------------- - Requires ``atcoords``, ``atnums`` - May dump ``title`` - Keyword arguments ``atom_columns`` :py:func:`iodata.formats.xyz.load_many` --------------------------------------- - Always loads ``atcoords``, ``atnums``, ``title`` - Keyword arguments ``atom_columns`` :py:func:`iodata.formats.xyz.dump_many` --------------------------------------- - Requires ``atcoords``, ``atnums`` - May dump ``title`` - Keyword arguments ``atom_columns``