6
\$\begingroup\$

I am trying to write a basic Hartee-Fock program that computes the total energies and some properties of a molecule given as an input. The code uses Cython to statically compile the computationally expensive parts, and there is a basic parallelization implemented using mpi4py. However, the code is still very slow for all practical applications.
The code is currently hosted at GitHub, including a sample input for a water molecule.
Any suggestions on how this can be improved, especially execution time, are much appreciated.
Note: I am largely self-taught in Python. So any pointers to where I can learn better coding practices are also appreciated.

driver code

from numpy                      import array, float32, int32, zeros, pi, exp, dot, savetxt, diag, trace
from numpy.linalg               import norm, eigh, solve
from math                       import ceil, sqrt, log
from scipy.special              import factorial2, comb
from sys                        import argv
from itertools                  import combinations_with_replacement, combinations
from mpi4py                     import MPI
from plank.routines.oneelectron import overlapcgtos, kineticcgtos
from plank.routines.twoelectron import nuclearcgtos, ericgtos
from plank.routines.hatreefock  import computecorehamiltonian, canonicalorthogonalization, computehamiltonian
from plank.routines.hatreefock  import restrictedhatreefock, restrictedhatreefockDIIS, totalenergy
from plank.routines.hatreefock  import diisorthogonalization
from yaml                       import safe_load


with open("periodicTable.yaml") as periodic:
    periodictable   =   safe_load(periodic)
    periodicTable   =   {y: x[y] for x in periodictable for y in x}


def getatom(atomname):
    return periodicTable[atomname]


def nuclearenergy(atomobjects):
    atompairs       =   list(combinations(atomobjects,2))
    nuclearenergy   =   0.0
    for atompair in atompairs:
        center1         =   atompair[0].center
        charge1         =   atompair[0].charge
        center2         =   atompair[1].center
        charge2         =   atompair[1].charge
        distance        =   norm(center1-center2)
        energy          =   charge1*charge2/distance
        nuclearenergy   =   nuclearenergy+energy
    return nuclearenergy


def overlap(basisobjects):
    overlaps    =   []
    for basisobjecttuple in basisobjects:
        result  =   overlapcgtos(basisobjecttuple[0], basisobjecttuple[1])
        overlaps.append((result, basisobjecttuple[0].basisindex, basisobjecttuple[1].basisindex))
    return overlaps


def kinetic(basisobjects):
    kineticintegrals =   []
    for basisobjecttuple in basisobjects:
        result  =   kineticcgtos(basisobjecttuple[0], basisobjecttuple[1])
        kineticintegrals.append((result, basisobjecttuple[0].basisindex, basisobjecttuple[1].basisindex))
    return kineticintegrals


def nuclear(basisobjects, atomobjects):
    eniintegrals    =   []
    for basisobjecttuple in basisobjects:
        result      =   nuclearcgtos(basisobjecttuple[0], basisobjecttuple[1], atomobjects)
        eniintegrals.append((result, basisobjecttuple[0].basisindex, basisobjecttuple[1].basisindex))
    return eniintegrals


def electronic(basisobjects):
    eriintegrals    =   []
    for basisobjectquartet in basisobjects:
        result  =   ericgtos(basisobjectquartet[0], basisobjectquartet[1], basisobjectquartet[2], basisobjectquartet[3])
        eriintegrals.append((result, basisobjectquartet[0].basisindex, basisobjectquartet[1].basisindex, basisobjectquartet[2].basisindex, basisobjectquartet[3].basisindex))
    return eriintegrals


def integralEngine(molecule, basisobjects, dimensions, calctype="overlap"):
    results     =   []
    if calctype == "overlap":
        rank            =   comm.Get_rank()
        nprocs          =   comm.Get_size()
        shellpairs      =   list(combinations_with_replacement(basisobjects, 2))
        nshellpairs     =   len(shellpairs)
        nshellpair      =   ceil(nshellpairs/nprocs)
        shellpairi      =   rank*nshellpair
        shellpairf      =   (rank+1)*nshellpair
        overlapresults  =   overlap(shellpairs[shellpairi:shellpairf])

        if rank != 0:
            comm.send(overlapresults, dest=0)
        else:
            for result in overlapresults:
                results.append(result)
            for proc in range(1, nprocs):
                overlapresults  =   comm.recv(source=proc)
                for result in overlapresults:
                    results.append(result)

        if rank == 0:
            molecule.overlapmat =   zeros((dimensions, dimensions))
            for data in results:
                molecule.overlapmat[data[1], data[2]]   =   data[0]
                molecule.overlapmat[data[2], data[1]]   =   data[0]

        molecule.overlapmat =   comm.bcast(molecule.overlapmat, root=0)

    if calctype == "kinetic":
        rank            =   comm.Get_rank()
        nprocs          =   comm.Get_size()
        shellpairs      =   list(combinations_with_replacement(basisobjects, 2))
        nshellpairs     =   len(shellpairs)
        nshellpair      =   ceil(nshellpairs/nprocs)
        shellpairi      =   rank*nshellpair
        shellpairf      =   (rank+1)*nshellpair
        kineticresults  =   kinetic(shellpairs[shellpairi:shellpairf])

        if rank != 0:
            comm.send(kineticresults, dest=0)
        else:
            for result in kineticresults:
                results.append(result)
            for proc in range(1, nprocs):
                kineticresults  =   comm.recv(source=proc)
                for result in kineticresults:
                    results.append(result)

        if rank == 0:
            molecule.kineticmat =   zeros((dimensions, dimensions))
            for data in results:
                molecule.kineticmat[data[1], data[2]]   =   data[0]
                molecule.kineticmat[data[2], data[1]]   =   data[0]

        molecule.kineticmat =   comm.bcast(molecule.kineticmat, root=0)

    if calctype == "eni":
        rank            =   comm.Get_rank()
        nprocs          =   comm.Get_size()
        shellpairs      =   list(combinations_with_replacement(basisobjects, 2))
        atomobjects     =   [x for x in molecule.geometry]
        nshellpairs     =   len(shellpairs)
        nshellpair      =   ceil(nshellpairs/nprocs)
        shellpairi      =   rank*nshellpair
        shellpairf      =   (rank+1)*nshellpair
        eniresults      =   nuclear(shellpairs[shellpairi:shellpairf], atomobjects)

        if rank != 0:
            comm.send(eniresults, dest=0)
        else:
            for result in eniresults:
                results.append(result)
            for proc in range(1, nprocs):
                eniresults  =   comm.recv(source=proc)
                for result in eniresults:
                    results.append(result)

        if rank == 0:
            molecule.enimat =   zeros((dimensions, dimensions))
            for data in results:
                molecule.enimat[data[1], data[2]]   =   data[0]
                molecule.enimat[data[2], data[1]]   =   data[0]

        molecule.enimat =   comm.bcast(molecule.enimat, root=0)

    if calctype == "eri":
        rank            =   comm.Get_rank()
        nprocs          =   comm.Get_size()
        shellquartets  =   []
        for basisobjectA in basisObjects:
            for basisobjectB in basisObjects:
                for basisobjectC in basisObjects:
                    for basisobjectD in basisObjects:
                        index1  =   basisobjectA.basisindex
                        index2  =   basisobjectB.basisindex
                        index3  =   basisobjectC.basisindex
                        index4  =   basisobjectD.basisindex
                        index12 =   (index1*(index1+1)//2)+index2
                        index34 =   (index3*(index3+1)//2)+index4
                        if index12 >= index34:
                            shellquartets.append((basisobjectA, basisobjectB, basisobjectC, basisobjectD))
        nshellquartets  =   len(shellquartets)
        nshellquartet   =   ceil(nshellquartets/nprocs)
        shellpairi      =   rank*nshellquartet
        shellpairf      =   (rank+1)*nshellquartet
        eriresults      =   electronic(shellquartets[shellpairi:shellpairf])

        if rank != 0:
            comm.send(eriresults, dest=0)
        else:
            for result in eriresults:
                results.append(result)
            for proc in range(1, nprocs):
                eriresults  =   comm.recv(source=proc)
                for result in eriresults:
                    results.append(result)

        if rank == 0:
            molecule.erimat =   zeros((dimensions, dimensions, dimensions, dimensions))
            for result in results:
                molecule.erimat[result[1], result[2], result[3], result[4]]  =   result[0]
                molecule.erimat[result[3], result[4], result[1], result[2]]  =   result[0]
                molecule.erimat[result[2], result[1], result[4], result[3]]  =   result[0]
                molecule.erimat[result[4], result[3], result[2], result[1]]  =   result[0]
                molecule.erimat[result[2], result[1], result[3], result[4]]  =   result[0]
                molecule.erimat[result[4], result[3], result[1], result[2]]  =   result[0]
                molecule.erimat[result[1], result[2], result[4], result[3]]  =   result[0]
                molecule.erimat[result[3], result[4], result[2], result[1]]  =   result[0]

        molecule.erimat =   comm.bcast(molecule.erimat, root=0)


def scfEngine(molecule, dimensions):
    density     =   zeros((dimensions, dimensions))
    orthomat    =   canonicalorthogonalization(molecule.overlapmat)
    diisortho   =   diisorthogonalization(molecule.overlapmat)
    core        =   computecorehamiltonian(molecule.kineticmat, molecule.enimat) 
    for iteration in range(1, molecule.maxiterations):
        if iteration <= 2 or molecule.DIIS == False:
            orthofock, fock, hamiltonian, orbitals, coeffs, density, rmsdensity, maxdensity =   restrictedhatreefock(molecule.nelectrons, core, diisortho, density, molecule.erimat)
            molenergy   =   totalenergy(molecule.nuclearenergy, fock, density, core) 
            print("{:10.0f}{:20.5f}{:20.5f}{:20.5f}".format(iteration, molenergy, log(rmsdensity), log(maxdensity)))
        if iteration > 2 and molecule.DIIS == True:
            orthofock, fock, hamiltonian, orbitals, coeffs, density, rmsdensity, maxdensity =   restrictedhatreefockDIIS(molecule, molecule.nelectrons, core, diisortho, density, molecule.erimat)
            molenergy   =   totalenergy(molecule.nuclearenergy, fock, density, core) 
            print("{:10.0f}{:20.5f}{:20.5f}{:20.5f}".format(iteration, molenergy, log(rmsdensity), log(maxdensity)))            
        if log(rmsdensity) <= -20.0 and log(maxdensity) <= -20.0:
            molecule.scfstatus  =   True
            molecule.density    =   density
            molecule.fock       =   fock
            molecule.orbitals   =   orbitals
            molecule.coeffs     =   coeffs
            print("SCF CONVERGENCE ACHIEVED IN {:10.0f} ITERATIONS".format(iteration))
            break



class Basis(object):
    """ docstring for Basis

    Basis class constructs one basis object of type (s,p,d or f) depending on the basis set
    passed to the code. The generation of basis object happens during the construction of the
    atom object in the class Atom, with the list of basis objects tied to the atom object.
    """
    def __init__(self, shell, exponents, coefficients, center):
        super(Basis, self).__init__()
        self.shell          =   array(shell)
        self.exponents      =   array(exponents)
        self.coefficients   =   array(coefficients)
        self.normcoeffs     =   zeros(self.coefficients.size)
        self.center         =   array(center)
        self.basisindex     =   0

        self.normalizepGTO()

    def normalizepGTO(self):
        ll, mm, nn          =   self.shell
        totalangmomentum    =   sum(self.shell)
        prefactorpGTO       =   pow(2, 2*totalangmomentum)*pow(2, 1.5)/factorial2(2*ll-1)/factorial2(2*mm-1)/factorial2(2*nn-1)/pow(pi, 1.5)

        for index, exponent in enumerate(self.exponents):
            self.normcoeffs[index]  =   sqrt(pow(exponent, totalangmomentum)*pow(exponent, 1.5)*prefactorpGTO)

        prefactorcGTO   =   pow(pi, 1.5)*factorial2(2*ll-1)*factorial2(2*mm-1)*factorial2(2*nn-1)/pow(2.0, totalangmomentum)
        normalfactor    =   0.0

        for index1, coefficient1 in enumerate(self.coefficients):
            for index2, coefficient2 in enumerate(self.coefficients):
                t_normalfactor  =   (self.normcoeffs[index1]*self.normcoeffs[index2]*self.coefficients[index1]*self.coefficients[index2])/(pow(self.exponents[index1]+self.exponents[index2], totalangmomentum+1.5))
                normalfactor    =   normalfactor + t_normalfactor

        normalfactor    =   prefactorcGTO*normalfactor
        normalfactor    =   pow(normalfactor, -0.5)

        for index, coefficient in enumerate(self.coefficients):
            self.coefficients[index]  =   self.coefficients[index]*normalfactor


class Atom(object):
    """docstring for Atom."""

    def __init__(self, atomname, center, charge, basisset='sto-3g'):
        super(Atom, self).__init__()
        self.atomname   =   atomname
        self.center     =   array(center)*1.8897259886
        self.basisset   =   basisset
        self.shells     =   []
        self.charge     =   charge
        self.readBasis()

    def readBasis(self):
        self.basisfile  =   "./basis/"+self.atomname+"."+self.basisset+".gbs"
        with open(self.basisfile) as target:
            basisdata   =   target.readlines()
            headerline  =   basisdata[0].split()[0]
            _atomname   =   headerline
            assert (_atomname == self.atomname), "Wrong basis set"
            for lnumber, line in enumerate(basisdata[1:]):
                if "S" in line and "P" not in line:
                    nprims          =   int(line.split()[1])
                    pgtodata        =   [x.replace('D', 'E').split() for x in basisdata[lnumber+2:lnumber+2+nprims]]
                    exponents       =   [float(x[0]) for x in pgtodata]
                    coefficients    =   [float(x[1]) for x in pgtodata]
                    shell00         =   [0, 0, 0]
                    self.shells.append(Basis(shell00, exponents, coefficients, self.center))

                if "P" in line:
                    nprims          =   int(line.split()[1])
                    pgtodata        =   [x.replace('D', 'E').split() for x in basisdata[lnumber+2:lnumber+2+nprims]]
                    coefficient1    =   [float(x[1]) for x in pgtodata]
                    coefficient2    =   [float(x[2]) for x in pgtodata]
                    exponents       =   [float(x[0]) for x in pgtodata]
                    shell00         =   [0, 0, 0]
                    self.shells.append(Basis(shell00, exponents, coefficient1, self.center))
                    shell11         =   [1, 0, 0]
                    self.shells.append(Basis(shell11, exponents, coefficient2, self.center))
                    shell12         =   [0, 1, 0]
                    self.shells.append(Basis(shell12, exponents, coefficient2, self.center))
                    shell13         =   [0, 0, 1]
                    self.shells.append(Basis(shell13, exponents, coefficient2, self.center))

                if "D" in line and "+" not in line:
                    nprims          =   int(line.split()[1])
                    pgtodata        =   [x.replace('D', 'E').split() for x in basisdata[lnumber+2:lnumber+2+nprims]]
                    exponents       =   [float(x[0]) for x in pgtodata]
                    coefficients    =   [float(x[1]) for x in pgtodata]
                    shell20         =   [2, 0, 0]
                    self.shells.append(Basis(shell20, exponents, coefficients, self.center))
                    shell21         =   [1, 1, 0]
                    self.shells.append(Basis(shell21, exponents, coefficients, self.center))
                    shell22         =   [1, 0, 1]
                    self.shells.append(Basis(shell22, exponents, coefficients, self.center))
                    shell23         =   [0, 2, 0]
                    self.shells.append(Basis(shell23, exponents, coefficients, self.center))
                    shell24         =   [0, 1, 1]
                    self.shells.append(Basis(shell24, exponents, coefficients, self.center))
                    shell25         =   [0, 0, 2]
                    self.shells.append(Basis(shell25, exponents, coefficients, self.center))


class Geometry(object):
    """docstring for Geometry."""

    def __init__(self, inputfile):
        super(Geometry, self).__init__()
        self.inputfile      =   inputfile
        self.charge         =   0
        self.natoms         =   0
        self.nelectrons     =   0
        self.calctype       =   'energy'
        self.basisset       =   'sto-3g'
        self.geometry       =   []
        self.atomobjects    =   []
        self.density        =   []
        self.scfstatus      =   False
        self.maxiterations  =   100
        self.readgeometry()
        self.overlapmat     =   []
        self.kineticmat     =   []
        self.enimat         =   []
        self.erimat         =   []
        self.nuclearenergy  =   0.0
        self.totalenergy    =   0.0
        self.diisfock       =   []
        self.diiserror      =   []
        self.DIIS

    def readgeometry(self):
        with open(self.inputfile) as file:
            inputdata       =   file.readlines()
            calcbasis       =   inputdata[0].split()
            self.calctype   =   calcbasis[0]
            self.basisset   =   calcbasis[1]
            chargemul       =   inputdata[1].split()
            self.charge     =   int(chargemul[0])
            multiplicity    =   int(chargemul[1])
            try:
                assert multiplicity == 1, "[ERROR] : Only closed-shell singlets are currently supported."
            except AssertionError as Message:
                exit(Message)
            natoms          =   int(inputdata[2].split()[0])
            coorddata       =   [x for x in inputdata[3:3+natoms]]
            for atom in coorddata:
                atomdata        =   atom.split()
                atomname        =   atomdata[0]
                charge          =   getatom(atomname)
                coords          =   [float(x) for x in atomdata[1:]]
                self.nelectrons +=  charge
                self.geometry.append(Atom(atomname, coords, charge, self.basisset))
            iterdiisline     =   inputdata[-1].split()
            if iterdiisline  != []:
                self.maxiterations  =   int(iterdiisline[0])   
                self.DIIS           =   True if (iterdiisline[1] == 'T') else False


inputfilename       =   argv[1]
logfilename         =   inputfilename[:-4]+".log"
molecule            =   Geometry(inputfilename)
comm                =   MPI.COMM_WORLD
atomObjects         =   [x for x in molecule.geometry]
basisObjects        =   [y for x in atomObjects for y in x.shells]
nbasisObjects       =   len(basisObjects)
for counter, x in enumerate(basisObjects):
    x.basisindex    =   counter

integralEngine(molecule, basisObjects, nbasisObjects, "overlap")
integralEngine(molecule, basisObjects, nbasisObjects, "kinetic")
integralEngine(molecule, basisObjects, nbasisObjects, "eni")
integralEngine(molecule, basisObjects, nbasisObjects, "eri")

rank    =   comm.Get_rank()
if rank == 0:
    with open("overlap."+inputfilename[:-4]+".txt", "w") as t:
        for row in molecule.overlapmat:
            for column in row:
                t.write("{:10.3f}".format(column))
            t.write("\n")

    with open("kinetic."+inputfilename[:-4]+".txt", "w") as t:
        for row in molecule.kineticmat:
            for column in row:
                t.write("{:10.3f}".format(column))
            t.write("\n")

    with open("nuclear."+inputfilename[:-4]+".txt", "w") as t:
        for row in molecule.enimat:
            for column in row:
                t.write("{:10.3f}".format(column))
            t.write("\n")

    with open("electronic."+inputfilename[:-4]+".txt", "w") as t:
        for axis1 in molecule.erimat:
            for axis2 in axis1:
                for axis3 in axis2:
                    for axis4 in axis3:
                        t.write("{:10.3f}".format(axis4))
                    t.write("\n")
                t.write("\n")
            t.write("\n")

    molecule.nuclearenergy  =   nuclearenergy(atomObjects)
    scfEngine(molecule, nbasisObjects)

    with open("density."+inputfilename[:-4]+".txt", "w") as d:
        savetxt(d, molecule.density, "%10.3f")

MPI.Finalize()

cython sections

oneelectron.pyx

import cython   as  cython
import numpy    as  np
cimport numpy   as  np

from scipy.special  import  comb, factorial2
from libc.math      cimport exp,  pow
from numpy          import  dot,  pi

@cython.boundscheck(False)
@cython.wraparound(False)
def gaussiantheorem(np.ndarray[np.float64_t] center1, double exponent1, np.ndarray[np.float64_t] center2, double exponent2):
    cdef np.ndarray gaussiancenter  =   np.zeros(3)
    cdef double gaussianexponent    =   0.0
    cdef double gaussianintegral    =   0.0
    gaussiancenter      =   ((exponent1*center1)+(exponent2*center2))/(exponent1+exponent2)
    gaussianexponent    =   (exponent1*exponent2)/(exponent1+exponent2)
    gaussianintegral    =   exp(-1*gaussianexponent*dot(center1-center2, center1-center2))
    return gaussiancenter, gaussianintegral

@cython.boundscheck(False)
@cython.wraparound(False)
cdef double overlappgtos(double center1, double exponent1, int shell1, double center2, double exponent2, int shell2, double gaussiancenter):
    cdef double t_overlap   =   0.0
    cdef double auxiliary   =   0.0
    cdef int counter1, counter2
    for counter1 in range(0, shell1+1):
        for counter2 in range(0, shell2+1):
            if (counter1+counter2)%2 == 0:
                auxiliary   =   comb(shell1, counter1)
                auxiliary   =   auxiliary*comb(shell2, counter2)
                auxiliary   =   auxiliary*factorial2(counter1+counter2-1)
                auxiliary   =   auxiliary*pow(gaussiancenter-center1, shell1-counter1)
                auxiliary   =   auxiliary*pow(gaussiancenter-center2, shell2-counter2)
                auxiliary   =   auxiliary/pow(2*(exponent1+exponent2), 0.5*(counter1+counter2))
                t_overlap   =   t_overlap+auxiliary
    return t_overlap

@cython.boundscheck(False)
@cython.wraparound(False)
def overlapcgtos(basisobject1, basisobject2):
    cdef np.ndarray center1         =   basisobject1.center
    cdef np.ndarray exponents1      =   basisobject1.exponents
    cdef np.ndarray coefficients1   =   basisobject1.coefficients
    cdef np.ndarray shell1          =   basisobject1.shell
    cdef np.ndarray normcoeffs1     =   basisobject1.normcoeffs

    cdef np.ndarray center2         =   basisobject2.center
    cdef np.ndarray exponents2      =   basisobject2.exponents
    cdef np.ndarray coefficients2   =   basisobject2.coefficients
    cdef np.ndarray shell2          =   basisobject2.shell
    cdef np.ndarray normcoeffs2     =   basisobject2.normcoeffs

    cdef double overlaptotal        =   0.0
    cdef index1                     =   0
    cdef index2                     =   0
    cdef exponent1                  =   0
    cdef exponent2                  =   0
    cdef np.ndarray gaussiancenter  =   np.zeros(3)
    cdef double gaussianintegral    =   0.0
    for index1, exponent1 in enumerate(exponents1):
        for index2, exponent2 in enumerate(exponents2):
            gaussiancenter, gaussianintegral    =   gaussiantheorem(center1, exponent1, center2, exponent2)
            overlapx        =   overlappgtos(center1[0], exponent1, shell1[0], center2[0], exponent2, shell2[0], gaussiancenter[0])
            overlapy        =   overlappgtos(center1[1], exponent1, shell1[1], center2[1], exponent2, shell2[1], gaussiancenter[1])
            overlapz        =   overlappgtos(center1[2], exponent1, shell1[2], center2[2], exponent2, shell2[2], gaussiancenter[2])
            overlap         =   overlapx*overlapy*overlapz*gaussianintegral*pow(pi/(exponent1+exponent2), 1.5)
            overlaptotal    =   overlaptotal+(normcoeffs1[index1]*normcoeffs2[index2]*coefficients1[index1]*coefficients2[index2])*overlap
    return overlaptotal

@cython.boundscheck(False)
@cython.wraparound(False)
def kineticcgtos(basisobject1, basisobject2):
    cdef np.ndarray center1         =   basisobject1.center
    cdef np.ndarray exponents1      =   basisobject1.exponents
    cdef np.ndarray coefficients1   =   basisobject1.coefficients
    cdef np.ndarray shell1          =   basisobject1.shell
    cdef np.ndarray normcoeffs1     =   basisobject1.normcoeffs

    cdef np.ndarray center2         =   basisobject2.center
    cdef np.ndarray exponents2      =   basisobject2.exponents
    cdef np.ndarray coefficients2   =   basisobject2.coefficients
    cdef np.ndarray shell2          =   basisobject2.shell
    cdef np.ndarray normcoeffs2     =   basisobject2.normcoeffs

    cdef double Kx                  =   0.0
    cdef double Ky                  =   0.0
    cdef double Kz                  =   0.0
    cdef np.ndarray gaussiancenter  =   np.zeros(3)
    cdef double gaussianintegral    =   0.0
    for index1, exponent1 in enumerate(exponents1):
        for index2, exponent2 in enumerate(exponents2):
            gaussiancenter, gaussianintegral    =   gaussiantheorem(center1, exponent1, center2, exponent2)
            overlapx    =   overlappgtos(center1[0], exponent1, shell1[0], center2[0], exponent2, shell2[0], gaussiancenter[0])
            overlapy    =   overlappgtos(center1[1], exponent1, shell1[1], center2[1], exponent2, shell2[1], gaussiancenter[1])
            overlapz    =   overlappgtos(center1[2], exponent1, shell1[2], center2[2], exponent2, shell2[2], gaussiancenter[2])

            overlapx11  =   overlappgtos(center1[0], exponent1, shell1[0]-1, center2[0], exponent2, shell2[0]-1, gaussiancenter[0])
            overlapx12  =   overlappgtos(center1[0], exponent1, shell1[0]+1, center2[0], exponent2, shell2[0]-1, gaussiancenter[0])
            overlapx13  =   overlappgtos(center1[0], exponent1, shell1[0]-1, center2[0], exponent2, shell2[0]+1, gaussiancenter[0])
            overlapx14  =   overlappgtos(center1[0], exponent1, shell1[0]+1, center2[0], exponent2, shell2[0]+1, gaussiancenter[0])

            overlapy11  =   overlappgtos(center1[1], exponent1, shell1[1]-1, center2[1], exponent2, shell2[1]-1, gaussiancenter[1])
            overlapy12  =   overlappgtos(center1[1], exponent1, shell1[1]+1, center2[1], exponent2, shell2[1]-1, gaussiancenter[1])
            overlapy13  =   overlappgtos(center1[1], exponent1, shell1[1]-1, center2[1], exponent2, shell2[1]+1, gaussiancenter[1])
            overlapy14  =   overlappgtos(center1[1], exponent1, shell1[1]+1, center2[1], exponent2, shell2[1]+1, gaussiancenter[1])

            overlapz11  =   overlappgtos(center1[2], exponent1, shell1[2]-1, center2[2], exponent2, shell2[2]-1, gaussiancenter[2])
            overlapz12  =   overlappgtos(center1[2], exponent1, shell1[2]+1, center2[2], exponent2, shell2[2]-1, gaussiancenter[2])
            overlapz13  =   overlappgtos(center1[2], exponent1, shell1[2]-1, center2[2], exponent2, shell2[2]+1, gaussiancenter[2])
            overlapz14  =   overlappgtos(center1[2], exponent1, shell1[2]+1, center2[2], exponent2, shell2[2]+1, gaussiancenter[2])

            kx          =   shell1[0]*shell2[0]*overlapx11
            kx          +=  -2*exponent1*shell2[0]*overlapx12
            kx          +=  -2*exponent2*shell1[0]*overlapx13
            kx          +=  4*exponent1*exponent2*overlapx14

            ky          =   shell1[1]*shell2[1]*overlapy11
            ky          +=  -2*exponent1*shell2[1]*overlapy12
            ky          +=  -2*exponent2*shell1[1]*overlapy13
            ky          +=  4*exponent1*exponent2*overlapy14

            kz          =   shell1[2]*shell2[2]*overlapz11
            kz          +=  -2*exponent1*shell2[2]*overlapz12
            kz          +=  -2*exponent2*shell1[2]*overlapz13
            kz          +=  4*exponent1*exponent2*overlapz14

            Kx          +=  0.5*kx*overlapy*overlapz*gaussianintegral*pow(pi/(exponent1+exponent2), 1.5)*normcoeffs1[index1]*coefficients1[index1]*normcoeffs2[index2]*coefficients2[index2]
            Ky          +=  0.5*ky*overlapx*overlapz*gaussianintegral*pow(pi/(exponent1+exponent2), 1.5)*normcoeffs1[index1]*coefficients1[index1]*normcoeffs2[index2]*coefficients2[index2]
            Kz          +=  0.5*kz*overlapx*overlapy*gaussianintegral*pow(pi/(exponent1+exponent2), 1.5)*normcoeffs1[index1]*coefficients1[index1]*normcoeffs2[index2]*coefficients2[index2]
    return Kx+Ky+Kz

twoelectron.pyx

import cython   as  cython
import numpy    as  np
cimport numpy   as  np

from scipy.special  import  comb, factorial2, factorial, hyp1f1
from libc.math      cimport exp,  pow, sqrt
from numpy          import  dot,  pi
from numpy.linalg   import norm

@cython.boundscheck(False)
@cython.wraparound(False)
def gaussiantheorem(np.ndarray[np.float64_t] center1, double exponent1, np.ndarray[np.float64_t] center2, double exponent2):
    cdef np.ndarray gaussiancenter  =   np.zeros(3)
    cdef double gaussianexponent    =   0.0
    cdef double gaussianintegral    =   0.0
    gaussiancenter      =   ((exponent1*center1)+(exponent2*center2))/(exponent1+exponent2)
    gaussianexponent    =   (exponent1*exponent2)/(exponent1+exponent2)
    gaussianintegral    =   exp(-1*gaussianexponent*dot(center1-center2, center1-center2))
    return gaussiancenter, gaussianintegral

@cython.boundscheck(False)
@cython.wraparound(False)
cdef double expansioncoeff1(int expansionindex, int shell1, float center1, int shell2, float center2, float gaussiancenter):
    cdef double t_expansioncoeff    =   0.0
    cdef int counter                =   0
    cdef double auxiliary           =   0.0
    for counter in range(max(0, expansionindex-shell2), min(expansionindex, shell1)+1):
        auxiliary           =   comb(shell1, counter)
        auxiliary           =   auxiliary*comb(shell2, expansionindex-counter)
        auxiliary           =   auxiliary*pow(gaussiancenter-center1, shell1-counter)
        auxiliary           =   auxiliary*pow(gaussiancenter-center2, shell2+counter-expansionindex)
        t_expansioncoeff    =   t_expansioncoeff+auxiliary
    return t_expansioncoeff

@cython.boundscheck(False)
@cython.wraparound(False)
cdef double expansioncoeff2(int expansionindex1, int expansionindex2, int expansionindex3, int shell1, float center1, int shell2, float center2, float atomcenter, float gaussiancenter, float gamma):
    cdef double t_expansioncoeff    =   0.0
    cdef double epsilon             =   1.0/(4*gamma)
    t_expansioncoeff                =   expansioncoeff1(expansionindex1, shell1, center1, shell2, center2, gaussiancenter)
    t_expansioncoeff                =   t_expansioncoeff*factorial(expansionindex1)
    t_expansioncoeff                =   t_expansioncoeff*pow(-1, expansionindex1+expansionindex3)
    t_expansioncoeff                =   t_expansioncoeff*pow(gaussiancenter-atomcenter, expansionindex1-(2*expansionindex2)-(2*expansionindex3))
    t_expansioncoeff                =   t_expansioncoeff*pow(epsilon, expansionindex2+expansionindex3)
    t_expansioncoeff                =   t_expansioncoeff/factorial(expansionindex3, exact=True)
    t_expansioncoeff                =   t_expansioncoeff/factorial(expansionindex2, exact=True)
    t_expansioncoeff                =   t_expansioncoeff/factorial(expansionindex1-(2*expansionindex2)-(2*expansionindex3), exact=True)
    return t_expansioncoeff

@cython.boundscheck(False)
@cython.wraparound(False)
cdef double boysfunction(int boysindex, double boysparameter):
    return hyp1f1(0.5+boysindex, 1.5+boysindex, -1*boysparameter)/((2*boysindex)+1)

@cython.boundscheck(False)
@cython.wraparound(False)
def nuclearcgtos(basisobject1, basisobject2, atomobjects):
    cdef double t_eniintegral   =   0.0
    cdef np.ndarray center1         =   basisobject1.center
    cdef np.ndarray exponents1      =   basisobject1.exponents
    cdef np.ndarray coefficients1   =   basisobject1.coefficients
    cdef np.ndarray shell1          =   basisobject1.shell
    cdef np.ndarray normcoeffs1     =   basisobject1.normcoeffs

    cdef np.ndarray center2         =   basisobject2.center
    cdef np.ndarray exponents2      =   basisobject2.exponents
    cdef np.ndarray coefficients2   =   basisobject2.coefficients
    cdef np.ndarray shell2          =   basisobject2.shell
    cdef np.ndarray normcoeffs2     =   basisobject2.normcoeffs
    for index1, exponent1 in enumerate(exponents1):
        for index2, exponent2 in enumerate(exponents2):
            t_nuclearintegral   =   0.0
            gamma               =   exponent1+exponent2
            gaussiancenter, gaussianintegral    =   gaussiantheorem(center1, exponent1, center2, exponent2)
            for atom in atomobjects:
                for expansionindex1A in range(0, shell1[0]+shell2[0]+1):
                    for expansionindex1B in range(0, int(expansionindex1A//2)+1):
                        for expansionindex1C in range(0, int((expansionindex1A-(2*expansionindex1B))//2)+1):
                            auxiliary1A =   expansioncoeff2(expansionindex1A, expansionindex1B, expansionindex1C, shell1[0], center1[0], shell2[0], center2[0], atom.center[0], gaussiancenter[0], gamma)
                            
                            for expansionindex2A in range(0, shell1[1]+shell2[1]+1):
                                for expansionindex2B in range(0, int(expansionindex2A//2)+1):
                                    for expansionindex2C in range(0, int((expansionindex2A-(2*expansionindex2B))//2)+1):
                                        auxiliary2A =   expansioncoeff2(expansionindex2A, expansionindex2B, expansionindex2C, shell1[1], center1[1], shell2[1], center2[1], atom.center[1], gaussiancenter[1], gamma)

                                        for expansionindex3A in range(0, shell1[2]+shell2[2]+1):
                                            for expansionindex3B in range(0, int(expansionindex3A//2)+1):
                                                for expansionindex3C in range(0, int((expansionindex3A-(2*expansionindex3B))//2)+1):
                                                    auxiliary3A =   expansioncoeff2(expansionindex3A, expansionindex3B, expansionindex3C, shell1[2], center1[2], shell2[2], center2[2], atom.center[2], gaussiancenter[2], gamma)
                                                    boysindex   =   (expansionindex1A+expansionindex2A+expansionindex3A)-2*(expansionindex1B+expansionindex2B+expansionindex3B)-(expansionindex1C+expansionindex2C+expansionindex3C)
                                                    boysparam   =   dot(atom.center-gaussiancenter, atom.center-gaussiancenter)*gamma
                                                    auxiliary4A =   boysfunction(boysindex, boysparam)
                                                    t_nuclearintegral   =   t_nuclearintegral+(auxiliary1A*auxiliary2A*auxiliary3A*auxiliary4A)*atom.charge*-1.0
            t_eniintegral   =   t_eniintegral+(t_nuclearintegral*coefficients1[index1]*coefficients2[index2]*normcoeffs1[index1]*normcoeffs2[index2]*gaussianintegral*(2*pi/gamma))
    return t_eniintegral

@cython.boundscheck(False)
@cython.wraparound(False)
cdef double enigaussian(int shell1, int shell2, int nodes, float center1, float center2, float exponent1, float exponent2):
    cdef double gamma       =   exponent1+exponent2
    cdef double exponent    =   exponent1*(exponent2/gamma)

    if (nodes < 0) or (nodes > (shell1+shell2)):
        return 0
    elif nodes == shell1 == shell2 == 0:
        return exp(-1*exponent*pow(center1-center2, 2))
    elif shell2 == 0:
        return (1/(2*gamma))*enigaussian(shell1-1, shell2, nodes-1, center1, center2, exponent1, exponent2) - (exponent*(center1-center2)/exponent1)*enigaussian(shell1-1, shell2, nodes, center1, center2, exponent1, exponent2) + (nodes+1)*enigaussian(shell1-1, shell2, nodes+1, center1, center2, exponent1, exponent2)
    else:
        return (1/(2*gamma))*enigaussian(shell1, shell2-1, nodes-1, center1, center2, exponent1, exponent2) + (exponent*(center1-center2)/exponent2)*enigaussian(shell1, shell2-1, nodes, center1, center2, exponent1, exponent2) + (nodes+1)*enigaussian(shell1, shell2-1, nodes+1, center1, center2, exponent1, exponent2)

@cython.boundscheck(False)
@cython.wraparound(False)
cdef double auxiliaryhermiteeri(int nodes1, int nodes2, int nodes3, int boysorder, float exponent, float xdist, float ydist, float zdist, float dist):
    cdef double boysparam   =   exponent*pow(dist, 2)
    cdef double auxhermite  =   0.0

    if nodes1 == nodes2 == nodes3 == 0:
        auxhermite  +=  pow(-2*exponent, boysorder)*boysfunction(boysorder, boysparam)
    elif nodes1 == nodes2 == 0:
        if nodes3 > 1:
            auxhermite  +=  (nodes3-1)*auxiliaryhermiteeri(nodes1, nodes2, nodes3-2, boysorder+1, exponent, xdist, ydist, zdist, dist)
        auxhermite  +=  zdist*auxiliaryhermiteeri(nodes1, nodes2, nodes3-1, boysorder+1, exponent, xdist, ydist, zdist, dist)
    elif nodes1 == 0:
        if nodes2 > 1:
            auxhermite  +=  (nodes2-1)*auxiliaryhermiteeri(nodes1, nodes2-2, nodes3, boysorder+1, exponent, xdist, ydist, zdist, dist)
        auxhermite  +=  ydist*auxiliaryhermiteeri(nodes1, nodes2-1, nodes3, boysorder+1, exponent, xdist, ydist, zdist, dist)
    else:
        if nodes1 > 1:
            auxhermite  +=  (nodes1-1)*auxiliaryhermiteeri(nodes1-2, nodes2, nodes3, boysorder+1, exponent, xdist, ydist, zdist, dist)
        auxhermite  +=  xdist*auxiliaryhermiteeri(nodes1-1, nodes2, nodes3, boysorder+1, exponent, xdist, ydist, zdist, dist)
    return auxhermite

@cython.boundscheck(False)
@cython.wraparound(False)
cdef double eripgtos(exponent1, shell1, center1, exponent2, shell2, center2, exponent3, shell3, center3, exponent4, shell4, center4):
    cdef double     eripgto             =   0.0
    cdef np.ndarray gaussiancenterA     =   np.zeros(3)
    cdef double     gaussianintegralA   =   0.0
    cdef np.ndarray gaussiancenterB     =   np.zeros(3)
    cdef double     gaussianintegralB   =   0.0
    gaussiancenterA, gaussianintegralA  =   gaussiantheorem(center1, exponent1, center2, exponent2)
    gaussiancenterB, gaussianintegralB  =   gaussiantheorem(center3, exponent3, center4, exponent4)
    cdef double xdist                   =   gaussiancenterA[0]-gaussiancenterB[0]
    cdef double ydist                   =   gaussiancenterA[1]-gaussiancenterB[1]
    cdef double zdist                   =   gaussiancenterA[2]-gaussiancenterB[2]
    cdef double distance                =   norm(gaussiancenterA-gaussiancenterB)
    cdef double combexponentA           =   exponent1+exponent2
    cdef double combexponentB           =   exponent3+exponent4
    cdef double combexponent            =   combexponentA*(combexponentB/(combexponentA+combexponentB))

    cdef int index1                     =   0
    cdef int index2                     =   0
    cdef int index3                     =   0
    cdef int index4                     =   0
    cdef int index5                     =   0
    cdef int index6                     =   0
    for index1 in range(0, shell1[0]+shell2[0]+1):
        for index2 in range(0, shell1[1]+shell2[1]+1):
            for index3 in range(0, shell1[2]+shell2[2]+1):
                for index4 in range(0, shell3[0]+shell4[0]+1):
                    for index5 in range(0, shell3[1]+shell4[1]+1):
                        for index6 in range(0, shell3[2]+shell4[2]+1):
                            erigaussianA    =   enigaussian(shell1[0], shell2[0], index1, center1[0], center2[0], exponent1, exponent2)
                            erigaussianB    =   enigaussian(shell1[1], shell2[1], index2, center1[1], center2[1], exponent1, exponent2)
                            erigaussianC    =   enigaussian(shell1[2], shell2[2], index3, center1[2], center2[2], exponent1, exponent2)
                            erigaussianD    =   enigaussian(shell3[0], shell4[0], index4, center3[0], center4[0], exponent3, exponent4)
                            erigaussianE    =   enigaussian(shell3[1], shell4[1], index5, center3[1], center4[1], exponent3, exponent4)
                            erigaussianF    =   enigaussian(shell3[2], shell4[2], index6, center3[2], center4[2], exponent3, exponent4)
                            erigaussianG    =   auxiliaryhermiteeri(index1+index4, index2+index5, index3+index6, 0, combexponent, xdist, ydist, zdist, distance)
                            result          =   erigaussianA*erigaussianB*erigaussianC*erigaussianD*erigaussianE*erigaussianF*erigaussianG*pow(-1, index4+index5+index6)
                            eripgto         =   eripgto+result
    eripgto *=  2*pow(pi, 2.5)/(combexponentA*combexponentB*sqrt(combexponentA+combexponentB))
    return eripgto

@cython.boundscheck(False)
@cython.wraparound(False)
def ericgtos(basisobject1, basisobject2, basisobject3, basisobject4):
    cdef np.ndarray center1         =   basisobject1.center
    cdef np.ndarray exponents1      =   basisobject1.exponents
    cdef np.ndarray coefficients1   =   basisobject1.coefficients
    cdef np.ndarray shell1          =   basisobject1.shell
    cdef np.ndarray normcoeffs1     =   basisobject1.normcoeffs

    cdef np.ndarray center2         =   basisobject2.center
    cdef np.ndarray exponents2      =   basisobject2.exponents
    cdef np.ndarray coefficients2   =   basisobject2.coefficients
    cdef np.ndarray shell2          =   basisobject2.shell
    cdef np.ndarray normcoeffs2     =   basisobject2.normcoeffs

    cdef np.ndarray center3         =   basisobject3.center
    cdef np.ndarray exponents3      =   basisobject3.exponents
    cdef np.ndarray coefficients3   =   basisobject3.coefficients
    cdef np.ndarray shell3          =   basisobject3.shell
    cdef np.ndarray normcoeffs3     =   basisobject3.normcoeffs

    cdef np.ndarray center4         =   basisobject4.center
    cdef np.ndarray exponents4      =   basisobject4.exponents
    cdef np.ndarray coefficients4   =   basisobject4.coefficients
    cdef np.ndarray shell4          =   basisobject4.shell
    cdef np.ndarray normcoeffs4     =   basisobject4.normcoeffs

    cdef double ericgto =   0.0
    for index1, exponent1 in enumerate(exponents1):
        for index2, exponent2 in enumerate(exponents2):
            for index3, exponent3 in enumerate(exponents3):
                for index4, exponent4 in enumerate(exponents4):
                    result  =   eripgtos(exponent1, shell1, center1, exponent2, shell2, center2, exponent3, shell3, center3, exponent4, shell4, center4)
                    result  *=  coefficients1[index1]*normcoeffs1[index1]*coefficients2[index2]*normcoeffs2[index2]*coefficients3[index3]*normcoeffs3[index3]*coefficients4[index4]*normcoeffs4[index4]
                    ericgto =   ericgto+result
    return ericgto

hartreefock.pyx

import cython   as  cython
import numpy    as  np
cimport numpy   as  np

from scipy.special  import  comb, factorial2
from libc.math      cimport exp,  pow
from numpy          import  dot,  pi, diag, amax, trace, isclose
from numpy.linalg   import  eigh, solve

@cython.boundscheck(False)
@cython.wraparound(False)
def computecorehamiltonian(np.ndarray[np.float64_t, ndim=2]kineticmat, np.ndarray[np.float64_t, ndim=2]enimat):
    cdef int nbasis =   kineticmat[0].size
    cdef np.ndarray[np.float64_t, ndim=2] corehamiltonian   =   np.zeros((nbasis, nbasis))
    corehamiltonian =   kineticmat+enimat
    return corehamiltonian

@cython.boundscheck(False)
@cython.wraparound(False)
def canonicalorthogonalization(np.ndarray[np.float64_t, ndim=2] overlapmat):
    cdef int nbasis =   overlapmat[0].size
    cdef np.ndarray[np.float64_t, ndim=1]   eigenvalues     =   np.zeros(nbasis)
    cdef np.ndarray[np.float64_t, ndim=2]   t_eigenvalues   =   np.zeros((nbasis, nbasis))
    cdef np.ndarray[np.float64_t, ndim=2]   eigenvectors    =   np.zeros((nbasis, nbasis))
    cdef np.ndarray[np.float64_t, ndim=2]   orthogonalmat   =   np.zeros((nbasis, nbasis))
    eigenvalues, eigenvectors   =   eigh(overlapmat)
    t_eigenvalues               =   diag((eigenvalues)**-0.5)
    orthogonalmat               =   dot(eigenvectors, t_eigenvalues)
    return orthogonalmat

@cython.boundscheck(False)
@cython.wraparound(False)
def computehamiltonian(np.ndarray[np.float64_t, ndim=2] densitymat, np.ndarray[np.float64_t, ndim=4] erimat):
    cdef int nbasis =   densitymat[0].size
    cdef int index1
    cdef int index2
    cdef int index3
    cdef int index4
    cdef np.ndarray[np.float64_t, ndim=2]   hamiltonian =   np.zeros((nbasis, nbasis))
    for index1 in range(0, nbasis):
        for index2 in range(0, nbasis):
            for index3 in range(0, nbasis):
                for index4 in range(0, nbasis):
                    hamiltonian[index1, index2] =   hamiltonian[index1, index2]+(densitymat[index3, index4]*(erimat[index1, index2, index3, index4]-0.5*(erimat[index1, index4, index3, index2])))
    return hamiltonian

@cython.boundscheck(False)
@cython.wraparound(False)
def computedensity(int nelectrons, np.ndarray[np.float64_t, ndim=2] overlapmat):
    cdef int nbasis =   overlapmat[0].shape[0]
    cdef np.ndarray[np.float64_t, ndim=2]   densitymat  =   np.zeros((nbasis, nbasis))
    cdef int index1
    cdef int index2
    cdef int index3
    for index1 in range(0, nbasis):
        for index2 in range(0, nbasis):
            for index3 in range(0, int(nelectrons//2)):
                densitymat[index1, index2]  +=  2.0*overlapmat[index1, index3]*overlapmat[index2, index3]
    return densitymat

@cython.boundscheck(False)
@cython.wraparound(False)
def diisorthogonalization(np.ndarray[np.float64_t, ndim=2] overlapmat):
    eigenvalues, eigenvectors   =   eigh(overlapmat)
    halfeigmat                  =   diag((eigenvalues)**-0.5)
    rightmat                    =   dot(eigenvectors, halfeigmat)
    orthomat                    =   dot(rightmat, eigenvectors.T)
    return orthomat

@cython.boundscheck(False)
def diisEngine(molecule, np.ndarray[np.float64_t, ndim=2] fock, np.ndarray[np.float64_t, ndim=2] density):
    cdef int dimensions         =   fock[0].size
    eigenvalues, eigenvectors   =   eigh(molecule.overlapmat)
    halfeigmat                  =   diag((eigenvalues)**-0.5)
    rightmat                    =   dot(eigenvectors, halfeigmat)
    orthomat                    =   dot(rightmat, eigenvectors.T)
    t_errorvector               =   dot(fock, dot(density, molecule.overlapmat))-dot(molecule.overlapmat, dot(density, fock))
    leftmat                     =   dot(orthomat.T, t_errorvector)
    errorvector                 =   dot(leftmat, orthomat)
    molecule.diisfock.append(fock)
    molecule.diiserror.append(errorvector)
    cdef int fockdimension               =   len(molecule.diisfock)
    if fockdimension > 6:
        del molecule.diisfock[0]
        del molecule.diiserror[0]
        fockdimension   =   fockdimension-1
    cdef np.ndarray[np.float64_t, ndim=2] bmatrix   =   np.zeros((fockdimension+1, fockdimension+1))
    bmatrix[-1,:]               =   -1.0
    bmatrix[:,-1]               =   -1.0
    bmatrix[-1,-1]              =   0.0
    for index1 in range(0, fockdimension):
        for index2 in range(0, index1+1):
            bmatrix[index1, index2] =   trace(dot(molecule.diiserror[index1].T, molecule.diiserror[index2]))
            bmatrix[index2, index1] =   trace(dot(molecule.diiserror[index1].T, molecule.diiserror[index2]))
    cdef np.ndarray[np.float64_t, ndim=1] residue    =   np.zeros(fockdimension+1)
    residue[-1]                 =   -1.0
    cdef np.ndarray[np.float64_t, ndim=1] weights    =   solve(bmatrix, residue)
    assert isclose(sum(weights[:-1]),1.0)
    cdef np.ndarray[np.float64_t, ndim=2] newfock    =   np.zeros((dimensions, dimensions))
    for index, fockelement in enumerate(molecule.diisfock):
        newfock =   newfock+(weights[index]*fockelement)
    return newfock

@cython.boundscheck(False)
@cython.wraparound(False)
def restrictedhatreefock(int nelectrons, np.ndarray[np.float64_t, ndim=2] corehamiltonian, np.ndarray[np.float64_t, ndim=2] orthogonalmat, np.ndarray[np.float64_t, ndim=2] densitymat, np.ndarray[np.float64_t, ndim=4] erimat):
    cdef int nbasis =   corehamiltonian[0].size
    cdef np.ndarray[np.float64_t, ndim=2]   hamiltonianmat  =   computehamiltonian(densitymat, erimat)
    cdef np.ndarray[np.float64_t, ndim=2]   fockmatrix      =   hamiltonianmat+corehamiltonian
    cdef np.ndarray[np.float64_t, ndim=2]   orthofockmatrix =   dot(orthogonalmat.conj().T, dot(fockmatrix, orthogonalmat))
    cdef np.ndarray[np.float64_t, ndim=1]   orbitalenergies =   np.zeros(nbasis)
    cdef np.ndarray[np.float64_t, ndim=2]   orthocoeffs     =   np.zeros((nbasis, nbasis))
    orbitalenergies, orthocoeffs                            =   eigh(orthofockmatrix)
    cdef np.ndarray[np.float64_t, ndim=2]   canonicalcoeffs =   dot(orthogonalmat, orthocoeffs)
    cdef np.ndarray[np.float64_t, ndim=2]   t_densitymat    =   computedensity(nelectrons, canonicalcoeffs)
    cdef double maxdensitydiff
    cdef double rmsdensitydiff
    maxdensitydiff, rmsdensitydiff                          =   checkconvergence(densitymat, t_densitymat)
    return orthofockmatrix, fockmatrix, hamiltonianmat, orbitalenergies, canonicalcoeffs, t_densitymat, maxdensitydiff, rmsdensitydiff

@cython.boundscheck(False)
@cython.wraparound(False)
def restrictedhatreefockDIIS(molecule, int nelectrons, np.ndarray[np.float64_t, ndim=2] corehamiltonian, np.ndarray[np.float64_t, ndim=2] orthogonalmat, np.ndarray[np.float64_t, ndim=2] densitymat, np.ndarray[np.float64_t, ndim=4] erimat):
    cdef int nbasis =   corehamiltonian[0].size
    cdef np.ndarray[np.float64_t, ndim=2]   hamiltonianmat  =   computehamiltonian(densitymat, erimat)
    cdef np.ndarray[np.float64_t, ndim=2]   fockmatrix      =   hamiltonianmat+corehamiltonian
    cdef np.ndarray[np.float64_t, ndim=2]   diisfock        =   diisEngine(molecule, fockmatrix, densitymat)
    cdef np.ndarray[np.float64_t, ndim=2]   orthofockmatrix =   dot(orthogonalmat.conj().T, dot(diisfock, orthogonalmat))
    cdef np.ndarray[np.float64_t, ndim=1]   orbitalenergies =   np.zeros(nbasis)
    cdef np.ndarray[np.float64_t, ndim=2]   orthocoeffs     =   np.zeros((nbasis, nbasis))
    orbitalenergies, orthocoeffs                            =   eigh(orthofockmatrix)
    cdef np.ndarray[np.float64_t, ndim=2]   canonicalcoeffs =   dot(orthogonalmat, orthocoeffs)
    cdef np.ndarray[np.float64_t, ndim=2]   t_densitymat    =   computedensity(nelectrons, canonicalcoeffs)
    cdef double maxdensitydiff
    cdef double rmsdensitydiff
    maxdensitydiff, rmsdensitydiff                          =   checkconvergence(densitymat, t_densitymat)
    return orthofockmatrix, fockmatrix, hamiltonianmat, orbitalenergies, canonicalcoeffs, t_densitymat, maxdensitydiff, rmsdensitydiff

@cython.boundscheck(False)
@cython.wraparound(False)
def checkconvergence(np.ndarray[np.float64_t, ndim=2] densitymat, np.ndarray[np.float64_t, ndim=2] t_densitymat):
    cdef np.ndarray[np.float64_t, ndim=2]   densitydiff     =   t_densitymat-densitymat
    cdef double maxdensitydiff  =   amax(densitydiff)
    cdef double rmsdensitydiff  =   pow(sum(sum((densitydiff)**2))/4.0, 0.5)
    return maxdensitydiff, rmsdensitydiff

@cython.boundscheck(False)
@cython.wraparound(False)
def totalenergy(float nucenergy, np.ndarray[np.float64_t, ndim=2] fockmatrix, np.ndarray[np.float64_t, ndim=2] densitymatrix, np.ndarray[np.float64_t, ndim=2] corehamiltonian):
    cdef double t_energy    =   0.0
    cdef int nbasis         =   fockmatrix[0].size
    for index1 in range(0, nbasis):
        for index2 in range(0, nbasis):
            t_energy    =   t_energy+(0.5*densitymatrix[index1,index2]*(fockmatrix[index1,index2]+corehamiltonian[index1,index2]))
    return t_energy+nucenergy
```
\$\endgroup\$
2
  • 1
    \$\begingroup\$ Did you consider using Numba? \$\endgroup\$
    – greybeard
    Commented Jun 21, 2023 at 13:59
  • \$\begingroup\$ I did not. I was under the impression that static compilation was a better way to go. Because many programs have already implemented routines using Cython; PyQuante for example. \$\endgroup\$ Commented Jun 21, 2023 at 14:37

1 Answer 1

2
\$\begingroup\$

Let's talk about structure and readability.


Use snake_case for naming, it's a standard practice in Python. nuclear_energy instead of nuclearenergy, get_atom instead of getatom, etc.. Much easier to read.


Don't use camelCase and, more importantly, don't mix naming conventions together:

periodictable   =   safe_load(periodic)
periodicTable   =   {y: x[y] for x in periodictable for y in x}

At first I thought you're assigning periodictable twice.


Don't pollute the global scope by creating names here. The piece above creates 2 names periodictable and periodicTable that will be accessible from anywhere in the code. Instead create a PeriodicTable class with a get_atom member method.


Let's look at this function:

def nuclearenergy(atomobjects):
    atompairs       =   list(combinations(atomobjects,2))
    nuclearenergy   =   0.0
    for atompair in atompairs:
        center1         =   atompair[0].center
        charge1         =   atompair[0].charge
        center2         =   atompair[1].center
        charge2         =   atompair[1].charge
        distance        =   norm(center1-center2)
        energy          =   charge1*charge2/distance
        nuclearenergy   =   nuclearenergy+energy
    return nuclearenergy

It can be rewritten as:

def get_nuclear_energy(atoms):
    return sum([pair[0].charge * pair[1].charge / norm(pair[0].center - pair[1].center) for pair in combinations(atoms, 2)])

This is shorter and more readable while also being about 6% faster (according to a simple test via PyCharm profiling). The only information we lost is that norm(center1-center2) is distance, but I assume people who are going to contribute to this code are already familiar with the formulas.


Now this function:

def overlap(basisobjects):
    overlaps    =   []
    for basisobjecttuple in basisobjects:
        result  =   overlapcgtos(basisobjecttuple[0], basisobjecttuple[1])
        overlaps.append((result, basisobjecttuple[0].basisindex, basisobjecttuple[1].basisindex))
    return overlaps

We see a similar structure: create an empty container-accumulator and fill it with stuff in a cycle. This is verbose and slow, since you have to call append() every time you're adding to the container. It can be rewritten like this:

def get_overlaps(bases):
    return [(overlapcgtos(basis[0], basis[1]), basis[0].basis_index, basis[1].basis_index) for basis in bases]

Can't check the performance without compiling pyx, but should be about 44% faster (tested append in a cycle vs list comprehension via PyCharm profiler).

The same goes for the following 3 functions.


def integralEngine ...

def scfEngine ...

Can't tell what those functions are for by reading their name. Functions should be named with verbs that represent the action the function is doing, e.g. start_scf_engine.


if calctype == "overlap":
...
if calctype == "kinetic":
...
if calctype == "eni":
...
if calctype == "eri":
...

Consider defining an enum for that, so things don't break when you decide to change the string in one place and forget to change it in another. Also use elif to get rid of unnecessary checks.


pow(2, 2*totalangmomentum)*pow(2, 1.5)

Python has a pow operator: **

2 ** (2 * totalangmomentum) * 2 ** 2.5

[x for x in molecule.geometry]

This statement does nothing with molecule.geometry. If you wanted to copy it, there is a function for that in the copy package: copy.copy(molecule.geometry)


Consider using if __name__ == '__main__' construction to run the script.


In general, try running your code with a profiler and see what parts are the slowest. Try to speed them up first and don't bother with stuff that is already fast. And consider trying Numba, as @greybeard suggested in the comments.

\$\endgroup\$
2
  • \$\begingroup\$ @greybeard, no. Is there a point? \$\endgroup\$ Commented Jun 24, 2023 at 16:10
  • \$\begingroup\$ @greybeard haven't noticed, ma bad \$\endgroup\$ Commented Jun 24, 2023 at 19:43

Not the answer you're looking for? Browse other questions tagged or ask your own question.