20
$\begingroup$

In the overall good paper of Beruski et. al.,[1] an algorithm of how to symmetrize an "almost symmetric" molecule by symmetrizing the distance matrix is given for the example of Methane.

The main idea is that symmetric equivalent atoms have the same permuted list of distances in their respective column and row of the distance matrix. Thus by averaging and clustering these distances one can get the distance matrix of the closest symmetric molecule. Depending on the "amount of averaging" the molecule is forced into higher and higher symmetries.

However when I try to apply this algorithm to a more complex molecule (say Hexanitrocuprate(II)) the new distance matrix averaged over it's Symmetric Equivalent Atoms (SEA) with the tolerance of 25.0 pm, gets non-Euclidean whereas the original distance matrix is Euclidean. To get the distance matrices the molecule positions have been converted from Angstrom to pm and resorted into the order of the elements as groups of O, N and Cu.

For an explanation of this problem which is called the Euclidean distance problem, see e.g. the dissertation of Al-Homidan[2]. Digging a bit into the theory one can apply a projection method which transforms the non-euclidean distance matrix back into a Euclidean one. The algorithm mentioned in Beruski et. al. reconstructs the atom positions in the molecule directly through a quasi-Newton optimization of a score function on the distance matrix.

However both approaches "de-symmetrize" the distance matrix back to a state which is close to the distance matrix of the original molecule (at least for the example Hexanitrocuprate(II)). I assume the main problem is the constraint on the distances which returns the molecule back to its original shape.

Of course there are algorithms which determine the symmetry (the point group) of the molecule first (see e.g. the answer to this and this questions from computational chemistry), and then symmetrize the molecule with this information. However I would like to just use the local symmetry of the molecule without the use of a rather complex point group recognition algorithm.

Any idea how to solve this problem?

References

  1. O. Beruski, L. N. Vidal. J. Comput. Chem. 2014, 35, 290–299.
  2. Suliman Saleh Al–Homidan. Hybrid Methods for Optimization Problems with Positive Semi–Definite Matrix Constraints. PhD Thesis, University of Dundee, Dundee, December 2003.

Referenced File contents

Hexanitrocuprate(II) (mol file)

 OpenBabel10241500433D  

 19 18  0  0  0  0  0  0  0  0999 V2000  
    6.2190   -0.6777   -1.0333 O   0  5  0  0  0  0  0  0  0  0  0  0  
    5.5672   -1.4099   -0.2754 N   0  3  0  0  0  0  0  0  0  0  0  0  
    6.0808   -2.3487    0.3568 O   0  0  0  0  0  0  0  0  0  0  0  0    
    3.7257   -1.2558    0.0053 Cu  0  7  0  0  0  0  0  0  0  0  0  0  
    3.5134   -2.5214    1.3633 N   0  3  0  0  0  0  0  0  0  0  0  0  
    4.2665   -2.2644    2.3200 O   0  5  0  0  0  0  0  0  0  0  0  0  
    2.7186   -3.4711    1.3724 O   0  0  0  0  0  0  0  0  0  0  0  0  
    2.1847   -0.7314    0.9234 N   0  3  0  0  0  0  0  0  0  0  0  0  
    2.1383   -0.1356    2.0089 O   0  5  0  0  0  0  0  0  0  0  0  0  
    1.1629   -1.0996    0.3148 O   0  0  0  0  0  0  0  0  0  0  0  0  
    4.4060    0.3341    0.7046 N   0  3  0  0  0  0  0  0  0  0  0  0  
    4.2381    1.3976    0.0898 O   0  5  0  0  0  0  0  0  0  0  0  0  
    4.9585    0.2682    1.8160 O   0  0  0  0  0  0  0  0  0  0  0  0  
    3.4004   -0.4833   -1.6643 N   0  3  0  0  0  0  0  0  0  0  0  0  
    2.3776    0.2229   -1.6723 O   0  5  0  0  0  0  0  0  0  0  0  0  
    4.1301   -0.6158   -2.6565 O   0  0  0  0  0  0  0  0  0  0  0  0  
    3.3488   -2.7126   -1.0984 N   0  3  0  0  0  0  0  0  0  0  0  0  
    2.1566   -2.9077   -1.3889 O   0  5  0  0  0  0  0  0  0  0  0  0  
    4.2641   -3.4679   -1.4571 O   0  0  0  0  0  0  0  0  0  0  0  0  
  1  2  1  0  0  0  0  
  2  3  2  0  0  0  0  
  2  4  1  0  0  0  0  
  4  5  1  0  0  0  0  
  4  8  1  0  0  0  0  
  4 11  1  0  0  0  0  
  4 14  1  0  0  0  0  
  4 17  1  0  0  0  0  
  5  6  1  0  0  0  0  
  5  7  2  0  0  0  0  
  8  9  1  0  0  0  0  
  8 10  2  0  0  0  0  
 11 12  1  0  0  0  0  
 11 13  2  0  0  0  0  
 14 15  1  0  0  0  0  
 14 16  2  0  0  0  0  
 17 18  1  0  0  0  0  
 17 19  2  0  0  0  0  
M  CHG  8   1  -1   2   1   4  -3   5   1   6  -1   8   1   9  -1  11   1  
M  CHG  5  12  -1  14   1  15  -1  17   1  18  -1  
M  END  

Original distance matrix (csv file)

0.,285.286949929365,365.2230816911768,231.30761855157303,273.544384149995,270.1046991446095,123.91014849478631,124.29027476033676,302.6233946673654,388.3080251552883,431.26815544391866,445.4489939375776,312.76083674270984,274.96343866048807,384.58956837127033,289.23154098403586,388.7863095583486,270.7369295090716,186.91339037104862  
285.286949929365,0.,227.223583723169,306.3413243426358,365.1435658751226,247.45944091911304,405.7485996278977,276.3043665959697,124.43807697003355,123.8434471419461,282.8381340979324,294.0360137806251,418.39638311056177,317.41328674143426,424.68919800249216,449.11493284013613,309.2691185682787,306.8235470103297,186.84160805345257  
365.2230816911768,227.223583723169,0.,247.33251666531837,286.97859484637524,306.0711925353316,448.4098021899164,425.63235379843957,293.8447174954826,282.7119788760285,123.91290691449379,124.5006200787771,307.3106168032598,308.05683501587816,277.2283785978629,407.5981909920602,317.55003684458933,418.0803374950801,186.88456249781575  
231.30761855157303,306.3413243426358,247.33251666531837,0.,270.01965354395963,369.4707259039612,270.7582805751285,318.1718988220048,306.2871146489842,421.57825655505525,265.7871304258353,356.7231046624258,123.9839384759171,124.29034636688402,312.6727620052632,350.3630121745159,446.674488302164,437.58755123974913,186.53714348622367  
273.544384149995,365.1435658751226,286.97859484637524,270.01965354395963,0.,230.05833738423826,289.4902126152109,384.045655228646,444.93924203198793,431.431883036013,389.95115001753743,305.0093236279835,270.4883160138345,388.6504335517973,124.29409800951929,123.87409656582768,273.8722139976964,311.3959969235315,186.81917728113459  
270.1046991446095,247.45944091911304,306.0711925353316,369.4707259039612,230.05833738423826,0.,351.89634922800775,311.670567266144,356.77044566499615,266.0324591097861,421.4445697835007,306.2198891319765,436.99448245944706,446.8783960989835,314.4797433858022,272.66601438389785,124.24955130703692,123.9724917068299,186.61386711603188  
123.91014849478631,405.7485996278977,448.4098021899164,270.7582805751285,289.4902126152109,351.89634922800775,0.,217.80078627038978,419.21944170088295,508.3628252537749,511.8687892419306,524.9716223759146,308.0939549877602,325.6086231966224,399.6968266073675,264.61506117377365,464.78430610337955,343.31409073907815,276.2143091876306  
124.29027476033676,276.3043665959697,425.63235379843957,318.1718988220048,384.045655228646,311.670567266144,217.80078627038978,0.,267.449905963715,368.7223068923279,481.3579984793023,507.42245929797,418.3490884416984,319.9529268501853,494.40938108009243,398.5994203457903,433.1200195096044,280.05786437806034,262.00150133157643  
302.6233946673654,124.43807697003355,293.8447174954826,306.2871146489842,444.93924203198793,356.77044566499615,419.21944170088295,267.449905963715,0.,217.94643975068732,302.61870216495214,387.42636771391795,428.77549603492974,267.33766588342917,506.8832586503523,524.4238664477429,431.52580583320855,396.42026209567035,258.21647294469807  
388.3080251552883,123.8434471419461,282.7119788760285,421.57825655505525,431.431883036013,266.0324591097861,508.3628252537749,368.7223068923279,217.94643975068732,0.,344.49151789267614,302.7000578130107,525.9104743204873,438.13579470296645,479.9178584924716,513.5864635093102,287.36807842904193,322.40736250898493,279.11945310207244  
431.26815544391866,282.8381340979324,123.91290691449379,265.7871304258353,389.95115001753743,421.4445697835007,511.8687892419306,481.3579984793023,302.61870216495214,344.49151789267614,0.,217.96045444070813,323.15332413577306,285.548470316337,370.6348631739869,509.54701883143224,438.51930561835024,525.7033852848962,279.08958705046666  
445.4489939375776,294.0360137806251,124.5006200787771,356.7231046624258,305.0093236279835,306.2198891319765,524.9716223759146,507.42245929797,387.42636771391795,302.7000578130107,217.96045444070813,0.,396.7806431770582,430.47714968392916,267.8258529343275,422.69329270760846,267.5679164249705,428.5547332605253,258.61424032717144  
312.76083674270984,418.39638311056177,307.3106168032598,123.9839384759171,270.4883160138345,436.99448245944706,308.0939549877602,418.3490884416984,428.77549603492974,525.9104743204873,323.15332413577306,396.7806431770582,0.,218.50141784437005,281.8931845575554,340.6993872903208,500.54774028058506,510.5552453946586,270.3742881636491  
274.96343866048807,317.41328674143426,308.05683501587816,124.29034636688402,388.6504335517973,446.8783960989835,325.6086231966224,319.9529268501853,267.33766588342917,438.13579470296645,285.548470316337,430.47714968392916,218.50141784437005,0.,433.95084733181466,463.36765974763495,531.1155093762561,501.53582304756657,266.85213752188685  
384.58956837127033,424.68919800249216,277.2283785978629,312.6727620052632,124.29409800951929,314.4797433858022,399.6968266073675,494.40938108009243,506.8832586503523,479.9178584924716,370.6348631739869,267.8258529343275,281.8931845575554,433.95084733181466,0.,217.79172573814643,315.11605671561716,415.056597706867,261.1181544818361  
289.23154098403586,449.11493284013613,407.5981909920602,350.3630121745159,123.87409656582768,272.66601438389785,264.61506117377365,398.5994203457903,524.4238664477429,513.5864635093102,509.54701883143224,422.69329270760846,340.6993872903208,463.36765974763495,217.79172573814643,0.,327.93776269286224,309.6932477468632,276.7366726691639  
388.7863095583486,309.2691185682787,317.55003684458933,446.674488302164,273.8722139976964,124.24955130703692,464.78430610337955,433.1200195096044,431.52580583320855,287.36807842904193,438.51930561835024,267.5679164249705,500.54774028058506,531.1155093762561,315.11605671561716,327.93776269286224,0.,218.174964879108,267.10750756951785  
270.7369295090716,306.8235470103297,418.0803374950801,437.58755123974913,311.3959969235315,123.9724917068299,343.31409073907815,280.05786437806034,396.42026209567035,322.40736250898493,525.7033852848962,428.5547332605253,510.5552453946586,501.53582304756657,415.056597706867,309.6932477468632,218.174964879108,0.,270.58962895868723  
186.91339037104862,186.84160805345257,186.88456249781575,186.53714348622367,186.81917728113459,186.61386711603188,276.2143091876306,262.00150133157643,258.21647294469807,279.11945310207244,279.08958705046666,258.61424032717144,270.3742881636491,266.85213752188685,261.1181544818361,276.7366726691639,267.10750756951785,270.58962895868723,0.

new distance matrix (csv file)

0.,284.90364177745516,365.1833237831497,239.03947838006067,277.35510979538634,269.22803953053614,123.88514977926349,124.29218638492802,301.66740166629756,404.11562988556216,434.849076029974,431.45426939883123,312.7048608574015,273.827178600421,376.9980984165578,277.0514027606322,388.71837155507296,274.0233685968097,186.76829146761784  
284.90364177745516,0.,237.30978125769235,310.23931151726777,381.83722368114843,239.03947838006067,404.11562988556216,271.73490110511295,124.46934852440532,123.88514977926349,277.0514027606322,301.66740166629756,410.5138853378226,317.61308082818346,429.3481046605877,434.849076029974,317.61308082818346,312.7048608574015,186.76829146761784  
365.1833237831497,237.30978125769235,0.,239.03947838006067,277.35510979538634,310.23931151726777,434.849076029974,429.3481046605877,301.66740166629756,277.0514027606322,123.88514977926349,124.46934852440532,312.7048608574015,317.61308082818346,271.73490110511295,404.11562988556216,317.61308082818346,410.5138853378226,186.76829146761784  
239.03947838006067,310.23931151726777,239.03947838006067,0.,277.35510979538634,359.11572725631135,277.0514027606322,315.3439925075028,301.66740166629756,404.11562988556216,277.0514027606322,356.74677516371094,123.9782150913735,124.26994883696048,315.3439925075028,351.8266306604487,437.4102258851276,432.97806574866286,186.76829146761784  
277.35510979538634,381.83722368114843,277.35510979538634,277.35510979538634,0.,239.03947838006067,277.0514027606322,376.9980984165578,431.45426939883123,434.849076029974,404.11562988556216,301.66740166629756,274.0233685968097,388.71837155507296,124.29218638492802,123.88514977926349,273.827178600421,312.7048608574015,186.76829146761784  
269.22803953053614,239.03947838006067,310.23931151726777,359.11572725631135,239.03947838006067,0.,351.8266306604487,315.3439925075028,356.74677516371094,277.0514027606322,404.11562988556216,301.66740166629756,432.97806574866286,437.4102258851276,315.3439925075028,277.0514027606322,124.26994883696048,123.9782150913735,186.76829146761784  
123.88514977926349,404.11562988556216,434.849076029974,277.0514027606322,277.0514027606322,351.8266306604487,0.,217.79625600426812,431.45426939883123,515.6449618418281,515.6449618418281,515.9253016929949,312.7048608574015,317.61308082818346,407.9254832754308,277.0514027606322,464.0759829255072,342.00673901469946,268.8361627756964  
124.29218638492802,271.73490110511295,429.3481046605877,315.3439925075028,376.9980984165578,315.3439925075028,217.79625600426812,0.,264.5020090797186,351.8266306604487,472.3569557056971,515.9253016929949,410.5138853378226,317.61308082818346,487.5236547829897,404.11562988556216,437.4102258851276,274.0233685968097,268.8361627756964  
301.66740166629756,124.46934852440532,301.66740166629756,301.66740166629756,431.45426939883123,356.74677516371094,431.45426939883123,264.5020090797186,0.,217.8748515499829,315.27662940585355,392.0134101751411,432.97806574866286,273.827178600421,507.1528589741612,515.6449618418281,437.4102258851276,410.5138853378226,268.8361627756964  
404.11562988556216,123.88514977926349,277.0514027606322,404.11562988556216,434.849076029974,277.0514027606322,515.6449618418281,351.8266306604487,217.8748515499829,0.,351.8266306604487,301.66740166629756,512.4679856204754,437.4102258851276,487.5236547829897,515.6449618418281,273.827178600421,312.7048608574015,268.8361627756964  
434.849076029974,277.0514027606322,123.88514977926349,277.0514027606322,404.11562988556216,404.11562988556216,515.6449618418281,472.3569557056971,315.27662940585355,351.8266306604487,0.,217.95344709569773,312.7048608574015,273.827178600421,376.9980984165578,515.6449618418281,437.4102258851276,512.4679856204754,268.8361627756964  
431.45426939883123,301.66740166629756,124.46934852440532,356.74677516371094,301.66740166629756,301.66740166629756,515.9253016929949,515.9253016929949,392.0134101751411,301.66740166629756,217.95344709569773,0.,410.5138853378226,437.4102258851276,271.73490110511295,434.849076029974,273.827178600421,432.97806574866286,268.8361627756964  
312.7048608574015,410.5138853378226,312.7048608574015,123.9782150913735,274.0233685968097,432.97806574866286,312.7048608574015,410.5138853378226,432.97806574866286,512.4679856204754,312.7048608574015,410.5138853378226,0.,218.338191361739,271.73490110511295,351.8266306604487,501.0417816640758,512.4679856204754,268.8361627756964  
273.827178600421,317.61308082818346,317.61308082818346,124.26994883696048,388.71837155507296,437.4102258851276,317.61308082818346,317.61308082818346,273.827178600421,437.4102258851276,273.827178600421,437.4102258851276,218.338191361739,0.,429.3481046605877,472.3569557056971,531.1155093762561,512.4679856204754,268.8361627756964  
376.9980984165578,429.3481046605877,271.73490110511295,315.3439925075028,124.29218638492802,315.3439925075028,407.9254832754308,487.5236547829897,507.1528589741612,487.5236547829897,376.9980984165578,271.73490110511295,271.73490110511295,429.3481046605877,0.,217.8748515499829,317.61308082818346,410.5138853378226,268.8361627756964  
277.0514027606322,434.849076029974,404.11562988556216,351.8266306604487,123.88514977926349,277.0514027606322,277.0514027606322,404.11562988556216,515.6449618418281,515.6449618418281,515.6449618418281,434.849076029974,351.8266306604487,472.3569557056971,217.8748515499829,0.,317.61308082818346,312.7048608574015,268.8361627756964  
388.71837155507296,317.61308082818346,317.61308082818346,437.4102258851276,273.827178600421,124.26994883696048,464.0759829255072,437.4102258851276,437.4102258851276,273.827178600421,437.4102258851276,273.827178600421,501.0417816640758,531.1155093762561,317.61308082818346,317.61308082818346,0.,218.338191361739,268.8361627756964  
274.0233685968097,312.7048608574015,410.5138853378226,432.97806574866286,312.7048608574015,123.9782150913735,342.00673901469946,274.0233685968097,410.5138853378226,312.7048608574015,512.4679856204754,432.97806574866286,512.4679856204754,512.4679856204754,410.5138853378226,312.7048608574015,218.338191361739,0.,268.8361627756964  
186.76829146761784,186.76829146761784,186.76829146761784,186.76829146761784,186.76829146761784,186.76829146761784,268.8361627756964,268.8361627756964,268.8361627756964,268.8361627756964,268.8361627756964,268.8361627756964,268.8361627756964,268.8361627756964,268.8361627756964,268.8361627756964,268.8361627756964,268.8361627756964,0.  
$\endgroup$
5
  • 1
    $\begingroup$ Without digging too deep into the theory, Cartesian coordinates are the least suitable for working with molecular systems. I found internal coordinates much better for this purpose. It might be also usable for you... $\endgroup$
    – ssavec
    Commented Nov 4, 2015 at 10:08
  • 1
    $\begingroup$ @ssavec: I'm not clearly understanding how internal coordinates would help here. Since the symmetrization works on the distances between the atoms, using internal coordinates would not change this, or am I missing something? $\endgroup$
    – Rainer
    Commented Nov 4, 2015 at 16:29
  • $\begingroup$ @Rainer: I had time to crunch through your question and the article. So, I had hard time reading the "mol" file, but I succeeded in converting it manually into .xyz file. It is not cuprate, there is cobalt inside, but it does not matter. What is worse, your original matrix seems quite wrong, in what units do you have them? The "new" matrix is even more wrong, the last column has the same distance to all other atoms, which is super-weird. $\endgroup$
    – ssavec
    Commented Nov 4, 2015 at 18:41
  • $\begingroup$ @ssavec: Sorry for the confusion. The mol file positions are in units of Angstrom and the distance matrix is in units of pm. Furthermore the distance matrix is indexed with the clustered order of the atoms in the molecules in groups of O, N and Cu. I also corrected Co to Cu as you rightly pointed out. The new distance matrix is heavily averaged, since the tolerance for the averaging is 95pm. However if one choses a smaller tolerance the result is similiar with respect to breaking the symmetry. $\endgroup$
    – Rainer
    Commented Nov 4, 2015 at 21:12
  • 1
    $\begingroup$ Consider this option: each distance can be associated with 'interaction'. Let each 'interaction' to produce a force. If the distance is below average in the cluster, than the force should be a repulsion, and if it is above average it is attraction. Shift atoms according to summary forces scaled by some value so the shif was below, say, 0.1 A. Run several iterations. || that said, I highly recommend to divide atoms into classes based on their neighborhood before the procedure, otherwise you'll have troubles in some cases. $\endgroup$
    – permeakra
    Commented Dec 9, 2015 at 16:11

1 Answer 1

18
+50
$\begingroup$

I've done some work in both symmetry detection and in distance matrix methods. I think it's a great idea in concept, but the devil will be in the details for large, more complex molecules.

The first problem is that distance geometry methods are over-determined. For each atom, there are range constraints to the other $N-1$ atoms (a lower bounds and an upper bounds) so that's a total of $N(N-1)/2$ constraints but only $3N-6$ degrees of freedom for the molecule. For a small molecule like methane, this isn't a huge gap:

  • 5 atoms = 10 constraints
  • 3N-6 = 9 degrees of freedom

In your case, you have 19 atoms, so:

  • 19x18/2 = 171 constraints
  • 3N-6 = 51 degrees of freedom

That's a gap of 120 constraints. My experience with distance geometry methods is that you must be very careful to smooth out all the constraints. This is algorithmically challenging and numerically unstable. For methane, it's trivial - it's a $5\times5$ matrix and you only need to consider bonding pairs and bond-angle pairs (i.e. H-H interactions).

Consider for your molecule.. you make sure all the Cu-N constraints are averaged out to be symmetric. Ok, if the Cu-O or the O-N or O-O are slightly off, there will be a "pull" back to the asymmetric geometry you mention.

I would actually take the advice in a comment - convert first to a z-matrix representation and average out all symmetric bond lengths and bond angles. Then I'd generate the distance matrix and start your procedure. The starting point should be closer than what you're using right now.

The second problem is in the optimization after you create the initial atomic coordinates from the distance matrix. As you said, it's easy for the distance matrix to become non-Euclidian.

I'd first treat this as a minimization problem on the distance matrix. You want the Euclidian matrix closest to the symmetric form. By definition, there must be a Euclidian, symmetric distance matrix (i.e., the perfectly symmetric molecule). Finding that is tricky.

So you might to iteratively smooth the matrix:

  • Perform the Euclidian projection
  • Re-symmetrize the new projected matrix
  • Repeat, checking the trace of the difference between the Euclidian and symmetrize matrix - hopefully the magnitude will decrease

Ok, so now you find a distance matrix that's nearly Euclidian and symmetric. Once you generate the coordinates, you'll need to ensure optimization in Cartesian space retains symmetry (e.g., for any perturbation on atom X, its symmetric partner X' must also move).

$\endgroup$
1
  • $\begingroup$ By the way, for anyone finding this old question, for the general question of “I just want a symmetric molecule,” I’d suggest using Avogadro (avogadro.cc). For v1.x on Linux or Mac, there’s View => Properties => Symmetry. For 2.x, there’s an even better symmetry tool that will clean up a molecule and visualize the symmetry elements. $\endgroup$ Commented Sep 18, 2018 at 20:58

Not the answer you're looking for? Browse other questions tagged or ask your own question.