Bandgaps of Atomically Precise Graphene Nanoribbons and Occam’s Razor

Rationalization of the “bulk” (ΔΕ ac ) or “zigzag-end” (ΔΕ zz ) energy gaps of atomically precise AGNRs, which are directly related to fundamental applications in nanoelectronics, could be challenging and largely controversial with respect to their magnitude, origin, substrate influence (ΔΕ sb ), and spin-polarization, among others. Hereby a simple self-consistent, “economical” and highly successful interpretation is presented based on “appropriate” DFT (TDDFT) calculations, general symmetry principles, and plausibility arguments, which is fully consistent with current experimental measurements for 5-, 7-, and 9-AGNRs within less than 1%, although at variance with some prevailing views or interpretations for ΔΕ ac , ΔΕ zz , and ΔΕ sb . The excellent agreement with experiment and the new insight gained is achieved by invoking the approximate equivalence of Coulomb correlation energy with the staggered sublattice potential. Breaking established stereotypes, we suggest that the measured STS gaps are virtually independent of the substrate, essentially equal to their free-standing values, and that the “true” lowest energy state is closed singlet with no conventional magnetism. The primary source of discrepancies is the finite length of AGNRs together with inversion/reflexion symmetry conflict and the resulting topological end/edge states. Such states invariably mix with other “bulk” states making their unambiguous detection/distinction difficult. This can be further tested by eliminating end-states (and ΔΕ zz ), by eliminating “empty” zigzag rings. Graphical Abstract Conflicting results and open questions (?) on various “bulk” (ΔΕ ac ) and “surface” (ΔΕ zz , Δε ζζ ) energy gaps of AGNRs have very simple answers in accord with Occam’s razor. Based on “inexpensive” DFT calculations and inversion-sublattice symmetry arguments we achieve very high accuracy (!)


Introduction Edge or end states in graphene nanoribbons (GNRs), and in particular armchair
GNRs (AGNRs) have attracted very much interest lately, 1-8 due to their anticipated magnetic properties, [9][10] although their presence in finite nanographenes (NGRs) has been predicted long time ago. 11 However, the significance and importance of end states for AGNRs was recognized only recently [1][2][3][4][5][6][7][8] , after the pioneering bottom-up synthesis of atomically precise AGNRs of finite lengths L with short zigzag ends, and their characterization by scanning tunnelling microscopy (STM) and spectroscopy (STS). [3][4][5][6][7][8][9][10][11][12][13][14][15][16] Clearly no end states appear in the common infinite AGNRs fabricated by the usual top-bottom techniques, which are theoretically described by periodic boundary conditions at their two ends 1 . The new developments have brought to the forefront new concepts and properties such as the "bulk band gaps" ΔΕac (or Δac 1,6 ) i.e., the energy gaps between delocalized states, and the energy separation of the zigzag-end-localized "end-states", denoted here by (ΔΕzz) (or Δzz 6-8 ), thus increasing both quantity and quality of key properties to be rationalized, understood, or interrelated at the atomic scale. At the same time, despite the increased complexity, such advances have also allowed the study of the L-dependence of key-quantities such as the bandgaps 4-7 (both, ΔΕac and ΔΕzz), conductivity, aromaticity 1, 3-4 , and even Raman spectra. 17 The L-dependence studies [4][5] revealed that the changes in such properties versus length are not always gradual (or smooth). The presence of a metal-insulator-like phase transition at a critical length Lc was advocated by two different recent works, Lawrence et al. 8 and Zdetsis et al. 4 , almost simultaneously. However, the two works have offered different assessments and interpretations for the nature of the transition and magnetism, as well as the value of Lc. 4,8 This is not something new or unusual in a rapidly grown pioneering field like this 1 , and this is not the only existing "discrepancy". Other conflicting (or conflicting-looking) results (experimental and theoretical) include the magnitude and nature of the bandgaps 1, 6-7, 12-13 , the existence and nature of magnetism in the edge states 1, 3-7 , as well as the magnitude of the substrate influence on these properties. 1,[5][6][7] For example, the magnitude of the bandgap for the 5-AGNRs has been measured by (at least) three Zdetsis A. D. , Bandgaps of Atomically precise AGNRs and Occam's Razor 4 different groups 8,[12][13] to be 0.85eV 8 , 2.8 eV 12 , and 0.1 eV 13 respectively, while the theoretical values vary from 0.1 eV 1 to 1.7 eV. 14 For the 7-AGNRs the measured values of ΔΕzz vary between 1.9 eV 6 and 2.5 eV 7 , whereas the measured ΔΕac values range from 2.3 eV to 3.2 eV [6][7]9,15 , overlapping significantly with the range of ΔΕzz. Thus, the unambiguous distinction between ΔΕac and ΔΕzz is another subtle point together with the bridging of the measured and calculated ΔΕac values, which also vary widely from 2.3 eV to 3.7 eV. 1,6,7,14 Some of the (different) measured or calculated values correspond to AGNRs of different length, but in the literature the quoted values are usually given without reference to the actual length which is, thus, treated as a hidden variable. However, the biggest problem seems to be the large difference between the measured values of the gap(s) in relation to the "official" theoretical values obtained by the GW method 14 , which are widely recognized as an almost universal point of reference. Such large differences (almost ~1.5 eV for the 7-AGNRs) between experimental and theoretical GW gaps (ΔΕac) are usually attributed to the screening from the metallic (Au) substrate ΔEsb, even though identical values of gap (within the experimental uncertainties) have been obtained for AGNRs grown on non-metallic substrates, such as NaCl 6 and MgO 7 . This is clearly (at least) problematic. As a result, it appears that there are several conflicting results and interpretations or high-braw "solutions" about the STS gaps, although the real solution could be much simpler (but not always obvious), as could be possibly argued on the basis of Occam's principle. Along these lines the present work aims at deciphering all these subtle points, also including the confusion in distinguishing between ΔΕac and ΔΕzz gaps.
Thus, the present work can be considered as a positive synthesis of various conflicting views. Based on previous experience, 1,3,19 it is expected that such synthesis should be proven successful and constructive, facilitating the successful and accurate functionalization of AGNRs for realistic applications. As is demonstrated below, we can fully rationalize all known experimental data for the 5-, 7-, and 9-AGNRs within less than 1% accuracy, predict non-measured gaps, and pinpoint at the same time the sources of discrepancies. Zdetsis A. D. , Bandgaps of Atomically precise AGNRs and Occam's Razor 5 2. Theoretical framework. For a consistent and transparent understanding and interpretation of the origin and magnitude of ΔΕac, ΔΕzz as well as the factors that influence their size, it is important to realize that practically all these quantities are dominated by the influence of the ("many-body") Coulomb correlation energy combined with sublattice frustration, which gives rise to the staggered sublattice potential 20 across the zigzag ends of finite length AGNRs. In fact, the sublattice frustration, which is the driving force for the generation of the end/edge states, as we have illustrated earlier, [2][3][4] constitutes the largest (or even the full) contribution on the Coulomb correlation energy. The understanding that most (or all) of the Coulomb correlation energy is devoted to counterbalance the topological frustration between sublattice and molecular symmetry-groups is the starting (and key) point of the present investigation. This principle together with the established 2-5 (hidden) strong contributions of aromaticity and shell structure 2-5 constitute the basis for the deeper understanding of all these quantities (ΔΕac, ΔΕzz, Δεζζ, and ΔΕs). Thus, if we can properly alleviate the sublatticemolecular group symmetry frustration (which is equivalent with inversion/reflection symmetry conflict), under the natural constrains of shell structure and aromaticity, we could effectively account for the (largest part of) Coulomb correlation energy. At the same time this would explain why the open shell states (singlet or triplet) are not the real lower energy states, but rather "pseudo-states". 21

Calculation of ΔΕzz and ΔΕac from the one-body DFT calculations.
Within the 1-electron approximation underlining the DFT and Hartree-Fock (HF) self-consistent fields, the symmetry frustration between molecular (D2h) and sublattice (C2V) symmetry groups can be alleviated by effectively breaking (or redefining) the symmetry of the additional degrees of freedom (besides spatial coordinates) i.e., the spin and/or pseudospin (for real-space calculations). In the first case we can introduce non-zero spin values preserving the molecular symmetry, 3 whereas in the second case we are forced to break molecular symmetry, by introducing open-shell singlet states, which when optimized geometrically converge normally to C2v symmetric geometries compatible with sublattice symmetry, thus breaking the molecular symmetry as well. This occurs because the HOMO (and Zdetsis A. D. , Bandgaps of Atomically precise AGNRs and Occam's Razor 6 LUMO) orbitals of the open singlet are obtained, by construction, by mixing the HOMO and LUMO orbitals of the closed singlet. These orbitals have opposite parities, u, or g, (and opposite behaviour under σy or σxz reflection plane). Thus, by losing the σy (or σxz) reflection plane of the D2h symmetry group, as is illustrated in Fig. 1(a), we can get sublattice distribution with opposite sublattice points at the two ends (or antisymmetric with respect to y-axis, but symmetric with respect to the x-axis, which is the axis of the AGNR). This facilitates frontier molecular orbitals (HOMO, LUMO) localized only at one end (left or right) of the AGNR, producing an antisymmetric (pseudo)spin density, as is shown in Fig. 1(b), reflecting the sublattice symmetry and structure. Obviously, the reverse picture with the pseudospins interchanged is equally valid. On the other hand, the molecular D2h symmetry demands same type (same sublattice) atoms at the two ends, as shown in the lower part of Fig. 1(a), thus producing a symmetric, with respect to the y-axis, (pseudo)spin distribution (bottom of Fig. 1(b)). In both cases the (pseudo)spin distribution is almost zero at the middle part of the AGNR. This is reproduced in the corresponding "spin" densities (b). Such spin densities, in Figs. This is also responsible for the well-known 3n, 3n±1 width rule for AGNRs. 3 Note that the (pseudo)spin densities invariably reflect the sublattice (pseudospin) structure within the frustrated molecular (D2h) symmetry 3 in the first case, or the sublattice symmetry (C2v) in the latter (see also wider AGNRs (where n>1 in the above rule for width 3 ), higher spin states are required 3 to lower the total energy (within the molecular D2h symmetry group). Such larger (pseudo) spin-polarized states optimize better the sublattice distribution (within the D2h molecular group), 3 whereas the open-shell singlets lye higher in energy and revert to the closed singlet state. This illustrates emphatically that the open-singlet state is not the true ground (lowest energy) state of AGNRs (and, consequently, no conventional magnetism is truly present). Nevertheless, the open singlet state is still a very useful and efficient concept for the description of end-states, as is illustrated below. It should be emphasized that in both cases of Fig. 1, when correlation is introduced even at the MP2 level, the energetical ordering is reversed and the lowest energy structure is a closed singlet. 4 In addition, the MP2 correlated "spin" density of the triplet, as we can see in Fig contributions through time-dependent DFT (TDDFT), which has been shown 1 to provide very good ("many-body") estimates of the gaps, so that the STS spectrum overall looks very much alike the (luminous) optical spectrum, because both are dominated by molecular overlaps between transition states. This is further illustrated and "verified" from the results below. Furthermore, the use of TDDFT allows the clear and unambiguous identification of the energy separation of the end/edge states, which according to the present investigation is not given by ΔΕzz, as Wang et al. 6 have suggested, but by another type of gap which here is denoted as Δεζζ. In the usual one-body approximation Δεζζ corresponds to the HOMO-LUMO separation of the closed singlet true ground state for L≥Lc, which is always only a few 0.1 eV (~0.1 eV, for L→∞) in accord with the association of the end states with the Dirac points 3-4 (and charge neutrality points 4 ) located "very close" to the fermi level. TDDFT indeed verifies that in contrast to Δεζζ which involves transition from one purely end-localized HOMO state to an opposite-parity end-localized LUMO state, the ΔEzz gap always involves transitions from a mixture (~ 60% -~ 40%) of "surface"-"bulk" states to another state of about equal amount of mixing. Thus, although ΔEzz involves a large amount of localized end-states, it should not be associated with the energy separation of the end-states. Another way, besides TDDFT, to distinguish between "bulk" and "surface" energy gaps is by comparing to the corresponding "edge-modified" AGNRs, 5 obtained by eliminating "empty" (i.e., non-aromatic) end-rings, which also eliminates topological end-states (and, therefore, ΔEzz and Δεζζ).  TDDFT. This can also help the correct identification (and nature) of the gaps. Figure 2 summarizes the present results for the 5-AGNRs (or 2x AGNRs) which, as mentioned earlier, have been also studied by several groups. 1,4,8,[12][13]16  ground state both definitions are equivalent, but this is not the case. As we can see in Fig. 2(a), the "one-body" ΔΕac=|(HOMO-1)-(LUMO+1)| gap after the discontinuity (or transition) at L≈100Å, which we have discussed in detail in a previous work, 4 starts opening up at Lc, contrary to the "onebody" ΔΕzz (i.e., the open singlet HOMO-LUMO gap) which varies slowly and smoothly over the entire range of lengths. This is very strange indeed, if ΔΕzz is going to represent the real separation of the edge states, since ΔΕzz first appears at and after the transition at Lc. Such behaviour (smooth variation) should be better suited for ΔΕac. This is indeed verified in Fig. 2(b), which shows the HOMO-LUMO gap of the "edge modified AGNRs", which seems to saturate to the value of 1.22 eV, very close to the value of 1.25 eV, suggested from the behavior of the "normal" AGNRs in Fig.   2

(a). The edge modified AGNRs by construction have no edge states and their HOMOs and
LUMOs are delocalized over their entire length, 4 and therefore their fundamental gap corresponds to ΔΕac. Such edge-modified AGNRs are obtained by eliminating the empty (non-aromatic) endrings 5 of the standard AGNRs, which also eliminates end-states and zigzag end-bonds. 5 This is a clear manifestation of the importance of aromaticity for AGNRs (and graphene itself). [2][3] Comparing the behavior of the "bulk gap" in Figs. 2(a) and 2(b), we can see that due to quantum confinement (both lateral and longitudinal) the (HOMO-1) and (LUMO+1) states defining the "onebody" ΔΕac are also affected by the abrupt appearance of the edge states, in sharp contrast to the ("one-body") open singlet gap which seems to be practically insensitive to the appearance of the end-states, contrary to what is expected from its original definition. This in fact emphasizes the "many-body" nature of the end states through their connection with inversion symmetry conflict, which is further supported from Figs. 2(c) and S1. The "correct" behavior (with length variation) of the "one-body" ΔΕac is given by the (delocalized) HOMO-LUMO gap of the edge-modified AGNRs in Fig. 2(b). As we can see in Fig. 1(b) 25 to fit the calculated ΔΕac as a function of L efficiently and transparently 1,25 to a polynomial of the form ΔΕac(L)=A+B×L -C , where the value A corresponds to the gap at infinity, ΔΕac(∞)=A, and the constant C to some short of effective ("fractal") dimensionality (here equal to 1.20). 1,25 As we can see in the inset in Fig. 2(b), the projected ΔΕac value is 1.07 eV, which is also verified by the TDDFT result ΔΕac =1.01 eV (see Fig. S1). The However, further correct information is given in Fig. 2(c), showing the spectra of the 2x22 and 2x23 AGNRs immediately before and after transition, respectively (see Fig.2(d) too). As is illustrated in Fig. 2(c), in the 2x23 AGNR (immediately after the transition) there is a strong peak value at 0.87 eV, very close to the recently measured 8 STS gap of 0.85 eV. Detailed analysis of the TDDFT results shows that this peak includes transitions involving end-states to a large percentage (about 60%). Thus, the calculated value of 0.87 eV and the measured 8 gap should be assigned to ΔΕzz. This, contrary to the "one-body" gap, restores the expected correct behavior of ΔΕzz at (and after) Lc. Even more interesting is the fact that extrapolating to longer AGNRs gives a gap of 0.85 eV (exactly), which is an unexpected full agreement with experiment, as is shown in Fig. 2(d). Fig.   2(d) also shows that, contrary to the "one-body" (open-singlet) ΔΕzz gap of Fig. 2(a), both "manybody" gaps, ΔΕzz, and Δεζζ (the latter corresponding to the "real energetical separation of the endstates), and the one-body HOMO-LUMO gap, which involve end-states, change discontinuously at the critical length (~100 Å), where Δεζζ and HOMO-LUMO gaps drop, while ΔΕzz increases. Thus, the observed 8 gap opening (of about 0.30 eV) is due to the increased aromaticity at the critical length, and the mixing of bulk and end-states at an almost equal amount. Lawrence et al. 8  interpretation. 4 Our present work reveals that the gap opening is a many-body effect related with the aromatic transition and the change from bulk-like (ΔΕac) to coupled "surface-bulk" end-states (ΔΕzz). On the other hand, the calculated Δεζζ gap of 0.1 eV in Fig. 2(d) is in full agreement with the results of Kimouche et al. 13 Thus, Kimouche et al. 13 , and Lawrence et al. 8 , have apparently ("correctly") measured different kinds of gaps. Moreover, the same could be true for the value of 2.8 eV measured by Zhang et al. 12 , which could be assigned as a tentative ΔΕac value, either for very short AGNRs (without end-states), or for longer AGNRs with a strong "bulk" transition from deep occupied states (well below HOMO-1 orbital) to higher unoccupied states (well above the LUMO+1), and thus much larger than the real ΔΕac (which is technically determined by the HOMO-1, LUMO+1 difference). We can also observe in the 5-AGNRs that differences between the "onebody" and "many-body" (TDDFT) methods for assigning ΔΕac, ΔΕzz, and Δεζζ are relatively large (or even unusual) compared to the 7-and 9-AGNRS, discussed below, where the corresponding differences are of the order of 0.1-0.2 eV. This could be related to the fact that the 5-AGNRs (contrary to 7-and 9-AGNRs) are topological and aromatic mixtures. 3 Thus, the three seemingly conflicting measurements 8, 12-13 for the 5-AGNRs could be attributed to different length samples (and/or different positions of the STS tip). Yet, alternatively, one could claim, based on the GW results 14 , that there is a substrate interaction of equal magnitude (0.85 eV) and the "real gap" is 1.7 eV. Such conclusion is clearly considered here as highly improbable, in view of equally good (in fact better) agreement for the 7-and 9-AGNRs, not to mention Occam's principle. Moreover, if this is indeed a general trend, it clearly illustrates that elaborate correlation calculations (e.g., GW) could be avoided (see also ref. 23) if topological frustration can be taken into account appropriately by simple DFT (one particle) calculations, provided that the DFT functionals include "exact" exchange which is sensitive to inversion symmetry conflict. 4 Figure 3 summarizes the results for the three spin states (closed singlet, open singlet, and triplet) for the 7-AGNRs, and in particular the (7,12) or 3x6 AGNR. First of all, we can comment on the significance of the exact exchange in the DFT functional, which was discussed above. The calculated DFT/PBE0 open singlet ΔΕzz gap is 1.9 eV in full agreement with the measured 6 ΔΕzz gap for the 3x6 (7, 12) AGNR. In contrast the ΔΕzz gap calculated with the PBE functional, which does not include "exact exchange", is less than half this value (~0.5 eV, in agreement with the PBE calculations of Wang et al. 6 ). As we can see in Figs whereas ΔEzz is due to mixed transitions involving both "end" and "bulk" states. This is verified by Fig. 3(d) which shows the excitation spectrum of the closed singlet state for the normal 3x6 (7,12) AGNR, in which there are two characteristic maxima at 1.9 eV and 3.2 eV, which practically coincide with the measured ΔEzz, and ΔEac values respectively for this AGNR. 6 As we can see in the left part of Fig. 3(d), ΔEac involves transitions between (mixtures of) "bulk states" (from  This is also supported by the TDDFT results in Fig. 3(d) showing the spectrum of the edge-modified closed singlet in which the peak of 1.9 eV is totally absent, whereas the peak of the "bulk" gap ΔEac is present and identical to the 3.2 eV peak of the normal (7,12) AGNR. The position of the ΔEac peak, contrary to ΔEzz, changes (decreases) as the length increases. Thus, for the 3x14 (7,28) AGNR we found a ΔEac value of 2.8 eV, as is shown in Fig. S2(a). This value of 2.8 eV, as could be expected, is in perfect agreement with the calculated GW value 6 and the experimental measurements for the (7, 24-28) AGNR(s) on insulating NaCl substrate. 6 3.3 9-AGNRs. We can observe in Fig. S2(c) that the overall spectrum of the 4x6 AGNR which has the same length with the 3x6 AGNR, except for a suppression of the Δεζζ peak, looks at a first sight very much alike the one for the 3x6 AGNR. Clearly a (deep) "bulk" gap could be expected not to vary very much or be so sensitive to the exact AGNR's width; but for the peak around 2.0 eV, which up to now was associated with the ΔEzz gap of the 3x-AGNRs, further investigation is needed, which is described in Fig. 4. This is verified in Fig. 4(d), which shows that the 2.1 eV "bulk" peak (together with the "deeper" 3.2 eV "bulk" peak) survives the elimination (total and partial) of the empty (non-aromatic) end-rings which generates the edge modified AGNRs (without end states, and ΔEzz). As is well known, this "bulk" peak value, decreases as the length of the AGNR increases. For the 4x13 AGNR we find ΔEac =1.6 eV, but for the longer 4x18 (9, 36), and 4x24 (

4.Conclusions.
We have achieved an excellent agreement (within 1% or less) with the measured STS gaps ("bulk" and "surface") for the known 5-, 7-and 9-AGNRs, although the "surface" gaps, as is illustrated in Table 1 e) The measured 6 ΔΕzz gap of 1.9 eV for the 7-AGNRs, (7,12), and longer, is clearly identical to the calculated here ΔΕzz gap of 1.9 eV (with both DFT-TDDFT/PBE0).
f) For the 9-AGNRs the only known (to the present author) measurement 16 for the gap is 1.4 eV. The present calculations (TDDFT/PBE0) yield a ΔΕac value of 1.45 eV, and also predict ΔΕzz=2.1±01 eV, quite close to the corresponding gap for the 7-AGNRs.
At the same time the present work has provided a simple physical understanding/rationalization of the origin and properties of these gaps. We have shown that such excellent agreement can be obtained by a transparent approach, using a minimum of computational resources, avoiding high level many-body methods, such as the advanced GW approach. 14 This is accomplished by Zdetsis  give accurate results, especially when augmented by TDDFT calculation which can further refine the results, provided that the chosen DFT functional includes the "exact" exchange (such as the PBE0 functional 23 used here, proven to provide excellent results [1][2][3][4][5]24 ), and the finite length of the AGNRs is taken into account (recall the synopsis of the theoretical approach in section 2.5). Under the same provisions (finite size of AGNRs, and "exact exchange") the GW approach would also give the correct results, as is illustrated in ref. 6, where taking account the finite size of the 7-AGNRs has lowered the GW gap by about 1 eV, in very good agreement with the measured STS value. As a result, a similarly large overestimation of the expected substrate screening would be avoided, since the GW results 14 are widely used as reference values for the free standing AGNRs. This is corroborated by STS measurements of AGNRs on non-metallic substrates. [6][7] Thus, the measured STS gaps are practically independent of the substrate and virtually equal to the freestanding values, obtained by any of the three computational methods: DFT, TDDFT, and GW (from the simplest to the more complex), provided the finite size and the "exact" exchange are taken into account. Obviously, the simplest (and computationally most economical) approach should be normally preferred, in accord also with Occam's principle. A combination of DFT and TDDFT, as is used here, should be considered ideal.
Additional supplementary material with more details and comparisons for the spectra of 5-, 7-, and 9-AGNRs is given in the Supplementary Information.