Duality and Mock Modularity

We derive a holomorphic anomaly equation for the Vafa-Witten partition function for twisted four-dimensional $\mathcal{N} =4$ super Yang-Mills theory on $\mathbb{CP}^{2}$ for the gauge group $SO(3)$ from the path integral of the effective theory on the Coulomb branch. The holomorphic kernel of this equation, which receives contributions only from the instantons, is not modular but `mock modular'. The partition function has correct modular properties expected from $S$-duality only after including the anomalous nonholomorphic boundary contributions from anti-instantons. Using M-theory duality, we relate this phenomenon to the holomorphic anomaly of the elliptic genus of a two-dimensional noncompact sigma model and compute it independently in two dimensions. The anomaly both in four and in two dimensions can be traced to a topological term in the effective action of six-dimensional (2,0) theory on the tensor branch. We consider generalizations to other manifolds and other gauge groups to show that mock modularity is generic and essential for exhibiting duality when the relevant field space is noncompact.


Introduction
The hypothesis of S-duality asserts that N = 4 super Yang-Mills theory is invariant under the action of a large duality group (SL(2, Z) or a close relative, depending on the four-dimensional gauge group G) acting on τ ≡ τ 1 + iτ 2 = θ/2π + 4πi/g 2 ; here g and θ are the gauge coupling and theta angle; τ 1 and τ 2 denote the real and imaginary parts of τ throughout this paper. But S-duality is hard to test, because computations for strong coupling are difficult. One way to circumvent this difficulty is to consider a topologically twisted version of the theory in which localization can be used to perform computations for strong coupling.
In this paper, we will consider one particular twisting, originally studied in the present context in [1]. With this twisting, a formal argument shows that the partition function on a compact four-manifold X is holomorphic in τ or equivalently in q = exp(2πiτ ). Furthermore, if a certain curvature condition (eqn. (2.58) in [1]) is satisfied, the evaluation of the path integral can formally be argued to localize on the contribution of ordinary Yang-Mills instantons. (Without this curvature condition, one localizes on the solutions of a more complicated system of equations.) The contribution to the path integral from the component of field space with instanton number 1 n is then a n q n , where a n is the Euler characteristic of the instanton number n moduli space M n . Thus the partition function after summing over bundles of all values of the instanton number is expected to be Z = n a n q n . (1.1) The relevant curvature condition is highly restrictive, but there are a number of fourmanifolds that satisfy this condition and for which computations of the a n were available in the mathematical literature [2,3]. In particular, two important examples are a K3 surface and CP 2 . For K3, the expectations were borne out; the function n a n q n is a holomorphic modular function. What happens for CP 2 is more complicated. There is a natural modular function Z(τ,τ ) whose holomorphic part is n a n q n , but this function is not holomorphic. It has a "holomorphic anomaly" which for SO(3) bundles with w 2 = 0 reads 2 (1.2) (There is a second such formula, which we consider later, for bundles with nonzero w 2 .) A microscopic explanation of the failure of holomorphy was not provided in [1]. However, it was noted that the right hand side of eqn. (1.2) -and also its analog with w 2 = 0 -looks like it 1 Here n is an integer for a simply-connected gauge group such as G = SU (2), but may have a fractional part if G is not simply-connected. The fractional part is determined by a two-dimensional cohomology class (for example, by the second Stieffel-Whitney class w 2 if G = SO(3)), and in the partition function n a n q n , it is natural to sum over all bundles keeping this class fixed. The values of n in the sum are then congruent to each other mod Z. A restriction on w 2 (and its analog for other groups) is assumed in eqn. (1.1) and other similar formulas in this paper. 2 Up to a factor of 2, this formula also holds for gauge group SU (2).
could come from a sum over abelian anti-instantons on the Coulomb branch. On the Coulomb branch, the gauge group is broken from SO(3) (or SU (2)) to U (1), andq n 2 can be interpreted as the exponential of the classical action for a U (1) anti-instanton of flux n. On this interpretation, the origin of the factor 3 16πi τ −3/2 2 is not immediately apparent. It is anyway not clear why we should be summing over anti-instantons in a theory that formally can be argued to localize on instantons.
Subsequent developments have made it clear that the holomorphic anomaly must indeed come from the Coulomb branch and more specifically from a surface term at infinity on the Coulomb branch. One development involves Donaldson theory of four-manifolds or more precisely its interpretation in terms of N = 2 super Yang-Mills theory. A formal argument shows that certain correlation functions in a twisted version of the N = 2 theory depend only on the smooth structure of a four-manifold X and not on its Riemannian metric g. These correlators are expected to coincide with the Donaldson invariants. From a mathematical point of view [4], the Donaldson invariants are true invariants for b + 2 > 1 but for b + 2 = 1, they instead have a chamber structure: they are generically invariant under a small change in the metric g, but they jump when one crosses certain "walls" in the space of metrics. This phenomenon is analogous to wall-crossing for BPS states in various supersymmetric models. The wall-crossing phenomenon was studied in [5] from a gauge theory point of view and was found to originate from a surface term at infinity on the Coulomb branch. 3 In other words, the formal proof that certain correlation functions are independent of g involves integration by parts in field space. Upon "localization," the proof requires integration by parts on the Coulomb branch of the theory, and there is a possibility of a surface term at infinity. Such a surface term arises for b + 2 = 1 and accounts for wall crossing.
Going back to N = 4, the formal proof of holomorphy of the twisted theory again involves integration by parts, so it is reasonable to ask if again there may be a surface term at infinity on the Coulomb branch that accounts for the holomorphic anomaly. Indeed, it was pointed out in [1] that CP 2 has b + 2 = 1 (while K3 has b + 2 > 1) and it was suggested that a holomorphic anomaly would arise on any four-manifold with b + 2 = 1. The goal of the present paper is to demonstrate this, by performing the appropriate analog of the computation in [5].
Before saying more about this, we pause to explain a dual version of the problem. A twodimensional supersymmetric field theory of a rather general type has a natural invariant, the elliptic genus. It is defined by a path integral on a torus T 2 . Fields in the sigma-model are taken to be periodic functions on the torus up to possible twists by symmetries; the twists are chosen so that the supercurrent associated to one of the supersymmetries is invariant, but otherwise one allows arbitrary twists. For a sigma-model with a compact target space (or for 3 For b + 2 > 1, there are enough fermion zero-modes on the Coulomb branch to prevent wall crossing behavior, so the Donaldson invariants are true topological invariants. There has been very little study of the case b + 2 = 0, partly because nonzero Donaldson invariants for gauge group SU (2) or SO(3) (the groups most studied) can only arise if b 1 + b + 2 is odd, so that if b + 2 = 0, b 1 must be nonzero. But most of the interest in four-manifold theory is on simply-connected four-manifolds, which necessarily have b 1 = 0. However, the case b + 2 = 0 certainly merits more study.
any supersymmetric field theory with a discrete spectrum), the elliptic genus is a holomorphic function of the modular parameter τ . However, for a sigma-model with a noncompact target space, the elliptic genus can have a holomorphic anomaly [6][7][8][9][10][11][12][13][14][15]. The elliptic genus defined by a path integral on the torus is then still modular invariant, but it is no longer holomorphic. In this situation, the elliptic genus becomes, in modern language, a mock modular function rather than an ordinary holomorphic modular function.
In a sigma-model with target W , supersymmetric localization reduces the computation of the elliptic genus to an integral over the space of constant maps from T 2 to W . This space of constant maps plays the role of the Coulomb branch in the gauge theory. The space of such constant maps is a copy of W . The proof of holomorphy involves an integration by parts on W , and the anomaly in holomorphy comes from a surface term at infinity. For sigma-models, this has been studied in a variety of ways in the literature. The derivation in [14], with a direct calculation of the holomorphic anomaly in terms of the behavior at infinity in W , will be particularly useful as background to our computation.
The two arenas for a holomorphic anomaly that we have mentioned -gauge theory in four dimensions and supersymmetric field theory in two dimensions -can be dual, for the following reason. S-duality in four dimensions is believed to be intimately connected with the existence and properties of a certain superconformal field theory in six dimensions, the (2, 0) model. In particular, the (2, 0) theory on Euclidean six manifold M = T 2 × X, in the limit that the area of the T 2 is very small, keeping fixed its complex structure, is expected to reduce to N = 4 super Yang-Mills theory on X, with the τ parameter of the gauge theory simply equal to the τ parameter that determines the complex structure of T 2 . The twisting of the N = 4 theory on X that is under discussion here can be "lifted" to a twisting of the (2, 0) model on T 2 × X. Formally, the partition function of this twisted version of the theory should not depend on the metric of X and should depend holomorphically on τ . But we will be exploring a possible anomaly in this holomorphy.
We will study the (2, 0) theory on T 2 × X in either of two limits. If the T 2 is very small compared to X, then as already stated, we reduce to gauge theory on X. We will call this the gauge theory region. In the opposite limit that X is very small compared to T 2 , we reduce to a supersymmetric (possibly superconformal) field theory on T 2 . We will call this the sigma-model region, since the supersymmetric model in question can be described as a sigma-model in the asymptotic region of field space that is important for the holomorphic anomaly (something as simple as this is not expected in the interior). Because the area of the T 2 does not matter in the topologically twisted theory, we must get the same holomorphic anomaly whether we compute in the sigma-model region or the gauge theory region.
In the gauge theory region, we will exhibit the holomorphic anomaly by a computation somewhat analogous to that in [5], and in the sigma-model region, we will exhibit the same anomaly by a calculation somewhat along lines of [14]. Actually, a computation using only the lowest order terms in the effective action on the Coulomb branch of the gauge theory or the target space of the sigma-model will not show the holomorphic anomaly. In that approximation, the holomorphic anomaly vanishes. It is necessary to include a certain correction in the effective action. In understanding wall crossing in N = 2 super Yang-Mills theory, the 1-loop quantum correction to the classical metric on the Coulomb branch plays an important role. For N = 4, there is no such quantum correction to the metric of the Coulomb branch, but at the 1-loop level, a half-BPS correction to the effective action on the Coulomb branch is generated [16]. By exploiting holomorphy or a relation to anomalies, it can be shown that the coefficient of this interaction is 1-loop exact. It turns out that this interaction has the right properties to generate the expected holomorphic anomaly in the gauge theory approach.
The six-dimensional (2, 0) model on its Coulomb branch likewise has a half-BPS coupling, first described in [17,18], that reduces after T 2 compactification to the half-BPS interaction of the N = 4 super Yang-Mills theory that was already mentioned. This interaction as well has a precisely known coefficient. We will show that this 6d coupling, after twisted compactification on X, has just the right properties to generate the expected holomorphic anomaly in the sigmamodel approach.
The computations of the holomorphic anomaly both in the gauge theory region and the sigma model region yield the same result on the right hand side of (1.2), but the various factors have different origins in the two regions.
• The factor of 3 is related to the first Chern class of the canonical line bundle of CP 2 in the gauge theory region and to the quantum of H-flux in the sigma-model region. • The factor of τ −3/2 2 comes from the integral over the constant mode of the auxiliary field in the gauge theory region and from the integral over three non-compact bosonic zero-modes in the sigma model region.
• The factor of η(τ ) −3 = η(τ ) −χ(CP 2 ) is the contribution of point-like instantons in the gauge theory region and of the left-moving oscillators in the sigma model region. • Finally, the anti-holomorphic theta-function n∈Zq n 2 is a contribution of abelian antiinstantons in the gauge theory region and of right-moving momenta of a compact chiral boson in the sigma model region.
In one important respect, our sigma model calculation in two dimensions is more complete than our corresponding gauge theory calculation. In the sigma model, we will have to do a path integral on a two-torus. Such a path integral can be interpreted as a Hilbert space trace, and this determines its absolute normalization. By contrast, in gauge theory we will be doing a Coulomb branch calculation on a general four-manifold X. Such a path integral does not have a natural normalization; it can be affected, for example, by topological terms proportional to the Euler characteristic and the signature of X. To determine the absolute normalization of the Coulomb branch path integral, we would have to start in the ultraviolet with conventions that lead to a holomorphic expansion of the precise form (1.1), and then deduce the resulting normalizations on the Coulomb branch. We will not attempt to do that. Now we mention some previous and current work on related problems. Mock modularity arising from Coulomb branch integrals in gauge theories with N = 2 supersymmetry has been systematically explored in [19], extending previous calculations that had been done by more special methods [5,[20][21][22][23][24]. Moreover, in forthcoming work, Manschot and Moore have analyzed the Coulomb branch integral and the associated mock modularity in the N = 2 * theory, which of course is closely related to N = 4 super Yang-Mills, which we study in the present paper. Their calculation might lead to a way to resolve the normalization issue mentioned in the last paragraph.
We now comment on the relation between the holomorphic anomaly and mock modularity. The naive holomorphic partition function of the twisted SO(3) super Yang-Mills theory on CP 2 is the holomorphic kernel of the anomaly equation (1.2) which receives contributions only from the instantons. It is holomorphic but not modular. The presence of the holomorphic anomaly implies that the physical partition function necessarily contains a nonholomorphic piece given by an Eichler integral of the anomaly [25] which receives contributions from the anti-instantons. In modern terminology [26][27][28], the holomorphic piece is a (vector valued) 'mixed mock modular form' whereas the anomaly is governed by its 'shadow'. The physical partition function satisfying the anomaly equation is the 'modular completion' and has good modular properties, as expected from duality.
These considerations extend naturally to other Kähler 4-manifolds with b + 2 = 1, b 1 = 0 and to other groups. In general, when the configuration space of the twisted theory is noncompact, the partition function is modular but not holomorphic, and satisfies a holomorphic anomaly equation similar to (1.2). This incompatibility between holomorphy and modularity is the essence of mock modularity. The physical requirement of duality invariance of the path integral thus leads naturally to the mathematical formalism of mock modularity whenever the relevant configuration space is noncompact.
The structure of the paper is as follows. In Section 2 we review relevant facts about topologically twists of N = 4 SYM theory, their M-theory realizations and some generalities about holomorphic anomaly. In Section 3 we derive the holomorphic anomaly equation for the SO(3) super Yang-Mills theory on CP 2 by a computation in the gauge theory region. In Section 4 we rederive the anomaly in corresponding sigma model region. The nonholomorphic contributions in both regions can be seen to originate from a topological term on the sixdimensional world volume theory of the Euclidean M5-brane described in Section 4.1. In Section 5 we present generalizations to other manifolds and other gauge groups.

Twisting and Topological Field Theory
The contents of this section are as follows. In Section 2.1, we review the general notion of a topologically twisted theory. In Section 2.2 we review all three possible twists of N = 4 4d super Yang-Mills theory. In Section 2.3 and Section 2.4 we comment on the geometric interpretation of the twists, including their realization in M-theory. In Section 2.5 we give a general discussion on the origin of the holomorphic anomaly in topologically twisted theories.

Generalities
Here we briefly recall how N = 4 super Yang-Mills theory can be "topologically twisted" to make what formally is a topological field theory. We say "formally" because, as we will discuss, the proof of topological invariance always relies on integration by parts in field space, which can generate a surface term under some circumstances. For more details on some of the following, see for example [1].
One picks ρ so that the representation (2, 1, 4) ⊕ (1, 2, 4) of Spin(4) × SU (4) R contains at least one Spin(4) singlet. Let us denote as Q a supercharge of N = 4 super Yang-Mills theory that is such a singlet. It will always obey Q 2 = 0. The reason is that, more generally, if Q is any linear combination of the global supercharges of N = 4 super Yang-Mills theory, its square will be a linear combination of the translation generators, 5 and these (as they commute with SU (4) R ) transform the same way under Spin(4) as under Spin (4). In particular, no nonzero translation generator is Spin(4) -invariant, so, if Q is Spin(4) -invariant, Q 2 must vanish. If there are multiple Spin(4) -invariant supercharges Q and Q , this argument shows that The basic idea of making a twisted topological field theory is now to view Q as a BRSTlike operator: we only consider operators and states that are Q-invariant, and we consider an operator O to be trivial if it is a Q-commutator, O = {Q, O } for some O , and a state Ψ to be trivial if it is Q-exact, Ψ = QΛ for some Λ. Because Q generates a symmetry of the path integral and Q 2 = 0, adding Q-exact terms to the operators or states will not affect the expectation values of Q-exact operators in Q-exact states. Specializing to Q-closed operators and states will lead to a topological field theory because in all cases (i.e. for every choice of ρ), the stress tensor, which measures the response of the theory to an infinitesimal change in a background metric, is Q-exact, where Λ µν is a linear combination of components of the supercurrent of the theory. Equation (2.1) is a special case of the usual commutation relation of the supercharges and supercurrents, written in a way that is natural in the twisted theory. So far we have considered the theory on R 4 . It turns out that it is possible to formulate the twisted theory on a rather general four-manifold X, preserving the Q symmetry, as long as one imposes on X a mild condition that depends on ρ and is detailed below. In fact, the generalization to a curved four-manifold X can be made in a way that preserves all of the Spin(4) invariant supercharges, and the fact that any linear combination of them squares to zero.
With the goal of formulating the theory on a general manifold, we view Spin(4) as the rotation symmetry group, so in general fields do not have the same spin they have under the original rotation group Spin(4). We use the Spin(4) quantum numbers in coupling the fields of the N = 4 theory to a background curved metric on any manifold X. This defines the coupling to a curved background modulo some nonminimal terms (explicit coupling to the Riemann tensor of X) which in some cases are needed to preserve the Spin(4) invariant supersymmetries.
The coupling to the background curved metric can be made in such as way that eqn. (2.1) still holds, that is, the stress tensor remains Q-exact. Formally, this means we get a topological field theory: as long as we consider only Q-invariant operators and their matrix elements between Q-invariant states, the expectation value or matrix element of {Q, Λ µν } will vanish because of Q-invariance of the path integral, and hence the response of the theory to a change in the background metric of X will vanish. However, as noted in the introduction, this step needs to be treated with care. The claim that {Q, V} = 0 for any operator V is ultimately based on integration by parts in field space. An anomaly might come from a surface term at infinity. For example, in Donaldson theory of four-manifolds, viewed as twisted N = 2 super Yang-Mills theory, the relevant integral reduces to an integral on the Coulomb branch, and one does find an anomaly -a surface term at infinity on the Coulomb branch -that spoils topological invariance in the case b + 2 (X) = 1 [5]. As explained in the introduction, in the present paper we will find that a somewhat similar anomaly spoils holomorphy in twisted N = 4 super Yang-Mills, again if b + 2 (X) = 1.
(B) The choice of ρ useful in applications to the geometric Langlands program [29] is such that the 4 of SU (4) R transforms under Spin(4) as (2, 1)⊕(1, 2). Thus Spin(4) commutes with a residual U (1) R subgroup of SU (4) R , which we normalize so that the (2, 1) and (1, 2) respectively have charges 1 and −1. The supercharges of the theory transform under Spin(4) × U (1) R as where the subscript is the U (1) R charge, and a 2 in front means that a representation appears twice. In particular, there are two Spin(4) singlets. They transform the same way (both with charge 1) under the unbroken global symmetry U (1) R . We can choose Q to be a linear combination αQ 1 +βQ 2 of the two Spin(4) singlets Q 1 and Q 2 . The resulting family of topological field theories depends in a nontrivial fashion on the parameter t = α/β, which plays an important role in the application to geometric Langlands.
(C) The last case is that the 4 of SU (4) R transforms under Spin(4) as (2, 1)⊕2(1, 1). Thus Spin(4) commutes with a residual U (2) R subgroup of SU (4) R , under which the two copies of (1, 1) transform as a doublet. The global supersymmetries transform under SU where the subscript is the charge under the center of U (2) R . In particular, there is up to scaling a unique Spin(4) singlet global supercharge Q that we can use to make a topological field theory.
In each of the three cases, it is straightforward to determine how the fields of N = 4 super Yang-Mills theory transform under Spin(4) . In particular, let us look at the adjoint-valued scalar fields φ of the theory. Before twisting, they transform in the 6 of SU (4) R . By examining how they transform in the twisted theory, we will find what condition must be placed on X so that the twisted topological field theory can be defined on X. (No additional condition comes from the gauge field A, as it is SU (4) R -singlet so it is not affected by twisting; and no additional condition comes from the fermions, essentially because they are related to the bosons by the action of Q and so transform the same way under Spin(4) .) (A ) With our first choice of ρ, the six scalars transform under SU (2) ×SU (2) r ×SU (2) R as (3, 1, 1) ⊕ (1, 1, 3). In particular, this representation is not invariant under exchange of SU (2) and SU (2) r , so the theory with this twist does not have a parity or reflection symmetry, and X must be oriented. However, the twisted scalars have integer spin, as do the fermions after twisting, so there is no need for X to have a spin structure. That is why we can take X = CP 2 , the case in which a holomorphic anomaly was found in [1].
(B ) With the second choice of ρ, the six scalars transform under SU This is a reflection symmetric representation, so X need not be oriented. (If X is unorientable, the parameter t discussed earlier is no longer arbitrary, since orientation-reversal acts nontrivially on this parameter.) The twisted scalars have integer spin, so X again does not require a spin structure and we can study this theory on CP 2 .

Geometrical Realization
In what follows, it will be useful to be familiar with a geometrical realization [30] of the three twisted theories. First let us consider a realization by D3-branes in Type IIB superstring theory. We can realize N = 4 super Yang-Mills, with gauge group U (N ), by wrapping N D3-branes on X. Of course, Type IIB superstring theory is naturally defined on a 10 dimensional spacetime Y , so X will have a rank 6 normal bundle W in Y . The scalars in the super Yang-Mills multiplet describe normal oscillations of the D-branes, so they are valued in W tensored with the adjoint representation of the gauge group G. Since we have determined how the scalars transform under the symmetries, we can read off what must be the normal bundle to X in Y : (A ) With our first choice of ρ, three scalars transform under Spin(4) as (3,1). This is the appropriate representation for a selfdual second rank tensor or two-form. So one summand in W is the bundle Ω + 2 (X) of selfdual two-forms on X. The other scalars are Spin(4) singlets in the vector representation of SU (2) R . The upshot of this is that we can take Y to be Ω 2 + (X)×R 3 , where here by Ω + 2 (X) we mean the total space of the rank three vector bundle Ω + 2 (X) → X, and R 3 is a copy of three-dimensional Euclidean space, with SU (2) R as its group of rotations. X is embedded in Ω 2 + (X) × R 3 as the zero-section of Ω + 2 (X), times a point in R 3 , which we can choose to be the origin.
For some favorable choices of X, such as CP 2 or S 4 , Ω + 2 (X) admits a complete metric of G 2 holonomy, such that the zero-section is a "coassociative" (or supersymmetric) submanifold. This puts the supersymmetry of the twisted model in a standard framework. For more generic X, there presumably is no nice complete metric of G 2 holonomy on Ω + 2 (X), but we can think of Ω + 2 (X) as carrying a G 2 structure near the zero-section. That is sufficient for purposes of gauge theory.
(B ) With the second choice of ρ, four scalars transform under Spin(4) as (2,2). This is the representation that corresponds to the tangent or cotangent bundle of X. Once X is given a Riemannian metric, the two are equivalent; it will be more natural in what follows to think in terms of the cotangent bundle T * X. The other two scalars are Spin(4) singlets but charged under U (1) R . The geometrical picture is that Y = T * X × R 2 , where X is embedded as the zero-section of T * X times a point in R 2 , and U (1) R acts as the rotation group of R 2 .
In a few favorable cases, such as X = CP 2 , T * X carries a complete Calabi-Yau metric, such that the zero-section is a Lagrangian submanifold. This puts the supersymmetry of the twisted model in a standard framework. Even when that is not so, T * X is a symplectic manifold, and a brane supported on its zero section is a Lagrangian brane in the A-model of T * X, still giving a standard framework for the supersymmetry of the twisted model. (We expect that the analog of this for cases A and C is that Ω + 2 (X) or S + (X) will always carry a possibly unintegrable G 2 structure or Spin(7) structure, and that this suffices for topological applications. This point of view has not been studied systematically.) (C ) With the third choice of ρ, four scalars transform under Spin (2) × Spin (2) r × U (2) R as (2, 1, 2) 0 , where the subscript refers to the charge under the U (1) R center of SU (2) R . The other two transform as (1, 1, 1) ±2 . The geometrical meaning is as follows. Let S + (X) be the positive chirality spin bundle of X, viewed as a real vector bundle of rank 4. Then Y can be identified as S + (X) × R 2 , where X is embedded as the zero-section of S + (X) times a point in R 2 . SU (2) R acts on the fiber of S + (X) → X, commuting with its structure group Spin(4) , and U (1) R acts on R 2 by rotations.
In a few favorable cases, S + (X) carries a complete metric of Spin(7) holonomy, with the zero section as a coassociative submanifold, providing a standard framework for the supersymmetry of the twisted model. Even when such a complete metric does not exist, this provides a sufficient description for our application to gauge theory.

M-Theory Variant
Instead of considering D3-branes in Type IIB superstring theory, we can consider M5-branes in M-theory. Here we use the fact that M-theory on T 2 × Z, for a two-torus T 2 and any Z, goes over, in the limit that the T 2 is small, to Type IIB on S 1 × Z. In this process, an M5-brane on T 2 × X (where X is any submanifold of Z) goes over to a D3-brane on X times a point in S 1 . Since strings or branes wrapped on S 1 will not be important in anything we say, we can here decompactify S 1 and replace it by a copy of R. The complex structure of the torus, τ , plays the role of the complex coupling constant in 4d.
The upshot of this is that the geometrical descriptions that were described earlier have M-theory variants: 6 (A ) In the first example, we can consider M-theory on (B ) In the second example, we can consider M-theory on T 2 × T * X × R with M5-branes wrapped on T 2 × X times a point in R.
(C ) In the third example, we can consider M-theory on T 2 × S + (X) × R with M5-branes wrapped on T 2 × X times a point in R.
In each of these cases, we have the option to make the T 2 larger or smaller than X. In the limit that T 2 is very small, we return to the Type IIB description via D3-branes wrapped on X. This in turn can be described in terms of the four-dimensional twisted versions of N = 4 super Yang-Mills theory, as described above. In the opposite limit that X is very small compared to T 2 , we get a description in terms of a conformal field theory on T 2 . We will explore both limits in this paper.
Parallel M5-branes, which we have used in this explanation, give a particular realization of the (2, ) superconformal field theory in six dimensions. Instead of talking about M5-branes wrapped on T 2 × X, we could more generally talk about the (2, 0) model on T 2 × X. This formulation is more general as it encompasses all groups of A − D − E type.
Consider in more detail the twist of type (A ). The (uncompactified) 6d theory has Spin(5) R × Spin(6) global symmetry, where the first factor is the R-symmetry and the second factor describes (local) rotations of the 6d spacetime. When the 6d theory put on a spacetime of the form X × Σ 2 , where X is an oriented 4-manifold and Σ 2 is a Riemann surface, the second factor naturally breaks into Spin(2) × Spin(4) ⊂ Spin (6). The two factors correspond to local rotations on Σ 2 and X respectively. The topological twist is then realized by identifying the In the M-theory setting, the 6d type A 1 theory describes the dynamics of a stack of 2 M5 branes, with center of mass degrees removed. Then 6d spacetime rotations and the R-symmetry can be both embedded into the group of 11d rotations: Spin(6) × Spin(5) R ⊂ Spin (11) where Spin(5) R correspond to the rotations along the directions orthogonal to the worldvolume of the 5-branes. The topological twist can be then realized by the following geometric background in M-theory: M-theory: where Ω + 2 (X) is the total space of the rank 3 vector bundle of the self-dual 2-forms over X. This construction follows from the fact that antisymmetric rank 2 tensors of SO(4) ≡ Spin(4)/Z 2 transforms as a triplet of SU (2) r ⊂ Spin(4). After the topological twist SU (2) r is identified with the Spin(3) R ⊂ Spin(5) R subgroup of the R-symmetry that corresponds to the rotations of the fibers of the normal bundle to the worldvolume of the 5-branes. The total space Ω + 2 (X) is a local G 2 -manifold and X is a coassociative cycle.
When X is Kähler, as in the case of X = CP 2 , its holonomy is reduced to U (2) ⊂ SO(4) ≡ Spin(4)/Z 2 . In particular, SU (2) is reduced to its maximal torus U (1) ⊂ SU (2) . After the topological twist, this maximal torus is identified with the subgroup U (1) R ≡ Spin(2) R ⊂ Spin(5) R embedded in the standard way (i.e. as a subgroup corresponding to the rotations among 2 out of 5 normal directions). The three-dimensional real representation of Spin(3) R decomposes into a complex 1-dimensional representation of U (1) R of charge 2 plus a trivial 1-dimensional real representation. Geometrically this correponds to the splitting of the rank 3 real vector bundle into a real rank 1 trivial bundle and a rank 1 complex vector bundle: Ω + 2 (X) = R × KX where KX := Λ 2 T * C X is the canonical bundle of X, considered as a complex manifold. The total space KX of this canonical bundle is a local Calabi-Yau 3-fold.

Some Background Concerning the Anomaly
In cases A and C, one finds that if S 4d is the action of the theory, then the antiholomorphic dependence of S 4d on the gauge theory coupling parameter τ is Q-exact, meaning that for some functional Λ. (The details of case B are more complicated and depend on the choice of the parameter t.) Formally, it follows that as long as we only discuss Q-invariant operators and states, all computations in cases A or C will give results holomorphic in τ . In fact, as already mentioned, the proof of decoupling of Q-exact operators depends on integration by parts and there is a possibility of an anomaly coming from a surface term at infinity. As explained in the introduction, from explicitly known formulas it appears that there is such a holomorphic anomaly in case A provided that b + 2 (X) = 1. In the present paper, we will aim to elucidate this anomaly.
We will primarily aim to understand the case of gauge group SU (2) or SO (3), which can be realized by a system of two D3-branes or two M5-branes after removing the center of mass degree of freedom. A possible anomaly will come from the Coulomb branch, which in the geometric description is the region in which two branes are widely separated in the untwisted directions. For example, for D3-branes on Ω + 2 (X) × R 3 , T * X × R 2 , or S + (X) × R 2 , the Coulomb branch is the region in which the D3-branes are widely separated in R 3 , R 2 , or R 2 . The overall center of mass motion of the D3-brane system is described by a free system. The Coulomb branch that we are interested in parametrizes the relative motion between the two D3-branes. So we can effectively describe the Coulomb branch asymptotically in terms of the degrees of freedom of a single D3-brane wrapped on X but near infinity in the second factor of Ω (Here in describing the relative motion of the two D3-branes, we divide by Z 2 to account for the Weyl group that exchanges the two D3-branes.) In the Mtheory description, the story is similar. The Coulomb branch for the relative motion of a pair of M5-branes can be described in terms of a single M5-brane wrapped on T 2 × X times a point near infinity in the last factor of T 2 ×Ω + 2 (X)×R 2 /Z 2 , T 2 ×T * X ×R/Z 2 , or T 2 ×S + (X)×R/Z 2 . We will find the anomaly by a careful study of the effective field theory on the Coulomb branch. In doing this, as remarked in the introduction, it is necessary to include in the theory on the Coulomb branch a certain half-BPS interaction that was originally described in [16] in the context of supersymmetric Yang-Mills theory, or its M-theory analog, which was originally described in [17,18].
Formally, in any of the three twisted theories under discussion, one can calculate by a procedure of supersymmetric localization. Formally, one can show that modulo Q-exact terms, the path integral of any of the three twisted theories can be localized on configurations that satisfy {Q, χ} = 0, where χ can be any of the fermion fields of the theory. The equations {Q, χ} = 0 become a system of elliptic differential equations modulo the gauge group for bosonic fields in the theory. In theories A and B, these equations have been discussed in detail in [1] and [29], respectively. In theory C, the localization equations are similar to the "monopole" equations studied in [31]. Actually, the details of this localization procedure will not be our focus in the present paper; what we will be interested in here is precisely how this localization can fail. Since localization holds modulo Q-exact terms, the localization procedure will give an incomplete result if there is an anomaly at infinity on the Coulomb branch. The anomaly that we will explore violates the predictions of the formal localization procedure which implies holomorphy.

Holomorphic Anomaly in Four Dimensions
We begin in Section 3.1 by recalling a few facts about the holomorphic anomaly encountered [1] in the computation of the twisted partition function of supersymmetric Yang-Mills theory with gauge group SO(3) on CP 2 . Our goal is to better understand how the anomaly relates to mock modularity and eventually to noncompactness of the Coulomb branch. In Section 3.2 we review the relevant facts about the effective action of the four-dimensional N = 4 theory on the Coulomb branch. In Section 3.3 we give the derivation of the holomorphic anomaly from the path integral of the effective four-dimensional theory.

Mock Modularity of CP 2 Partition Function
Since Accordingly, there are two partition functions, which we denote by Z 0 (τ ) and Z 1 (τ ). It was shown in [1] using the work of Klyachko and Yoshioka [2,3]  With these definitions, the partition functions of the topologically twisted supersymmetric SO(3) Yang-Mills theory on CP 2 [1] are given by where η(τ ) is the Dedekind eta function and with τ = τ 1 + iτ 2 and which equals the complementary error function up to normalization. Note that in [1,3] there is a factor of η(τ ) 6 in the denominator instead of η(τ ) 3 as in (3.1). The generating functions considered there are for compactified moduli spaces of anti-self-dual U (2) connection with fixed value of the first Chern class. This is because those formulas were obtained in algebro-geometric setting, where the moduli spaces are realized as the moduli spaces of stable rank two sheaves on CP 2 with fixed first and second Chern classes. The extra 1/η(τ ) 3 = 1/η(τ ) χ(CP 2 ) compared to the SU (2) case can be understood as the contribution of the abelian point-like instantons from the diagonal U (1) subgroup of U (2). The formulas (3.2) are consistent with the fact that, by lifting the theory to the 6d (2,0) theory on CP 2 × T 2 , one can argue that the physically normalized partition functions Z v (τ ) should transform as a vector valued modular form of weight zero, with a multiplier system determined by 't Hooft anomalies. The case of U (2) gauge group will be discussed further in Appendix B.
The functions { h (τ )} are not purely holomorphic because of the second term in (3.3) and satisfy the holomorphic anomaly equation: transforms as a vector valued modular form with weight 3/2 under the modular group Γ 0 (4). In particular, Since h 0 is invariant under T and h 1 under T 4 , h 0 has the expected modular transformation law under the group Γ 0 (4) generated by T and ST 4 S, as expected from S-duality [1].
In modern terminology [26][27][28], h(τ ) is a vector-valued pure mock modular form with holomorphic shadow g(τ ) with components where c = 1 16πi is the overall normalization for which there is no standard convention. The vector h(τ ) is holomorphic but not modular. Addition of the correction terms in (3.3) constructed from the shadow vector g(τ ) yields its modular completion h(τ ). The completion is modular but not holomorphic. This incompatibility between modularity and holomorphy is the essence of mock modularity.
The connection to mock modularity can be seen more simply by combining the two components of the vector-valued mock modular form into a single mock modular form defined by . This gives the Zagier mock modular form [25] which is the generating function for the Hurwitz-Kronecker class numbers: It is a 'pure mock modular form' of weight 3/2 on Γ 0 (4) and shadow the classical theta function ϑ(τ ) = q n 2 . See [27,28] for the definition and further discussion. We see from the q-expansion of H(τ ) that it has no poles at q = 0 and hence is strongly holomorphic. Consequently, its Fourier coefficients grow very slowly. This is exceptional. In fact, up to minor variations the Zagier mock modular form is essentially the only known non-trivial example of a strongly holomorphic pure mock modular form.
The components {h } are most naturally regarded as the vector of theta coefficients of a mock Jacobi form H(τ, z) defined by where y = e 2πiz and are the level m theta functions. Following the definition in [28], one can check that H(τ, z) is a mock Jacobi form of weight 2 and index 1 with holomorphic anomaly (up to normalization) Note that a similar relation between components {h } form and a single Jacobi mock modular form appears in the decomposition of the Vafa-Witten U (2) partition functions into the sum of products of SO(3) and U (1) partition functions. However, the latter decomposition is different because the coefficients in that case are anti-holomorphic theta functions. We will discuss it in more detail in Appendix B.

Wess-Zumino Term in Four Dimensions
As explained in the introduction, the holomorphic anomaly of interest is given by a contribution from the boundary of the space of the bosonic zero-modes. Saturation of fermionic zero-modes in this region is governed by the two-fermion superpartner of a Wess-Zumino term which we describe below and in Appendix A.
Consider a four-dimensional N = 4 super Yang-Mills theory with gauge group G spontaneously broken to H × U (1) by the vacuum expectation value of the scalar field Φ valued in R 6 .
The bosonic part of the effective action, after coupling to background gauge fields of SO(6) R , contains the following Wess-Zumino term [18,32,33]: where Φ I , I = 0, . . . , 5 are the six scalar fields of the unbroken U (1), A is the background SO(6) R connection and F is its curvature. As usual, the integral is over a five-dimensional manifold Ξ 5 which has X as its boundary. The factor of i is due to the fact that we work in Euclidean spacetime (here and in the rest of the paper we use the convention in which the integrand of the path integral is e −S ). This term compensates for the deficit in the 't Hooft anomaly of the SU (4) R R-symmetry. 7 The We can label fields such that in the description via M5-branes wrapped on R 4 × T 2 , Φ 0 is compact and Φ I , 1 ≤ I ≤ 5 are noncompact fields that correspond to oscillations of the M5branes in the transverse R 5 . Only the Spin(5) R symmetry acting on the transverse oscillations is manifest in this description; it is enhanced to Spin(6) R in the limit of small T 2 as Φ 0 decompactifies. The relation between the topological terms in six and four dimensions will be discussed in more detail in Section 4.1.
Geometrically, η 5 =Φ * (e 5 ) is the pull-back of the global Euler angular form e 5 on the total space of an S 5 bundle E → Ξ 5 to the base space Ξ 5 by the sectionΦ : Ξ 5 → E. In general, for an S n sphere bundle π : E → Ξ, the fiber can be thought of as a sphere in the fiber R n+1 of a real vector bundle V . One can define the global Euler angular form e n ∈ Ω n (E) with the following properties. Its restriction to the fiber gives the volume form ω n normalized such that S n ω n = 1. Moreover, de n = −π * (e(V )) where e(V ) is the standard representative of the Euler class of the bundle V → Ξ. For odd n, the form e n is not closed in general when the bundle is non-trivial, as in the case above for n = 5. For even n, the Euler density form e(V ) is identically zero and e n is closed; this fact will be used in Section 4 for n = 4. Its de Rham cohomology class satisfies [e n ] 2 = π * (p n/2 (V )), where p n/2 (V ) is the n/2-th Pontryagin class of V . See, for example, [34] for details and explicit general formulas for e n .
As shown in [35], the Wess-Zumino term is part of the N = 4 completion of the Dine-Seiberg term [16]. Restricted to the N = 2 vector multiplet, the Dine-Seiberg term is given by a logarithmic prepotential. However, to compute the holomorphic anomaly, we will need not only the vector multiplet couplings but also the couplings involving both the N = 2 vector multiplet and the hypermultiplets including the auxiliary fields. These are related by supersymmetry to the second term in (3.17) which is linear in the R-symmetry curvature F . The supersymmetrization is therefore more elaborate than what is available in the literature and will be discussed in Appendix A.

Holomorphic Anomaly
To compute the holomorpic anomaly from the gauge theory path integral, we first represent the full four-dimensional effective action S 4d as A formula like (3.18) holds both microscopically and (therefore) also in the Coulomb branch effective field theory. We want this formula in the Coulomb branch effective field theory, in which we will do the localization computation, and in a formalism in which the scalar supercharges which are used in the localization are realized off-shell. An off-shell realization of the scalar supercharges gives a precise framework for the localization computation. The off-shell realization of the scalar supercharges in the topologically twisted N = 4 theory was considered in [1,36,37].
The effective action on the Coulomb branch is the sum of a quadratic action, a half-BPS coupling that is the supersymmetric completion of the Wess-Zumino coupling of eqn. (3.16), and various couplings of higher order that are in large supermultiplets. These higher order couplings have no BPS properties, and we do not expect them to be relevant. The Wess-Zumino coupling does not depend on τ orτ , and, in a formalism in which the scalar supercharges are realized off-shell, the same is true for its completion that is invariant under those supercharges. So the half-BPS coupling on the Coulomb branch will not contribute to Λ. Thus, we can evaluate Λ just using quadratic the quadratic part S free 4d on the Coulomb branch effective action S 4d : To determine the anomaly, we will be interested in the integral over the zero-modes of the path integral near the boundary. The zero-modes can be readily determined. In the twisted theory on CP 2 , the six real scalars of the untwisted theory turn into four real scalars and a complex section of the canonical bundle. Since there are no harmonic sections of the canonical bundle over CP 2 , there are only four real bosonic zero-modes corresponding to the four scalars. Since b 1 = 0, the gauge field in the Coulomb branch effective action has no zero-modes. Finally, the auxiliary fields in the twisted theory are a self-dual 2-form H (2+) and a 1-formH (1) . Since b 1 = 0 and b + 2 = 1, one obtains a single zero-mode of the auxiliary field, which we discuss in more detail in Appendix A. The fermions in the twisted theory consist of two scalars, two selfdual 2-forms and two vectors. Since b 1 = 0 and b + 2 = 1 one obtains four fermionic zero-modes.
In summary, there are four bosonic zero-modes of the four scalars, a single zero-mode of the bosonic auxiliary field, and four fermionic zero-modes. It may seem unusual in a supersymmetric theory that we have 1 extra bosonic degree of freedom, but we have to divide by the volume of the gauge group and hence the (compact) zero-mode of the gauge parameter counts as −1 bosonic degrees of freedom.
Since CP 2 is Kähler, a Spin(4) = SU (2) × SU (2) subgroup of SU (4) R R-symmetry remains unbroken after twisting. There are four scalar supercharges Q A and QȦ, transforming as (2, 1)⊕ (1, 2) respectively. The four fermion zero-modes transform likewise, 8 and we denote them by χ A andχȦ. The four bosonic zero-modes transform as (2, 2) and we denote them by u AȦ with the reality conditions (u AȦ ) * = AB ȦḂ u BḂ . The zero mode of the auxiliary field transforms as a scalar and we denote it as H.
Restricted to the zero-modes, the four off-shell supercharges described in Appendix A, in particular in (A.31), take the form where zm stands for the zero-modes. The effective action restricted to the zero-modes can have at most four fermions. The most general action consistent with the unbroken Spin(4) R symmetry takes the form where |u| 2 := u AȦ u AȦ . The coefficients depend also on H and the flux of the gauge field n = CP 1 F A /2π (which for the gauge group SO(3) is valued in 1 2 Z), but we do not show this dependence explicitly. Demanding Q A S| zm = QȦS| zm = 0, one obtains The action is thus completely determined in terms of a single function G(|u| 2 ). Since we are interested in the action at the boundary corresponding to |u| 2 large, we consider the expansion The function C 0 (H, n) is determined by the free part of the original action and hence is a homogeneous polynomial of degree two in H and n. The higher C k (H, n) for k > 0 are determined by the interacting part of the action related by supersymmetry to the Wess-Zumino term. The Wess-Zumino term is scale invariant if all scalar fields scale with weight one. Hence, the C k (H, n) are homogeneous polynomials of degree 2k in H and n. The contribution to the integral over the zero-modes from the boundary at infinity is determined by only the first two terms in (3.23) because the subsequent terms fall off rapidly for large |u|.
The function C 0 (H, n) is determined in Appendix A and is given by The function C 1 (H, n) is a homogeneous polynomial of degree two and hence there are only three possible terms. The condition that K AḂ does not have H in the denominator implies that C 1 (H, n) has no term constant in H. We note that according to (3.22), for G(|u| 2 ) = 1/|u| 2 we obtain R(|u| 2 ) = 0 which puts no constraint on the term linear in H. In summary, C 1 (H, n) takes the general form for some numerical constants a and b that need to determined. The term linear in H can be related by supersymmetry to the Wess-Zumino term using the off-shell formalism realizing four scalar supercharges in the twisted theory as described in Appendix A. One obtains a = 3/π from the relevant bosonic terms. Note that we have introduced the overall factor of i in (3.25) for convenience because the Wess-Zumino term is imaginary in Euclidean action. The term quadratic in H 2 cannot be related by supersymmetry to the Wess-Zumino term using only the four off-shell scalar supercharges described in the Appendix. However, the constant b can be determined by imposing Spin(6) R R-symmetry and using an off-shell realization of eight supercharges in the N = 2 supergravity formalism in the untwisted theory. One obtains b = −1 from the relevant bosonic terms.
The nonholomorphic variation of the action (3.19) for Q = Q 1 +Q 1 is given by S free as explained in more detail in Appendix A. The holomorphic anomaly then takes the form where the terms on the right have the following origin.
• The sum is over integral U (1) fluxes with v = w 2 (E) ∼ = Z 2 being the discrete flux.
• The contributions from nonzero modes cancels due to supersymmetry as usual, and the coefficient C zm (n) is given by a finite-dimensional integral over the zero-modes 9 : where as determined above and C is a numerical constant that depends on the normalization of the path integral measure.
As was explained in the introduction, in the 4d setup, to determine the absolute normalization of the path integral and therefore the constant C, one needs to compare the normalization used at short distances to put the holomorphic expansion in the form (1.1) with the normalization of the path integral measure in the low energy effective field theory on the Coulomb branch. This normalization can be affected, among other things, by topological terms -linear combinations of the Euler characteristic and signature of X -which might appear in the effective action on the Coulomb branch after integrating out massive modes. By contrast, in the 2d setup of Section 4, the relevant path integral can be interpreted as a Hilbert space trace; this provides a direct way to determine its normalization.
The integrals over two out of four fermionic zero-modes in (3.28) are saturated by Q| zm and Λ| zm . The other two should be saturated by terms in the action (3.21) that are quadratic in fermionic zero-modes. Bringing down the fermionic terms from the exponential, performing the Grassmann integrals, and using Stokes's theorem, one obtains and we have divided the integral by two to take into account the quotient by the Z 2 Weyl group on S 3 . We show in the appendix that the integral over ζ 3 +ζ 3 gives 2π 2 . The Gaussian integral over H can be readily performed to obtain Note that with our choice of normalization of the auxiliary field H the corresponding quadratic term has a "wrong" sign in the exponential. To make it convergent, the integral over H is performed along the imaginary axis. This is the origin of the factor of i in the result of the integration. The overall sign is somewhat ambiguous and depends on the choice of orientation in the space of zero modes. We do not address the question of fixing it in this paper. Combining

Holomorphic Anomaly in Two Dimensions
The non-compactness of the target space of the two-dimensional sigma model obtained by dimensionally reducing the six-dimensional type A 1 theory on CP 2 can lead to a holomorphic anomaly. This anomaly can receive a contribution only from the boundary of field space, so it suffices to determine the sigma model in this region. The noncompact bosonic zero-modes of the A 1 theory parametrize the separation between a pair of M5-branes. When these fields have large expectation values, the six-dimensional theory can be approximated by a single (2, 0) tensor multiplet valued in the Cartan subalgebra u(1) ⊂ su (2). This is the regime in which we work. We start by reviewing relevant facts about the effective action of the six-dimensional (2,0) theory on the tensor branch in Section 4.1. Then, in Section 4.2 we determine the effective twodimensional theory obtained by compactification of the six-dimensional theory on CP 2 . The two-dimensional theory is similar to a heterotic sigma-model. The Wess-Zumino-like terms in the 6d action lead to the Wess-Zumino terms in the 2d action. In Section 4.3 we review relevant facts about 2d (0,1) supersymmetric nonlinear sigma models. As in four dimensions, the holomorphic anomaly is determined by the supersymmetrization of the Wess-Zumino term. In Section 4.4 we derive the holomorphic anomaly from the path integral of the two-dimensional theory.

Six-dimensional Effective Action
The (2, 0) theory in six dimensions is characterized by a choice of a Lie algebra g, which is assumed to be a direct sum of simply-laced Lie algebras and u(1) factors. For each u(1), the theory contains a (2, 0) abelian tensor multiplet which consists of a 2-form gauge field B with self-dual 3-form field strength (dB = * dB), five scalar fields Φ a (a = 1, . . . , 5), transforming as a vector of Spin(5) R , and fermionic fields in the (4, 4) representation of Spin(6) × Spin(5) R where Spin (6) is the six dimensional rotation symmetry and Spin(5) R is the R-symmetry.
It was argued in [18] that 'higgsing' g → h ⊕ u(1) of the (2,0) model in six dimensions produces a Wess-Zumino-like term in the effective action on a six-dimensional space M , of the form where c(g) = dim g · h ∨ g and ∂Ξ 7 = M . This term compensates for the mismatch of the 't Hooft anomaly for the SO(5) R R-symmetry. The overall factor of i is present because we consider Euclidean action. con The 3-form Ω 3 is (locally) defined by descent: where Φ a , a = 1, . . . , 5 are the five scalar fields of the u(1) tensor multiplet,Φ a := Φ a / Φ , A is a background SO(5) R connection and F is its curvature. For the term (4.1) to be well defined, it is necessary that η 4 is closed. This is true in general. We check it explicitly in Section C.2 when only an Spin(2) subgroup of the Spin(5) R connection is turned on relevant for topological twisting on a Kähler 4-manifold. Geometrically, η 4 =Φ * (e 4 ) is the pull-back of the global Euler angular form e 4 on the total space of S 4 sphere bundle E → M to the base space M by the sectionΦ : M → E. See discussion in Section 3.2.
In [17,18] it was argued that the effective action also contains a topological term that governs the coupling of the Skyrmionic string to the B-field: where n W = dim g − dim h − 1 is the number of W -bosons in the five-dimensional theory obtained by dimensional reduction, the same as the n W that appeared in the Section 3.2. In the formula above and later, the B-field is normalized such that large gauge transformations shift it by an element of 2π H 2 (M, Z). By reducing the 6d theory to 5d, one can relate (4.4) to a term linear in the gauge field found in [33] by a one-loop calculation, with the coefficient in agreement with the formula above. By reducing the 6d theory further to 4d, the term (4.4) can be related to the Wess-Zumino term (3.16), which was shown in Section 3.3 to be responsible for the holomorphic anomaly. To see the relation between these two terms in more detail, and verify the consistency of the coefficients in front of the integrals, consider Ξ 7 = T 2 ×Ξ 5 . Writing Φ 0 = T 2 B, one obtains a compact boson valued in a circle. In the 4d limit its radius becomes infinitely large. In a trivial R-symmetry background, the action S 6d Sk reduces to which a priori appears quite different from the integral ofΦ * (ω 5 ) in (3.16). However, both terms can be interpreted as having n W /2 units of flux at the infinity of the space where the scalar fields Φ I , I = 0, . . . , 5 are valued. In (4.5) this space is R 5 × S 1 with boundary at infinity S 4 × S 1 . In the 4d limit the size of the 2-torus T 2 is taken to zero, the radius of S 1 becomes infinite and the space where the scalar fields are valued becomes R 6 with boundary at infinity S 5 . This deformation is shown in Figure 1.
The two six-dimensional terms above can be combined into a compact expression where ) Ω 3 is the u(1) 3-form flux. For the case of interest, g = su(2), h = 0 and hence c(g) − c(h) = 6 and n W = 2. Φ 1,2,... Φ 0 Figure 1. The 5-cycles S 5 (in blue) and S 4 × S 1 (in red) are homologous in R 5 × S 1 , the space where the scalar fields Φ I , I = 0 . . . 5 are valued. This is equivalent to the statement that the flux through the boundary at infinity is preserved when the radius of S 1 is taken to be infinitely large.

Two-dimensional Effective Action
The effective two-dimensional theory on T 2 is obtained by compactifying the six-dimensional theory on a manifold of the form M = X × T 2 for small X. The field content in two dimensions is obtained by standard twisted Kaluza-Klein reduction of a single (2,0) u(1) tensor multiplet on X, by counting harmonic sections of certain bundles. Compactification on a general fourmanifold is described, for example, in [38] in Table 1. The choice of the topological twist on the four-manifold X that we are using leads to supersymmetry in the right-moving, or, equivalently, anti-holomorphic, sector of the effective 2d theory. This is correlated with the convention that the contribution from instantons, which preserve supersymmetry in the 4d theory, has holomorphic dependence on the modulus τ of the T 2 .
The charge lattice in [38] is the standard charge lattice Z of a U (1) gauge theory corresponding to the tensor multiplet on the worldvolume of a single M5-brane. In our application, the tensor multiplet really describes the relative motion of a pair of M5-branes. As a result, the normalization of the charge lattice is modified, as we discuss below and in Section 5.2.
We review how the various sigma model fields arise from the KK reduction. For simplicity, consider a simply-connected four-manifold X. The 6d scalar fields that transform in the vector representation of Spin(5) R in the untwisted theory give rise in the twisted theory to b + 2 real non-chiral non-compact 2d scalars σ i plus a single complex non-chiral non-compact scalar φ 0 . The 6d fermions transforming in the (4, 4) representation of Spin(5) R × Spin(6) give rise to b + 2 right-moving Weyl fermions ψ i + and a single right-moving Weyl fermion χ + . The topological twist preserves N = (0, 2) supersymmetry in two dimensions for a generic X; this is enhanced to N = (0, 4) for a Kähler X.
The 2-form gauge field B with self-dual curvature 3-form gives rise to b 2 = b + 2 + b − 2 compact chiral real bosons X i L,R valued in H 2 (X, R)/H 2 (X, 2πZ). The quotient arises from the large gauge transformations that change the holonomy of the 2-form gauge field along 2-cycles C i ⊂ X by an integer: C i B → C i B + 2πn i , n i ∈ Z. The bosons valued in the subspace of self-dual harmonic forms, H 2+ (X) ⊂ H 2 (X, R), are right-moving and the bosons valued in H 2− (X) are left-moving. In other words, the 2-form field B gives rise to a Narain-like lattice CFT with the lattice Γ = H 2 (X, Z) being the second homology lattice equipped with the standard geometric intersection form. For a closed four-manifold, H 2 (X, Z) ∼ = H 2 (X, Z) by Poincaré duality and the lattice is self-dual.
When the tensor multiplet describes the tensor branch of 6d type A 1 theory, the lattice should be rescaled by √ 2 compared to the case of a single 5-brane considered above because Γ SU (2) = √ 2Γ U (1) . One way to see this is that the root lattice of SU (2) is Λ = √ 2Z, instead of just Z when embedded in R with the standard metric, while its weight lattice is the dual Λ * = 1 √ 2 Z. For a general 6d theory on a tensor branch, with string charges living in the lattice 10 Λ * , the 2-form fields give rise to 2d lattice CFT for the lattice Γ = Λ ⊗ H 2 (X, Z). Note that when Λ is not self-dual, Γ is not self-dual even for a closed four-manifold. This implies that the effective 2d theory is not absolute but relative, meaning that instead of a single partition function it has a vector of partition functions labelled by the elements of the coset [39][40][41][42] Γ/Γ * = Λ/Λ * ⊗ H 2 (X, Z). (4.7) This is in agreement with the fact that the 6d theory itself is also in general relative with its relativeness measured by the defect group Λ/Λ * . Consider now in more detail the case of an A 1 theory compactified on CP 2 . We have b − 2 = 0 and b + 2 = 1. The second homology is generated by the class of CP 1 ⊂ CP 2 , which has self-intersection number +1. The self-dual lattice H 2 (CP 2 , Z) can then be identified with the standard lattice Z and therefore Γ = √ 2Z. The 6d 2-form gauge field gives rise to a single right-moving real compact boson X R valued in R/2π √ 2Z, that is, on a circle of radius √ 2. The compact boson valued in a circle of radius √ 2 can be equivalently thought of as a u(1) 2 chiral WZW theory, which in turn can be conformally embedded into su(2) 1 . This is another way to understand the rescaling of the lattice by √ 2. Both affine Lie algebras have two integrable modules. Equivalently, both WZW models have two conformal blocks on a torus with characters given by the corresponding lattice theta-functions: From the general analysis above, we also have three non-compact non-chiral real bosons and two right-moving Weyl fermions. Note that if the compact boson were non-chiral the 2d fields would form the fields of the standard (0,4) sigma model with the target space X = (S 1 ×R 3 )/Z 2 = (C * ×C)/Z 2 where Z 2 is the SU (2) Weyl symmetry that flips all four directions. There are no left-moving fermions.
The additional data of the sigma-model is the Kalb-Ramond 2-form field 11 b on the target space (not to be confused with the self-dual 2-form field B in 6d). The b-field determines the 2d Wess-Zumino term depending on the bosonic fields. This term can be obtained by the reduction of the 6d Skyrmionic string coupling term (4.4) as follows.
The topological twist on a Kähler X is realized by turning on a non-trivial background for the subgroup of R-symmetry U (1) R ≡ Spin(2) R ⊂ Spin(5) R . The background corresponds to identification of U (1) R with the diagonal of U (2) holonomy group of X. One can assume that the Spin(2) R rotates the directions 4-5 so that F a 1 a 2 = 0 unless a 1 , a 2 are permutation of 4, 5 and [F 45 /(2π)] = −[F 54 /(2π)] = c 1 (KX) = −c 1 (X). The non-trivial contribution to the action of the effective 2d theory is given only by the second term in (4.4). Namely, let M = Σ 2 × CP 2 , where Σ 2 is the 2d spacetime. The KK reduction of 6d bosonic fields to massless 2d fields is explicitly realized as follows. For the B-field, we have where ω is the Kähler 2-form (which is harmonic and self-dual) normalized such that CP 1 ω = 1, and X R is the right-moving compact boson of the effective 2d theory with the identification For the scalar fields, we have: where the first equation follows from the fact that the canonical bundle of CP 2 has no harmonic sections, and φ a are the 2d non-chiral scalar fields valued in R 3 . Performing the partial integration in (4.4) over the CP 2 yields the bosonic WZ term in 2d: where ω 2 is the volume form on S 2 normalized such that S 2 ω 2 = 1 withφ a := φ a / φ understood as a map Σ 2 → S 2 . In the second equality in (4.11) we used the fact that the canonical bundle over Let us denote the coordinate on S 1 of the target space 12 of the effective theory by Y 0 and the coordinates on R 3 as Y a (a = 1, 2, 3), with norm Y . From (4.11) we deduce that the Kalb-Ramond field b on target space is given by (the choice of normalization is with corresponding 3-form flux whereŶ a := Y a / Y . The choice of normalization is such that h ∈ 8π 2 Z and will be consistent with the normalization of the action in Section 4.3. The compactification of the 6d Wess-Zumino term (4.1) to 2d is not relevant for our computation of the holomorphic anomaly because it produces the term where ∂Ξ 3 = Σ 2 . This term is the 2d analog of the 6d Hopf-Wess-Zumino term (4.1) introduced in [18]. It is well defined since the integral of the form Ω 1 dΩ 1 over a closed 3-manifold is an integer, the Hopf invariant of the map from this 3-manifold to S 2 . The term (4.14) does not contain the compact boson X R . The fermionic partner of this term therefore does not affect the saturation of zero-modes.
To compute the holomorphic anomaly we need not the bosonic WZ term (4.11) itself, but rather its fermionic partner. The effective two-dimensional theory is a fairly standard heterotic sigma model with target X = (S 1 × R 3 )/Z 2 , except that the bosonic field valued in S 1 is only right-moving. This does not affect supersymmetrization of the action because the supersymmetry generators act only on the right-movers. One can thus use known results about heterotic sigma models reviewed below to deduce the relevant fermionic terms.

Review of (0,1) Nonlinear Sigma Model
The goal of this section is to review basic facts about (0,1) Nonlinear Sigma models and to fix various normalizations that will be relevant for our calculation of the holomorphic anomaly. We can restrict ourselves to the class of theories containing only scalar multiplets (φ i , ψ i ) composed of bosonic scalars φ i and right-moving Majorana-Weyl spinors ψ i . The scalars play the role of local coordinates in the target space while spinors ψ i are valued in the tangent bundle. In general the theory can also have left-moving Majorana-Weyl spinors valued in a real vector bundle over the target space; however for our purposes this generalization is not necessary.
Let g ij and b = 1 2 b ij dx i ∧ dx j be the metric and Kalb-Ramond 2-form on the target space. The theory is anomaly-free and, in particular, invariant under symmetries of the target, provided that w 1 (T X ) = w 2 (T X ) = 0 and p 1 (T X )/2 = 0; here w i denote Stiefel-Whitney classes and p 1 denotes the first integral Pontryagin class. Denote ∂ := ∂/∂z,∂ := ∂/∂z. The action of the theory in Euclidean spacetime with a local complex coordinate z then reads 13 [44,45]: 13 Our normalization corresponds to the choice α = 2 in the string theory literature [43].
where d 2 z := idzdz, Γ ijk are Christoffel symbols of the Levi-Cevita connection, and is the 3-form flux of b. The term containing the scalar fields and the Kalb-Ramond field in the target is the Wess-Zumino term, which can be recast as where ∂Ξ 3 = Σ 2 . We normalize the supercharge so that it acts on the fields as follows: The right-moving energy momentum-tensor and the supercurrent are given by (cf. [46]) The normalization of the energy-momentum tensorT is fixed by the definitionT = 2πδS 2d /δhzz, where hzz is the corresponding component of the metric on the 2d space-time. The normalization of the supercurrentḠ is then fixed by the relation with the supercharge defined above. These formulae compactly summarize the supersymmetrization in two dimensions, which is much simpler than the supersymmetrization in fourdimensions described in Appendix A. In particular, there is no need for auxiliary fields. The normalization of the path integral for the theory on a torus can now be fixed by requiring that it coincides with the corresponding trace over the Hilbert space on the circle: where we consider periodic-periodic boundary conditions on the fermions, that is, odd spin structure. Consider in particular the theory of D free (0,1) scalar multiplets described by the metric g ij = δ ij and vanishing b ij . The theory then factorizes into the theory of D free bosons and D free Majorana-Weyl fermions. With the choice of normalization of the action above, the bosons and fermions satisfy the following operator product expansions: The partition function of the bosons on the torus reads where V D is the regularized volume of the target space. The prefactor originates from the trace over the momentum space [43] and is given by The partition function of free fermions is zero due to the presence of zero-modes. To get a nonzero answer one can consider instead a one-point function of the product of all spinor fields: : where ψ i 0 are zeroth components of the fermionic fields ψ i (z) = n ψ i n z −n−1/2 in the Ramond sector. They satisfy the following anti-commutation relation: The ground states in the Hilbert space on the circle therefore form the standard spinor representation of the Clifford algebra with identification ψ i = 2 −1/2 γ i , where γ i are the standard Eulidean gamma-matrices satisfying {γ i , γ j } = 2δ ij . When D is even the fermion parity is non-anomalous and on the ground states can be represented by Therefore the trace over the ground states reads The full Hilbert space is obtained by action of the creation operators ψ i −n , n > 0 on the ground states. Therefore, the one-point function on the torus reads (4.30) Even though this expression is obatined for even D, it can be used to define the normalization of the torus one-point function for arbitrary D. Note that for odd D the sign of i D/2 is ambiguous which corresponds to the mod 2 fermion parity anomaly. When an even number of fermions is separated into two groups each containing an odd number of fermions, it is necessary to choose the signs consistently. To be concrete, we choose the convention i D/2 = e πiD/4 . Consider now a more general (0,1) sigma-model, but in a limit when the target space X has large radius of curvature. There is a standard one-to-one correspondence between local observables of the form O =: f i 1 ...i k (φ)ψ i 1 . . . ψ i k : and k-forms on the target space Using this correpondence and the above results on the free fields, we obtain the following formula for the one-point function in the large radius limit: This fixes the normalization of the measure for the zero-modes in the path-integral which will be used in the next section. Finally, note that the supercharge Q in (4.18), when restricted to the zero-modes, acts as the exterior derivative on Ω * (X ) under this correspondence.

Holomorphic Anomaly from the Sigma Model
The elliptic genus of a compact target X is given by the partition function of a (0, 1) supersymmetric sigma model with periodic boundary conditions for fermions in both directions on T 2 . When X is not compact, the partition function is in general not holomorphic and has a holomorphic anomaly. If X has a "boundary" 14 Y, then the holomorphic anomaly is governed by the one-point function of the supercharge in a sigma model with target Y [14]. More precisely, let Z X be the partition function of a heterotic sigma model with target X on a torus with complex structure τ . Then 15 where O(z 0 ) is a path integral with an insertion of operator O(z 0 ) at an arbitrary point z 0 . This formula can be understood as follows. The first equality follows from the trace (4.21) or from the definition ofT and the fact that the change of a complex structure can be related to the change of metric. The second equality in (4.32) uses the relation 2iT = {Q,Ḡ} between the energy momentum tensorT and supercurrentḠ via the supersymmetry transformation Q. The last equality in (4.32) is more subtle and can be argued as follows.
The partition function Z X is obtained by integrating over the space of all maps φ : T 2 → X along with fermion fields ψ ∈ Γ(S + (T 2 ) ⊗ φ * T X ). The path integral has zero-modes consisting of a map φ : T 2 → X together with a constant ψ field. We write (φ 0 , ψ 0 ) for such a constant pair. Supersymmetry localizes the path integral on the space of constant pairs. This means that the path integral can be evaluated by integrating out other modes to construct a measure f (φ 0 , ψ 0 ), which must then be integrated over φ 0 , ψ 0 . In an ordinary sigma model, to compute f , it suffices to integrate out the other modes in a 1-loop approximation because Z X does not depend on the metric of X . Hence, one can scale up the metric of X so that the 1-loop computation over nonzero modes is exact. In the present situation, we have to modify this procedure slightly because there is a right-moving chiral boson with no tunable parameter such as a variable metric. However, the effects of this mode can be treated exactly, treating other nonzero modes in the one-loop approximation. Now we come to the main point of the argument. We have already explained in section 4.3 that an operator of the form O f =: , and that the path integral O f is the integral X f of the differential form f , times some additional factors explained in eqn. (4.31). Moreover, acting on operators that are related to differential forms in this way, Q corresonds to the exterior derivative d. To claim that a Q-exact term does not contribute to the path integral amounts to claiming that X f is invariant under f → f + dg. But in general there is the possibility of a surface term, because by Stokes's theorem X dg(φ 0 , ψ 0 ) = Y g(φ 0 , ψ 0 ), where Y = ∂X . The anomaly captures the failure of the naive argument of vanishing of Q-exact terms, and hence reduces to an integral over Y.
Since we are interested in the expectation value of {Q,Ḡ}, we have f = dg where g is a function of φ 0 , ψ 0 that is computed by integrating out all modes of a chiral boson and nonzero modes for all other fields in the path integral for Ḡ . We are only interested in evaluating g on Y. Most of the modes that appear in the evaluation of g(φ 0 , ψ 0 ) are the modes that would appear in evaluating Ḡ in a sigma-model with target Y, with one important exception. In the sigma-model with target X , there is a bosonic field φ ⊥ that describes the motion normal to Y, and a corresponding fermion partner ψ ⊥ . These modes are absent in the sigma-model with target Y, so we have to include them separately. The partition function of the left-moving modes of φ ⊥ is the factor 1/η(τ ) in eqn. (4.32). The right-moving modes of φ ⊥ cancel the nonzero modes of ψ ⊥ . The zero-modes of φ ⊥ and ψ ⊥ are eliminated when we use Stoke's theorem and replace X dg with Y g, except for a normalization factor e πi/4 √ 8π 2 τ 2 (4.33) explained in Section 4.3. Altogether, one obtains precisely the right hand side of (4.32). In our case, X = (R 3 × S 1 )/Z 2 and Y = (S 2 R × S 1 )/Z 2 , where S 2 R is a sphere of very large radius R. 16 In the limit R → ∞, the fermions and bosons appearing in a sigma model with target Y can be treated as being free. We thus obtain where we used the fact that in free field theory. When we interpret the fermion zero-modes as 1-forms on Y, the coupling h ijk ψ i ψ j ψ k just becomes a 3-form h = 1 3! h ijk dφ i dφ j dφ k that has to be integrated over Y. Therefore, the result will only depend on the cohomology class of the Wess-Zumino coupling. Using the expression (4.13) for h, and taking into account the periodicity Here we only need S 2 ω 2 = 1 and not the explicit expression for the 2-form ω 2 . The nonzero modes of the fermion contribute η(τ ) 3 . It is also necessary to include the normalization phase e 3πi/4 for the fermionic zero-modes explained in Section 4.3. The nonzero modes of the bosons valued in S 2 contribute 1/(8π 2 τ 2 |η(τ )| 4 ) with normalization also explained in Section 4.3. The nonzero modes of the chiral boson valued in S 1 contributeχ , which differs from its partition function (4.8) by factoring out the integral over the zero-mode dY 0 = 2π √ 2.
Here v ∈ Z 2 corresponds to the discrete flux of the SO(3) gauge field on Combining all the contributions we obtain The anti-holomorphic eta-functions cancel between bosons and fermions due to right-moving supersymmetry. The anti-holomorphic theta function is the contribution of the "winding modes" of the compact chiral boson. Note that, as in 4d calculation, the overall sign in (4.37) is somewhat ambiguous and depends on the choice of orientation in the space of zero-modes. We do not address the question of fixing it in this paper. Nevertheless, combining (4.37) with (4.32) we obtain which is in agreement with the expected result (3.5)-(3.6).

Generalizations
We now turn to possible generalizations. In Section 5.1 we consider a general Kähler manifold with b + 2 = 1, b 1 = 0 and compute the holomorphic anomaly of the Vafa-Witten partition function. In Section 5.2 we consider SU (N ) gauge theory realized on multiple M5-branes. The holomorphic anomaly can again be traced to the Wess-Zumino term in the effective action on the Coulomb branch where SU (N ) is spontaneously broken to SU (N 1 ) × SU (N 2 ) × U (1) with N = N 1 + N 2 . Finally, in Section 5.3 we briefly discuss other twists.

Vafa-Witten Theory on Kähler Manifolds with
gauge theory on an arbitrary Kähler manifold X with b + 2 = 1 and b 1 = 0 obtained by wrapping two M5-branes on X in M-theory on KX × T 2 × R 3 . Unlike in the case X = CP 2 , we will in general have a nonzero b − 2 with interesting modifications. Consider first the two-dimensional point of view. The field content of the effective (0,4) sigma model can be obtained by Kaluza-Klein reduction of the 6d (2,0) tensor multiplet as in Section 4.2 following [38]. The only additional fields are b − 2 left-moving compact bosons because the relation (4.9) is replaced by the following decomposition of the six-dimensional 2-form field: where h − i are the generators of H 2− (X, R) and ω is the Kähler 2-form, normalized such that The fields X i L and X R are components of left-moving and right-moving compact bosons; (X L , X R ) is valued in the torus H 2− (X, R) ⊕ H 2+ (X, R)/H 2 (X, Z).
Thus, the compact bosons of the 2d theory form Narain lattice CFT associated to the indefinite lattice Λ ⊗ H 2 (X, Z). As before, Λ ∼ = √ 2Z denotes the root lattice of SU (2). What is important is that the intersection form still has a single negative eignvalue. When b − 2 > 0 the theory has non-trivial moduli depending on the conformal class of the metric of X. It has special "walls" that appear when the intersection H 2,+ (X, R) ∩ (H 2 (X, Z) ⊗ R) has nonzero elements, corresponding to abelian instantons.
The fermionic and non-compact bosonic fields will be the same as in the case X = CP 2 . The analysis of the holomorphic anomaly is thus analogous and the final result is modified to where n + := (n · ω) ω, and v ∈ H 2 (X, Z 2 ) (5.5) is the Z 2 magnetic flux of the SU (2)/Z 2 ∼ = SO(3) gauge field. Here and below we use ( · ) to denote the intersection pairing H 2 (X, R) ⊗ H 2 (X, R) → R. This is in agreement with the prediction for the holomorphic anomaly [47,48] obtained from a different perspective. There are two modifications in (5.3) compared to the CP 2 case (4.32). The first is the overall factor, (c 1 (X) · [ω]) which is the straightforward generalization of the factor that appears after reduction of 6d Skyrmionic string term to 2d WZ term as in (4.11). the second is the contribution from left-moving compact bosons X i L . Consider for example the case The H 2 (X, Z) lattice has two generators e 1 and e 2 , Poincaré dual to 2-cycles pt × CP 1 and CP 1 × pt. Their intersection numbers are: e 1 · e 1 = e 2 · e 2 = 0, e 1 · e 2 = 1. In terms of this basis, the first Cherm class of the tangent bundle reads The orthonormal basis in H 2,+ (X, R) ⊕ H 2,− (X, R) is given by where R 2 /2 is the ratio of the areas of two CP 1 's. In this case the effective two-dimensional theory is closely related to a (0,1) sigma model with S 1 × R 3 target, where R is the radius of S 1 . The difference comes from rescaling of the lattice of winding numbers and momenta along S 1 by the overall √ 2 factor. The formula (5.3) then reads where v ∈ Z 2 2 . The analysis in the gauge theory region is similar. The overall factor of the 3-form integrated over the boundary S 3 in the space of bosonic zero-modes changes from 3 to (c 1 (X) · [ω]). The contribution of the abelian and point-like instantons in (3.27) is replaced with where χ(X) = 3 + b − 2 . This yields the same result (5.3) obtained in the sigma model region.

Holomorphic Anomaly for SU (N )/Z N Gauge Theory
Another generalization is to consider Vafa-Witten theory for the gauge group SU (N )/Z N . This theory can be obtained by compactifying on T 2 the six-dimensional (2,0) model of type A N −1 , which can be realized in M-theory by a stack of N M5-branes with the center of mass degrees of freedom removed. For simplicity, consider X = CP 2 . By arguments similar to the above, we expect that the contributions to the holomorphic anomaly come from the boundary of the non-compact space of bosonic zero-modes. The latter can originate from topologically twisted scalar bosonic fields serving as coordinates on the Coulomb branch, where the gauge group is spontaneously broken to some subgroup. Consider breaking SU (N ) by a vacuum expectation value of adjoint scalar fields proportional to the traceless matrix with the standard normalization Tr(T 2 ) = 1; here subgroup that commutes with T is left unbroken. Breaking to smaller subgroups can be realized recursively.
The analysis above for SU (2)/Z 2 can be repeated with the scalar fields Φ i in the vector multiplet of the unbroken U (1) and their superpartners. The overall coefficients in front of the 6d Skyrmionic string term (4.4) and 4d WZ term (3.16) are now given by This modifies accordingly the overall coefficient on the right hand side of the holomorphic anomaly equation. The integration over the boundary S 3 in the space of the bosonic zeromodes of the 4d theory on CP 2 (or boundary S 2 × S 1 in the effective 2d theory), together with summation over all possible partitions N = N 1 + N 2 and discrete fluxes of The theta function appearing in this formula corresponds to sum over the weight lattice of the unbroken U (1) inside the weight lattice Λ * SU (N ) of SU (N ). The normalization is fixed by the normalization of T in (5.13) above. This is in agreement with the general prediction in [47][48][49][50][51] (where the gauge group is U (N ) instead of SU (N )). By applying this formula recursively one can conclude that the holomorphic limit of Z SU (N )/Z N v is a depth N − 1 mock modular form.

Other Twists
For completeness, we include the computation of the holomorphic anomaly in the partition function for the B and the C twist. The analysis of the 2d sigma model is very similar for to the one for the A twist and leads to a noncompact theory. The analysis on the Coulomb branch is also similar. However, it turns out that the holomorphic anomaly for the partition function actually vanishes for these twists. It is conceivable that some other observables of the twisted theory have nonvanishing holomorphic anomaly and exhibit mock modularity.

The C Twist
This twist is only possible on spin manifolds, because the N = 4 theory, considered as a N = 2 theory, has matter in the adjoint representation. The corresponding sigma model has (0, 1) supersymmetry in two dimensions, which for Kähler manifolds is enhanced to (0, 2).
The simplest example is X = CP 1 × CP 1 . Since its signature is zero, for generic metric there will be no harmonic spinors. The effective theory contains a single non-chiral non-compact boson, and also a left-moving chiral boson X L and a right-moving chiral boson X R , as in the example at the end of Section 5.1). The non-compact boson and X R have super-partners: two right-moving Majorana-Weyl fermions ψ 1 , ψ 2 . Unlike in the case of the A twist, the Wess-Zumino term does not play a role as the target space is two-dimensional. The boundary theory Y contains a single fermionic mode that can be saturated by the supercurrent. However, its expectation value turns out to vanish: (n 1 /R + R n 2 /2)q (n 1 /R+R n 2 /2) 2 /4 q (n 1 /R−R n 2 /2) 2 /4 = 0, (5.16) where, as before, v ∈ Z 2 2 labels the discrete gauge flux. The vanishing can be attributed to anti-symmetry under the automorphism of the H 2 (X, Z) lattice acting as n → −n on all the elements. For X = CP 1 × CP 1 this automorphism is induced by the map X → X given by complex conjugation on the complex coordinates. Thus it corresponds to a certain global Z 2 symmetry of the theory. Note that for any other spin manifold X with b + 2 = b − 2 = 1, the intersection form is the same as for CP 1 × CP 1 , due to the classical result about classification of even self-dual lattices. Therefore the result of the calculation will be the exactly same (namely zero) because the contribution from the boundary at infinity of the target space depends only on the intersection form and the action of the Hodge star on H 2 (X, R). However, the involution of the lattice n → −n does not necessarily correspond to an orientation-preserving self-diffeomorphism of X.
Consider now a smooth spin 4-manifold X with b + 2 = 1 and b − 2 > 1. Note that due to Rokhlin's theorem, b − 2 − 1 = 0 mod 16. If the metric on X is generic, the effective 2d theory will have (b − 2 − 1)/4 ≥ 4 extra right-moving Majorana-Weyl fermions in addition to the fields ψ 1,2 introduced above. They originate from harmonic spinors on X. In this case it is not possible to saturate the corresponding fermionic zero-modes by an insertion of the supercurrent and the anomaly vanishes for a more trivial reason.

The B Twist
The sigma model now has (1, 1) supersymmetry, which for Kähler manifolds is enhanced to (1,2). As before, the field content can be determined by counting harmonic forms on X. In the case X = CP 2 the effective theory contains a single non-chiral non-compact boson, a single right-moving compact boson of radius √ 2, and their super-partners: one left-moving and two right-moving Majorana-Weyl fermions. While it is possible to saturate the zero-modes for the right-moving fermions as in the case of C twist considered above, the zero-mode for the leftmoving fermion will remain unpaired, rendering the result to be zero. The same holds for other manifolds with b + 2 = 1 and b 1 = 0.

Acknowledgments
We would like to thank G. Moore, D. Pei, B. Pioline, C. Vafa for useful discussions. EW acknowledges partial support from NSF Grant PHY-1911298.

A Supersymmetrization of the Four-Dimensional Action
Our goal is to determine the constants a and b introduced in (3.25). The constant b will be determined in Appendix A.5 by comparison with the bosonic part of untwisted action with auxiliary fields put on-shell. The constant a will be determined by identifying a bosonic term linear in the field H (whose zero mode is H) in the supersymmetrization of the bosonic Wess-Zumino term. It is convenient to work directly in the twisted theory because the superalgebra of the scalar supercharges is particularly simple. On a four-manifold of general holonomy, the N = 4 topological twisted theory has two scalar supercharges with an off-shell realization [1,36,37,52]. For a Kähler manifold X, the holonomy is reduced from SU (2) × SU (2) r to U (1) × SU (2) r . Only a U (1) R subgroup of Spin(6) R is used for twisting by replacing The resulting twisted theory thus has unbroken Spin(4) R × U (1) R global symmetry in addition to the U (1) l × SU (2) r holonomy group. Note that since U (1) R is abelian, it remains unbroken even after turning on a non-trivial background. There are then four scalar supercharges which we denote as Q A (A = 1, 2) and QȦ (Ȧ = 1, 2) and which transform as where the superscript denotes the U (1) charge and the subscript denotes the U (1) R charge.

A.1 Off-shell Fields
The N = 4 vector multiplet contains the gauge field and six scalar fields in the untwisted theory. The gauge field is not affected by twisting, and splits into the following two irreducible representations 17 : corresponding to the Hodge decomposition of a 1-form on a Kähler manifold into (1,0) and (0,1) forms. Similarly, the exterior derivative splits as d = d + + d − : where d + ≡ ∂, d − ≡∂ are the Dolbeault differentials. The six scalars of the untwisted theory split into three irreducible representations: 3) 17 In the Euclidean theory, they are to be regarded as independent fields not related by complex conjugation.
We have denoted the fields suppressing the SU (2) r indices to avoid clutter. The A andȦ indices on the fields transform in the spinor and the conjugate spinor representations of Spin(4) R , the superscript denotes the U (1) charge and the subscript denotes the U (1) R charge. There are sixteen fermions which split into six irreducible representations: So far we have ten bosonic and sixteen fermionic fields. Taking into account gauge freedom parametrized by a single scalar bosonic field we then need seven auxiliary fields to obtain an off-shell realization of the four scalar supercharges. We introduce auxiliary fields for each fixed representation of SU (2) r × U (1) as follows: The fields H ±± and H can be seen to arise from a self-dual 2-form, as the (1,1), (2,0) and (0,2) components in the Hodge decomposition on a Kähler manifold.
Here and elsewhere ω denotes the Kähler form normalized such that X ω ∧ ω = 1. Similarly, the fieldsH ± combine into 1-form as its (1,0) and (0,1) components: A.2 Off-shell Superalgebra The four supercharges are nilpotent up to gauge transformation δ gauge (Φ AḂ ) generated by Φ AḂ : The off-shell realization of this algebra can be determined essentially by inspection and by comparison with the untwisted theory as we describe below. The entire vector supermultiplet of N = 4 theory together with the auxiliary fields splits into three multiplets of the algebra (A.8) satisfied by the four scalar supercharges. The transformations below are strongly constrained by the SU (2) r × Spin(4) R × U (1) × U (1) R global symmetry and the algebra (A.8). Note that the fields can always be rescaled relative to the gauge fields A ± so that the coefficients in the supersymmetry transformations are as below.
Sections of the canonical and anti-canonical bundles belong to two short multiplets: and The remaining fields including the gauge field form a long multiplet: where we have grouped the fields into shorter multiplets with respect to the Q A orQȦ supercharges separately. The fields Φ,ψ,λ, H form a submultiplet of the superalgebra (A.8).
The normalization of the gauge field can be fixed by specifying the coefficient for the kinetic term. The realization of the superalgebra above still leaves the freedom to rescale other fields together with supercharges without changing the supersymmetry transformations: (A.12)

A.3 Supersymmetrization of the Free Action
The bosonic part of the free action can be fixed by requiring that one obtains the standard kinetic term for the gauge field for the unbroken U (1) subgroup of SO(3) on the Coulomb branch after eliminating the auxiliary fields: Using Hodge decomposition (A.6) of H (2+) , the remaining terms are fixed by supersymmetry: One can show that the term in (A.14) proportional to τ 2 is Q-exact. Using Spin(4) R R-symmetry, one can choose Q to be a linear combination of the form where α and β are constants. Define and They satisfy and 18 S free 4d = 2πiτ For convenience one can choose α = β = 1 as in Section 3.3. 18 Note that for the action restricted to the supermultiplet (A.11) it is possible to use just Q 1 orQ 1 .

A.4 Supersymmetrization of the Wess-Zumino Term
To determine the coefficient a of the term linear in H in the zero mode action (3.25), it is necessary to determine the terms in the full action that are linear in the field H. In this subsection we show that such a term is indeed present and is necessary to cancel supersymmetric variation of the bosonic Wess Zumino term (3.16). We start by writing the Wess Zumino term (3.16) in the twisted field variables. The topological twist on a Kähler X is realized by turning on background field strength for the subgroup of R-symmetry Spin(2) R ⊂ Spin(6) R . One can assume that the Spin(2) R rotates in the 4-5 plane in the R 6 field space of scalars in the untwisted theory. With this choice, one can relate the fields Φ I , I = 0 . . . 5 with the bosonic fields in (A.2) as follows: Only nonzero components of F I 1 I 2 in (3.16) are F 45 = −F 54 , related to the curvature of the canonical bundle when restricted to X = CP 2 . Let ω be the Kähler form on CP 2 normalized such that CP 1 ω = 1 for a standard embedding of CP 1 into CP 2 . Since the canonical bundle is O(−3) bundle one has c 1 (KCP 2 ) = −3[ω]. Moreover, F 45 | X = 3 · 2π ω when the connection on KCP 2 is induced by Levi-Civita connection on T CP 2 for the Fubini-Study metric.
Since Φ AȦ and Φ 4 ± iΦ 5 belong to different supermultiplets (A.9) and (A.11) after twisting, one can restrict Φ 4 and Φ 5 to be zero. With the field redefinitions above the bosonic Wess-Zumino term S 4d WZ equals It turns out that (A.23) by itself cannot be supersymmetrized. One can consider instead which can be supersymmetrized to the desired order. This term can arise from a term in the effective action in the untwisted theory of the form for some constants c 1 and c 2 , and can be thought of as a non-minimal coupling to gravity required by supersymmetry on a curved manifold. It respects both scaling and Spin(6) R symmetry and reduces to (A.25) on CP 2 when Φ 4 = Φ 5 = 0. Such a term is indeed known to be present as a four derivative coupling in N = 2 supergravity [53].
Our goal is to determine whether a bosonic term linear in H is required by supersymmetry. Since we are trying to relate a bosonic term to another bosonic term, we will have to consider at least two consecutive supersymmetry transformations. A general variation S WZ is given by Even though the Wess-Zumino action is defined by a 5d integral, its variation must be a local 4d integral. The supersymmetry variation can be canceled by adding the term With some algebra, one can verify that the variation of S WZ with respect to Q C is completely canceled by the variation of the first two terms with respect to Q C ; and the variation with respect toQĊ is completely canceled by the variation of the last two terms with respect toQĊ. However the variation of the first two terms (A.28) with respect toQĊ and the last two terms with respect to Q C is nonzero. Restricting to one fermion terms modulo terms linear iñ H − , one can verify that this variation can be canceled by the variation of the term We have thus concluded that a term linear in F A and linear in H is necessarily present in the supersymmetrization of the Wess-Zumino action. Therefore, the coefficient of this term and in turn a is topological in origin and is determined by the anomaly matching which fixes the coefficient of the Wess-Zumino term.

A.5 Restriction to Zero-Modes
Given the relevant terms in the four-dimensional action, one can determine the effective action over zero-modes to the desired order by restricting the four-dimensional fields to the zeromodes. When X = CP 2 , the canonical bundle has no harmonic sections and b 1 = 0. The only zero-modes arise from fields in representations of the form (1, * , * ) 0, * and hence are given by: The supercharges in (A.11), restricted to the zero-modes, then can be realized as the following linear differential operators: (A.31) The restriction of (A.14) to the zero-modes is then given by S free 4d | zm = − τ 2 π H(H − 4πn) + 2πiτ n 2 , (A. 32) where we have used the fact that The restriction of (A.21) is then given by The reduction of the term (A.29) in the effective action to the zero-modes reads It follows that the constant a defined in (3.25) is equals 3/π. The coefficient b in (3.25) of the term quadratic in H cannot be fixed by the supersymmetrization of the Wess-Zumino term by the four supercharges considered above. This can be seen by considering a term in the action of the following form motivated by (3.21): where f is any function and prime denotes the derivative with respect to the first argument. This term is annihilated by all four superchages and hence is not fixed by the analysis in this appendix. When restricted to zero-modes, such a term will shift the value of b.
To determine b, one can appeal to the untwisted on-shell action. If b = −1, then after integrating the auxiliary field H, there must be a term proportional to on X = CP 2 with a nonzero coefficient.The indices in F + A F + A ω are contracted appropriately to obtain a 4-form. Such a term can arise from only two possible terms.
• A term of the form 38) in the theory in flat space, where F R is the background R-symmetry. However, such a term can be invariant only under Spin(4) R × U (1) R but cannot be made invariant under the full Spin(6) R R-symmetry of the untwisted theory.
• A term of the form where R is the Riemann curvature tensor. The scalar fields Φ in the term above can be restricted to be purely in the N = 2 vector multiplet. However, it is known that there is no such 4-derivative coupling in N = 2 supergravity ( [53]).
Since neither term is consistent with superymmetry and R-symmetry, we conclude b = −1.

B SO(3) versus U (2)
In this section we comment on the relation between the partition functions for the SO(3) and U (2) gauge groups. From the point of view of 6d (2,0) theory, the u(2) Lie algebra is in a certain sense more natural because it describes the stack of two M5-branes without removing the centre of mass degrees of freedom. Moreover, the six-dimensional u(2) (2,0) theory is absolute because it has a self-dual lattice of string charges and thus has a single partition function. By contrast, the su(2) (2,0) theory is relative and has a vector of partition functions labeled by discrete fluxes. On X × T 2 partition function should transform to itself under modular transformations, up to a phase related to 't Hooft anomalies. Let us consider how the effective two-dimensional theory in the u(2) case is modified compared to su(2) case. The analysis in Section 4.2 can be repeated for the 6d tensor multiples valued in the Cartan sub algebra of u(2). In particular, the KK reduction of the self-dual 2-form field B now leads to u(2) 1 right-moving WZW CFT, instead of su(2) 1 ∼ = u(1) 2 . This two-dimensional theory is now also absolute 20 and its character is which again captures contribution of abelian instantons. We have included the fugacity e z for the diagonal u(1). From the 4d perspective such refinement is realized by adding the topological term z CP 1 c 1 to the action on CP 2 . The decoupling of the diagonal u(1), corresponding to the center of mass degrees of freedom, is then done via the following decomposition of characters: The characters (4.8) transform as a dimension 2 complex representation of SL(2, Z), up to an overall multiplier system related to the gravitational anomaly. The S and T elements are 20 As a spin-theory, as we will comment on below.
represented by the following matrices: The left hand side of (B.2) is a modular function (again, with a multiplier system) under the index 3 subgroup S, T 2 ⊂ SL(2, Z) generated by S and T 2 . This is expected because U (2) is self-dual, and on a non-spin manifold the U (2) theory is only invariant under the shift of the theta-angle by 4π. From the 2d point of view S, T 2 is the subgroup of SL(2, Z) preserving the antiperiodic-antiperiodic spin structure on T 2 . This is the spin structure for which the character (B.1) is the partition function of u(2) 1 chiral WZW, which should be considered as a spin theory. Note that the matrices S and T 2 transforming the characters of su(2) 1 , or equivalently u(1) 2 , happen to be real (up to an overall phase), so the representation is self-conjugate. In particular this means that formally one could also take instead a linear combination to achieve the same effect, that is to produce an S, T 2 Jacobi modular function (with a different multiplier system). Therefore there is no contradiction with the discussion in Section 3.1.
As an aside we note that if one considers periodic-periodic spin structure instead, the partition function of the chiral u(2) 1 WZW reads C Closed Forms