1. Introduction
The design space for stellarators is larger than that of tokamaks because stellarators exploit three-dimensional (3-D) magnetic fields, by which it is meant that there is no continuous (e.g. rotational) symmetry, whereas tokamaks are notionally axisymmetric (two-dimensional) (Helander Reference Helander2014). The larger design space allows more freedom in the geometry of the plasma boundary. Geometry affects several important plasma properties such as stability and transport (Helander et al. Reference Helander, Beidler, Bird, Drevlak, Feng, Hatzky, Jenko, Kleiber, Proll and Turkin2012), including turbulent transport (Hegna, Terry & Faber Reference Hegna, Terry and Faber2018). If the search includes 3-D configurations, generally, one may expect that it is more likely to find a feasible fusion device.
A particularly advantageous feature of stellarators is that the rotational transform, which is essential for confinement in toroidal configurations, can be provided externally (Mercier Reference Mercier1964), either by current-carrying coils or by permanent magnets. This reduces or even eliminates the need for generating toroidal plasma currents, which can lead to problematic disruptions.
Along with advantages, 3-D configurations can have drawbacks. A feasible fusion device must possess good particle confinement and the plasma equilibrium must be supported by an external set of coils or magnets that do not pose unrealistic engineering challenges (Grieger et al. Reference Grieger, Lotz, Merkel, Nührenberg, Sapper, Strumberger, Wobig, Burhenn, Erckmann and Gasparino1992). Stellarators are guaranteed neither. Plasmas in early stellarator designs were not well confined because of neoclassical particle losses caused by unconfined orbits (Galeev et al. Reference Galeev, Sagdeev, Furth and Rosenbluth1969; Beidler et al. Reference Beidler, Allmaier, Isaev, Kasilov, Kernbichler, Leitold, Maaßberg, Mikkelsen, Murakami and Schmidt2011). Geometrically distorted coils complicated the engineering of the cancelled national compact stellarator experiment (NCSX) (Neilson et al. Reference Neilson, Gruber, Harris, Rej, Simmons and Strykowsky2010) and caused delays in the construction of Wendelstein 7-X (Riße Reference Riße2009).
Nonetheless, with careful exploitation of the large design space of 3-D configurations, confinement in stellarators can be significantly improved, e.g. by using quasi-symmetric (Nührenberg & Zille Reference Nührenberg and Zille1988; Boozer Reference Boozer1995; Henneberg, Drevlak & Helander Reference Henneberg, Drevlak and Helander2020) or quasi-isodynamic fields (Gori, Lotz & Nührenberg Reference Gori, Lotz and Nührenberg1996; Landreman & Catto Reference Landreman and Catto2012). Perfectly quasi-isodynamic fields have vanishing bootstrap current, which allows for simultaneous optimization of both neoclassical transport and small toroidal net current (Helander & Nührenberg Reference Helander and Nührenberg2009), which can be beneficial for stability and is necessary if an island divertor is to be employed. The 3-D optimization effort is encouraged by recent Wendelstein 7-X results (Klinger et al. Reference Klinger, Andreeva, Bozhenkov, Brandt, Burhenn, Buttenschön, Fuchert, Geiger, Grulke and Laqua2019), which was optimized for neoclassical transport (Nührenberg & Zille Reference Nührenberg and Zille1986). There are new methods in coil-design, such as using permanent magnets (Helander et al. Reference Helander, Drevlak, Zarnstorff and Cowley2020; Landreman & Zhu Reference Landreman and Zhu2020; Zhu et al. Reference Zhu, Hammond, Brown, Gates, Zarnstorff, Corrigan, Sibilia and Feibush2020) and designing coils with generous tolerances (Lobsien, Drevlak & Pedersen Reference Lobsien, Drevlak and Pedersen2018; Lobsien et al. Reference Lobsien, Drevlak, Kruger, Lazerson, Zhu and Pedersen2020).
With these developments in understanding and designing 3-D configurations, the question is not whether we should search the large 3-D configuration space, but how. With an order of magnitude more degrees of freedom, give or take, it is significantly more difficult to search the 3-D configuration space. To simultaneously achieve the desired properties of a feasible stellarator, we need efficient optimization approaches, and we need to understand the solution space.
An essential component of stellarator optimization is the evaluation of equilibrium properties.Footnote 1 Equilibrium calculations are conveniently divided into two types, namely fixed boundary and free boundary.Footnote 2 The fixed-boundary approach requires the geometry of the plasma boundary to be provided as input information (Hirshman & Whitson Reference Hirshman and Whitson1983; Bauer, Betancourt & Garabedian Reference Bauer, Betancourt and Garabedian1984; Hudson et al. Reference Hudson, Dewar, Dennis, Hole, McGann, von Nessi and Lazerson2012). Experience suggests that fixed-boundary calculations are faster and more robust than their free-boundary counterpart. In reality, however, the geometry of the plasma boundary is not known a priori. Free-boundary equilibrium calculations require as input the external magnetic field, and the self-consistent plasma boundary is determined as part of the equilibrium calculation (Hirshman, van Rij & Merkel Reference Hirshman, van Rij and Merkel1986; Hudson et al. Reference Hudson, Loizu, Zhu, Qu, Nührenberg, Lazerson, Smiet and Hole2020). Existing free-boundary equilibrium codes invariably perform additional iterations compared with their fixed-boundary analogues. The present algorithm in free-boundary SPEC (Hudson et al. Reference Hudson, Loizu, Zhu, Qu, Nührenberg, Lazerson, Smiet and Hole2020), for example, performs a Picard-style iteration over the fixed-boundary code in order to determine the plasma boundary that is consistent with the external field. The speed of existing fixed-boundary codes is advantageous for stellarator optimization because the equilibrium is typically computed at each iteration.
One might ask: What is the ‘best’ approach for stellarator optimization? We do not expect that there will be a be-all and end-all of stellarator optimization algorithms. We should embrace a variety of methods. In this paper, we seek to elucidate the mathematical structure of the various interrelated calculations that underpin the integrated stellarator optimization problem, which we hereafter call the combined plasma–coil optimization. The overall numerical efficiency of the combined plasma–coil optimization problem cannot be understood without considering how the equilibrium calculation, the coil calculation and the optimization calculation communicate with each other. This needs to be understood from both a mathematical and an algorithmic perspective.
Existing algorithms can loosely be sorted into two categories, which may be called the ‘two-step’ method and the ‘direct-coil’ method, which we now describe.Footnote 3 The two-step approach separates the equilibrium optimization calculation from the problem of finding coils (Nührenberg & Zille Reference Nührenberg and Zille1986). In the first step, the plasma boundary is varied to optimize the desired equilibrium properties, such as rotational-transform profile, magnetohydrodynmics (MHD) stability, etc. (Beidler et al. Reference Beidler, Grieger, Herrnegger, Harmeyer, Kisslinger, Lotz, Maassberg, Merkel, Nührenberg and Rau1990; Zarnstorff et al. Reference Zarnstorff, Berry, Brooks, Fredrickson, Fu, Hirshman, Hudson, Ku, Lazarus and Mikkelsen2001; Strickler, Berry & Hirshman Reference Strickler, Berry and Hirshman2002a; Henneberg et al. Reference Henneberg, Drevlak, Nührenberg, Beidler, Turkin, Loizu and Helander2019; Bader et al. Reference Bader, Faber, Schmitt, Anderson, Drevlak, Duff, Frerichs, Hegna, Kruger and Landreman2020). In the second step, the geometry of an external set of coils that provides the required external field is determined. The coil geometry must meet certain engineering criteria; the coils cannot have too small curvature and they must not intersect each other, for example. This stellarator optimization can be performed with STELLOPT (Lazerson et al. Reference Lazerson, Schmitt, Zhu and Breslau2020), ROSE (Drevlak et al. Reference Drevlak, Beidler, Geiger, Helander and Turkin2019), or SIMSOPT (Simons Hidden Symmetries Collaboration, https://github.com/hiddenSymmetries/simsopt), which are all under active development. The two-step approach was used to design Wendelstein 7-X (Beidler et al. Reference Beidler, Grieger, Herrnegger, Harmeyer, Kisslinger, Lotz, Maassberg, Merkel, Nührenberg and Rau1990), CHS-qa (Okamura et al. Reference Okamura, Matsuoka, Nishimura, Isobe, Nomura, Suzuki, Shimizu, Murakami, Nakajima and Yokoyama2001), NCSX (Nelson et al. Reference Nelson, Berry, Brooks, Cole, Chrzanowski, Fan, Fogarty, Goranson, Heitzenroeder and Hirshman2003), HSX (Canik et al. Reference Canik, Anderson, Anderson, Likin, Talmadge and Zhai2007), ESTELL (Drevlak et al. Reference Drevlak, Brochard, Helander, Kisslinger, Mikhailov, Nührenberg, Nührenberg and Turkin2013) and CFQS (Shimizu et al. Reference Shimizu, Liu, Isobe, Okamura, Nishimura, Suzuki, Xu, Zhang, Liu and Huang2018).
An advantage of the two-step approach is that it is primarily the fusion-relevant performance of the plasma that is most important; so, it is reasonable to optimize the plasma performance first without compromise; that is, without regard to the complexity of the coils. After all, a stellarator that produces fusion power with complicated coils is incomparably more commercially valuable than a stellarator with simple coils that does not. An advantage of using fixed-boundary calculations to optimize the equilibrium properties is the availability of fast and robust fixed-boundary equilibrium codes.
Using fixed-plasma-boundary calculations for the plasma equilibrium leads to an inverse problem for the coil geometry: the required magnetic field is known, and the Biot–Savart law must be inverted to obtain the coil geometry. Many different codes have been developed to solve the inverse problem for the coils, e.g. NESCOIL (Merkel Reference Merkel1987), ONSET (Drevlak Reference Drevlak1998), COILOPT (Brown et al. Reference Brown, Breslau, Gates, Pomphrey and Zolfaghari2015), FOCUS (Zhu et al. Reference Zhu, Hudson, Song and Wan2017), REGCOIL (Landreman Reference Landreman2017), OMIC (Singh et al. Reference Singh, Kruger, Bader, Zhu, Hudson and Anderson2020) and FOCUSADD (McGreivy, Hudson & Zhu Reference McGreivy, Hudson and Zhu2021). There are codes that solve the inverse problem for a set of permanent magnets (Helander et al. Reference Helander, Drevlak, Zarnstorff and Cowley2020; Landreman & Zhu Reference Landreman and Zhu2020; Zhu et al. Reference Zhu, Hammond, Brown, Gates, Zarnstorff, Corrigan, Sibilia and Feibush2020).
A disadvantage of the two-step approach is that not all plasma boundaries can be produced by a discrete set of coils or magnets that are easy to build. If there is no consideration of the coil and magnet complexity until after the plasma equilibrium has been chosen, complicated coils can result. It may be the case that small changes in the plasma geometry that result in small improvements in the plasma performance may also result in large disadvantageous changes in the coil geometry. Integrated algorithms that incorporate both plasma performance and coil complexity are, therefore, desirable (Boozer Reference Boozer2015; Landreman & Boozer Reference Landreman and Boozer2016; Drevlak et al. Reference Drevlak, Beidler, Geiger, Helander and Turkin2019).
Measures of coil complexity can be included in the optimization cost function using the two-step method (Drevlak et al. Reference Drevlak, Beidler, Geiger, Helander and Turkin2019; Carlton-Jones, Paul & Dorland Reference Carlton-Jones, Paul and Dorland2020). This comes at the cost of computing the coil geometry or some approximation of it at every stage of the fixed-boundary plasma optimization.
Another approach for the combined plasma–coil design is the direct coil optimization using a free-boundary equilibrium code (Hudson et al. Reference Hudson, Monticello, Reiman, Boozer, Strickler, Hirshman and Zarnstorff2002; Strickler, Berry & Hirshman Reference Strickler, Berry and Hirshman2002b). The direct coil optimization approach avoids the need for the inverse coil code because the coil geometry is taken to be the independent degree-of-freedom in the optimization, i.e. the coil geometry is assumed to be known. The external magnetic field that is required as input by the free-boundary equilibrium calculation is computed using the Biot–Savart formula. Metrics of coil complexity can easily be included in the optimization cost function.
There are direct coil optimization approaches that dispense with the equilibrium calculation and instead optimize the properties of the vacuum field, which may be thought of the trivial (zero plasma-current, zero pressure) plasma equilibrium state. This should be sufficient for designs that do not target high plasma pressure. However, finite-$\beta$ effects are important: the pressure-induced Shafranov shift (Shafranov Reference Shafranov1963) can affect stability and confinement. A related direct coil optimization uses a near-axis analytic approximation to the equilibrium state (Giuliani et al. Reference Giuliani, Wechsung, Cerfon, Stadler and Landreman2020), which might be sufficient for large aspect-ratio configurations but perhaps not so for ‘compact’ stellarators. Also, recent work has exploited analytical gradient information and adjoint methods for a variety of optimization problems (Landreman & Paul Reference Landreman and Paul2018; Paul et al. Reference Paul, Landreman, Bader and Dorland2018; Zhu et al. Reference Zhu, Hudson, Song and Wan2018; Antonsen, Paul & Landreman Reference Antonsen, Paul and Landreman2019; Paul et al. Reference Paul, Abel, Landreman and Dorland2019; Carlton-Jones et al. Reference Carlton-Jones, Paul and Dorland2020).
In this paper, we review and categorize different combined plasma–coil optimization. We discuss free-boundary calculations and propose an alternative method to calculate the virtual-casing self-consistent vacuum magnetic field, which reduces the cost of a free-boundary calculation to that of a fixed-boundary calculation. In § 2, the multiregion relaxed magnetohydrodynamic (MRxMHD) energy functional, the coil-penalty functional and the virtual casing integral are described. In § 3, we present the relevant variational calculus for the free-boundary equilibrium calculation, the coil geometry optimization and the virtual casing integral that are employed in the combined plasma–coil optimization algorithms discussed in § 5. In § 4, we discuss the construction of the magnetic field using a (weak) Galerkin method. In § 5, we examine various fixed- and free-boundary algorithms for the combined plasma–coil optimization that differ in which quantity, which we denote with $\boldsymbol {z}$, is chosen to be the independent degree-of-freedom, and derive the derivatives of the plasma equilibrium and the coil geometry with respect to $\boldsymbol {z}$ in terms of derivatives of the plasma-energy functional, the coil-penalty functional, and the virtual-casing integral.Footnote 4 Finally, in § 6, we discuss the results of this paper.
2. Plasma equilibria and supporting coils
This paper builds upon the construction of the plasma equilibrium as a stationary point of the MRxMHD energy functional, which we now describe.Footnote 5
The computational domain, $\varOmega \subset \mathbb {R}^{3}$, is considered to be a solid torus with computational boundary ${{\mathcal {D}}} \equiv \partial \varOmega$. The latter is a prescribed toroidal surface, which unless explicitly stated otherwise is held fixed throughout the calculation. The computational domain contains the plasma volume, ${{\mathcal {V}}}\subset \varOmega$, a smaller solid torus enclosed by the plasma boundary ${{\mathcal {S}}}\equiv \partial \mathcal {V}$. The plasma volume is further partitioned into $v = 1, \dots , N_V$ nested toroidal subvolumes, ${{\mathcal {V}}}_v$, which are separated by a set of ideal interfaces, $\mathcal {I}_v$ so that $\partial \mathcal {V}_v={{\mathcal {I}}}_v\cup {{\mathcal {I}}}_{v-1}$, with the outermost interface coinciding with the plasma boundary, ${{\mathcal {I}}}_{N_V} = {{\mathcal {S}}}$. The innermost subregion is a simple (solid) torus, and each other subregion is a toroidal annulus which may be thought of as a ‘hollow’ torus. In each ${{\mathcal {V}}}_v$, the magnetic field is written as the curl of the magnetic vector potential, $\boldsymbol {B}_v = \boldsymbol {\nabla } \times {\boldsymbol {A}}_v$. In the innermost toroidal region, the toroidal flux, computed as a poloidal loop integral of ${\boldsymbol {A}}_1$ on ${{\mathcal {I}}}_1$, is constrained to match a prescribed value. In each annular region, both the enclosed toroidal and poloidal fluxes, computed, respectively, as the difference between the poloidal and toroidal loop integrals of ${\boldsymbol {A}}_v$ on ${{\mathcal {I}}}_v$ and ${{\mathcal {I}}}_{v-1}$, are constrained. In each region, the magnetic helicity, defined as the volume integral of ${\boldsymbol {A}}_v \cdot \boldsymbol {B}_v$, is also constrained. The magnetic field in each region is taken to be tangential to that region's boundary, $\boldsymbol {n}\cdot {\boldsymbol {B}}_v=0$ on ${{\mathcal {I}}}_v$ where $\boldsymbol {n}$ is a unit outward normal vector, but otherwise the topology of the field lines is unconstrained and magnetic islands and irregular field lines are allowed in each ${\mathcal {V}}_v$.
For free-boundary calculations, by which we mean that the plasma boundary is allowed to move, we include the vacuum region ${{\mathcal {V}}}_+=\varOmega {\setminus} {{\mathcal {V}}}$, with the inner boundary of ${{\mathcal {V}}}_+$ coinciding with ${{\mathcal {S}}}$ and its outer boundary with ${{\mathcal {D}}}$. It is on ${{\mathcal {D}}}$ that the normal plasma field plus the normal external field is required to equal the normal total field.
By construction, there are no currents inside ${{\mathcal {V}}}_+$. We may employ a scalar potential and write the vacuum field as $\boldsymbol {B}_{+} = \boldsymbol {\nabla } \varPhi$.Footnote 6 To obtain a unique solution, constraints are imposed on the linking and net toroidal plasma currents, which are computed as loop integrals of $\boldsymbol {B}_+$. We take $\boldsymbol {B}_+$ to be tangential to the plasma surface, $\boldsymbol {n} \boldsymbol {\cdot } \boldsymbol {\nabla } \varPhi |_{{{\mathcal {S}}}}=0$ where $\boldsymbol {n}$ is a unit outward normal vector on ${{\mathcal {S}}}$. The normal field on the computational boundary ${{\mathcal {D}}}$, $B_{T,n} \equiv \bar {\boldsymbol {n}} \boldsymbol {\cdot } \boldsymbol {\nabla } \varPhi |_{{{\mathcal {D}}}}$ where $\bar {\boldsymbol {n}}$ is normal to ${{\mathcal {D}}}$, is the sum, $B_{P,n} + B_{E,n}$, of the plasma field and the external field, neither of which are necessarily known a priori; generally, $B_{T,n} \ne 0$.
The equilibria that we consider are stationary points of the following energy functional:
where
with $p_v$ the pressure in the $v$-th region,Footnote 7 $\gamma$ is the specific heat ratio (adiabatic index) and $H_v$ is the prescribed value of the magnetic helicity in ${{\mathcal {V}}}_v$. The constraints on the toroidal and poloidal fluxes in the plasma volumes, ${{\mathcal {V}}}_v$, are implied; that is, the fields $\boldsymbol {B}_v$ that we consider in (2.2) are restricted to the class of divergence-free vector fields that are tangential to the interfaces and have the prescribed values of the toroidal and poloidal fluxes. The helicity constraints are enforced explicitly using the Lagrange multipliers $\mu _v$. The loop integral constraints on $\boldsymbol {\nabla }\varPhi$ are also implied: the allowed $\varPhi$ in (2.3) is restricted to the class of multivalued scalar functions with prescribed loop integrals of $\boldsymbol {\nabla }\varPhi$. We write $\bar {\varPhi }\equiv \varPhi |_{{{\mathcal {D}}}}$ to denote the scalar potential evaluated on ${{\mathcal {D}}}$ to distinguish it from $\varPhi \equiv \varPhi |_{{{\mathcal {S}}}}$ on ${{\mathcal {S}}}$. Similarly, we write $\mathrm {d}\bar {\boldsymbol {s}}$ to denote an infinitesimal surface element on ${{\mathcal {D}}}$.
We may call ${{\mathcal {F}}}$ the free-plasma-boundary, fixed-computational-boundary, generalized- boundary-condition MRxMHD energy functional. This is to accommodate the common understanding in the magnetic confinement community that a free-boundary equilibrium calculation allows the plasma boundary to move, which it does in this case, and the common terminology in the broader mathematical community that refers to the ‘boundary’ as boundary of the computational domain, which is fixed in this case. We refer to the boundary conditions as ‘generalized’ because the total normal field, $B_{T,n}$, is not constrained to be zero. For brevity, we hereafter refer to ${{\mathcal {F}}}$ simply as the energy functional.
Using the virtual-casing principle (Shafranov & Zakharov Reference Shafranov and Zakharov1972), the normal plasma field, $B_{P,n}$, produced by currents within the plasma volume is computed on the computational boundary as
where ${{\boldsymbol {r}}} \equiv \boldsymbol {x} - \bar {\boldsymbol {x}}$ for $\boldsymbol {x} \in {{\mathcal {S}}}$ and $\bar {\boldsymbol {x}}\in {{\mathcal {D}}}$, and we use $\mathrm {d} \boldsymbol {s}$ to denote a surface element of ${{\mathcal {S}}}$. Here, $\boldsymbol {B}_T|_{{{\mathcal {S}}}_+}=\boldsymbol {\nabla } \varPhi |_{{{\mathcal {S}}}_+}$ is the total magnetic field immediately outside of ${{\mathcal {S}}}$. To accommodate for the possible existence of sheet currents lying on the plasma boundary, we must use in (2.4) the total magnetic field lying immediately outside the plasma boundary; that is, we must use the magnetic field on the inner boundary of the vacuum region. When that information is not available, we can use the magnetic field immediately inside the plasma boundary. In § 4, we shall exploit the circumstance that the evaluation of the plasma field on the computational boundary is a non-local, linear operator of the scalar potential, $\varPhi$.
The difference between the total normal magnetic field and the plasma normal field on ${{\mathcal {D}}}$, which we call the required external normal field, must be provided by an external source. A plasma cannot create its own magnetic bottle (Shafranov Reference Shafranov1966). This quantity, $D_n \equiv B_{T,n}-B_{P,n}$, is very closely related to $B_{E,n}$, the provided external field, but they may differ. The difference is what we later call the ‘coil error’.
For clarity of exposition regarding the externally applied field and to facilitate discussion of the various combined plasma–coil optimization algorithms considered in § 5, we imagine a typical and easily differentiable example; namely, that the external magnetic field is provided by a set of $i = 1, \ldots , N_C$ current-carrying one-dimensional curves (filaments) with geometry $\boldsymbol {c}_i$ that produce a magnetic field given by the Biot–Savart law,
where $I_i$ is the current in the $i$-th coil and $\mathrm {d} {{\boldsymbol {l}}}$ is the differential line segment. For brevity, we shall hereafter ignore the dependency on the magnitude of the currents and we set $I_i = 4{\rm \pi} /\mu _0$. It is a simple matter to extend the following to allow for the variation of the coil currents. It is also a simple matter to extend the following to the case where a surface potential on a prescribed winding surface provides the external field (Merkel Reference Merkel1987; Landreman Reference Landreman2017), or to the case where a ‘finite-build’ approximation is used for the coils (Li et al. Reference Li, Liu, Xu, Shimizu, Kinoshita, Okamura, Isobe, Xiong, Luo and Cheng2020; McGreivy et al. Reference McGreivy, Hudson and Zhu2021; Singh et al. Reference Singh, Kruger, Bader, Zhu, Hudson and Anderson2020), or when permanent magnets are included (Helander et al. Reference Helander, Drevlak, Zarnstorff and Cowley2020; Landreman & Zhu Reference Landreman and Zhu2020; Zhu et al. Reference Zhu, Hammond, Brown, Gates, Zarnstorff, Corrigan, Sibilia and Feibush2020).
If the required external field, $D_{n}$, is given, the geometry of the coils may be chosen to minimize a suitable ‘coil-penalty’ functional; for example (Merkel Reference Merkel1987; Landreman Reference Landreman2017; Hudson et al. Reference Hudson, Zhu, Pfefferlé and Gunderson2018; Zhu et al. Reference Zhu, Hudson, Song and Wan2018),
where $\varphi _2 \equiv \int _{{{\mathcal {D}}}} (Q^{2} \,\mathrm {d}\bar {s})/2$ is the commonly used quadratic-flux error functional, $Q = D_n - B_{E,n}$ where $B_{E,n}[\boldsymbol {c},{{\mathcal {D}}}]$ is the normal component of $\boldsymbol {B}_E$ on ${{\mathcal {D}}}$ produced by the filamentary coils, where we use $\boldsymbol {c}$ to denote the geometry of all the coils; i.e. $\boldsymbol {c} \equiv \{ \boldsymbol {c}_i, i = 1, \dots N_C \}$. We include a regularization term, $L = L[\boldsymbol {c}]$, which we take to be the total length of the coils, and $\omega$ is a scalar penalty.
The geometry of the optimal coils is given by $\delta {{\mathcal {E}}}/\delta \boldsymbol {c} = 0$. This equation may be solved using descent algorithms (Zhu et al. Reference Zhu, Hudson, Song and Wan2017; Hudson et al. Reference Hudson, Zhu, Pfefferlé and Gunderson2018) or Newton-style methods (Zhu et al. Reference Zhu, Hudson, Song and Wan2018). Practically, it is not possible to obtain a coil set that exactly produces the required normal field and generally $\varphi _2\ne 0$. This is why the required external normal field, $D_n$, must be distinguished from the actual external normal field, $B_{E,n}$. Upon solving $\delta {{\mathcal {E}}}/ \delta \boldsymbol {c} = 0$, $\varphi _2$ measures what we call the coil error.Footnote 8
In the above, we have described the three basic components of the combined plasma–coil optimization algorithms that we consider in § 5; namely, the energy functional, ${{\mathcal {F}}}$, the Biot–Savart law, the coil-penalty functional, ${{\mathcal {E}}}$, and the evaluation of the normal plasma field, $B_{P,n}$, on ${{\mathcal {D}}}$ using the virtual-casing functional. In the following section, § 3, we present the first and second variations of ${{\mathcal {F}}}$, ${{\mathcal {E}}}$ and $B_{P,n}$ that will be required in § 5. In § 4, we elaborate upon the numerical construction of the magnetic fields and show that, instead of prescribing the total normal magnetic field, $B_{T,n}$, on ${{\mathcal {D}}}$, computing the equilibrium and then determining the coil geometry that provides (approximately) the required external normal field, we may directly consider the case where the external normal field, $B_{E,n}$, on ${{\mathcal {D}}}$ is given. We can solve for the virtual-casing self-consistent vacuum magnetic field using a Galerkin method. In § 5, we show that the derivatives described in § 3 and the Galerkin construction of the magnetic fields may be combined to construct a variety of combined plasma–coil stellarator optimization algorithms.
3. Variation of the energy, coil-penalty and virtual-casing functionals
In this section, we present the variations of the energy, coil-penalty and virtual casing functionals that are used in § 5. The variational calculus provides useful insight into the physics and mathematics of the different optimization problems, as we shall comment upon in § 6.
Requesting that the first variations of the energy functionals vanish results in weak formulations of the various coupled boundary value problems. When the boundary data is ‘smooth’ enough, it is expected that weak solutions coincide with (strong) solutions. In what follows, we assume the fields are sufficiently regular to apply integration by parts and derive local Euler–Lagrange equations. Conversely, we convert any linear elliptic partial differential equations subject to boundary conditions into variational problems (weak formulation) simply by integrating against arbitrary variations. In the case where the bilinear functional thus obtained is symmetric (self-adjoint), the variational problem can be associated with an energy minimization principle. However, as we will see in § 3.4, the weak formulation of certain boundary value problems do not necessarily result in self-adjoint bilinear functionals, in which case there is no immediate energy minimization principle.
The weak formulation is the basis of our numerical representation of solutions, in particular leading to a spectral-Galerkin method where our fields are approximated by a finite linear combination of Fourier modes and orthogonal polynomials. The coefficients of the linear combination become our degrees of freedom and the weak formulation boils down to a matrix inversion.
3.1. Energy functional
In the plasma volumes we write $\boldsymbol {B}_v = \boldsymbol {\nabla }\times \boldsymbol {A}_v$, and thus $\delta \boldsymbol {B}_v = \boldsymbol {\nabla }\times \delta \boldsymbol {A}_v$. We use the fact that the fields in the vicinity of the interfaces obey ideal MHD, so that the variation of the magnetic field on the interfaces can be expressed with an displacement vector $\boldsymbol {\xi }_v$, $\delta \boldsymbol {B}_v = \boldsymbol {\nabla }\times (\boldsymbol {\xi }_v\times \boldsymbol {B}_v)$, i.e. $\delta \boldsymbol {A}_v = \boldsymbol {\xi }_v\times \boldsymbol {B}_v + \nabla g_v$, where $g_v$ is a gauge potential. In MRxMHD, the mass is constrained in each volume $\mathcal {V}_v$ by the isentopic ideal-gas constraint, $p_v = a_v / V_v^{\gamma }$, where $V_v$ is the volume of the $v$-th region, $V_v = \int _{\mathcal {V}_v} \,\mathrm {d} v$ and $a_v$ is a constant. The first variation with respect to deformations of the volume boundary is given by $\delta p_v = - \gamma p_v (\int _{{{{\mathcal {I}}}}_v} \boldsymbol {\xi }_v\cdot \mathrm {d} {\boldsymbol {s}}-\int _{{{{\mathcal {I}}}}_{v-1}} \boldsymbol {\xi }_{v-1}\cdot \mathrm {d} {\boldsymbol {s}})/V_v$.
Using $\boldsymbol {B}_v \cdot {\boldsymbol {n}}_v = 0$ on the interfaces, where $\boldsymbol {n}_v$ is the normal vector of the $v$-th interface, one can derive
where
In vacuum, the first variation in ${{\mathcal {F}}}_+$ with respect to variations in $\varPhi$ is
Setting the functional derivative,
to zero leads to the Laplace equation. The other terms provide corresponding Neumann boundary conditions.
The second variation with respect to deformations of the interface geometry of ${\mathcal {F}}_v$ is given by
where the tilde ($\sim$) indicates that the second variation can differ from the first. Here, for simplicity, we allowed only one of the two interfaces to vary. For a complete expression (see (3.8)), one has to be careful since cross-terms, including integrals over two different interfaces, appear in the $\gamma p_v$ term.
In vacuum, the second variation with respect to the scalar potential is given by
Collecting all the different terms for the second variation, we obtain
where we used the equilibrium expressions; we included variations of both interfaces; we used $\boldsymbol {B} \boldsymbol {\cdot } ((\boldsymbol {B}\times \boldsymbol {\nabla })\times \boldsymbol {n})=\boldsymbol {B}^{2} \boldsymbol {\nabla }\boldsymbol {\cdot }\boldsymbol {n} - \boldsymbol {B}\boldsymbol {\cdot }\boldsymbol {\nabla } \boldsymbol {n}\boldsymbol {\cdot }\boldsymbol {B}$; and we set $\boldsymbol {\xi }_v$ and $\tilde {\boldsymbol {\xi }}_v$ proportional to the normal vector $\boldsymbol {n}$ since only normal variations of the interfaces affect the plasma energy. The second variation of the energy functional determines the MRxMHD stability of the equilibrium state (Kumar et al. Reference Kumar, Qu, Hole, Wright, Loizu, Hudson, Baillod, Dewar and Ferraro2021).Footnote 9
3.2. Coil-penalty functional
The coil-penalty functional, ${{\mathcal {E}}}$ defined in (2.6) is considered to be a functional of the coil geometry, $\boldsymbol {c}$, the required normal magnetic field, $D_n$, on the computational boundary, ${{\mathcal {D}}}$, and on ${{\mathcal {D}}}$ itself, which is, however, kept fixed here; i.e. ${{\mathcal {E}}}={{\mathcal {E}}}[\boldsymbol {c},D_n]$. We may formally write the first variation, $\delta {{\mathcal {E}}}$, with respect to variations $\delta \boldsymbol {c}$, and $\delta D_n$ as
where $\alpha$ is an angle-like curve parameter, $\boldsymbol {c}_i(\alpha +2{\rm \pi} )=\boldsymbol {c}_i(\alpha )$. The functional derivatives are
where $\boldsymbol {c}_i^{\prime }$ is the derivative of the $i$-th coil with respect to $\alpha$. We have written ${\textsf {R}}_i \equiv 3 \, \boldsymbol {r}_i \boldsymbol {r}_i / r_i^{5} - {{\sf I}} / r_i^{3}$ and defined ${{\boldsymbol {R}}}_{i,n} = {\textsf {R}}_i {\cdot } \boldsymbol {n}$, where $\boldsymbol {r}_i$ is the displacement between a point on the $i$-th coil and the evaluation point on ${{\mathcal {D}}}$, and ${{\sf I}}$ is the identity tensor or idemfactor. The curvature of the $i$-th coil is $\boldsymbol {\kappa }_i \equiv \boldsymbol {c}_i' \times \boldsymbol {c}_i'' / \boldsymbol {c}_i'^{3}$ and the functional derivatives are generalized expressions of the ones presented in Dewar, Hudson & Price (Reference Dewar, Hudson and Price1994) and Hudson et al. (Reference Hudson, Zhu, Pfefferlé and Gunderson2018). We used the expression for the variation of the normal component of the magnetic field with respect to the coil geometry
with the functional derivatives
The required second variations of ${{\mathcal {E}}}$ are (Hudson et al. Reference Hudson, Zhu, Pfefferlé and Gunderson2018)
where
For the case where $i=j$ the variation cannot be written in such a compact form but also does not provide much insight. The other second variations of ${{\mathcal {E}}}$ are not required in § 5 and are omitted for brevity.
We only need the variation with respect to the geometry of the computational boundary, ${{\mathcal {D}}}$, when $D_n=-B_{P,n}=-\boldsymbol {B}_{P}\cdot \bar {\boldsymbol {n}}$. Since there is an explicit dependence with respect to $\bar{\boldsymbol {n}}$, we calculate the combined variation of ${{\mathcal {E}}}$ and $B_{P,n}$ with respect to variations in ${{\mathcal {D}}}$, which allows us to express the results in compact form. The first variation is
where
where we used (12) from Dewar et al. (Reference Dewar, Hudson and Price1994). The notation $\boldsymbol {f}_{s} = ( {{\sf I}} - \bar {\boldsymbol {n}}\bar {\boldsymbol {n}})\cdot \boldsymbol {f}$ denotes the surface projection of an arbitrary vector or tensor, $\boldsymbol {f}$, onto a surface, which in this case is the computational boundary. The second variation of ${{\mathcal {E}}}$ with respect to ${{\mathcal {D}}}$ and $\boldsymbol {c}_i$ is
where
where the mean curvature of ${{\mathcal {D}}}$ is $\boldsymbol {H} \equiv - \bar {\boldsymbol {n}} \, (\boldsymbol {\nabla }\boldsymbol {\cdot }\bar {\boldsymbol {n}} )$. Equation (3.21) is a more general form of (12) of Hudson et al. (Reference Hudson, Zhu, Pfefferlé and Gunderson2018) where $D_n=0$.
3.3. Plasma normal field
The normal plasma field, $B_{P,n}$, on ${{\mathcal {D}}}$, as defined in (2.4), is considered to be a functional of the plasma boundary, ${\mathcal {S}}$, the total (tangential) magnetic field, $\boldsymbol {B}_T|_{{{\mathcal {S}}}}$ on ${{\mathcal {S}}}$, and on ${{\mathcal {D}}}$, which is kept fixed; i.e. $B_{P,n}=B_{P,n}[{\mathcal {S}},\boldsymbol {B}_T|_{{{\mathcal {S}}}}]$. The normal component of the magnetic field produced by plasma currents, $B_{P,n}$, given in (2.4) may be written as
where the antisymmetric tensor, ${\textsf {M}} \equiv ( {\boldsymbol {r}} \, \bar {\boldsymbol {n}} - \bar {\boldsymbol {n}} \, {\boldsymbol {r}} ) / r^{3}$, is defined, where ${\boldsymbol {r}} = \boldsymbol {x}-\bar {\boldsymbol {x}}$ for $\boldsymbol {x} \in {{\mathcal {S}}}$ and $\bar {\boldsymbol {x}}\in {{\mathcal {D}}}$. The first variation setting $\delta {{\mathcal {D}}} =0$ is
where
where we used again (12) from Dewar et al. (Reference Dewar, Hudson and Price1994) for the first expression. Note that only ${\textsf {M}}$ and ${\textsf {R}}$ depend on $\bar {\boldsymbol {x}}$. The relevant variation with respect to the computational boundary ${{\mathcal {D}}}$ is presented in the previous section. The second variations are not required in the following.
3.4. Supplied external field
To enable efficient free-boundary optimization algorithms, which we describe in § 5, we revisit the energy functional; in particular, we pay attention to the boundary condition for the normal field on ${{\mathcal {D}}}$.
It is perfectly legitimate to prescribe the total normal field, $B_{T,n}$, on ${{\mathcal {D}}}$. Indeed, every fixed-boundary equilibrium calculation does as much. As described in § 2, the total normal field is the sum of the plasma normal field and the external normal field, $B_{T,n} = B_{P,n} + B_{E,n}$, neither of which are known a priori in fixed-boundary calculation.
It seems unlikely that there are any circumstances for which we will a priori know $B_{P,n}$, except for the trivial equilibrium, the vacuum, for which $B_{P,n}=0$. After all, this is why the equilibrium calculation is required, to determine the plasma currents and the magnetic field that they produce.
In contrast, a particularly important calculation arises when $B_{E,n}$ is known. In this case, if we were to proceed with the approach of specifying $B_{T,n}$ on ${{\mathcal {D}}}$, then some type of ‘self-consistent’ iteration, for example, must be implemented to determine the $B_{T,n}$ that satisfies the matching condition, namely that $B_{T,n} - B_{P,n}[\boldsymbol {B}_T|_\mathcal {S}] = B_{E,n}$, where $B_{P,n}[\boldsymbol {B}_T|_{{{\mathcal {S}}}}]$ may be considered to be a linear, non-local operator acting on the tangential total field on the plasma boundary $\boldsymbol {B}_T|_{{{\mathcal {S}}}}$, which is only known after the equilibrium has been computed. In the current implementation of free-boundary SPEC (Hudson et al. Reference Hudson, Loizu, Zhu, Qu, Nührenberg, Lazerson, Smiet and Hole2020), for example, a Picard iteration over $B_{T,n}$ is required.
Here, we present a more direct strategy. The boundary value problem in the vacuum region is to find $\boldsymbol {B}_+ = \boldsymbol {\nabla }\varPhi$ such that $\boldsymbol {\nabla } \boldsymbol {\cdot } \boldsymbol {B}_+ = \boldsymbol {\nabla } \boldsymbol {\cdot } \boldsymbol {\nabla } \varPhi = 0$ on ${{\mathcal {V}}}_+$ satisfying boundary conditions $\boldsymbol {n}\boldsymbol {\cdot }\boldsymbol {\nabla }\varPhi =0$ on ${{\mathcal {S}}}$ and $\bar{\boldsymbol {n}}\boldsymbol {\cdot } \boldsymbol {\nabla }\bar {\varPhi } - B_{P,n}[\boldsymbol {\nabla }\varPhi |_{{{\mathcal {S}}}}]= B_{E,n}$ on ${{\mathcal {D}}}$. This translates into the weak problem of finding $\varPhi$ such that
where $\psi$ is an arbitrary (test) function. An important solvability condition is that the normal field of the coils is consistent with a divergence-free field, $\int _{{{\mathcal {D}}}} B_{E,n} \,\mathrm {d} \bar {s} =0$.
As it stands, it is not possible to cast this weak problem into an energy minimization problem because the second term on the left-hand side of (3.26) spoils the required symmetry of the functional. The first variation of a quadratic energy functional necessarily results in symmetric bilinear forms.
Energy principles do exist for exterior Neumann problems in the form of boundary integral methods (Giroire & Nédélec Reference Giroire and Nédélec1978). Their formulation, however, requires the plasma boundary and the computational boundary to coincide. The NESTOR code is a good example (Merkel Reference Merkel1986), as well as BIEST (Malhotra et al. Reference Malhotra, Cerfon, Imbert-Gérard and O'Neil2019). There are several disadvantages and challenges in implementing these energy principles. First, as seen in (3.26) in the limit ${{\mathcal {D}}}\to {{\mathcal {S}}}$, the integral operators tend to have singular kernels, which requires delicate numerical treatment. Second, in free-boundary calculations, the plasma boundary is varied to achieve force balance. This requires updating the numerical representation of the integral operators as well as the source term on the right-hand side of (3.26) every time the boundary changes. For this, the external field (from coils) is effectively needed in an entire volume not just on a single surface, which comes with a sizeable computational cost.Footnote 10
In the following section, § 4, we present a free-boundary equilibrium approach for a given external field that is based on a weak (Galerkin) method for constructing the required magnetic fields while keeping the computational boundary fixed and separate from the plasma boundary so that only the normal component of the external field on the computational boundary need be evaluated and only once. The virtual-casing self-consistent vacuum field is obtained as the solution to a set of linear equations, given below in (4.4). The matching condition is enforced directly in the computation of the vacuum field, and this removes the self-consistent iteration mentioned above that is otherwise required to match the equilibrium to the provided external field.
4. Galerkin method for constructing the vacuum field
Galerkin methods for constructing weak solutions to partial differential equations are well known. In the context of MRxMHD, a Galerkin method for constructing the magnetic fields for fixed- and free-boundary MRxMHD equilibria is described in Hudson et al. (Reference Hudson, Dewar, Dennis, Hole, McGann, von Nessi and Lazerson2012, Reference Hudson, Loizu, Zhu, Qu, Nührenberg, Lazerson, Smiet and Hole2020) and Qu et al. (Reference Qu, Pfefferlé, Hudson, Baillod, Kumar, Dewar and Hole2020). In this paper, we restrict our attention to the problem of constructing $\boldsymbol {B}_+$.
A standard numerical approach for finding stationary points of ${{\mathcal {F}}}_+$ is to represent the scalar potential using a mixed Chebyshev–Fourier representation; e.g. $\varPhi (s,\theta ,\zeta ) = I \theta + G \zeta + \sum _{l,m,n} \varPhi _{l,m,n} T_l(s)\exp (\textrm {i}m\theta -\textrm {i}n\zeta )$, where $(s,\theta ,\zeta )$ is a suitable toroidal coordinate system and $T_l(s)$ is the $l$-th Chebyshev polynomial, and $I$ and $G$ are given by the prescribed value of the loop $\text {integrals}/2{\rm \pi}$. When this representation is inserted into (2.2), and assuming that ${{\mathcal {S}}}$ and ${{\mathcal {D}}}$ do not change, the problem of calculating the vacuum field amounts to solving
where $\boldsymbol {\phi }$ represents the vector of independent degrees-of-freedom, namely the $\varPhi _{l,m,n}$ and the Lagrange multipliers that may be used to enforce the constraints, and ${{\mathcal {A}}}$ is a symmetric matrix whose elements are the second derivatives of ${{\mathcal {F}}}_{+}$ with respect to these degrees-of-freedom and involve volume integrals of coordinate metric elements and the Chebyshev–Fourier functions. The matrix ${{\mathcal {A}}}$ depends on the geometry of ${{\mathcal {S}}}$ and ${{\mathcal {D}}}$; i.e. ${{\mathcal {A}}}={{\mathcal {A}}}[{{\mathcal {S}}},{{\mathcal {D}}}]$. The right-hand side vector, $\boldsymbol {B}_T$, contains the Fourier harmonics of $B_{T,n}$. The system of linear equations given in (4.1) is essentially a discrete realization of Laplace's equation, $\boldsymbol {\nabla } \boldsymbol {\cdot } \boldsymbol {\nabla } \varPhi = 0$, with the supplied boundary conditions and loop integrals, written in matrix form. We can determine how the vacuum magnetic field varies with ${{\mathcal {S}}}$, ${\mathcal {D}}$ and $B_{T,n}$ using matrix perturbation methods,
where $\delta {\mathcal {A}} = \partial {{\mathcal {A}}} / \partial {{\mathcal {S}}} \cdot \delta {{\mathcal {S}}} + \partial {\mathcal {A}}/ \partial {{\mathcal {D}}} \cdot \delta {{\mathcal {D}}}$.
From (2.4), we recognize that the magnetic field produced by the plasma currents is a linear functional of the total (tangential) magnetic field on the immediate outside of the plasma boundary, namely $\boldsymbol {B}_+|_{{{\mathcal {S}}}_+}$. The immediate outside of ${{\mathcal {S}}}$ is the inner boundary of the vacuum region, ${{\mathcal {V}}}_+$, and the magnetic field in ${{\mathcal {V}}}_+$, namely $\boldsymbol {B}_+$, is the gradient of the scalar potential. The Fourier harmonics, $B_{P,n}$, of the plasma field on ${{\mathcal {D}}}$ are linear in the degrees-of-freedom of the scalar potential; that is, we may write
where $\boldsymbol {B}_P$ represents the vector of Fourier harmonics of $B_{P,n}$, and ${\mathcal {B}}$ is a matrix derived from inserting the mixed Chebyshev–Fourier representation for $\varPhi$ into (2.4). We note that ${\mathcal {B}}$ depends only on the geometry of ${{\mathcal {S}}}$ and ${{\mathcal {D}}}$; i.e. ${\mathcal {B}}={\mathcal {B}}[{{\mathcal {S}}},{{\mathcal {D}}}]$.
Provided that the external field $\boldsymbol {B}_E$ is ‘tame’ (in particular $\int _{{{\mathcal {D}}}} B_{E,n}\,\mathrm {d} \bar {s}=0$), we put forward the following propositions: (i) for toroidal annular domains of practical interest, with the constraint that $\boldsymbol {B}_T \cdot \boldsymbol {n} = 0$ on the inner boundary, ${{\mathcal {S}}}$, there exists a normal magnetic field on the outer boundary, ${{\mathcal {D}}}$, that is the sum of an a priori known externally applied field plus the a priori unknown magnetic field that is produced by plasma currents inside ${{\mathcal {S}}}$, as given by the virtual-casing integral given; and (ii) that this ‘virtual-casing self-consistent’ vacuum field may be obtained as the solution to the system of linear equations resulting from combining (4.3) with (4.1); i.e.
where we defined the Laplace-virtual-casing matrix, ${\mathcal {L}}_{vc}\equiv {\mathcal {A}}-{\mathcal {B}}$, and $\boldsymbol {B}_E$ is that part of the right-hand side vector given in (4.1) that does not depend on the plasma currents.Footnote 11 We shall elaborate upon these propositions in a future article.
The virtual-casing self-consistent vacuum field will change with variations in the plasma boundary according to
where $\delta \mathcal {B} = \partial \mathcal {B} / \partial {{\mathcal {S}}} \cdot \delta {{\mathcal {S}}} + \partial \mathcal {B} / \partial {{\mathcal {D}}} \cdot \delta {{\mathcal {D}}}$.
4.1. Restricted energy functionals
To incorporate the Galerkin method for computing the magnetic fields with the energy functional and with the various combined plasma–coil optimization algorithms discussed in § 5 below, we simplify as follows. We redefine the energy functionals, ${{\mathcal {F}}}_v$, in the plasma volumes to
where here and hereafter the magnetic field, $\boldsymbol {B}_v$, in each region is restricted to be the unique magnetic field with the prescribed helicity and fluxes that is tangential to the boundary and obeys $\boldsymbol {\nabla } \times \boldsymbol {B}_v = \mu _v \boldsymbol {B}_v$. This equation is known as the Beltrami equation and is obtained, see (3.2), as the Euler–Lagrange equation $\delta {{\mathcal {F}}}_V / \delta \boldsymbol {A}_v = 0$. We note that $\boldsymbol {B}_v$ depends only on the geometry of the adjacent interfaces; i.e. $\boldsymbol {B}_v = \boldsymbol {B}_v[\boldsymbol {x}_{v-1},\boldsymbol {x}_v]$.Footnote 12 The energy functional in the vacuum region is redefined simply as
where $\boldsymbol {B}_+ = \nabla \varPhi$ is restricted to be the unique magnetic field with the prescribed loop integrals (for the enclosed currents) and is tangential to ${{\mathcal {S}}}$, and that obeys $\boldsymbol {\nabla }\boldsymbol {\cdot }\boldsymbol {\nabla }\varPhi =0$ in ${{\mathcal {V}}}_+$. This field is obtained as the solution to either (4.1) if the total normal field is given or (4.4) if the external normal field is given. With the enclosed currents and ${{\mathcal {D}}}$ held constant, we note that $\boldsymbol {B}_+$ depends only on ${\mathcal {S}}$ and the boundary condition; i.e. $\boldsymbol {B}_+ = \boldsymbol {B}_+[{\mathcal {S}},B_{T,n}]$ or $\boldsymbol {B}_+ = \boldsymbol {B}_+[{\mathcal {S}},B_{E,n}]$.
With these conventions hereafter implied, we omit the explicit dependence of ${{\mathcal {F}}}$ on the vector and scalar potentials and the Lagrange multipliers, and write ${{\mathcal {F}}}={{\mathcal {F}}}_t[\boldsymbol {x}, B_{T,n},{{\mathcal {D}}}]$ or ${{\mathcal {F}}}={{\mathcal {F}}}_e[\boldsymbol {x}, B_{E,n},{{\mathcal {D}}}]$. To distinguish the case when ${B_{T,n}}$ is provided and (4.1) is used to compute the vacuum field, we use ${{\mathcal {F}}}_t$ to denote the energy functional in this case, whereas we instead denote this quantity by ${{\mathcal {F}}}_e$ when ${B_{E,n}}$ is provided and (4.4) is used. We use $\boldsymbol {x}$ to denote the geometry of the $v=1,\ldots N_V$ interfaces, which includes the plasma boundary, i.e. $\boldsymbol {x}_{N_V} = {{\mathcal {S}}}$.
The interface geometry, and by implication the equilibrium magnetic field, is defined by $\delta {{\mathcal {F}}} / \delta \boldsymbol {x} = 0$, which from (3.3) is the equilibrium equation, ${\boldsymbol {F}} \equiv [[p+B^{2}/2]]=0$ across the ideal interfaces. Minima of ${{\mathcal {F}}}$ with respect to ${\boldsymbol {x}}$ can be found using descent-style algorithms, $\partial \boldsymbol {x} / \partial \tau = - \delta {{\mathcal {F}}}/\delta \boldsymbol {x}$, where $\tau$ is an integration parameter, or by Newton-style methods. The geometry of the equilibrium interfaces is considered to be a function of the boundary conditions; i.e. $\boldsymbol {x} = \boldsymbol {x}[B_{T,n},{{\mathcal {D}}}]$ or $\boldsymbol {x} = \boldsymbol {x}[B_{E,n},{{\mathcal {D}}}]$. We shall write the total (tangential) magnetic field, $\boldsymbol {B}_T|_{{{\mathcal {S}}}}$, on ${{\mathcal {S}}}$, which is required for the virtual casing calculation of $B_{P,n}$, as a function of $\boldsymbol {x}$, i.e. $\boldsymbol {B}_T|_{{{\mathcal {S}}}}=\boldsymbol {B}_T|_{{{\mathcal {S}}}}[\boldsymbol {x}]$.
The second derivatives of the restricted energy functions, (4.6) and (4.7), with respect to variations in the interface geometry, $\boldsymbol {x}$, and the boundary conditions, either $B_{T,n}$ of $B_{E,n}$, which are required in the following section, may be constructed from incorporating the derivatives of the magnetic fields as calculated using (4.2) and (4.5) with the expressions presented in § 3.1.Footnote 13
5. Combined plasma–coil optimization algorithms
In this section, we consider the construction of the plasma equilibrium and the coil geometry simultaneously with the optimization of the plasma and coil properties.
When we are required to construct the geometry of the coils as extrema of the coil-penalty functional, ${{\mathcal {E}}}$, we assume that the coil geometry satisfies $\delta {{\mathcal {E}}} / \delta \boldsymbol {c} = 0$. When the coil geometry is taken as the independent degree-of-freedom in the optimization, the coil-penalty functional is not required and the Biot–Savart integral is used to compute ${B_{E,n}} = {B_{E,n}}[\boldsymbol {c},{{\mathcal {D}}}]$.
When the normal plasma field, $B_{P,n}$, on ${{\mathcal {D}}}$ is required and the equilibrium magnetic field is known, $B_{P,n}$ is computed using either the total (tangential) magnetic field either immediately inside, ${{\mathcal {S}}}_-$, or immediately outside, ${{\mathcal {S}}}_+$, the plasma boundary, depending on whether $\boldsymbol {B}_T$ is determined using a fixed-boundary calculation, for which the total magnetic field outside ${{\mathcal {S}}}$ may not be known, or a free-boundary calculation, for which it is. If there is a sheet current on the plasma boundary, these will differ. The total (tangential) magnetic field, $\boldsymbol {B}_T|_{{{\mathcal {S}}}}$ on ${{\mathcal {S}}}$, is obtained as an output of the equilibrium calculation, $\delta {{\mathcal {F}}} / \delta \boldsymbol {x} = 0$.
In the following, we consider choosing the independent degrees-of-freedom, $\boldsymbol {z}$, in the combined plasma–coil optimization to be (i) the plasma boundary, $\boldsymbol {z} \equiv {{\mathcal {S}}}$; (ii) the total normal field on ${{\mathcal {D}}}$, $\boldsymbol {z} \equiv B_{T,n}$; (iii) the required normal field on ${{\mathcal {D}}}$, $\boldsymbol {z} \equiv D_n$; and (iv) the coil geometry, $\boldsymbol {z} \equiv \boldsymbol {c}$. For each of these choices, the descent direction depends on the derivatives of the plasma-energy functional, ${{\mathcal {F}}} = {{\mathcal {F}}}_t[\boldsymbol {x}_v, B_{T,n},{{\mathcal {D}}}]$ or ${{\mathcal {F}}} = {{\mathcal {F}}}_e[\boldsymbol {x}_v, B_{E,n},{{\mathcal {D}}}]$ depending on whether the total or external normal field is given; the coil-penalty functional, ${{\mathcal {E}}} = {{\mathcal {E}}}[\boldsymbol {c},D_n,{{\mathcal {D}}}]$; and the virtual-casing calculation of the plasma normal field, $B_{P,n} = B_{P,n}[{{\mathcal {S}}},\boldsymbol {B}_T|_{{{\mathcal {S}}}},{{\mathcal {D}}}]$.
It is helpful to have a tangible example of what plasma and coil properties we wish to optimize. For the plasma optimization, we imagine a plasma property, ${\mathcal {P}}[\boldsymbol {x}]$, that is explicitly a scalar function of the geometry of the flux surfaces and is to be minimized. Generally, most plasma properties depend on the magnetic field, but here we are assuming that the equilibrium magnetic field depends on the geometry of the interfaces and so there is no loss of generality in treating ${\mathcal {P}}$ as an explicit function of only the geometry of the interfaces. Plasma properties of interest include the integrability of the magnetic field, quasi-symmetry properties, the rotational-transform profile, stability properties,etc.
For the coil geometry optimization, we imagine a measure of the coil complexity, ${\mathcal {C}}[\boldsymbol {c}]$, that is a scalar function of the coil geometry and is to be minimized. We might consider the total length of the coils, as shorter coils tend to be cheaper, or the non-planarness of the coils, as measured by ${\mathcal {C}} = \sum _i \oint \tau ^{2}/2 \,\mathrm {d} l$ where $\tau$ is the torsion. Other properties of interest might include a measure of the coil–plasma separation, which explicitly depends on both the coil geometry and the plasma boundary, e.g. ${\mathcal {Q}} = {\mathcal {Q}}[\boldsymbol {x},\boldsymbol {c}]$. It is straightforward to extend the following treatment to include such properties.
For the combined optimization, we imagine a differentiable cost function, $\mathcal {T}=\mathcal {T}[\mathcal {P},\mathcal {C}]$. If ${\mathcal {P}}[\boldsymbol {x}]$ and ${\mathcal {C}}[\boldsymbol {c}]$ are differentiable functions of, respectively, $\boldsymbol {x}$ and $\boldsymbol {c}$,Footnote 14 then descent algorithms can be implemented if we know the derivatives of $\boldsymbol {x}$ and $\boldsymbol {c}$ with respect to $\boldsymbol {z}$. The derivative of ${\mathcal {T}}$ is
Here and hereafter, for notational clarity, we assume that all quantities are discretized, so that functionals of lines, surfaces and volumes become functions of a finite set of parameters that describe those objects, and the infinite-dimensional Frechét derivatives become finite-dimensional derivatives. One cannot be more explicit regarding constructing the derivatives of ${\mathcal {T}}$, ${\mathcal {P}}$ and ${\mathcal {C}}$ until one has stated what these quantities are, this is left to a future article. In the following, we present expressions for $\partial \boldsymbol {x} / \partial \boldsymbol {z}$ and $\partial \boldsymbol {c} / \partial \boldsymbol {z}$ for the different choices of $\boldsymbol {z}$ mentioned above.
5.1. Fixed-boundary optimization
We can consider the independent degree-of-freedom in the optimization to be the plasma boundary, $\boldsymbol {z} \equiv {{\mathcal {S}}}$. The free-plasma-boundary equilibrium calculation described in § 2 reduces to a fixed-plasma-boundary calculation by eliminating the vacuum region; that is, we choose ${{\mathcal {D}}}={{\mathcal {S}}}$ and $B_{T,n}=0$. In this subsection only, because we are choosing ${{\mathcal {D}}}$ to be coincident with ${{\mathcal {S}}}$ to facilitate a fixed-boundary equilibrium calculation, ${{\mathcal {D}}}$ will move during the optimization.
The equilibrium, $\boldsymbol {x}$, satisfies $\partial {{\mathcal {F}}}_t(\boldsymbol {x},0,{{\mathcal {S}}}) / \partial \boldsymbol {x}=0$. We compute $B_{P,n}({{\mathcal {S}}},\boldsymbol {B}_T|_{{{\mathcal {S}}}_-},{{\mathcal {S}}})$ from the virtual-casing integral.Footnote 15 With $B_{T,n}=0$, the required normal field is $D_n = - B_{P,n}({{\mathcal {S}}},\boldsymbol {B}_T|_{{{\mathcal {S}}}_-},{{\mathcal {S}}})$. The coil geometry satisfies $\partial {{\mathcal {E}}}(\boldsymbol {c}, D_n,{{\mathcal {S}}})/\partial \boldsymbol {c}=0$. Expanding and collecting terms, we obtain
where
and where
The analogous functional derivatives for $\partial ^{2} {{\mathcal {E}}} / \partial \boldsymbol {c} \partial \boldsymbol {c}$ and $\partial ^{2} {{\mathcal {E}}} / \partial \boldsymbol {c} \partial {{\mathcal {D}}}$ are given in (3.16) and (3.21), those for $\partial ^{2} {{\mathcal {F}}}_t/\partial \boldsymbol {x} \partial \boldsymbol {x}$, $\partial ^{2} {{\mathcal {F}}}_t/\partial \boldsymbol {x} \partial {{\mathcal {D}}}$ are given in (3.6) and (3.7), and those for $\partial B_{P,n} / \partial {{\mathcal {S}}}$ and $\partial B_{P,n} / \partial \boldsymbol {B}_T|_{{{\mathcal {S}}}}$ can be found in (3.24) and (3.25).
The coil geometry is constructed by minimizing the coil-penalty functional. With a finite number of discrete coils, the optimal coils will not exactly produce the required magnetic field. It may be beneficial for a combined plasma–coil optimization to minimize the quadratic-flux error $\varphi _2=\varphi _2(D_n,\boldsymbol {c},{{\mathcal {D}}})$ The required derivative is
The analogous functional derivative, $\partial \varphi _2/\partial {{\mathcal {D}}}$, is given in (3.19).
5.2. Generalized fixed-boundary optimization
We can consider the independent degree-of-freedom in the optimization to be the total normal field on the computational boundary, ${\boldsymbol {z}} \equiv B_{T,n}$ on ${{\mathcal {D}}}$. Here ${{\mathcal {D}}}$ is assumed to lie outside the plasma boundary and to remain fixed during the calculation. The equilibrium satisfies $\partial {{\mathcal {F}}}_t(\boldsymbol {x},B_{T,n},{{\mathcal {D}}})/\partial \boldsymbol {x}=0$. The required normal field is $D_n = B_{T,n}-B_{P,n}({{\mathcal {S}}},\boldsymbol {B}_T|_{{{\mathcal {S}}}_+},{{\mathcal {D}}})$. The coil geometry satisfies $\partial {{\mathcal {E}}}(\boldsymbol {c},D_n,{{\mathcal {D}}})/\partial \boldsymbol {c}=0$. Expanding and collecting terms, we obtain
where
and where
The analogous functional derivatives for $\partial {{\mathcal {E}}}/\partial \boldsymbol {c} \partial D_n$, $\partial B_{P,n} / \partial {{\mathcal {S}}}$ and $\partial B_{P,n} / \partial \boldsymbol {B}_T|_{{{\mathcal {S}}}}$ are given in (3.17), (3.24) and (3.25). Note the use of $\boldsymbol {B}_T|_{{{\mathcal {S}}}_-}$ in (5.5) and $\boldsymbol {B}_T|_{{{\mathcal {S}}}_+}$ in (5.10). The derivative of the quadratic-flux error is
The analogous functional derivative for $\partial \phi _2 /\partial D_n$ is given in (3.11).
5.3. Quasi-free-boundary optimization
In this case, we consider the independent degree-of-freedom in the optimization to be the required external normal field on the computational boundary, $\boldsymbol {z} \equiv D_n$ on ${{\mathcal {D}}}$. For the equilibrium calculation, we assume that the ‘actual’ external normal field is equal to the required external normal field, ${B_{E,n}} = D_n$; i.e. we assume that the equilibrium satisfies $\partial {{\mathcal {F}}}_e(\boldsymbol {x},D_n,{{\mathcal {D}}})/\partial \boldsymbol {x}=0$. The virtual casing integral is not explicitly required as it is implicitly included in the construction of the virtual-casing self-consistent vacuum field, (4.4). The coil geometry satisfies $\partial {{\mathcal {E}}}(\boldsymbol {c},D_n,{{\mathcal {D}}}) / \partial \boldsymbol {c} = 0$.
Expanding and collecting terms, we obtain
and for the quadratic-flux error
5.4. Free-boundary (coil) optimization
In this final case, the independent degree-of-freedom in the optimization is the coil geometry, $\boldsymbol {z} \equiv \boldsymbol {c}$. The equilibrium satisfies $\partial {{\mathcal {F}}}_e(\boldsymbol {x},B_{E,n},{{\mathcal {D}}})/\partial \boldsymbol {x}=0$, where $B_{E,n}=B_{E,n}[\boldsymbol {c},{{\mathcal {D}}}]$ is computed using the Biot–Savart integral.
We obtain
The analogous functional derivative for $\partial B_{E,n}/\partial \boldsymbol {c}$ is given in (3.13). The coil geometry is given directly, $\boldsymbol {c} = \boldsymbol {z}$.
Since the input to the equilibrium calculation, ${B_{E,n}}$, is computed directly from the actual coils there is no coil error, $\varphi _2 = 0$.
A summary of the 4 different combined plasma-coil optimization algorithms can be found in Table 1.
6. Discussion
In this paper, we have categorized and investigated four different combined plasma–coil optimization approaches and proposed a novel method for improving free-boundary MRxMHD equilibrium calculation. We began by summarizing all the functional derivatives of the MRxMHD energy,Footnote 16 the coil-penalty and the virtual-casing integral needed for a combined plasma–coil optimization. We emphasized that absence of an energy-principle formulation with an explicit boundary condition on the normal component of the external magnetic field on the computational boundary if it does not coincide with the plasma boundary. This construction of the field in the vacuum region is advantageous for the free-boundary optimization approach. For this special case, we proposed solving for the required field using a weak formulation. Finally, we explicitly stated which derivatives are necessary for the four distinct combined plasma–coil optimization algorithms: fixed-boundary optimization, generalized fixed-boundary optimization, quasi-free-boundary optimization and free-boundary (coil) optimization. To the best of our knowledge, all existing stellarator optimization algorithms can be grouped in either the fixed-boundary optimization or the free-boundary optimization approach. The collection of these four distinct optimization algorithms helps to clarify certain intrinsic problems of combined plasma–coil optimization, as we will discuss later in this section.
The novel proposed approach for calculating the virtual-casing self-consistent vacuum field will reduce the cost of the free-boundary calculation to something comparable to that of a fixed-boundary calculation. A full-Newton method, with analytic derivatives, can be used for both. In the proposed approach, in (4.4) appears the Laplace-virtual-casing matrix ${\mathcal {L}}_{vc}={\mathcal {A}}-\mathcal {B}$, which is an important matrix that deserves some discussion. This matrix, in a vague sense, ‘connects’ the plasma to the external magnetic field. Since it is not symmetric, it cannot directly follow from an energy principle, although there are roundabout ways to force this to happen (e.g. by introducing adjoint degrees of freedom). Investigation of the well-posedness of the Laplace-virtual-casing problem is the subject of future publication.
On the basis of the summary of the four distinct optimization algorithms presented in § 5, we now discuss the differences and potential advantages and disadvantages of the various optimization algorithms described above. The independent degree-of-freedom is (i) a two-dimensional surface, ${{\mathcal {S}}}(\theta ,\zeta )$, where $\theta$ and $\zeta$ are poloidal and toroidal angles, embedded in 3-D space for the fixed-boundary approach; (ii) a scalar function, ${B_{T,n}}(\theta ,\zeta )$, with the constraint that the net-flux is zero for the generalized fixed-boundary approach; (iii) a similar function, $D_n(\theta ,\zeta )$, for the quasi-free-boundary approach; and (iv) a discrete set of one-dimensional curves embedded in 3-D space, $\boldsymbol {c}_i(\theta )$ for $i = 1, \dots , N_C$, for the free-boundary (coil) approach. Note that a surface is really just one scalar function of two angles: even though some codes represent a surface using two functions, namely $R$ and $Z$, this introduces a tangential null space, which is removed by exploiting spectral condensation (Hirshman & Meier Reference Hirshman and Meier1985; Hirshman & Breslau Reference Hirshman and Breslau1998; Lee et al. Reference Lee, Harris, Hirshman and Neilson1988). Also, in (iv) we have restricted attention to when the external field is provided by a finite number of closed current filaments; other representations can be included.
For the quasi-free-boundary approach, the theory of ‘efficient fields’ can be used when $D_{n}$ is the independent degree-of-freedom (Landreman & Boozer Reference Landreman and Boozer2016) to determine which Fourier spectrum of the normal component of the magnetic field corresponds to distant coils. Using only efficient fields, the optimization space can be effectively reduced compared with the fixed-boundary and generalized fixed-boundary approaches. For the free-boundary (coil) algorithm, we expect that a concise parameterization of the coils can be introduced possibly including some engineering constraints which could also reduce the optimization space efficiently. For the other approaches, we suggest using some measure of the coil error in total cost function: the quadratic-flux error would be a good target, and targets weighting resonant normal field errors are probably better.
In each presented case, the $\partial ^{2}{\mathcal {F}}/\partial \boldsymbol {x}\partial \boldsymbol {x}$ matrix needs to be inverted to calculate $\partial \boldsymbol {x} / \partial \boldsymbol {z}$. This is the equilibrium stability matrix, and near marginal stability its eigenvalues vanish. This suggests that the optimization can encounter singular points. Physically, this has to do with the fact that if a plasma is only marginally stable the equilibrium can be changed substantially by even a small perturbation. This presumably would induce a large change in the coil geometry, whose optimal shape is therefore very uncertain. Marginal stability is a particularly relevant and important case. Toroidal confinement of fusion plasmas typically improves as the plasma pressure and/or currents increase, but pressure and currents ultimately are responsible for instabilities. An economically viable fusion reactor would, presumably, operate at the highest fusion-performance parameters but also sufficiently far from bifurcation boundaries in parameter space so that small variations could be controlled. In practice, it would be probably possible to avoid singular points of the stability matrix by including a target of stability in the stellarator optimization and by starting the optimization with a stable state.
Similarly, in all but the last case, the $\partial ^{2} {{\mathcal {E}}}/\partial \boldsymbol {c}\partial \boldsymbol {c}$ matrix needs to be inverted to calculate $\partial \boldsymbol {c}/\partial \boldsymbol {z}$. Small changes in the equilibrium may result in large changes to the coils if this matrix has a small eigenvalue.
Algorithms for combined plasma–coil optimization might become problematic near marginal stability boundaries, particularly if the response in either the plasma or the coils to small variations is not well understood. Depending on the plasma and coil properties chosen, there might be bifurcation boundaries in the optimization parameter space, which determine which initial conditions will converge to which local minima.
Marginal plasma stability is a lower-dimensional subspace of the full space of configurations, and it can be defined when there is a direction in which there is no variation, i.e. there are eigenvalues that are zeros. A similarly defined marginal stability subspace exists in the coil space. The stability boundary of the combined plasma–coil optimization, in the parameter space denoted by ${\boldsymbol {z}}$, will be a combination of both. Depending on the plasma, ${\mathcal {P}}({\boldsymbol {x}})$, and coil, ${\mathcal {C}}({\boldsymbol {c}})$, properties chosen, there may exist subspaces for which $\partial {\mathcal {P}} / \partial {\boldsymbol {x}}=0$ and $\partial {\mathcal {C}} / \partial {\boldsymbol {c}}=0$, and there may be additional bifurcation boundaries in ${\mathcal {T}}({\boldsymbol {z}})$, depending on what ${\mathcal {T}}({\boldsymbol {z}})$ is chosen to be. For robust multiobjective functional optimization, it would be advantageous to understand separately the stability boundaries in the plasma, coil and optimization spaces for a variety of relevant plasma and coil properties.
In this paper, we focused on the $\partial \boldsymbol {x}/\partial \boldsymbol {z}$ and $\partial \boldsymbol {c}/\partial \boldsymbol {z}$ part of (5.1), because these derivatives appear independently of which properties one wants to tailor in the optimization. We neglected the actual cost function with the equilibrium, ${\mathcal {P}}$, and coil, ${\mathcal {C}}$, properties and their derivatives because these quantities are problem specific; for example, a reactor-grade plasma has different design criteria than a research experiment. Recently, much work has been devoted to the development of adjoint methods and automatic differentiation to calculate derivatives of the plasma and coil properties, e.g. for the rotational transform, neoclassical transport and for the volume averaged $\beta$ (Landreman & Paul Reference Landreman and Paul2018; Paul et al. Reference Paul, Landreman, Bader and Dorland2018; Antonsen et al. Reference Antonsen, Paul and Landreman2019; Paul et al. Reference Paul, Abel, Landreman and Dorland2019; Carlton-Jones et al. Reference Carlton-Jones, Paul and Dorland2020). These methods promise significant advantages for stellarator optimization.
For both the equilibrium calculation and the coil calculation, both descent and Newton-style methods can be used. Descent methods can be employed for combined plasma–coil optimization. An interesting question is whether or not all these descent calculations can be performed simultaneously; for example,
where $\tau$ is an arbitrary integration parameter, and $\alpha$ and $\beta$ can be chosen for numerical stability so that, for example, the ‘inner’ equilibrium and coil geometry calculations are sufficiently converged so that the ‘outer’ optimization calculation is provided with sufficiently accurate information. The equations in (6.1a–c) can be thought of as a dynamical system; and, like most dynamical systems, the location and stability of the fixed points of (6.1a–c) gives important information about the ‘dynamics’ (Strogatz Reference Strogatz2018), which in this case is the convergence and stability properties of the combined plasma–coil optimization.
Acknowledgements
The lead author would like to thank J. Loizu, E. Paul, C. Nührenberg and B. Shanahan for helpful conversations. This work has been carried out in the framework of the EUROfusion Consortium and has received funding from the Euratom research and training programme 2014–2018 and 2019–2020 under grant agreement no. 633053. It was also supported by a grant from the Simons Foundation (560651, PH). The views and opinions expressed herein do not necessarily reflect those of the European Commission.
Editor William Dorland thanks the referees for their advice in evaluating this article.
Declaration of interests
The authors report no conflict of interest.