Abstract

agama is a publicly available software library for a broad range of applications in the field of stellar dynamics. It provides methods for computing the gravitational potential of arbitrary analytic density profiles or N-body models, orbit integration and analysis, transformations between position/velocity and action/angle variables, distribution functions expressed in terms of actions and their moments, and iterative construction of self-consistent multicomponent galaxy models. Applications include the inference about the structure of Milky Way or other galaxies from observations of stellar kinematics, preparation of equilibrium initial conditions for N-body simulations, and analysis of snapshots from simulations. The library is written in C++ , provides a python interface, and can be coupled to other stellar-dynamical software: amuse, galpy, and nemo. It is hosted at http://github.com/GalacticDynamics-Oxford/Agama.

1 INTRODUCTION

Galaxy models are vital for understanding their structure and evolution. The rapid increase in quantity and quality of observational data, both for Milky Way and external galaxies, calls for similar advances in modelling techniques. One of the most powerful approaches describes the stars and other mass components by distribution functions (DF) in the space of integrals of motion. For several reasons discussed later in the paper, actions are the most appropriate choice for these integrals of motion. A DF provides a complete description of the system, and various other properties (density, velocity distributions, etc.) can be computed from a DF in a given potential. A flexible representation of the gravitational potential is also a necessity. A dynamically self-consistent model of a stellar system implies certain relations between the potential and the DF; depending on the scientific context, this may or may not be required.

This paper presents a software framework for galaxy modelling – agama. It provides necessary tools for the construction of customized dynamical models described by distribution functions in action space, but many parts of the library, such as general-purpose potential solvers, are applicable to a broader range of problems in stellar dynamics. It is organized into several modules, starting from basic mathematical routines to a complete framework for constructing galaxy models, which are presented in more detail in the following sections:

  • Gravitational potentials, including a few commonly used analytical potential-density pairs and two versatile potential expansions that can be constructed from an arbitrary density distribution or from an N-body model (Section 2).

  • Transformation between coordinate/velocity and action/angle variables, in particular, a new implementation of the Stäckel fudge (Section 3).

  • Several types of DFs expressed in terms of actions (including a new class of disc DF), and associated routines for computing DF moments and creating an N-body representation of a DF in a given potential (Section 4).

  • The framework for iterative construction of self-consistent galaxy models (Section 5).

We illustrate some of the possible applications and compare agama to other similar software projects in Section 6.

The agama library is written primarily in C++ to achieve maximum efficiency, and has a python interface offering greater flexibility for practical work. It is distributed with many example programs both in C++ and python, illustrating various aspects of its use (we present a few short python listings in the paper). It also includes several other stellar-dynamical software projects: an updated version of the Monte Carlo simulation code raga (Vasiliev 2015), the Fokker–Planck code phaseflow (Vasiliev 2017), and parts of the Schwarzschild orbit-superposition code smile (Vasiliev & Valluri, in preparation; Vasiliev 2013). Extensive documentation (Vasiliev 2018) describes the structure and usage of the library, and also contains a more technical description of various methods used in the code (including original implementations of many mathematical tasks such as spline approximation and fitting, sampling from multidimensional distribution functions, etc.) that we do not repeat in this paper.

agama has interfaces to several other astrophysical frameworks: the potential approximations can be used in amuse (Portegies Zwart et al. 2013), galpy (Bovy 2015), and nemo (Teuben 1995), while the routines for action/angle conversion may serve as a more efficient drop-in replacement for the ones in galpy. The library is publicly available at http://github.com/GalacticDynamics-Oxford/Agama.

2 POTENTIALS

At the heart of any galaxy modelling framework lies the gravitational potential. Some commonly used choices (e.g. Hernquist, Miyamoto–Nagai, or Navarro–Frenk–White models) have analytic expressions for the potential only in spherical or at most axisymmetric cases; some other models (e.g. Sérsic, exponential disc, or double-power-law halo profiles) require numerical integration or evaluation of expensive special functions. In a general case, one needs to solve the Poisson equation
\begin{eqnarray*} \nabla ^2 \Phi (\boldsymbol{x}) = 4\pi \, G\, \rho (\boldsymbol{x}) \end{eqnarray*}
(1)
to obtain the potential Φ of a given density profile ρ. agama provides a few standard potential–density pairs and two general-purpose potential solvers suitable for a wide range of applications.

2.1 Spherical-harmonic expansion

Mathematically, the potential can be expressed directly as a 3D integral of the product of the Green’s function and the density:
\begin{eqnarray*} \Phi (\boldsymbol{x}) = -G \int\!\!\!\int\!\!\!\int d^3x^{\prime } \, \rho (\boldsymbol{x}^{\prime }) \times \frac{1}{|\boldsymbol{x}-\boldsymbol{x}^{\prime }|} . \end{eqnarray*}
(2)
This is computationally challenging both because of the need to compute a triple integral over infinite domain for each point, and because the Green’s function is singular. Nevertheless, this approach has been used in the literature (e.g. Fragkoudi et al. 2015).
A commonly used alternative is to replace two of the three integrals by sums over a certain set of basis functions, such as spherical harmonics (Chapter 2.4 Binney & Tremaine 2008). Namely, we write the density and potential in spherical coordinates as
\begin{eqnarray*} \lbrace \rho ,\Phi \rbrace (r,\theta ,\phi) = \sum _{l=0}^\infty \sum _{m=-l}^{l} \lbrace \rho ,\Phi \rbrace _{lm}(r)\, Y_l^m(\theta ,\phi), \end{eqnarray*}
(3)
where |$Y_l^m$| are the real-valued spherical harmonics (products of associated Legendre functions in cos θ and trigonometric functions in mϕ). Thanks to the orthogonality of the spherical-harmonic basis, the coefficients of this expansion at each radius ρlm can be expressed through integrals of the density over two angular variables:
\begin{eqnarray*} \rho _{lm}(r) = \int _0^{\pi } \mathrm{ d}\theta \int _0^{2\pi } \mathrm{ d}\phi \, \rho (r,\theta ,\phi) \, Y_l^{m}(\theta ,\phi) \, A_{lm}, \end{eqnarray*}
(4)
where Alm are normalization constants. Finally, the coefficients of potential expansion are given by 1D integrals:
\begin{eqnarray*} \Phi _{lm}(r) = -\frac{4\pi G}{2l+1}\, \bigg [\, r^{-1-l} & \int _0^r \mathrm{ d}r^{\prime }\, \rho _{lm}(r^{\prime })\, r^{\prime \, l+2}\,\,+ \nonumber \\ r^l & \int _r^\infty \mathrm{ d}r^{\prime }\, \rho _{lm}(r^{\prime })\, r^{\prime \, 1-l} \bigg ]. \end{eqnarray*}
(5)

For this approach to be practical, one needs to restrict the range of summation over l, m in the above formulae to some maximum values 0 ≤ llmax, |m| ≤ min(l, mmax); mmax may be smaller than lmax or even zero for axisymmetric systems. For systems that are not too far from spherical (e.g. elliptical galaxies), lmax could be as small as 4–8, and in many cases even two terms (monopole and quadrupole) are sufficient. On the other hand, for strongly flattened systems one needs to use many terms in the expansion (for instance, lmax = 36 in Holley–Bockelmann, Weinberg & Katz 2005), which is quite demanding computationally.

An elegant modification of this approach was introduced by Kuijken & Dubinski (1995) for axisymmetric discy systems with a separable density profile: ρ(R, |$z$|⁠) = ρR(R|$z$|(⁠|$z$|⁠). The key point is that the potential may be split into two parts: one given by an analytic expression involving ρR and the second antiderivative of ρ|$z$|(⁠|$z$|⁠), and the remainder corresponding to a residual density profile. The latter is not strongly confined to the plane |$z$| = 0 and is efficiently represented by a moderate number of spherical harmonic coefficients. The implementation by W. Dehnen is known as galpot and has been used in many studies, starting from Dehnen & Binney (1998); a nearly equivalent implementation is included in agama. On the other hand, it is restricted to double-exponential axisymmetric disc profiles, and even though the method can be extended to more general density models, the constraints of separability and axisymmetry cannot be lifted.

2.2 Azimuthal-harmonic expansion

Another less well-known possibility is conceptually in between the two described approaches. Namely, the density and potential are both expanded in Fourier harmonics in the azimuthal angle ϕ, but the expressions relating each harmonic coefficient of potential to its corresponding density coefficient are given by 2D integrals:
\begin{eqnarray*} \lbrace \rho ,\Phi \rbrace (R,z,\phi) &= \sum _{m=0}^\infty \lbrace \rho ,\Phi \rbrace _m(R,z)\, \exp (im\phi), \end{eqnarray*}
(6)
\begin{eqnarray*} \Phi _m(R,z) = -G\int_{-\infty}^{\infty} dz^{\prime } \int _0^\infty dR^{\prime }\, \rho _m(R^{\prime },z^{\prime })\, \Xi _m(R,z,R^{\prime },z^{\prime }), \end{eqnarray*}
where Ξm is the Green’s function in cylindrical coordinates, which has an analytic expression in terms of the Legendre function of the second kind Qm − 1/2 (Cohl & Tohline 1999). In the axisymmetric case, only m = 0 term is needed, and in general the above sum may be truncated at a rather moderate mmax ≲ 10. This potential solver has been introduced in the context of Schwarzschild modelling (Vasiliev & Athanassoula 2015) and upgraded for the current implementation.

2.3 Comparison of the two approaches

In both spherical-harmonic (5) and azimuthal-harmonic (6) cases, the potential coefficients Φlm(r) and Φm(R, |$z$|⁠) are pre-computed at the nodes of a 1D grid in spherical radius r or a 2D grid in the meridional plane, using a moderate number of points (few tens per dimension). The global potential approximation is then constructed by creating interpolating splines in suitably scaled coordinates, which allows one to compute the potential and its two derivatives at any point in space in a very efficient way. In the present implementation, we additionally pre-compute the derivatives of the potential at the grid points, which allows the use of a high-accuracy quintic spline interpolation, introduced in Dehnen & Binney (1998). Moreover, following the latter paper, the spherical-harmonic expansion with 1D radial spline interpolation is actually converted to a 2D spline in the r, θ plane for each value of m, which is more efficient than evaluation of Legendre polynomials if the number of harmonics lmax > 2. Note that the difference with the azimuthal-harmonic expansion is in that in the latter case, the 2D spline is constructed for a rectangular grid in R, |$z$| coordinates (see also Fig. 6 in Section 5.2 for an illustration). The computational effort needed to evaluate the potential and/or its derivatives is thus similar for both methods and comparable to that of galpot. However, the potentials in agama can be constructed for any geometry (axisymmetric, triaxial, or even more general), while the galpot approach is limited to axisymmetry.

The two approaches based on harmonic expansion are complementary to each other in several aspects. While the potential evaluation through 2D integration is more accurate than the spherical-harmonic expansion in the case of a highly flattened density distribution, it is also much more computationally intensive to construct. Moreover, it is only accurate enough for density profiles with a finite value at origin and a steep decline at large radii, since the grid in the meridional plane may only cover a finite volume, and the potential outside the grid is extrapolated using a low-degree multipole expansion with a zero Laplacian (hence the extrapolated density is zero). On the other hand, the radial grid in the spherical-harmonic potential approximation covers a wide range of radii (with equally spaced nodes in log r), and the extrapolation to small and large radii is able to represent power-law density profiles outside the grid fairly accurately. Thus the latter approximation is more suitable for spheroidal galaxy components, such as the bulge and the halo, while the former is adequate for the disc component. One may get the best of both worlds by combining the two potential expansions, each one representing one or more density components, into a composite potential.

2.4 Smooth approximations to N-body potentials

Both potential expansions may be computed either from an analytic density profile, or from a set of N point masses, thus approximating the potential of an N-body system by a smooth one. In fact, as shown in Vasiliev (2013) and Vasiliev & Athanassoula (2015), in test cases when the particle positions are sampled from a known analytic density model, the smooth potential expansion more closely matches the corresponding analytic profile than a potential computed directly from the N-body snapshot, e.g. using a tree-code method, while also being much faster to evaluate. Of course, in this case the error in potential is dominated by discreteness noise rather than interpolation error, but typical N-body potential solvers suffer even more from this noise. Thus the potential expansion approach is well suited for analysing the properties of orbits in a ‘frozen’ potential of an N-body system, and could be easily extended to represent a time-dependent potential whose coefficients are interpolated in time, providing a more flexible alternative to parametrized analytic models (used e.g. in Muzzio, Carpintero & Wachlin 2005; Machado & Manos 2016) or tree-code potentials (e.g. Valluri et al. 2010). The computation of a smooth potential from discrete samples is at the core of the Monte Carlo simulation code raga (Vasiliev 2015), which is also included in the framework.

2.5 Miscellanea

For completeness, we mention another approach that replaces all three integrals in (2) with sums over members of a basis set, which typically consists of products of spherical-harmonic functions and certain families of orthogonal polynomials in scaled radial coordinate (e.g. Zhao 1996; Lilley et al. 2018). This approach is commonly associated with the ‘self-consistent field’ method (Hernquist & Ostriker 1992; Meiron et al. 2014) and is similar to the spline-interpolated spherical-harmonic expansion, but is less flexible and more computationally demanding. To compute the potential of a basis-set expansion, one needs to sum over all basis functions, while for the case of spline interpolation only the values at two adjacent grid nodes are required. Vasiliev (2013) compared both approaches and found the spline interpolation to be generally faster and more accurate; therefore, the basis-set expansion is no longer included in the library.

Listing 1 illustrates the construction and use of various potential types, and the creation of a smooth potential approximation from an N-body snapshot (in this example, the snapshot itself was generated from a smooth density profile; in realistic applications it would be taken from an N-body simulation).

Given a potential, it is straightforward to numerically integrate orbits; we use a modified version of the eight order Runge–Kutta integrator dop853 (Hairer, Nørsett & Wanner 1993). All potentials in the library provide up to two derivatives, which may be used for computing the largest Lyapunov exponent (the corresponding routine is included in agama) or other variational chaos indicators (e.g. Carpintero, Maffione & Darriba 2014).

3 ACTION/ANGLE VARIABLES

agama deals with models of stellar systems expressed in terms of action/angle variables |$\boldsymbol{J},\, \boldsymbol{\theta}$| (see Binney & Tremaine 2008, Section 3.5, for the definition), and provides several methods for conversion between position/velocity and action/angle variables. The most practically important one is the axisymmetric Stäckel fudge (Binney 2012), for which our implementation is more efficient and accurate than other existing codes, thanks to an improved method for choosing the focal distance. In this section we review the action/angle formalism and present in detail the approach used in the library.

3.1 General properties

Strictly speaking, action/angle variables are only well-defined when the motion is integrable (multiperiodic); the meaning of actions depends on the orbit type, and in a spherical or an axisymmetric potential, the most convenient choice is the triplet {Jr, J|$z$|, Jϕ}. The first two (radial and vertical actions) describe the extent of oscillations in spherical radius and vertical dimension (for a nearly circular orbit in the equatorial plane they decouple, otherwise the motion is qualitatively similar but more complex), while the third (azimuthal action) is the conserved component L|$z$| of the angular momentum. In the spherical case J|$z$| + |Jϕ| is the total angular momentum L.

Actions are just another set of integrals of motion, so they can be expressed in terms of more familiar integrals such as energy E, angular momentum L or L|$z$|, and the non-classical third integral I3 (assuming that it exists). They have certain advantages over other choices of integrals:

  • Action/angle variables are canonical, i.e. satisfy the Hamilton’s equations of motion, which in this case are trivial: |$\boldsymbol{J} = \mathrm{const}$|⁠, |$\dot{\boldsymbol{\theta }} = \mathrm{const} = \boldsymbol{\Omega } \equiv \partial H/\partial \boldsymbol{J}$|⁠, where H is the Hamiltonian expressed in terms of actions.

  • The possible range of each action variable is [0, ∞) or (− ∞, ∞), independently of others (unlike, say, E and L).

  • The transformation between |$\boldsymbol{x}, \boldsymbol{v}$| and |$\boldsymbol{J}, \boldsymbol{\theta}$| has unit determinant, which is convenient for dealing with DFs (for instance, the mass of the system is given by |$\int f(J)\,\, \mathrm{ d}^3J\, \mathrm{ d}^3\theta$| independently of the potential), see Section 4.

  • Actions are adiabatical invariants, conserved under slow changes in the potential. This is convenient for iterative self-consistent modelling (Section 5), and simplifies the treatment of multicomponent systems (e.g. growing a stellar disc in a dark halo preserves the action-based DF of the latter, even though the correspondence between actions and coordinates changes).

  • Action/angle variables are naturally suited to analyse perturbations from equilibrium state (e.g. Monari, Famaey & Siebert 2016) and collisional relaxation (e.g. Fouvry et al. 2015).

There exist several methods for conversion between position/velocity and action/angle variables (see Sanders & Binney 2016 for a comprehensive review and comparison of approaches). It can be performed in both directions analytically in special cases, such as the Isochrone potential, or using 1D numerical quadratures for an arbitrary Stäckel potential (in particular, a spherical one); these methods are available in the library. For a more interesting practical case of an arbitrary axisymmetric potential, we use the ‘Stäckel fudge’ (Binney 2012), which is an approximate method for computing actions and angles under the assumption that the motion is integrable and is locally well described by a Stäckel potential, separable in prolate spheroidal coordinates. agama contains a fresh implementation of this approach, which is more efficient and accurate than other existing codes (tact, Sanders & Binney 2016, or galpy, Bovy 2015). The reverse transformation (from action/angles to position/velocity) is provided by the torusmapper package (Binney & McMillan 2016), which is included in the agama library with some modifications (most notably, an improved angle mapping method used by Binney & Kumar 1993, Laakso & Kaasalainen 2013). At the moment the action/angle framework applies only to oblate axisymmetric potentials, although there exist methods suitable for triaxial potentials (Sanders & Binney 2014, 2015a).

3.2 Stäckel fudge and the role of focal distance

We now recall the key ingredients of the Stäckel fudge method, which is the main workhorse in agama.

Actions and angles can be computed exactly for a general potential separable in a confocal ellipsoidal coordinate system (e.g. de Zeeuw 1985). We specialize to the case of oblate axisymmetric potentials, as being more relevant for flattened disc galaxies. It is possible to extend the formalism to prolate potentials, but an additional complication will be the existence of two types of tube orbits (inner and outer long-axis tubes), hence it is not implemented at the moment. The potential is expressed in a prolate spheroidal coordinate system defined by the focal distance Δ. The transformation between the cylindrical coordinates R, |$z$| and spheroidal coordinates λ, ν is given, e.g. by equation 8 in Sanders (2012). A Stäckel potential has a form
\begin{eqnarray*} \Phi (\lambda ,\nu) = -\frac{f_\lambda (\lambda)-f_\nu (\nu)}{\lambda -\nu }, \end{eqnarray*}
(7)
in other words, instead of being an arbitrary function of two coordinates, it is defined by two 1D functions. An orbit in such a potential respects three integrals of motion – energy E, |$z$|-component of angular momentum L|$z$|, and a non-classical third integral 1
\begin{eqnarray*} I_3 \equiv \frac{z^2\, v_\phi ^2 + (zv_R-Rv_z)^2 + \Delta ^2\, v_z^2}{2} + \frac{\lambda f_\nu (\nu) - \nu f_\lambda (\lambda)}{\lambda -\nu }, \end{eqnarray*}
(8)
which ranges from I3 = 0 for an equatorial-plane orbit with J|$z$| = 0 to |$I_3^\mathrm{max}\equiv (R_\mathrm{shell}^2+\Delta ^2)\, v_z^2/2$| for a thin (‘shell’) orbit that crosses the plane |$z$| = 0 at a single radius Rshell(E, L|$z$|) with vertical velocity |$v$||$z$| and has Jr = 0. Given the three integrals, one can numerically solve the equations defining the turn-around points (min/max values of λ, ν), and then compute the actions, angles, and frequencies by 1D numerical integration.

The essence of the Stäckel fudge is to use the same expressions, but substitute the real potential instead, which is not of a Stäckel form (without explicitly constructing an approximating Stäckel potential, as in Sanders 2012). Actions computed in this way are approximate, i.e. not conserved along a numerically integrated orbit. The accuracy of approximation depends on the only free parameter in the method – the focal distance Δ, which can be different for each point. We now discuss various methods for choosing Δ and introduce a new one, which appears to be close to optimal.

In a Stäckel potential, Δ can be found from the combination of second derivatives of the potential (equation 15 in Sanders & Binney 2016), hence one possible choice is to use this equation in the real potential at each input point (dubbed ‘Fudge v1’ in that paper, and shown by red curves labelled [1] in Figs 13). Another possibility is to relate Δ to the quantities conserved for each orbit, namely the two classical integrals of motion E and L|$z$|. Since the focal distance is more important for high-inclination orbits and irrelevant for in-plane orbits, Binney (2014) suggested to estimate it for the extreme ‘shell’ orbit, constructed numerically (for each pair E, L|$z$|, locating the radius Rshell such that an orbit launched vertically from the equatorial plane returns back to the same radius). Then Δ can be computed as the focal distance of an ellipse that best matches the shell orbit. A 2D interpolation grid for Δ(E, L|$z$|) is pre-initialized for the given potential and used to find the focal distance for any input point; this method is dubbed ‘Fudge v2’ in Sanders & Binney (2016) and shown by blue lines [2].

Accuracy of the Stäckel fudge for various methods of computing the focal distance Δ. Shown is the relative rms variation of the radial (top panel) and vertical (bottom panel) action for a series of orbits in a realistic galactic potential (Piffl et al. 2014), started at R = 8.3 kpc, $z$ = 0, with a total velocity 240 km s−1 (equal to the local circular speed), split in different proportions between $v$R, $v$$z$ = 0.8$v$R, and $v$ϕ. These initial conditions are similar to the ones considered in Section 5.2 and plotted in Fig. 3 of Sanders & Binney (2016), except that we keep the total velocity constant, not $v$ϕ, and plot the relative rms variation instead of the absolute one. We numerically integrate each orbit for 10 dynamical times and compute the variation of the actions evaluated at 1000 points sampled from the trajectory. Dashed blue curve [2] corresponds to Δ being interpolated from a pre-initialized 2D grid in E, L$z$ using ellipse fits to shell orbits, as in Binney (2014); it matches the ‘Fudge v2’ method in Sanders & Binney (2016). Red dotted line [1] corresponds to Δ being evaluated separately for each input point from the second derivatives of the potential (equation 15 in that paper), it matches the ‘Fudge v1’ method. Cyan dot-dashed line [3] uses a similar approach, but taking the median value of Δ for the entire trajectory (as used in galpy). Green solid line [4] takes Δ from a pre-initialized 2D grid in E, L$z$, using the condition that Jr = 0 for shell orbits (equation 9); this is the new method introduced in agama. Yellow long-dashed line [5] uses the same Δ, but interpolates actions from a pre-initialized 3D grid in E, L$z$, I3. Naturally, the accuracy is best for low-eccentricity orbits (with Jr, J$z$ ≪ Jϕ), and worst for resonant orbits (several strong peaks in both plots). For the former, the median value of Δ for the entire orbit [5] delivers the highest accuracy, but of course requires the orbit to be computed first. Given just a single point, equation (9) produces the most accurate estimate [4], and its interpolated variant [5] is only moderately worse but ∼10× faster.
Figure 1.

Accuracy of the Stäckel fudge for various methods of computing the focal distance Δ. Shown is the relative rms variation of the radial (top panel) and vertical (bottom panel) action for a series of orbits in a realistic galactic potential (Piffl et al. 2014), started at R = 8.3 kpc, |$z$| = 0, with a total velocity 240 km s−1 (equal to the local circular speed), split in different proportions between |$v$|R, |$v$||$z$| = 0.8|$v$|R, and |$v$|ϕ. These initial conditions are similar to the ones considered in Section 5.2 and plotted in Fig. 3 of Sanders & Binney (2016), except that we keep the total velocity constant, not |$v$|ϕ, and plot the relative rms variation instead of the absolute one. We numerically integrate each orbit for 10 dynamical times and compute the variation of the actions evaluated at 1000 points sampled from the trajectory. Dashed blue curve [2] corresponds to Δ being interpolated from a pre-initialized 2D grid in E, L|$z$| using ellipse fits to shell orbits, as in Binney (2014); it matches the ‘Fudge v2’ method in Sanders & Binney (2016). Red dotted line [1] corresponds to Δ being evaluated separately for each input point from the second derivatives of the potential (equation 15 in that paper), it matches the ‘Fudge v1’ method. Cyan dot-dashed line [3] uses a similar approach, but taking the median value of Δ for the entire trajectory (as used in galpy). Green solid line [4] takes Δ from a pre-initialized 2D grid in E, L|$z$|, using the condition that Jr = 0 for shell orbits (equation 9); this is the new method introduced in agama. Yellow long-dashed line [5] uses the same Δ, but interpolates actions from a pre-initialized 3D grid in E, L|$z$|, I3. Naturally, the accuracy is best for low-eccentricity orbits (with Jr, J|$z$|Jϕ), and worst for resonant orbits (several strong peaks in both plots). For the former, the median value of Δ for the entire orbit [5] delivers the highest accuracy, but of course requires the orbit to be computed first. Given just a single point, equation (9) produces the most accurate estimate [4], and its interpolated variant [5] is only moderately worse but ∼10× faster.

Accuracy of frequency determination by the Stäckel fudge for various choices of focal distance. This plot is similar to Fig. 1 and uses the same orbits and methods (except the interpolated one, in which frequency determination is not implemented), but shows the relative rms. variation of three frequencies; it may be compared to fig. 4 in Sanders & Binney (2016). Again the choice of Δ in agama (green solid line [4]) produces the most accurate results.
Figure 2.

Accuracy of frequency determination by the Stäckel fudge for various choices of focal distance. This plot is similar to Fig. 1 and uses the same orbits and methods (except the interpolated one, in which frequency determination is not implemented), but shows the relative rms. variation of three frequencies; it may be compared to fig. 4 in Sanders & Binney (2016). Again the choice of Δ in agama (green solid line [4]) produces the most accurate results.

Accuracy of the Stäckel fudge for various choices of focal distance Δ. We take a large sample of orbits started at R = 8.3 kpc, $z$ = 0, and randomly directed velocity $v$ = 240 km s−1 in the Milky Way potential from Piffl et al. (2014), covering the range from thin disc to halo stars. For each numerically integrated orbit we compute the relative rms variation of the sum of radial and vertical actions, and plot the median error (solid lines) and 90 per cent percentile (dotted lines) for the same five methods as in Fig. 1). The ranking of the methods turns out to be the same, with the one used in agama being optimal among the methods that use only a single point on the trajectory, and its interpolated version being on average 50 per cent less accurate. However, a significant fraction of high-eccentricity orbits turn out to be resonant, with much larger variations of actions (seen as a rapid increase in the 90 per cent percentile error for (Jr + J$z$)/Jϕ ≳ 0.02). For these orbits the Stäckel fudge is not accurate regardless of the choice of Δ. We checked that the ranking between methods is similar for other values of energy and choices of potential, although the severity of the accuracy deterioration near resonances depends on the properties of potential: for instance, it is almost negligible for the MWPotential2014 used in Bovy (2015), because the latter does not have a thin gaseous disc, which creates strong gradients in vertical frequency in the potential models used by Piffl et al. (2014); McMillan (2017).
Figure 3.

Accuracy of the Stäckel fudge for various choices of focal distance Δ. We take a large sample of orbits started at R = 8.3 kpc, |$z$| = 0, and randomly directed velocity |$v$| = 240 km s−1 in the Milky Way potential from Piffl et al. (2014), covering the range from thin disc to halo stars. For each numerically integrated orbit we compute the relative rms variation of the sum of radial and vertical actions, and plot the median error (solid lines) and 90 per cent percentile (dotted lines) for the same five methods as in Fig. 1). The ranking of the methods turns out to be the same, with the one used in agama being optimal among the methods that use only a single point on the trajectory, and its interpolated version being on average 50 per cent less accurate. However, a significant fraction of high-eccentricity orbits turn out to be resonant, with much larger variations of actions (seen as a rapid increase in the 90 per cent percentile error for (Jr + J|$z$|)/Jϕ ≳ 0.02). For these orbits the Stäckel fudge is not accurate regardless of the choice of Δ. We checked that the ranking between methods is similar for other values of energy and choices of potential, although the severity of the accuracy deterioration near resonances depends on the properties of potential: for instance, it is almost negligible for the MWPotential2014 used in Bovy (2015), because the latter does not have a thin gaseous disc, which creates strong gradients in vertical frequency in the potential models used by Piffl et al. (2014); McMillan (2017).

We find that a more suitable choice of Δ is obtained if one demands that Jr = 0 for the same shell orbit. Namely, we write the momentum pλ, canonically conjugate to the radial coordinate λ, as a function of λ, Φ(λ, ν = 0), E, L|$z$|, and I3 (the latter depends on Δ). For a shell orbit, the range of oscillation in λ, determined by the condition |$p_\lambda ^2\ge 0$|⁠, collapses to a single point |$\lambda _\mathrm{shell}=R_\mathrm{shell}^2+\Delta ^2$|⁠; hence |$\partial p_\lambda ^2/\partial \lambda =0$|⁠, which translates to
\begin{eqnarray*} \Delta ^2 = R_\mathrm{shell}^2\,\, \left. \frac{2 \big [E - \Phi \big ] - R \frac{\partial \Phi }{\partial R} }{R \frac{\partial \Phi }{\partial R} - \frac{L_z^2}{R^2} } \right|_{R = R_\mathrm{shell},z=0}. \end{eqnarray*}
(9)

In some cases, this expression produces negative Δ2, which are replaced by zeros (essentially rendering the approximate Stäckel potential spherical). We pre-initialize a 2D interpolation table for Δ(E, L|$z$|), which is a one-time effort for the given potential, requiring a few CPU seconds. Figs 13 compare the accuracy of Stäckel fudge for various choices of Δ, demonstrating that equation (9), shown by the green curve [4], delivers better results than either of the two methods used in Sanders & Binney (2016).

In principle, a further improvement (cyan curve [5]) can sometimes be achieved by taking a median value of Δ from potential derivatives along a numerically computed trajectory (this method is offered in galpy); however, the overhead of orbit integration makes it impractical for calculating the actions at a single point. galpy does not contain any method to retrieve a suitable Δ from an interpolation table, hence a single value has to be used for the entire system, substantially deteriorating the accuracy of action computation.

We checked that for a given Δ, the numerical procedures used in tact, galpy, and agama produce nearly identical results; therefore, the improvement in accuracy in agama arises from a more judicious assignment of Δ that varies across the phase space. The cost of action computation for each input point comes from numerical root-finding and integration routines, which require ∼50 potential evaluations; this is 1.5 − 2 × fewer in agama than in other existing implementations, thanks to various mathematical improvements.

3.3 Interpolation

One could achieve considerable savings by pre-computing a 3D interpolation table for actions Jr, J|$z$| as functions of three integrals of motion (one of them being approximate), as part of the initialization procedure, approximately doubling its computational cost. For an orbit started at R = Rshell(E, L|$z$|), |$z$| = 0 and meridional velocity |$v_\mathrm{mer}=\sqrt{2\big [E-\Phi (R_\mathrm{shell})\big ] - (L_z/R_\mathrm{shell})^2}$| directed at an angle ψ to the equatorial plane, the value of the third integral (8) is |$I_3=(R_\mathrm{shell}^2+\Delta ^2)\, v_\mathrm{mer}^2\, \sin ^2\psi$|⁠, varying from 0 for an in-plane orbit (ψ = 0) to |$I_3^\mathrm{max}(E,L_z)$| for a shell orbit (ψ = π/2). For each combination of E and L|$z$| used in constructing the interpolator for Δ, we pre-compute Jr and J|$z$| on a grid in ψ (or equivalently |$I_3/I_3^\mathrm{max}$|⁠) and construct 3D cubic spline interpolators in E, L|$z$|/Lcirc(E) and ψ for both actions. Then for an arbitrary point |$\lbrace \boldsymbol{x},\boldsymbol{v}\rbrace$| we need to determine the value of I3 to retrieve the actions from the interpolation table. If the potential were of the Stäckel form (7), the third integral would be given by equation (8). One could compute it without explicitly specifying the functions fλ(λ), fν(ν), by replacing the second term in equation (8) by either of the two expressions:
\begin{eqnarray*} f_\lambda (\lambda) + \lambda \Phi (\lambda ,\nu) &= \lambda \,\,\big [ \Phi (\lambda ,\nu) - \Phi (\lambda ,0) \big ], \end{eqnarray*}
(10a)
\begin{eqnarray*} f_\nu (\nu) + \nu \Phi (\lambda ,\nu) &= \nu \,\,\big [ \Phi (\lambda ,\nu) - \Phi (\lambda _\mathrm{shell},\nu) \big ] \nonumber \\ &+ \lambda _\mathrm{shell}\,\,\big [ \Phi (\lambda _\mathrm{shell},\nu) - \Phi (\lambda _\mathrm{shell},0) \big ], \end{eqnarray*}
(10b)

where the right-hand sides only contain the values of potential at certain points. For a non-Stäckel potential, these expressions are not equivalent and yield different numerical values for I3, which furthermore vary along the orbit (Fig. 4, bottom right-hand panel). The former one is related to the radial energy Er (equation 18 Binney 2012), and the latter – to the vertical energy E|$z$| (equation 2 Bovy 2015). The latter paper suggested to use both approximate values of I3 to compute interpolated Jr and J|$z$|, correspondingly, whereas Binney’s approach only uses the latter one for both actions. We also do the same since we find that (10b) is typically better conserved along orbits.

Illustration of the Stäckel fudge and its interpolated variant. Left-hand panel shows the meridional cross-section of an orbit in a realistic axisymmetric Milky Way potential. The orbit starts at point S with R = Rshell(E, L$z$), $z$ = 0, and meridional velocity directed at ∼50° to the equatorial plane and being 4.5 larger than the azimuthal velocity (hence the orbit is fairly eccentric and the inaccuracies in action computation are greater than for the majority of stellar orbits). Blue curve shows the initial part of the trajectory, and the entire region spanned by the orbit is shaded in grey. Dotted grey lines show the auxiliary prolate spheroidal coordinate system, and black dots on the $z$ axis show its focal points located at $z$ = ±Δ. To compute actions for any point P, we first determine the extent of oscillation in the radial (λ) and vertical (ν) directions, and then integrate the canonically conjugate momenta pλ, pν along the lines of constant λ and ν, shown by green dots. If the potential were of a Stäckel form, the orbit would align with the prolate spheroidal coordinates, and its extent would be the same for all points on the trajectory; in reality the match is not perfect, and consequently the determined orbit boundaries vary between points (red and green boxes for the points S and P, respectively). Alternatively, we ‘translate’ the point P back to the initial point S and determine the value of I3 that is related to the initial velocity direction; the actions are then interpolated from a pre-computed table, which was filled using integration contours originating from S (dotted red curves). There are various possible ways of computing I3: by following the line of λ = const. from P to the point L and then the line ν = const. from L to S (equation 10a, resulting in dotted purple curves), or in the opposite order, from P to N and then to S (equation 10b, dashed yellow curves). Right-hand panels show the variation of actions and I3 along this orbit; the two points S and P are marked by vertical dotted lines. The two methods for estimating I3 produce different results, as shown in the bottom sub-panel. The corresponding interpolated actions are shown in the same colour and line style in the other two sub-panels, and their variation mirrors that of I3. The actions computed by the non-interpolated Stäckel fudge are shown in green curves and exhibit smaller variations, because the errors in determining the orbit boundaries are somewhat compensated by using different integration paths for each point. Hence the interpolated Stäckel fudge is generally less accurate not because of interpolation itself, but because of larger fluctuations in the approximate third integral. Moreover, the fluctuations in actions are anticorrelated, therefore the value of a typical distribution function that depends on a linear combination of Jr and J$z$, such as equation (12), is computed more accurately than either of the two actions.
Figure 4.

Illustration of the Stäckel fudge and its interpolated variant. Left-hand panel shows the meridional cross-section of an orbit in a realistic axisymmetric Milky Way potential. The orbit starts at point S with R = Rshell(E, L|$z$|), |$z$| = 0, and meridional velocity directed at ∼50° to the equatorial plane and being 4.5 larger than the azimuthal velocity (hence the orbit is fairly eccentric and the inaccuracies in action computation are greater than for the majority of stellar orbits). Blue curve shows the initial part of the trajectory, and the entire region spanned by the orbit is shaded in grey. Dotted grey lines show the auxiliary prolate spheroidal coordinate system, and black dots on the |$z$| axis show its focal points located at |$z$| = ±Δ. To compute actions for any point P, we first determine the extent of oscillation in the radial (λ) and vertical (ν) directions, and then integrate the canonically conjugate momenta pλ, pν along the lines of constant λ and ν, shown by green dots. If the potential were of a Stäckel form, the orbit would align with the prolate spheroidal coordinates, and its extent would be the same for all points on the trajectory; in reality the match is not perfect, and consequently the determined orbit boundaries vary between points (red and green boxes for the points S and P, respectively). Alternatively, we ‘translate’ the point P back to the initial point S and determine the value of I3 that is related to the initial velocity direction; the actions are then interpolated from a pre-computed table, which was filled using integration contours originating from S (dotted red curves). There are various possible ways of computing I3: by following the line of λ = const. from P to the point L and then the line ν = const. from L to S (equation 10a, resulting in dotted purple curves), or in the opposite order, from P to N and then to S (equation 10b, dashed yellow curves). Right-hand panels show the variation of actions and I3 along this orbit; the two points S and P are marked by vertical dotted lines. The two methods for estimating I3 produce different results, as shown in the bottom sub-panel. The corresponding interpolated actions are shown in the same colour and line style in the other two sub-panels, and their variation mirrors that of I3. The actions computed by the non-interpolated Stäckel fudge are shown in green curves and exhibit smaller variations, because the errors in determining the orbit boundaries are somewhat compensated by using different integration paths for each point. Hence the interpolated Stäckel fudge is generally less accurate not because of interpolation itself, but because of larger fluctuations in the approximate third integral. Moreover, the fluctuations in actions are anticorrelated, therefore the value of a typical distribution function that depends on a linear combination of Jr and J|$z$|, such as equation (12), is computed more accurately than either of the two actions.

Interestingly, the interpolated actions obtained from either expression for I3 have larger variations than the actions computed by the Stäckel fudge: the fluctuations in I3 are largely compensated by using different contours (λ = const. and ν = const.) for each point λ, ν along the orbit, as shown in the left-hand panel of Fig. 4. Hence the lower accuracy of the interpolated actions arises not from the interpolation itself, but from the approximate nature of the third integral. Because the variation of I3 (relative to the maximum value |$I_3^\mathrm{max}$|⁠) is typically larger than the variations of actions computed by the (non-interpolated) Stäckel fudge, this suggests that actions have further advantages compared to the ‘vanilla’ third integral, as used, e.g. in Famaey, van Caelenberg & Dejonghe (2002), Bienaymé, Robin & Famaey (2015).

Besides larger fluctuations, a more serious problem is that the interpolated actions have small but non-negligible systematic bias (compare the solid and dashed or dotted curves in the top and middle right-hand panels of Fig. 4). This bias depends on the potential and exists for any choice of the approximation for I3, and its magnitude and even sign vary across the action space, being more severe for high-eccentricity orbits. This precludes the use of interpolated actions in applications that require an unbiased comparison between different potentials, such as the maximum-likelihood determination of potential parameters using an action-based DF of kinematic tracers (Section 4.5). However, in other contexts a moderate deterioration of accuracy in the interpolated actions is acceptable and leads to a 10× increase in computational efficiency. Therefore the library offers a choice between using interpolated or non-interpolated actions [in both cases it still constructs a 2D interpolation table for Δ(E, L|$z$|) as described above]. Finally, we note that the actions produced by the non-interpolated Stäckel fudge are, for all practical purposes, unbiased – their average values along the orbit agree very well with calculations using a more rigorous generating function approach (Sanders & Binney 2014).

To summarize, the axisymmetric Stäckel fudge in agama is both more efficient and accurate than other existing implementations, thanks to the new method for estimating the focal distance and various algorithmic improvements, and its interpolated version offers further significant speedup at the expense of an often tolerable decrease in accuracy. Listing 2 illustrates the use of action finders.

4 DISTRIBUTION FUNCTIONS

A stellar system consisting of a large enough number of stars generally can be described by a one-particle DF |$f(\boldsymbol{x},\boldsymbol{v},t)$|⁠, and according to Jeans’ theorem, in a steady state it may only depend on the integrals of motion – in our approach, actions. We use the convention for normalization of the DF such that its integral over some region of phase space equals the mass (not number) of stars in that region. Since the integration over angles is trivial, this mass is given by
\begin{eqnarray*} M=(2\pi)^3\int\!\!\!\int\!\!\!\int f(\boldsymbol{J})\, \mathrm{d}^3\!J, \end{eqnarray*}
(11)
without extra weight factors [such as the density of states g(E), equation 4.55 in Binney & Tremaine 2008] and independently of the potential. In the core of the library we only deal with simple DFs of the form |$f(\boldsymbol{J})$|⁠. More generally, the population of stars of various masses, ages, and metallicities can be described by an extended DF that also depends on these extra arguments (Sanders & Binney 2015b); such user-defined extended DFs can be used in the python interface (e.g. Das, Williams & Binney 2016). Alternatively, one may simply use a superposition of single-component DFs for each population (e.g. Bovy et al. 2012; Trick, Bovy & Rix 2016).

agama provides several types of DFs for spheroidal and discy components, including generalizations of previously used models and some new ones. There are routines for computing DF moments (density, velocity dispersion, marginalized 1D velocity distributions, etc.), sampling a DF by particles, and determining the best-fitting DF from discrete samples; some of them are illustrated in Listings 3, 4.

4.1 Double-power-law DF for spheroidal systems

A suitable choice for spheroidal components (bulge, halo) is a family of double-power-law DFs – a generalization of the one introduced by Posti et al. (2015):
\begin{eqnarray*} f(\boldsymbol{J}) &=& \frac{M}{(2\pi \, J_0)^3} \left[1 + \left(\frac{J_0}{h(\boldsymbol{J})}\right)^\eta \right]^{\Gamma /\eta } \left[1 + \left(\frac{g(\boldsymbol{J})}{J_0}\right)^\eta \right]^{\frac{\Gamma -\mathrm{B}}{\eta }} \nonumber \\ &&\times \, \exp \bigg [-\left(\frac{g(\boldsymbol{J})}{J_\mathrm{max}}\right)^\zeta \bigg ] \bigg (1 + \varkappa \tanh \frac{J_\phi }{J_{\phi ,0}}\bigg). \end{eqnarray*}
(12)
Here
\begin{eqnarray*} g(\boldsymbol{J}) &\equiv g_r J_r + g_z J_z\, + g_\phi \, |J_\phi |, \nonumber \\ h(\boldsymbol{J}) &\equiv h_r J_r + h_z J_z + h_\phi |J_\phi | \nonumber \end{eqnarray*}
are linear combinations of actions that control the flattening and anisotropy of the model in the outer region (above the break action J0) and the inner region (below J0), respectively. Only two of the three coefficients in the linear combination are independent, so we set gϕ = 3 − grg|$z$|, hϕ = 3 − hrh|$z$|. The power-law indices B and Γ control the outer and inner slopes of the density profile. The parameter η determines the steepness of the transition between the two regimes. An optional exponential cutoff at JJmax is controlled by the sharpness parameter ζ. The last term allows for rotation by introducing the odd-Jϕ part of DF, proportional to the even part with a coefficient ϰ (Binney 2014); Jϕ, 0 then controls the extent of the central region where the rotation is suppressed.

This DF roughly corresponds to a (possibly flattened) double-power-law density profile. The asymptotic behaviour in a power-law regime at small or large radii can be derived analytically, yielding the values of power-law indices and mixing coefficients corresponding to the given density profile. Posti et al. (2015) constructed DFs approximately corresponding to spherical double-power-law density profiles, by fixing these parameters to their asymptotic values, setting η = 1 and adjusting the only free parameter J0 to achieve the best agreement between the density profile generated by the DF (Section 4.5) and the required analytic density model. We argue that it is preferable to freely adjust all DF parameters in order to attain a better match in the intermediate range of radii at the expense of a moderate deterioration in the asymptotic regime (e.g. reducing the approximation error in density and velocity dispersion from 10 per cent down to |${\lesssim } 1{{\ \rm per\ cent}}$| for a Hernquist model). Williams & Evans (2015a) suggest a somewhat different functional form for the transition regime that can be better tuned to produce desirable anisotropy profiles. In the end, if the goal is to fit the DF parameters using observations (Section 4.6), it does not matter whether the DF produces a density profile that closely follows any particular functional form, so long as it adequately fits the observed kinematics and surface density.

4.2 Quasi-isotropic DF for spheroidal systems

An alternative way of constructing suitable DFs for spheroidal galaxy components is based on the following idea. For an arbitrary spherically symmetric density profile ρ(r) in a spherically symmetric potential Φ(r) (not necessarily related to ρ via Poisson equation), we numerically construct an isotropic DF of the form f(E), using the Eddington inversion formula; this might be generalized to anisotropic models with f(E, L) (e.g. Cuddeford 1991). Even if there is no closed-form analytical expression for such a DF, it can always be approximated by a smooth and fast-to-compute function such as a cubic spline in suitably scaled variables.

In this spherical potential, we also know the expression for energy E(Jr, L) as a function of actions Jr and LJ|$z$| + |Jϕ| (again, it may be approximated to any desired accuracy by a 2D spline). Therefore, the DF may be viewed as a composition of two functions f(E(Jr, L)), and the intermediate mapping E(Jr, L) can be made part of its definition. In this interpretation, a ‘quasi-isotropic DF’ |$f(\boldsymbol{J})$| is a valid action-based DF. Of course, it still has isotropic velocity distribution and produces the original density profile ρ(r) in the same original potential Φ(r). But we may equally well use this DF in any other potential |$\Phi ^{\prime }(\boldsymbol{r})$|⁠, not necessarily spherical. In this case, the DF may produce a non-spherical density profile and anisotropic kinematics (hence the ‘quasi-’ name prefix).

The mapping between |$\lbrace \boldsymbol{x},\boldsymbol{v}\rbrace$| and |$\boldsymbol{J}$| depends on the potential, but the numerical value of |$f(\boldsymbol{J})$| does not. Note that it differs from computing the energy |$E^{\prime }\equiv \Phi ^{\prime }(\boldsymbol{r})+\frac{1}{2}|\boldsymbol{v}|^2$| and using the original function of energy f(E′) – this would not be an action-based DF and hence would not enjoy its useful properties. For instance, the density generated by a quasi-isotropic DF in the potential Φ′ would be identical to the density obtained by evolving the population of stars drawn from the initial DF in a slowly varying potential that adiabatically changes from Φ to Φ′ – and this is achieved in one go, without the need to actually follow this evolution! (of course, this holds for any action-based DF, not only a quasi-isotropic one). This property is handy for the construction of multicomponent self-consistent models, described in the next section. The motivation is that if the final potential is not too different from the initial one (for instance, the initial potential is a spherically symmetric version of the final one), then the resulting density profile would be similar to the initial one, have the same total mass, but somewhat flattened shape and moderately anisotropic kinematics.

A disadvantage of this type of DF is that it does not have a closed analytic form. One may wish to approximate such a numerically constructed DF by equation (12). Among the example programs distributed with the library, we provide a tool for constructing such approximations for arbitrary spherical density and potential profiles (not necessarily related via Poisson equation). Jeffreson et al. (2017) used a similar approach, but with a double-power-law DF of the form proposed by Williams & Evans (2015a).

Thus we have two different approaches for defining a DF of a spheroidal component: one mandates that this DF produces a particular density profile in a particular potential (the quasi-isotropic DF), and the other dispenses with the mention of potential altogether, defining the functional form of the DF purely in terms of actions. As we shall see, a similar distinction can be made for discy DFs.

4.3 Quasi-isothermal DF for discs

Distribution functions for discy components are usually intended to produce density profiles |$\rho (R,z) \simeq \Sigma (R)\, H(z)$|⁠, with the surface density Σ(R) typically declining exponentially with radius, |$\Sigma (R) = \frac{M_\mathrm{disc}}{2\pi \, R_\mathrm{disc}^2}\, \exp (-R/R_\mathrm{disc})$|⁠, and the vertical profile being isothermal, |$H(z)\propto \mathrm{sech}^2\big (\frac{z}{2h}\big)$|⁠. In a kinematically cold disc, the velocity distribution at a radius R is approximately Maxwellian with dispersions σR(R), σ|$z$|(R) that are small compared to the mean streaming velocity |$\overline{v_\phi}$| (itself close to the circular velocity |$v_\circ (R) \equiv \sqrt{R\, \partial \Phi /\partial R}$|⁠). Expressed in terms of integrals of motion, such a DF might look like
\begin{eqnarray*} f(E,L_z) = f_\circ (L_z)\, \exp \big (-E_R/\tilde{\sigma }_R^2-E_z/\tilde{\sigma }_z^2\big), \end{eqnarray*}
(13)
where |$E_z \equiv \Phi (R,z)-\Phi (R,0)+v_z^2/2$| is the energy of vertical motion (an approximate integral), EREEE|$z$| is the energy in planar motion, E is the energy of a circular orbit with angular momentum L|$z$|, and f(L|$z$|) is related to the radial disc profile. The functions controlling the velocity dispersions are tilded, to distinguish them from the true dispersions that are obtained by taking moments of the DF. This form is used by many authors, e.g. Kuijken & Dubinski (1995). To bring this DF into the action-based form, we replace ER and E|$z$|, respectively, with κJr and νJ|$z$|, where
\begin{eqnarray*} \kappa \equiv \sqrt{\frac{\partial ^2\Phi }{\partial R^2} + \frac{3}{R}\, \frac{\partial \Phi }{\partial R}}, \quad \nu \equiv \sqrt{\frac{\partial ^2\Phi }{\partial z^2}} \end{eqnarray*}
are the radial and vertical epicyclic frequencies. The result is the quasi-isothermal DF of Binney & McMillan (2011):
\begin{eqnarray*} f(\boldsymbol{J}) = f_\circ (J_\phi) \times \frac{\kappa }{\tilde{\sigma }_R^2}\exp \left(-\frac{\kappa J_r}{\tilde{\sigma }_R^2}\right) \times \frac{\nu }{\tilde{\sigma }_z^2}\exp \left(-\frac{\nu J_z}{\tilde{\sigma }_z^2}\right). \end{eqnarray*}
(14)
Here the frequencies and velocity dispersions must be functions of actions; again the simplest way to ensure this is to let them depend on the radius R of a circular orbit with angular momentum Jϕ, defined as the solution of the equation |$R^3\, \partial \Phi /\partial R = J_\phi ^2$|⁠. The approximate relation between f(Jϕ) and the surface brightness profile Σ(R) may be derived by integrating the above expression over angles and actions and noting that the integral over Jϕ may be converted into the integral over R(Jϕ):
\begin{eqnarray*} M &=& \int f(\boldsymbol{J})\,\,\mathrm{d}^3J\, \mathrm{d}^3\theta = (2\pi)^3 \int f_\circ (J_\phi)\, \mathrm{d}J_\phi \\ &=& (2\pi)^3 \int f_\circ (J_\phi)\, \frac{\mathrm{d}J_\phi }{\mathrm{d}R_\circ }\, \mathrm{d}R_\circ \\ &=& (2\pi)^3 \int f_\circ (J_\phi)\, \frac{R_\circ \, \kappa ^2}{2\Omega }\, \mathrm{d}R_\circ, \end{eqnarray*}
where Ω ≡ |$v$|/R is the azimuthal epicyclic frequency. On the other hand, the same mass is given by
\begin{eqnarray*} M = \int \Sigma (R_\circ)\, 2\pi \, R_\circ \, \mathrm{d}R_\circ . \end{eqnarray*}
Comparing the two expressions, we infer a suitable form of f(Jϕ):
\begin{eqnarray*} f_\circ (J_\phi) &= \frac{\tilde{\Sigma }(R_\circ)\, \Omega (R_\circ)}{2\pi ^2\, \kappa ^2(R_\circ)}, \end{eqnarray*}
(15)
with the function |$\tilde{\Sigma }(R_\circ)$| matching the surface brightness profile Σ(R).

This DF is not without deficiencies. First, in the case when the potential is cuspy (e.g. a Hernquist bulge), the epicyclic frequencies rise arbitrarily high at small radii, so this DF is not well-behaved at small Jϕ. More generally, for a realistically warm disc there is a significant fraction of stars with max(Jr, J|$z$|) ≳ Jϕ (especially at small radii), for which the dimensional quantities entering the DF (density, frequencies, velocity dispersions) are evaluated at a radius R(Jϕ) that is much smaller than the average radius of the orbit. As a result, the density Σ and velocity dispersions σR, σ|$z$| generated by such a DF are not particularly close to their desired forms |$\tilde{\Sigma },\tilde{\sigma }_R,\tilde{\sigma }_z$| – usually they have a pronounced central depression at radii smaller than the disc scale length.

Dehnen (1999) proposed a modification of this approach: the radius used to compute the surface density, velocity dispersions, and epicyclic frequencies, is taken to be the radius of a circular orbit with the given energy, not angular momentum: he argues that it corresponds more closely to the actual mean radius of an orbit even if the latter is far from circular. In terms of actions, this may be mimicked by replacing R(Jϕ) with |$R_\circ (\tilde{J})$|⁠, where
\begin{eqnarray*} \tilde{J} \equiv |J_\phi | + k_r\, J_r + k_z\, J_z \end{eqnarray*}
(16)
is a linear combination of actions that is approximately constant across the energy surface (see fig. 1 in Binney 2014 for an illustration). For orbits close to the disc plane, this could be achieved by setting kr = κ/Ω, k|$z$| = ν/Ω, but more generally these coefficients may have arbitrary values. Larger k|$z$| produce DF that declines faster at large J|$z$|, suppressing the high |$z$| tail of the vertical density profile or the high |$v$||$z$| tail of the velocity distribution, without noticeably changing its width, and kr has a similar effect on the radial velocity distribution. Values of order unity produce a suitable DF that generates a density profile that more closely follows the desired form |$\tilde{\Sigma }(R)$| than an un-modified DF.
It remains to choose the form of functions |$\tilde{\sigma }_R(R_\circ)$|⁠, |$\tilde{\sigma }_z(R_\circ)$|⁠. The vertical velocity dispersion is often taken to be exponentially declining with radius:
\begin{eqnarray*} \tilde{\sigma }_z = \sigma _{z,0}\, \exp (-R_\circ /R_{\sigma _z}), \end{eqnarray*}
(17a)
because this would produce a constant scale height in an isolated thin disc. In the presence of other components, the self-gravity of the disc is not necessarily dominant at small and large radii, so a better approximation to a constant-scale height disc is obtained by setting
\begin{eqnarray*} \tilde{\sigma }_z(R_\circ) = \sqrt{2}\, h\, \nu (R_\circ). \end{eqnarray*}
(17b)
Similarly, the radial velocity dispersion is also assumed to decline exponentially with radius (for no other reason than similarity to |$\tilde{\sigma }_z$|⁠):
\begin{eqnarray*} \tilde{\sigma }_R(R_\circ) = \sigma _{R,0}\, \exp (-R_\circ / R_{\sigma _R}), \end{eqnarray*}
(18)
with adjustable maximum value σR, 0 and scale length |$R_{\sigma _R} \simeq 2R_\mathrm{disc}$|⁠. We retain this functional form, but place a lower limit on how small |$\tilde{\sigma }_R$| could get, to avoid an unphysical situation that the DF in the disc plane could become arbitrarily high at large radii.
Finally, the disc DF should normally be asymmetric in Jϕ, producing net rotation. Binney & McMillan (2011) achieve this by multiplying the DF by a factor 1 + tanh(Jϕ/Jϕ, 0), with Jϕ, 0 chosen such that in the central part of the disc there is approximate symmetry between prograde and retrograde orbits, and further out only the prograde ones survive. Following Dehnen (1999), we instead introduce an alternative method for extending the DF into the negative-Jϕ half-space, namely, multiply f by
\begin{eqnarray*} f_{\pm } \equiv \left\lbrace \begin{array}{ll}1 &\mbox{if }J_\phi \ge 0, \\ \exp \big (2\Omega J_\phi / \sigma _R^2 \big) &\mbox{if }J_\phi \lt 0, \end{array} \right. \end{eqnarray*}
(19)
where Ω and σR are evaluated at the radius |$R_\circ (\tilde{J})$|⁠. To summarize, the improved variant of quasi-isothermal DF is given by
\begin{eqnarray*} f(\boldsymbol{J}) &=& \frac{\Sigma _0\, \Omega (R_\circ)}{2\pi ^2\, \kappa (R_\circ)^2} \exp \bigg (-\frac{R_\circ }{R_\mathrm{disc}}\bigg) \times f_\pm \nonumber \\ &&\times \, \frac{\kappa (R_\circ)}{\tilde{\sigma }_R^2(R_\circ)} \exp \bigg (-\frac{\kappa (R_\circ)\, J_r}{\tilde{\sigma }_R^2(R_\circ)}\bigg) \nonumber \\ &&\times \, \frac{\nu (R_\circ)}{\tilde{\sigma }_z^2(R_\circ)} \exp \bigg (-\frac{\nu (R_\circ)\, J_z}{\tilde{\sigma }_z^2(R_\circ)}\bigg), \end{eqnarray*}
(20)
where R is a function of |$\tilde{J}$| (16), velocity dispersions are given by (17,18), and f± is given by (19).

If the disc is not too cold, the density and velocity dispersions generated by such a DF may be different from the required values. Kuijken & Dubinski (1995), Dehnen (1999), or Sharma & Bland-Hawthorn (2013) further adjust the functions |$\tilde{\Sigma }(R),\tilde{\sigma }_{R,z}(R)$| to match the resulting profiles more closely to the input ones. We do not employ such adjustments, relying instead on the recomputation of the potential to achieve self-consistency, as described in Section 5.2. Our modifications – the use of a linear combination of actions (16), and a different expression for |$\tilde{\sigma }_z$| (17b) – generally produce a better match to the density profile of a radially exponential, vertically isothermal disc, than the original quasi-isothermal DF of Binney & McMillan (2011) [which could still be recovered by using equation (17a) and setting kr = k|$z$| = 0].

As in the case of quasi-isotropic DFs, the intermediate mappings JϕR → κ, ν, Ω, Σ, etc., become part of the definition of the DF, constructed in a particular potential, but not necessarily the one that the DF is used in later. Of course, if the potential is very different from the one used to initialize these mappings, then the density generated by such a DF will also differ significantly from the exponential/isothermal form.

4.4 Exponential DF for discs

The fact that the DF contains functions defined by a potential is conceptually distressing, and we now introduce another family of disc DFs that is specified entirely in terms of actions, without any auxiliary functions. It is designed to produce density profiles that are close to exponential/isothermal, in a potential with an approximately flat rotation curve (⁠|$v$| ≈ const.). In this case, R(Jϕ) = |Jϕ|/|$v$|, |$\Omega = v_\circ ^2 / |J_\phi |$|⁠, |$\kappa = \sqrt{2}\, \Omega$|⁠, and ν is also approximately proportional to Ω, although with a coefficient that slowly varies with radius. Velocity dispersions are related to the extent of radial (ΔR) and vertical (h) excursions of stars: |$\sigma _R\sim \kappa \, \Delta R$|⁠, |$\sigma _z\sim \nu \, h$|⁠, and characteristic values of actions are, correspondingly, 〈Jr〉 ∼ σRΔR ∼ κΔR2, 〈J|$z$|〉 ∼ σ|$z$|h ∼ ν|$z$|2. The dependence of DF on the vertical action should be exponential with the characteristic scale 〈J|$z$|〉:
\begin{eqnarray*} f \propto \exp \bigg (-\frac{J_z}{\langle J_z \rangle } \bigg) \sim \exp \bigg (-\frac{J_z |J_\phi |}{(v_\circ h)^2} \bigg) = \exp \bigg (-\frac{J_z |J_\phi |}{J_{z,0}^2} \bigg), \end{eqnarray*}
where we introduced a new dimensional parameter J|$z$|⁠, 0|$v$|h. Similar considerations lead to the same exponential dependence of DF on the radial action, with its own characteristic scale Jr, 0. Finally, the analogue of (15) for an exponential radial density profile would be f(Jϕ) ∝ |Jϕ|exp (− |Jϕ|/Jϕ, 0). In practice, these expressions need to be tweaked somewhat: first, we replace |Jϕ| by a linear combination |$\tilde{J}$| of all three actions (16), for the same reasons as in the quasi-isothermal DF (it better corresponds to the average radius of the orbit even when the latter is far from circular). Secondly, recognizing that the potential cannot have a flat rotation curve all the way down to R → 0, we mitigate undesirable deviations of the model from its prescribed properties at small radii by two extra parameters with the dimension of actions Jd, 0 and J|$v$|⁠, 0. In the end, the new ‘exponential’ DF model is defined as follows:
\begin{eqnarray*} f(\boldsymbol{J}) &= \frac{M}{(2\pi)^3}\, \frac{J_d}{J_{\phi ,0}^2}\, \exp \bigg (-\frac{J_d}{J_{\phi ,0}}\bigg) \nonumber \\ &\times \frac{J_v}{J_{r,0}^2}\, \exp \bigg (-\frac{J_v\, J_r}{J_{r,0}^2}\bigg) \times \frac{J_v}{J_{z,0}^2}\, \exp \bigg (-\frac{J_v\, J_z}{J_{z,0}^2}\bigg) \nonumber \\ &\times \left\lbrace \begin{array}{ll}1 &\mbox{if } J_\phi \ge 0 \,\,, \\ \exp \bigg (\displaystyle \frac{J_v\, J_\phi }{J_{r,0}^2} \bigg) &\mbox{if } J_\phi \lt 0 \,\,, \end{array}\right. \nonumber \\ J_d &\equiv \sqrt{\tilde{J}^2 + J_{d,0}^2}, \quad J_v \equiv \sqrt{\tilde{J}^2 + J_{v,0}^2}. \end{eqnarray*}
(21)

The advantage of this DF is a relatively simple functional form, which produces discy density profiles in realistic potentials (not necessarily with flat rotation curves), and a small number of free parameters: M sets the overall normalization (the total mass is of the order of M), Jϕ, 0 determines the scale radius, J|$z$|⁠, 0 – scale height and vertical velocity dispersion, Jr, 0 – radial velocity dispersion, Jd, 0Jr, 0 affects the density profile at small radii, and J|$v$|⁠, 0Jϕ, 0 influences the velocity distribution in the centre.

Thus we have two alternative models for the disc DF – the quasi-isothermal one is specified in terms of parameters with dimensions of length and velocity, and needs a potential model to construct auxiliary functions, whereas the exponential DF is specified entirely in terms of scale actions and does not reference any potential. This parallels the distinction between quasi-isotropic and double-power-law DFs for spheroidal components discussed in the previous section. Depending on the application, one or the other approach may be more convenient. Fig. 5 illustrates that both families of disc DFs produce similar and realistic density profiles and velocity distributions in a typical Milky Way potential.

Illustration of disc distribution functions. We take the Milky Way potential from Piffl et al. (2014) and consider two types of DFs: quasi-isothermal (solid lines) and exponential (dashed), designed to produce a realistic thick disc population (scale radius ∼3 kpc, scale height ∼1 kpc, radial and vertical velocity dispersions at R = 8 kpc equal to σR = 50 km s−1, σ$z$ = 40 km s−1, consistent with the values given in Bland-Hawthorn & Gerhard 2016). The DF parameters are: Rdisc = 3 kpc, σR, 0 = 185 km s−1, $R_{\sigma _R}=2R_\mathrm{disc}$, h = 0.38 kpc for the quasi-isothermal DF (17b–20), or {Jr, 0, J$z$, 0, Jϕ, 0, Jd, 0, J$v$, 0} = {340, 260, 720, 500, 720} kpc × km s−1 for the exponential DF (21); the mixing coefficients in (16) are kr = 1, k$z$ = 0.25 in both cases. Left-hand panel shows the radial dependence of the surface density, normalized to the exponential profile Σ0exp (− R/Rdisc), and the rms value of $z$, expected to be nearly constant. The actual profiles do not exactly follow the expectations, but the deviations are within $10-20{{\ \rm per\ cent}}$, except at the very centre, where the disc is no longer a dominant component. The vertical density profile (not shown) very closely follows the sech2 law. Central panel plots the radial dependences of three components of the velocity dispersion tensor in the equatorial plane. The vertical velocity dispersion σ$z$ is computed from the condition that the scale height is (nearly) constant, and follows very similar profiles for both DF models, but is not exponentially declining with radius, as is commonly assumed (if it were, the scale height would not stay constant). The radial velocity dispersion σR is a nearly exponential function of radius in the quasi-isothermal DF, and consequently falls below the vertical one at large radii, which is not very realistic. In the exponential DF it behaves similarly to σ$z$. Right-hand panel shows the 1D velocity distribution functions $\mathfrak {f}(v)$ for each velocity component, at a point R = 8 kpc, $z$ = 0. They are very similar in both models, nearly Gaussian for $v$R, $v$$z$ and strongly asymmetric for $v$ϕ, peaking around the local circular velocity (240 km s−1), sharply declining towards larger $v$ϕ and having a long tail extending to negative $v$ϕ, described by equation (19) (although in a realistic galaxy model, the halo stellar population dominates at these velocities).
Figure 5.

Illustration of disc distribution functions. We take the Milky Way potential from Piffl et al. (2014) and consider two types of DFs: quasi-isothermal (solid lines) and exponential (dashed), designed to produce a realistic thick disc population (scale radius ∼3 kpc, scale height ∼1 kpc, radial and vertical velocity dispersions at R = 8 kpc equal to σR = 50 km s−1, σ|$z$| = 40 km s−1, consistent with the values given in Bland-Hawthorn & Gerhard 2016). The DF parameters are: Rdisc = 3 kpc, σR, 0 = 185 km s−1, |$R_{\sigma _R}=2R_\mathrm{disc}$|⁠, h = 0.38 kpc for the quasi-isothermal DF (17b20), or {Jr, 0, J|$z$|⁠, 0, Jϕ, 0, Jd, 0, J|$v$|⁠, 0} = {340, 260, 720, 500, 720} kpc × km s−1 for the exponential DF (21); the mixing coefficients in (16) are kr = 1, k|$z$| = 0.25 in both cases. Left-hand panel shows the radial dependence of the surface density, normalized to the exponential profile Σ0exp (− R/Rdisc), and the rms value of |$z$|⁠, expected to be nearly constant. The actual profiles do not exactly follow the expectations, but the deviations are within |$10-20{{\ \rm per\ cent}}$|⁠, except at the very centre, where the disc is no longer a dominant component. The vertical density profile (not shown) very closely follows the sech2 law. Central panel plots the radial dependences of three components of the velocity dispersion tensor in the equatorial plane. The vertical velocity dispersion σ|$z$| is computed from the condition that the scale height is (nearly) constant, and follows very similar profiles for both DF models, but is not exponentially declining with radius, as is commonly assumed (if it were, the scale height would not stay constant). The radial velocity dispersion σR is a nearly exponential function of radius in the quasi-isothermal DF, and consequently falls below the vertical one at large radii, which is not very realistic. In the exponential DF it behaves similarly to σ|$z$|. Right-hand panel shows the 1D velocity distribution functions |$\mathfrak {f}(v)$| for each velocity component, at a point R = 8 kpc, |$z$| = 0. They are very similar in both models, nearly Gaussian for |$v$|R, |$v$||$z$| and strongly asymmetric for |$v$|ϕ, peaking around the local circular velocity (240 km s−1), sharply declining towards larger |$v$|ϕ and having a long tail extending to negative |$v$|ϕ, described by equation (19) (although in a realistic galaxy model, the halo stellar population dominates at these velocities).

4.5 Using the DF

Given the DF and the potential (which determines the mapping |$\boldsymbol{J}[\boldsymbol{x},\boldsymbol{v}]$|⁠), we may compute various moments by integrating the DF over velocity:
\begin{eqnarray*} \rho (\boldsymbol{x}) &\equiv& \int\!\!\!\int\!\!\!\int \mathrm{d}^3v\,\, f\big (\boldsymbol{J}[\boldsymbol{x},\boldsymbol{v}]\big) \, \, \, \, \qquad \quad \rm{density}, \nonumber \\ \overline{\boldsymbol{v}} &\equiv& \frac{1}{\rho } \int\!\!\!\int\!\!\!\int \mathrm{d}^3v\,\, \boldsymbol{v}\,\, f\big (\boldsymbol{J}\big) \, \qquad \qquad \rm{mean\ velocity}, \nonumber \\ \overline{v^2_{ij}} &\equiv& \frac{1}{\rho } \int\!\!\!\int\!\!\!\int \mathrm{d}^3v\,\, v_i v_j\,\, f\big (\boldsymbol{J}\big) \qquad \quad \rm{second\ moment\ of\ velocity}, \nonumber \\ \sigma ^2_{ij} &\equiv& \overline{v^2_{ij}} - \overline{v_{i}}\, \overline{v_{j}} \qquad \qquad \qquad \quad \rm{velocity\ dispersion\ tensor}. \end{eqnarray*}
(22)
A more complete characterization of kinematic structure is given by the 1D distribution of any velocity component |$v$|1, marginalized over the other two components |$v$|2, |$v$|3:
\begin{eqnarray*} \mathfrak {f}(\boldsymbol{x};\, v_1) \equiv \frac{1}{\rho } \int\!\!\!\int \mathrm{d}v_2\, \mathrm{d}v_3\,\,f\big (\boldsymbol{J}\big) . \end{eqnarray*}
(23)
We compute these integrals by an adaptive multidimensional routine cubature. Listing 3 illustrates the computation of DF moments and velocity distributions.

The DF can be sampled with particles in the 3D action space, or, more usefully in practice, in the 6D position/velocity space – the latter could be used to create initial conditions for N-body simulations. We employ an adaptive multidimensional rejection sampling algorithm, which iteratively refines regions in the sampling domain where the function values are highest. It is completely general (does not use any prior information about the function), quite efficient (acceptance fraction is typically |${\gtrsim } 10-30{{\ \rm per\ cent}}$| even for a strongly localized function), and can be used in many other contexts. For instance, a mock catalogue of stars for a particular survey is produced by sampling from a DF multiplied by this survey’s selection function.

4.6 Fitting a DF

The reverse scenario is to use a sample of observed stars to infer the parameters of the galaxy model M – the potential and the DF (e.g. Bovy & Rix 2013; McMillan & Binney 2013). Suppose that we know the survey’s selection function |$S(\boldsymbol{w})$|⁠, where |$\boldsymbol{w}$| are the observationally determined quantities (e.g. sky coordinates, parallax, proper motion, line-of-sight velocity, magnitude, etc.); this function describes how likely we are to observe a star with these parameters if it actually exists in the galaxy. We know the mapping between |$\boldsymbol{w}$| and the phase-space coordinates |$\lbrace \boldsymbol{x},\boldsymbol{v}\rbrace$|⁠, and in a given potential, also know how they translate into actions |$\boldsymbol{J}$|⁠. Since these quantities are not measured exactly, and some of them may not be accessible at all, we need to consider the multivariate error distribution |$E(\boldsymbol{w} | \boldsymbol{w}^{\prime })$|⁠, describing the probability of measuring |$\boldsymbol{w}$| given the true values |$\boldsymbol{w}^{\prime}$|⁠. Then the likelihood of observing a star s with coordinates |$\boldsymbol{w}_s$|⁠, given the model M and the survey S, is
\begin{eqnarray*} \mathcal {L}_s = \frac{ S(\boldsymbol{w}_s)\,\, \int \mathrm{d} w^{\prime }\,\, E(\boldsymbol{w}_s | \boldsymbol{w}^{\prime })\,\, f\big (\boldsymbol{J}[\boldsymbol{w}^{\prime }]\big)\,\, \mathcal {J}(\boldsymbol{w}^{\prime }) }{ \int \mathrm{d} w^{\prime }\,\, S(\boldsymbol{w}^{\prime })\,\, f\big (\boldsymbol{J}[\boldsymbol{w}^{\prime }]\big)\,\, \mathcal {J}(\boldsymbol{w}^{\prime }) } . \end{eqnarray*}
(24)
Here f is the DF of stars in the model, |$\mathcal {J} \equiv |\partial (\boldsymbol{x},\boldsymbol{v})/\partial \boldsymbol{w}|$| is the Jacobian of coordinate transformation, the integral in the numerator is the convolution with error distribution, and the integral in the denominator is the normalization factor (the total number of stars that we expect to find in the survey for the given model). In practice, the former integral is computed in a Monte Carlo way, by drawing K sample points |$\boldsymbol{w}_{s,k}$| from the (known) multivariate error distribution of observables for each star s, propagating them to action space, and replacing the integral by a sum |$\frac{1}{K} \sum _{k=1}^K f\big (\boldsymbol{J}[\boldsymbol{w}_{s,k}]\big)$|⁠. It could well be that some of these points are unphysical (e.g. have negative parallax, or velocity exceeding the escape velocity); they will not contribute to the integral because f = 0 for such points. As long as f is non-zero for some points, the likelihood of a star remains positive. The integral in the denominator is the same for all stars, and it should be computed accurately (e.g. Trick et al. 2016, Section 2.6); we again use the adaptive cubature routine. Finally, the log-likelihood of the model is given by summing the log-likelihoods of all stars in the observed sample. The model parameters can then be varied to maximize this quantity. In doing so, it is advisable to fix the Monte Carlo samples used to compute the error convolution for each star, minimizing the impact of Poisson noise (McMillan & Binney 2013).

Depending on the context, we may keep either the potential or the DF fixed; for instance, if the goal is to infer the parameters of the potential, we should consider all possible combinations of the potential and the DF, and compute the posterior probability distributions of potential parameters, marginalized over the DF parameters. Listing 4 illustrates the simplest scenario with full phase-space information and no errors. The true parameters of the potential and the DF are easily recovered using deterministic optimization routine; in practice this should be followed or replaced by a full exploration of parameter space using Markov chain Monte Carlo or similar aproaches. A more realistic scenario of inferring the potential from a catalogue of tracer stars with only sky position and noisy line-of-sight velocity (no distance or proper motion information), but still assuming a spherical geometry and neglecting the selection function, was applied to the Gaia Challenge test suite (Read et al., in preparation); the example code is provided with the library.

A particular case of inferring the DF from discrete samples occurs in the context of spherical isotropic DFs constructed from N-body snapshots. Here the DF is a function of one variable (energy, or rather its action-like counterpart – phase volume), and the method of constructing a non-parametric spline representation of a 1D probability distribution from discrete samples is based on penalized log-density estimate. It is used in the Monte Carlo code raga to compute two-body relaxation coefficients.

5 SELF-CONSISTENT MODELS

We now consider the task of constructing a galaxy model consisting of an arbitrary number of components (e.g. disc, bulge, halo), each specified by its DF |$f_c(\boldsymbol{J})$|⁠, plus optionally some external potential (e.g. a central black hole). The problem is to find the solution of the coupled system of equations:

  • actions |$\boldsymbol{J}(\boldsymbol{x}, \boldsymbol{v}\, |\, \Phi)$| depend on the potential;

  • density of each component is the integral of its DF (22);

  • the potential is related to the total density via the Poisson equation (1).

Because of this circular dependence, the solution is obtained iteratively, starting from a suitable initial guess for the potential and repeating the three steps until convergence. Our approach extends previous works, described in the next section, and improves them in several ways:

  • we always recompute the density from the DF on each iteration, instead of relying on the assumption that the DF (e.g. a quasi-isothermal one) approximately generates the desired density profile;

  • consequently, we recalculate the total potential of all components without any simplifying approximations (e.g. of a separable exponential disc profile);

  • the use of actions facilitates the construction of models with arbitrary combination of components and accelerates the convergence of iterations.

5.1 Previous work

The first application of this method dates to Prendergast & Tomer (1970), who constructed models of elliptical galaxies specified by f(E, L|$z$|), using spherical-harmonic potential representation; they obtained solution in a few dozen iterations, but not for every possible choice of DF (very flattened models could not be constructed with their family of DFs). Rowley (1988) used a similar approach, with a different ansatz for f(E, L|$z$|), and added a static disc potential (without a corresponding DF).

Kuijken & Dubinski (1995) extended the method to the case of three-component disc galaxies with a bulge and a halo. They start from suitable ansatzes for the DFs of spheroidal components, expressed as analytic functions of E, L|$z$|, and a static potential of the disc. Then they employ the same iterative procedure to recompute the total potential, using the modification of spherical-harmonic expansion for separable discs, described in Section 2. Finally, the disc DF is constructed in the form given by (13), using the energy of vertical motion E|$z$| as the approximate third integral. The functions f(R), |$\tilde{\sigma }_R(R_\circ)$|⁠, |$\tilde{\sigma }_z(R_\circ)$| are initially taken to be exponential with radius, and are iteratively adjusted until the density matches the desired form (exponential in radius, isothermal vertically) at some pre-selected points; they argue that this produces the model sufficiently close to equilibrium so that the disc density does not need to be recomputed in the iteratively adjusted potential. Apart from this assumption, another difficulty is that the DFs for spheroidal components only produce their intended density profiles in isolation, but not in the composite model. This complicates the setup, because the relation between DF parameters and the resulting density profiles is rather non-intuitive.

Widrow, Pym & Dubinski (2008) improved the method by using the Eddington inversion formula to numerically construct DFs of spheroidal components in the spherical approximation of the total potential (retaining only the monopole term of the disc potential). Then they kept these expressions for f(E), substituting E with the energy computed in the actual potential at each iteration and rescaled back into the same range as used in the original DF. This approach resembles the quasi-isotropic DF introduced in Section 4.2, but with important differences: the energy is not an adiabatic invariant, even after rescaling, and the total mass of such a DF depends on the potential even if the functional form f(E) remains unchanged. In the final model the density of spheroidal components is constant on equipotential surfaces, i.e. somewhat flattened compared to the initial profiles, but follows approximately the same radial dependence; the velocity distribution is necessarily isotropic. This method, dubbed galactics, has been used to construct initial conditions for N-body simulations, as well as models of actual galaxies (Milky Way, M31) using a variety of observational constraints, and more recently also incorporating integral-field kinematic data for more distant galaxies (Taranu et al. 2017).

Debattista & Sellwood (2000) constructed initial conditions for their disc + halo N-body simulations using a similar iterative technique, recomputing the density profile of the halo from its DF (also expressed in terms of E and L|$z$|) and then readjusting the potential. For the latter step, they employed the same Cartesian-grid Poisson solver as used in the actual N-body simulation, rendering unnecessary the modification of spherical-harmonic expansion used by Kuijken & Dubinski (1995). However, they also did not recompute the disc density profile, drawing particles from an approximately isothermal DF at the end of iterative procedure.

McMillan & Dehnen (2007) used a somewhat different approach to construct equilibrium multicomponent models, implemented in a program mkgalaxy included in the nemo framework. They also start from a spherical model with DFs of bulge and halo computed from Cuddeford’s(1991) anisotropic generalization of Eddington formula in the spherical approximation of the total potential (including the monopole component of the disc). Then the two spheroidal components are converted to N-body representations, and the potential of the disc component (which is still an external contribution to the N-body system of bulge and halo) is slowly deformed into the final shape, while the density profiles of the other two components adiabatically evolve while retaining overall self-consistency throughout the N-body simulation. Finally, disc particles are drawn from the DF of Dehnen (1999), in which all relevant quantities (circular radius corresponding to the given energy, epicyclic frequencies as functions of radius, etc.) are computed in the final potential, and the auxiliary functions f(R), |$\tilde{\sigma }_R(R)$| are also adjusted to produce the required density and velocity dispersion profiles [similar to Kuijken & Dubinski (1995)]. The vertical velocity dispersion is taken from an isolated exponential disc, resulting in an out-of equilibrium vertical structure, which should probably be regarded as an oversight rather than an intrinsic deficiency of the method. This approach circumvents the need for iterations, replacing it with an actual adiabatic transformation of the distribution of particles (which is, in fact, more computationally intensive); however, it also forfeits the description of the bulge and halo in terms of a DF.

In these approaches the DF is represented as a function of energy E, angular momentum L or L|$z$|, and, for the disc component, an approximate third integral (the energy of vertical motion E|$z$|). However, these variables are not ideal for iterative self-consistent modelling, whereas actions have several important advantages:

  • A DF specified in terms of energy is inconvenient because the possible range of energy depends on the overall potential. Moreover, even the total mass of each component is hard to specify in advance. In the method of Widrow et al. (2008), the DF is rescaled at each iteration, which probably explains slow convergence (several tens of iterations are needed, with the order of spherical-harmonic expansion increased gradually). By contrast, a DF given in terms of actions always has the same mass (11) and functional form, independently of the potential.

  • A superposition of several such components is straightforward, and keeps unchanged the explicit form of the DF in terms of |$\boldsymbol{J}$|⁠. Physically this corresponds to adiabatic modification of the density profile of each component upon the addition of another component.

  • The vertical energy is a good proxy for the integral of motion only for cold orbits near the disc plane (where the epicyclic approximation is valid). The third integral in the Stäckel approximation (e.g. Bienaymé et al. 2015) is conserved better even far from the equatorial plane, but the vertical action J|$z$| in the same approximation is an even better integral, as illustrated in Fig. 4. Ultimately, action/angle formalism allows us to use a systematic and mathematically well-grounded non-perturbative approach based on canonical transformations specified by numerically constructed generating functions (Sanders & Binney 2014; Binney & McMillan 2016) to compute actions to arbitrary accuracy. Sanders & Evans (2015) compared the quality of self-consistent models constructed using the more accurate action finder with those using the Stäckel fudge, and concluded that the latter approximation is sufficient in practice.

The iterative approach to the construction of self-consistent models with DF specified in terms of actions has been applied by Binney (2014) to a flattened generalization of the Isochrone model. Sanders & Evans (2015) used a double-power-law DF of Williams & Evans (2015a) to construct mildly triaxial one-component models. Piffl, Penoyre & Binney (2015), Binney & Piffl (2015), and Cole & Binney (2017) extended the approach to a multicomponent model of the Milky Way, using a family of quasi-isothermal DFs (14) for the disc population and a double-power-law DF (12) for the dark halo. They recomputed both the halo and the disc density at each iteration; however, to solve the Poisson equation, they approximated the disc density by a separable exponential profile and employed the galpot approach, hence the potential is only approximately self-consistent and the error cannot be systematically reduced. Moreover, to construct a quasi-isothermal DF, one needs a ‘seed’ potential, which should be reasonably close to the final one. Therefore they employed a two-stage procedure: first the disc density was kept fixed (using the analytic disc profile), while the density and potential of the halo was recomputed iteratively; after a few iterations, the total potential was used to construct the disc DF, and in several more iterations both disc and halo density were recomputed from the DF. Because of the functional form of their quasi-isothermal DF, the final disc density looked rather different from the initial analytic profile, complicating the setup. The need for an intermediate potential for initializing the disc DF also introduced non-trivial dependence between the parameters of the disc and the halo.

5.2 The present method

The iterative self-consistent modelling approach in agama is completely general, although presently restricted to axisymmetric systems due to the availability of action finders.

A few practical considerations need to be mentioned. First, the overall potential in a general case is represented by a sum of two approximations – the azimuthal-harmonic spline expansion for the disc components and the spherical-harmonic spline expansion for the spheroidal components. In the case of an elliptical galaxy model the first one may not be needed, although even elliptical and lenticular galaxies often have significant discy components (Rix & White 1990; Emsellem et al. 2007). Each of the two potential components, in turn, may correspond to one or several density components – for instance, thin and thick stellar discs for the first one, and stellar bulge and extended dark halo for the second one. In addition, there may be density components that are not specified by their DFs, but are simply ‘static’ contributions to the total potential: for instance, a thin gaseous disc that does not satisfy the collisionless Boltzmann equation, or a ‘deadweight’ softened point mass potential instead of the bulge in the case that our main interest lies outside the central region of the galaxy and we do not need to model the DF of the bulge. The user must explicitly assign each mass component to either of the two potential solvers.

Secondly, in solving the Poisson equation in either spherical-harmonic (5) or azimuthal-harmonic (6) form, we need to evaluate the density at many points in space while computing the relevant integrals. However, doing this directly by integrating |$f(\boldsymbol{J})$| over velocities at each point (22) would be extremely inefficient. Instead, we pre-compute the values of density of each component at a moderate number of points on a rectangular 2D grid in R, |$z$| or r, θ planes, and constructing an interpolating spline for fast evaluation of density at any location. This is not quite a trivial step, as the accuracy of potential approximation is determined by the accuracy of density interpolation, and ultimately by the number and location of grid points. For the disc components, 15–30 grid nodes per direction (R, |$z$|⁠) is enough, provided that the grid covers the spatial region enclosing almost all of the mass of the given component, and the nodes are spaced more densely close to the disc plane where the density is highest (Fig. 6). For the spheroidal components, we use a uniform grid in log-radius with 20–40 nodes, and nodes of the Gauss–Legendre quadrature rule of order lmax + 1 in cos θ, with the order of multipole expansion lmax ≃ 6 − 8.

Illustration of grids used to represent the density in self-consistent modelling. Depending on the type of the density profile (spheroidal or discy), we use two different types of grids and correspondingly two classes of potential solvers (spherical-harmonic and azimuthal-harmonic, correspondingly). In the first case, shown by red circles, the density values are collected at the nodes of a logarithmically spaced grid in r and the nodes of a Gauss–Legendre quadrature rule in cos θ; in this example we used eight radial and five angular points, corresponding to lmax = 8 (only the upper half of the grid is used). In the second case, shown by blue crosses, the density values are collected at the nodes of a separable grid in R and $z$, which are also spaced non-uniformly, but include a point at the origin. This arrangement reflects the grids used in the potential interpolation, which is performed in spherical or cylindrical coordinates, respectively. The azimuthal direction (ϕ) is represented by Fourier harmonics in both cases; however, presently only the m = 0 azumuthal harmonic is used, because we rely on the axisymmetric Stäckel fudge in the rest of the modelling framework.
Figure 6.

Illustration of grids used to represent the density in self-consistent modelling. Depending on the type of the density profile (spheroidal or discy), we use two different types of grids and correspondingly two classes of potential solvers (spherical-harmonic and azimuthal-harmonic, correspondingly). In the first case, shown by red circles, the density values are collected at the nodes of a logarithmically spaced grid in r and the nodes of a Gauss–Legendre quadrature rule in cos θ; in this example we used eight radial and five angular points, corresponding to lmax = 8 (only the upper half of the grid is used). In the second case, shown by blue crosses, the density values are collected at the nodes of a separable grid in R and |$z$|⁠, which are also spaced non-uniformly, but include a point at the origin. This arrangement reflects the grids used in the potential interpolation, which is performed in spherical or cylindrical coordinates, respectively. The azimuthal direction (ϕ) is represented by Fourier harmonics in both cases; however, presently only the m = 0 azumuthal harmonic is used, because we rely on the axisymmetric Stäckel fudge in the rest of the modelling framework.

Thirdly, the essence of the method is that the DF is fixed, and gives rise to different density profiles in different potentials. At the same time, expressions for some parametrized DFs (e.g. that of a quasi-isothermal disc, Binney & McMillan 2011) contain the potential implicitly, through the epicyclic frequencies as functions of radius of a circular orbit, which itself is expressed through the actions. This implies that prior to using such a DF, we must compute the dependence of these frequencies on L|$z$| given a plausible initial guess for the total potential. On the other hand, it is essential to fix this dependence throughout the iterative scheme, so the value of the DF as a function of actions stays constant, even though the epicyclic frequencies in the finally obtained self-consistent potential might be somewhat different from the ones used at the beginning.

In practice, if the goal is to create a model with prescribed density profiles, then one may use quasi-isotropic DFs for spheroidal components and quasi-isothermal DFs for discy components, provided that they were constructed from the desired density profiles in the corresponding total potential. In this case the resulting density profiles are typically only moderately different (⁠|${\lesssim } 10{{\ \rm per\ cent}}$|⁠) from the input ones. However, the DFs are not simply expressible functions, even though they can always be evaluated numerically. The alternative is to choose some analytic form of DFs, e.g. double-power-law (12) or exponential (21), and this completely specifies the resulting model, regardless of the choice of the initial potential: a bad choice may only delay, but not prevent the convergence. Typically, the change in the potential during each subsequent iteration is ∼2 × smaller, and with a good initial guess, five iterations is enough to be within |$1{{\ \rm per\ cent}}$| of the asymptotic solution; even with a poor guess, 10 iterations should be sufficient in practice. In terms of wall-clock time, each iteration takes only a few seconds on a 16-core workstation. The present implementation is considerably faster (by a factor of few tens) than the one used in Binney (2014), Piffl et al. (2015), and Binney & Piffl (2015).

5.3 Example

Listing 5 illustrates the iterative method for constructing self-consistent models with a simple single-component system, resembling the flattened isochrone of Binney (2014).

We now consider a more interesting example of a three-component axisymmetric bulge–disc–halo galaxy used in Section 4.1 of Vasiliev & Athanassoula (2015). In that paper, five different methods were given the task of creating an equilibrium multicomponent model with the given density profile. It has an exponential disc, a flattened Sérsic bulge with axial ratio q = 0.8, and a nearly spherical NFW halo with an outer cutoff; the disc is relatively thick and warm (Toomre parameter Q ≳ 1.7) to minimize the impact of non-axisymmetric instabilities. We created N-body realizations of this composite model and checked how close they were to equilibrium by running N-body simulations and recording the evolution of radial profiles of three components of the velocity dispersion tensor of disc particles. We found that the Schwarzschild method, implemented in the smile code (Vasiliev 2013), produced a model that was closest to equilibrium; galactics and mkgalaxy models were also acceptable, although the latter one failed to correctly assign the vertical velocity dispersion profile (neglecting contributions from spheroidal components). magalie (Boily, Kroupa & Peñarrubia 2001) and galic (Yurin & Springel 2014) produced models that were significantly out of equilibrium: the former – due to a simplified treatment of composite potential and a Maxwellian assumption about the velocity distribution, the latter – because of unrealistic requirement that σR = σϕ.

We now add another method to this test suite, using the same density profile to initialize quasi-isotropic DFs of the spheroidal components (bulge and halo) and a quasi-isothermal DF of the disc, and performing five iterations of the self-consistent modelling procedure. The resulting density profiles are not exactly the same as the initial ones, but the difference is |${\lesssim } 10{{\ \rm per\ cent}}$|⁠. We then created an N-body realization of the model and evolved it for 100 time units with the fast-multipole code gyrfalcon (Dehnen 2000), as in the previous study. To minimize the impact of relaxation on the disc structure, we increased the number of particles fivefold compared to the previous paper, using 0.2, 0.8, and 4 million particles for the bulge, disc, and halo components, respectively.

Fig. 7 shows the initial and evolved velocity dispersion profiles of the disc component, comparing agama with two other similar methods (galactics and mkgalaxy), which were also re-run with a higher number of particles. The initial profiles are not exactly the same due to the differences in the methods, but they are close enough for a meaningful comparison. All three models displayed some evolution caused primarily by the development of a rather moderate spiral instability in the outer parts of the disc. In general, though, the changes in velocity dispersion profiles and other dynamical characteristics were minor, indicating that the initial conditions were in a good equilibrium. To verify this, we computed the orbits of disc particles for the same time interval in a fixed potential of all three components, using the two potential approximations constructed from the particles of the initial N-body snapshot for each method. The final states of these fixed-potential models were very close to the initial ones (we do not plot the respective curves because they very nearly coincide), except for the vertical velocity dispersion in mkgalaxy (shown as a dash-double-dotted line), which, as mentioned above, was underestimated in the initial conditions.

Self-consistent disc–bulge–halo models constructed by three different methods: mkgalaxy (McMillan & Dehnen 2007), galactics (Widrow et al. 2008), and agama (this paper). Top panel shows the intended rotation curve, which is reproduced with minor deviations by all codes. Other panels show the radial profiles of three components of velocity dispersion tensor of the disc component, normalized by the value of vertical velocity dispersion that would correspond to an isolated thin disc: $\sigma _\mathrm{z,iso} \equiv \sqrt{G\, M_\mathrm{disc}\, h / R_\mathrm{disc}^2}\: \exp \big (-\frac{1}{2} R/R_\mathrm{disc}\big)$, where Mdisc = 1, Rdisc = 1, h = 1/16 are the mass, scale radius, and scale height of the disc. Solid lines correspond to the initial models, and dashed lines – to the models evolved for 100 time units. Dash-double-dotted line in the second panel shows the vertical velocity dispersion profile for a model evolved in a fixed (initial) potential, which is noticeably higher than the initial one; for other methods and velocity components, the fixed-potential evolution leaves the profiles essentially unchanged and they are not shown to avoid crowding. Except for this one case, the evolution of live models leads only to rather moderate structural changes, mainly in the outer parts that develop some degree of spiral instability. Fig. 2 in Vasiliev & Athanassoula (2015) shows three other methods in the same comparison project (note that σR and σϕ are erroneously swapped in that figure).
Figure 7.

Self-consistent disc–bulge–halo models constructed by three different methods: mkgalaxy (McMillan & Dehnen 2007), galactics (Widrow et al. 2008), and agama (this paper). Top panel shows the intended rotation curve, which is reproduced with minor deviations by all codes. Other panels show the radial profiles of three components of velocity dispersion tensor of the disc component, normalized by the value of vertical velocity dispersion that would correspond to an isolated thin disc: |$\sigma _\mathrm{z,iso} \equiv \sqrt{G\, M_\mathrm{disc}\, h / R_\mathrm{disc}^2}\: \exp \big (-\frac{1}{2} R/R_\mathrm{disc}\big)$|⁠, where Mdisc = 1, Rdisc = 1, h = 1/16 are the mass, scale radius, and scale height of the disc. Solid lines correspond to the initial models, and dashed lines – to the models evolved for 100 time units. Dash-double-dotted line in the second panel shows the vertical velocity dispersion profile for a model evolved in a fixed (initial) potential, which is noticeably higher than the initial one; for other methods and velocity components, the fixed-potential evolution leaves the profiles essentially unchanged and they are not shown to avoid crowding. Except for this one case, the evolution of live models leads only to rather moderate structural changes, mainly in the outer parts that develop some degree of spiral instability. Fig. 2 in Vasiliev & Athanassoula (2015) shows three other methods in the same comparison project (note that σR and σϕ are erroneously swapped in that figure).

This example illustrates that the DF-based self-consistent method in agama produces models that are at least as good as the other commonly used methods; however, it offers much greater flexibility in choosing the model parameters, number, and type of components, etc. It is also competitive in terms of computational effort, taking about 1 min of wall-clock time on a 16-core workstation.

6 DISCUSSION

We presented a software framework for stellar dynamics and galaxy modelling. It provides general-purpose methods for constructing smooth potentials from arbitrary density distributions or from N-body snapshots, conversion between position/velocity and action/angle variables (presently only for spherical or oblate axisymmetric potentials), distribution functions and their moments, iterative construction of multicomponent self-consistent models.

Methods that are new or considerably improved compared to previous works include:

  • Efficient and versatile general-purpose potential representations in terms of spherical-harmonic and azimuthal-harmonic expansion. They can be used to accurately compute the potential of almost any density profile provided in a functional form, as well as construct a smooth approximation to the potential of an N-body system. The first of these approaches is a more flexible alternative to the familiar basis-set expansion technique, while the second has been little used so far in the astrophysical community, but is highly suitable for discy galaxies, especially non-axisymmetric.

  • A novel implementation of the Stäckel fudge approach for computing approximate action/angle variables in arbitrary axisymmetric potentials, which is more accurate and efficient than other existing codes.

  • Several types of action-based DFs, including improvements and generalizations of previously proposed ones, and a new class of disc DF formulated entirely in terms of actions.

  • Various tools for working with DFs: optimized computation of DF moments and 1D marginalized velocity distributions, general routines for sampling from a multidimensional probability distribution and for non-parametric penalized density estimates from discrete samples.

  • Framework for iterative construction of DF-based multicomponent self-consistent models, extending previous similar approaches and improving both the performance and flexibility.

The agama library is written in C++ , provides python and fortran interfaces and bindings to several other stellar-dynamical packages. Considerable effort is invested into computational efficiency and mathematical robustness. Most time-consuming operations are internally parallelized using OpenMP, both when using the library natively in C++ programs or as a python extension module.

6.1 Comparison to other similar projects

There are several other software projects with broadly similar scope.

The most well-known is galpy (Bovy 2015) – a python package for galactic dynamics, offering a large collection of gravitational potentials, orbit integration routines, methods for conversion between position/velocity and action/angle coordinates, action-based distribution functions, coordinate conversion, and plotting facilities. As evident from this list, there is a lot of overlap with the features provided by agama. galpy has a larger variety of built-in potentials, however, agama offers two general-purpose potential solvers that can suit almost any imaginable need, and can be constructed from arbitrary user-provided density profiles as well as from N-body snapshots. Both libraries have several methods for action computation: isochrone, arbitrary spherical potentials, Stäckel fudge for axisymmetric potentials. galpy additionally offers adiabatic approximation (which is generally inferior compared to the Stäckel fudge) and a method based on canonical transformation using numerically integrated orbits, which is more accurate but far more expensive; on the other hand, the most practical Stäckel approximation in agama is more accurate and substantially faster. galpy is primarily written in python, but for greater efficiency parts of the code (some potentials and action finders) are also re-written in C, which leads to a rather complicated internal structure and code duplication. By contrast, all computationally intensive parts of agama are written in C++; the python interface exposes most of the underlying functionality without sacrificing the performance, but offering greater flexibility (for instance, a user-provided python function can be used as a density profile for the two general-purpose potential solvers). Perhaps the best approach is to combine the advantages of both packages – for instance, any agama potential can be used as a regular galpy potential (the library provides a simple wrapper), its action finders can also serve as more efficient substitutes to those from galpy, while the latter provides nice visualization features.

Another package with similar functionality is gala (Price-Whelan 2017), written in cython (a more performance-oriented superset of python that is compiled just like C/C++ code). It also offers a collection of gravitational potentials (most of them have counterparts in agama), orbit integration routines, and an orbit-based action transformation method of Sanders & Binney (2014).

tact (Sanders & Binney 2015a) is a C++ library with the broadest collection of action transformation methods; those missing from agama, such as the triaxial Stäckel fudge or the generating function method, can be easily coupled to it thanks to the shared language and similar conventions.

6.2 Possible usage scenarios

  • Approximate the potential of a snapshot from an N-body simulation with a smooth non-parametric potential expansion, which faithfully represents global features while discarding small-scale noise. This drastically reduces the amount of information needed to represent the ‘frozen’ potential of the given system, which can then be used to integrate and classify orbits of particles. This task traditionally involved using the same Poisson solver as in the simulation itself (e.g. a tree code), as in Valluri et al. (2010, 2016); Portail, Wegg & Gerhard (2015), or approximating the potential using a basis-set expansion (e.g. Hoffman et al. 2010; Bryan et al. 2012; Röttgers, Naab & Oser 2014), or fitting it by some combination of analytic models (e.g. Muzzio et al. 2005; Machado & Manos 2016; Maffione et al. 2018). The N-body potential solvers are both expensive and noisy, and analytic models or conventional basis sets do not offer enough flexibility and could lead to significant biases (e.g. Carpintero & Wachlin 2006; Kalapotharakos, Efthymiopoulos & Voglis 2008). Our spline-interpolated spherical- and azimuthal-harmonic expansions provide an optimal balance between efficiency and accuracy of potential approximation; the first of these is used in Zhu et al. (2017), le Bret et al. (in preparation).

  • These potential solvers can also be applied to smooth density models, in which the alternative approaches for solving the Poisson equation may be expensive, such as a triaxial bar or spiral arms (e.g. Pichardo et al. 2003; Antoja et al. 2011; Fragkoudi et al. 2015). They are also used for orbit computation in the Schwarzschild (1979) orbit-superposition method, implemented in the smile code (Vasiliev 2013), which is included in the library.

  • Instead of classifying numerically integrated orbits, one may compute particle actions and use clustering algorithms to detect substructure such as clumps or streams (e.g. Sanderson, Helmi & Hogg 2015; Helmi et al. 2017).

  • DF-based models provide a compact representation of the full 6D phase space of N-body models, similar to what smooth potentials do for the 3D density profile. At present, agama provides either non-parametric spherical isotropic DFs (1D), or a choice from several families of parametric three-integral models.

  • Spherical isotropic DFs, while highly idealized, nevertheless can serve as a good approximation to some collisional systems, such as globular clusters. They can be used to compute classical two-body relaxation coefficients, which have been shown to describe the actual evolution of corresponding N-body systems quite well (e.g. Vasiliev 2015). Using the tools from agama, Beraldo e Silva et al. (2017) demonstrated the agreement between entropy evolution measured from N-body simulations and predicted by the smooth DF extracted from the simulations.

  • DF-based models are well suited for describing observed stellar systems. A prime example is our Galaxy, where such models have been fitted to a vast collection of observational data (e.g. the Besançon model, Robin et al. 2003). Much effort has gone into determination of suitable DFs describing the kinematics of stars in the solar neighbourhood (e.g. Binney 2010; Sharma et al. 2014; Sanders & Binney 2015b; Posti et al. 2018) or in the galactic halo (e.g. Williams & Evans 2015b; Das et al. 2016; Das & Binney 2016), assuming that the gravitational potential is known and focusing on velocity anisotropy and rotation, or relations between age, chemical composition, and kinematics of different stellar populations.

  • One may instead consider the stars as kinematic tracers of the potential, and use their motion to constrain the total distribution of matter (both visible and dark). This is often performed using Jeans equations, but DF-based models have an advantage of always providing a physically valid solution (which is not guaranteed in the Jeans approach). McMillan & Binney (2013), Ting, Rix & Bovy J. van de Ven (2013), Piffl et al. (2014), and Trick et al. (2016) constructed action-based DF models of the Milky Way, varying the parameters of both the DF and the potential when fitting them to the observations. The DF approach has also been applied to infer the potential of nearby dwarf spheroidal galaxies from radial velocities of individual stars (e.g. Wilkinson et al. 2002; Amorisco & Evans 2011; Walker & Peñarrubia 2011), and Pascale et al. (2018) pioneered the use of action-based self-consistent models for one of these systems, although only in a spherical geometry. In this approach the parameters of DF and the potential are varied independently, and the contribution of stars to the total potential is typically ignored as negligible.

  • Ultimately, the complete galaxy model must combine the potential and the DFs of all components (including dark matter) in a dynamically self-consistent way. Binney & Piffl (2015), Piffl et al. (2015), and Cole & Binney (2017) used the iterative method with action-based DFs to construct such models of Milky Way. agama brings this method to a new level by substantially improving the performance and abandoning the ad hoc approximation of separable disc density used in computing its potential, in favour of more general potential solvers. These models can be fitted to a variety of observational constraints and hence present the most general way of self-consistently describing the Galaxy in terms of just a handful of parameters (Binney & Vasiliev, in preparation).

  • The same method may be used to construct an equilibrium galaxy model with known (or assumed) DFs and then use it to generate an N-body snapshot. In this context it is similar to the galactics code (Kuijken & Dubinski 1995; Widrow et al. 2008; Taranu et al. 2017) but more flexible and potentially more accurate thanks to the general-purpose potential solvers. The formulation of the DF in terms of actions, as opposed to any other integrals of motion, is also beneficial especially for construction of multicomponent models, which can be done simply by adding together several DFs and readjusting the total potential without modifying the explicit expressions for DFs.

  • Axisymmetric models are only an approximation for real disc galaxies (such as Milky Way), but they could serve as a suitable starting point for a more complicated analysis. In this aspect, an explicitly known three-integral DF expressed in terms of actions opens up many possibilities, e.g. using perturbation theory to study deviations from axisymmetric equilibrium state (e.g. Monari et al. 2016; Binney 2018), normal modes and instabilities of self-gravitating discs (e.g. Kalnajs 1977; Jalali & Hunter 2005; Polyachenko 2005), or the impact of resonances on the relaxation of dynamically cold systems (e.g. Fouvry et al. 2015).

  • DF-based models are a valuable tool in generating mock catalogues for the given survey parameters (e.g. the galaxia code, Sharma et al. 2011). These mock data sets may be used to test the performance of the chosen modelling approach (agama was used in this context by Zhu et al. 2016 to validate their discrete Jeans models). The method for sampling from a multidimensional probability distribution (the product of a DF and a selection function of the survey) provided in agama is completely general and quite efficient, potentially superseding earlier schemes tuned for a particular form of DF.

6.3 Advantages and limitations

The approach to dynamical modelling based on a DF has several advantages both in theoretical and observational applications. A DF is a smooth representation of a stellar system and can be treated as a probability distribution in various contexts. One is the creation of equilibrium N-body models with arbitrary large number of particles, something that cannot be easily done with the Schwarzschild (1979) or made-to-measure (M2M, Syer & Tremaine 1996) models, which have a finite number of discrete elements. The other comes into play when comparing models to data using a likelihood approach. In general, there seems to be no practical way of doing this if both the model and the data points are discrete samples of some underlying probability distribution, without smoothing or binning either of them (see Saha 1998, Chaname, Kleyna & van der Marel 2008 and de Lorenzi et al. 2008 in the context of N-body, Schwarzschild and M2M models, respectively). It has also been argued that the discrete nature of Schwarzschild-type models presents challenges in likelihood-based inference of high-quality data such as that for stars in the Milky Way (McMillan & Binney 2013), which are avoided in models with a smooth DF. Finally, it is easy to compare different models specified in terms of a DF with a known functional form, while this hardly can be done if the model consists of individual particles or orbits.

Of course, this approach is not without limitations. It relies on the fundamental assumption of integrability of motion in the given potential. Clearly, this assumption is violated in triaxial systems, such as elliptical or barred disc galaxies, but even in purely axisymmetric systems the phase space may contain multiple orbit families separated by chaotic layers. Nevertheless, the formalism of action/angle variables may be meaningfully used even in this case (e.g. Kaasalainen 1995a,b; Binney 2016). Still, even ignoring the complications arising from non-integrability, one may argue that if the method used to compute actions behaves regularly in the presence of resonant and chaotic orbits, then the description in terms of a smooth DF remains approximately valid for the task of creating a (nearly) self-consistent model (Binney 2018). Other limitations of the present implementation, specific to action computation, are its restriction to axisymmetric systems and the use of the Stäckel fudge. They are not fundamental and may be lifted in the future; the work of Sanders & Evans (2015) demonstrates both the extension of iterative DF-based method to non-rotating triaxial models of elliptical galaxies and the use of a more accurate action determination algorithm, which was found not to change the results appreciably.

To summarize, the agama framework complements (and in some aspects supersedes) the existing software libraries for galaxy modelling, and can be used in many theoretical and observationally motivated applications. As with most scientific software, it is being continuously developed; we hope that releasing it publicly will be beneficial for the community, and welcome the feedback and suggestions for further extension and improvement.

ACKNOWLEDGEMENTS

I am grateful to L. Beraldo e Silva, J. Binney, D. Cole, P. Das, V. Debattista, W. Dehnen, K. Hattori, E. Polyachenko, L. Posti, J. Sanders, M. Valluri, the anonymous referee, and other colleagues for valuable comments on the paper and continuous feedback on the code. This work was supported by the European Research council under the 7th Framework programme (grant No. 321067).

Footnotes

1

There are various definitions of I3 in the literature, but they are equivalent up to a shift or multiplication by a constant that depends on E, L|$z$|, and Δ.

REFERENCES

Amorisco
N. C.
,
Evans
N. W.
,
2011
,
MNRAS
,
411
,
2118

Antoja
T.
,
Figueras
F.
,
Romero-Gómez
M.
,
Pichardo
B.
,
Valenzuela
O.
,
Moreno
E.
,
2011
,
MNRAS
,
418
,
1423

Beraldo e Silva
L.
,
de Siqueira Pedra
W.
,
Sodré
L.
,
Perico
E. L. D.
,
Lima
M.
,
2017
,
ApJ
,
846
,
125

Bienaymé
O.
,
Robin
A. C.
,
Famaey
B.
,
2015
,
A&A
,
581
,
A123

Binney
J.
,
2010
,
MNRAS
,
401
,
2318

Binney
J.
,
2012
,
MNRAS
,
426
,
1324

Binney
J.
,
2014
,
MNRAS
,
440
,
787

Binney
J.
,
2016
,
MNRAS
,
462
,
2792

Binney
J.
,
2018
,
MNRAS
,
474
,
2706

Binney
J.
,
Kumar
S.
,
1993
,
MNRAS
,
261
,
584

Binney
J.
,
McMillan
P.
,
2011
,
MNRAS
,
413
,
1889

Binney
J.
,
McMillan
P. J.
,
2016
,
MNRAS
,
456
,
1982

Binney
J.
,
Piffl
T.
,
2015
,
MNRAS
,
454
,
3653

Binney
J.
,
Tremaine
S.
,
2008
,
Galactic Dynamics
. 2nd edn.,
Princeton Univ. Press
,
Princeton
, p.
920

Bland-Hawthorn
J.
,
Gerhard
O.
,
2016
,
ARA&A
,
54
,
529

Boily
Ch.
,
Kroupa
P.
,
Peñarrubia
J.
,
2001
,
New Astron.
,
6
,
27

Bovy
J.
,
2015
,
ApJS
,
216
,
29

Bovy
J.
,
Rix
H.-W.
,
2013
,
ApJ
,
779
,
115

Bovy
J.
,
Rix
H. W.
,
Liu
C.
,
Hogg
D. W.
,
Beers
T. C.
,
Lee
Y. S.
,
2012
,
ApJ
,
753
,
148

Bryan
S. E.
,
Mao
S.
,
Kay
S. T.
,
Schaye
J.
,
Dalla Vecchia
C.
,
Booth
C. M.
,
2012
,
MNRAS
,
422
,
1863

Carpintero
D. D.
,
Wachlin
F. C.
,
2006
,
CeMDA
,
96
,
129

Carpintero
D.
,
Maffione
N.
,
Darriba
L.
,
2014
,
Astron. Comput.
,
5
,
19

Chanamé
J.
,
Kleyna
J.
,
van der Marel
R.
,
2008
,
ApJ
,
682
,
841

Cohl
H. S.
,
Tohline
J. E.
,
1999
,
ApJ
,
527
,
86

Cole
D. R.
,
Binney
J.
,
2017
,
MNRAS
,
465
,
798

Cuddeford
P.
,
1991
,
MNRAS
,
253
,
414

Das
P.
,
Binney
J.
,
2016
,
MNRAS
,
460
,
1725

Das
P.
,
Williams
A.
,
Binney
J.
,
2016
,
MNRAS
,
463
,
3169

de Lorenzi
F.
,
Gerhard
O.
,
Saglia
R. P.
,
Sambhus
N.
,
Debattista
V. P.
,
Pannella
M.
,
Méndez
R. H.
,
2008
,
MNRAS
,
385
,
1729

de Zeeuw
T.
,
1985
,
MNRAS
,
216
,
273

Debattista
V. P.
,
Sellwood
J. A.
,
2000
,
ApJ
,
543
,
704

Dehnen
W.
,
1999
,
AJ
,
118
,
1201

Dehnen
W.
,
2000
,
ApJ
,
536
,
L39

Dehnen
W.
,
Binney
J.
,
1998
,
MNRAS
,
294
,
429

Emsellem
E.
et al. ,
2007
,
MNRAS
,
379
,
401

Famaey
B.
,
Van Caelenberg
K.
,
Dejonghe
H.
,
2002
,
MNRAS
,
335
,
201

Fouvry
J. B.
,
Pichon
C.
,
Magorrian
J.
,
Chavanis
P. H.
,
2015
,
A&A
,
584
,
A129

Fragkoudi
F.
,
Athanassoula
E.
,
Bosma
A.
,
Iannuzzi
F.
,
2015
,
MNRAS
,
450
,
229

Hairer
E.
,
Nørsett
S.
,
Wanner
G.
,
1993
,
Solving Ordinary Differential Equations
.
Springer–Verlag
,
Berlin
, p.
528

Helmi
A.
,
Veljanoski
J.
,
Breddels
M. A.
,
Tian
H.
,
Sales
L. V.
,
2017
,
A&A
,
598
,
A58

Hernquist
L.
,
Ostriker
J. P.
,
1992
,
ApJ
,
386
,
375

Hoffman
L.
,
Cox
T. J.
,
Dutta
S.
,
Hernquist
L.
,
2010
,
ApJ
,
723
,
818

Holley-Bockelmann
K.
,
Weinberg
M.
,
Katz
N.
,
2005
,
MNRAS
,
363
,
991

Jalali
M. A.
,
Hunter
C.
,
2005
,
ApJ
,
630
,
804

Jeffreson
S. M. R.
et al. ,
2017
,
MNRAS
,
469
,
4740

Kaasalainen
M.
,
1995a
,
MNRAS
,
275
,
162

Kaasalainen
M.
,
1995b
,
Phys. Rev. E
,
52
,
1193

Kalapotharakos
C.
,
Efthymiopoulos
C.
,
Voglis
N.
,
2008
,
MNRAS
,
383
,
971

Kalnajs
A. J.
,
1977
,
ApJ
,
212
,
637

Kuijken
K.
,
Dubinski
J.
,
1995
,
MNRAS
,
277
,
1341

Laakso
T.
,
Kaasalainen
M.
,
2013
,
Phys. D
,
243
,
14

Lilley
E. J.
,
Sanders
J. L.
,
Evans
N. W.
,
Erkal
D.
,
2018
,
MNRAS
,
476
,
2092

Machado
R. E. G.
,
Manos
T.
,
2016
,
MNRAS
,
458
,
3578

Maffione
N. P.
et al. ,
2018
,
MNRAS
,
478
,
4052

McMillan
P. J.
,
2017
,
MNRAS
,
465
,
76

McMillan
P. J.
,
Binney
J. J.
,
2013
,
MNRAS
,
433
,
1411

McMillan
P. J.
,
Dehnen
W.
,
2007
,
MNRAS
,
378
,
541

Meiron
Y.
,
Li
B.
,
Holley-Bockelmann
K.
,
Spurzem
R.
,
2014
,
ApJ
,
792
,
98

Monari
G.
,
Famaey
B.
,
Siebert
A.
,
2016
,
MNRAS
,
457
,
2569

Muzzio
J. C.
,
Carpintero
D. D.
,
Wachlin
F. C.
,
2005
,
CeMDA
,
91
,
173

Pascale
R.
,
Posti
L.
,
Nipoti
C.
,
Binney
J.
,
2018
,
MNRAS
,
480
,
927

Pichardo
B.
,
Martos
M.
,
Moreno
E.
,
Espresate
J.
,
2003
,
ApJ
,
582
,
230

Piffl
T.
et al. ,
2014
,
MNRAS
,
445
,
3133

Piffl
T.
,
Penoyre
Z.
,
Binney
J.
,
2015
,
MNRAS
,
451
,
639

Polyachenko
E. V.
,
2005
,
MNRAS
,
357
,
559

Portail
M.
,
Wegg
C.
,
Gerhard
O.
,
2015
,
MNRAS
,
450
,
L66

Portegies Zwart
S.
,
McMillan
S.
,
van Elteren
E.
,
Pelupessy
I.
,
de Vries
N.
,
2013
,
Comput. Phys. Commun.
,
184
,
456

Posti
L.
,
Binney
J.
,
Nipoti
C.
,
Ciotti
L.
,
2015
,
MNRAS
,
447
,
3060

Posti
L.
,
Helmi
A.
,
Veljanoski
J.
,
Breddels
M. A.
,
2018
,
A&A
,
615
,
A70

Prendergast
K. H.
,
Tomer
E.
,
1970
,
AJ
,
75
,
674

Price-Whelan
A.
,
2017
,
J. Open Source Softw.
,
2
,
388

Rix
H. W.
,
White
S. D. M.
,
1990
,
ApJ
,
362
,
52

Robin
A. C.
,
Reylé
C.
,
Derrière
S.
,
Picaud
S.
,
2003
,
A&A
,
409
,
523

Röttgers
B.
,
Naab
T.
,
Oser
L.
,
2014
,
MNRAS
,
445
,
1065

Rowley
G.
,
1988
,
ApJ
,
331
,
124

Saha
P.
,
1998
,
AJ
,
115
,
1206

Sanders
J.
,
2012
,
MNRAS
,
426
,
128

Sanders
J.
,
Binney
J.
,
2015a
,
MNRAS
,
447
,
2479

Sanders
J.
,
Binney
J.
,
2015b
,
MNRAS
,
449
,
3479

Sanders
J. L.
,
Binney
J.
,
2014
,
MNRAS
,
441
,
3284

Sanders
J. L.
,
Binney
J.
,
2016
,
MNRAS
,
457
,
2107

Sanders
J. L.
,
Evans
N. W.
,
2015
,
MNRAS
,
454
,
299

Sanderson
R. E.
,
Helmi
A.
,
Hogg
D. W.
,
2015
,
ApJ
,
801
,
98

Schwarzschild
M.
,
1979
,
ApJ
,
232
,
236

Sharma
S.
,
Bland-Hawthorn
J.
,
2013
,
ApJ
,
773
,
183

Sharma
S.
,
Bland-Hawthorn
J.
,
Johnston
K. V.
,
Binney
J.
,
2011
,
ApJ
,
730
,
3

Sharma
S.
et al. ,
2014
,
ApJ
,
793
,
51

Syer
D.
,
Tremaine
S.
,
1996
,
MNRAS
,
282
,
223
M2M

Taranu
D. S.
et al. ,
2017
,
ApJ
,
850
,
70

Teuben
P.
,
1995
, in
Shaw
R. A.
,
Payne
H. E.
,
Hayes
J. J. E.
, eds,
ASP Conf. Ser. Vol. 77, Astronomical Data Analysis Software and Systems IV
.
Astron. Soc. Pac
,
San Francisco
p.
398

Ting
Y. S.
,
Rix
H. W.
,
Bovy
J.
,
van de Ven
G.
,
2013
,
MNRAS
,
434
,
652

Trick
W. H.
,
Bovy
J.
,
Rix
H. W.
,
2016
,
ApJ
,
830
,
97

Valluri
M.
,
Debattista
V. P.
,
Quinn
T.
,
Moore
B.
,
2010
,
MNRAS
,
403
,
525

Valluri
M.
,
Shen
J.
,
Abbott
C.
,
Debattista
V. P.
,
2016
,
ApJ
,
818
,
141

Vasiliev
E.
,
2013
,
MNRAS
,
434
,
3174

Vasiliev
E.
,
2015
,
MNRAS
,
446
,
3150

Vasiliev
E.
,
2017
,
ApJ
,
848
,
10

Vasiliev
E.
,
2018
, preprint (arXiv:1802.08255)

Vasiliev
E.
,
Athanassoula
E.
,
2015
,
MNRAS
,
450
,
2842

Walker
M. G.
,
Peñarrubia
J.
,
2011
,
ApJ
,
742
,
20

Widrow
L. M.
,
Pym
B.
,
Dubinski
J.
,
2008
,
ApJ
,
679
,
1239

Wilkinson
M. I.
,
Kleyna
J.
,
Evans
N. W.
,
Gilmore
G.
,
2002
,
MNRAS
,
330
,
778

Williams
A.
,
Evans
N.
,
2015a
,
MNRAS
,
448
,
1360

Williams
A.
,
Evans
N.
,
2015b
,
MNRAS
,
454
,
698

Yurin
D.
,
Springel
V.
,
2014
,
MNRAS
,
444
,
62

Zhao
H.
,
1996
,
MNRAS
,
278
,
488

Zhu
L.
,
van de Ven
G.
,
Watkins
L. L.
,
Posti
L.
,
2016
,
MNRAS
,
463
,
1117

Zhu
Q.
,
Hernquist
L.
,
Marinacci
F.
,
Springel
V.
,
Li
Y.
,
2017
,
MNRAS
,
466
,
3876

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)