A New Method and a New Scaling For Deriving Fermionic Mean-field Dynamics

We introduce a new method for deriving the time-dependent Hartree or Hartree-Fock equations as an effective mean-field dynamics from the microscopic Schroedinger equation for fermionic many-particle systems in quantum mechanics. The method is an adaption of the method used in [Pickl, Lett. Math. Phys., 97(2):151-164, 2011] for bosonic systems to fermionic systems. It is based on a Gronwall type estimate for a suitable measure of distance between the microscopic solution and an antisymmetrized product state. We use this method to treat a new mean-field limit for fermions with long-range interactions in a large volume. Some of our results hold for singular attractive or repulsive interactions. We can also treat Coulomb interaction assuming either a mild singularity cutoff or certain regularity conditions on the solutions to the Hartree(-Fock) equations. In the considered limit, the kinetic and interaction energy are of the same order, while the average force is subleading. For some interactions, we prove that the Hartree(-Fock) dynamics is a more accurate approximation than a simpler dynamics that one would expect from the subleading force. With our method we also treat the mean-field limit coupled to a semiclassical limit, which was discussed in the literature before, and we recover some of the previous results. All results hold for initial data close (but not necessarily equal) to antisymmetrized product states and we always provide explicit rates of convergence.


Introduction
The behavior of an interacting many-body system in classical or quantum mechanics can be very complicated and the microscopic equations governing its behavior are usually practically impossible to solve for more than three or four particles. For large interacting systems it is therefore essential to use a statistical description in order to make statements about the typical behavior of a large number of particles. One type of such statistical descriptions is to approximate the microscopic dynamics by an effective one-body dynamics, i.e., replace the microscopic evolution equation with very many degrees of freedom by a simpler, usually non-linear equation with very few degrees of freedom. Famous examples are the Boltzmann, Navier-Stokes and Vlasov equations in classical mechanics, and the Hartree, Hartree-Fock and Gross-Pitaevskii equations in quantum mechanics; different regimes can lead to different effective evolution equations, e.g., kinetic equations or meanfield equations. Apart from the study of these equations and their interesting properties and consequences, it is an ongoing project of mathematical physics to derive them from the microscopic dynamics, i.e., to prove rigorously that the solutions to the effective equations approximate the solutions to the microscopic equations well in certain situations. Only some cases of such a rigorous derivation are known. In classical mechanics, this could for example be shown for the Vlasov equation [10], however, for the Boltzmann equation it has been shown only for very short times [30], and for the Navier-Stokes equation it is still an open problem (see [44] for an excellent overview). In quantum mechanics, starting in the 70s and by a series of recent works, the derivation of effective dynamics for bosons near a condensate is well understood, see [28,42,21,23,40,38,29] for the case of the Hartree equation and [16,17,18,19,20,37,36,5] for the Gross-Pitaevskii equation. However, despite recent progress [3,15,22,7], there are still several open questions regarding the derivation of mean-field dynamics for fermions which is the content of this article. For fermionic many-particle systems in quantum mechanics, the microscopic evolution is given by the Schrödinger equation (we set = 1 = 2m throughout this work) i∂ t ψ t = Hψ t (1) for antisymmetric complex-valued N -particle wave functions ψ t ∈ L 2 (R 3N ) (for simplicity, we neglect spin and always work in three dimensions). Antisymmetry means that ψ t (. . . , x j , . . . , x k , . . .) = −ψ t (. . . , x k , . . . , x j , . . .) ∀j = k. We consider Hamiltonians where H 0 j acts only on x j and v (N ) (x) = v (N ) (−x) is a real-valued pair-interaction potential (the superscript (N ) denotes a possible scaling, see below). According to (1), the unitary time evolution of an initial wave function ψ 0 is given by ψ t = e −iHt ψ 0 if H is self-adjoint which we henceforth assume. Note that for antisymmetric initial conditions ψ 0 , the wave function ψ t remains antisymmetric under the Schrödinger evolution (1) with Hamiltonian (2) for all times. For the desired mean-field description, consider N orthonormal one-particle wave functions (also called orbitals) ϕ t 1 , . . . , ϕ t N ∈ L 2 (R 3 ) which are solutions to the fermionic Hartree equations (sometimes called reduced Hartree-Fock equations). These are the coupled system of non-linear differential equations for j = 1, . . . , N , where ⋆ denotes convolution, and ρ t N = N i=1 |ϕ t i | 2 is the spatial density. Note that for orthonormal initial conditions ϕ 0 1 , . . . , ϕ 0 N , (3) preserves the orthonormality for all times. The term is called the mean-field. It can be viewed as the average value of the interaction potential at point x, created by particles distributed according to the density ρ t N . Note that closely related effective equations for fermions are the Hartree-Fock equations, where an additional exchange term is present on the right-hand side of (3). In general, the Hartree-Fock equations are expected to be a better approximation than the fermionic Hartree equations; however, the exchange term is always smaller than the direct term (4), and, as we see later, is negligibly small in our setting. Now suppose that some initial ϕ 0 1 , . . . , ϕ 0 N are given. Let the initial N -particle wave function be (here S N is the symmetric group and (−1) σ the sign of the permutation σ) is the antisymmetrized product of ϕ 1 , . . . , ϕ N . Then, under the Schrödinger evolution (1), this initial wave function evolves to ψ t = e −iHt ψ 0 . We want to compare this ψ t to the wave function N j=1 ϕ t j , where the ϕ t j are the solutions to the fermionic Hartree equations (3). In other words, if still ψ t ≈ N j=1 ϕ t j at some time t, then the Schrödinger dynamics is approximated well by the Hartree dynamics. To show such a statement is the goal of this article. We introduce its rigorous version in Section 2. Now note that in the presence of an interaction potential v (N ) it is in general never true that e −iHt N j=1 ϕ 0 j = N j=1 ϕ t j , since the interaction leads to correlations between the particles (the wave function evolves into a superposition of many antisymmetrized product states). These correlations are caused by deviations from the mean-field behavior, i.e., by fluctuations around the mean-field. For a rigorous derivation we thus need to consider a regime where the fluctuations become small for large N , while there is still non-trivial microscopic interaction. The appropriate limit N → ∞ is then called the mean-field limit.

Density ∝ N Regime
One such regime describes a system of fermions with very high density proportional to N . In this case, the microscopic wave function ψ t (x 1 , . . . , x N ) is a solution to the Schrödinger equation (here for non-relativistic particles) and ϕ t 1 , . . . , ϕ t N are solutions to the corresponding fermionic Hartree equations for j = 1, . . . , N (recall ρ t N = N i=1 |ϕ t i | 2 ), where w (N ) is a real-valued external field. The evolution (7) should be considered for initial data in a volume independent of N (e.g., particles confined by a nice external trapping potential w (N ) ). Then, due to the antisymmetry of ψ t , the kinetic energy is (at least) ∝ N 5 3 (which can, e.g., be read off from the kinetic energy inequality in Section 8.1). Together with the N -dependent prefactors in (7), the kinetic energy is thus of the same order in N as the interaction energy. The extra N − 1 3 in front of the time derivative comes from consideration of the time scales: Due to the high kinetic energy, the average velocities of the particles are ∝ N 1 3 , i.e., one can only expect to follow the dynamics for short times. Taking into account the N − 1 3 in front of the time derivative, one considers time scales of order one.
Note that the equation (7) has a semiclassical structure due to the N − 1 3 factors, which play the role of a very small parameter (like the in the Schrödinger equation with units). As a consequence, the solutions to the Schrödinger equation (7) are close to solutions to the classical Vlasov equation, e.g., in the sense that the Wigner transform of a solution to (7) is close to a classical phase space density ρ t (x, p) that solves the Vlasov equation. Such a derivation of the Vlasov equation from the microscopic Schrödinger equation has been considered in [34] and improved in [43]. A derivation of the mean-field dynamics (8) from the microscopic dynamics (7) has first been given in [15] for bounded analytic interactions and short times. In [7,6], the derivation was crucially improved in the sense that it was shown to hold for all times, for initial data with a certain semiclassical structure. Furthermore, the result holds for pseudo-relativistic free Hamiltonians and with fewer regularity assumptions on the interaction which is however still assumed to be non-singular.

Density ∝ 1 Regime
Let us now introduce a new fermionic mean-field regime which describes particles with density independent of N . The idea is that for such systems the volume is ∝ N and a mean-field approximation is valid for long-range interactions like Coulomb interaction. The microscopic evolution for the wave function ψ t (x 1 , . . . , x N ) is given by the Schrödinger equation (for non-relativistic particles) and the corresponding fermionic Hartree equations are , scaling exponent β ∈ R and v(x) = v(−x) a real-valued longrange interaction. The Schrödinger equation (9) is interesting for initial data with total kinetic energy bounded by AN , for some constant A > 0. By the kinetic energy inequality (see Section 8.1), this implies that the system volume is (at least) ∝ N . Now the crucial observation is that the total interaction energy for v with long-range decay is not ∝ N 2 (as one might expect from the ∝ N 2 terms in the double sum i<j in (9)), but in fact smaller, i.e., ∝ N 1+β , with some 0 ≤ β < 1. This can be seen heuristically by considering the mean-field for constant density and interactions |x| −s , since (V N denotes a volume with size ∝ N ) for appropriate s > 0 (see also Lemma 8.2). Thus, in the described situation, the kinetic energy and the interaction energy coming from the Equations (9) and (10) are of the same order in N . 1 Note that systems with total kinetic energy ∝ N and long-range interactions like Coulomb interaction are very relevant in many applications, e.g., for describing the dynamics of electrons in large molecules (e.g., chemical reactions of proteins) or solid states.
The described regime, to our knowledge, has not been considered in the literature before for a derivation of mean-field dynamics. Note, however, that [3,4,2,22] consider the case β = 1 which, for Coulomb interaction, leads to subleading interaction ∝ N − 1 3 .
1 However, let us for a moment consider Coulomb interaction and the corresponding β = 2 3 . Then one can show that the "mean force" ∇N − 2 3 (v⋆ρ t N ) ≤ CN − 1 3 , at least if the regularity conditions (57) on ρ t N hold. It is interesting to note that although the kinetic and interaction energy are of the same order in N here, the "mean force" on the particles is in many cases subleading. For those cases one would thus expect the dynamics (10) to be close to some sort of semiclassical (but in general not free) dynamics.

Outline
We introduce here a new method for deriving fermionic mean-field limits in both regimes. This method does not rely on BBGKY hierarchies but on a Gronwall type estimate for a suitably defined functional, which leads to many technical advantages. The method is defined and discussed in Section 2. In Section 3, we present our main results. The focus is on the new scaling from (9). We first state very general results in Section 3.1 that tell us under which conditions on the solutions to the fermionic Hartree equations does the derivation succeed, for arbitrary self-adjoint Hamiltonian with pair interaction. In Section 3.2, we evaluate these conditions for the scaling from (9). Under the only condition that the kinetic energy stays bounded by AN , we derive the mean-field limit (10) for weakly singular interactions, and for Coulomb interaction with singularity cut off or weakened in a very mild way. Under additional conditions on the solutions to the fermionic Hartree equations, we derive the mean-field limit also for full Coulomb interaction. In Section 3.3, we derive the meanfield equations (8) for basically bounded interactions and initial states with a certain semiclassical structure. For this case, we mostly only reproduce the results from [7], with minor improvements on the initial conditions. We give this derivation anyway to demonstrate the generality of the new method. All the proofs are given in Sections 4 to 9. Finally, let us remark that the length of the paper is due to the fact that we introduce the new method in great detail and generality. Once this is done in Sections 4 to 7.2, the actual proofs of the theorems are indeed very short, which exemplifies the technical advantages of the method.

The New Method
We now present the new method for deriving fermionic mean-field dynamics form the microscopic Schrödinger dynamics. The key of the method is to define a suitable functional α f (ψ, ϕ 1 , . . . , ϕ N ) that measures the closeness of a many-body wave function ψ ∈ L 2 (R 3N ) to the antisymmetrized product state N j=1 ϕ j , with ϕ 1 , . . . , ϕ N ∈ L 2 (R 3 ). For the definition of α f we need to introduce several projectors that play a crucial role in the proofs later.
The functional α f has first been introduced by one of the authors (P.P.) for bosons [38], that is, with p m = |ϕ ϕ| m . The functional was used in [38,29] for the derivation of the bosonic Hartree equation, and in [36,37] for the derivation of the Gross-Pitaevskii equation. A variant of it has been used in [14,13] to derive the effective dynamics of a tracer particle in a Bose gas.
Let us now explain these definitions a little further. First, note that p m , q m and P (N,k) are indeed projectors, since ϕ 1 , . . . , ϕ N are assumed to be orthonormal. As operators on L 2 (R 3 ), p 1 projects on the subspace spanned by ϕ 1 , . . . , ϕ N , and q 1 on its complement, i.e., in particular, where the last equation is true since P (N,k) contains exactly k q-projectors in each summand and q m p m = 0. We can thus define the decomposition ψ = N k=0 P (N,k) ψ of the microscopic wave function, where now each contribution P (N,k) ψ has (N − k) particles in one of the orbitals ϕ 1 , . . . , ϕ N ("good particles") and k particles not in the orbitals ϕ 1 , . . . , ϕ N ("bad particles").
With the functional α f , each contribution P (N,k) ψ is given the weight f (k), i.e., f specifies how much weight is given to the number of bad particles, i.e., the number of particles outside the antisymmetrized product state N j=1 ϕ j . Since 0 ≤ α f ≤ 1, α f ≈ 0 means (for suitable weight functions f ) that the approximation of ψ by N j=1 ϕ j is very good, while α f ≈ 1 means that the approximation is not valid at all. By choosing an appropriate f we can fine tune what exactly is meant by the closeness of ψ to N j=1 ϕ j . One obvious and very simple weight is the relative number k N . We always denote this weight function by and the corresponding functional by α n . For this weight function, due to the antisymmetry of ψ and (17), the functional has the simple form (recall that each summand in P (N,k) contains exactly k projectors q) Let us note here that α n has been used before in [1,25] to measure deviation from the antisymmetrized product structure in the static setting; see also the remarks following (32). Furthermore, α n coincides with the measure introduced in [7] for states in Fock space with fixed particle number N . Another important weight is with some 0 < γ ≤ 1. The function m (γ) (k) gives a much larger weight to already very few particles outside the antisymmetrized product structure. On the other hand, for k > N γ , i.e., very many particles outside the antisymmetrized product structure, m (γ) (k) gives the same weight 1 for all k > N γ . These properties enable us to derive mean-field approximations for a much wider range of physical situations, e.g., singular or weakly scaled interaction potentials. Let us stress that the freedom in the choice of the weight function is a key feature and advantage of the described method.
Note that we could also define projectors p ϕ = N m=1 p ϕ m and q ϕ = 1 − p ϕ . For antisymmetric ψ, the α f functional defined with those projectors coincides with α f from Definition 2.1, since k j=1 q ϕj N j=k+1 p ϕj sym ψ = P (N,k) ψ, as can be seen from multiplying out the left-hand side. Note that the projectors p ϕ are related to fermionic creation and annihilation operators a(ϕ) and a † (ϕ) (see, e.g., [32]) by p ϕ = a † (ϕ)a(ϕ).
The goal of this work is to prove bounds on α f (t) = α f ψ t , ϕ t 1 , . . . , ϕ t N , where ψ t is a solution to the Schrödinger equation and ϕ t 1 , . . . , ϕ t N are solutions to the fermionic Hartree equations. In more detail, we first look for a bound on the time derivative of α f (t) of the type which then, by Gronwall's Lemma, implies the bound where the function C(t) is independent of N , and δ > 0 is called the convergence rate. In the main theorems of Section 3, the weight function is either n from (18) or m (γ) from (20). A bound of the form (22) implies that if initially (at time t = 0) α f is small, then it stays small for times t > 0 and N large enough. In the limit of N → ∞, we arrive at the statement that lim N →∞ α f (t = 0) = 0 implies lim N →∞ α f (t) = 0 for all t > 0. Note that for all f ≥ n, we have α f ≥ α n , i.e., if the bound (21) holds for such an f , we find Thus, by using a weight function f ≥ n (which can be advantageous in the estimates, see below) we can still control α n (t), but now under stronger conditions on the initial state, namely that α f (0) has to be small in N . Let us give a brief overview of how a bound (21) can be obtained and thereby show that the idea to "count the number of particles" not in the antisymmetrized product is very natural and has a clear physical interpretation which is reflected in the proof. We first consider the weight function n(k) = k N . Note that q t 1 is a solution to the Heisenberg equation of motion i∂ t q t is the "mean-field Hamiltonian" from the right-hand side of (3), with V Using the antisymmetry of ψ t , we then find where, using p t 1 + q t 1 = 1 = p t 2 + q t 2 , Let us emphasize here that the kinetic and external field terms coming from the Schrödinger and the fermionic mean-field equations cancel, which is why the main theorems in Section 3.1 hold for any H 0 j . Also, terms (II) n and (III) n do not depend on the mean-field V (N ) 1 ; it is only term (I) n where a difference between Schrödinger interaction and the mean-field enters. Note that these three terms have a nice intuitive explanation. The change ∂ t α n (t) in the "number of bad particles" is given by three different kind of transitions due to the microscopic interaction: transitions between one good, one bad particle and two good particles (term (I) n ); between two bad and two good particles (term (II) n ); and transitions between two bad and one good, one bad particle (term (III) n ). Note, however, that for α n (0) = 0, i.e., ψ 0 = N j=1 ϕ 0 j , we find ∂ t α n (t) t=0 = 0. Thus, what causes the deviations from the antisymmetrized product structure in the first place, is a higher order effect, viz., the fluctuations around the mean-field. The exponential growth in time in (22) comes from the fact that the larger the number of bad particles, the more possibilities do good particles have to become correlated, i.e., as in (21), the change in the number of bad particles at some time t is expected to be proportional to the number of bad particles at this time (plus fluctuations).
Let us briefly discuss how the three terms in (24) can be bounded rigorously (Lemma 7.3). Let us first show why the terms can be bounded by C N α n (t) (plus error terms of O(N −1 ) 2 ), where C N denotes some possibly N -dependent constant, and afterwards discuss why C N in fact does not depend on N for the scalings discussed in Sections 1.1 and 1.2. Recall that ||q t 1 ψ t || = ψ t , q t 1 ψ t = α n (t). Thus, by using Cauchy-Schwarz, term (III) n is bounded by C N α n (t). In term (II) n we also have two q's available, but both q's are left from v (N ) 12 , so we cannot directly apply Cauchy-Schwarz. However, there is a trick that uses the antisymmetry of ψ t to shift the q t 2 to the right side of the scalar product, on the expense of a boundary term of O(N −1 ). Then, by Cauchy-Schwarz, we conclude (II) n ≤ C N (α n (t) + N −1 ). Term (I) n is the crucial term in the sense that there is only one q-projector available (which is not enough for the desired bound) and one thus has to use a cancellation between the microscopic interaction and the mean-field. From the proof of Lemma 7.3 we see that one can indeed extract an extra q-projector from the difference , again at the expense of an error term of O(N −1 ), which yields the desired bound. 3 For Coulomb interaction and the corresponding scaling v in front of the scalar products in the three terms from (24). For the desired bound one thus needs to gain an extra N − 1 3 from the scalar products, so that the constants C N from above are of O(1), i.e., N -independent. For the case discussed in Section 1.2, it is the long-range decay of the Coulomb interaction that gives us this additional factor. In the estimates, the p projectors are used to smooth out the singularity, which leads to terms like v ⋆ ρ t N or v 2 ⋆ ρ t N . From these terms we gain the additional N − 1 3 by using kinetic energy inequalities (Lieb-Thirring inequalities, see Section 8) which take into account the orthonormality of ϕ t 1 , . . . , ϕ t N (fermi pressure), i.e., the fact that the systems volume is ∝ N when the total kinetic energy is bounded by AN . This leads to the results in Section 3.2. In the semiclassical case, we have to gain the additional N − 1 3 by other means, namely from the fact that in a constant volume, again due to the orthonormality of ϕ t 1 , . . . , ϕ t N , the average velocities are ∝ N 1 3 . In the estimates, this is reflected by the condition Finally, let us discuss where and how the use of the weight function (20) helps in the estimates. For strong singularities or weak scalings, term (III) n can be problematic, since there is only one projector p available to smooth out the singularity and gain the additional N − 1 3 by the kinetic energy inequalities. Here we would like to improve the estimate. For general weight functions, ∂ t α f (t) is calculated in Section 6. Each of the transitions between good and bad particles is now weighted with the corresponding change in the weight function, i.e., a derivative of the weight function appears. More exactly, ∂ t α f (t) is again given by three terms similar to (24), but with ψ t replaced by where f ′(d) is a discrete derivative of f . (This is consistent with (24), since the derivative of The key observation is now that the derivative of the weight function m (γ) is zero for all k > N γ , i.e., all contributions coming from P (N,k) ψ t with k > N γ vanish. This helps us to make term (III) m (γ) small, which only gives contributions from large k due to the three projectors q. 2 We use the Landau notation, i.e., 3 As an aside, note that here lies the crucial difference compared to the bosonic Hartree equation, as treated in [38,29]. There, since there is just one orbital ϕ, i.e., pm = |ϕ ϕ|m, and since v (N) = N −1 v, one can directly calculate that i.e., the corresponding term (I)n can directly be seen to be bounded by C N αn(t) + N −1 . (Note that in [38,29], the term v ⋆ |ϕ| 2 (x 1 ) q 2 is usually regarded as being part of term (III)n.)

Connection to Density Matrices
It turns out that the functional α f is closely related to the trace and Hilbert-Schmidt norms of the difference between reduced one-particle density matrices. For any normalized antisymmetric ψ ∈ L 2 (R 3N ), the reduced one-particle density matrix is defined by its integral kernel Note that for an antisymmetrized product state N j=1 ϕ j we find Let us now consider α n , i.e., the α-functional with the weight n(k) = k N . Note that where tr(·) denotes the trace. It is the expression tr(γ ψ 1 q 1 ) that has been used before to measure deviation from the antisymmetrized product structure in the static setting, see [1,25]. The relations between α n and the differences between Schrödinger and Hartree reduced one-particle density matrices in trace norm ||·|| tr and Hilbert-Schmidt norm ||·|| HS are summarized in the following lemma.
Note that the extra N factors are due to the choice of normalization; indeed γ ϕj 1 tr = 1 and γ i.e., convergence of γ ψ 1 to γ ϕj 1 in certain norms is equivalent to convergence of α n to zero. However, there is a difference in the convergence rates, e.g., if α n converges with rate N −1 , then the density matrices converge in trace norm only with rate N − 1 2 , as can be seen from (33). Let us now consider more general weight functions f with f (k) ≥ n(k) for all k. This includes in particular the weight m (γ) (k) from (20) which we use later. For those weights, α f ≥ α n , thus Lemma 2.2 directly implies the following. Lemma 2.3. Let ψ ∈ L 2 (R 3N ) be antisymmetric and normalized, and let ϕ 1 , . . . , It follows that convergence of α f to zero still implies convergence of γ ψ 1 to γ ϕj 1 , but in general not the other way around.
To conclude this section, let us make a remark about convergence in operator norm. Note that ||γ ψ 1 || op ≤ N −1 for antisymmetric ψ, so a possible indicator of convergence would be the operator norm times N . This, however, is not a good type of convergence for our purpose, since the operator norm is given by the largest eigenvalue (which at most can be N −1 for fermionic density matrices; recall our choice of normalization). Thus, while convergence of N times the operator norm does imply convergence of α n , the opposite is not true. One orbital not in the antisymmetrized product of the ϕ 1 , . . . , ϕ N is enough to let the operator norm of N times the difference between the density matrices be equal to one, while α n converges to zero.

General Hamiltonians
The two Theorems 3.1 and 3.3 in this section cover very general Hamiltonians. The theorems are of the form: Given certain properties of the solutions to the fermionic Hartree equations, the mean-field approximation for the dynamics is good, i.e., α f (t) ≤ C(t) α f (0) + N −δ for some δ > 0. We consider wave functions ψ t ∈ L 2 (R 3N ) that are solutions to the Schrödinger equation (1) with self-adjoint Hamiltonian (2) with real and possibly scaled interaction The most important example for the free part H 0 j is the non-relativistic free Hamiltonian with external field, H 0 , but we could also replace the Laplacian by relativistic operators like The fermionic mean-field equations for the oneparticle wave functions ϕ t 1 , . . . , ϕ t N ∈ L 2 (R 3 ) are given by (3). The first theorem gives a bound on α n (t) = ψ t , q t 1 ψ t .
Then there is a positive C(t) = 9 D(t), such that Proof. See Section 7.4.

Remarks.
1. From Lemma 2.2 it follows that (40) implies for the reduced one-particle density matrices the bounds and (39) is only a condition on the solutions to the fermionic Hartree equations (3), and not on the solutions to the Schrödinger equation (1). We show in Theorem 3.4 for H 0 j = −∆ j + w (N ) (x j ) more specifically for which situations condition (39) holds. At this point, note that for solutions ϕ t 1 , . . . , ϕ t N ∈ H 1 (R 3 ) (the first Sobolev space) and interesting interactions, e.g., Coulomb interaction, the left-hand side of (39) is always finite due to Hardy's inequality. Therefore, the challenge lies in the N -dependence on the right-hand side of (39).

The condition
3. Note that condition (39) implies that (by Cauchy-Schwarz and ρ t N = N ) i.e., the scaled mean-field interaction is everywhere bounded.
4. There is an interesting connection between condition (39) and the fluctuations around the mean-field. At each point y ∈ R 3 , let us denote the "fluctuations" in the interaction at y in . , x N denote the integration variables in the scalar product). Then, by a simple calculation, we find Var . Therefore, if condition (39) holds, then the fluctuations around the mean-field vanish at each point for large N , with rate N −1 . Note that N −1 is the typical size of fluctuations in the (weak) law of large numbers, for independently identically distributed random variables. It is therefore not surprising that under this condition the derivation of the mean-field dynamics succeeds (although condition (39) could well be too restrictive; see also Theorem 3.3). The condition (39) can be relaxed and replaced by other conditions if we use the weight function m (γ) (k) from (20). This allows to treat more singular interactions and smaller scaling exponents. Let us first summarize the precise assumptions that we need on the scaled interaction v (N ) and the density ρ t N and afterwards state the theorem. sup sup Proof. See Section 7.4.

Remarks.
6. Similar to (41), the bound (48) implies and a similar bound for the Hilbert-Schmidt norm. However, in general it is not true that 7. We show in Theorem 3.4 for which situations Assumption 3.2 holds.

Density ∝ 1 Regime with Interactions |x| −s
In this section we explicitly consider the non-relativistic Schrödinger equation (9) and the corresponding fermionic Hartree equations (10), as discussed in Section 1.2. The results in this subsection are concerned with interaction potentials |x| −s with 0 < s < 6 5 , sometimes with singularity weakened or cutoff, and the corresponding β = 1 − s 3 (see also Lemma 8.2). For the following results we assume the existence of solutions to the equations (9) and (10) and that the total kinetic energy of the solutions to (10) (but not necessarily to (9)) is bounded by AN , i.e., N i=1 ||∇ϕ t i || 2 ≤ AN . There are several works about solution theory to the Hartree(-Fock) equations [8,12,11,9] which establish existence and uniqueness of solutions even for singular (attractive or repulsive) interactions and external fields like |x| −1 . A blowup of solutions is only expected to happen for strong attractive interactions (e.g., for gravitating fermions) with semirelativistic free Hamiltonian, see, e.g., [24,27,26]. (Indeed, as is shown in [24], even for bounded interactions, v ⋆ ρ t N can become infinite in the limit of t → T , for some T < ∞.) Therefore, due to the conservation of the total Hartree energy, N i=1 ||∇ϕ t i || 2 ≤ AN always holds if it holds for the initial states ϕ 0 1 , . . . , ϕ 0 N and if the external field is nice enough (e.g., scaled external Coulomb fields generated by nuclei with some N -independent distances to each other are ok). In the following, we give the desired bounds only in terms of α m (γ) , since the corresponding bounds for the convergence of density matrices can be read off from Lemmas 2.2 and 2.3.

Remarks.
8. Theorem 3.4 shows in particular that the mean-field approximation holds for Coulomb interaction with very mild singularity cutoff: either by bounding the Coulomb potential on a small ball with radius N −δ , δ < 1 6 around the singularity (i.e., changing it only in a volume shrinking in N ) or by replacing the singularity by the weaker singularity |x| −s , s < 1 3 . We can also treat full Coulomb interaction with singularity under the stronger conditions (57) or (39) on the solutions to the fermionic Hartree equations.
Finally, we remark that for all times t of order O(N −δ ) for any δ > 0, the dynamics (9) is indeed free (in the limit N → ∞). We summarize this in the following proposition which for simplicity we only state for Coulomb interaction without external fields.

Density ∝ N Regime
Let us now state our result for the derivation of the mean-field dynamics for the regime discussed in Section 1.1. Such a derivation has recently been given in [7] and here, we mostly reproduce these results (actually using estimates about the propagation of properties of the initial data from [7]). A slight improvement is that our conditions on the closeness of the initial data are more transparent and general. We consider the Schrödinger equation (7) (for simplicity, without external fields) and the corresponding fermionic Hartree equations (8). Note that for the interactions we consider the exchange term can easily be shown to be subleading (compared to the error term in the bound (66)), so the theorem can be proven directly for both the fermionic Hartree and for the Hartree-Fock equations . Note that here we do not have to use the long-range behavior of the interaction, since we are interested in solutions in some constant, N -independent volume, i.e., very high densities. That the average velocities are ∝ N 1 3 (or higher) is crucial for the proof and is captured by the condition (64). For technical reasons, we consider basically bounded interactions. Our theorem is analogous to [7, Thm. 2.1]. We denote the trace norm by ||·|| tr (see also Section 5) and the commutator by [A, B] = AB − BA.

Remarks.
9. It follows from Lemma 2.2 that (66) implies for density matrices the estimate for some constantsC 1 ,C 2 and a similar estimate for the Hilbert-Schmidt norm.
10. The theorem also holds with external fields that are such that they preserve the bounds (64) and (65) for all t.
11. The double exponential in the bound (66) comes from using the Gronwall Lemma twice: once for propagating the conditions (64) and (65) in time, and once for the time derivative of α n (t).

Notation and Preliminaries
We summarize here some notation, list often used inequalities and establish some basic properties of the projectors p ϕ m that we use in the proofs throughout the following sections. The Hilbert space of complex square integrable functions on R d is denoted by In order to distinguish the scalar product on L 2 (R 3N ) from scalar products on another L 2 (R d ) (usually L 2 (R 3 )) we always write ·, · for the scalar product on L 2 (R 3N ). We denote by ·, · a+1,...,N the scalar product only in the variables x a+1 , . . . , x N , i.e., it is a "partial trace" or "partial scalar product", formally defined for any χ, ψ ∈ L 2 (R 3N ) by ψ, χ a+1,...,N (x 1 , . . . , x a ) : which should be regarded as a vector in L 1 (R 3a ) (for χ = ψ, it is the diagonal of the reduced a-particle density matrix, a generalization of (30)). In the same style we denote by |· m a vector in L 2 (R 3 ) acting only on the m-th variable of L 2 (R 3 ) N , by ·| m its dual, and by ·, · m the scalar product only in the m-th variable. Given a function h : In general, subscripts i 1 , . . . , i a usually indicate that an operator acts only on x i1 , . . . , x ia . We always denote by B R (x) = y ∈ R d : |x − y| < R the open ball with radius R around x. For any set Ω ⊂ R d we write Ω c = R d \ Ω. We frequently use some well-known inequalities (for proofs, see, e.g., [31]), the Cauchy-Schwarz inequality for a, b ∈ C N , the Cauchy-Schwarz inequality for ψ, χ ∈ L 2 (Ω) (for (Ω, Σ, µ) a general measure space), and Hölder's inequality for f ∈ L p (Ω), g ∈ L q (Ω), with 1 ≤ p, q ≤ ∞, 1 p + 1 q = 1, ||f g|| 1 ≤ ||f || p ||g|| q . (72) Note that the operator p ϕ m from Definition 2.1 is indeed a projector on L 2 (R 3N ) and that for ϕ i ⊥ ϕ j and all m, n = 1, . . . , N we have which yields (77).
By inserting two identities 1 = p 1 + q 1 we find, using (90), (91), (85) and the triangle inequality, Since 0 ≤ α n ≤ 1, it is indeed true that . We find, using (90), (91), tr(q 1 γ ψ 1 p 1 ) = tr(p 1 γ ψ 1 q 1 ) = 0 and |tr(A)| ≤ ||A|| tr , that Note that indeed ||p 1 − q 1 || op = 1, since for all f ∈ L 2 (R 3 ), We now show γ We now show α n ≤ √ N γ The expression for the time derivative of for arbitrary weight functions f (k) follows from direct calculation (recall that the projectors P (N,k) depend on t through their dependence on ϕ t 1 , . . . , ϕ t N ). We use that the time derivative of ψ t ∈ L 2 (R 3N ) is given by the Schrödinger equation (1) with self-adjoint Hamiltonian H given by (2) with real interaction v (N ) (x) = v (N ) (−x). As effective equations for the one-particle wave functions ϕ t 1 , . . . , ϕ t N ∈ L 2 (R 3 ) we consider the general equations where the operator V (N ) can possibly depend on N , j and ϕ t 1 , . . . , ϕ t N . The two interesting cases are when there is only direct interaction, where ρ t N = N i=1 |ϕ t i | 2 , and when there is direct and exchange interaction, We calculate the time derivative of α f (t) in two steps. First, in Lemma 6.1 we directly calculate a straightforward expression for the time derivative. We then prove an auxiliary lemma which we use to bring the time derivative into a form (Lemma 6.5) that can nicely be estimated (later in Lemma 7.3). Lemma 6.1. Let ψ t ∈ L 2 (R 3N ) be an antisymmetric solution to the Schrödinger equation (1) and let ϕ t 1 , . . . , ϕ t N ∈ L 2 (R 3 ) be orthonormal solutions to the effective equations (99). Then where Proof. Note that the operators p m , q m , P (N,k) all depend on t through the orbitals ϕ t 1 , . . . , ϕ t N . For ease of notation, we do not explicitly write out this t-dependence. Let us first prove that f fulfills the Heisenberg equation of motion Note that and, using p m + q m = 1, Then, by applying the product rule and using H mf m , p n = 0 ∀m = n, it follows that which implies (103). Using this and the antisymmetry of ψ t , we find for the time derivative of α f (t), Let us now bring the expression (102) into a form we can use for the desired Gronwall estimate. For that, we need to define shifted operators and discrete derivatives of weight functions f . Definition 6.2. Let d ∈ Z and f be a weight function as in Definition 2.1.
(a) We define the shifted operators where for the last step we defined f (k) = 0 for all k < 0 and k > N .
(b) For d > 0, we define discrete first derivatives of f by where ½ A (k) = 1 for k ∈ A and 0 otherwise.
For Lemma 6.4 and the proof of Lemma 6.5 we need some more notation.
where the subscript sym means the symmetrized tensor product.
In accordance with Definition 2.1 we have P (k) 1...N = P (N,k) . For the proof of Lemma 6.5 we need an auxiliary Lemma that shows how to shift an f to the other side of a two-particle operator. Lemma 6.4. As in Definition 6.3, we abbreviate P (0) 12 = p 1 p 2 , P (1) 12 = p 1 q 2 + q 1 p 2 and P (2) 12 = q 1 q 2 . Let h 12 be an operator that acts only on the first and second particle index. Then, for all a, b = 0, 1, 2, Proof. We only prove (112) since (113) can be proven in just the same way. In the following calculation we use the splitting where P (d) 12 contains exactly d q-projectors, and P . (115) Note that Lemma 6.4 holds also more generally for m-particle operators h 1...m , i.e., holds for all a, b ∈ {0, . . . , m}. As last preparatory remark, note that for all f ≥ 0, since P (N,k) P (N,ℓ) = δ kℓ P (N,k) . 4 With Lemma 6.4 we can simplify the expression (102) for the time derivative of α f (t) by splitting it into three parts, each of which will be estimated separately later in Lemma 7.3. The advantage of having the square root of the "derivatives" f ′(d) next to ψ t is made clear in Lemma 7.1 and the estimates in Lemma 7.3. Lemma 6.5. Let ψ t ∈ L 2 (R 3N ) be an antisymmetric solution to the Schrödinger equation (1) and let ϕ t 1 , . . . , ϕ t N ∈ L 2 (R 3 ) be orthonormal solutions to the effective equations (99). Then, for all monotone increasing f (k), Remarks. 4 Note that for all 0 < s ∈ Q also the more general relations f 12. Note that the time derivative is formally the same as for bosons, where p 1 := |ϕ ϕ| 1 , see [37,38]. Note that in [37,38] the splitting into three summands is done slightly differently: compared to (118), an additional identity 1 = p 2 + q 2 is added in front of V (N ) 1 and the operator f ′(−d) 1 2 is pulled over to the left side of the scalar product. 13. For the case f (k) = n(k) = k N we find a simple expression for the time derivative of α n (t) = ψ t , q t 1 ψ t . It can easily be found by direct calculation as in (24) or from (118). Note that, in view of Definition 2.1 and the identity N k=0 P (N,k) = 1, and in the same way Then (118) simplifies to 14. Note that the proof of Lemma 6.5 can easily be generalized to m-particle interactions W 1...m . In this case, with the proper definition of W 1...m coming from Lemma 6.1 and using (116), we find Proof of Lemma 6.5. We calculate the time derivative of α f (t) starting from the expression (102) of Lemma 6.1. The idea of the proof is to insert two identities 1 = p 1 + q 1 and 1 = p 2 + q 2 in front of each ψ t (which leads to 16 summands) and then to use Lemma 6.4 in order to shift f . It turns out that a lot of terms drop out due to the commutator structure. Let us first note two auxiliary calculations. For all a, b = 0, 1, 2 with a > b, we have where the positivity is true since f was assumed to be monotone increasing, and Then, by inserting two identities 1 = 2 a=0 P (a) 12 (in the following sums, the indices a, b always run from 0 to 2), we find By inserting the definition , using p m q m = 0 and, in the last step, using q 2 = 1 − p t on the right side of the scalar product in the third summand, we find 7 Proof of Results for General Hamiltonians

Using the New Weight Function
In Section 6 we went through some trouble to handle general weight functions in the time derivative of α f (t). The next Lemma shows what we gain by choosing the weight function m (γ) (k) defined in (20). Morally, the Lemma says the following. By choosing γ < 1 compared to γ = 1, we make the convergence rate worse when there is no q available (see (131)), we do not loose anything when there is one q available (see (132)) and we gain powers in N when there are two q's available (see (133)). (20) for some 0 < γ ≤ 1 and let ψ ∈ L 2 (R 3N ) be antisymmetric and normalized. Then, for all d = 1, 2, and Remarks.
15. Note that more generally the estimate holds for all 1 < d < N and 1 < n < N , for some constant C that depends only on d and n.
Proof of Lemma 7.1. First, recall that for all antisymmetric φ ∈ L 2 (R 3N ), and in a similar way We denote by ⌊·⌋ the floor function, i.e., for any x ∈ R, ⌊x⌋ = max{m ∈ Z : m ≤ x}. Then we find and, by using (135), and, by using (136), Careful consideration of the boundary terms around the k = ⌊N γ ⌋ summands gives the values of the constants C(d) andC(d).

Diagonalization of p 2 h 12 p 2 and Related Lemmas
For handling the terms in the time derivative of α f (t) from equation (118), it is often useful to diagonalize operators p 2 h 12 p 2 . Let us briefly summarize what we mean by that and introduce some notation on the way. For any h : , the operator p 2 h 12 p 2 is a multiplication operator in x 1 and a projector onto the N -dimensional subspace span(ϕ 1 , . . . , ϕ N ) in the second variable. Therefore one can write it as an x 1 dependent non-negative self-adjoint (N × N )-matrix acting on the second variable, i.e., (recall our notation from Section 4) where the eigenvectors |χ x1 i 2 are orthonormal and can be written as |χ x1 1 , . . . , χ x1 N ) = span(ϕ 1 , . . . , ϕ N ) for all x 1 ∈ R 3 and that the projector p 2 is independent of the choice of basis, i.e., The eigenvalues λ i (x 1 ) have the properties that and furthermore, for all i = j, We use this diagonalization in the proof of Lemma 7.3 for the operator p 2 v (N ) 12 p 2 from term (I) f from the time derivative of α f (t) in (118), and to prove the following lemma which we need to bound term (II) f and (III) f from (118). Note that this lemma is similar to (83); the additional N −1 factors come from Lemma 4.1.
Proof. Recall our notation for the "partial scalar product" from (69). For proving (145) we use the diagonalization (140), Lemma 4.1 and (143). We find For proving (146) we diagonalize We call the eigenvalues µ j and the eigenvectorsφ j . With the diagonalization (140) and Lemma 4.1 we find If ψ is antisymmetric in all variables except x 3 , then, according to Lemma 4.1, one can only extract factors (N − 2) −1 instead of (N − 1) −1 , and (N − 1) −1 instead of N −1 from the antisymmetry of ψ.

Bounds on ∂ t α f (t)
We now give the rigorous bounds for the three terms in the time derivative of α f (t) given by (118).
Here, we use the weight function m (γ) (k) from (20). This also contains the case where γ = 1, i.e., where the weight function is n(k). The estimates are collected in the following lemma, which constitutes the heart of the proof of our main results. We state this lemma only for positive v (N ) . If v (N ) contains both positive and negative parts,  3 or Ω N = ∅). Then, using the weight function m (γ) (k) as in (20), we find for all 0 < γ ≤ 1 for the three terms from (118), Proof. For ease of notation, we omit all the subscripts and superscripts N , (N ) or (γ) in this proof, i.e., we abbreviate , ρ = ρ N , Ω = Ω N and m = m (γ) . Term (I) m . In the following, we diagonalize the operator p m v 1m p m and use the same notation as in (140). We split each eigenvalue into two parts, Note that, by using Cauchy-Schwarz, and For bounding (I) m it is useful to introduce the projectors that act on all but the first variable. From these projectors we only need the properties that for all ψ 1 as that are antisymmetric in all variables except x 1 , q which remains true if |ϕ i m = |χ x1 i m (m ≥ 2). In the following we abbreviate φ = m ′(1) For the first summand in (157), we find by Cauchy-Schwarz (C.-S.), (156), Lemma 7.1 and (153), For the second summand in (157), we find by Cauchy-Schwarz (C.-S.), (156), Lemma 7.1 and (154), Term (II) m . The idea of the bound for this term is to shift one q to the right side of the scalar product by using the antisymmetry of ψ. We abbreviate φ = m ′(2)

Proof of the Theorems
Proof of Theorems 3.1 and 3.3. First, we split v (N ) = v (N ) where Term ± refers to (I) m (γ) , (II) m (γ) , (III) m (γ) from (119), (120) and (121) with interaction v (N ) ± . We bound Term ± separately by using Lemma 7.3, which proves under the stated assumptions the bound Applying the Gronwall Lemma gives the desired bound The values of the constant C(t) in Theorems 3.1 and 3.3 can be obtained by using the respective assumptions, together with Lemma 7.3 and (162).

Remarks.
16. Following up on Remark 5 after Theorem 3.1, let us consider the size of the error we make by neglecting the exchange term. Suppose that the exchange term is of O(N −δ ). It then gives an additional term CN −δ α n (t) ≤ C α n (t) + N −2δ in the time derivative of α n (t) (where the α n (t) comes from the q 1 in term (I) n ).

Proof of Results for Density ∝ 1 Regime
In this section, C denotes a constant which can be different from line to line.

Kinetic Energy Inequalities
Let us first state some well-known inequalities which we use in Section 8.2 to show that the conditions of Theorems 3.1 and 3.3 hold if the total kinetic energy is bounded by AN . The first is the Lieb-Thirring or kinetic energy inequality [33,32] which we state here only for antisymmetrized product states.
Note that this inequality in a more general form reads for any a > 0, see, e.g., [41]. We also need the Hardy-Littlewood-Sobolev inequality (see, e.g., [31,Thm. 4.3]), here stated for three dimensions. Let p, r > 1 and 0 < λ < 3 with 1 p + λ 3 + 1 r = 2. Let f ∈ L p (R 3 ) and h ∈ L r (R 3 ). Then there is a constant C = C(λ, p), such that Let us also recall Hardy's inequality (see, e.g., [32]). Let f ∈ H 1 (R 3 ). Then Let us now show for interactions |x| −s , 0 < s < 6 5 , that the mean-field term v (N ) ⋆ ρ N is bounded independent of N , if it is scaled with β = 1 − s 3 and if the total kinetic energy is bounded by AN . We need this statement for the proofs in Section 8.2.
Proof. We define B R (y) = {x ∈ R 3 : |x − y| < R} and B R (y) c = R 3 \ B R (y). First, note that for 0 < s < 6 5 , Then, using Hölder's inequality, Lemma 8.1, ρ N = N and (171), we find for any R > 0, Setting R = N 1 3 (if we set R = N δ and then optimize (172) with respect to δ we find δ = 1 3 ) we find Using the explicit values of the constants from Lemma 8.1 and (171), setting R N = rN 1 3 , with N -independent r > 0, and minimizing the resulting expression (172) with respect to r gives an explicit value for the constant of (173),

Proof of the Results
Proof of Theorem 3.4. We consider the three different interactions separately.
• Let v s (x) = ±|x| −s , with 0 < s < 3 5 and β = 1 − s 3 and note that v 2 s = |v 2s |. We can therefore use Lemma 8.2 to show that the condition (39) from Theorem 3.1 holds. We find v (N ) If we use that the constant in (170) is proportional to A s 2 , we find that the constant in (175) is proportional to A s and thus the C appearing in the α n -estimate (52) is proportional to A s 2 .
and from the Hardy-Littlewood-Sobolev inequality (178) with s = 1, Using Hölder's inequality and the kinetic energy inequality from Lemma 8.1, we find (recall that R < 1) Finally, sup i.e., we need − 2 3 + 2s 9−15s ≤ − 1 2 − γ 2 , i.e., γ ≤ 1 3 − 4s 9−15s . • For v(x) = ±|x| −1 , we use Theorem 3.1, i.e., we have to verify (39). By Hölder's inequality we find for any R > 0, Now BR(0) |x| −2p d 3 x ≤ CR 3−2p for p < 3 2 , i.e., q > 3. For q < ∞ we use the generalized kinetic energy inequality (166) with a = 3 2 (q − 1) > 3 and for q = ∞ we use ||ρ t N || ∞ ≤ C. Then we find for R = N 1 3 (recall 1 Proof of Proposition 3.5. We start from the expression (125) which yields, by using p m + q m = 1, V Then, by Cauchy-Schwarz, Lemmas 7.2 and 8.2 and (188) we find |∂ t α n (t)| ≤ 2N Proof of Theorem 3.6. The strategy of the proof is again to bound |∂ t α n (t)| ≤ C(t) α n (t) + N −1 (197) and use the Gronwall Lemma to conclude the desired bound (66). Recall that we use the weight function n(k) = k N here, i.e., we can use the form (125) for the time derivative of α n (t). Using the scaling v (N ) = N −1 v and noting the additional N − 1 3 in front of the time derivatives in the Schrödinger and mean-field equations, we find that ∂ t α n (t) is given by the sum of the three terms (II) n = 2N − 2 3 Im ψ t , q t 1 q t 2 (N − 1)v 12 p t 1 p t 2 ψ t , with V 1 = V dir 1 in the case of the fermionic Hartree equations, and V 1 = V dir 1 + V exch 1 in the case of the Hartree-Fock equations. For ease of notation we omit the t-dependence in the following, i.e., we write ψ t = ψ, ϕ t i = ϕ i and the constants C can be t-dependent. The double exponential in (66) comes from one exponential in Lemma 9.1 and another exponential from the Gronwall Lemma applied to (197). In the following estimates, we decompose v(x) = d 3 kv(k)e ikx . Note that the assumption d 3 k (1 + |k| 2 ) |v(k)| < ∞ in particular implies that d 3 k |v(k)| < ∞ and d 3 k |k| |v(k)| < ∞. Term (I) n . Let us first bound the contribution from the exchange term. Using the Fourier decomposition of v and Cauchy-Schwarz we find Here we see explicitly that the contribution from the exchange term is of lower order in N . Let us now bound (I) n only with direct interaction. Using the Fourier decomposition of v we find ϕ j , e −ikx ϕ j q 1 e ikx1 p 1 ψ .