Heterotic Compactifications in the Era of Data Science

Abstract: The goal of this thesis is to review and investigate recent applications of machine learning to problems in string theory. String theory, the leading candidate for a unification of gravity and the standard model of particle physics, requires the introduction of additional space-time dimensions. To match experimental observations of our universe, these additional dimensions need to be curled up on a compact space. The most common choice to describe this compact space are manifolds of Calabi-Yau type. These manifolds come with favourable mathematical and phenomenological properties.In the first half of this thesis Calabi-Yau manifolds, which are complex Kähler manifolds admitting a Ricci-flat metric, are introduced. The popular construction as complete intersections in products of complex projective space is explained and the necessary mathematical machinery to compute their topological quantities presented. This part is followed by a review of machine learning applications to study their Hodge numbers and the cohomologies of line bundles. In a next step the new Python library cymetric is presented for modeling numerical approximations of the unknown Ricci-flat metric. The metric tensor is a required component in the calculation of Yukawa couplings. It is learned by a neural network trained against a custom loss function, that encodes all the necessary mathematical properties.In the second half Calabi-Yau manifolds are used to compactify the heterotic string and con-struct standard model like vacua. Those are vacua which match the particle content and gaugegroup of a supersymmetric extension of the standard model. First, the popular compactificationprocedure utilising line bundle sums is reviewed and applied to the newly discovered construc-tions of generalized complete intersection Calabi-Yau manifolds. Second, an exploration ofsuch models is initiated in so far uncharted territories. This includes two Calabi-Yau manifoldswith more than 7 Kähler moduli, which are beyond systematic computational reach. In total19538 new models are found by using Actor-Critic agents from deep reinforcement learning.