Machine Learning Models For Galaxy Morphology

State-of-the-art hydrodynamical simulations can reproduce the diversity of different galaxy morphological types, but fail to exactly recreate real, observed galaxies. In the last decade, machine learning (ML) had very promising results in image recognition and dimensionality reduction. The goal of my thesis is to investigate how ML can be used to create galaxy morphology models and encode the information contained in modern state-of-the-art simulations. I used IllustrisTNG simulated galaxies to build a galaxy morphology model.

Principal Component Analysis

The main part is to investigate how principal component analysis (PCA) can serve as a galaxy morphology model. The so called "Eigengalaxies" calculated from PCA are galaxy images which act as the basis vectors of the image space such that each galaxy in the dataset can be described as a linear combination of all eigengalaxies.

Eigengalaxies

These are the calculated eigengalaxies.

Projection

Projecting galaxy images onto the space spanned by the eigengalaxies gives us a low dimensional representation

Projection

The reconstruction error made by projecting each galaxy image onto the space spanned by 49 Eigengalaxies is calculated and shown in the following histogram. We find that 99% of the galaxy images have a reconstruction error less than 9%.

Reconstruction Error

PCA in three Dimensions

We calculate PCA on three dimensional galaxy images in order to find the 3D Eigengalaxies. Here are some Examples:

GitHub