The diffsky mock galaxy catalogs are available on NERSC to anyone with an account. There are two options for accessing the data:
Access directly on NERSC via Jupyter or a job script
Use the OpenCosmo Compute Portal to query a smaller subset of the data, and move the result to a computing resource of your choice. To request access, fill out this Google form
Feel free to use either (or both!) of these methods. In either case, the data will be stored in the OpenCosmo Format. You can use the OpenCosmo Python Toolkit to access this data and further query it. If working on NERSC, you can find the data at /global/cfs/cdirs/hacc/OpenCosmo/LastJourney/synthetic_galaxies/
Computing Mass Accretion and Star Formation Histories with Diffmah¶
This notebook demonstrates how to compute mass accretion and star formation histories for a population of synthetic galaxies, with data stored in the OpenCosmo Format.
Step 0: Install Requirements¶
In order to run this notebook, you will need to be using an environment with opencosmo, diffmah, and diffstar installed.
You can install opencosmo with:
conda create -n diffsky python=3.12 conda-forge::opencosmoOnce opencosmo is installed, install the diffsky packages with:
conda activate diffsky
opencosmo install diffsky --file /path/to/synthetic/catalog.hdf5If you’re working with the original data on NERSC, the dataset will be split into a large number of individual files. You may use any of them for the step above. The file contains information about the versions of Diffmah and Diffstar that were used to produce the catalog. The command above will automatically install the correct versions into your environment, as well as additional dependencies like jax. See below for the paths.
Additional information about the OpenCosmo toolkit, including additional installation options, can be found on ReadTheDocs
Step 1: Selecting a Catalog and Loading the Data¶
The data folder contains several different catalogs, based on the dataset they are calibrated against. Catalogs are tagged with the date they were produced. For each callibration there is also a “latest” folder, which always points to the most recent version of the catalogs for a given calibration.
Now that we have the software we need, we can load the data using the OpenCosmo toolkit. We recommend passing all the files in the catalog to the toolkit. Don’t worry, it won’t actually read anything yet. We’ll restrict to a subset of the data later.
Note about Synthetic Cores¶
By default, every mock galaxy in Diffsky lives in a “subhalo core” that was resolved in the underlying N-body simulation (for more information about subhalo cores, see Rangel+20, https://
Not all catalogs have synthetic cores, but if you are working with one that does you can include them by setting the synth_cores flag to True when you open the data, e.g.
import opencosmo as oc
oc.open("lc_cores-411.diffsky_gals.hdf5", synth_cores=True)from pathlib import Path
import opencosmo as oc
import numpy as np
import matplotlib.pyplot as plt# This tutorial assumes you are running on NERSC
all_catalogs_path = Path("/global/cfs/cdirs/hacc/OpenCosmo/LastJourney/synthetic_galaxies/")
available_catalogs = list(p.stem for p in all_catalogs_path.glob("*") if "latest" in p.stem)
print(available_catalogs)# Select one of the catalogs, and open the data
data_path = all_catalogs_path / "galacticus_in_plus_ex_situ_latest"
files = list(f for f in data_path.glob("*.hdf5") if f.stem.startswith("lc_cores"))
ds = oc.open(*files)Step 1:¶
If you are running on NERSC, the data are much too large to access all at once. We can cut it down a bit to make it more manageable. The cell below will select 50,000 galaxies at random between redshift 0.5 and 1.0
ds = ds.with_redshift_range(0.5, 1.0)
ds = ds.take(50_000)The OpenCosmo dataset object does not actually contain any data yet, it just knows how to go get the data when you ask for it. However it does have lots of additional useful information, such as the cosmology of the simulation this catalog was produced from. We can preview the dataset by printing it
dsStep 2: Get the Cosmology Information¶
Diffmah and Diffstar were produced from from halo catlogs from the LastJourney simulation. In order to produce the fits, we need to provide them with information about the cosmology this simulation was run with. We can easily access the astropy Cosmology object for the simulation with the cosmology attribute.
cosmology = ds.cosmology
cosmologyStep 3: Retrive the parameters for the Mass Accretion History fits¶
To compute the mass accretion histories, we need to retrieve the relavent columns from the catalog. By selecting these columns with ds.select, we ensure OpenCosmo doesn’t try to load a ton of unnecessary data into memory.
from diffmah import DEFAULT_MAH_PARAMS
# Get the names of the relevant columns
diffmah_columns = [key for key in DEFAULT_MAH_PARAMS._fields]
# Getting the data for those columns as numpy arrays
diffmah_data = ds.select(diffmah_columns).get_data("numpy")
# Convert to the format Diffmah expects
diffmah_data = [diffmah_data[key] for key in diffmah_columns]
mah_params = DEFAULT_MAH_PARAMS._make(diffmah_data)Step 4: Compute Mass Accretion Histories and make a plot¶
from diffmah import mah_halopop
t0 = cosmology.age(0).value
tarr = np.linspace(0.5, t0, 100)
dmdht, log_mah = mah_halopop(mah_params, tarr, np.log10(t0))
fig, ax = plt.subplots(1, 1)
for i in range(10):
__=ax.plot(tarr, log_mah[i, :])
_ = ax.set_ylim(8,12.5)
ax.set_xlabel("Cosmic Time (Gyr)")
ax.set_ylabel(r"Maximum Halo Mass $log(M_{halo}/M_{sun})$")Step 5: Repeat Step 2 for Star Formation History Parameters¶
from diffstar import DEFAULT_DIFFSTAR_PARAMS
sfh_columns = DEFAULT_DIFFSTAR_PARAMS.ms_params._fields + DEFAULT_DIFFSTAR_PARAMS.q_params._fields
diffstar_data = ds.select(sfh_columns).data
sfh_ms_data = [diffstar_data[key].value for key in DEFAULT_DIFFSTAR_PARAMS.ms_params._fields]
sfh_q_data = [diffstar_data[key].value for key in DEFAULT_DIFFSTAR_PARAMS.q_params._fields]
sfh_ms_params = DEFAULT_DIFFSTAR_PARAMS.ms_params._make(sfh_ms_data)
sfh_q_params = DEFAULT_DIFFSTAR_PARAMS.q_params._make(sfh_q_data)
sfh_params = DEFAULT_DIFFSTAR_PARAMS._make((sfh_ms_params, sfh_q_params))Step 6: Compute SFH and make a plot¶
from diffstar import calc_sfh_galpop
fb_last_journey = cosmology.Ob0 / cosmology.Om0
sfh, smh = calc_sfh_galpop(sfh_params, mah_params, tarr, lgt0=np.log10(t0), fb=fb_last_journey, return_smh=True)
fig, ax = plt.subplots(1, 1)
yscale = ax.set_yscale('log')
yscale = ax.set_ylim(1e-5, 200)
for i in range(10):
__=ax.plot(tarr, sfh[i, :])
ax.set_ylim(1e-3, 1e2)