Sponsored by BMBF Logo

The Nature of Dark Energy

Contact Person: Gottloeber, Stefan,sgottloeber@aip.de,
0331-7499516, Astrophysical Institute Potsdam


Current observational data from supernova distances as well as the
results from the cosmic background radiation measurements show that at
present the universe expands accelerated. This cannot be explained in
a universe which contains only photons and matter (baryons, dark
matter and massive neutrinos). The source of this acceleration has
been designated as dark energy, which contributes about 70 % to the
total energy density in the universe. The nature of this dark energy
is not yet known. The simplest candidate for dark energy is a
cosmological constant as already introduced by Einstein in 1917. This
constant could be interpreted as vacuum energy. However, there is a
huge difference in magnitude between the theoretical prediction and
the astrophysical observations. Other models postulate a new field
(evolving dark energy) or modifications of General Relativity. Since
dark energy has fundamental implications on the formation and
evolution of the universe, it is important to understand the nature
and, in particular, the evolution of dark energy.

All present observational data are consistent with a cosmological
constant rsp. vacuum energy, although the present uncertainties are
large enough to be in agreement with many other models. Obviously, one
needs to improve the accuracy of the measurements of the present
expansion and to investigate the expansion history. The Hobby
Eberly-Telescope Dark Energy Experiment (HETDEX) plans to measure the
baryon acoustic oscillation (BAO) in a large volume and with a large
number of galaxies. The BAO are a feature in the power spectrum of
the galaxy distribution, which can be used as a standard ruler to
measure the expansion history of the universe. HETDEX proposes to
obtain the most accurate measure to date of the expansion history from
redshift 1.9 < z < 3.8. The aim of our numerical study is to measure
in a big cosmological simulation the BAO both in the dark matter
distribution and the DM halos so that the effect of galaxy bias can be
estimated. We will test, whether from these measurements the w
parameter of the equation of state of the dark energy can be
obtained. We have already finished at LRZ a simulation assuming the
standard LCDM model (w = -1) with 10243 particles in a box of 1000
h-1 Mpc side length (mass resolution 6.2 times 1011 h-1 M_sun). In
order to study the effect of a non-cosmological-constant dark energy
we plan to run within the AstroGrid a second simulation assuming w =
-0.8. Details about the analyzing process can be found in Wagner et
al, astro-ph 0705.0354.


The numerical simulations will be done using the Adaptive Refinement Tree
(ART) code which implements successive spatial and temporal refinement
in high density regions. The ART code was built by a number of people.
It started as a particle mesh code (PM) in 1979 written by A. Klypin
in collaboration with A. Doroshkevich and S. Shandarin. In 1995 A.
Khokhlov developed his Fully Threaded Tree algorithm for Adaptive Mesh
Refinement and provided routines to handle data structures in the new
adaptive mesh refinement scheme. In 1997 based on this algorithm A.
Kravtsov wrote the first version of the ART code. It used OpenMP
parallelization. In 2002-2003 S. Gottloeber and A. Klypin developed
the MPI+OpenMP code. At that time the code also was able to treat
particles of different masses. The reasons to design a hybrid MPI+OMP
code was to address two issues: On the one hand, OMP parallelization
is not very efficient. Scaling depends on particular computer
architecture and on the particular system under study. Typically the
code scales well up to 4 processors on shared memory computers. MPI
parallelization is necessary if one wishes to run the code on a
substantially larger number of CPUs. On the other hand, the code
requires a lot of memory, often more than associated with a single
CPU. OMP provides a way to access a larger memory since all memory of
a node which consists of several CPUs will be accessible. Thus the
combination of OpenMP and MPI is the best way to fulfil both
requirements of the ART code.

We use rectangular domains for MPI parallelization. The whole
simulation volume - a cube - is split into non-overlapping fully
covering parallelepipeds. Each domain is handled by one MPI task and
can spawn OpenMP threads. The boundaries of the parallelepipeds can
move as time goes on in order to equalize the load of different MPI
tasks. The exchange of information between MPI tasks is done very
infrequently: only at the beginning of each zero-level time-step. The
main idea is the same as in TREE codes. The mass distribution at large
distances can be approximated only roughly when we calculate forces.
In ART this idea is implemented by creating a buffer zone with low
mass primary particles around the domain handled by an MPI task and
massive temporary particles away from it.

Each MPI task handles the whole volume - there is no other exchange of
information between MPI tasks; only the creation of temporary
particles, and the exchange of the buffer and temporary particles.
For one zero-level time-step the MPI task will advance all the
particles (primary or not). Once the time step is finished the CPU time
consumed by every MPI task is gathered by the root task. It decides
how to move the boundaries of the domains in order to improve the load
balance. Then the primary particles are redistributed so that they
reside on tasks, which handle the domains and the process starts
again: exchange buffer particles, create and send temporary particles.
With the ART-MPI code we routinely run simulations with 10243
particles on 500 CPUs.


In order to follow the evolution of clustering we need to store many
outputs, typically 150. Each output step consists of 125 files with a
total of 60 GB which is a total of 9 TB (in case of 150 outputs). The
output files are at the same time restart files. Additional files come
from data analysis and possible additional steps which will be
stored. Thus we expect a maximum of 18 TB of data. Data will be stored
in the archive.

CPU time

For the existing run with 10243 particles in the 1000 h-1 Mpc box we
used in total 250000 CPU hours at LRZ and NAS Ames, but this run was
done with extremely high resolution and small timesteps. For the dark
energy project this is not necessary. We will reduce resolution and
increase slightly the timestep so that we expect to finish a run
within 150000 CPUhours. We will run three models with w=-0.8, 1.0 and
0.9, which results in a total of about 450000 CPU hours + some time
for analysis.