Right now, analyzing in parallel is a tricky business. With the next major release of yt, we intend to include a significantly improved infrastructure for dispatching analysis via a queuing system.
Below is a script (parallel_projection_mpi4py.py) that shows a proof-of-concept parallel projection. It relies on MPI4Py and PyTables, and currently only works with a decomposition onto a square number of processors.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 | #
# This is a quick-and-dirty method of testing the parallel projection sketch
# I've created. (Matt)
#
fn = "my_gigantic_data.dir/my_gigantic_data"
from mpi4py import MPI
import math, time
from yt.config import ytcfg
num_proc = MPI.COMM_WORLD.size
my_id = MPI.COMM_WORLD.rank
field_name = "Density"
time.sleep(my_id) # Offset our IO slightly
# Now a bit of boilerplate to ensure we're not doing any IO
# unless we're the root processor.
ytcfg["yt","logfile"] = "False"
ytcfg["lagos","ReconstructHierarchy"] = "False"
if my_id == 0:
ytcfg["lagos","serialize"] = "True"
ytcfg["lagos","onlydeserialize"] = "False"
else:
ytcfg["lagos","onlydeserialize"] = "True"
from yt.mods import *
pf = get_pf()
# Domain decomposition.
x_edge = na.mgrid[0:1:(na.sqrt(num_proc) + 1)*1j]
y_edge = na.mgrid[0:1:(na.sqrt(num_proc) + 1)*1j]
xe_i = int(math.floor(my_id/na.sqrt(num_proc)))
ye_i = my_id % na.sqrt(num_proc)
# Note that here we are setting it to be projected along axis zero
LE = [0.0, x_edge[xe_i], y_edge[ye_i]]
RE = [1.0, x_edge[xe_i+1], y_edge[ye_i+1]]
reg = pf.h.region([.5,.5,.5],LE,RE) # center at 0.5 but only project sub-regions
# Record the corners of our region
open("LE_RE_%02i.txt" % my_id,"w").write("%s, %s\n" % (LE,RE))
proj = pf.h.proj(0,field_name,source=reg) # Actually *do* the projection here
if my_id == 0:
# Now we collect!
d = [proj.data]
for i in range(1,num_proc):
# Blocking receive
d.append(MPI.COMM_WORLD.Recv(source=i, tag=0))
new_proj = {}
for key in proj.data.keys():
new_proj[key] = na.concatenate([mm[key] for mm in d])
proj_array = na.array([new_proj['px'],new_proj['py'],
new_proj['pdx'],new_proj['pdy'],
new_proj[field_name]])
# We've now received all of our data and constructed an
# array of the pixelization. So, let's store it.
import tables
p = tables.openFile("result_mpi4py.h5","w")
p.createArray("/","Test",proj_array)
else:
# proj.data is where the dictionary of projection values is kept
MPI.COMM_WORLD.Send(proj.data, dest=0, tag=0)
|
The process of analyzing objects in parallel if they don’t require in-line data comparison – for example, creating phase diagrams of multiple galaxies in a large scale cosmological simulation – is significantly simpler, and can be conducted in a very straightforward fashion.
The below example takes a list of HOP centers, generated by yt and then written out with the write_out() method, reads them in and then chooses which one to examine based on the processor number. Note that MPI4Py might be overkill here, but it suffices for our purposes.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | from mpi4py import MPI
from yt.mods import *
num_proc = MPI.COMM_WORLD.size
my_id = MPI.COMM_WORLD.rank
pf = get_pf()
hop_centers = []
for line in open("HOP.txt"):
if line[0] == "#": continue
# Maximum density location
hop_centers.append( [float(i) for i in line.split()[4:7]] )
# Maximum radius
hop_radii.append(float(line.split()[-1]))
results = open("results_%04i.txt" % my_id)
# Now we want to start at my_id, jump by num_proc each step
# and hit the len(hop_centers).
for hop_id in na.mgrid[my_id:len(hop_centers):num_proc]:
# This is where our analysis goes.
sp = pf.h.sphere(hop_centers[hop_id], hop_radii[hop_id])
axv = sp.quantities["WeightedAverageQuantity"]("x-velocity","CellMassMsun")
results.write("%04i\t%0.9e\n" % hop_id, axv)
results.close()
|