Difference between revisions of "Communication"

Revision as of 20:23, 13 December 2011

To parallelize our codes we need a communication infrastructure. The basic idea is to divide the lattice equally between processes that run in parallel and manage their own resources: CPU memory, GPU memory etc. The processes are mostly equivalent: the calculations and memory access patterns are very similar. This allows us to write codes that look very much like serial codes: it is only the lattice-wide operations that benefit from the multi-processor hardware. However, since the computational time is dominated by these routines, the codes scale well.

While the I/O resources are also managed by each process, in most of our codes, the I/O is handled by rank 0 process. The

Code organization

The communication is organized in three layers:

low level: these routines send/receive unstructured data from one process to another
intermediate level: these routines handle data movement between lattice data structures
high level: these routines implement shifts

Low level routines

These routines are simple wrappers that sit on top of the communication library (most likely MPI). Their purpose is to insulate our codes from the communication layer. Currently, our code provides an implementation that uses MPI and a vanilla implementation that is used for single node architectures. Here is a list of the low level routines in comm_low.h with a short description:

void init_machine(int &argc, char **&argv, bool single_node = false)

This function needs to be called if the communication library is linked, to initialize the communication layer. The parameters argc and argv are the ones passed to main routine by the operating system. The last parameter indicates whether the code is only ment to run on single node machines.

void shutdown_machine()

comm library cleanup. call before exiting main program.

 int get_num_nodes()

returns the number of processes in the current program.

 rank_t get_node_rank()

returns the rank (number) of the current process. rank_t is defined to be unsigned int in layout.h.

void synchronize()

inserts a communication barrier. This is block the execution in all processes until all of them reach this point.

void broadcast(char* buffer, size_t sz)

broadcast the buffer of size sz bytes to all nodes, i.e., copy it from node of rank 0 to all other nodes.

void send(char* buffer, size_t sz, rank_t dest, int tag=1)

send a buffer of size sz bytes from the current node to the node of rank dest and add a numeric tag to it (if necessary). This is a blocking call, that will return when a matching receive is executed on node dest.

This is found in the comm folder, it handles the communication between nodes.

Difference between revisions of "Communication"

Revision as of 20:23, 13 December 2011

Code organization

Low level routines

Navigation menu

Views

Personal tools

Navigation

Search

Tools

@@ Line 1: / Line 1: @@
+To parallelize our codes we need a communication infrastructure. The basic idea is to divide the lattice equally between processes that run in parallel and manage their own resources: CPU memory, GPU memory etc. The processes are mostly equivalent: the calculations and memory access patterns are very similar. This allows us to write codes that look very much like serial codes: it is only the lattice-wide operations that benefit from the multi-processor hardware. However, since the computational time is dominated by these routines, the codes scale well.
+While the I/O resources are also managed by each process, in most of our codes, the I/O is handled by rank 0 process. The
 ==Code organization==
 The communication is organized in three layers: