Communication

From Gw-qcd-wiki
Revision as of 20:33, 13 December 2011 by Alexan (talk | contribs)
Jump to: navigation, search

To parallelize our codes we need a communication infrastructure. The basic idea is to divide the lattice equally between processes that run in parallel and manage their own resources: CPU memory, GPU memory etc. The processes are mostly equivalent: the calculations and memory access patterns are very similar. This allows us to write codes that look very much like serial codes. Most of the code path is run redundantly by all processes; it is only the lattice-wide operations that benefit from the multi-processor hardware. However, since the computational time is dominated by these routines, the codes scale well.

While the I/O resources are also managed independently by each process, in most of our codes, the I/O is handled by rank 0 process. This is mainly due to the fact that we cannot guarantee that the disk is mounted by all compute nodes and, even if it is, most I/O libraries do not handle well multi-process access to a single file.

Code organization

The communication is organized in three layers:

  • low level: these routines send/receive unstructured data from one process to another
  • intermediate level: these routines handle data movement between lattice data structures
  • high level: these routines implement shifts

Low level routines

These routines are simple wrappers that sit on top of the communication library (most likely MPI). Their purpose is to insulate our codes from the communication layer. Currently, our code provides an implementation that uses MPI and a vanilla implementation that is used for single node architectures. Here is a list of the low level routines in comm_low.h with a short description:

void init_machine(int &argc, char **&argv, bool single_node = false)
This function needs to be called if the communication library is linked, to initialize the communication layer. The parameters argc and argv are the ones passed to main routine by the operating system. The last parameter indicates whether the code is only ment to run on single node machines.
void shutdown_machine()
comm library cleanup. call before exiting main program.
 int get_num_nodes()
returns the number of processes in the current program.
 rank_t get_node_rank()
returns the rank (number) of the current process. rank_t is defined to be unsigned int in layout.h.
void synchronize()
inserts a communication barrier. This is block the execution in all processes until all of them reach this point.
void broadcast(char* buffer, size_t sz)
broadcast the buffer of size sz bytes to all nodes, i.e., copy it from node of rank 0 to all other nodes.
void send(char* buffer, size_t sz, rank_t dest, int tag=1)
send a buffer of size sz bytes from the current node to the node of rank dest and add a numeric tag to it (if necessary). This is a blocking call, that will return when a matching receive is executed on node dest.

This is found in the comm folder, it handles the communication between nodes.