Difference between revisions of "Communication"
From Gw-qcd-wiki
Line 1: | Line 1: | ||
− | =Code organization= | + | ==Code organization== |
The communication is organized in three layers: | The communication is organized in three layers: | ||
Line 6: | Line 6: | ||
* high level: these routines implement shifts | * high level: these routines implement shifts | ||
− | =Low level routines= | + | ==Low level routines== |
These routines are simple wrappers that sit on top of the communication library (most likely MPI). | These routines are simple wrappers that sit on top of the communication library (most likely MPI). | ||
− | Their purpose is to insulate our codes from the communication layer. Here is a list of the low level routines with a | + | Their purpose is to insulate our codes from the communication layer. |
+ | Currently, our code provides an implementation that uses MPI and a vanilla implementation that is | ||
+ | used for single node architectures. | ||
+ | Here is a list of the low level routines in comm_low.h with a short description: | ||
void init_machine(int &argc, char **&argv, bool single_node = false) | void init_machine(int &argc, char **&argv, bool single_node = false) | ||
− | :This function needs to be called if the communication library is linked to initialize the communication layer. The parameters argc and argv are the ones passed to main by the operating system. The last parameter indicates whether the code is ment to run | + | :This function needs to be called if the communication library is linked, to initialize the communication layer. The parameters argc and argv are the ones passed to main routine by the operating system. The last parameter indicates whether the code is only ment to run on single node machines. |
void shutdown_machine() | void shutdown_machine() | ||
:comm library cleanup. call before exiting main program. | :comm library cleanup. call before exiting main program. | ||
− | + | int get_num_nodes() | |
− | : | + | :returns the number of processes in the current program. |
+ | rank_t get_node_rank() | ||
+ | :returns the rank (number) of the current process. rank_t is defined to be unsigned int in layout.h. | ||
+ | void synchronize() | ||
+ | :inserts a communication barrier. This is block the execution in all processes until all of them reach this point. | ||
+ | void broadcast(char* buffer, size_t sz) | ||
+ | :broadcast the buffer of size sz bytes to all nodes, i.e., copy it from node of rank 0 to all other nodes. | ||
+ | void send(char* buffer, size_t sz, rank_t dest, int tag=1) | ||
+ | :send a buffer of size sz bytes from the current node to the node of rank dest and add a numeric tag to it (if necessary). This is a blocking call, that will return when a matching receive is executed on node dest. | ||
This is found in the comm folder, it handles the communication between nodes. | This is found in the comm folder, it handles the communication between nodes. | ||
*[[comm_intermediate.cpp]] | *[[comm_intermediate.cpp]] | ||
*[[comm_low_cuda.cu]] | *[[comm_low_cuda.cu]] |
Revision as of 17:37, 13 December 2011
Code organization
The communication is organized in three layers:
- low level: these routines send/receive unstructured data from one process to another
- intermediate level: these routines handle data movement between lattice data structures
- high level: these routines implement shifts
Low level routines
These routines are simple wrappers that sit on top of the communication library (most likely MPI). Their purpose is to insulate our codes from the communication layer. Currently, our code provides an implementation that uses MPI and a vanilla implementation that is used for single node architectures. Here is a list of the low level routines in comm_low.h with a short description:
void init_machine(int &argc, char **&argv, bool single_node = false)
- This function needs to be called if the communication library is linked, to initialize the communication layer. The parameters argc and argv are the ones passed to main routine by the operating system. The last parameter indicates whether the code is only ment to run on single node machines.
void shutdown_machine()
- comm library cleanup. call before exiting main program.
int get_num_nodes()
- returns the number of processes in the current program.
rank_t get_node_rank()
- returns the rank (number) of the current process. rank_t is defined to be unsigned int in layout.h.
void synchronize()
- inserts a communication barrier. This is block the execution in all processes until all of them reach this point.
void broadcast(char* buffer, size_t sz)
- broadcast the buffer of size sz bytes to all nodes, i.e., copy it from node of rank 0 to all other nodes.
void send(char* buffer, size_t sz, rank_t dest, int tag=1)
- send a buffer of size sz bytes from the current node to the node of rank dest and add a numeric tag to it (if necessary). This is a blocking call, that will return when a matching receive is executed on node dest.
This is found in the comm folder, it handles the communication between nodes.