Two additions are made to many collective communication calls:
[] Rationale.
The ``in place'' operations are provided to reduce unnecessary memory motion by
both the MPI implementation and by the user. Note that while the simple check
of testing whether the send and receive buffers have the same address will
work for some cases (e.g., MPI_ALLREDUCE), they are inadequate in
others (e.g., MPI_GATHER, with root not equal to zero). Further,
Fortran explicitly prohibits aliasing of arguments; the approach of using a
special value to denote ``in place'' operation eliminates that difficulty.
( End of rationale.)
[] Advice to users.
By allowing the ``in place'' option, the receive buffer in many of the collective calls becomes a send-and-receive buffer. For this reason, a Fortran binding that includes INTENT must mark these as INOUT, not OUT.
Note that MPI_IN_PLACE is a special kind of value; it has the same restrictions on its use that MPI_BOTTOM has.
Some intracommunicator collective operations do not support the ``in place''
option (e.g., MPI_ALLTOALLV).
( End of advice to users.)
Note that the ``in place'' option for intracommunicators does not apply to intercommunicators since in the intercommunicator case there is no communication from a process to itself.
[] Rationale.
Rooted operations are unidirectional by nature, and there is a clear
way of specifying direction. Non-rooted operations, such as
all-to-all, will often occur as part of an exchange, where it makes
sense to communicate in both directions at once.
( End of rationale.)
In the following, the definitions of the collective routines are provided to
enhance the readability and understanding of the associated text. They do not
change the definitions of the argument lists from MPI-1.
The C and Fortran language bindings for these routines are unchanged from
MPI-1, and are
not repeated here. Since new C++ bindings for the intercommunicator
versions are required, they are included.
The text provided for each routine is appended to the
definition of the routine in MPI-1.
MPI_BCAST(buffer, count, datatype, root, comm)
[ INOUT buffer] starting address of buffer (choice)
[ IN count] number of entries in buffer (integer)
[ IN datatype] data type of buffer (handle)
[ IN root] rank of broadcast root (integer)
[ IN comm] communicator (handle)
void MPI::Comm::Bcast(void* buffer, int count, const MPI::Datatype& datatype, int root) const = 0
The ``in place'' option is not meaningful here.
If comm is an intercommunicator, then the call involves all processes in the intercommunicator, but with one group (group A) defining the root process. All processes in the other group (group B) pass the same value in argument root, which is the rank of the root in group A. The root passes the value MPI_ROOT in root. All other processes in group A pass the value MPI_PROC_NULL in root. Data is broadcast from the root to all processes in group B. The receive buffer arguments of the processes in group B must be consistent with the send buffer argument of the root.
MPI_GATHER(sendbuf, sendcount, sendtype, recvbuf,
recvcount, recvtype, root, comm)
[ IN sendbuf] starting address of send buffer (choice)
[ IN sendcount] number of elements in send buffer (integer)
[ IN sendtype] data type of send buffer elements (handle)
[ OUT recvbuf] address of receive buffer (choice,
significant only at root)
[ IN recvcount] number of elements for any single receive (integer,
significant only at root)
[ IN recvtype] data type of recv buffer elements
(handle, significant only at root)
[ IN root] rank of receiving process (integer)
[ IN comm] communicator (handle)
void MPI::Comm::Gather(const void* sendbuf, int sendcount, const MPI::Datatype& sendtype, void* recvbuf, int recvcount, const MPI::Datatype& recvtype, int root) const = 0
The ``in place'' option for intracommunicators is specified by passing MPI_IN_PLACE as the value of sendbuf at the root. In such a case, sendcount and sendtype are ignored, and the contribution of the root to the gathered vector is assumed to be already in the correct place in the receive buffer
If comm is an intercommunicator, then the call involves all processes in the intercommunicator, but with one group (group A) defining the root process. All processes in the other group (group B) pass the same value in argument root, which is the rank of the root in group A. The root passes the value MPI_ROOT in root. All other processes in group A pass the value MPI_PROC_NULL in root. Data is gathered from all processes in group B to the root. The send buffer arguments of the processes in group B must be consistent with the receive buffer argument of the root.
MPI_GATHERV(sendbuf, sendcount, sendtype, recvbuf,
recvcounts, displs, recvtype, root, comm)
[ IN sendbuf] starting address of send buffer (choice)
[ IN sendcount] number of elements in send buffer (integer)
[ IN sendtype] data type of send buffer elements (handle)
[ OUT recvbuf] address of receive buffer (choice,
significant only at root)
[ IN recvcounts] integer array (of length group size)
containing the number of elements that are received from each process
(significant only at root)
[ IN displs] integer array (of length group size). Entry
i specifies the displacement relative to recvbuf at
which to place the incoming data from process i (significant only
at root)
[ IN recvtype] data type of recv buffer elements
(handle, significant only at root)
[ IN root] rank of receiving process (integer)
[ IN comm] communicator (handle)
void MPI::Comm::Gatherv(const void* sendbuf, int sendcount, const MPI::Datatype& sendtype, void* recvbuf, const int recvcounts[], const int displs[], const MPI::Datatype& recvtype, int root) const = 0
The ``in place'' option for intracommunicators is specified by passing MPI_IN_PLACE as the value of sendbuf at the root. In such a case, sendcount and sendtype are ignored, and the contribution of the root to the gathered vector is assumed to be already in the correct place in the receive buffer
If comm is an intercommunicator, then the call involves all processes in the intercommunicator, but with one group (group A) defining the root process. All processes in the other group (group B) pass the same value in argument root, which is the rank of the root in group A. The root passes the value MPI_ROOT in root. All other processes in group A pass the value MPI_PROC_NULL in root. Data is gathered from all processes in group B to the root. The send buffer arguments of the processes in group B must be consistent with the receive buffer argument of the root.
MPI_SCATTER(sendbuf, sendcount, sendtype, recvbuf,
recvcount, recvtype, root, comm)
[ IN sendbuf] address of send buffer (choice, significant
only at root)
[ IN sendcount] number of elements sent to each process
(integer, significant only at root)
[ IN sendtype] data type of send buffer elements
(handle, significant only at root)
[ OUT recvbuf] address of receive buffer (choice)
[ IN recvcount] number of elements in receive buffer (integer)
[ IN recvtype] data type of receive buffer elements (handle)
[ IN root] rank of sending process (integer)
[ IN comm] communicator (handle)
void MPI::Comm::Scatter(const void* sendbuf, int sendcount, const MPI::Datatype& sendtype, void* recvbuf, int recvcount, const MPI::Datatype& recvtype, int root) const = 0
The ``in place'' option for intracommunicators is specified by passing MPI_IN_PLACE as the value of recvbuf at the root. In such case, recvcount and recvtype are ignored, and root ``sends'' no data to itself. The scattered vector is still assumed to contain n segments, where n is the group size; the root-th segment, which root should ``send to itself,'' is not moved.
If comm is an intercommunicator, then the call involves all processes in the intercommunicator, but with one group (group A) defining the root process. All processes in the other group (group B) pass the same value in argument root, which is the rank of the root in group A. The root passes the value MPI_ROOT in root. All other processes in group A pass the value MPI_PROC_NULL in root. Data is scattered from the root to all processes in group B. The receive buffer arguments of the processes in group B must be consistent with the send buffer argument of the root.
MPI_SCATTERV(sendbuf, sendcounts, displs, sendtype,
recvbuf, recvcount, recvtype, root, comm)
[ IN sendbuf] address of send buffer (choice, significant
only at root)
[ IN sendcounts] integer array (of length group size)
specifying the number of elements to send to each processor
[ IN displs] integer array (of length group size). Entry
i specifies the displacement (relative to sendbuf from
which to take the outgoing data to process i
[ IN sendtype] data type of send buffer elements (handle)
[ OUT recvbuf] address of receive buffer (choice)
[ IN recvcount] number of elements in receive buffer (integer)
[ IN recvtype] data type of receive buffer elements (handle)
[ IN root] rank of sending process (integer)
[ IN comm] communicator (handle)
void MPI::Comm::Scatterv(const void* sendbuf, const int sendcounts[], const int displs[], const MPI::Datatype& sendtype, void* recvbuf, int recvcount, const MPI::Datatype& recvtype, int root) const = 0
The ``in place'' option for intracommunicators is specified by passing MPI_IN_PLACE as the value of recvbuf at the root. In such case, recvcount and recvtype are ignored, and root ``sends'' no data to itself. The scattered vector is still assumed to contain n segments, where n is the group size; the root-th segment, which root should ``send to itself,'' is not moved.
If comm is an intercommunicator, then the call involves all processes in the intercommunicator, but with one group (group A) defining the root process. All processes in the other group (group B) pass the same value in argument root, which is the rank of the root in group A. The root passes the value MPI_ROOT in root. All other processes in group A pass the value MPI_PROC_NULL in root. Data is scattered from the root to all processes in group B. The receive buffer arguments of the processes in group B must be consistent with the send buffer argument of the root.
MPI_ALLGATHER(sendbuf, sendcount, sendtype, recvbuf,
recvcount, recvtype, comm)
[ IN sendbuf] starting address of send buffer (choice)
[ IN sendcount] number of elements in send buffer (integer)
[ IN sendtype] data type of send buffer elements (handle)
[ OUT recvbuf] address of receive buffer (choice)
[ IN recvcount] number of elements received from any
process (integer)
[ IN recvtype] data type of receive buffer elements (handle)
[ IN comm] communicator (handle)
void MPI::Comm::Allgather(const void* sendbuf, int sendcount, const MPI::Datatype& sendtype, void* recvbuf, int recvcount, const MPI::Datatype& recvtype) const = 0
The ``in place'' option for intracommunicators is specified by passing the
value
MPI_IN_PLACE to the argument sendbuf at all processes.
sendcount and sendtype are ignored. Then the input data
of each process is assumed to be in the area where that
process would receive its own contribution to the receive buffer.
Specifically, the outcome of a call to MPI_ALLGATHER in the
``in place'' case is as if all processes executed n calls to
MPI_GATHER( MPI_IN_PLACE, 0, MPI_DATATYPE_NULL, recvbuf, recvcount, recvtype, root, comm )for root = 0, ..., n - 1.
If comm is an intercommunicator, then each process in group A contributes a data item; these items are concatenated and the result is stored at each process in group B. Conversely the concatenation of the contributions of the processes in group B is stored at each process in group A. The send buffer arguments in group A must be consistent with the receive buffer arguments in group B, and vice versa.
[] Advice to users.
The communication pattern of MPI_ALLGATHER executed on an intercommunication domain need not be symmetric. The number of items sent by processes in group A (as specified by the arguments sendcount, sendtype in group A and the arguments recvcount, recvtype in group B), need not equal the number of items sent by processes in group B (as specified by the arguments sendcount, sendtype in group B and the arguments recvcount, recvtype in group A). In particular, one can move data in only one direction by specifying sendcount = 0 for the communication in the reverse direction.
( End of advice to users.)
MPI_ALLGATHERV(sendbuf, sendcount, sendtype, recvbuf,
recvcounts, displs, recvtype, comm)
[ IN sendbuf] starting address of send buffer (choice)
[ IN sendcount] number of elements in send buffer (integer)
[ IN sendtype] data type of send buffer elements (handle)
[ OUT recvbuf] address of receive buffer (choice)
[ IN recvcounts] integer array (of length group size)
containing the number of elements that are received from each process
[ IN displs] integer array (of length group size). Entry
i specifies the displacement (relative to recvbuf) at
which to place the incoming data from process i
[ IN recvtype] data type of receive buffer elements (handle)
[ IN comm] communicator (handle)
void MPI::Comm::Allgatherv(const void* sendbuf, int sendcount, const MPI::Datatype& sendtype, void* recvbuf, const int recvcounts[], const int displs[], const MPI::Datatype& recvtype) const = 0
The ``in place'' option for intracommunicators is specified by passing the
value
MPI_IN_PLACE to the argument sendbuf at all processes.
sendcount and sendtype are ignored. Then the input data
of each process is assumed to be in the area where that
process would receive its own contribution to the receive buffer.
Specifically, the outcome of a call to MPI_ALLGATHER in the
``in place'' case is as if all processes executed n calls to
MPI_GATHERV( MPI_IN_PLACE, 0, MPI_DATATYPE_NULL, recvbuf, recvcounts, displs, recvtype, root, comm )for root = 0, ..., n - 1.
If comm is an intercommunicator, then each process in group A contributes a data item; these items are concatenated and the result is stored at each process in group B. Conversely the concatenation of the contributions of the processes in group B is stored at each process in group A. The send buffer arguments in group A must be consistent with the receive buffer arguments in group B, and vice versa.
MPI_ALLTOALL(sendbuf, sendcount, sendtype, recvbuf,
recvcount, recvtype, comm)
[ IN sendbuf] starting address of send buffer (choice)
[ IN sendcount] number of elements sent to each process (integer)
[ IN sendtype] data type of send buffer elements (handle)
[ OUT recvbuf] address of receive buffer (choice)
[ IN recvcount] number of elements received from any
process (integer)
[ IN recvtype] data type of receive buffer elements (handle)
[ IN comm] communicator (handle)
void MPI::Comm::Alltoall(const void* sendbuf, int sendcount, const MPI::Datatype& sendtype, void* recvbuf, int recvcount, const MPI::Datatype& recvtype) const = 0
No ``in place'' option is supported.
If comm is an intercommunicator, then the outcome is as if each process in group A sends a message to each process in group B, and vice versa. The j-th send buffer of process i in group A should be consistent with the i-th receive buffer of process j in group B, and vice versa.
[] Advice to users.
When all-to-all is executed on an intercommunication domain, then the number of data items sent from processes in group A to processes in group B need not equal the number of items sent in the reverse direction. In particular, one can have unidirectional communication by specifying sendcount = 0 in the reverse direction.
( End of advice to users.)
MPI_ALLTOALLV(sendbuf, sendcounts, sdispls, sendtype,
recvbuf, recvcounts, rdispls, recvtype, comm)
[ IN sendbuf] starting address of send buffer (choice)
[ IN sendcounts] integer array equal to the group size
specifying the number of elements to send to each processor
[ IN sdispls] integer array (of length group size). Entry
j specifies the displacement (relative to sendbuf) from
which to take the outgoing data destined for process j
[ IN sendtype] data type of send buffer elements (handle)
[ OUT recvbuf] address of receive buffer (choice)
[ IN recvcounts] integer array equal to the group size
specifying the number of elements that can be received from
each processor
[ IN rdispls] integer array (of length group size). Entry
i specifies the displacement (relative to recvbuf) at
which to place the incoming data from process i
[ IN recvtype] data type of receive buffer elements (handle)
[ IN comm] communicator (handle)
void MPI::Comm::Alltoallv(const void* sendbuf, const int sendcounts[], const int sdispls[], const MPI::Datatype& sendtype, void* recvbuf, const int recvcounts[], const int rdispls[], const MPI::Datatype& recvtype) const = 0
No ``in place'' option is supported.
If comm is an intercommunicator, then the outcome is as if each process in group A sends a message to each process in group B, and vice versa. The j-th send buffer of process i in group A should be consistent with the i-th receive buffer of process j in group B, and vice versa.