缓冲发送(MPI_Bsend): - MPI_Bsend 是一种缓冲发送方式,它允许发送方将消息放入缓冲区,并继续执行,而无需等待接收方确认。 - 与 MPI_Send 和 MPI_Ssend 不同,MPI_Bsend 不会立即阻塞发送方,而是将消息放入缓冲区,稍后由 MPI 系统异步传输消息。 - MPI_Bsend 通常用于发送大量数据,以便减少通信的开销,但需要在适当的时机调用 MPI_Buffer_detach 以确保缓冲区中的消息被发送。
* More details in Modes
sending a message
int MPI_Ssend(void *buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm);
from rank1 to rank3
int x[10];MPI_Ssend(x,10, MPI_INT,3,0, MPI_COMM_WORLD);// 3 is Destination, 0 is Tag
receiving a message
int MPI_Recv(void *buf, int count, MPI_Datatype datatype, int source, int tag, MPI_Comm comm, MPI_Status *status)
from rank1 on rank3
int y[10];MPI_Status status;MPI_Recv(y,10, MPI_INT,1,0, MPI_COMM_WORLD,&status);// 1 is Source, 0 is Tag
for a communication to succeed
valid destination rank in sender
valid source rank in receiver
same communicator
same tags
same message types
receiver's buffer must be large enough
wildcarding
receiver can wildcard
to receive from any source MPI_ANY_SOURCE
to receive with any tag MPI_ANY_TAG
actual source and tag are returned in the receiver's status parameter.
received message count
int MPI_Get_count(MPI_Status *status, MPI_Datatype datatype, int *count)
all MPI communications take place within a communicator
communicator - a group of processes
message can only be received within the same communicator from which it was sent
can be split into pieces by MPI_Comm_split()
each process has a new rank within each sub-communicator
guarantee messages from different pieces do not interact
can make copy by MPI_Comm_dup
containing same processes but in a new communicator
Non-blocking Communications
non-blocking operations
all non-blocking operations should have matching wait operations
some system cannot free resources until wait has been called
if immediately followed by a matching wait is equal to blocking operation
operation continues after the call has returned
非阻塞操作和传统的顺序子例程调用是不同的。在非阻塞操作中,调用一个非阻塞操作后,该操作会立即返回,允许程序继续执行其他任务,而不必等待操作完成。这与传统的顺序子例程调用不同,传统调用会一直等待子例程执行完成后才返回
can be separate into three phases:
initiate non-blocking communication
do some work
wait for non-blocking communication to complete
a request handle is allocated when a communication is initiated
non-blocking synchronous send
a blocking send can be used with a non-blocking receive, and vice-versa
non-blocking sends can use any mode - synchronous, buffered or standard
non-blocking modes
non-blocking operation
MPI call
Standard send
MPI_Isend
Sync send
MPI_Issend
buffered send
MPI_Ibsend
receive
MPI_Irecv
example
MPI_Request request;MPI_Status status;if(rank ==0){MPI_Issend(sendarray,10, MPI_INT,1, tag, MPI_COMM_WORLD,&request); Do_something_else_while Issend_happens();// now wait for send to completeMPI_Wait(&request,&status);}elseif(rank ==1){MPI_Irecv(recvarray,10, MPI_INT,0, tag, MPI_COMM_WORLD,&request); Do_something_else_while Irecv_happens();// now wait for receive to completeMPI_Wait(&request,&status);}
important notes
synchronous mode affects what completion means
after a wait on MPI_Issend, you know the routine has completed
after a wait on MPI_Isend, you know the routine has completed
the request can be re-used after a wait
You must not access send or receive buffers until communications are complete
cannot overwrite send buffer until after a wait on Issend / Isend
cannot read from a receive buffer until after a wait on Irecv
testing multiple non-blocking comms
see slide L07
Collective communication
characteristics
collective action over a communicator
all processes must communicate
synchronisation may or may not occur
standard collective operations are blocking
no tags
receive buffer must be exactly the right size
comms
barrier synchronisation
int MPI_Barrier(MPI_Comm comm)
用于创建一个同步点,所有进程在到达这一点之前将等待。一旦所有进程都到达栅栏,它们将继续执行
broadcast
int MPI_Bcast(void *buffer, int count, MPI_Datatype datatype, int root, MPI_Comm comm)
each process is “connected” to its neighbours in a virtual grid
boundaries can be cyclic, or not.
optionally re-order ranks to allow MPI implementation to optimise for underlying network interconnectivity
processes are identified by cartesian coordinates
create a cartesian virtual topology
int MPI_Cart_create(MPI_Comm comm_old, int ndims, int *dims, int *periods, int reorder, MPI_Comm *comm_cart)
balanced processor distribution
int MPI_Dims_create( int nnodes, int ndims, int *dims )
mapping functions
int MPI_Cart_rank( MPI_Comm comm, int *coords, int *rank) // Mapping process grid coordinates to ranks
int MPI_Cart_coords(MPI_Comm comm, int rank, int maxdims, int *coords) // Mapping ranks to process grid coordinates
int MPI_Cart_shift(MPI_Comm comm, int direction, int disp, int *rank_source, int *rank_dest) // Computing ranks of my neighbouring processes Following conventions of MPI_SendRecv
rank_source is not your rank! it is an output not an input
For message round a ring
rank_source would be rank - 1
rank_dest would be rank + 1
Different convention to MPI_Cart_coords()
your are implicitly asking for your neighbours
Partitioning
int MPI_Cart_sub ( MPI_Comm comm, int *remain_dims, MPI_Comm *newcomm)