MPI_Comm_Split和MPI_Bcast问题

Problem with MPI_Comm_Split and MPI_Bcast

提问人:Giorgio Aveni 提问时间:11/10/2023 最后编辑:GerhardhGiorgio Aveni 更新时间:11/10/2023 访问量:68

问:

几天前我提出了一个问题,多亏了一个好心人,我可以解决我的问题。现在,我尝试使用 MPI 在 C 语言中实现一个更复杂的程序,使用 MPI_Comm_Split 为某些进程组创建通信器。我有一个全局排名为 0 的进程,它不在任何组中,因为我想把它当作一个中央服务器,实际上任务是初始化一个由 4 个浮动 Noumbers 组成的数组并将其发送到属于组 0 的进程。这个组 0 由 3 个进程(全局排名为 1、2 和 3 的进程)组成,我将它们视为中间服务器,因此它们从中央服务器接收数组并将其转发给其他组。我还有其他 3 个包含其余进程的组(这 3 个组中的每一个都有相同的进程)。

代码如下:

#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>

int main(int argc, char *argv[]) {
    MPI_Init(&argc, &argv);

    int rank, size;
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Comm_size(MPI_COMM_WORLD, &size);

    if (size < 7) {
        if (rank == 0) {
            printf("This program requires at least 7 processes.\n");
        }
    } else {
        int group_number;

        if (rank == 0) {
            group_number = -1; // Process with rank 0 does not belong to any group
        } else if (rank >= 1 && rank <= 3) {
            group_number = 0; // New group formed by processes with rank 1, 2, and 3
        } else {
            group_number = (rank - 4) % 3 + 1; // Subdivision of the other processes into 3 groups with numbers 1, 2, and 3
        }

        MPI_Comm comm_group;
        MPI_Comm_split(MPI_COMM_WORLD, group_number, rank, &comm_group);

        int group_rank;
        int group_size;
        MPI_Comm_rank(comm_group, &group_rank);
        MPI_Comm_size(comm_group, &group_size);

        if (rank == 0) {
            printf("Process %d does not belong to any group because it is the central server.\n", rank);
        } else if(group_number != 0) {
            printf("Process %d in group %d. Rank in the group: %d\n", rank, group_number, group_rank);
        }else{
            printf("Process %d in intermediate server group %d. Rank in the group: %d\n", rank, group_number, group_rank);
        }

        float global_message[4];
        float intermediary_message[4];

        if (rank == 0)
        {
           // Process with global rank 0 initializes the message
            global_message[0] = 10.0;
            global_message[1] = 11.0;
            global_message[2] = 12.0;
            global_message[3] = 13.0;

            printf("Message to send: %f %f %f %f\n\n", global_message[0], global_message[1], global_message[2], global_message[3]);
        }

        // Send message only to processes in group 0
        if (group_number == 0 || rank == 0) {
            MPI_Bcast(&global_message, 4, MPI_FLOAT, 0,MPI_COMM_WORLD);
        }

        if (group_number == 0 ) {
            // Print the message received from process 0
            printf("Received message from central server: %f %f %f %f\n\n", global_message[0], global_message[1], global_message[2], global_message[3]);
            
        } 

        if (rank == 1) {
            
            printf("intermediate message 1: %f %f %f %f\n\n", global_message[0], global_message[1], global_message[2], global_message[3]);

            for (int i = 0; i < 4; i++) {
                intermediary_message[i] = global_message[i];
            }
        }

        if (rank == 2) {
            
            printf("intermediate message 2: %f %f %f %f\n\n", global_message[0], global_message[1], global_message[2], global_message[3]);

            for (int i = 0; i < 4; i++) {
                intermediary_message[i] = global_message[i];
            }
        }

        if (rank == 3) {
            
            printf("intermediate message 3: %f %f %f %f\n\n", global_message[0], global_message[1], global_message[2], global_message[3]);

            for (int i = 0; i < 4; i++) {
                intermediary_message[i] = global_message[i];
            }
        }

        if (group_number == 1 || rank == 1)
        {
            MPI_Bcast(&intermediary_message, 4, MPI_FLOAT, 0, comm_group);
        }

         if (group_number == 2 || rank == 2)
        {
            MPI_Bcast(&intermediary_message, 4, MPI_FLOAT, 1, comm_group);
        }

         if (group_number == 3 || rank == 3)
        {
            MPI_Bcast(&intermediary_message, 4, MPI_FLOAT, 2, comm_group);
        }

        if (group_number == 1 || group_number == 2 || group_number == 3 )
        {
            // Print the message received from each process in group 0
            printf("Received the message from the intermediate server: %f %f %f %f\n\n", global_message[0], global_message[1], global_message[2], global_message[3]);
        }

        int group_members[group_size];
        MPI_Gather(&rank, 1, MPI_INT, group_members, 1, MPI_INT, 0, comm_group);

        if (group_number != -1) {
            if (group_rank == 0) {
                printf("Group %d includes the following members: ", group_number);
                for (int i = 0; i < group_size; i++) {
                    printf("%d ", group_members[i]);
                }
                printf("\n\n");
            }
        }

        if (group_number >= 0 && group_number != 0) {
            MPI_Comm_free(&comm_group);
        }
    }

    MPI_Finalize();
    return 0;
}

这是输出:

mpiexec -np 13 third_example

Process 8 in group 2. Rank in the group: 1
Received the message from the intermediate server: -0.000001 0.000000 -0.000001 0.000000

Process 0 does not belong to any group because it is the central server.
Message to send: 10.000000 11.000000 12.000000 13.000000

Process 1 in intermediate server group 0. Rank in the group: 0
Received message from central server: 10.000000 11.000000 12.000000 13.000000

intermediate message 1: 10.000000 11.000000 12.000000 13.000000

Group 0 includes the following members: 1 2 3

Process 2 in intermediate server group 0. Rank in the group: 1
Received message from central server: 10.000000 11.000000 12.000000 13.000000

intermediate message 2: 10.000000 11.000000 12.000000 13.000000

Process 5 in group 2. Rank in the group: 0
Received the message from the intermediate server: -0.000001 0.000000 -0.000001 0.000000

Group 2 includes the following members: 5 8 11

Process 12 in group 3. Rank in the group: 2
Received the message from the intermediate server: -0.000001 0.000000 -0.000001 0.000000

Process 7 in group 1. Rank in the group: 1
Received the message from the intermediate server: -0.000001 0.000000 -0.000001 0.000000

Process 9 in group 3. Rank in the group: 1
Received the message from the intermediate server: -0.000001 0.000000 -0.000001 0.000000

Process 3 in intermediate server group 0. Rank in the group: 2
Received message from central server: 10.000000 11.000000 12.000000 13.000000

intermediate message 3: 10.000000 11.000000 12.000000 13.000000

Process 6 in group 3. Rank in the group: 0
Received the message from the intermediate server: -0.000001 0.000000 -0.000001 0.000000

Group 3 includes the following members: 6 9 12

Process 11 in group 2. Rank in the group: 2
Received the message from the intermediate server: -0.000001 0.000000 -0.000001 0.000000

Process 4 in group 1. Rank in the group: 0
Received the message from the intermediate server: -0.000001 0.000000 -0.000001 0.000000

Group 1 includes the following members: 4 7 10

Process 10 in group 1. Rank in the group: 2
Received the message from the intermediate server: -0.000001 0.000000 -0.000001 0.000000

如您所见,组 0 的数组转发有效,但对其他组无效。我称之为“中间服务器”的进程和组 1、2、3 的通信器是相同的,但它不起作用。我认为问题在于用于将阵列从中央服务器转发到中间服务器组的通信器是全局通信器(MPI_COMM_WORLD),但是当我尝试使用组通信器更改它时,代码不会执行。 请有人能帮我吗?

C MPI型

评论

0赞 Victor Eijkhout 11/11/2023
1 你认为如果其他队伍从不调用 bcast 消息,他们将如何收到 bcast 消息?2. 你对群组通信器有一堆 bcast 调用,但它们都是条件的。我不喜欢这种代码:太容易错过一个案例了。只执行一次 bcast 调用,并使用条件来设置不同的缓冲区。// Send message only to processes in group 0 if (group_number == 0 || rank == 0) {
1赞 Giorgio Aveni 11/11/2023
@VictorEijkhout当你说其他等级不叫 bcast 时,你指的是第 0 组的等级还是其他组的等级?我设置了很多条件,因为我是使用 mpi 编程的初学者,所以我设法将消息发送到组 0 的所有进程,我也尝试为其他组复制相同的逻辑。所以我问了很多,但你能更好地解释一下我能做些什么来使代码达到我的期望吗?
0赞 Gilles Gouaillardet 11/11/2023
MPI_Bcast()发生在通信器中。因此,如果要从 to 广播 (rank in ),则必须在同一个通信器中执行所有这些任务。0[1-3]MPI_COMM_WORLD
0赞 Giorgio Aveni 11/11/2023
@GillesGouaillardet我感到非常困惑。在我看来,MPI_COMM_WORLD中排名为 0 的进程必须仅将消息发送到组 0 中包含的所有进程(即排名在 MPI_COMM_WORLD 1,2,3 中的进程)。事实上,当我在代码中编写 // 仅向组 0 中的进程发送消息时,如果 (group_number == 0 || rank == 0) { MPI_Bcast(&global_message, 4, MPI_FLOAT, 0,MPI_COMM_WORLD); } 组 0 进程(MPI_COMM_WORLD中的排名 1,2,3)是否真的收到了消息,还是我错了?
0赞 Giorgio Aveni 11/11/2023
我不想一直用问题打扰你,那么,你碰巧知道我可以学习用 MPI 编程的任何方法吗?(例如书籍、视频课程或链接)

答: 暂无答案