提问人:DJNZ 提问时间:11/10/2023 最后编辑:Ian BushDJNZ 更新时间:11/22/2023 访问量:79
通过并行循环中子例程中传递的索引处理共享数组
Processing a shared array by a passed index in a subroutine in a parallel loop
问:
在并行循环中,我使用一个子例程处理一个共享数组,我将数组和当前 private-do 索引作为参数传递给该子例程,但程序崩溃并出现数组越界错误。如何正确调用子例程来处理共享数组并将并行循环索引传递给它?
代码在主。F:
PROGRAM TESTER
USE OMP_LIB
USE PRINTER
INTEGER, PARAMETER:: N = 5
REAL*4,DIMENSION(:),ALLOCATABLE, SAVE :: ARG_1, ARG_2
REAL*4,DIMENSION(:),ALLOCATABLE:: RES
C=======================================================================
C$OMP THREADPRIVATE(ARG_1, ARG_2)
C=======================================================================
ALLOCATE(RES(N))
PRINT *,'MAIN: "RES" IS ALLOCATED = ',
> ALLOCATED(RES)
C$OMP PARALLEL PRIVATE(I) SHARED(RES) NUM_THREADS(2)
ALLOCATE(ARG_1(N))
PRINT *,'MAIN: "ARG_1" IS ALLOCATED = ',
> ALLOCATED(ARG_1)
ALLOCATE(ARG_2(N))
PRINT *,'MAIN: "ARG_2" IS ALLOCATED = ',
> ALLOCATED(ARG_2)
C Step 1:Initialize working arrays:
CALL WORK1(ARG_1,N, ARG_2,N)
CALL WORK2(ARG_1,N, ARG_2,N)
C Step 2: Print working arrays:
CALL PRINT_ARR(ARG_1,N)
CALL PRINT_ARR(ARG_2,N)
PRINT *,'===================================='
C Step 3: Parallel Loop:
c-----------------------------------------------------------------------
C$OMP DO
DO I=1,N
CALL WORK3(RES,I,ARG_1(I),ARG_2(I))
ENDDO
C$OMP END DO
CALL PRINT_ARR(RES,N)
c-----------------------------------------------------------------------
C$OMP END PARALLEL
DEALLOCATE(ARG_1,ARG_2)
DEALLOCATE(RES)
END PROGRAM TESTER
工作代码。F 文件:
SUBROUTINE WORK1(ARG_ARR_1,DIM_1,ARG_ARR_2,DIM_2)
INTEGER DIM_1, DIM_2,I,J
REAL*4 ARG_ARR_2(DIM_2)
REAL*4 ARG_ARR_1(DIM_1)
REAL*4 ARG1, ARG2
REAL*4,DIMENSION(:),ALLOCATABLE:: ARG_ARR_3
SAVE
c-----------------------------------------------------------------------
C$OMP THREADPRIVATE (I)
c-----------------------------------------------------------------------
DO I=1,DIM_1
ARG_ARR_1(I)= 1.0
ENDDO
RETURN
ENTRY WORK2 (ARG_ARR_1,DIM_1,ARG_ARR_2,DIM_2)
DO I=1,DIM_2
ARG_ARR_2(I)= 2.0
ENDDO
RETURN
ENTRY WORK3 (ARG_ARR_3,J,ARG1,ARG2)
ARG_ARR_3(J)= ARG1+ARG2
RETURN
END SUBROUTINE WORK1
和 module.f 代码:
MODULE PRINTER
CONTAINS
SUBROUTINE PRINT_ARR(ARR_VAR,SIZE)
REAL*4,DIMENSION(:),ALLOCATABLE:: ARR_VAR
INTEGER SIZE
INTEGER,SAVE:: J
c-----------------------------------------------------------------------
C$OMP THREADPRIVATE(J)
c-----------------------------------------------------------------------
DO J=1,SIZE
PRINT *,'ARR_VAR(',J,')=',ARR_VAR(J)
ENDDO
FLUSH(6)
END SUBROUTINE PRINT_ARR
END MODULE PRINTER
我的编译和运行命令:
gfortran -fopenmp -O0 -g -fcheck=all -fbacktrace -c module1.f work.F main.F
gfortran -fopenmp *.o -o a.x
./a.x
我的输出:
MAIN: "RES" IS ALLOCATED = T
MAIN: "ARG_1" IS ALLOCATED = T
MAIN: "ARG_2" IS ALLOCATED = T
ARR_VAR( 1 )= 1.00000000
ARR_VAR( 2 )= 1.00000000
ARR_VAR( 3 )= 1.00000000
ARR_VAR( 4 )= 1.00000000
ARR_VAR( 5 )= 1.00000000
ARR_VAR( 1 )= 2.00000000
ARR_VAR( 2 )= 2.00000000
ARR_VAR( 3 )= 2.00000000
ARR_VAR( 4 )= 2.00000000
ARR_VAR( 5 )= 2.00000000
====================================
MAIN: "ARG_1" IS ALLOCATED = T
Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
Backtrace for this error:
MAIN: "ARG_2" IS ALLOCATED = T
ARR_VAR( 1 )= 1.00000000
ARR_VAR( 2 )= 1.00000000
ARR_VAR( 3 )= 1.00000000
ARR_VAR( 4 )= 1.00000000
ARR_VAR( 5 )= 1.00000000
ARR_VAR( 1 )= 2.00000000
ARR_VAR( 2 )= 2.00000000
ARR_VAR( 3 )= 2.00000000
ARR_VAR( 4 )= 2.00000000
ARR_VAR( 5 )= 2.00000000
====================================
At line 21 of file work.F
Fortran runtime error: Index '4' of dimension 1 of array 'arg_arr_3' above upper bound of 2
Error termination. Backtrace:
#0 0x7f1e90ed3ad0 in ???
#1 0x7f1e90ed2c35 in ???
#2 0x7f1e90c8051f in ???
at ./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0
#3 0x55d38d47e43d in master.0.work1
at .../work.F:21
#4 0x55d38d47e04f in work3_
at .../work.F:20
#5 0x55d38d47dae6 in MAIN__._omp_fn.0
at .../main.F:44
#6 0x7f1e90e7aa15 in ???
#7 0x55d38d47d45b in tester
at .../main.F:18
#8 0x55d38d47d58d in main
at .../main.F:2
Segmentation fault (core dumped)
我使用 gfortran: gcc 版本 11.4.0 (Ubuntu 11.4.0-1ubuntu1~22.04)
答:
1赞
PierU
11/10/2023
#1
我在这里没有带来完整的解决方案,但很难在评论中详细说明。
- 首先,您必须删除 中的语句,这是多线程的潜在杀手。
SAVE
WORK1()
- 然后,不再需要线程私有化。
I
- 你不需要参数上的 allocate 属性(无论如何它都不会起作用,除非你把例程放在模块中):
ARG_ARR_3
SUBROUTINE WORK1(ARG_ARR_1,DIM_1,ARG_ARR_2,DIM_2)
INTEGER DIM_1, DIM_2,I,J
REAL*4 ARG_ARR_2(DIM_2)
REAL*4 ARG_ARR_1(DIM_1)
REAL*4 ARG1, ARG2
REAL*4 ARG_ARR_3(*)
DO I=1,DIM_1
ARG_ARR_1(I)= 1.0
ENDDO
RETURN
ENTRY WORK2 (ARG_ARR_1,DIM_1,ARG_ARR_2,DIM_2)
DO I=1,DIM_2
ARG_ARR_2(I)= 2.0
ENDDO
RETURN
ENTRY WORK3 (ARG_ARR_3,J,ARG1,ARG2)
ARG_ARR_3(J)= ARG1+ARG2
RETURN
END SUBROUTINE WORK1
此外,在您的主程序中,这是矫枉过正的:旨在获取并行区域之间的持久私有变量。我看不出这里有什么需要。保持简单,改为声明:THREADPRIVATE(ARG_1, ARG_2)
threadprivate
C$OMP PARALLEL PRIVATE(I,ARG_1,ARG_2) SHARED(RES) NUM_THREADS(2)
最后,应放置在平行区域的末尾之前。DEALLOCATE(ARG_1,ARG_2)
试试那个...但这绝对是一个糟糕的设计(是过去的复兴,以及固定形式的源)。ENTRY
评论
0赞
DJNZ
11/22/2023
感谢您对平行区域之前的注释。deallocate
0赞
DJNZ
11/22/2023
#2
非常感谢您的回答和评论,@PierU和@IanBush!对于长时间的回复,我深表歉意!
我想补充一下我关于关键词的话,以及:threadprivate
save
- 我正在使用的代码在大约 70+ 个文件中有 ~100k 行,其中包含具有数百或数千行的子例程,并且几乎所有这些子例程都有一个独立的语句,没有变量列表并作用于范围内的所有局部变量。在我的示例中,我模拟了这种环境,对于循环计数器/数组迭代器等关键变量,我被迫使用语句。
SAVE
threadprivate
- 但是,您绝对正确,应尽可能少地使用此构造(+变量列表)。在我的目标代码中,这些子例程在大型 OpenMP 循环中被调用,并显式标记它们的一些局部变量,因为这是不可能的。
SAVE
THREADPRIVATE
private
- 所说的一切也适用于使用:它是遗留环境的一部分,我想将所有代码放在模块中并避免很多问题,但现在我没有时间这样做。
entry
因此,我能够制定自己的解决方案,我将在下面介绍。关键在于对共享变量的正确描述:数组 Y 必须共享(作为模块变量)。
主要。F:
PROGRAM TESTER
USE OMP_LIB
USE PRINTER
REAL*4,DIMENSION(:),ALLOCATABLE, SAVE :: ARG_1, ARG_2
REAL*4,DIMENSION(:),ALLOCATABLE:: RES
C=======================================================================
C$OMP THREADPRIVATE(ARG_1, ARG_2)
C=======================================================================
ALLOCATE(RES(N))
ALLOCATE(Y(N))
PRINT *,'MAIN: "RES" IS ALLOCATED = ',
> ALLOCATED(RES)
c-----------------------------------------------------------------------
C$OMP PARALLEL PRIVATE(I) SHARED(Y) NUM_THREADS(2)
c-----------------------------------------------------------------------
ALLOCATE(ARG_1(N))
PRINT *,'MAIN: "ARG_1" IS ALLOCATED = ',
> ALLOCATED(ARG_1)
ALLOCATE(ARG_2(N))
PRINT *,'MAIN: "ARG_2" IS ALLOCATED = ',
> ALLOCATED(ARG_2)
C Initialize working arrays:
CALL WORK1(ARG_1,N, ARG_2,N)
CALL WORK2(ARG_1,N, ARG_2,N)
C Step 1: Print working arrays:
CALL PRINT_ARR(ARG_1,N)
CALL PRINT_ARR(ARG_2,N)
PRINT *,'===================================='
FLUSH(6)
C Step 2: Parallel Loop:
c-----------------------------------------------------------------------
C$OMP DO
DO I=1,N
c RES(I)=ARG_1(I) + ARG_2(I)
CALL WORK3(I,ARG_1(I),ARG_2(I))
ENDDO
C$OMP END DO
CALL PRINT_ARR(Y,N)
DEALLOCATE(ARG_1,ARG_2)
c-----------------------------------------------------------------------
C$OMP END PARALLEL
c-----------------------------------------------------------------------
DEALLOCATE(RES)
END PROGRAM TESTER
工作。F:
SUBROUTINE WORK1(ARG1_W1,DIM_1,ARG2_W2,DIM_2)
USE PRINTER
c------ Input arguments: -----------------------------------------------
INTEGER DIM_1, DIM_2, J
REAL*4 ARG2_W2(DIM_2)
REAL*4 ARG1_W1(DIM_1)
c dummy arguments for WORK3:
REAL*4 ARG1_W3, ARG2_W3
c------ Locals: --------------------------------------------------------
INTEGER I
SAVE I
c------ OpenMP spells: -------------------------------------------------
c$OMP THREADPRIVATE (I)
c-----------------------------------------------------------------------
DO I=1,DIM_1
ARG1_W1(I) = 1.0
ENDDO
RETURN
ENTRY WORK2 (ARG1_W1,DIM_1,ARG2_W2,DIM_2)
DO I=1,DIM_2
ARG2_W2(I) = 2.0
ENDDO
RETURN
ENTRY WORK3 (J,ARG1_W3,ARG2_W3)
Y(J)= ARG1_W3 + ARG2_W3
RETURN
END SUBROUTINE WORK1
module1.f:
MODULE PRINTER
INTEGER, PARAMETER:: N = 5
c NB: array Y is shared!
REAL*4,DIMENSION(:),ALLOCATABLE::Y
CONTAINS
SUBROUTINE PRINT_ARR(ARR_VAR,SIZE)
REAL*4,DIMENSION(:),ALLOCATABLE:: ARR_VAR
INTEGER SIZE
INTEGER,SAVE:: J
c------ OpenMP spells: -------------------------------------------------
c$OMP THREADPRIVATE(J)
c-----------------------------------------------------------------------
DO J=1,SIZE
PRINT *,'ARR_VAR(',J,')=',ARR_VAR(J)
ENDDO
FLUSH(6)
END SUBROUTINE PRINT_ARR
END MODULE PRINTER
我的编译和运行命令:
sudo rm -R -f {*.o,*.x,*.mod}
gfortran -fopenmp -O0 -g -fcheck=all -fbacktrace -c module1.f work.F main.F
gfortran -fopenmp *.o -o a.x
./a.x
我的输出(由于多个线程,输出中可能存在一些混乱):
MAIN: "RES" IS ALLOCATED = T
MAIN: "ARG_1" IS ALLOCATED = T
MAIN: "ARG_2" IS ALLOCATED = T
ARR_VAR( 1 )= 1.00000000
ARR_VAR( 2 )= 1.00000000
ARR_VAR( 3 )= 1.00000000
ARR_VAR( 4 )= 1.00000000
ARR_VAR( 5 )= 1.00000000
ARR_VAR( 1 )= 2.00000000
ARR_VAR( 2 )= 2.00000000
ARR_VAR( 3 )= 2.00000000
ARR_VAR( 4 )= 2.00000000
ARR_VAR( 5 )= 2.00000000
====================================
MAIN: "ARG_1" IS ALLOCATED = T
MAIN: "ARG_2" IS ALLOCATED = T
ARR_VAR( 1 )= 1.00000000
ARR_VAR( 2 )= 1.00000000
ARR_VAR( 3 )= 1.00000000
ARR_VAR( 4 )= 1.00000000
ARR_VAR( 5 )= 1.00000000
ARR_VAR( 1 )= 2.00000000
ARR_VAR( 2 )= 2.00000000
ARR_VAR( 3 )= 2.00000000
ARR_VAR( 4 )= 2.00000000
ARR_VAR( 5 )= 2.00000000
====================================
ARR_VAR( 1 )= 3.00000000
ARR_VAR( 2 )= 3.00000000
ARR_VAR( 3 )= 3.00000000
ARR_VAR( 4 )= 3.00000000
ARR_VAR( 5 )= 3.00000000
ARR_VAR( 1 )= 3.00000000
ARR_VAR( 2 )= 3.00000000
ARR_VAR( 3 )= 3.00000000
ARR_VAR( 4 )= 3.00000000
ARR_VAR( 5 )= 3.00000000
评论
SAVE
WORK1()
SAVE
I
THREADPRIVATE
WORK1()
I
WORK1
Entry
SAVE
SAVE
THREADPRIVATE