PTXAS 文件中的 CUDA 外部类链接和未解析的 extern 函数-解网

问：

我正在使用 CUDA，我创建了一个类来处理复杂的整数。int2_

文件中的类声明如下：ComplexTypes.h

namespace LibraryNameSpace
{
    class int2_ {

        public:
            int x;
            int y;

            // Constructors
            __host__ __device__ int2_(const int,const int);
            __host__ __device__ int2_();
            // etc.

            // Equalities with other types      
            __host__ __device__ const int2_& operator=(const int);
            __host__ __device__ const int2_& operator=(const float);
            // etc.

    };
}

文件中的类实现如下：ComplexTypes.cpp

#include "ComplexTypes.h"

__host__ __device__         LibraryNameSpace::int2_::int2_(const int x_,const int y_)           { x=x_; y=y_;}
__host__ __device__         LibraryNameSpace::int2_::int2_() {}
// etc.

__host__ __device__ const   LibraryNameSpace::int2_& LibraryNameSpace::int2_::operator=(const int a)                        { x = a;            y = 0.;             return *this; }
__host__ __device__ const   LibraryNameSpace::int2_& LibraryNameSpace::int2_::operator=(const float a)                      { x = (int)a;       y = 0.;             return *this; }
// etc.

一切正常。在（包括）中，我可以处理数字。mainComplexTypes.hint2_

在文件中，我现在包含并定义并正确实例化该函数：CudaMatrix.cuComplexTypes.h__global__

template <class T1, class T2>
__global__ void evaluation_matrix(T1* data_, T2* ob, int NumElements)
{
    const int i = blockDim.x * blockIdx.x + threadIdx.x;
    if(i < NumElements) data_[i] = ob[i];
}

template __global__ void evaluation_matrix(LibraryNameSpace::int2_*,int*,int);

文件的情况似乎与函数对称。然而，编译器抱怨：CudaMatrix.cumain

Error   19  error : Unresolved extern function '_ZN16LibraryNameSpace5int2_aSEi'    C:\Users\Documents\Project\Test\Testing_Files\ptxas simpleTest

请考虑：

在将实现移动到单独的文件之前，在文件中包含声明和实现时，一切都正常工作。main
有问题的指令是。data_[i] = ob[i]

有人知道发生了什么吗？

C++ 类 CUDA 未解析的外部

必须将文件名转换为，以便可以拦截 CUDA 关键字和 .Talonmies在他的评论中指出了这一点。实际上，在发布之前，我已经将文件名从更改为，但编译器抱怨并显示相同的错误。因此，我巧妙地退后一步;ComplexTypes.cppComplexTypes.cunvcc__device____host__.cpp.cu
在 Visual Studio 2010 中，必须使用 View -> 属性页;配置属性 -> CUDA C/C++ -> 通用 -> 生成可重定位的设备代码 -> 是（-rdc=true）。这对于单独编译是必要的。事实上，在NVIDIA CUDA编译器驱动程序NVCC上，据说：

CUDA 的工作原理是将设备代码嵌入到主机对象中。在整个程序编译中，它将可执行设备代码嵌入到主机对象中。在单独的编译中，我们将可重定位的设备代码嵌入到主机对象中，并运行设备链接器（nvlink）将所有设备代码链接在一起。然后，nvlink 的输出由主机链接器与所有主机对象链接在一起，以形成最终的可执行文件。可重定位与可执行设备代码的生成由 --relocatable-device-code={true，false} 选项控制，该选项可以缩短为 –rdc={true，false}。

PTXAS 文件中的 CUDA 外部类链接和未解析的 extern 函数

CUDA external class linkage and unresolved extern function in ptxas file

评论

评论