提问人:sara 提问时间:11/17/2023 最后编辑:sara 更新时间:11/18/2023 访问量:55
C++ pthreads 导致墙时间比串行运行更长
c++ pthreads causing longer wall times than serial run
问:
我目前正在编写/运行一个程序,该程序本质上是在 Linux 中运行的域中反弹对象。这个过程可以对许多对象(~10^10 或更多)完成,我正在尝试弄清楚如何加快速度(目前在串行中需要 3 小时以上)。对于每个对象,我初始化它,开始它的移动,并查看它是否已经移动到一个新的域网格单元。从计算上讲,它不是很复杂,我只是做了很多数组查找,然后我将对象在前一个单元格中移动的距离写到数组中。我正在使用 pthreads 在 >50 个任务上运行,并且在运行代码时遇到速度变慢(需要 4 小时以上)。
代码本身有点太长,无法粘贴到这里,但重要的步骤:
//global variables
unsigned int Nth, Ncells; // these are input to the code
std::vector<std::vector<double>> distVec;
static void * start_moveObject(void *args)
{
return ((cellargs *)args)->context->moveOject(args);
}
void * moveObject(void * args)
{
unsigned int startv = ((cellargs *)args)->st;
unsigned int endv = ((cellargs *)args)->ed;
unsigned int thID = ((cellargs *)args)->thID;
unsigned int dim = ((cellargs *)args)->context->dim;
bool ncell = false;
double dist = 0.;
unsigned int currCell, prevCell;
for(unsigned int index = startv; index < endv; index++)
{
// do the things here that move the object and calculate the distance (dist) traveled in currCell
// check if it is in new cell
if(currCell != prevCell)
{
distVec[thID][prevCell] += dist;
dist = 0;
prevCell = currCell;
}
}
int main()
{
pthread_t * threads = new pthread_t[Nthreads];
cellargs * args = new cellargs[Nthreads];
unsigned int st, end, prev;
distVec.resize(Nth);
prev = 0;
for(unsigned int ith = 0; ith < Nthreads; ith++)
{
st = prev;
end = st + floor(Ncells/Nthreads);
if(ith == Nthreads -1)
end = Ncells -1;
prev = end + 1;
distVec[ith].resize(Ncells, 0.);
args[ith].st = st;
args[ith].end = end;
args[ith].thID = ith;
if(pthread_create(&threads[ith], NULL, start_moveObject, (void*)&args[ith]) != 0)
}
for(unsigned int ith = 0; ith < Nth; ith++)
pthread_join(threads[ith], NULL);
}
有没有其他方法可以为并行运行编写代码,或者如果无法优化,其他库可以更好地用于此?
答: 暂无答案
评论
Nthreads
Nthreads
Nthreads=1
Nthreads=N
if(ith = Nthreads -1)
if(ith == Nthreads -1)