提问人:Mico Mrkaic 提问时间:11/17/2023 最后编辑:Mico Mrkaic 更新时间:11/17/2023 访问量:64
Eigen c++ 线性代数比 numpy 慢得多——这是怎么回事?
Eigen c++ linear algebra much slower than numpy -- what is going on?
问:
我编译了以下 C++ 程序,并使用优化 O3 使用 GCC 对其进行编译。得到的矩阵逆比等效的简单 numpy 代码 b=np.linalg.inv(a) 慢约 3-4 倍。我的印象是 Eigen 速度超快,所以我一定做错了什么,但我不知道是什么。
有人可以告诉我我可以做些什么来改进代码吗?
c++ 代码是
#include <iostream>
#include <iomanip>
#include <chrono>
#include <Eigen/LU>
#include <Eigen/Dense>
using Eigen::MatrixXd;
int main()
{
const int max_rep = 5;
const int N=1000;
MatrixXd inv_m(N,N);
std::cout << "**** Eigen Library -- matrix inversion speed test ****" << std::endl;
std::cout << "Matrix size = " << N << std::endl;
for(int i=1;i<=max_rep;i++) {
MatrixXd m = MatrixXd::Random(N,N);
auto start = std::chrono::system_clock::now();
inv_m = m.inverse();
auto stop = std::chrono::system_clock::now();
std::cout << "Test no." << std::setw(3) << i;
std::cout << ", CPU time = "
<< std::chrono::duration<double>{stop - start}.count()
<< " seconds." << std::endl;
}
return 0;
}
Python 代码是
import numpy as np
import time
for n in range(500, 5001, 500):
a=np.random.rand(n,n)
t=time.time();
b=np.linalg.inv(a)
t=time.time()-t;
print(f'Size {n=:5d}, {t=:.2f}')
我对 numpy 和 Eigen 代码进行了基准测试,并期望 Eigen 更快,但结果恰恰相反。(这也适用于计算特征值,而不仅仅是逆的特殊情况)。
numpy 配置是
In [26]: np.show_config()
blas_mkl_info:
NOT AVAILABLE
blis_info:
NOT AVAILABLE
openblas_info:
NOT AVAILABLE
accelerate_info:
NOT AVAILABLE
atlas_3_10_blas_threads_info:
NOT AVAILABLE
atlas_3_10_blas_info:
NOT AVAILABLE
atlas_blas_threads_info:
NOT AVAILABLE
atlas_blas_info:
NOT AVAILABLE
blas_info:
libraries = ['blas', 'blas']
library_dirs = ['/usr/lib/x86_64-linux-gnu']
include_dirs = ['/usr/local/include', '/usr/include']
language = c
define_macros = [('HAVE_CBLAS', None)]
blas_opt_info:
define_macros = [('NO_ATLAS_INFO', 1), ('HAVE_CBLAS', None)]
libraries = ['blas', 'blas']
library_dirs = ['/usr/lib/x86_64-linux-gnu']
include_dirs = ['/usr/local/include', '/usr/include']
language = c
lapack_mkl_info:
NOT AVAILABLE
openblas_lapack_info:
NOT AVAILABLE
openblas_clapack_info:
NOT AVAILABLE
flame_info:
NOT AVAILABLE
atlas_3_10_threads_info:
NOT AVAILABLE
atlas_3_10_info:
NOT AVAILABLE
atlas_threads_info:
NOT AVAILABLE
atlas_info:
NOT AVAILABLE
lapack_info:
libraries = ['lapack', 'lapack']
library_dirs = ['/usr/lib/x86_64-linux-gnu']
language = f77
lapack_opt_info:
libraries = ['lapack', 'lapack', 'blas', 'blas']
library_dirs = ['/usr/lib/x86_64-linux-gnu']
language = c
define_macros = [('NO_ATLAS_INFO', 1), ('HAVE_CBLAS', None)]
include_dirs = ['/usr/local/include', '/usr/include']
Supported SIMD extensions in this NumPy install:
baseline = SSE,SSE2,SSE3
found = SSSE3,SSE41,POPCNT,SSE42,AVX,F16C,FMA3,AVX2
not found = AVX512F,AVX512CD,AVX512_KNL,AVX512_KNM,AVX512_SKX,AVX512_CLX,AVX512_CNL,AVX512_ICL
答: 暂无答案
评论
np.show_config()
的输出,那将会很有趣-march=native -fopenmp