问：

大多数学习 C 的 C++ 用户更喜欢使用 / 系列函数，即使他们使用 C++ 编码也是如此。printfscanf

虽然我承认我发现界面更好（尤其是类似 POSIX 的格式和本地化），但似乎一个压倒性的担忧是性能。

看看这个问题：

如何加快逐行读取文件的速度

似乎最好的答案是使用，并且 C++ 始终慢 2-3 倍。fscanfifstream

我认为，如果我们能编译一个“技巧”存储库来提高IOStreams的性能，哪些有效，哪些无效，那就太好了。

需要考虑的要点

缓冲（rdbuf()->pubsetbuf(buffer, size))
同步（std::ios_base::sync_with_stdio)
locale 处理（我们可以使用精简的区域设置，还是完全删除它？

当然，其他方法也是受欢迎的。

注意：提到了 Dietmar Kuhl 的“新”实现，但我无法找到有关它的许多细节。以前的引用似乎是死链接。

C 优化 iostream faq c++-standard-library

如果直接使用文件缓冲区获得更好的性能，那么这意味着解析代码（无论如何用于读取）是性能消耗，因为这是包装缓冲区的东西。不幸的是，广泛的 IO 流实现在后台使用 /，这肯定比直接使用 C std lib IO 慢。（另请参阅我对@Konrad对这个问题的评论。std::istreamprintf()scanf()

17赞 MaHuJa 10/23/2011

“除了使用 cout 和 iostream 之外的 C 代码”——我们称之为“带有 iostreams 的 C”，这是许多大学课程中 C++ 的代名词。

53赞 Matthieu M. 3/2/2011 #2

以下是我到目前为止收集到的信息：

缓冲：

如果默认情况下缓冲区非常小，则增加缓冲区大小肯定可以提高性能：

它减少了 HDD 命中次数
它减少了系统调用的次数

可以通过访问底层实现来设置缓冲区。streambuf

char Buffer[N];

std::ifstream file("file.txt");

file.rdbuf()->pubsetbuf(Buffer, N);
// the pointer reader by rdbuf is guaranteed
// to be non-null after successful constructor

警告由 @iavr 提供：根据 cppreferre，最好在打开文件之前调用 pubsetbuf。否则，各种标准库实现具有不同的行为。

区域设置处理：

Locale 可以执行字符转换、过滤和涉及数字或日期的更聪明的技巧。它们会经历一个由动态调度和虚拟呼叫组成的复杂系统，因此删除它们可以帮助减少惩罚。

默认区域设置意味着不执行任何转换，并且在计算机之间保持一致。这是一个很好的默认设置。C

同步：

使用此工具，我看不到任何性能改进。

可以使用 static 函数访问全局设置（静态成员）。std::ios_basesync_with_stdio

测量：

玩这个，我玩了一个简单的程序，在 SUSE 10p3 上使用 .gcc 3.4.2-O2

C ： 7.76532e+06
C++： 1.0874e+07

这代表了大约...对于默认代码。事实上，篡改缓冲区（C或C++）或同步参数（C++）并没有产生任何改进。20%

其他人的结果：

@Irfy g++ 4.7.2-2ubuntu1、-O3、虚拟化的 Ubuntu 11.10、3.5.0-25-generic、x86_64、足够的 ram/cpu、196MB 的几个“查找/>>大文件.txt”运行

C ： 634572 C++： 473222

C++ 速度提高 25%

@Matteo Italia on g++ 4.4.5， -O3， Ubuntu Linux 10.10 x86_64随机 180 MB 文件

C：910390
C++：776016

C++ 速度提高 17%

@Bogatyr g++ i686-apple-darwin10-g++-4.2.1 （GCC） 4.2.1 （Apple Inc. build 5664）， mac mini， 4GB ram，空闲，除了这个测试，数据文件为 168MB

C ： 4.34151e+06
C++： 9.14476e+06

C++ 慢 111%

@Asu clang++ 3.8.0-2ubuntu4， Kubuntu 16.04 Linux 4.8-rc3， 8GB ram， i5 Haswell， Crucial SSD， 88MB datafile （tar.xz archive）

C ： 270895 C++： 162799

C++ 速度提高 66%

所以答案是：这是一个实施质量问题，实际上取决于平台：/

完整的代码在这里，供那些对基准测试感兴趣的人使用：

#include <fstream>
#include <iostream>
#include <iomanip>

#include <cmath>
#include <cstdio>

#include <sys/time.h>

template <typename Func>
double benchmark(Func f, size_t iterations)
{
  f();

  timeval a, b;
  gettimeofday(&a, 0);
  for (; iterations --> 0;)
  {
    f();
  }
  gettimeofday(&b, 0);
  return (b.tv_sec * (unsigned int)1e6 + b.tv_usec) -
         (a.tv_sec * (unsigned int)1e6 + a.tv_usec);
}


struct CRead
{
  CRead(char const* filename): _filename(filename) {}

  void operator()() {
    FILE* file = fopen(_filename, "r");

    int count = 0;
    while ( fscanf(file,"%s", _buffer) == 1 ) { ++count; }

    fclose(file);
  }

  char const* _filename;
  char _buffer[1024];
};

struct CppRead
{
  CppRead(char const* filename): _filename(filename), _buffer() {}

  enum { BufferSize = 16184 };

  void operator()() {
    std::ifstream file(_filename, std::ifstream::in);

    // comment to remove extended buffer
    file.rdbuf()->pubsetbuf(_buffer, BufferSize);

    int count = 0;
    std::string s;
    while ( file >> s ) { ++count; }
  }

  char const* _filename;
  char _buffer[BufferSize];
};


int main(int argc, char* argv[])
{
  size_t iterations = 1;
  if (argc > 1) { iterations = atoi(argv[1]); }

  char const* oldLocale = setlocale(LC_ALL,"C");
  if (strcmp(oldLocale, "C") != 0) {
    std::cout << "Replaced old locale '" << oldLocale << "' by 'C'\n";
  }

  char const* filename = "largefile.txt";

  CRead cread(filename);
  CppRead cppread(filename);

  // comment to use the default setting
  bool oldSyncSetting = std::ios_base::sync_with_stdio(false);

  double ctime = benchmark(cread, iterations);
  double cpptime = benchmark(cppread, iterations);

  // comment if oldSyncSetting's declaration is commented
  std::ios_base::sync_with_stdio(oldSyncSetting);

  std::cout << "C  : " << ctime << "\n"
               "C++: " << cpptime << "\n";

  return 0;
}

导致这个问题的问题与偏好无关，它与“典型”案例输入处理的具体测量有关。你的基准测试并不有趣，因为它不符合真实世界的案例。相反，你为什么不编写一个 shell 脚本，在一组大文件上通过 1 次迭代来运行你的程序，并测量聚合的时钟时间。

4赞 Konrad Rudolph 3/2/2011

@Bogatyr，如果有什么比 .此外，这很好地近似于真实世界的案例：读取数据。毕竟，我们不想测量其他东西，只想测量数据的读取。所以这个基准是好的。将两个代码放在同一个可执行文件中也完全没问题。只要确保运行了足够多的基准测试迭代，以抵消预热速度减慢（或者在开始时运行一次，Mathieu 就是这样做的）。这个基准比你建议的“改进”要好得多。gettimeofdaytime

1赞 Irfy 3/20/2013

我刚刚在 3 台 linux 机器上进行了测试，使用 g++ 从 4.5.4 到 4.7.2 进行编译，差异从 C++ 快 25% 到 C++ 快 40%。

21赞 user4385532 2/11/2016 #3

另外两个改进：

在大量输入/输出之前发出问题。`std::cin.tie(nullptr);`

引用 http://en.cppreference.com/w/cpp/io/cin：

构造 std：：cin 后，std：：cin.tie（）返回 &std：：cout，同样，std：：wcin.tie（）返回 &std：：wcout。这意味着，如果有任何字符挂起输出，则对 std：：cin 的任何格式化输入操作都会强制调用 std：：cout.flush（）。

您可以通过从中解绑来避免刷新缓冲区。这与对和的多个混合调用有关。请注意，调用会使程序不适合由用户以交互方式运行，因为输出可能会延迟。std::cinstd::coutstd::cinstd::coutstd::cin.tie(std::nullptr);

使用代替 .`'\n'std::endl`

引用 http://en.cppreference.com/w/cpp/io/manip/endl：

在输出序列 os 中插入换行符，并像调用 os.put（os.widen（'\n'））后跟 os.flush（）一样刷新它。

您可以通过打印而不是来避免冲洗起泡。'\n'endl

如何让IOStream表现得更好？

How to get IOStream to perform better?

评论

评论

评论

在大量输入/输出之前发出问题。`std::cin.tie(nullptr);`

使用代替 .`'\n'std::endl`

评论

如何让IOStream表现得更好？

How to get IOStream to perform better?

评论

评论

评论

在大量输入/输出之前发出问题。std::cin.tie(nullptr);

使用代替 .'\n'std::endl

评论

在大量输入/输出之前发出问题。`std::cin.tie(nullptr);`

使用代替 .`'\n'std::endl`