如何在 C++ 中读取和解析 CSV 文件？-解网

问：

我需要在 C++ 中加载和使用 CSV 文件数据。在这一点上，它实际上可以只是一个逗号分隔的解析器（即不用担心转义新行和逗号）。主要需求是一个逐行解析器，每次调用该方法时，该解析器都会为下一行返回一个向量。

我发现这篇文章看起来很有前途：http://www.boost.org/doc/libs/1_35_0/libs/spirit/example/fundamental/list_parser.cpp

我从未使用过 Boost's Spirit，但愿意尝试。但前提是没有我忽略的更直接的解决方案。

C++ 解析文本 CSV

12赞 chrish 7/14/2009

我已经看过解析了。它更多地用于解析语法，这要归功于解析简单的文件格式。我的团队中有人试图用它来解析 XML，调试起来很痛苦。如果可能的话，请远离。boost::spiritboost::spirit

52赞 MattyT 7/14/2009

对不起，chrish，但这是可怕的建议。Spirit 并不总是一个合适的解决方案，但我已经在许多项目中成功地使用了它，并将继续使用它。与类似工具（Antlr、Lex/yacc 等）相比，它具有显着的优势。现在，对于解析 CSV，它可能有点矫枉过正......

4赞 fho 7/14/2014

@MattyT恕我直言，很难用于解析器组合器库。在使用 Haskells 库时有一些（非常愉快的）经验，我希望它（精神）也能很好地工作，但在与 600 行编译器错误作斗争后放弃了它。spirit(atto)parsec

4赞 SomethingSomething 8/24/2014

C CSV 解析器：sourceforge.net/projects/cccsvparser C CSV 编写器：sourceforge.net/projects/cccsvwriter

0赞 let me down slowly 5/18/2021

你为什么不想逃避逗号和换行符！每次搜索都链接到这个问题，我找不到一个考虑逃逸的答案！:|

答：

9赞 anon 7/13/2009 #1

你可能想看看我的FOSS项目CSVfix（更新的链接），这是一个用C++编写的CSV流编辑器。CSV 解析器不是奖品，但可以完成这项工作，整个包可以完成您需要的操作，而无需您编写任何代码。

有关 CSV 解析器，请参阅 alib/src/a_csv.cpp，有关使用示例，请参阅 csvlib/src/csved_ioman.cpp （）。IOManager::ReadCSV

0赞 neuro 7/13/2009

看起来很棒......状态测试版/生产版怎么样？

0赞 7/13/2009

状态为“正在开发”，如版本号所示。在进入 1.0 版之前，我真的需要用户的更多反馈。另外，我还想添加一些功能，与CSV的XML生产有关。

0赞 neuro 7/13/2009

为它添加书签，下次我必须处理那些精彩的标准CSV文件时，我会尝试一下......

361赞 Martin York 7/13/2009 #2

如果你不关心转义逗号和换行符，
并且你不能在引号中嵌入逗号和换行符（如果你不能转义，那么......
那么它只有大约三行代码（OK 14 ->但它只有 15 行代码来读取整个文件）。

std::vector<std::string> getNextLineAndSplitIntoTokens(std::istream& str)
{
    std::vector<std::string>   result;
    std::string                line;
    std::getline(str,line);

    std::stringstream          lineStream(line);
    std::string                cell;

    while(std::getline(lineStream,cell, ','))
    {
        result.push_back(cell);
    }
    // This checks for a trailing comma with no data after it.
    if (!lineStream && cell.empty())
    {
        // If there was a trailing comma then add an empty element.
        result.push_back("");
    }
    return result;
}

我只会创建一个表示行的类。
然后流式传输到该对象：

#include <iterator>
#include <iostream>
#include <fstream>
#include <sstream>
#include <vector>
#include <string>

class CSVRow
{
    public:
        std::string_view operator[](std::size_t index) const
        {
            return std::string_view(&m_line[m_data[index] + 1], m_data[index + 1] -  (m_data[index] + 1));
        }
        std::size_t size() const
        {
            return m_data.size() - 1;
        }
        void readNextRow(std::istream& str)
        {
            std::getline(str, m_line);

            m_data.clear();
            m_data.emplace_back(-1);
            std::string::size_type pos = 0;
            while((pos = m_line.find(',', pos)) != std::string::npos)
            {
                m_data.emplace_back(pos);
                ++pos;
            }
            // This checks for a trailing comma with no data after it.
            pos   = m_line.size();
            m_data.emplace_back(pos);
        }
    private:
        std::string         m_line;
        std::vector<int>    m_data;
};

std::istream& operator>>(std::istream& str, CSVRow& data)
{
    data.readNextRow(str);
    return str;
}   
int main()
{
    std::ifstream       file("plop.csv");

    CSVRow              row;
    while(file >> row)
    {
        std::cout << "4th Element(" << row[3] << ")\n";
    }
}

但是通过一些工作，我们可以在技术上创建一个迭代器：

class CSVIterator
{   
    public:
        typedef std::input_iterator_tag     iterator_category;
        typedef CSVRow                      value_type;
        typedef std::size_t                 difference_type;
        typedef CSVRow*                     pointer;
        typedef CSVRow&                     reference;

        CSVIterator(std::istream& str)  :m_str(str.good()?&str:nullptr) { ++(*this); }
        CSVIterator()                   :m_str(nullptr) {}

        // Pre Increment
        CSVIterator& operator++()               {if (m_str) { if (!((*m_str) >> m_row)){m_str = nullptr;}}return *this;}
        // Post increment
        CSVIterator operator++(int)             {CSVIterator    tmp(*this);++(*this);return tmp;}
        CSVRow const& operator*()   const       {return m_row;}
        CSVRow const* operator->()  const       {return &m_row;}

        bool operator==(CSVIterator const& rhs) {return ((this == &rhs) || ((this->m_str == nullptr) && (rhs.m_str == nullptr)));}
        bool operator!=(CSVIterator const& rhs) {return !((*this) == rhs);}
    private:
        std::istream*       m_str;
        CSVRow              m_row;
};


int main()
{
    std::ifstream       file("plop.csv");

    for(CSVIterator loop(file); loop != CSVIterator(); ++loop)
    {
        std::cout << "4th Element(" << (*loop)[3] << ")\n";
    }
}

现在我们已经进入了 2020 年，让我们添加一个 CSVRange 对象：

class CSVRange
{
    std::istream&   stream;
    public:
        CSVRange(std::istream& str)
            : stream(str)
        {}
        CSVIterator begin() const {return CSVIterator{stream};}
        CSVIterator end()   const {return CSVIterator{};}
};

int main()
{
    std::ifstream       file("plop.csv");

    for(auto& row: CSVRange(file))
    {
        std::cout << "4th Element(" << row[3] << ")\n";
    }
}

30赞 Martin York 7/14/2009

first（） next（）。这是什么 Java！只是开玩笑。

5赞 Martin York 1/13/2012

@DarthVader：一个覆盖的广泛陈述，就其广泛性而言是愚蠢的。如果你想澄清为什么它是坏的，那么为什么这种坏事适用于这种情况。

12赞 Martin York 1/13/2012

@DarthVader：我认为笼统的概括是愚蠢的。上面的代码工作正常，所以我实际上可以看到它有什么问题。但是，如果您对上述内容有任何具体评论，我一定会在这种情况下考虑。但是，我可以看到，通过盲目地遵循一组 C# 的通用规则并将其应用于另一种语言，您可以得出这个结论。

5赞 sk29910 6/28/2013

此外，如果您在使用上述代码时遇到奇怪的链接问题，因为某处定义了另一个库（如 Eigen），请在运算符声明之前添加一个来修复它。istream::operator>>inline

4赞 Maxim Egorushkin 7/3/2014

解析部分丢失了，一个仍然以字符串结束。这只是一个过度设计的分线器。

55赞 dtw 7/14/2009 #3

使用 Boost Tokenizer 的解决方案：

std::vector<std::string> vec;
using namespace boost;
tokenizer<escaped_list_separator<char> > tk(
   line, escaped_list_separator<char>('\\', ',', '\"'));
for (tokenizer<escaped_list_separator<char> >::iterator i(tk.begin());
   i!=tk.end();++i) 
{
   vec.push_back(*i);
}

12赞 Rolf Kristensen 4/14/2010

boost tokenizer 并不完全支持完整的 CSV 标准，但有一些快速的解决方法。请参阅 stackoverflow.com/questions/1120140/csv-parser-in-c/...

4赞 NPike 4/28/2011

您是否必须在计算机上安装整个 boost 库，或者您可以只使用其代码的子集来执行此操作？256mb 对于 CSV 解析来说似乎很多。.

6赞 ildjarn 5/25/2011

@NPike ：您可以使用 boost 附带的 bcp 实用程序仅提取您实际需要的标头。

1赞 ravenspoint 7/14/2009 #4

对不起，这一切似乎都是大量复杂的语法来隐藏几行代码。

为什么不这样做：

/**

  Read line from a CSV file

  @param[in] fp file pointer to open file
  @param[in] vls reference to vector of strings to hold next line

  */
void readCSV( FILE *fp, std::vector<std::string>& vls )
{
    vls.clear();
    if( ! fp )
        return;
    char buf[10000];
    if( ! fgets( buf,999,fp) )
        return;
    std::string s = buf;
    int p,q;
    q = -1;
    // loop over columns
    while( 1 ) {
        p = q;
        q = s.find_first_of(",\n",p+1);
        if( q == -1 ) 
            break;
        vls.push_back( s.substr(p+1,q-p-1) );
    }
}

int _tmain(int argc, _TCHAR* argv[])
{
    std::vector<std::string> vls;
    FILE * fp = fopen( argv[1], "r" );
    if( ! fp )
        return 1;
    readCSV( fp, vls );
    readCSV( fp, vls );
    readCSV( fp, vls );
    std::cout << "row 3, col 4 is " << vls[3].c_str() << "\n";

    return 0;
}

0赞 Timmmm 11/19/2014

呃，为什么字符串里会有？",\n"

0赞 Martyn Shutt 6/6/2015

@Timmmm查找 String 类的 substr 方法，您会看到它需要多个字符，\n 是换行符，因此在这种情况下，它计为单个字符。它不会将整个值作为一个整体进行搜索。它搜索每个单独的角色;即逗号或换行符。substr 将返回它找到的第一个字符的位置，如果两者都找不到，则返回 -1，这意味着它完成了对该行的读取。fp 在内部跟踪文件中的位置，因此每次调用 readCSV 一次移动一行。

1赞 MadH 9/18/2009 #5

您还可以查看库的功能。Qt

它有正则表达式支持，QString 类有很好的方法，例如返回 QStringList，通过使用提供的分隔符拆分原始字符串获得的字符串列表。对于csv文件来说应该足够了。split()

为了获取具有给定标头名称的列，我使用以下命令： c++ 继承 Qt 问题 qstring

0赞 Ezee 3/26/2015

这不会处理引号中的逗号

34赞 Matthieu N. 9/25/2009 #6

C++ 字符串工具包库（StrTk）有一个令牌网格类，允许您从文本文件、字符串或字符缓冲区加载数据，并以行列方式解析/处理它们。

您可以指定行分隔符和列分隔符，也可以仅使用默认值。

void foo()
{
   std::string data = "1,2,3,4,5\n"
                      "0,2,4,6,8\n"
                      "1,3,5,7,9\n";

   strtk::token_grid grid(data,data.size(),",");

   for(std::size_t i = 0; i < grid.row_count(); ++i)
   {
      strtk::token_grid::row_type r = grid.row(i);
      for(std::size_t j = 0; j < r.size(); ++j)
      {
         std::cout << r.get<int>(j) << "\t";
      }
      std::cout << std::endl;
   }
   std::cout << std::endl;
}

更多例子可以在这里找到

1赞 rampion 8/29/2017

虽然 strtk 支持双引号字段，甚至去除周围的引号（通过），但它不支持删除双引号（例如，字段作为 c 字符串）。你必须自己做。options.trim_dquotes = true"She said ""oh no"", and left.""She said \"oh no\", and left."

1赞 rampion 8/30/2017

使用时，您还必须手动处理包含换行符的双引号字段。strtk

16赞 Rolf Kristensen 10/20/2009 #7

将 Boost Tokenizer escaped_list_separator用于 CSV 文件时，应注意以下事项：

它需要一个转义字符（默认反斜杠 - \）
它需要一个拆分器/分隔符（默认逗号 - ，）
它需要引号字符（默认引号 - “）

wiki 指定的 CSV 格式规定数据字段可以在引号中包含分隔符（支持）：

1997年，福特，E350，“超级豪华卡车”

wiki 指定的 CSV 格式规定单引号应使用双引号处理（escaped_list_separator将去除所有引号字符）：

1997年，福特，E350，“超级豪华”卡车”

CSV 格式未指定应删除任何反斜杠字符（escaped_list_separator将删除所有转义字符）。

修复提升escaped_list_separator默认行为的可能解决方法：

首先，将所有反斜杠字符（\）替换为两个反斜杠字符（\\），这样它们就不会被剥离。
其次，将所有双引号（“”）替换为单个反斜杠字符和引号（\“）

这种解决方法的副作用是，用双引号表示的空数据字段将转换为单引号标记。在遍历标记时，必须检查标记是否为单引号，并将其视为空字符串。

不漂亮，但只要引号中没有换行符，它就可以工作。

29赞 Joel de Guzman 11/20/2009 #8

使用 Spirit 解析 CSV 并不过分，Spirit 非常适合微解析任务。例如，使用 Spirit 2.1，它就像以下几点一样简单：

bool r = phrase_parse(first, last,

    //  Begin grammar
    (
        double_ % ','
    )
    ,
    //  End grammar

    space, v);

向量 v 被填充了值。在刚刚随 Boost 1.41 一起发布的新 Spirit 2.1 文档中，有一系列教程涉及这一点。

本教程从简单到复杂。CSV 解析器在中间的某个地方显示，并涉及使用 Spirit 的各种技术。生成的代码与手写代码一样紧凑。查看生成的汇编程序！

20赞 Gerdiner 12/2/2012

实际上，这是矫枉过正，编译时间的冲击是巨大的，使得使用Spirit进行简单的“微解析任务”是不合理的。

15赞 Gerdiner 12/2/2012

另外，我想指出，上面的代码不解析 CSV，它只是解析用逗号分隔的向量类型的范围。它不处理引号、不同类型的列等。简而言之，19票支持完全回答问题的东西对我来说似乎有点可疑。

9赞 Konrad Rudolph 12/6/2012

@Gerdiner胡说八道。小型解析器的编译时间并不大，但它也无关紧要，因为您将代码塞入其自己的编译单元并编译一次。然后你只需要链接它，这就足够高效了。至于你的其他评论，CSV的方言有多少处理器就有多少。这当然不是一个非常有用的方言，但它可以轻而易举地扩展以处理引号值。

12赞 Gerdiner 1/11/2013

@konrad：在以 2.ghz 运行的 corei7 上使用 MSVC 2012，只需将“#include < boost/spirit/include/qi.hpp>”包含在一个只有 main 而没有其他内容的空文件中，需要 9.7 秒。这是不必要的臃肿。公认的答案在同一台机器上不到 2 秒即可编译，我很难想象“适当的”Boost.Spirit 示例需要多长时间才能编译。

11赞 2/25/2014

@Gerdiner我必须同意你的看法，将精神用于像 cvs 处理这样简单的事情的开销太大了。

32赞 stefanB 2/24/2010 #9

您可以将 Boost Tokenizer 与 escaped_list_separator 一起使用。

escaped_list_separator解析 CSV 的超集。Boost：：tokenizer

这仅使用 Boost tokenizer 头文件，无需链接到 boost 库。

下面是一个示例（有关详细信息，请参阅在 C++ 中使用提升分词器解析 CSV 文件或）：Boost::tokenizer

#include <iostream>     // cout, endl
#include <fstream>      // fstream
#include <vector>
#include <string>
#include <algorithm>    // copy
#include <iterator>     // ostream_operator
#include <boost/tokenizer.hpp>

int main()
{
    using namespace std;
    using namespace boost;
    string data("data.csv");

    ifstream in(data.c_str());
    if (!in.is_open()) return 1;

    typedef tokenizer< escaped_list_separator<char> > Tokenizer;
    vector< string > vec;
    string line;

    while (getline(in,line))
    {
        Tokenizer tok(line);
        vec.assign(tok.begin(),tok.end());

        // vector now contains strings from one row, output to cout here
        copy(vec.begin(), vec.end(), ostream_iterator<string>(cout, "|"));

        cout << "\n----------------------" << endl;
    }
}

0赞 stefanB 1/13/2011

如果您希望能够解析嵌入的新行，mybyteofcode.blogspot.com/2010/11/...。

0赞 Rob Smallshire 6/27/2012

虽然这种技术有效，但我发现它的性能非常差。在我的 2 GHz Xeon 上解析一个 90000 行的 CSV 文件，每行 10 个字段大约需要 8 秒。Python 标准库 csv 模块在大约 0.3 秒内解析同一文件。

0赞 tofutim 7/12/2012

@Rob这很有趣 - Python csv 有什么不同之处？

1赞 stefanB 7/16/2012

@RobSmallshire，这是一个简单的示例代码，而不是一个高性能的示例代码。此代码复制每行的所有字段。为了获得更高的性能，您将使用不同的选项，并仅返回对缓冲区中字段的引用，而不是进行复制。

18赞 Michael 3/20/2010 #10

如果您确实关心正确解析 CSV，这将做到这一点......相对较慢，因为它一次工作一个字符。

 void ParseCSV(const string& csvSource, vector<vector<string> >& lines)
    {
       bool inQuote(false);
       bool newLine(false);
       string field;
       lines.clear();
       vector<string> line;

       string::const_iterator aChar = csvSource.begin();
       while (aChar != csvSource.end())
       {
          switch (*aChar)
          {
          case '"':
             newLine = false;
             inQuote = !inQuote;
             break;

          case ',':
             newLine = false;
             if (inQuote == true)
             {
                field += *aChar;
             }
             else
             {
                line.push_back(field);
                field.clear();
             }
             break;

          case '\n':
          case '\r':
             if (inQuote == true)
             {
                field += *aChar;
             }
             else
             {
                if (newLine == false)
                {
                   line.push_back(field);
                   lines.push_back(line);
                   field.clear();
                   line.clear();
                   newLine = true;
                }
             }
             break;

          default:
             newLine = false;
             field.push_back(*aChar);
             break;
          }

          aChar++;
       }

       if (field.size())
          line.push_back(field);

       if (line.size())
          lines.push_back(line);
    }

0赞 Jeremy Friesner 6/19/2014

AFAICT 这不会正确处理嵌入的引号（例如，“此字符串具有”“嵌入引号”“”，“foo”，1））

1赞 NPike 4/30/2011 #11

如果您不想在项目中包含 boost（如果您只使用它用于 CSV 解析，那么它相当大......

我很幸运地在这里进行了CSV解析：

http://www.zedwood.com/article/112/cpp-csv-parser

它处理带引号的字段 - 但不处理内联 \n 字符（对于大多数用途来说可能没问题）。

1赞 tofutim 7/12/2012

编译器难道不应该去掉所有不必要的东西吗？

11赞 jxh 7/18/2012 #12

由于所有 CSV 问题似乎都被重定向到这里，我想我会在这里发布我的答案。这个答案并没有直接解决提问者的问题。我希望能够在已知为 CSV 格式的流中读取，并且每个字段的类型都是已知的。当然，下面的方法可用于将每个字段视为字符串类型。

作为我希望如何使用CSV输入流的示例，请考虑以下输入（取自维基百科的CSV页面）：

const char input[] =
"Year,Make,Model,Description,Price\n"
"1997,Ford,E350,\"ac, abs, moon\",3000.00\n"
"1999,Chevy,\"Venture \"\"Extended Edition\"\"\",\"\",4900.00\n"
"1999,Chevy,\"Venture \"\"Extended Edition, Very Large\"\"\",\"\",5000.00\n"
"1996,Jeep,Grand Cherokee,\"MUST SELL!\n\
air, moon roof, loaded\",4799.00\n"
;

然后，我希望能够像这样读取数据：

std::istringstream ss(input);
std::string title[5];
int year;
std::string make, model, desc;
float price;
csv_istream(ss)
    >> title[0] >> title[1] >> title[2] >> title[3] >> title[4];
while (csv_istream(ss)
       >> year >> make >> model >> desc >> price) {
    //...do something with the record...
}

这就是我最终得到的解决方案。

struct csv_istream {
    std::istream &is_;
    csv_istream (std::istream &is) : is_(is) {}
    void scan_ws () const {
        while (is_.good()) {
            int c = is_.peek();
            if (c != ' ' && c != '\t') break;
            is_.get();
        }
    }
    void scan (std::string *s = 0) const {
        std::string ws;
        int c = is_.get();
        if (is_.good()) {
            do {
                if (c == ',' || c == '\n') break;
                if (s) {
                    ws += c;
                    if (c != ' ' && c != '\t') {
                        *s += ws;
                        ws.clear();
                    }
                }
                c = is_.get();
            } while (is_.good());
            if (is_.eof()) is_.clear();
        }
    }
    template <typename T, bool> struct set_value {
        void operator () (std::string in, T &v) const {
            std::istringstream(in) >> v;
        }
    };
    template <typename T> struct set_value<T, true> {
        template <bool SIGNED> void convert (std::string in, T &v) const {
            if (SIGNED) v = ::strtoll(in.c_str(), 0, 0);
            else v = ::strtoull(in.c_str(), 0, 0);
        }
        void operator () (std::string in, T &v) const {
            convert<is_signed_int<T>::val>(in, v);
        }
    };
    template <typename T> const csv_istream & operator >> (T &v) const {
        std::string tmp;
        scan(&tmp);
        set_value<T, is_int<T>::val>()(tmp, v);
        return *this;
    }
    const csv_istream & operator >> (std::string &v) const {
        v.clear();
        scan_ws();
        if (is_.peek() != '"') scan(&v);
        else {
            std::string tmp;
            is_.get();
            std::getline(is_, tmp, '"');
            while (is_.peek() == '"') {
                v += tmp;
                v += is_.get();
                std::getline(is_, tmp, '"');
            }
            v += tmp;
            scan();
        }
        return *this;
    }
    template <typename T>
    const csv_istream & operator >> (T &(*manip)(T &)) const {
        is_ >> manip;
        return *this;
    }
    operator bool () const { return !is_.fail(); }
};

使用以下帮助程序，可以通过 C++11 中的新整数特征模板进行简化：

template <typename T> struct is_signed_int { enum { val = false }; };
template <> struct is_signed_int<short> { enum { val = true}; };
template <> struct is_signed_int<int> { enum { val = true}; };
template <> struct is_signed_int<long> { enum { val = true}; };
template <> struct is_signed_int<long long> { enum { val = true}; };

template <typename T> struct is_unsigned_int { enum { val = false }; };
template <> struct is_unsigned_int<unsigned short> { enum { val = true}; };
template <> struct is_unsigned_int<unsigned int> { enum { val = true}; };
template <> struct is_unsigned_int<unsigned long> { enum { val = true}; };
template <> struct is_unsigned_int<unsigned long long> { enum { val = true}; };

template <typename T> struct is_int {
    enum { val = (is_signed_int<T>::val || is_unsigned_int<T>::val) };
};

在线试用！

8赞 Heygard Flisch 12/18/2012 #13

可以在此处找到另一个 CSV I/O 库：

http://code.google.com/p/fast-cpp-csv-parser/

#include "csv.h"

int main(){
  io::CSVReader<3> in("ram.csv");
  in.read_header(io::ignore_extra_column, "vendor", "size", "speed");
  std::string vendor; int size; double speed;
  while(in.read_row(vendor, size, speed)){
    // do stuff with the data
  }
}

3赞 quant_dev 12/27/2015

不错，但它迫使您在编译时选择列数。对于许多应用程序不是很有用。

0赞 Hari 9/30/2021

指向同一存储库的 github 链接：github.com/ben-strasser/fast-cpp-csv-parser

3赞 Jim M. 2/19/2013 #14

这是读取矩阵的代码，请注意，matlab 中还有一个 csvwrite 函数

void loadFromCSV( const std::string& filename )
{
    std::ifstream       file( filename.c_str() );
    std::vector< std::vector<std::string> >   matrix;
    std::vector<std::string>   row;
    std::string                line;
    std::string                cell;

    while( file )
    {
        std::getline(file,line);
        std::stringstream lineStream(line);
        row.clear();

        while( std::getline( lineStream, cell, ',' ) )
            row.push_back( cell );

        if( !row.empty() )
            matrix.push_back( row );
    }

    for( int i=0; i<int(matrix.size()); i++ )
    {
        for( int j=0; j<int(matrix[i].size()); j++ )
            std::cout << matrix[i][j] << " ";

        std::cout << std::endl;
    }
}

4赞 marcp 6/27/2013 #15

这是一个旧线程，但它仍然位于搜索结果的顶部，因此我使用 std：：stringstream 和我在此处找到的 Yves Baumes 的简单字符串替换方法添加我的解决方案。

以下示例将逐行读取文件，忽略以 // 开头的注释行，并将其他行解析为字符串、整数和双精度的组合。Stringstream 执行解析，但希望字段由空格分隔，因此我首先使用 stringreplace 将逗号转换为空格。它可以正常处理制表符，但不处理带引号的字符串。

错误或缺失的输入将被忽略，这可能是好的，也可能不是好的，这取决于你的情况。

#include <string>
#include <sstream>
#include <fstream>

void StringReplace(std::string& str, const std::string& oldStr, const std::string& newStr)
// code by  Yves Baumes
// http://stackoverflow.com/questions/1494399/how-do-i-search-find-and-replace-in-a-standard-string
{
  size_t pos = 0;
  while((pos = str.find(oldStr, pos)) != std::string::npos)
  {
     str.replace(pos, oldStr.length(), newStr);
     pos += newStr.length();
  }
}

void LoadCSV(std::string &filename) {
   std::ifstream stream(filename);
   std::string in_line;
   std::string Field;
   std::string Chan;
   int ChanType;
   double Scale;
   int Import;
   while (std::getline(stream, in_line)) {
      StringReplace(in_line, ",", " ");
      std::stringstream line(in_line);
      line >> Field >> Chan >> ChanType >> Scale >> Import;
      if (Field.substr(0,2)!="//") {
         // do your stuff 
         // this is CBuilder code for demonstration, sorry
         ShowMessage((String)Field.c_str() + "\n" + Chan.c_str() + "\n" + IntToStr(ChanType) + "\n" +FloatToStr(Scale) + "\n" +IntToStr(Import));
      }
   }
}

1赞 Fabien 7/19/2013 #16

值得一提的是，这是我的实现。它处理 wstring 输入，但可以很容易地调整为字符串。它不处理字段中的换行符（因为我的应用程序也不处理，但添加其支持并不太困难），并且它不符合 RFC 的“\r\n”行尾（假设您使用 std：：getline），但它确实正确处理空格修剪和双引号（希望如此）。

using namespace std;

// trim whitespaces around field or double-quotes, remove double-quotes and replace escaped double-quotes (double double-quotes)
wstring trimquote(const wstring& str, const wstring& whitespace, const wchar_t quotChar)
{
    wstring ws;
    wstring::size_type strBegin = str.find_first_not_of(whitespace);
    if (strBegin == wstring::npos)
        return L"";

    wstring::size_type strEnd = str.find_last_not_of(whitespace);
    wstring::size_type strRange = strEnd - strBegin + 1;

    if((str[strBegin] == quotChar) && (str[strEnd] == quotChar))
    {
        ws = str.substr(strBegin+1, strRange-2);
        strBegin = 0;
        while((strEnd = ws.find(quotChar, strBegin)) != wstring::npos)
        {
            ws.erase(strEnd, 1);
            strBegin = strEnd+1;
        }

    }
    else
        ws = str.substr(strBegin, strRange);
    return ws;
}

pair<unsigned, unsigned> nextCSVQuotePair(const wstring& line, const wchar_t quotChar, unsigned ofs = 0)
{
    pair<unsigned, unsigned> r;
    r.first = line.find(quotChar, ofs);
    r.second = wstring::npos;
    if(r.first != wstring::npos)
    {
        r.second = r.first;
        while(((r.second = line.find(quotChar, r.second+1)) != wstring::npos)
            && (line[r.second+1] == quotChar)) // WARNING: assumes null-terminated string such that line[r.second+1] always exist
            r.second++;

    }
    return r;
}

unsigned parseLine(vector<wstring>& fields, const wstring& line)
{
    unsigned ofs, ofs0, np;
    const wchar_t delim = L',';
    const wstring whitespace = L" \t\xa0\x3000\x2000\x2001\x2002\x2003\x2004\x2005\x2006\x2007\x2008\x2009\x200a\x202f\x205f";
    const wchar_t quotChar = L'\"';
    pair<unsigned, unsigned> quot;

    fields.clear();

    ofs = ofs0 = 0;
    quot = nextCSVQuotePair(line, quotChar);
    while((np = line.find(delim, ofs)) != wstring::npos)
    {
        if((np > quot.first) && (np < quot.second))
        { // skip delimiter inside quoted field
            ofs = quot.second+1;
            quot = nextCSVQuotePair(line, quotChar, ofs);
            continue;
        }
        fields.push_back( trimquote(line.substr(ofs0, np-ofs0), whitespace, quotChar) );
        ofs = ofs0 = np+1;
    }
    fields.push_back( trimquote(line.substr(ofs0), whitespace, quotChar) );

    return fields.size();
}

6赞 sashoalm 10/15/2013 #17

这是Unicode CSV解析器的另一个实现（适用于wchar_t）。我写了一部分，而乔纳森·莱夫勒（Jonathan Leffler）写了其余的。

注意：此解析器旨在尽可能接近地复制 Excel 的行为，特别是在导入损坏或格式错误的 CSV 文件时。

这是原始问题 - 解析带有多行字段和转义双引号的 CSV 文件

这是作为 SSCCE（简短、独立、正确示例）的代码。

#include <stdbool.h>
#include <wchar.h>
#include <wctype.h>

extern const wchar_t *nextCsvField(const wchar_t *p, wchar_t sep, bool *newline);

// Returns a pointer to the start of the next field,
// or zero if this is the last field in the CSV
// p is the start position of the field
// sep is the separator used, i.e. comma or semicolon
// newline says whether the field ends with a newline or with a comma
const wchar_t *nextCsvField(const wchar_t *p, wchar_t sep, bool *newline)
{
    // Parse quoted sequences
    if ('"' == p[0]) {
        p++;
        while (1) {
            // Find next double-quote
            p = wcschr(p, L'"');
            // If we don't find it or it's the last symbol
            // then this is the last field
            if (!p || !p[1])
                return 0;
            // Check for "", it is an escaped double-quote
            if (p[1] != '"')
                break;
            // Skip the escaped double-quote
            p += 2;
        }
    }

    // Find next newline or comma.
    wchar_t newline_or_sep[4] = L"\n\r ";
    newline_or_sep[2] = sep;
    p = wcspbrk(p, newline_or_sep);

    // If no newline or separator, this is the last field.
    if (!p)
        return 0;

    // Check if we had newline.
    *newline = (p[0] == '\r' || p[0] == '\n');

    // Handle "\r\n", otherwise just increment
    if (p[0] == '\r' && p[1] == '\n')
        p += 2;
    else
        p++;

    return p;
}

static wchar_t *csvFieldData(const wchar_t *fld_s, const wchar_t *fld_e, wchar_t *buffer, size_t buflen)
{
    wchar_t *dst = buffer;
    wchar_t *end = buffer + buflen - 1;
    const wchar_t *src = fld_s;

    if (*src == L'"')
    {
        const wchar_t *p = src + 1;
        while (p < fld_e && dst < end)
        {
            if (p[0] == L'"' && p+1 < fld_s && p[1] == L'"')
            {
                *dst++ = p[0];
                p += 2;
            }
            else if (p[0] == L'"')
            {
                p++;
                break;
            }
            else
                *dst++ = *p++;
        }
        src = p;
    }
    while (src < fld_e && dst < end)
        *dst++ = *src++;
    if (dst >= end)
        return 0;
    *dst = L'\0';
    return(buffer);
}

static void dissect(const wchar_t *line)
{
    const wchar_t *start = line;
    const wchar_t *next;
    bool     eol;
    wprintf(L"Input %3zd: [%.*ls]\n", wcslen(line), wcslen(line)-1, line);
    while ((next = nextCsvField(start, L',', &eol)) != 0)
    {
        wchar_t buffer[1024];
        wprintf(L"Raw Field: [%.*ls] (eol = %d)\n", (next - start - eol), start, eol);
        if (csvFieldData(start, next-1, buffer, sizeof(buffer)/sizeof(buffer[0])) != 0)
            wprintf(L"Field %3zd: [%ls]\n", wcslen(buffer), buffer);
        start = next;
    }
}

static const wchar_t multiline[] =
   L"First field of first row,\"This field is multiline\n"
    "\n"
    "but that's OK because it's enclosed in double quotes, and this\n"
    "is an escaped \"\" double quote\" but this one \"\" is not\n"
    "   \"This is second field of second row, but it is not multiline\n"
    "   because it doesn't start \n"
    "   with an immediate double quote\"\n"
    ;

int main(void)
{
    wchar_t line[1024];

    while (fgetws(line, sizeof(line)/sizeof(line[0]), stdin))
        dissect(line);
    dissect(multiline);

    return 0;
}

1赞 Antonello 1/24/2014 #18

这是一个现成的函数，如果你只需要加载一个双精度的数据文件（没有整数，没有文本）。

#include <sstream>
#include <fstream>
#include <iterator>
#include <string>
#include <vector>
#include <algorithm>

using namespace std;

/**
 * Parse a CSV data file and fill the 2d STL vector "data".
 * Limits: only "pure datas" of doubles, not encapsulated by " and without \n inside.
 * Further no formatting in the data (e.g. scientific notation)
 * It however handles both dots and commas as decimal separators and removes thousand separator.
 * 
 * returnCodes[0]: file access 0-> ok 1-> not able to read; 2-> decimal separator equal to comma separator
 * returnCodes[1]: number of records
 * returnCodes[2]: number of fields. -1 If rows have different field size
 * 
 */
vector<int>
readCsvData (vector <vector <double>>& data, const string& filename, const string& delimiter, const string& decseparator){

 int vv[3] = { 0,0,0 };
 vector<int> returnCodes(&vv[0], &vv[0]+3);

 string rowstring, stringtoken;
 double doubletoken;
 int rowcount=0;
 int fieldcount=0;
 data.clear();

 ifstream iFile(filename, ios_base::in);
 if (!iFile.is_open()){
   returnCodes[0] = 1;
   return returnCodes;
 }
 while (getline(iFile, rowstring)) {
    if (rowstring=="") continue; // empty line
    rowcount ++; //let's start with 1
    if(delimiter == decseparator){
      returnCodes[0] = 2;
      return returnCodes;
    }
    if(decseparator != "."){
     // remove dots (used as thousand separators)
     string::iterator end_pos = remove(rowstring.begin(), rowstring.end(), '.');
     rowstring.erase(end_pos, rowstring.end());
     // replace decimal separator with dots.
     replace(rowstring.begin(), rowstring.end(),decseparator.c_str()[0], '.'); 
    } else {
     // remove commas (used as thousand separators)
     string::iterator end_pos = remove(rowstring.begin(), rowstring.end(), ',');
     rowstring.erase(end_pos, rowstring.end());
    }
    // tokenize..
    vector<double> tokens;
    // Skip delimiters at beginning.
    string::size_type lastPos = rowstring.find_first_not_of(delimiter, 0);
    // Find first "non-delimiter".
    string::size_type pos     = rowstring.find_first_of(delimiter, lastPos);
    while (string::npos != pos || string::npos != lastPos){
        // Found a token, convert it to double add it to the vector.
        stringtoken = rowstring.substr(lastPos, pos - lastPos);
        if (stringtoken == "") {
      tokens.push_back(0.0);
    } else {
          istringstream totalSString(stringtoken);
      totalSString >> doubletoken;
      tokens.push_back(doubletoken);
    }     
        // Skip delimiters.  Note the "not_of"
        lastPos = rowstring.find_first_not_of(delimiter, pos);
        // Find next "non-delimiter"
        pos = rowstring.find_first_of(delimiter, lastPos);
    }
    if(rowcount == 1){
      fieldcount = tokens.size();
      returnCodes[2] = tokens.size();
    } else {
      if ( tokens.size() != fieldcount){
    returnCodes[2] = -1;
      }
    }
    data.push_back(tokens);
 }
 iFile.close();
 returnCodes[1] = rowcount;
 return returnCodes;
}

2赞 Maxim Egorushkin 7/3/2014 #19

另一种快速简便的方法是使用 Boost.Fusion I/O：

#include <iostream>
#include <sstream>

#include <boost/fusion/adapted/boost_tuple.hpp>
#include <boost/fusion/sequence/io.hpp>

namespace fusion = boost::fusion;

struct CsvString
{
    std::string value;

    // Stop reading a string once a CSV delimeter is encountered.
    friend std::istream& operator>>(std::istream& s, CsvString& v) {
        v.value.clear();
        for(;;) {
            auto c = s.peek();
            if(std::istream::traits_type::eof() == c || ',' == c || '\n' == c)
                break;
            v.value.push_back(c);
            s.get();
        }
        return s;
    }

    friend std::ostream& operator<<(std::ostream& s, CsvString const& v) {
        return s << v.value;
    }
};

int main() {
    std::stringstream input("abc,123,true,3.14\n"
                            "def,456,false,2.718\n");

    typedef boost::tuple<CsvString, int, bool, double> CsvRow;

    using fusion::operator<<;
    std::cout << std::boolalpha;

    using fusion::operator>>;
    input >> std::boolalpha;
    input >> fusion::tuple_open("") >> fusion::tuple_close("\n") >> fusion::tuple_delimiter(',');

    for(CsvRow row; input >> row;)
        std::cout << row << '\n';
}

输出：

(abc 123 true 3.14)
(def 456 false 2.718)

1赞 Amruta Ghodke 11/10/2014 #20

您可以使用 fopen ，fscanf 函数打开和读取 .csv 文件，但重要的是解析数据。使用分隔符解析数据的最简单方法。在.csv的情况下，分隔符为 '，'。

假设您的 data1.csv 文件如下：

A,45,76,01
B,77,67,02
C,63,76,03
D,65,44,04

您可以标记数据并存储在 char 数组中，然后使用 atoi（） etc 函数进行适当的转换

FILE *fp;
char str1[10], str2[10], str3[10], str4[10];

fp = fopen("G:\\data1.csv", "r");
if(NULL == fp)
{
    printf("\nError in opening file.");
    return 0;
}
while(EOF != fscanf(fp, " %[^,], %[^,], %[^,], %s, %s, %s, %s ", str1, str2, str3, str4))
{
    printf("\n%s %s %s %s", str1, str2, str3, str4);
}
fclose(fp);

[^，]， ^ -它反转逻辑，表示匹配任何不包含逗号的字符串，然后最后一个，表示匹配终止前一个字符串的逗号。

83赞 sastanin 5/20/2015 #21

我的版本除了标准的 C++11 库外没有使用任何东西。它很好地应对了Excel CSV报价：

spam eggs,"foo,bar","""fizz buzz"""
1.23,4.567,-8.00E+09

代码是作为有限状态机编写的，一次消耗一个字符。我认为这更容易推理。

#include <istream>
#include <string>
#include <vector>

enum class CSVState {
    UnquotedField,
    QuotedField,
    QuotedQuote
};

std::vector<std::string> readCSVRow(const std::string &row) {
    CSVState state = CSVState::UnquotedField;
    std::vector<std::string> fields {""};
    size_t i = 0; // index of the current field
    for (char c : row) {
        switch (state) {
            case CSVState::UnquotedField:
                switch (c) {
                    case ',': // end of field
                              fields.push_back(""); i++;
                              break;
                    case '"': state = CSVState::QuotedField;
                              break;
                    default:  fields[i].push_back(c);
                              break; }
                break;
            case CSVState::QuotedField:
                switch (c) {
                    case '"': state = CSVState::QuotedQuote;
                              break;
                    default:  fields[i].push_back(c);
                              break; }
                break;
            case CSVState::QuotedQuote:
                switch (c) {
                    case ',': // , after closing quote
                              fields.push_back(""); i++;
                              state = CSVState::UnquotedField;
                              break;
                    case '"': // "" -> "
                              fields[i].push_back('"');
                              state = CSVState::QuotedField;
                              break;
                    default:  // end of quote
                              state = CSVState::UnquotedField;
                              break; }
                break;
        }
    }
    return fields;
}

/// Read CSV file, Excel dialect. Accept "quoted fields ""with quotes"""
std::vector<std::vector<std::string>> readCSV(std::istream &in) {
    std::vector<std::vector<std::string>> table;
    std::string row;
    while (!in.eof()) {
        std::getline(in, row);
        if (in.bad() || in.fail()) {
            break;
        }
        auto fields = readCSVRow(row);
        table.push_back(fields);
    }
    return table;
}

0赞 dr_rk 4/6/2018

顶级答案对我不起作用，因为我使用的是较旧的编译器。这个答案有效，向量初始化可能需要这个：const char *vinit[] = {""}; vector<string> fields(vinit, end(vinit));

0赞 Jan Kratochvil 11/6/2023

它无法解析字段中的换行符 - 此类 CSV 文件位于 C++ 字符串“\”a\n\“\n”中。至少从 LibreOffice.org Calc 开始。 MS Excel 365 无法导出为 CSV，而且我没有本机 MS Excel。

2赞 Elizabeth Card 10/7/2015 #22

您需要做的第一件事是确保文件存在。完成这你只需要尝试在路径上打开文件流。在你之后已打开文件流，请使用 stream.fail（）查看它是否按预期工作，或不。

bool fileExists(string fileName)
{

ifstream test;

test.open(fileName.c_str());

if (test.fail())
{
    test.close();
    return false;
}
else
{
    test.close();
    return true;
}
}

您还必须验证所提供的文件类型是否正确。为此，您需要查看提供的文件路径，直到你找到文件扩展名。一旦你有了文件扩展名，确保它是一个.csv文件。

bool verifyExtension(string filename)
{
int period = 0;

for (unsigned int i = 0; i < filename.length(); i++)
{
    if (filename[i] == '.')
        period = i;
}

string extension;

for (unsigned int i = period; i < filename.length(); i++)
    extension += filename[i];

if (extension == ".csv")
    return true;
else
    return false;
}

此函数将返回稍后在错误消息中使用的文件扩展名。

string getExtension(string filename)
{
int period = 0;

for (unsigned int i = 0; i < filename.length(); i++)
{
    if (filename[i] == '.')
        period = i;
}

string extension;

if (period != 0)
{
    for (unsigned int i = period; i < filename.length(); i++)
        extension += filename[i];
}
else
    extension = "NO FILE";

return extension;
}

此函数实际上将调用上面创建的错误检查，然后解析文件。

void parseFile(string fileName)
{
    if (fileExists(fileName) && verifyExtension(fileName))
    {
        ifstream fs;
        fs.open(fileName.c_str());
        string fileCommand;

        while (fs.good())
        {
            string temp;

            getline(fs, fileCommand, '\n');

            for (unsigned int i = 0; i < fileCommand.length(); i++)
            {
                if (fileCommand[i] != ',')
                    temp += fileCommand[i];
                else
                    temp += " ";
            }

            if (temp != "\0")
            {
                // Place your code here to run the file.
            }
        }
        fs.close();
    }
    else if (!fileExists(fileName))
    {
        cout << "Error: The provided file does not exist: " << fileName << endl;

        if (!verifyExtension(fileName))
        {
            if (getExtension(fileName) != "NO FILE")
                cout << "\tCheck the file extension." << endl;
            else
                cout << "\tThere is no file in the provided path." << endl;
        }
    }
    else if (!verifyExtension(fileName)) 
    {
        if (getExtension(fileName) != "NO FILE")
            cout << "Incorrect file extension provided: " << getExtension(fileName) << endl;
        else
            cout << "There is no file in the following path: " << fileName << endl;
    }
}

1赞 scap3y 11/18/2015 #23

我写了一个很好的解析CSV文件的方法，我想我应该把它添加为答案：

#include <algorithm>
#include <fstream>
#include <iostream>
#include <stdlib.h>
#include <stdio.h>

struct CSVDict
{
  std::vector< std::string > inputImages;
  std::vector< double > inputLabels;
};

/**
\brief Splits the string

\param str String to split
\param delim Delimiter on the basis of which splitting is to be done
\return results Output in the form of vector of strings
*/
std::vector<std::string> stringSplit( const std::string &str, const std::string &delim )
{
  std::vector<std::string> results;

  for (size_t i = 0; i < str.length(); i++)
  {
    std::string tempString = "";
    while ((str[i] != *delim.c_str()) && (i < str.length()))
    {
      tempString += str[i];
      i++;
    }
    results.push_back(tempString);
  }

  return results;
}

/**
\brief Parse the supplied CSV File and obtain Row and Column information. 

Assumptions:
1. Header information is in first row
2. Delimiters are only used to differentiate cell members

\param csvFileName The full path of the file to parse
\param inputColumns The string of input columns which contain the data to be used for further processing
\param inputLabels The string of input labels based on which further processing is to be done
\param delim The delimiters used in inputColumns and inputLabels
\return Vector of Vector of strings: Collection of rows and columns
*/
std::vector< CSVDict > parseCSVFile( const std::string &csvFileName, const std::string &inputColumns, const std::string &inputLabels, const std::string &delim )
{
  std::vector< CSVDict > return_CSVDict;
  std::vector< std::string > inputColumnsVec = stringSplit(inputColumns, delim), inputLabelsVec = stringSplit(inputLabels, delim);
  std::vector< std::vector< std::string > > returnVector;
  std::ifstream inFile(csvFileName.c_str());
  int row = 0;
  std::vector< size_t > inputColumnIndeces, inputLabelIndeces;
  for (std::string line; std::getline(inFile, line, '\n');)
  {
    CSVDict tempDict;
    std::vector< std::string > rowVec;
    line.erase(std::remove(line.begin(), line.end(), '"'), line.end());
    rowVec = stringSplit(line, delim);

    // for the first row, record the indeces of the inputColumns and inputLabels
    if (row == 0)
    {
      for (size_t i = 0; i < rowVec.size(); i++)
      {
        for (size_t j = 0; j < inputColumnsVec.size(); j++)
        {
          if (rowVec[i] == inputColumnsVec[j])
          {
            inputColumnIndeces.push_back(i);
          }
        }
        for (size_t j = 0; j < inputLabelsVec.size(); j++)
        {
          if (rowVec[i] == inputLabelsVec[j])
          {
            inputLabelIndeces.push_back(i);
          }
        }
      }
    }
    else
    {
      for (size_t i = 0; i < inputColumnIndeces.size(); i++)
      {
        tempDict.inputImages.push_back(rowVec[inputColumnIndeces[i]]);
      }
      for (size_t i = 0; i < inputLabelIndeces.size(); i++)
      {
        double test = std::atof(rowVec[inputLabelIndeces[i]].c_str());
        tempDict.inputLabels.push_back(std::atof(rowVec[inputLabelIndeces[i]].c_str()));
      }
      return_CSVDict.push_back(tempDict);
    }
    row++;
  }

  return return_CSVDict;
}

1赞 g24l 12/2/2015 #24

可以使用 .std::regex

根据文件的大小和可用的内存，可以逐行读取或完全以 .std::string

要读取该文件，可以使用：

std::ifstream t("file.txt");
std::string sin((std::istreambuf_iterator<char>(t)),
                 std::istreambuf_iterator<char>());

然后，您可以匹配它，它实际上是可根据您的需求进行定制的。

std::regex word_regex(",\\s]+");
auto what = 
    std::sregex_iterator(sin.begin(), sin.end(), word_regex);
auto wend = std::sregex_iterator();

std::vector<std::string> v;
for (;what!=wend ; wend) {
    std::smatch match = *what;
    v.push_back(match.str());
}

7赞 Pietro Saccardi 12/6/2015 #25

另一个类似于Loki Astari在C++11中的答案的解决方案。此处的行是给定类型的 s。代码扫描一行，然后扫描直到每个分隔符，然后转换值并将其直接转储到元组中（带有一些模板代码）。std::tuple

for (auto row : csv<std::string, int, float>(file, ',')) {
    std::cout << "first col: " << std::get<0>(row) << std::endl;
}

优势：

相当干净且易于使用，只有 C++11。
自动将类型转换为 VIA 。std::tuple<t1, ...>operator>>

缺少什么：

转义和引用
在格式错误的 CSV 的情况下不会进行错误处理。

主要代码：

#include <iterator>
#include <sstream>
#include <string>

namespace csvtools {
    /// Read the last element of the tuple without calling recursively
    template <std::size_t idx, class... fields>
    typename std::enable_if<idx >= std::tuple_size<std::tuple<fields...>>::value - 1>::type
    read_tuple(std::istream &in, std::tuple<fields...> &out, const char delimiter) {
        std::string cell;
        std::getline(in, cell, delimiter);
        std::stringstream cell_stream(cell);
        cell_stream >> std::get<idx>(out);
    }

    /// Read the @p idx-th element of the tuple and then calls itself with @p idx + 1 to
    /// read the next element of the tuple. Automatically falls in the previous case when
    /// reaches the last element of the tuple thanks to enable_if
    template <std::size_t idx, class... fields>
    typename std::enable_if<idx < std::tuple_size<std::tuple<fields...>>::value - 1>::type
    read_tuple(std::istream &in, std::tuple<fields...> &out, const char delimiter) {
        std::string cell;
        std::getline(in, cell, delimiter);
        std::stringstream cell_stream(cell);
        cell_stream >> std::get<idx>(out);
        read_tuple<idx + 1, fields...>(in, out, delimiter);
    }
}

/// Iterable csv wrapper around a stream. @p fields the list of types that form up a row.
template <class... fields>
class csv {
    std::istream &_in;
    const char _delim;
public:
    typedef std::tuple<fields...> value_type;
    class iterator;

    /// Construct from a stream.
    inline csv(std::istream &in, const char delim) : _in(in), _delim(delim) {}

    /// Status of the underlying stream
    /// @{
    inline bool good() const {
        return _in.good();
    }
    inline const std::istream &underlying_stream() const {
        return _in;
    }
    /// @}

    inline iterator begin();
    inline iterator end();
private:

    /// Reads a line into a stringstream, and then reads the line into a tuple, that is returned
    inline value_type read_row() {
        std::string line;
        std::getline(_in, line);
        std::stringstream line_stream(line);
        std::tuple<fields...> retval;
        csvtools::read_tuple<0, fields...>(line_stream, retval, _delim);
        return retval;
    }
};

/// Iterator; just calls recursively @ref csv::read_row and stores the result.
template <class... fields>
class csv<fields...>::iterator {
    csv::value_type _row;
    csv *_parent;
public:
    typedef std::input_iterator_tag iterator_category;
    typedef csv::value_type         value_type;
    typedef std::size_t             difference_type;
    typedef csv::value_type *       pointer;
    typedef csv::value_type &       reference;

    /// Construct an empty/end iterator
    inline iterator() : _parent(nullptr) {}
    /// Construct an iterator at the beginning of the @p parent csv object.
    inline iterator(csv &parent) : _parent(parent.good() ? &parent : nullptr) {
        ++(*this);
    }

    /// Read one row, if possible. Set to end if parent is not good anymore.
    inline iterator &operator++() {
        if (_parent != nullptr) {
            _row = _parent->read_row();
            if (!_parent->good()) {
                _parent = nullptr;
            }
        }
        return *this;
    }

    inline iterator operator++(int) {
        iterator copy = *this;
        ++(*this);
        return copy;
    }

    inline csv::value_type const &operator*() const {
        return _row;
    }

    inline csv::value_type const *operator->() const {
        return &_row;
    }

    bool operator==(iterator const &other) {
        return (this == &other) or (_parent == nullptr and other._parent == nullptr);
    }
    bool operator!=(iterator const &other) {
        return not (*this == other);
    }
};

template <class... fields>
typename csv<fields...>::iterator csv<fields...>::begin() {
    return iterator(*this);
}

template <class... fields>
typename csv<fields...>::iterator csv<fields...>::end() {
    return iterator();
}

我在 GitHub 上放了一个小工作示例;我一直在用它来解析一些数值数据，它达到了它的目的。

1赞 MrPisarik 1/29/2016

您可能不关心内联，因为大多数编译器都是自己决定的。至少我在Visual C++中是肯定的。它可以独立于您的方法规范进行内联方法。

1赞 Pietro Saccardi 1/29/2016

这正是我明确标记它们的原因。我最常使用的 Gcc 和 Clang 也有自己的约定。“内联”关键字应该只是一种激励。

2赞 nikos_k 10/16/2016 #26

由于我现在不习惯提升，所以我会建议一个更简单的解决方案。假设您的 .csv 文件有 100 行，每行有 10 个数字，用“，”分隔。您可以使用以下代码以数组的形式加载此数据：

#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
using namespace std;

int main()
{
    int A[100][10];
    ifstream ifs;
    ifs.open("name_of_file.csv");
    string s1;
    char c;
    for(int k=0; k<100; k++)
    {
        getline(ifs,s1);
        stringstream stream(s1);
        int j=0;
        while(1)
        {
            stream >>A[k][j];
            stream >> c;
            j++;
            if(!stream) {break;}
        }
    }


}

4赞 d99kris 5/29/2017 #27

我需要一个易于使用的 C++ 库来解析 CSV 文件，但找不到任何可用的文件，所以我最终构建了一个。Rapidcsv 是一个 C++11 仅标头库，它允许以选择的数据类型直接访问解析的列（或行）作为向量。例如：

#include <iostream>
#include <vector>
#include <rapidcsv.h>

int main()
{
  rapidcsv::Document doc("../tests/msft.csv");

  std::vector<float> close = doc.GetColumn<float>("Close");
  std::cout << "Read " << close.size() << " values." << std::endl;
}

1赞 Maksym Ganenko 7/23/2017

干得不错，但如果标题的标签为空，则库无法正常工作。这是 Excel/LibreOffice NxN 表的典型特征。此外，它可能会跳过最后一行数据。不幸的是，你的库并不健壮。

1赞 d99kris 7/24/2017

感谢您的反馈@MaksymGanenko我已经修复了没有尾随换行符的最后一行的“最后一行数据”错误。至于提到的另一个问题 - “带有空标签的标题” - 我不确定它指的是什么？库应处理空标签（带引号和不带引号）。它也可以读取没有标题行/列的 CSV，但随后它要求用户指定此项（列标题 ID -1 和行标题 ID -1）。如果您希望获得支持，请在 GitHub 页面上提供更多详细信息或报告错误。谢谢！

15赞 m0meni 5/30/2017 #28

我写了一个仅标题的 C++11 CSV 解析器。它经过充分测试，速度很快，支持整个 CSV 规范（引号字段、引号中的分隔符/终止符、引号转义等），并且可以配置为考虑不符合规范的 CSV。

配置是通过流畅的界面完成的：

// constructor accepts any input stream
CsvParser parser = CsvParser(std::cin)
  .delimiter(';')    // delimited by ; instead of ,
  .quote('\'')       // quoted fields use ' instead of "
  .terminator('\0'); // terminated by \0 instead of by \r\n, \n, or \r

解析只是一个基于范围的 for 循环：

#include <iostream>
#include "../parser.hpp"

using namespace aria::csv;

int main() {
  std::ifstream f("some_file.csv");
  CsvParser parser(f);

  for (auto& row : parser) {
    for (auto& field : row) {
      std::cout << field << " | ";
    }
    std::cout << std::endl;
  }
}

2赞 Maksym Ganenko 7/23/2017

干得不错，但你需要再添加三件事：（1）读取标头（2）提供按名称索引的字段（3）不要通过重用相同的字符串向量来重新分配循环中的内存

0赞 m0meni 7/24/2017

@MaksymGanenko我做#3。你能详细说明#2吗？

1赞 Maksym Ganenko 7/24/2017

不是按行中的位置获取字段，而是按标题中给出的名称（在 CSV 表的第一行中）获取字段非常有用。例如，我期望带有“日期”字段的 CSV 表，但我不知道一行中的“日期”字段索引是什么。

2赞 m0meni 7/25/2017

@MaksymGanenko啊，我明白你的意思了。当您在编译时知道 CSV 的列时，github.com/ben-strasser/fast-cpp-csv-parser，它可能比我的更好。我想要的是一个 CSV 解析器，用于您想对许多不同的 CSV 使用相同的代码并且事先不知道它们是什么样子的情况。所以我可能不会添加 #2，但我会在将来的某个时候添加 #1。

2赞 vadamsky 6/30/2017 #29

您可以使用以下库：https://github.com/vadamsky/csvworker

例如代码：

#include <iostream>
#include "csvworker.h"

using namespace std;

int main()
{
    //
    CsvWorker csv;
    csv.loadFromFile("example.csv");
    cout << csv.getRowsNumber() << "  " << csv.getColumnsNumber() << endl;

    csv.getFieldRef(0, 2) = "0";
    csv.getFieldRef(1, 1) = "0";
    csv.getFieldRef(1, 3) = "0";
    csv.getFieldRef(2, 0) = "0";
    csv.getFieldRef(2, 4) = "0";
    csv.getFieldRef(3, 1) = "0";
    csv.getFieldRef(3, 3) = "0";
    csv.getFieldRef(4, 2) = "0";

    for(unsigned int i=0;i<csv.getRowsNumber();++i)
    {
        //cout << csv.getRow(i) << endl;
        for(unsigned int j=0;j<csv.getColumnsNumber();++j)
        {
            cout << csv.getField(i, j) << ".";
        }
        cout << endl;
    }

    csv.saveToFile("test.csv");

    //
    CsvWorker csv2(4,4);

    csv2.getFieldRef(0, 0) = "a";
    csv2.getFieldRef(0, 1) = "b";
    csv2.getFieldRef(0, 2) = "r";
    csv2.getFieldRef(0, 3) = "a";
    csv2.getFieldRef(1, 0) = "c";
    csv2.getFieldRef(1, 1) = "a";
    csv2.getFieldRef(1, 2) = "d";
    csv2.getFieldRef(2, 0) = "a";
    csv2.getFieldRef(2, 1) = "b";
    csv2.getFieldRef(2, 2) = "r";
    csv2.getFieldRef(2, 3) = "a";

    csv2.saveToFile("test2.csv");

    return 0;
}

0赞 ferdymercury 9/23/2020

另一个有趣的库是 github.com/roman-kashitsyn/text-csv

0赞 okeyla 1/4/2023

我收到一个错误显示：“二进制'<<'：没有找到采用'row'类型的右操作数的运算符（或者没有可接受的转换）” 有什么解决方案吗？

3赞 jav 7/31/2017 #30

当你使用如此美丽的东西时，你必须感到自豪boost::spirit

在这里，我尝试使用解析器（几乎）符合此链接上的 CSV 规范 CSV 规范（我不需要字段中的换行符。此外，逗号周围的空格也会被忽略）。

在克服了等待 10 秒:)编译此代码的令人震惊的体验后，您可以高枕无忧享受。

// csvparser.cpp
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix_operator.hpp>

#include <iostream>
#include <string>

namespace qi = boost::spirit::qi;
namespace bascii = boost::spirit::ascii;

template <typename Iterator>
struct csv_parser : qi::grammar<Iterator, std::vector<std::string>(), 
    bascii::space_type>
{
    qi::rule<Iterator, char()                                           > COMMA;
    qi::rule<Iterator, char()                                           > DDQUOTE;
    qi::rule<Iterator, std::string(),               bascii::space_type  > non_escaped;
    qi::rule<Iterator, std::string(),               bascii::space_type  > escaped;
    qi::rule<Iterator, std::string(),               bascii::space_type  > field;
    qi::rule<Iterator, std::vector<std::string>(),  bascii::space_type  > start;

    csv_parser() : csv_parser::base_type(start)
    {
        using namespace qi;
        using qi::lit;
        using qi::lexeme;
        using bascii::char_;

        start       = field % ',';
        field       = escaped | non_escaped;
        escaped     = lexeme['"' >> *( char_ -(char_('"') | ',') | COMMA | DDQUOTE)  >> '"'];
        non_escaped = lexeme[       *( char_ -(char_('"') | ',')                  )        ];
        DDQUOTE     = lit("\"\"")       [_val = '"'];
        COMMA       = lit(",")          [_val = ','];
    }

};

int main()
{
    std::cout << "Enter CSV lines [empty] to quit\n";

    using bascii::space;
    typedef std::string::const_iterator iterator_type;
    typedef csv_parser<iterator_type> csv_parser;

    csv_parser grammar;
    std::string str;
    int fid;
    while (getline(std::cin, str))
    {
        fid = 0;

        if (str.empty())
            break;

        std::vector<std::string> csv;
        std::string::const_iterator it_beg = str.begin();
        std::string::const_iterator it_end = str.end();
        bool r = phrase_parse(it_beg, it_end, grammar, space, csv);

        if (r && it_beg == it_end)
        {
            std::cout << "Parsing succeeded\n";
            for (auto& field: csv)
            {
                std::cout << "field " << ++fid << ": " << field << std::endl;
            }
        }
        else
        {
            std::cout << "Parsing failed\n";
        }
    }

    return 0;
}

编译：

make csvparser

测试（示例从维基百科中窃取）：

./csvparser
Enter CSV lines [empty] to quit

1999,Chevy,"Venture ""Extended Edition, Very Large""",,5000.00
Parsing succeeded
field 1: 1999
field 2: Chevy
field 3: Venture "Extended Edition, Very Large"
field 4: 
field 5: 5000.00

1999,Chevy,"Venture ""Extended Edition, Very Large""",,5000.00"
Parsing failed

3赞 Pedro Vicente 9/1/2017 #31

此解决方案可检测以下 4 种情况

完整的课程在

https://github.com/pedro-vicente/csv-parser

1,field 2,field 3,
1,field 2,"field 3 quoted, with separator",
1,field 2,"field 3
with newline",
1,field 2,"field 3
with newline and separator,",

它逐个字符读取文件，一次读取 1 行到一个向量（字符串），因此适用于非常大的文件。

用法是

迭代直到返回空行（文件末尾）。行是一个向量，其中每个条目都是一个 CSV 列。

read_csv_t csv;
csv.open("../test.csv");
std::vector<std::string> row;
while (true)
{
  row = csv.read_row();
  if (row.size() == 0)
  {
    break;
  }
}

类声明

class read_csv_t
{
public:
  read_csv_t();
  int open(const std::string &file_name);
  std::vector<std::string> read_row();
private:
  std::ifstream m_ifs;
};

实现

std::vector<std::string> read_csv_t::read_row()
{
  bool quote_mode = false;
  std::vector<std::string> row;
  std::string column;
  char c;
  while (m_ifs.get(c))
  {
    switch (c)
    {
      /////////////////////////////////////////////////////////////////////////////////////////////////////
      //separator ',' detected. 
      //in quote mode add character to column
      //push column if not in quote mode
      /////////////////////////////////////////////////////////////////////////////////////////////////////

    case ',':
      if (quote_mode == true)
      {
        column += c;
      }
      else
      {
        row.push_back(column);
        column.clear();
      }
      break;

      /////////////////////////////////////////////////////////////////////////////////////////////////////
      //quote '"' detected. 
      //toggle quote mode
      /////////////////////////////////////////////////////////////////////////////////////////////////////

    case '"':
      quote_mode = !quote_mode;
      break;

      /////////////////////////////////////////////////////////////////////////////////////////////////////
      //line end detected
      //in quote mode add character to column
      //return row if not in quote mode
      /////////////////////////////////////////////////////////////////////////////////////////////////////

    case '\n':
    case '\r':
      if (quote_mode == true)
      {
        column += c;
      }
      else
      {
        return row;
      }
      break;

      /////////////////////////////////////////////////////////////////////////////////////////////////////
      //default, add character to column
      /////////////////////////////////////////////////////////////////////////////////////////////////////

    default:
      column += c;
      break;
    }
  }

  //return empty vector if end of file detected 
  m_ifs.close();
  std::vector<std::string> v;
  return v;
}

0赞 victimofleisure 12/19/2018 #32

如果您使用的是 Visual Studio/MFC，以下解决方案可能会使您的生活更轻松。它支持 Unicode 和 MBCS，有注释，除了 CString 之外没有依赖项，对我来说效果很好。它不支持嵌入在带引号的字符串中的换行符，但我不在乎，只要它在这种情况下不会崩溃，它就不会崩溃。

总体策略是，将带引号的字符串和空字符串作为特殊情况处理，其余部分使用 Tokenize。对于带引号的字符串，策略是找到真正的收盘价，跟踪是否遇到成对的连续引号。如果是，请使用“替换”将配对转换为单打。毫无疑问，有更有效的方法，但就我而言，性能还不够重要，不足以证明进一步优化的合理性。

class CParseCSV {
public:
// Construction
    CParseCSV(const CString& sLine);

// Attributes
    bool    GetString(CString& sDest);

protected:
    CString m_sLine;    // line to extract tokens from
    int     m_nLen;     // line length in characters
    int     m_iPos;     // index of current position
};

CParseCSV::CParseCSV(const CString& sLine) : m_sLine(sLine)
{
    m_nLen = m_sLine.GetLength();
    m_iPos = 0;
}

bool CParseCSV::GetString(CString& sDest)
{
    if (m_iPos < 0 || m_iPos > m_nLen)  // if position out of range
        return false;
    if (m_iPos == m_nLen) { // if at end of string
        sDest.Empty();  // return empty token
        m_iPos = -1;    // really done now
        return true;
    }
    if (m_sLine[m_iPos] == '\"') {  // if current char is double quote
        m_iPos++;   // advance to next char
        int iTokenStart = m_iPos;
        bool    bHasEmbeddedQuotes = false;
        while (m_iPos < m_nLen) {   // while more chars to parse
            if (m_sLine[m_iPos] == '\"') {  // if current char is double quote
                // if next char exists and is also double quote
                if (m_iPos < m_nLen - 1 && m_sLine[m_iPos + 1] == '\"') {
                    // found pair of consecutive double quotes
                    bHasEmbeddedQuotes = true;  // request conversion
                    m_iPos++;   // skip first quote in pair
                } else  // next char doesn't exist or is normal
                    break;  // found closing quote; exit loop
            }
            m_iPos++;   // advance to next char
        }
        sDest = m_sLine.Mid(iTokenStart, m_iPos - iTokenStart);
        if (bHasEmbeddedQuotes) // if string contains embedded quote pairs
            sDest.Replace(_T("\"\""), _T("\""));    // convert pairs to singles
        m_iPos += 2;    // skip closing quote and trailing delimiter if any
    } else if (m_sLine[m_iPos] == ',') {    // else if char is comma
        sDest.Empty();  // return empty token
        m_iPos++;   // advance to next char
    } else {    // else get next comma-delimited token
        sDest = m_sLine.Tokenize(_T(","), m_iPos);
    }
    return true;
}

// calling code should look something like this:

    CStdioFile  fIn(pszPath, CFile::modeRead);
    CString sLine, sToken;
    while (fIn.ReadString(sLine)) { // for each line of input file
        if (!sLine.IsEmpty()) { // ignore blank lines
            CParseCSV   csv(sLine);
            while (csv.GetString(sToken)) {
                // do something with sToken here
            }
        }
    }

0赞 Jack Of Blades 12/20/2018 #33

我有一个更快的解决方案，最初是针对这个问题的：

如何拉动不同字符串的特定部分？

但它显然已经关闭了。不过，我不打算扔掉这个：

#include <iostream>
#include <string>
#include <regex>

std::string text = "\"4,\"\"3\"\",\"\"Mon May 11 03:17:40 UTC 2009\"\",\"\"kindle2\"\",\"\"tpryan\"\",\"\"TEXT HERE\"\"\";;;;";

int main()
{
    std::regex r("(\".*\")(\".*\")(\".*\")(\".*\")(\".*\")(\".*\")(\".*\")(\".*\")(\".*\")(\".*\")");
    std::smatch m;
    std::regex_search(text, m, r);
    std::cout<<"FOUND: "<<m[9]<<std::endl;

    return 0;
}

只需按索引从 smatch 集合中挑选出您想要的匹配项即可。正则表达式是幸福的。

0赞 Luke Usherwood 4/3/2019

脑海中浮现出一句海口话：你有问题。你用reg ex解决它。现在有两个问题。:-D

2赞 Olikay Gokce 1/24/2020 #34

使用 Stream 分析 CSV 文件行

我写了一个解析 CSV 文件行的小示例，如果需要，可以使用 for 和 while 循环进行开发：

#include <iostream>
#include <fstream>
#include <string.h>

using namespace std;

int main() {


ifstream fin("Infile.csv");
ofstream fout("OutFile.csv");
string strline, strremain, strCol1 , strout;

string delimeter =";";

int d1;

要继续到文件末尾，请执行以下操作：

while (!fin.eof()){

从 InFile 获取第一行：

    getline(fin,strline,'\n');

在直线上查找 Delimeter 位置：

    d1 = strline.find(';');

并解析第一列：

    strCol1 = strline.substr(0,d1); // parse first Column
    d1++;
    strremain = strline.substr(d1); // remaining line

以 CSV 格式创建输出行：

    strout.append(strCol1);
    strout.append(delimeter);

将行写入输出文件：

    fout << strout << endl; //out file line

} 

fin.close();
fout.close();

return(0);
}

此代码已编译并运行。祝你好运！

0赞 Romain Laneuville 2/6/2020 #35

就像每个人都提出他的解决方案一样，这是我使用模板、lambda 和元组的解决方案。

它可以将任何带有所需列的 CSV 转换为元组的 C++ 向量。

它的工作原理是在元组中定义每个 CSV 行元素类型。

您还需要为每个元素定义 to conversion lambda（例如，使用std::stringtypeFormatterstd::atod

然后你得到了一个与你的 CSV 数据相对应的这个结构的向量。

您可以轻松地重用它来匹配任何 CSV 结构。

StringsHelpers.hpp（字符串助手.hpp）

#include <string>
#include <fstream>
#include <vector>
#include <functional>

namespace StringHelpers
{
    template<typename Tuple>
    using Formatter = std::function<Tuple(const std::vector<std::string> &)>;

    std::vector<std::string> split(const std::string &string, const std::string &delimiter);

    template<typename Tuple>
    std::vector<Tuple> readCsv(const std::string &path, const std::string &delimiter, Formatter<Tuple> formatter);
};

字符串助手.cpp

#include "StringHelpers.hpp"

namespace StringHelpers
{
    /**
     * Split a string with the given delimiter into several strings
     *
     * @param string - The string to extract the substrings from
     * @param delimiter - The substrings delimiter
     *
     * @return The substrings
     */
    std::vector<std::string> split(const std::string &string, const std::string &delimiter)
    {
        std::vector<std::string> result;
        size_t                   last = 0,
                                 next = 0;

        while ((next = string.find(delimiter, last)) != std::string::npos) {
            result.emplace_back(string.substr(last, next - last));
            last = next + 1;
        }

        result.emplace_back(string.substr(last));

        return result;
    }

    /**
     * Read a CSV file and store its values into the given structure (Tuple with Formatter constructor)
     *
     * @tparam Tuple - The CSV line structure format
     *
     * @param path - The CSV file path
     * @param delimiter - The CSV values delimiter
     * @param formatter - The CSV values formatter that take a vector of strings in input and return a Tuple
     *
     * @return The CSV as vector of Tuple
     */
    template<typename Tuple>
    std::vector<Tuple> readCsv(const std::string &path, const std::string &delimiter, Formatter<Tuple> formatter)
    {
        std::ifstream      file(path, std::ifstream::in);
        std::string        line;
        std::vector<Tuple> result;

        if (file.fail()) {
            throw std::runtime_error("The file " + path + " could not be opened");
        }

        while (std::getline(file, line)) {
            result.emplace_back(formatter(split(line, delimiter)));
        }

        file.close();

        return result;
    }

    // Forward template declarations

    template std::vector<std::tuple<double, double, double>> readCsv<std::tuple<double, double, double>>(const std::string &, const std::string &, Formatter<std::tuple<double, double, double>>);
} // End of StringHelpers namespace

main.cpp（一些用法）

#include "StringHelpers.hpp"

/**
 * Example of use with a CSV file which have (number,Red,Green,Blue) as line values. We do not want to use the 1st value
 * of the line.
 */
int main(int argc, char **argv)
{
    // Declare CSV line type, formatter and template type
    typedef std::tuple<double, double, double>                          CSV_format;
    typedef std::function<CSV_format(const std::vector<std::string> &)> formatterT;

    enum RGB { Red = 1, Green, Blue };

    const std::string COLOR_MAP_PATH = "/some/absolute/path";

    // Load the color map
    auto colorMap = StringHelpers::readCsv<CSV_format>(COLOR_MAP_PATH, ",", [](const std::vector<std::string> &values) {
        return CSV_format {
                // Here is the formatter lambda that convert each value from string to what you want
                std::strtod(values[Red].c_str(), nullptr),
                std::strtod(values[Green].c_str(), nullptr),
                std::strtod(values[Blue].c_str(), nullptr)
        };
    });

    // Use your colorMap as you  wish...
}

0赞 user2891006 2/7/2020 #36

@sastanin解决方案的次要版本，以便它可以处理引号中的换行符。

std::vector<std::vector<std::string>> readCSV(std::istream &in) {
    std::vector<std::vector<std::string>> table;

    while (!in.eof()) {
        CSVState state = CSVState::UnquotedField;
        std::vector<std::string> fields {""};
        size_t i = 0; // index of the current field
        for (char c : row) {
            switch (state) {
                case CSVState::UnquotedField:
                    switch (c) {
                        case ',': // end of field
                                  fields.push_back(""); i++;
                                  break;
                        case '"': state = CSVState::QuotedField;
                                  break;
                        default:  fields[i].push_back(c);
                                  break; }
                    break;
                case CSVState::QuotedField:
                    switch (c) {
                        case '"': state = CSVState::QuotedQuote;
                                  break;
                        default:  fields[i].push_back(c);
                                  break; }
                    break;
                case CSVState::QuotedQuote:
                    switch (c) {
                        case ',': // , after closing quote
                                  fields.push_back(""); i++;
                                  state = CSVState::UnquotedField;
                                  break;
                        case '"': // "" -> "
                                  fields[i].push_back('"');
                                  state = CSVState::QuotedField;
                                  break;
                        case '\n': // newline
                                  table.push_back(fields);
                                  state = CSVState::UnquotedField;
                                  fields = vector<string>{""};
                                  i = 0;
                        default:  // end of quote
                                  state = CSVState::UnquotedField;
                                  break; }
                    break;
            }
        }
    }
    return table;
}

7赞 Alexander 2/22/2021 #37

您可以使用仅标头 Csv：:P arser 库。

它完全支持 RFC 4180，包括字段值中的引号、转义引号和换行符。
它只需要标准的 C++ （C++17）。
它支持在编译时读取 CSV 数据。std::string_view
它使用 Catch2 进行了广泛的测试。

-1赞 Exlife 3/29/2021 #38

CSV文件是由行组成的文本文件，每行由逗号分隔的标记组成。虽然在解析时您应该知道一些事情：

（0）文件使用“CP_ACP”代码页进行编码。您应该使用相同的编码页来解码文件内容。

（1）CSV丢失了“复合单元格”信息（如rowspan>1），因此当它读回excel时，复合单元格信息会丢失。

（2）单元格文本的开头和尾部可以用“”“引用，字面上的引号char将变成双引号。因此，结束匹配的引号字符必须是一个引号字符，而不是后面跟着另一个引号字符。例如，如果一个单元格有一个逗号，它必须用 CSV 引用，因为逗号在 CSV 中是有意义的。

（3）当单元格内容有多行时，它将以CSV引用，在这种情况下，解析器必须继续读取CSV文件中的下一个连续行，直到您得到与第一个引号字符匹配的结束引号字符，确保在解析行的标记之前读取当前逻辑行是完整的。

例如：在 CSV 文件中，以下 3 个物理行是一条由 3 个标记组成的逻辑行：

    --+----------
    1 |a,"b-first part
    2 |b-second part
    3 |b-third part",c
    --+----------

1赞 Deoclecio Freire 5/24/2022 #39

如果可以的话，我简单而快速的贡献。没有提升。

接受分隔符和分隔符内的分隔符，只要成对或远离分隔符即可。

#include <iostream>
#include <vector>
#include <fstream>

std::vector<std::string> SplitCSV(const std::string &data, char separator, char delimiter)
{
  std::vector<std::string> Values;
  std::string Val = "";
  bool VDel = false; // Is within delimiter?
  size_t CDel = 0; // Delimiters counter within delimiters.
  const char *C = data.c_str();
  size_t P = 0;
  do
  {
    if ((Val.length() == 0) && (C[P] == delimiter))
    {
      VDel = !VDel;
      CDel = 0;
      P++;
      continue;
    }
    if (VDel)
    {
      if (C[P] == delimiter)
      {
        if (((CDel % 2) == 0) && ( (C[P+1] == separator) || (C[P+1] == 0) || (C[P+1] == '\n') || (C[P+1] == '\r') ))
        {
          VDel = false;
          CDel = 0;
          P++;
          continue;
        }
        else
          CDel++;
      }
    }
    else
    {
      if (C[P] == separator)
      {
        Values.push_back(Val);
        Val = "";
        P++;
        continue;
      }
      if ((C[P] == 0) || (C[P] == '\n') || (C[P] == '\r'))
        break;
    }
    Val += C[P];
    P++;
  } while(P < data.length());
  Values.push_back(Val);
  return Values;
}

bool ReadCsv(const std::string &fname, std::vector<std::vector<std::string>> &data,
  char separator = ',', char delimiter = '\"')
{
  bool Ret = false;
  std::ifstream FCsv(fname);
  if (FCsv)
  {
    FCsv.seekg(0, FCsv.end);
    size_t Siz = FCsv.tellg();
    if (Siz > 0)
    {
      FCsv.seekg(0);
      data.clear();
      std::string Line;
      while (getline(FCsv, Line, '\n'))
        data.push_back(SplitCSV(Line, separator, delimiter));
      Ret = true;
    }
    FCsv.close();
  }
  return Ret;
}

int main(int argc, char *argv[])
{
  std::vector<std::vector<std::string>> Data;
  ReadCsv("fsample.csv", Data);
  return 0;
}

上一个：创建一个与“godot：：Variant”兼容的新数组类型？

下一个：错误：非聚合类型“vector<int>”无法使用初始值设定项列表 vector<int> denominations = {1， 2， 3};

如何在 C++ 中读取和解析 CSV 文件？

How can I read and parse CSV files in C++?

评论

评论

评论

评论

评论

评论

评论

评论

评论

评论

评论

评论

评论

评论

评论

评论

评论

评论