提问人:user8229029 提问时间:9/26/2022 最后编辑:user8229029 更新时间:9/27/2022 访问量:37
使用 R 解析 BGG xml api 中的 xml 数据
Parsing xml data in BGG xml api with R
问:
这个问题是这个问题的第二部分:如何在 R 中为 BGG API 解析 xml 列表和表。
我想为此表生成一个数据框:
<marketplacelistings>
<listing>
<listdate>Thu, 19 Jan 2006 22:08:15 +0000</listdate>
<price currency="EUR">90.00</price>
<condition>likenew</condition>
<notes>Siedler von Catan / Settlers of Catan-Set (Basisspiel/basic game + Erweiterungen Die Seefahrer/ Städte und Ritter/ 5-6 Spieler / extensions The Seafarers/ Cities and Knights/ 5-6 players); 3 x gespielt (Neuwertig; lediglich alle Bestandteile in EINER der Originalboxen verstaut) / 3 times played (like new; only all items in ONE original box stored); Abgabe nur komplett / selling only all together; KEIN Festpreis (nur um überhaupt etwas einzugeben) – erwarte Angebot! / no fixed price (just to complete the entries)– make an offer; Versand weltweit zu Lasten Käufer / shipping worldwide, paid by buyer</notes>
<link href="https://boardgamegeek.com/market/product/40605" title="marketlisting"/>
</listing>
<listing>
<listdate>Mon, 29 Sep 2008 15:25:32 +0000</listdate>
<price currency="USD">34.95</price>
<condition>new</condition>
<notes>Brand New Sealed Board Game. Released from MayFair Games. Price is in USD. If you wish to pay in CAD...then we will convert at market rate. Shipping is $10.95 USD. We also carry the 5-6 Player Expansion that goes with this for $24.95 USD. We have sold thousands of board games across Canada. Please feel free to buy with confidence.</notes>
<link href="https://boardgamegeek.com/market/product/116347" title="marketlisting"/>
</listing>
这是我不知道该怎么办的地方。这个游戏大约有 100 个列表,我想从中制作一个数据框。我从哪里开始?下面的代码不起作用,因为它给出了 为 NULL 结果。
listings_df <- do.call(rbind,lapply(
getNodeSet(xmltop, '//marketplacelistings'),
function(x) data.frame(
XML:::xmlAttrsToDataFrame(xmlChildren(x)),
row.names = NULL
)))
编辑:这是我的最终解决方案。它可能不优雅,但它有效。
marketplace_df_func <- function(xmltop){
marketplace_df <- data.frame(
listdate = xmlSApply(getNodeSet(xmltop, "//marketplacelistings//listing//listdate"), xmlValue),
currency = xmlSApply(getNodeSet(xmltop, "//marketplacelistings//listing//price[@currency]"), xmlAttrs),
price = xmlSApply(getNodeSet(xmltop, "//marketplacelistings//listing//price"), xmlValue),
condition = xmlSApply(getNodeSet(xmltop, "//marketplacelistings//listing//condition"), xmlValue))
marketplace_df$listdate <- substr(marketplace_df$listdate, 1, 25)
return(marketplace_df)}
答:
1赞
Parfait
9/26/2022
#1
由于此 XML 现在在元素中包含更多数据而不是属性,因此只需运行 accessible 而不循环:xmlToDataFrame
lapply
library(XML)
url <- "..."
doc <- xmlParse(readLines(url))
listings_df <- xmlToDataFrame(doc, nodes = getNodeSet(doc, "//listing"))
str(listings_df)
# 'data.frame': 103 obs. of 5 variables:
# $ listdate : chr "Thu, 19 Jan 2006 22:08:15 +0000" "Mon, 29 Sep 2008 15:25:32 +0000" "Sat, 18 Jul 2009 20:42:03 +0000" "Fri, 04 Dec 2009 14:25:25 +0000" ...
# $ price : chr "90.00" "34.95" "49.00" "40.00" ...
# $ condition: chr "likenew" "new" "verygood" "new" ...
# $ notes : chr "Siedler von Catan / Settlers of Catan-Set (Basisspiel/basic game + Erweiterungen Die Seefahrer/ Städte und Rit"| __truncated__ "Brand New Sealed Board Game. Released from MayFair Games. Price is in USD. If you wish to pay in CAD...then w"| __truncated__ "inlcudes 5/6 player expansion" "" ...
# $ link : chr "" "" "" "" ...
若要绑定基础属性,请使用特殊方法:
listings_df <- data.frame(
xmlToDataFrame(doc, nodes = getNodeSet(doc, "//listing")),
XML:::xmlAttrsToDataFrame(getNodeSet(doc, "//listing/price")),
XML:::xmlAttrsToDataFrame(getNodeSet(doc, "//listing/link")),
row.names = NULL
)
str(listings_df)
# 'data.frame': 103 obs. of 8 variables:
# $ listdate : chr "Thu, 19 Jan 2006 22:08:15 +0000" "Mon, 29 Sep 2008 15:25:32 +0000" "Sat, 18 Jul 2009 20:42:03 +0000" "Fri, 04 Dec 2009 14:25:25 +0000" ...
# $ price : chr "90.00" "34.95" "49.00" "40.00" ...
# $ condition: chr "likenew" "new" "verygood" "new" ...
# $ notes : chr "Siedler von Catan / Settlers of Catan-Set (Basisspiel/basic game + Erweiterungen Die Seefahrer/ Städte und Rit"| __truncated__ "Brand New Sealed Board Game. Released from MayFair Games. Price is in USD. If you wish to pay in CAD...then w"| __truncated__ "inlcudes 5/6 player expansion" "" ...
# $ link : chr "" "" "" "" ...
# $ currency : chr "EUR" "USD" "EUR" "EUR" ...
# $ href : chr "https://boardgamegeek.com/market/product/40605" "https://boardgamegeek.com/market/product/116347" "https://boardgamegeek.com/market/product/158433" "https://boardgamegeek.com/market/product/181379" ...
# $ title : chr "marketlisting" "marketlisting" "marketlisting" "marketlisting" ...
评论