提问人:Humberto R 提问时间:8/24/2023 最后编辑:MarkHumberto R 更新时间:8/26/2023 访问量:38
R 字符串:在任意位置匹配并替换
R String Match anywhere and replace
问:
我有一个来自调查的字符串向量,该向量包含有关人员工作位置的信息。一些回复是:首席执行官、首席执行官、首席执行官、首席执行官/所有者、首席执行官/创始人。
我想将任何包含单词 ceo 的字符串(大写、小写、后面的空格、前面的空格)替换为 CEO。
我已经尝试了这些代码,它们替换了一些,但不是全部。
aiprm$job_title <- gsub("ceo|CEO|owner|Ceo|Owner|executive|CEO |CEo|CE0|CEO/CEO|\\ceo|\\CEO", "CEO",aiprm$job_title)
仍然缺少一些像:经纪人/首席执行官,业务首席执行官/健康教练,首席执行官和制片人,首席执行官/创意总监,首席执行官/设计师,首席执行官/运营商。
答:
0赞
r2evans
8/26/2023
#1
grep
找到匹配的位置(大小写不中)并替换整个东西。"ceo"
quux[grep("ceo", quux$job_title, ignore.case = TRUE),1:2]
# job_title industry
# 14 CEO Advertising
# 28 CEO Marketing Agency
# 64 CEO Education
# 70 Founder & CEO AI Consulting
# 81 CEO ZK ART criations
# 83 CEO Marketing
# 110 Ceo Digital marketing
# 111 CEO Web Design
# 120 CEO Marketing & Advertisement
# 124 CEO Trainings
# 125 CEO Healthcare
# 128 CEO consultation
# 132 CEO IT-Services
# 144 Ceo Media
# 167 CEO BRANDING AND PRINTING
# 176 CEO Civil Engineering
# 180 ceo software
# 195 ceo marketing digital
# 210 CEO & Producer Comm (Rádio and Audio Producer)
# 217 CEO services
# 253 CEO Trucking; Travel, e-commerce
# 256 CEO Home
# 262 President, CEO Management Consulting
# 272 CEO Short Term Rentals and Hospitality
# 280 ceo eletrônicos
# 285 Ceo/owner Entrepreneur
# 312 CEO Nonprofit- services for people with I/DD
# 316 Ceo Marketing
# 321 Founder & CEO Digital Media
# 330 CEO PR
# 333 CEO Marketing
# 337 ceo agri
# 359 CEO Marketing
# 366 CEO Media
# 378 CEO IT/SocialNetwork
# 404 CEO Digital Marketing Agency
# 419 CEO Publicité
# 431 Ceo Ceo
# 439 CEO SaaS
# 442 CEO Digital Marketing
# 443 CEO wellness and health, Real State
# 445 Owner/ceo Disability
# 452 CEO Advising and Entrepreneurship
# 453 CEO KCARBONFREE G-W-RBIO
# 458 CEO Software
quux$job_title[grepl("ceo", quux$job_title, ignore.case = TRUE)] <- "CEO"
quux[grep("ceo", quux$job_title, ignore.case = TRUE),1:2]
# job_title industry
# 14 CEO Advertising
# 28 CEO Marketing Agency
# 64 CEO Education
# 70 CEO AI Consulting
# 81 CEO ZK ART criations
# 83 CEO Marketing
# 110 CEO Digital marketing
# 111 CEO Web Design
# 120 CEO Marketing & Advertisement
# 124 CEO Trainings
# 125 CEO Healthcare
# 128 CEO consultation
# 132 CEO IT-Services
# 144 CEO Media
# 167 CEO BRANDING AND PRINTING
# 176 CEO Civil Engineering
# 180 CEO software
# 195 CEO marketing digital
# 210 CEO Comm (Rádio and Audio Producer)
# 217 CEO services
# 253 CEO Trucking; Travel, e-commerce
# 256 CEO Home
# 262 CEO Management Consulting
# 272 CEO Short Term Rentals and Hospitality
# 280 CEO eletrônicos
# 285 CEO Entrepreneur
# 312 CEO Nonprofit- services for people with I/DD
# 316 CEO Marketing
# 321 CEO Digital Media
# 330 CEO PR
# 333 CEO Marketing
# 337 CEO agri
# 359 CEO Marketing
# 366 CEO Media
# 378 CEO IT/SocialNetwork
# 404 CEO Digital Marketing Agency
# 419 CEO Publicité
# 431 CEO Ceo
# 439 CEO SaaS
# 442 CEO Digital Marketing
# 443 CEO wellness and health, Real State
# 445 CEO Disability
# 452 CEO Advising and Entrepreneurship
# 453 CEO KCARBONFREE G-W-RBIO
# 458 CEO Software
这是上面使用的数据示例,“所有这些”对于堆栈答案来说太大了。
quux <- structure(list(job_title = c("Marketing Director", "Digital Content Manager", "Owner", "Content Principal", "Chief Consultant", "Managing Director", "Senior SEO Analyst", "Senior SEO Specialist", "Head of Content", "", "", "", "content manager", "CEO", "Head of SEO", "Owner/SEO Consultant", "Snr Manager, SEO + Talent", "Head of Web and Digital Communications", "VP of SEO & Content", "SEO consultant"), industry = c("Sporting Goods", "Marketing and Advertising", "Software", "Tech", "Business technology services", "Marketing", "SEO", "Automotive", "Marketing", "", "", "", "entertainment", "Advertising", "Digital Marketing", "Marketing", "SEO", "Online publishing", "private equity / finance", "SEO")), row.names = c(NA, 20L), class = "data.frame")
评论
replace(aiprm$job_title, grepl("\\bCEO\\b", aiprm$job_title, ignore.case=TRUE), "CEO")
gsub(".*\\bCEO\\b.*", "CEO", aiprm$job_title, ignore.case=TRUE)