根据与列表项的匹配项更改 pyspark.df 列值-解网

问：

我有这样的 df

列表	index_in_List
['一个'，'b']	0
['c'，'d']	0
['d'，'a']	1

是通过在中查找索引来创建的，其中字符串（“a”）匹配。用：index_in_listList

df = df.withColumn("index_in_List", (F.array_position(df.List, "a")))

不幸的是，当索引为 0 以及“a”不存在时，生成的索引为 0。现在，我想将 0 变为 None （Null），其中 .否则，请保留原件index_in_ListList[index_in_list]!= "a"

结果应如下所示

列表	index_in_List
['一个'，'b']	0
['c'，'d']	零
['d'，'a']	1

我尝试了几件事来交换与 None 的不匹配：

    #conditions1= F.when(df["index_in_List"]=="a",df["index_in_List"]).otherwise(F.col("index_in_List",None)
    #conditions2= F.when(df["index_in_List"]=="a",df["index_in_List"]).otherwise(F.lit(None)
    #conditions3= F.when(df["List"][F.col("index_in_List")]!="a",F.lit(None)).otherwise(df["index_in_List"])          
    conditions4= F.when( df["List"].getItem([df["index_in_List"]])!="a",F.lit(None)).otherwise(df["index_in_List"])        
    #conditions5= F.when(df["index_in_List"]=="a",).otherwise(F.col(idxcolname,None)

    df=df.withColumn("index_in_List",conditions4)

您能否帮助正确更改为 None （Null）

或者有没有一种方法可以在找不到匹配项时返回 None 而不是 0？谢谢

python 列表 pyspark 替换匹配

根据与列表项的匹配项更改 pyspark.df 列值

Change pyspark.df column value based on match to list item

评论