Geopy Google v3 - 从location.raw中提取地址组件

Geopy Google v3 - Extracting address components from location.raw

提问人:Fletch 提问时间:4/21/2021 更新时间:11/9/2023 访问量:859

问:

我有这个 python 脚本,我从 SQL 表中获取地址列表,然后使用 Geopy 将它们传递给 googles api,对它们进行地理编码,然后将数据写回不同的 SQL 表。

我目前被困在尝试从address_components中提取地址部分。

我已经尝试了很多事情,例如转换为 Json,使用其他地址解析器(但我不在美国),但我的 python 并不强大。我也不能直接引用列表部分,因为不同的地址会有不同的长度,所以当我稍后将其应用于数据帧时,它会失败,因为列表的长度并不完全相同。例如location.rawloc_raw0.append(location.raw['address_components'][0]['long_name'])

我目前正在尝试使用嵌套的 For 循环来获取街道号码,然后复制其他部分。

发生的事情是相等的,但会相等,而不仅仅是.k'address_components'v'{'long_name': '46', 'short_name': '46', 'types': ['street_number']}' 'street_number'

        for k, v in location.raw.items():
            if k == 'address_components' and 'street_number' in v:
                loc_street_number.append(location.raw['long_name'])
                print(loc_street_number)

清单 V 内容的图片

df = pd.DataFrame(SQL_Query, columns=['address'])


loc_Inputaddress = []
loc_Longitude = []
loc_Latitude = []
loc_Matchedaddress = []

loc_subpremise = []
loc_street_number = []
loc_road = []
loc_locality = []
loc_AdminArea1 = []
loc_AdminArea2 = []
loc_postcode = []
loc_type = []


for address in df.address:
    try:
        inputAddress = address
        location = g.geocode(inputAddress, timeout=15)

        loc_Inputaddress.append(inputAddress)
        loc_Longitude.append(location.longitude)
        loc_Latitude.append(location.latitude)
        loc_Matchedaddress.append(location.address)
        loc_type.append(location.raw['types'][0])

        #get address type
        print(loc_type.append(location.raw['types'][0]))

        #print(location.raw['address_components'])

  
        for k, v in location.raw.items():
            if k == 'address_components' and 'street_number' in v:
                loc_street_number.append(location.raw['long_name'])
                print(loc_street_number)

    except Exception as e:
        print('Error, skipping address...', e)
嵌套列表 吉奥皮

评论

1赞 Hernán Alarcón 4/21/2021
请记住,不建议发布数据图像。请参阅如何提出一个好问题?

答:

0赞 Hernán Alarcón 4/21/2021 #1

在您的代码中,是一个字典列表,据我了解,您想要具有类型的字典。此示例应帮助您:vlong_namestreet_number

v = [{
    "long_name": "40",
    "short_name": "40",
    "types": ["subpremise"]
},
{
    "long_name": "46",
    "short_name": "46",
    "types": ["street_number"]
},
{
    "long_name": "Aongatete",
    "short_name": "Aongatete",
    "types": ["locality", "political"]
}]

# Iterate over v because it is a list
for address_component in v:
    # Check if one of the address component types is "street_number"
    if "street_number" in address_component["types"]:
        print(address_component["long_name"])
        break

# Output:
# 46

myCompiler 中的演示

评论

0赞 Fletch 4/21/2021
非常感谢,一切都很好。我花了 1.5 天的时间尝试不同的东西,但我终于可以把这些数据拿出来了!
0赞 Fletch 4/21/2021
例如,在某些情况下,地址可能没有门牌号 - 如果没有要添加的数据,您有什么技巧可以保持数组的长度相同吗?
0赞 N8- 11/9/2023 #2

下面是一个函数,用于根据 geocodezip 的 js 示例提取地址组件,在这里找到:

def extract_address_details(address_components):
    """
    extract_address_details extracts address parts from the details of the google maps api response

    :param address_components: a dict representing the details['address_components'] response from the google maps api
    :return: a dict of the address components
    """
    # set up the loop parameters for each component
    count = len(address_components)
    looplist = range(0, count)

    #loop through the indices of the address components
    for i in looplist:

        #set up the loop parameters for the component types
        tcount = len(address_components[i]['types'])
        tlooplist = range(0, tcount)
        
        #loop through the indices of the address component types
        for t in tlooplist:

            #match the type, pull the short_name from the appropriate component as a string
            match address_components[i]['types'][t]:
                case "subpremise":
                    subpremise = str(address_components[i]['short_name'])
                case "street_number":
                    street_number = str(address_components[i]['short_name'])
                case "route":
                    route = str(address_components[i]['short_name'])
                case "locality":
                    city = str(address_components[i]['short_name'])
                case "administrative_area_level_1":
                    state = str(address_components[i]['short_name'])
                case "postal_code":
                    postal_code = str(address_components[i]['short_name'])

    #assemble the street address
    address1 = street_number + " " + route + " " + subpremise

    #populate the return values
    data = {
        'address1': address1,
        'city': city,
        'state': state,
        'postal_code': postal_code
    }
    
    return data