如何通过 Python 中的 API 将块中的文件上传到 SharePoint (Office 365)(使用 startupload/continueupload/finishupload)

How to Upload a File in Chunks to SharePoint (Office 365) via API in Python (using startupload/ continueupload/ finishupload)

提问人:d.rat 提问时间:11/13/2023 更新时间:11/15/2023 访问量:46

问:

我正在尝试通过 API 将文件分块上传到 SharePoint。

上传没有块的“小”文件有效 (.../files/add(overwrite=true,url='{file_name}')。(所以身份验证等工作正常)

但是当我使用 startupload/continueupload/finishupload 时,它不起作用,我收到此错误消息/响应(代码:-2147024809):“参数名称:serverRelativeUrl 参数不支持指定值。

我需要在没有任何其他框架(如 Office365-REST-Python-Client)的情况下解决这个问题,因为我需要找出正确的 API 调用。

我已经检查了参考资料:https://learn.microsoft.com/en-us/previous-versions/office/developer/sharepoint-rest-reference/dn450841(v=office.15https://learn.microsoft.com/en-us/previous-versions/office/sharepoint-server/dn760924(v=office.15)

我还尝试过 w/ 和 w/o urllib.parse.quote 等。 (我还没有机会测试 fileoffset 部分,因为我没有得到有效的响应。

这是我的“大”文件代码。

def uploadFile(bearer_token: str, tenant_name: str, site_name: str, folder_name: str, file_name: str) -> str:
    with open(file_name, "rb") as file:
        file_size = os.stat(file_name).st_size    
        big_file = file_size > CHUNK_SIZE
        if big_file:
            headers = {
                'Authorization': f'Bearer {bearer_token}',
                'Accept': 'application/json;odata=verbose',
                'Content-Type': 'application/octet-stream;odata=verbose'
                }        
                      
            GUID = uuid.uuid4()
            chunk_count = 1
            chunks_to_be_sent = file_size / CHUNK_SIZE
            continue_upload = True
            
            file_path = f"Shared%20Documents/{folder_name}/{file_name}"
            #file_path = urllib.parse.quote(file_path)
           
            web_host_url = f"{SHAREPOINT_HOST}"#/sites/{site_name}" #'<host web url>'
            # web_host_url = urllib.parse.quote(web_host_url)

            while continue_upload:
                big_file_chunk = file.read(CHUNK_SIZE)
                if not big_file_chunk or chunk_count >= chunks_to_be_sent: # Finish Upload
                    file_offset = (CHUNK_SIZE * (chunk_count - 1))# + 1
                    file_url = f"/sites/{site_name}/_api/web/getfilebyserverrelativeurl('{file_path}')/finishupload(uploadId=guid'{GUID}',fileOffset={file_offset})"
                    continue_upload = False
                elif chunk_count < 2: # Start Upload (First Chunk)
                    file_url = f"/sites/{site_name}/_api/web/getfilebyserverrelativeurl('{file_path}')/startupload(uploadId=guid'{GUID}')"#?@target='{web_host_url}'"
                else: # continue Upload
                    file_offset = (CHUNK_SIZE * chunk_count)# + 1
                    file_url = f"/sites/{site_name}/_api/web/getfilebyserverrelativeurl('{file_path}')/continueupload(uploadId=guid'{GUID}',fileOffset={file_offset})"
             
                url = urllib.parse.quote(file_url)
                conn = http.client.HTTPSConnection(SHAREPOINT_HOST)
                conn.request("POST", url, big_file_chunk, headers, encode_chunked=True)
                resp = conn.getresponse()
                resp = resp.read()
                #icecream.ic(chunk_count, file_url, resp)
                print(chunk_count, " -- ", resp)
                chunk_count += 1

            
            # resp = conn.getresponse()
            # resp = resp.read()
            return resp.decode("utf-8")

我收到此错误消息/响应: code: -2147024809 参数名称: serverRelativeUrl 参数不支持指定值。

1 -- b'{"error":{"code":"-2147024809, System.ArgumentException","message":{"lang":"en-US","value":"serverRelativeUrl\\r\\nParameter name: Specified value is not supported for the serverRelativeUrl parameter."}}}' 2 -- b'{"error":{"code":"-2147024809, System.ArgumentException","message":{"lang":"en-US","value":"serverRelativeUrl\\r\\nParameter name: Specified value is not supported for the serverRelativeUrl parameter."}}}'

python sharepoint 文件上传 上传 sharepoint-api

评论

0赞 d.rat 11/15/2023
我已经想通了——或者至少我有一些工作;)请参阅下面的答案。

答:

0赞 d.rat 11/15/2023 #1

这是我的工作解决方案。我认为我的错误是urllib.parse.quote()URL;但我不确定。

有以下几种方法

  1. GetFileByServerRelativePath(decodedurl='{URL}') 例如,url = f“/sites/{site_name}/_api/web/GetFileByServerRelativePath(decodedurl='/sites/{site_name}/{folder_path}/{file_name}')/startupload(uploadId=guid'{GUID}')”

  2. GetFileById('{unique_file_id}') 例如,file_url = f“/sites/{site_name}/_api/web/getfilebyid('{unique_file_id}')/startupload(uploadId=guid'{GUID}')”

常规:

  • 如果服务器上没有现有文件,则必须创建一个文件,否则这两种方法都不起作用。
  • 将urllib.parse.quote()与GetFileByServerRelativePath(decodedurl='{URL}')一起使用不适用于我的解决方案

方法

  1. 使用“.../GetFileByServerRelativePath(decodedurl='/sites/{site_name}/{folder_path}/{file_name}')”检查文件的 UniqueId
  2. 如果服务器上不存在该文件,请创建新的(空的内存中)文件,通过“.../files/add(overwrite=true,url='{file_name}')”上传它
  3. 生成 GUID -> GUID = uuid.uuid4()
  4. 读取块中的文件 -> big_file_chunk = file.read(CHUNK_SIZE)
  5. 以“.../startupload(uploadId=guid'{GUID}')”开头,并保护响应中的文件偏移量
  6. 继续“.../continueupload(uploadId=guid'{GUID}',fileOffset={file_offset})”,只要您不在最后一个chunk_count;此外,还可以保护 FileOffset 与响应
  7. 使用“.../finishupload(uploadId=guid'{GUID}',fileOffset={file_offset})”完成最后一个块

扩展 URL

host = "COMPANY.sharepoint.com"
site_name = "DUMMYSUPPLIER"
foler_path = "Shared%20Documents/TestFolder"
file_name = "FILE.EXT"
UniqueId = '3e6be666-8b5b-40fa-89ab-7cf9092a603d'
guid = '57dd389e-5325-4cc4-95cd-55af2131ae67'

url = "/sites/DUMMYSUPPLIER/_api/web/getfilebyid('3e6be666-8b5b-40fa-89ab-7cf9092a603d')/startupload(uploadId=guid'57dd389e-5325-4cc4-95cd-55af2131ae67')"
url = f"/sites/DUMMYSUPPLIER/_api/web/GetFileByServerRelativePath(decodedurl='/sites/DUMMYSUPPLIER/Shared%20Documents/TestFolder/FILE.EXT')/startupload(uploadId=guid'57dd389e-5325-4cc4-95cd-55af2131ae67')"

使用 GetFileById('{unique_file_id}') 的解决方案:

def _uploadLargeFile(host: str, bearer_token: str, site_name: str, folder_path: str, file_name: str) -> http.client.HTTPResponse:    
    """
    Uploads a file in chunks to the server and overwrites exisitng files. (Recommended for large files; mandatory for file sizes >= 250MB.)
    """
    headers = {
            'Authorization': f'Bearer {bearer_token}',
            'Accept': 'application/json;odata=verbose',
            'Content-Type': 'application/octet-stream;odata=verbose'
            }                              

    unique_file_id = get_file_unique_id(host=host, bearer_token=bearer_token, site_name=site_name, folder_path=folder_path, file_name=file_name)
    if not unique_file_id:
        unique_file_id, resp = create_and_upload_empty_file(host=host, bearer_token=bearer_token, site_name=site_name, folder_path=folder_path, file_name=file_name)

    GUID = uuid.uuid4()
    file_size = os.stat(file_name).st_size
    with open(file_name, "rb") as file:
        continue_upload = True
        file_offset = 0 # While uploading in chunks the fileoffset can be calculated as follows: fileoffset = chunk_count * CHUNK_SIZE
        while continue_upload:
            big_file_chunk = file.read(CHUNK_SIZE)

            conn = http.client.HTTPSConnection(host=host)
            
            if not big_file_chunk or _is_last_chunk(file_size, file_offset, CHUNK_SIZE): # Finish Upload
                file_url = f"/sites/{site_name}/_api/web/getfilebyid('{unique_file_id}')/finishupload(uploadId=guid'{GUID}',fileOffset={file_offset})"
                url = urllib.parse.quote(file_url)
                conn.request("POST", url, big_file_chunk, headers, encode_chunked=True)
                resp = conn.getresponse()                
                continue_upload = False
            elif file_offset == 0: # Start Upload
                file_url = f"/sites/{site_name}/_api/web/getfilebyid('{unique_file_id}')/startupload(uploadId=guid'{GUID}')"
                url = urllib.parse.quote(file_url)
                conn.request("POST", url, big_file_chunk, headers, encode_chunked=True)
                resp = conn.getresponse()
                resp_json = json.load(resp)
                file_offset = int(resp_json['d']['StartUpload'])
            else: # continue Upload
                file_url = f"/sites/{site_name}/_api/web/getfilebyid('{unique_file_id}')/continueupload(uploadId=guid'{GUID}',fileOffset={file_offset})"
                url = urllib.parse.quote(file_url)
                conn.request("POST", url, big_file_chunk, headers, encode_chunked=True)
                resp = conn.getresponse()
                resp_json = json.load(resp)
                file_offset = int(resp_json['d']['ContinueUpload'])

            if continue_upload:
                 prect = (file_offset / file_size) * 100
                 print(f"Uploading file '{file_name}': {prect:.2f}% uploaded ({file_offset}/{file_size})")
            else:
                 print(f"Uploading file '{file_name}': {100.0:.1f}% uploaded ({file_size}/{file_size})")
        return resp
        
        
def get_file_unique_id(host: str, bearer_token: str, site_name: str, folder_path: str, file_name: str) -> str | None:
    """
    Returns the files UniqueId if it exists otherwise None.
    """

    headers = {
    'Authorization': f'Bearer {bearer_token}',
    'Accept': 'application/json;odata=verbose',
    'Content-Type': 'application/octet-stream;odata=verbose'
    }

    url = f"/sites/{site_name}/_api/Web/GetFileByServerRelativePath(decodedurl='/sites/{site_name}/{folder_path}/{file_name}')"

    conn = http.client.HTTPSConnection(host)
    conn.request(method="POST", url=url, body=None, headers=headers)
    resp = conn.getresponse()
    resp_json = json.load(resp)
    
    if 'd' in resp_json:       
        return resp_json['d']['UniqueId']
    else:
        return None

        
def create_and_upload_empty_file(host: str, bearer_token: str, site_name: str, folder_path: str, file_name: str) -> (str, http.client.HTTPResponse):
    """
    Creates an empty file (in-memory) and uploads it the server. Returns the UniqueId of the uploaded file.
    """
    emptyfile = io.BytesIO(b"") # In-Memory file content

    headers = {
    'Authorization': f'Bearer {bearer_token}',
    'Accept': 'application/json;odata=verbose',
    'Content-Type': 'application/octet-stream;odata=verbose'
    }

    file_url = f"/sites/{site_name}/_api/web/getfolderbyserverrelativeurl('{folder_path}')/files/add(overwrite=true,url='{file_name}')"
    url = urllib.parse.quote(file_url)
    
    conn = http.client.HTTPSConnection(host)
    conn.request("POST", url, emptyfile, headers)
    resp = conn.getresponse()
    resp_json = json.load(resp)    
    if 'd' in resp_json:       
        return (resp_json['d']['UniqueId'], resp)
    else:
        return (None, resp)


def _is_last_chunk(file_size, file_offset, chunk_size) -> bool:
    return file_size - file_offset <= chunk_size