Wednesday, July 25, 2018

Using Microsoft Graph to get a PDF preview of a file in SharePoint by file path

The viewfinder of a camera shows a photo of the sunset.
Photo by Glenn Carstens-Peters on Unsplash

There are multiple ways to get a PDF version of a file, so I figured I’d show how you via a path to a file in SharePoint can use the Microsoft Graph API to get a PDF version of that file. I’ll be using the Graph drive item conversion API for this.

A sample URL could look something like this: https://contoso.sharepoint.com/sites/asite/FooLib/lala/Document.docx

[Update]

After posting the question on Stack Overflow I received an answer from Vadim Gremyachev which takes it down to one API call.

Basically he clued me onto how you can create a sharing token for the item URL which is actually the file id. Code for this is listed in the Graph Sharing API docs.

First you base64 encode the URL, replace some characters and prefix with u!, then access the files via the /sharing API. The below code is using PowerShell to construct the token.

$url = 'https://contoso.sharepoint.com/sites/asite/FooLib/lala/Document.docx'
"u!"+[Convert]::ToBase64String([Text.Encoding]::UTF8.GetBytes($url)).TrimEnd('=').Replace('/','_').Replace('+','-')

u!aHR0cHM6Ly9jb250b3NvLnNoYXJlcG9pbnQuY29tL3NpdGVzL2FzaXRlL0Zvb0xpYi9sYWxhL0RvY3VtZW50LmRvY3g

Armed with the token the result API call is:

https://graph.microsoft.com/v1.0/shares/u!aHR0cHM6Ly9jb250b3NvLnNoYXJlcG9pbnQuY29tL3NpdGVzL2FzaXRlL0Zvb0xpYi9sYWxhL0RvY3VtZW50LmRvY3g/driveItem/content?format=pdf

[Original post]

In order to get to the actual file two API calls are needed, one to fetch the drive (library) id, and one to fetch the file.

Note: This solution will not work on the root site collection as I make assumptions on the number of parts of a URL. The following file formats are supported: csv, doc, docx, odp, ods, odt, pot, potm, potx, pps, ppsx, ppsxm, ppt, pptm, pptx, rtf, xls, xlsx.

Deconstructing the file URL

Splitting the URL on slashes we get the parts needed to get the id of the document library and the id of the file.

0 https:
1
2 contoso.sharepoint.com
3 sites
4 pub
5 FooLib
6 lala
7 Document.docx

Part 2 is the tenant hostname, part 3+4 is the site path, part 5 is the document library, and part 6 and out is the item path relative to the document library.

Getting the drive id (id of document library)

Using the sample URL above we combine the sites and drives API’s in one query:

/v1.0/sites/{hostname}:{server-relative-path}:/drives

resulting in the following query where we select id and url

https://graph.microsoft.com/v1.0/sites/contos.sharepoint.com:/sites/asite:/drives?$select=id,weburl

The output of this call are all the libraries in the site.

{
    "@odata.context": "https://graph.microsoft.com/v1.0/$metadata#drives(id,webUrl)",
    "value": [
        {
            "id": "b!H11aFSof8062NsPf4rr-qE3OKQpUIjVEp7PzqdeT_psgYyKuXH2VR7fGsvWPyBOt",
            "webUrl": "https://contoso.sharepoint.com/sites/asite/Documents"
        },
        {
            "id": "b!H11aFSof8062NsPf4rr-qE3OKQpUIjVEp7PzqdeT_pv8T5clDnpiRZq2uVmXgGRU",
            "webUrl": "https://contoso.sharepoint.com/sites/asite/FooLib"
        },
        {
            "id": "b!H11aFSof8062NsPf4rr-qE3OKQpUIjVEp7PzqdeT_psUQF8PSnx9T7aXwvRalLc_",
            "webUrl": "https://contoso.sharepoint.com/sites/asite/PublishingImages"
        },
        {
            "id": "b!H11aFSof8062NsPf4rr-qE3OKQpUIjVEp7PzqdeT_pv01hj6qcWyR5wulob7Lk7-",
            "webUrl": "https://contoso.sharepoint.com/sites/asite/Pages"
        },
        {
            "id": "b!H11aFSof8062NsPf4rr-qE3OKQpUIjVEp7PzqdeT_pvEaXdch-3DToEk0qR4g-xx",
            "webUrl": "https://contoso.sharepoint.com/sites/asite/SiteCollectionDocuments"
        },
        {
            "id": "b!H11aFSof8062NsPf4rr-qE3OKQpUIjVEp7PzqdeT_ptwBh2OaBQOTbJMXT5jLKwi",
            "webUrl": "https://contoso.sharepoint.com/sites/asite/SiteCollectionImages"
        },
        {
            "id": "b!H11aFSof8062NsPf4rr-qE3OKQpUIjVEp7PzqdeT_pv-q5N0D8gWSLB-0MY7_RS3",
            "webUrl": "https://contoso.sharepoint.com/sites/asite/Translation%20Packages"
        }
    ]
}

Ideally you would use a $filter query to pick out just the library you want, but this is not supported for the drives endpoint, so you need to post-filter yourself.

By filtering out the item which has a webUrl  matching part 2,3 and 4 combined you have the library you are looking for.

Getting the PDF URL for the file

With the id of the document library in hand, it’s time for the next query which will return the URL of the PDF version in a 302 Location header.

/v1.0/drives/{drive-id}/root:/{item-path}:/content?format=pdf

Using the drive id from the previous call together with the document path I end up with the following URL

https://graph.microsoft.com/v1.0/drives/b!H11aFSof8062NsPf4rr-qE3OKQpUIjVEp7PzqdeT_pv8T5clDnpiRZq2uVmXgGRU/root:/FooLib/lala/Document.docx:/content?format=pdf

If you look at the Location header in the returned response you will find something similar to:

https://northeurope1-mediap.svc.ms/transform/pdf?provider=spo&inputFormat=docx&cs=N2FiNzg2….

This is a pre-authenticated URL which can be called directly from anywhere without the need to logging in, and the URL is valid for a few minutes only.