By Robert Ellison. Updated on Thursday, November 28, 2024.
There is a stunningly simple way to get a file out of sharepoint and I'll get to that soon (or just skip to the very end of the post).
I have been automating the shit out of a lot of routine work in Microsoft Teams recently. Teams is the result of Skype and Sharepoint having too much to drink at the Microsoft holiday party. It often shows. One annoyance is that channel threads are ordered by the time that someone last responded. Useful for quickly seeing the latest gossip but a pain when you need to keep an eye on each individual thread. After listlessly scrolling around trying to keep up with the flow I came up with a dumb solution - I sync the channel to Obsidian (my choice of note app, could be anything) and then I can just check there for new threads. It's a small convenience but has meaningully improved my life.
Unfortunately I got greedy. These messages usually have a PowerPoint presentation attached to them and so why not have an LLM summarize this while updating my notes?
It doesn't look like Copilot has a useful API yet. You can build plug-ins, but I don't want to talk to Copilot about presentations, I just want it to do the heavy lifting while I sleep so I can read the summary in the morning. Hopefully in the future there will be a simple way to say hey, Copilot, summarize this PPTX. Not yet.
So the outline of a solution here is download the presentation, send it ChatGPT, generate a summary and stick that in Obsidian. This felt like a half hour type of project. And it should have been - getting GPT4 Turbo to summarize a PPTX file took about ten minutes. Downloading the file has taken days and sent my self esteem back to primary school.
You would think that downloading a file would be the Graph API's bread and butter. Especially as I have a ChatMessage from the channel that includes attachments and links. The link is for a logged in human, but it must be easy to translate from this to an API call, right?
It turns out that all you need is the site ID, the drive ID and the item ID.
These IDs are not in the attachment URL or the ChatMessageAttachment. It would be pretty RESTful to include the obvious next resource I'm going to need in that return type. No dice though.
I tried ChatGPT which helpfully suggested API calls that looked really plausible and helpful but that did not in fact exist. So I then read probably hundreds of blogs and forum posts from equally confused and desperate developers. Here is a typical example:
"Now how can I upload and download files to this library with the help of Graph API (GraphServiceClient)."
To which Microsoft, terrifyingly, reply:
"We are currently looking into this issue and will give you an update as soon as possible."
Ignoring the sharepoint part and glossing over where that drive ID is coming from. Other documentation suggests that you can lookup your site by the URL, and then download a list of drives to go looking for the right one. Well, the first page in paginated drive collection anyway implying that just finding the ID might get you a call from the quota police.
I know Microsoft is looking after a lot of files for a lot of organizations, but how can it be this hard?
It isn't. It's just hidden. I eventually found this post from Alex Terentiev that points out that you just need to base64 encode the sharing url, swap some characters around and then call:
If Google was doing its job right this would be the top result. I should be grateful they're still serving results at all and not just telling me that my pastimes are all harmful.
The documentation is here and Microsoft should link to it on every page that discusses drives and DriveItems. For GraphServiceClient the call to get to an actual stream is:
(Published to the Fediverse as:
Download a Sharepoint File with GraphServiceClient (Microsoft Graph API) #code#ml#graph#sharepoint#c# Everyone developing applications with the Graph API should know about the shares endpoint that allows you to download files easily.)
"In messages during the pandemic, he referred to ministers as “useless fuckpigs,” “morons,” and “cunts.” The inquiry’s lawyer asked Cummings if he thought his language had been too strong. “I would say, if anything, it understated the position,” he replied."
This is a depressing but definitive read as we wait for the UK election to be announced. #politics#uk
Paresh Dave in Wired writes about TDCommons.org, a Google funded but bepress operated site. The idea is to publish technical disclosures as prior art that might invalidate future patents. It's an interesting overview of the subject, including a USPTO attempt to do the same thing (I covered this here) and a commercial competitor, IP.Com. Apparently USPTO is looking for help with this problem:
"Google is hoping TDCommons has a chance to be embraced as Kathi Vidal, a tech patent attorney who was sworn in as director of the USPTO almost two years ago, settles into her role. Deciding that generative AI programs can’t be patent holders has been a higher priority, she says, but creating a better search tool for prior art is an issue she’s discussed with a lot of organizations. Vidal says she’s open to the USPTO administering and funding its own prior art repository, offering up her email, [email protected], for feedback on how to do so."
I'm not super-convinced that she's providing her actual email address, but when I have a few minutes I might suggest my own plan - issue all patent applications and shift the examination to the start of any litigation or enforcement attempt.
(Published to the Fediverse as:
TDCommons and the Future of Patent Law #politics#patents#uspto Prior art attempts like TDCommons, IP.com and even the USPTO's failed scheme should be replaced with a fundamental overhaul of the patent system.)