San Francisco Budget Chatbot (GPT)

I just published a San Francisco Budget GPT to the ChatGPT store. This is free to use, you just need an OpenAI account.

The chatbot is constructed from the 2010-2025 data available on DataSF, the most recent 2025-2026 budget draft and a five year revenue projection. I added some instructions to help interpret the data and I've tested against a range of queries. This is generative AI and so be cautious about what it says. It's pretty good when it can find the right data and will happily invent things if it can't.

Not that this is limited to chatbots. There was a lot of press yesterday around San Francisco being the worst run city in the nation based on this 'analysis' from WalletHub. They are dividing a measure of service quality by budget dollars per capita. This fails to take into account that San Francisco is a city and a county and so the budget includes county services like a Sheriff department. It also ignores that San Francisco runs an international airport and port, not paid for by taxpayers but still in the budget. And it doesn't adjust for regional differences in income that make services more expensive to provide (and higher total dollar taxes to provide them). So I'll take an occasional chatbot glitch over a willfully incurious press pushing out a PR piece that tickles their confirmation biases.

Enjoy, and let me know if it works for you.

Add your comment...

Summer Solstice 2024

By Robert Ellison.

Summer starts at 20:51 UTC on June 20, 2024, Winter if you happen to live south of the equator. Rendered in Catfood Earth.

Add your comment...

Vernal (Spring) Equinox 2024

By Robert Ellison.

Spring starts now (03:07 UTC on March 20, 2024) in the Northern Hemisphere and Autumn for the equatorially challenged. The image above shows the precise moment of the equinox in Catfood Earth.

Add your comment...

Download a Sharepoint File with GraphServiceClient (Microsoft Graph API)

By Robert Ellison. Updated on Thursday, November 28, 2024.

There is a stunningly simple way to get a file out of sharepoint and I'll get to that soon (or just skip to the very end of the post).

I have been automating the shit out of a lot of routine work in Microsoft Teams recently. Teams is the result of Skype and Sharepoint having too much to drink at the Microsoft holiday party. It often shows. One annoyance is that channel threads are ordered by the time that someone last responded. Useful for quickly seeing the latest gossip but a pain when you need to keep an eye on each individual thread. After listlessly scrolling around trying to keep up with the flow I came up with a dumb solution - I sync the channel to Obsidian (my choice of note app, could be anything) and then I can just check there for new threads. It's a small convenience but has meaningully improved my life.

Unfortunately I got greedy. These messages usually have a PowerPoint presentation attached to them and so why not have an LLM summarize this while updating my notes?

It doesn't look like Copilot has a useful API yet. You can build plug-ins, but I don't want to talk to Copilot about presentations, I just want it to do the heavy lifting while I sleep so I can read the summary in the morning. Hopefully in the future there will be a simple way to say hey, Copilot, summarize this PPTX. Not yet.

So the outline of a solution here is download the presentation, send it ChatGPT, generate a summary and stick that in Obsidian. This felt like a half hour type of project. And it should have been - getting GPT4 Turbo to summarize a PPTX file took about ten minutes. Downloading the file has taken days and sent my self esteem back to primary school.

You would think that downloading a file would be the Graph API's bread and butter. Especially as I have a ChatMessage from the channel that includes attachments and links. The link is for a logged in human, but it must be easy to translate from this to an API call, right?

It turns out that all you need is the site ID, the drive ID and the item ID.

These IDs are not in the attachment URL or the ChatMessageAttachment. It would be pretty RESTful to include the obvious next resource I'm going to need in that return type. No dice though.

I tried ChatGPT which helpfully suggested API calls that looked really plausible and helpful but that did not in fact exist. So I then read probably hundreds of blogs and forum posts from equally confused and desperate developers. Here is a typical example:

"Now how can I upload and download files to this library with the help of Graph API (GraphServiceClient)."

To which Microsoft, terrifyingly, reply:

"We are currently looking into this issue and will give you an update as soon as possible."

Before eventually suggesting:

"await graphClient.Drives["{drive-id}"].Items["{driveItem-id}"].Content.GetAsync();"

Ignoring the sharepoint part and glossing over where that drive ID is coming from. Other documentation suggests that you can lookup your site by the URL, and then download a list of drives to go looking for the right one. Well, the first page in paginated drive collection anyway implying that just finding the ID might get you a call from the quota police.

I know Microsoft is looking after a lot of files for a lot of organizations, but how can it be this hard?

It isn't. It's just hidden. I eventually found this post from Alex Terentiev that points out that you just need to base64 encode the sharing url, swap some characters around and then call:

"GET https://graph.microsoft.com/v1.0/shares/{sharing-url}/driveItem"

If Google was doing its job right this would be the top result. I should be grateful they're still serving results at all and not just telling me that my pastimes are all harmful.

The documentation is here and Microsoft should link to it on every page that discusses drives and DriveItems. For GraphServiceClient the call to get to an actual stream is:

"graphClient.Shares[encodedUrl].DriveItem.Content.GetAsync()"

Add your comment...

Winter Solstice 2023

By Robert Ellison.

Winter starts right now for those of us at the top of the planet. It's summer time down under. Winter Solstice 2023 rendered in Catfood Earth (03:28 on December 22, 2023 UTC).

Add your comment...

Catfood Earth 4.40

By Robert Ellison. Updated on Saturday, January 20, 2024.

Catfood Earth 4.40 is now available to download.

With this release Catfood Earth is 20 years old! This update includes version 2023c of the Time Zone Database and the following bug fixes.

The National Weather Service changed one letter in the URL of their one hour precipitation weather radar product. It needs to be BOHA instead of BOHP. Presumably just checking that data consumers are paying attention? Weather radar is working again.

Not to be left out the Smithsonian Institution Global Vulcanism Program has decided to drop the www from their web site. The convention here is to redirect but they're content with just being unavailable at the former address. Recent volcanoes are working again as well.

The final fix is to the locations layer. Editing a location was crashing. This was due to a new format in the zoneinfo database that was not contemplated by the library that I use. As far as I can tell this isn't maintained any more since the death of CodePlex. While working on this update I started using GitHub Copilot, their AI assistant based on GPT 3.5. I was amazed at how helpful it was figuring out and then fixing this rather fiddly bug. The locations layer is back to normal, and I have regenerated all the time zone mapping as well.

Add your comment...

Rob 2.0

By Robert Ellison. Updated on Wednesday, November 6, 2024.

If I'm going to be replaced with AI then I may as well be the person to do it. I need an AI Rob that I can be proud of and that's going to take some work.

My approach so far is to generate some training data. I've answered lots of questions in a spreadsheet. This is an ongoing project and there will be dot releases as I work towards a usable product (one that I can just plug into email or Teams). Probably this is going to require a mix of fine tuning and retrieval augmented generation (RAG). To start with I'm just fine tuning GPT 3.5 Turbo from OpenAI.

Fine tuning was painless. As usual the difficult part was randomly trying different versions of Python to find one that would coexist with some stubborn dependency (tiktoken in this case, which will live with Python 3.11 but is very unhappy with Python 3.12).

You can try this below - just leave a comment and Rob 2.0 will reply. Anything you post goes through the regular moderation system, this is just to stop spam. any legitimate questions are fair game (and likely to make it into the training corpus if the answer is no good!).

Due to safety systems it doesn't swear like the real thing. That might require a different model / corporate host at some point in the future. I'll update this post as I make progress.

Updated 2023-12-20 00:46:

I had most of a day spare today and so decided to get a little closer to my own personal singularity. Rob 2.1 is live and answering your questions in the comments below.

The first thing I did was add a few hundred more questions and answers to my training data set. I then fine tuned GPT 3.5 on the new data.

I wanted to get the LLM trinity - prompt, retrieval augmented generation (RAG) and fine turing. Initially I thought that I could just use the OpenAI assistant API to get there, and I got as far as coding the whole thing up before stubbing my toe on a harsh reality. It only supports retrieval for gpt-3.5-turbo-1106 and gpt-4-1106-preview. Hopefully this changes at some point but no way to get everything I need from assistants yet.

Not a big deal - I rolled up my sleeves (and also GitHub Copilot's sleeves) and added my own RAG based on the Q&A training data and refined my prompt to include the most relevant answer as well as some more specific instructions. It's pretty basic - whatever you ask is compared to the existing question library using cosine distance of OpenAI embeddings. Maybe I'll add a vector database if I have the patience to answer enough questions about myself, but a brute force in memory search works fine for now.

Add your comment...

Autumnal Equinox 2023

By Robert Ellison.

Fall starts at 06:60 on September 23 UTC. Autumn if you're British. Spring if you're Australian. Rendered in Catfood Earth.

Add your comment...

Catfood WebCamSaver 3.22

By Robert Ellison. Updated on Saturday, September 23, 2023.

Catfood WebCamSaver 3.22 is available to download. This release updates the webcam list.

Add your comment...

Catfood Earth for Android 4.30

By Robert Ellison. Updated on Saturday, September 23, 2023.

Catfood Earth for Android now supports random locations. The slice of Earth displayed will change periodically throughout the day. You can still set a manual location or have Catfood Earth use your current location. Install from Google Play, existing users will get this update over the next few days.

Add your comment...

San Francisco Budget Chatbot (GPT)

Related Posts

Summer Solstice 2024

Related Posts

Vernal (Spring) Equinox 2024

Related Posts

Download a Sharepoint File with GraphServiceClient (Microsoft Graph API)

Related Posts

Winter Solstice 2023

Related Posts

Catfood Earth 4.40

Related Posts

Rob 2.0

Related Posts

Autumnal Equinox 2023

Related Posts

Catfood WebCamSaver 3.22

Related Posts

Catfood Earth for Android 4.30

Related Posts

Newsletter

Popular Posts

Recent Posts