I've been feeding webcam images into the Google Cloud Vision API for a few weeks now so I thought I'd take a look at what it thinks it can see. The image above shows every label returned from the API with my confidence going from the bottom to the top and Google's confidence going from left to right (so the top right hand corner contains labels that we both agree on).
Google is super-confident that it has seen a location. Can't really argue with it there.
It's more confident that it has seen an ice hotel than a sunrise (and it has seen a lot of sunrises at this point). Maybe I need to explore the Outer Sunset more.
Google is 60.96% confident that it has seen a ballistic missile submarine. I suppose that's plausible, I do have an ocean view but it's rather far away and unless there was an emergency blow that didn't make the news I'm going to have to call bullshit on that one. It's 72.66% confident that an Aston Martin DB9 went past which is pretty specific. Possibly a helicopter slung delivery?
Maybe I'm sending basically the same image in too many times and the poor system is going quietly mad and throwing out increasingly desperate guesses. Probably I've just learned that I should use 80%+ as my confidence threshold before triggering an email...
Nest (previously DropCam) can email you when it detects activity but that gets boring quickly. How about an email only when it sees something totally new?
The script below downloads a frame from a web cam and then calls the Google Cloud Vision API to label features. It keeps a record of everything that has previously been seen and only sends an email when a new feature is detected. You could easily tweak this to email on a specific feature (i.e. every time your dog is spotted), or to count the number of times a feature appears. I'm using a Nest cam but any security camera that has a publicly visible image download URL will work.
There is a bit of setup to get this working. Create a new Apps Script project in Google Drive and paste the code above in. You'll need to provide you own values for the three variables at the top.
OAuthCreds is the contents of the JSON format private key file for a Google Developer Console project. Go to the console, create a new project and enable the Cloud Vision API. You'll also need to enable billing (more on this below) - a trial account will work fine for this. Once the API is enabled create a service account under Credentials and download the JSON file. Just paste the contents of this into the script.
That's the hard part over. Now enter the URL of the image to monitor (see this post for instructions on finding this for a Nest / DropCam device) as MonitorImageUrl and your email address for SendEmailTo.
One last thing - follow the instructions here to reference the OAuth2 for Apps Script library.
Once this is all done run the script (the main() function) and authorize it. You should get an email with a picture attached and a list of the labels detected together with a confidence score from 0 to 1. If this doesn't happen check the logs (under the View menu).
You can now schedule the script to run repeatedly (Resources -> Current project's triggers). You get up to 1,000 units a month for free so once an hour should be safe. If you need more frequent updates check the Cloud Vision pricing guide for details.
After a few runs you should only get an email when something new is detected. If you're seeing too many wild guesses then add a filter on the score to exclude low confidence features.
Enjoy, and leave a comment if you have problems (or modify this in interesting ways).