Philip Brewer - An actually useful use-case for large language models

I just thought of a possibly actually useful use-case for large language models (what’s being called AI these days): Generating metadata for your photo library.

This is useful, because almost nobody is willing to generate their own metadata for photos. Most people have vast libraries with literally nothing but the date, time, and location captured by their phone or camera, the image itself, and details of the capture (exposure time, ISO, etc.).

Using the date, time, and location info, together with the image itself, AI could:

Write a brief description of the image.
Tell you where it was taken from (not just the latitude and longitude, but the name of the place where you were standing).
Look up if an event were underway at that place and time and say what it was (county fair, protest march).
Tell you any number of arbitrary things, like if there was something going on with the weather at that time (blizzard, wind chill advisory)—but only if it was interesting.

I know Google Photos can already do some of this. I don’t think it writes metadata for you, but it will find all of your photos that were taken in St. Croix, for example. (I’d heard that it could locate all your photos of a particular sculpture, but it didn’t work for the sculpture I just tried to find.) In any case, an LLM running on your own computer, saving the data to your photo library, would have all kinds of advantages. There are the obvious privacy advantages, but also sharing advantages—the metadata (or a subset that you selected) would be available to be included when you shared the image with a friend.