• ai
  • workflow

Semantic Desktops Powered by LLMs

Over the last two years I’d exported around 250 photos from my Lightroom Classic catalog as JPEGs to share with friends and family. Now I wanted to revisit those RAW files—re-edit them and re-export for this site. The problem: I don’t know where the originals are. They’re buried somewhere among thousands of photos, and all I have are filenames like DSC_0043-Pano-Enhanced-NR.jpg.

Initially I started by just manually copy pasting each file name into ctrl+f find in my Lightroom catalog for the corresponding RAW file to then add to a catalog. This is fine for just a few photos, but is a bit ridiculous to do for over 200 photos.

At this point the only thing I knew about Lightroom Classic automation capabilities is it uses a sqlite database to store catalog information, so I knew there had to be a way to automate this, even if it was just an ugly one time hacky script1. It was time to turn to the LLM (large language model)2 to do my dirty work.

After describing my problem to an LLM a much better idea arose of using Lightroom’s Lua plugin SDK, I guess it’s not going to be so dirty after all (aside from the Lua with its 1 based indexing3).

A few minutes later, after letting the LLM execute some tools, and following its instructions, I had my photo collection just as I wanted.

The Solution

The conversation started with me describing the problem and clarifying that the exported files are JPEGs while the originals are RAWs. After some back and forth about options—ExifTool, Smart Collections, Lua plugins—the LLM proposed a Lua plugin approach. I gave it the green light with the specific paths:

the directory with all the photo file names (.jpg) I want to add to collection: /mnt/c/Users/jbuck/Downloads/photo_export and you can go ahead and create the lightroom plugin directory here: /mnt/c/Users/jbuck/Documents/LrPlugins/FindByFilename.lrplugin/

From there, the LLM got to work. First, extract the base identifiers from the exported files and filter out photos I’d already processed:

# Extract DSC_XXXX pattern from exported JPEGs
ls ~/exports/*.jpg | xargs -n1 basename \
  | sed 's/\(DSC_[0-9]*\).*/\1/' | sort -u > /tmp/export_ids.txt

# Remove IDs already in the site manifest
jq -r '.photos[].importedFile' manifest.json \
  | sed 's/.*\(DSC_[0-9]*\).*/\1/' | sort -u > /tmp/done_ids.txt

comm -23 /tmp/export_ids.txt /tmp/done_ids.txt > remaining_ids.txt

(I learn about comm in the process…)

Then, a Lightroom plugin to search the catalog and create a collection:

-- Extract DSC_XXXX pattern from a filename
local function extractDscId(filename)
    return filename:match("(DSC_%d+)")
end

-- Search catalog and create collection
local allPhotos = catalog:getAllPhotos()
for _, photo in ipairs(allPhotos) do
    local filename = photo:getFormattedMetadata("fileName")
    local dscId = extractDscId(filename)
    if dscId and targetIds[dscId:upper()] then
        table.insert(matches, photo)
    end
end

catalog:withWriteAccessDo("Create Collection", function()
    local collection = catalog:createCollection("To Re-Export", nil, true)
    collection:addPhotos(matches)
end)

And finally some UI interaction steps to run the plugin (as instructed by the LLM4):

  1. Open Lightroom Classic
  2. Go to File → Plug-in Manager
  3. Click Add (bottom left)
  4. Navigate to the .lrplugin folder and select it
  5. Click Done
  6. Go to Library → Plug-in Extras → Create Collection from File List

Result: 208 photos found and collected, ready for review. The full plugin and steps involved are documented here, and the raw LLM session has the full chat.

What Just Happened?

This task crossed half a dozen domains: filesystem operations, regex and text processing, JSON parsing, Lua scripting, the Lightroom SDK (which I didn’t know existed), and UI interactions to install and run the plugin. The LLM did the legwork across all of them. I described what I wanted, it bridged the domain boundaries, and I had a working solution in a few minutes.

This is getting at an old idea: the semantic desktop.

The term dates back to the 2000s. The vision was a computing environment where all your data—files, emails, contacts, calendar events—could be seamlessly connected and queried. Your computer would understand the meaning of your data, not just store bytes and execute independent processes. Projects like NEPOMUK and the Semantic Web aimed to achieve this through structured metadata, ontologies, and standardized schemas. The idea was compelling: ask your computer “show me all photos from trips where I met with John” and have it understand the relationships between photos, locations, calendar events, and contacts.

There’s a related idea in the Unix philosophy: small tools that do one thing well, composed together via pipes. du -ah . | sort -rh | head -10—each tool is simple, but composition creates power. I’m a big fan of text-centric applications and APIs5 precisely because they inherit this wealth of tools for composition.

But both approaches hit the same wall. The semantic desktop required everyone to agree on schemas and metadata standards—chasing a “universal API of everything.” The Unix philosophy breaks down at application boundaries. How do you pipe data into Lightroom? How do you compose a spreadsheet with a calendar with a photo library? Some of these ideas work within silos—modern email clients can link messages to contacts and calendar events—but step outside and you’re back to manually connecting the dots.

Why LLMs Change This

A productive human-computer system needs several things working together: knowledge of tools and domains, the ability to execute operations, composition between tools, understanding of what data means (semantics), and some intent driving the whole thing.

Before LLMs, the division of labor was clear. Computers handled execution—running code, crunching numbers, storing bytes. Humans handled everything else. We built tools to help: web search for knowledge, APIs and SDKs for composition, programming languages to express intent. But these just assisted us. The computer couldn’t step in and directly span those gaps.

LLMs shift that boundary. The computer can now participate in knowledge, composition, and semantics—whether from training data or documentation added to the session’s context. When I say “find the original RAW files that match these exported JPEGs,” the LLM understands the semantic relationship even though no formal schema defines it. It translates my intent into the specific incantations each tool requires.

The LLM becomes the glue layer that the semantic desktop always needed.

More Examples

Here’s a simpler example that shows the same pattern. I wanted to know the source code file size distribution in a large Rust codebase (Agave Solana validator). Traditionally, I’d write a script, maybe reach for Python and a plot library.

Instead, in a single shot query directly to an LLM:

show me a distribution of line counts for *.rs files in this repo

Within less than 2 minutes:

  Lines        Count     %      Distribution
  0-50           257   21.0%  ███████████████████████████████████
  51-100         172   14.0%  ███████████████████████
  101-200        213   17.4%  █████████████████████████████
  201-500        291   23.8%  ████████████████████████████████████████
  501-1000       152   12.4%  ████████████████████
  1001-2000       80    6.5%  ██████████
  2001-5000       44    3.6%  ██████
  5000+           16    1.3%  ██

Summary Statistics:
  Min:        0 lines
  Max:    20564 lines
  Median:   184 lines
  Mean:     508 lines

The LLM handled the file traversal, line counting, bucketing, statistics, and ASCII visualization by composing some cli tools with bash. The raw session is here. The bash commands make it easy to verify the derivation of the numbers, bash here for the curious.

Combining tools isn’t always the point. Sometimes the emphasis is on domain knowledge:

Combining colors: #0096FF + #FF2E2E produces what color?

The LLM understood I was asking about RGB color mixing, computed the average of each channel, and explained the result:

- #0096FF (blue): RGB(0, 150, 255)
- #FF2E2E (red): RGB(255, 46, 46)

Average:
- R: (0 + 255) / 2 = 128
- G: (150 + 46) / 2 = 98
- B: (255 + 46) / 2 = 151

Result: #806297 — a muted purple/violet.

When I followed up asking “what if I add orange?”, it understood the context, added #FFA500 to the mix, and explained how combining many colors tends toward muddy browns (including subtle differences based on how they are combined/mixed). The raw session shows the full exchange.

The mapping from hex codes to RGB to English color names—these are all semantic relationships the LLM has internalized.

For something this simple, you could write a lookup table. But consider the variations: spelling errors (blu, orenge), other languages (azul, rouge), different color spaces (HSV, HSL), associations with objects (apples can be red). The LLM handles all of these without custom code—and can map this knowledge to tool parameters and other domains as needed.

A quick example with typos and mixed languages:

give me a hexcode for a dark blu and bright yelo and then combine them

- Dark Blue: #1a3a5c
- Bright Yellow: #ffdd00

Combined (averaging RGB): #8d8b2e — a muted olive/khaki green

mix in more verde and increase the sateration

Result: #4cb82a — a vivid, saturated green with a slight lime quality

Notice “blu”, “yelo”, “sateration”, and the Spanish “verde”—the LLM handled the semantics without issue. Raw session for reference.

The Glue Layer

For decades, we’ve had powerful tools that lived in silos. The semantic desktop tried to connect them through schemas and standards—too brittle. The Unix philosophy showed composition works, but only when tools speak the same text conventions.

LLMs offer a different approach: a layer that can bridge domains on the fly, translating intent into whatever syntax each tool demands. It’s not perfect, and understanding what’s happening under the hood still matters. But for crossing domain boundaries? The friction has dropped dramatically.


Site Updates

A few things have changed since the last post:

  • New photos — I’ve been steadily adding to the gallery as I work through my backlog
  • Infinite scroll — The photo gallery now loads more photos as you scroll, rather than always loading all thumbnails
  • Mobile improvements — Better touch interactions and layout on smaller screens
  • UI Improvements — Better fonts, UI polish, and some minor fixes
  • Internal Improvements: Lots of changes to support richer posts with syntax highlighting, rendered subpages, footnotes, etc.

  1. I had visions of a python script that would read and write directly from the sqlite database (which has no public API and has no stability guarantees).
  2. I use the term “LLM” instead of “AI” throughout this post to emphasize it is a Large Language Model. “AI” could mean a lot of things, a fully autonomous AI synthetic brain, a hive mind network of different models, Skynet, etc. I am using just a single LLM (Opus 4.5) running with scaffolding for tool use and agentic behavior (Claude Code or Opencode).
  3. This is in jest (mostly), I actually find Lua enjoyable to use.
  4. I do this manually but in theory the LLM could be screen shared and have tools for mouse/keyboard input.
  5. Neovim by the way.