Dev Log: January 16, 2026

Courses

Spent time evaluating heterogeneity-aware GPU scheduling (Gavel) and scoping a CS244 project. The Gavel evaluation helped clarify when smart GPU allocation actually pays off versus when it reduces to a simple baseline. On the project side, reviewed the Shockwave reproduction as a reference point for calibrating scope with a four-person team. Later, worked on generating Word documents with internal hyperlinks and citation tracking.

The evaluation reveals when heterogeneity-awareness matters most:

High cluster load (queues build up, smart allocation prevents explosion)
Diverse workloads (mix of jobs with different GPU preferences)
Diverse hardware (more GPU types = more optimization opportunity)

If your cluster has one GPU type or one job type, Gavel reduces to the baseline. The win comes from matching diverse jobs to diverse hardware.

The Shockwave sample shows the typical CS244 scope: 2 students reproduced one key figure from the paper by implementing ~3 core components. For 4 people, we need roughly double this scope with clearly separable work streams.

Internal hyperlinks in python-docx: Word documents use two mechanisms for internal links: (1) bookmarks - invisible anchors at target locations, and (2) hyperlinks with w:anchor attribute pointing to bookmark names. Since python-docx doesn’t expose these directly, we manipulate the underlying OOXML elements via OxmlElement and the qn() namespace helper.

Word internal links: Unlike HTML anchors, Word uses a two-part system: <w:bookmarkStart> marks the destination with a unique ID and name, then <w:hyperlink w:anchor="name"> links to it. The ID must be unique across all bookmarks in the document, while the name is what hyperlinks reference.

Section-scoped state: The pattern used here - a global _cited_in_section set that gets cleared on section boundaries - is a lightweight way to implement “first occurrence per scope” logic without restructuring the entire parser. It trades some purity for simplicity compared to passing state through every function.

Podcast Summarizer v2

A big day for the podcast summarizer. Started by fixing a SQL Server CASCADE constraint issue, then dealt with a CI outage blocking a merge. Shifted gears to rewriting the project’s CLAUDE.md with better design philosophy, then investigated the Azure OpenAI Batch API as a cost optimization for summarization. The most significant find was a dead-code bug where the daily delivery limit logic was implemented and tested but never wired into the production code path.

SQL Server CASCADE Constraint Limitation: SQL Server prevents multiple CASCADE paths to the same table because it can’t determine which path to follow during deletion. Here, deliveries already cascades to users via subscriptions (subscription → user), so adding a direct deliveries.user_id → users.user_id CASCADE creates ambiguity. The fix is to use NO ACTION or SET NULL instead.

Summary of the fix:

Root Cause: SQL Server rejects multiple CASCADE paths to the same table. The deliveries table already cascades to users via subscriptions, so a direct user_id → users CASCADE created a cycle.
Fix: Changed ondelete="CASCADE" to ondelete="NO ACTION" for the user_id FK, matching the existing pattern used for validation_job_id.
Impact: When a user is deleted, direct deliveries with that user_id will not be automatically deleted (you’d need to handle this in application code or via a separate cleanup process).

Situation: The code fix for NULLS LAST is correct (unit tests pass), but the GPU Docker image build is failing because Ubuntu’s Launchpad PPA service is experiencing an outage. This prevents the workflow from completing.

Options:

Wait for Launchpad to recover and re-run
Merge with admin bypass since this is infrastructure-related, not code-related
The fix is in PR #93 - you can merge when ready

CLAUDE.md Design Philosophy: The purpose of CLAUDE.md is to give AI assistants (and developers) the context they need to work effectively without reading every design doc. Key principles:

Constraints first - What NOT to do is often more valuable than what to do
Decisions over descriptions - Capture the “why” behind architectural choices
Single source of truth - Reference detailed docs rather than duplicating

Key improvements based on best practices:

Progressive disclosure - References design docs instead of duplicating content
Constraints-first - DO NOT VIOLATE table is prominent and expanded
Removed redundancy - Dropped env vars table (duplicates config.py), testing strategy (covered in Commands)

Current summarization flow:

CPU processor claims deliveries with ready transcripts (line 455-479 in cpu_processor.py)
Calls LLMProvider.generate_summary() synchronously for each transcript
Has a TokenRateLimiter that tracks TPM/RPM and waits when approaching Azure quotas
One-at-a-time processing with 50k TPM limit

Why batch API fits your philosophy: The CLAUDE.md states “prefer batching and scale-to-zero over low latency. Optimize for cost, not speed” - this is exactly what the batch API is designed for (50% cost savings, 24-hour SLA).

Azure OpenAI Batch API basics:

Requires a Global Batch deployment (separate from your standard gpt-5.2-chat deployment)
You upload a JSONL file to blob storage with all requests
Azure processes asynchronously, writes results back to blob
50% cost savings vs standard pricing
24-hour SLA (typically faster in practice)
Results available for 24 hours after completion

Good news: You already have Azure Blob storage set up, which is required for batch input/output files.

The daily limit SQL is dead code. The sophisticated claim_deliveries function with daily limit logic exists in cpu_processor.py:284, but the actual production path (run_cpu_processor -> delivery_service.deliver_pending) never calls it.

Tests pass because they call claim_deliveries directly
Production fails because deliver_pending uses its own simple query with no daily limit

This is a classic integration gap - the feature was implemented and unit tested, but never wired into the actual code path.

The Bug: The claim_deliveries function with daily limit logic was implemented but never called by the production code path. The run_cpu_processor function called delivery_service.deliver_pending() which had its own simple query with no daily limit checking.

Why tests passed: The existing tests called claim_deliveries directly, not through run_cpu_processor. Classic integration gap.

The Fix: Three changes:

Modified run_cpu_processor to use claim_deliveries() instead of bypassing it
Added pre_claimed parameter to deliver_one() to skip re-claiming
Updated status methods (_mark_sent, _mark_failed, _mark_retry, _release_delivery) to clear processing_owner_id

Tools

Worked on improving a Safari web clipper AppleScript. The main challenge was getting HTML content written to disk reliably, since shell heredocs break on HTML metacharacters. Switched to native AppleScript file I/O, replaced trafilatura with an embedded JavaScript-based HTML-to-Markdown converter, and added image downloading support.

Why heredocs fail here:

do shell script in AppleScript passes commands to /bin/sh -c as a single string
Heredocs (<< 'EOF') require multi-line handling that doesn’t work well in this context
HTML contains many shell metacharacters (<, >, $, backticks, quotes) that cause parsing errors

Why native AppleScript file I/O is better here:

No shell escaping needed - writes raw bytes directly
«class utf8» ensures proper UTF-8 encoding for international characters
set eof of fileRef to 0 clears any existing content before writing
The nested try in the error handler ensures the file handle is always closed

Key AppleScript improvements I’ll make:

Native file I/O - Bypasses shell entirely, avoiding all escaping issues
Better PDF detection - Handle URLs with query strings (e.g., file.pdf?token=xxx)
curl timeout - Prevent hanging on slow/unresponsive servers
Cleaner error handling - Consolidated cleanup in error paths
Remove scroll-to-bottom - This was a visual indicator but feels unnecessary

Why native AppleScript I/O is robust:

«class utf8» ensures proper encoding for any Unicode content
set eof of fileRef to 0 truncates any existing content before writing
The nested try in error handling guarantees the file handle is always closed, preventing resource leaks
No shell interpolation means <script>, $variables, and backticks are written literally

Why trafilatura was chosen:

It’s a battle-tested library for extracting article content from messy HTML
Handles boilerplate removal, ads, nav elements, etc.
But it has ~15 transitive dependencies (lxml, urllib3, certifi, etc.)

How the embedded JS converter works:

Recursively walks the DOM tree, converting each element to Markdown
Handles all common elements: headers, paragraphs, lists, links, images, code blocks, tables, blockquotes
Leverages Safari’s Reader Mode when available for pre-cleaned content
Falls back to finding <article>, <main>, or similar semantic elements

How image downloading works in the JS version:

Extract - grep parses markdown for ![alt](url) patterns
Resolve - Handles relative URLs (/path, //host/path, path)
Download - curl fetches with proper User-Agent and 15s timeout
Dedupe - Adds counter suffix if filename already exists
Rewrite - sed replaces markdown syntax with ![[folder/file.jpg]]