Dev Log: January 16, 2026
Courses
Spent time evaluating heterogeneity-aware GPU scheduling (Gavel) and scoping a CS244 project. The Gavel evaluation helped clarify when smart GPU allocation actually pays off versus when it reduces to a simple baseline. On the project side, reviewed the Shockwave reproduction as a reference point for calibrating scope with a four-person team. Later, worked on generating Word documents with internal hyperlinks and citation tracking.
The evaluation reveals when heterogeneity-awareness matters most:
- High cluster load (queues build up, smart allocation prevents explosion)
- Diverse workloads (mix of jobs with different GPU preferences)
- Diverse hardware (more GPU types = more optimization opportunity)
If your cluster has one GPU type or one job type, Gavel reduces to the baseline. The win comes from matching diverse jobs to diverse hardware.
The Shockwave sample shows the typical CS244 scope: 2 students reproduced one key figure from the paper by implementing ~3 core components. For 4 people, we need roughly double this scope with clearly separable work streams.
Internal hyperlinks in python-docx: Word documents use two mechanisms for internal links: (1) bookmarks - invisible anchors at target locations, and (2) hyperlinks with w:anchor attribute pointing to bookmark names. Since python-docx doesn’t expose these directly, we manipulate the underlying OOXML elements via OxmlElement and the qn() namespace helper.
Word internal links: Unlike HTML anchors, Word uses a two-part system: <w:bookmarkStart> marks the destination with a unique ID and name, then <w:hyperlink w:anchor="name"> links to it. The ID must be unique across all bookmarks in the document, while the name is what hyperlinks reference.
Section-scoped state: The pattern used here - a global _cited_in_section set that gets cleared on section boundaries - is a lightweight way to implement “first occurrence per scope” logic without restructuring the entire parser. It trades some purity for simplicity compared to passing state through every function.
Podcast Summarizer v2
A big day for the podcast summarizer. Started by fixing a SQL Server CASCADE constraint issue, then dealt with a CI outage blocking a merge. Shifted gears to rewriting the project’s CLAUDE.md with better design philosophy, then investigated the Azure OpenAI Batch API as a cost optimization for summarization. The most significant find was a dead-code bug where the daily delivery limit logic was implemented and tested but never wired into the production code path.
SQL Server CASCADE Constraint Limitation: SQL Server prevents multiple CASCADE paths to the same table because it can’t determine which path to follow during deletion. Here, deliveries already cascades to users via subscriptions (subscription → user), so adding a direct deliveries.user_id → users.user_id CASCADE creates ambiguity. The fix is to use NO ACTION or SET NULL instead.
Summary of the fix:
- Root Cause: SQL Server rejects multiple CASCADE paths to the same table. The
deliveriestable already cascades tousersviasubscriptions, so a directuser_id → usersCASCADE created a cycle. - Fix: Changed
ondelete="CASCADE"toondelete="NO ACTION"for theuser_idFK, matching the existing pattern used forvalidation_job_id. - Impact: When a user is deleted, direct deliveries with that
user_idwill not be automatically deleted (you’d need to handle this in application code or via a separate cleanup process).
Situation: The code fix for NULLS LAST is correct (unit tests pass), but the GPU Docker image build is failing because Ubuntu’s Launchpad PPA service is experiencing an outage. This prevents the workflow from completing.
Options:
- Wait for Launchpad to recover and re-run
- Merge with admin bypass since this is infrastructure-related, not code-related
- The fix is in PR #93 - you can merge when ready
CLAUDE.md Design Philosophy: The purpose of CLAUDE.md is to give AI assistants (and developers) the context they need to work effectively without reading every design doc. Key principles:
- Constraints first - What NOT to do is often more valuable than what to do
- Decisions over descriptions - Capture the “why” behind architectural choices
- Single source of truth - Reference detailed docs rather than duplicating
Key improvements based on best practices:
- Progressive disclosure - References design docs instead of duplicating content
- Constraints-first - DO NOT VIOLATE table is prominent and expanded
- Removed redundancy - Dropped env vars table (duplicates config.py), testing strategy (covered in Commands)
Current summarization flow:
- CPU processor claims deliveries with ready transcripts (line 455-479 in
cpu_processor.py) - Calls
LLMProvider.generate_summary()synchronously for each transcript - Has a
TokenRateLimiterthat tracks TPM/RPM and waits when approaching Azure quotas - One-at-a-time processing with 50k TPM limit
Why batch API fits your philosophy: The CLAUDE.md states “prefer batching and scale-to-zero over low latency. Optimize for cost, not speed” - this is exactly what the batch API is designed for (50% cost savings, 24-hour SLA).
Azure OpenAI Batch API basics:
- Requires a Global Batch deployment (separate from your standard
gpt-5.2-chatdeployment) - You upload a JSONL file to blob storage with all requests
- Azure processes asynchronously, writes results back to blob
- 50% cost savings vs standard pricing
- 24-hour SLA (typically faster in practice)
- Results available for 24 hours after completion
Good news: You already have Azure Blob storage set up, which is required for batch input/output files.
The daily limit SQL is dead code. The sophisticated claim_deliveries function with daily limit logic exists in cpu_processor.py:284, but the actual production path (run_cpu_processor -> delivery_service.deliver_pending) never calls it.
- Tests pass because they call
claim_deliveriesdirectly - Production fails because
deliver_pendinguses its own simple query with no daily limit
This is a classic integration gap - the feature was implemented and unit tested, but never wired into the actual code path.
The Bug: The claim_deliveries function with daily limit logic was implemented but never called by the production code path. The run_cpu_processor function called delivery_service.deliver_pending() which had its own simple query with no daily limit checking.
Why tests passed: The existing tests called claim_deliveries directly, not through run_cpu_processor. Classic integration gap.
The Fix: Three changes:
- Modified
run_cpu_processorto useclaim_deliveries()instead of bypassing it - Added
pre_claimedparameter todeliver_one()to skip re-claiming - Updated status methods (
_mark_sent,_mark_failed,_mark_retry,_release_delivery) to clearprocessing_owner_id
Tools
Worked on improving a Safari web clipper AppleScript. The main challenge was getting HTML content written to disk reliably, since shell heredocs break on HTML metacharacters. Switched to native AppleScript file I/O, replaced trafilatura with an embedded JavaScript-based HTML-to-Markdown converter, and added image downloading support.
Why heredocs fail here:
do shell scriptin AppleScript passes commands to/bin/sh -cas a single string- Heredocs (
<< 'EOF') require multi-line handling that doesn’t work well in this context - HTML contains many shell metacharacters (
<,>,$, backticks, quotes) that cause parsing errors
Why native AppleScript file I/O is better here:
- No shell escaping needed - writes raw bytes directly
«class utf8»ensures proper UTF-8 encoding for international charactersset eof of fileRef to 0clears any existing content before writing- The nested
tryin the error handler ensures the file handle is always closed
Key AppleScript improvements I’ll make:
- Native file I/O - Bypasses shell entirely, avoiding all escaping issues
- Better PDF detection - Handle URLs with query strings (e.g.,
file.pdf?token=xxx) - curl timeout - Prevent hanging on slow/unresponsive servers
- Cleaner error handling - Consolidated cleanup in error paths
- Remove scroll-to-bottom - This was a visual indicator but feels unnecessary
Why native AppleScript I/O is robust:
«class utf8»ensures proper encoding for any Unicode contentset eof of fileRef to 0truncates any existing content before writing- The nested
tryin error handling guarantees the file handle is always closed, preventing resource leaks - No shell interpolation means
<script>,$variables, and backticks are written literally
Why trafilatura was chosen:
- It’s a battle-tested library for extracting article content from messy HTML
- Handles boilerplate removal, ads, nav elements, etc.
- But it has ~15 transitive dependencies (lxml, urllib3, certifi, etc.)
How the embedded JS converter works:
- Recursively walks the DOM tree, converting each element to Markdown
- Handles all common elements: headers, paragraphs, lists, links, images, code blocks, tables, blockquotes
- Leverages Safari’s Reader Mode when available for pre-cleaned content
- Falls back to finding
<article>,<main>, or similar semantic elements
How image downloading works in the JS version:
- Extract - grep parses markdown for
patterns - Resolve - Handles relative URLs (
/path,//host/path,path) - Download - curl fetches with proper User-Agent and 15s timeout
- Dedupe - Adds counter suffix if filename already exists
- Rewrite - sed replaces markdown syntax with
![[folder/file.jpg]]