Bhavana AI

AI/ML insights

Dev Log: December 25, 2025

podcast-summarizer-v2

Built a full admin dashboard for the podcast summarizer, from backend API to frontend SPA deployment. Started with FastAPI admin endpoints using dependency-chained authorization, then implemented eight routers covering dashboard stats, user/channel/request management, content pipeline monitoring, and manual job triggers. Moved to the frontend with a React + TanStack Query SPA, wired up Azure Static Web Apps with SPA routing, added CI/CD via GitHub Actions, and wrapped up the day with a major query optimization pass that collapsed N+1 patterns across the codebase into efficient batch queries.

Task 1 Pattern: The AdminUser = Annotated[User, Depends(require_admin)] type alias is a FastAPI best practice. It chains dependencies—when you use AdminUser as a parameter type, FastAPI automatically:

  1. Calls get_current_user (from CurrentUser)
  2. Then calls require_admin with that user
  3. Injects the validated admin user into your endpoint

Router Aggregation Pattern: The admin router uses a hub-and-spoke pattern. The main router.py will later include sub-routers for each domain (dashboard, users, channels) with router.include_router(dashboard.router, prefix="/dashboard"). This keeps each domain’s endpoints in separate files while presenting a unified /admin/* API.

Pydantic v2 Config: The model_config = {"from_attributes": True} setting allows direct instantiation from SQLAlchemy models:

user = AdminUserSummary.model_validate(db_user)  # Works!

Pydantic reads db_user.email, db_user.tier, etc. automatically.

SQLAlchemy 2.0 Patterns: The dashboard uses modern query style:

db.execute(select(func.count()).where(Delivery.status == "pending")).scalar()

Key differences from 1.x: select() is a function not method, execute() returns a Result, scalar() extracts single value. This is type-safe and explicit.

Backend Complete! The admin API now has 8 routers providing comprehensive admin functionality:

  • Dashboard/Stats: System health monitoring
  • Users/Channels/Requests: Entity management
  • Deliveries/Episodes: Content pipeline management
  • Jobs: Manual job triggers for debugging

All endpoints use the AdminUser dependency for consistent authorization.

These pages follow a consistent React + TanStack Query pattern:

  • useQuery for data fetching with queryKey for cache invalidation
  • refetch() after mutations to update the UI
  • Skeleton/loading states for better UX
  • Badge variants for status differentiation

React Router v7 Route Layout Pattern: The <Route element={<AppShell />}> acts as a layout route. Child routes render inside the <Outlet /> component within AppShell. This pattern keeps navigation consistent across all pages while allowing the main content to change.

TanStack Query Configuration:

  • staleTime: 30000 means data is considered fresh for 30s (reduces API calls)
  • refetchOnWindowFocus: false prevents refetching when user tabs back (better for admin dashboards)

Azure Static Web Apps SPA Config: The navigationFallback with rewrite: "/index.html" is critical for SPAs with client-side routing. Without it, direct navigation to routes like /users would return a 404. The exclude pattern prevents rewriting requests for actual static assets.

Infrastructure as Code Pattern: Per the project’s CLAUDE.md rules, all production changes must go through Bicep. The corsOrigins parameter uses a JSON string because Azure doesn’t natively support array parameters in Bicep for Container Apps ingress config. The json(corsOrigins) call parses it at deployment time.

GitHub Actions Path Filtering: The paths trigger ensures this workflow only runs when admin-dashboard/** files change, avoiding unnecessary CI runs when only backend code changes. The defaults.run.working-directory sets the working directory for all run: steps, reducing repetition.

SQL Server vs PostgreSQL syntax difference: The ORDER BY ... NULLS LAST clause is PostgreSQL/SQLite syntax. SQL Server doesn’t support it directly - you need to use a CASE expression instead: ORDER BY CASE WHEN col IS NULL THEN 1 ELSE 0 END, col DESC

Issues Fixed:

  1. Nullable created_at fields - Some database records have NULL for created_at. Changed schemas to use datetime | None instead of datetime
  2. SQL Server compatibility - Replaced nullslast() (PostgreSQL/SQLite syntax) with a CASE expression that works across all databases: ORDER BY CASE WHEN col IS NULL THEN 1 ELSE 0 END, col DESC

Query efficiency note: The current implementation runs multiple queries per channel (N+1 problem). For a production dashboard with many channels, you’d want to batch these counts using window functions or subqueries. For an admin dashboard with 34 channels, this is acceptable.

Query Optimization Applied:

BeforeAfter
1 + 5N queries (N = channels per page)2 queries total
For 20 channels: 101 queriesFor 20 channels: 2 queries

Technique used: Pre-computed subqueries with LEFT OUTER JOIN. Each subquery aggregates counts grouped by channel_id, then the main query joins all subqueries at once. COALESCE(..., 0) handles channels with no matching rows.

Trade-off: The query is more complex but executes in a single database round-trip. The database optimizer can parallelize the subqueries internally.

Query Optimization Summary

LocationBeforeAfterImpact
Admin Users List (users.py)1 + 2N queries2 queries40 → 2 for 20 users
Admin Episodes List (episodes.py)1 + 2N queries2 queries40 → 2 for 20 episodes
Admin Channels List (channels.py)1 + 5N queries2 queriesAlready fixed earlier
Dashboard Stats (dashboard.py)5 separate COUNT queries1 query with CASE5 → 1
Processing Stats (stats.py)2 COUNT queries1 query with CASE2 → 1
Delivery Service countsSELECT * + len()COUNT() aggregateMemory + latency
Channel Detail helper3 separate queries1 query with subqueries3 → 1
Transcript cache lookupSequential OR queriesSingle OR condition2 → 1
Delivery creationN existence checksBatch check with INN → 2

Key Techniques Used:

  1. Pre-computed subqueries - Calculate aggregates once, join to main query
  2. CASE expressions - Multiple counts in single table scan
  3. OR conditions - Combine multiple lookups into single query
  4. Batch IN clauses - Check multiple existence conditions at once
  5. Code deduplication - Import shared helpers instead of copy-paste