Dev Log: January 9, 2026
CS244 GPU Scheduling Traces
Continued digging into the GPU cluster trace datasets for the CS244 project. The key question was understanding what kinds of GPU allocation patterns each trace captures, since that directly affects which scheduling problems you can study with them. It turns out the Alibaba and Philly traces have fundamentally different allocation models, which shapes the fragmentation dynamics you can reproduce.
Key difference between the traces:
- Alibaba traces: Have fractional GPU requests (e.g., “50% of a GPU”) - this is what causes the fragmentation FGD addresses
- Philly traces: Only have whole GPU allocations (jobs get gpu0, gpu1, etc.) - no fractional sharing
Internet Censorship Research
Spent time reading through a decade’s worth of papers on the Great Firewall’s active probing infrastructure. The research arc from the early 2010s through 2024 tells a compelling story of how measurement researchers and censors have co-evolved. The censor’s systems have gotten faster and more sophisticated, but so have the techniques for fingerprinting and exploiting them.
These papers reveal a fascinating cat-and-mouse dynamic:
- Memory safety matters for censors too - The GFW’s C code has the same buffer over-read bugs as any other software, but now adversaries can exploit them to study the censor
- Centralization leaves fingerprints - Despite using thousands of source IPs, shared TCP timestamps and ISN patterns expose that active probing comes from just a few processes
- The 10-year gap between papers shows evolution - From 15-minute probing queues (2011) to real-time probing (2015), and from basic DNS injection to Wallbleed disclosure (2021-2024)