Links:my little whisper.cpp bug fixwhy the logits were calculated inconsistentlywav2vec2 on arxiv and huggingfacethe openai whisper asr model announcementbeam search patiencetop notch guide to CTCthe bitter lesson (and in meme form)Errata:* I referred in the show to LLMs as encoder-decoder models. Most modern LLMs are decoder-only.* I messed up readability at Google. It means approvability, apparently. 🤷
--------
35:16
Iota: Random algorithms
In which I ramble about randomness and random algorithms. Now with theme music!Paper Cuts planned reading: Habitability and Piecemeal Growth, in Patterns of Software (just pages 7–16 of the book, which is pages 25-32 of the PDF)Selected links:* SIEVE cache replacement algorithm* Power of two random choices* Marc Brooker's blog* Random forests* Count Min Sketch* Monte Carlo Simulation* Random projection* T-Digest* The fix for my embarrassing compiler bug
--------
21:36
Write It Down with Shay Nehmad
Shay Nehmad on how writing is the key to becoming a better engineer, how to do it, and more.Links:* Cup O' Go podcast* Code Complete book* Shay's blog* Obsidian and Logseq
--------
41:40
Litestream and LiteFS with Ben Johnson
This was a fun and decidedly humbling conversation with Ben Johnson about SQLite, databases, Litestream, and LiteFS.Links:Ben on GitHubLitestreamLiteFS
--------
36:25
Iota: Rolling Hashes and FastCDC
No guest for this inaugural episode--just me this round.I cover the basics of rolling hashes and FastCDC, which appears to be the state of the art in content defined chunking.Mentioned in the episode:8 bit hash bugFastCDC paperPerkeep (once called Camlistore, oops)buprsyncI know the audio is slightly subpar (to say nothing of the content). But I think I know how to improve for next time. Finding my feet. :)Feedback welcome: josh@sigpod.dev