xania.org

Site RSSBlogs

Back

Latest posts

Walking the dog with Claude
May 24, 2026
Written with LLM assistance.The interview format is genuine; the prose is lightly tidied from voice notes. I had lunch with a pal yesterday, and we got onto the subject of why so much technical material is either accurate-but-impenetrable or polished-but-slightly-wrong. It's a gap I think about a lot, partly because I make videos that try to land in the middle of it and don't always succeed. The
2025 in Review
Dec 31, 2025
Written by me, proof-read by an LLM. Details at end. 2025 has been quite a year for me. The big ticket things for me were having the majority of the year on a non-compete, a new job, and some videos and conference talks. It was a bumper year for my public talks, which included: I also appeared in a number of :Computerphile videos On the front, I finally solved a three-year-old problem with — ou
Thank you
Dec 25, 2025
Written by me, proof-read by an LLM. Details at end. It's the 25th! Whatever you celebrate this time of year, I wish you the very best and hope you are having a lovely day. For me, this is a family time: I'm not at all religious but was brought up to celebrate Christmas. So, today we'll be cooking a massive roast dinner and enjoying family time.1 This series was an idea I had around this time las
When compilers surprise you
Dec 24, 2025
Written by me, proof-read by an LLM. Details at end. Every now and then a compiler will surprise me with a really smart trick. When I first saw this optimisation I could hardly believe it. I was looking at loop optimisation, and wrote something like this simple function that sums all the numbers up to a given value: So far so decent: GCC has done some preliminary checks, then fallen into a loop t
Switching it up a bit
Dec 23, 2025
Written by me, proof-read by an LLM. Details at end. The standard wisdom is that switch statements compile to jump tables. And they do - when the compiler can't find something cleverer to do instead. Let's start with a really simple example: Here the compiler has spotted the relationship between and the return value, and rewritten the code as: - pretty neat. No jump table, just maths!xif (x < 5
Clever memory tricks
Dec 22, 2025
Written by me, proof-read by an LLM. Details at end. After exploring SIMD vectorisation over the last of , let's shift gears to look at another class of compiler cleverness: memory access patterns. String comparisons seem straightforward enough - check the length, compare the bytes, done. But watch what Clang does when comparing against compile-time constants, and you'll see some rather clever t
When SIMD Fails: Floating Point Associativity
Dec 21, 2025
Written by me, proof-read by an LLM. Details at end. we saw SIMD work beautifully with integers. But floating point has a surprise in store. Let's try summing an array:Yesterday1 Looking at the core loop, the compiler has pulled off a clever trick: The compiler is using a vectorised add instruction which treats the as 8 separate integers, adding them up individually to the corresponding element
SIMD City: Auto-vectorisation
Dec 20, 2025
Written by me, proof-read by an LLM. Details at end. It's time to look at one of the most sophisticated optimisations compilers can do: autovectorisation. Most "big data" style problems boil down to "do this maths to huge arrays", and the limiting factor isn't the maths itself, but the feeding of instructions to the CPU, along with the data it needs. To help with this problem, CPU designers came
Chasing your tail
Dec 19, 2025
Written by me, proof-read by an LLM. Details at end. Inlining is fantastic, as we've . There's a place it surely can't help though: recursion! If we call our own function, then surely we can't inline...seenrecently Let's see what the compiler does with the classic recursive "greatest common divisor" routine - surely it can't avoid calling itself? And yet: The compiler is able to avoid the recurs
Partial inlining
Dec 18, 2025
Written by me, proof-read by an LLM. Details at end. We've learned how important inlining is to optimisation, but also that it might sometimes cause code bloat. Inlining doesn't have to be all-or-nothing! Let's look at a simple function that has a fast path and slow path; and then see how the compiler handles it.1 In this example we have some function that has a really trivial fast case for numb
Inlining - the ultimate optimisation
Dec 17, 2025
Written by me, proof-read by an LLM. Details at end. Sixteen days in, and I've been dancing around what many consider the fundamental compiler optimisation: inlining. Not because it's complicated - quite the opposite! - but because inlining is less interesting for what it does (copy-paste code), and more interesting for what it enables. Initially inlining was all about avoiding the expense of the
Calling all arguments
Dec 16, 2025
Written by me, proof-read by an LLM. Details at end. Today we're looking at calling conventions - which aren't purely optimisation related but are important to understand. The calling convention is part of the ABI (Application Binary Interface), and varies from architecture to architecture and even OS to OS. Today I'll concentrate on the System V ABI for x86 on Linux, as (to me) it's the most san
Aliasing
Dec 15, 2025
Written by me, proof-read by an LLM. Details at end. we ended on a bit of a downer: aliasing stopped optimisations dead in their tracks. I know this is supposed to be the , not the ! Knowing why your compiler can't optimise is just as important as knowing all the clever tricks it can pull off.YesterdayAdvent of Compiler OptimisationsAdvent of Compiler Giving Up Let's take a simple example of a c
When LICM fails us
Dec 14, 2025
Written by me, proof-read by an LLM. Details at end. ended with the compiler pulling invariants like and out of our loop - clean assembly, great performance. Job done, right?Yesterday's LICM postsize()get_range Not quite. Let's see how that optimisation can disappear. Let's say you had a , and wanted to write a function to return if there was an exclamation mark or not:const char *1 Here we're
Loop-Invariant Code Motion
Dec 13, 2025
Written by me, proof-read by an LLM. Details at end. Look back at - there's an optimisation I completely glossed over. Let me show you what I mean:our simple loop example On every loop iteration we are calling to compare the index value, and to check if the index has reached the end of the vector. However, looking in the assembly, the compiler has pulled the size calculation out of the loop ent
Unswitching loops for fun and profit
Dec 12, 2025
Written by me, proof-read by an LLM. Details at end. Sometimes the compiler decides the best way to optimise your loop is to... write it twice. Sounds counterintuitive? Let's change our to optionally return a sum-of-squares:sumexample from before1 At the compiler turns the ternary into: - using a multiply and add () instruction to do the multiply and add, and conditionally picking either or
Pop goes the...population count?
Dec 11, 2025
Written by me, proof-read by an LLM. Details at end. Who among us hasn't looked at a number and wondered, "How many one bits are in there?" No? Just me then? Actually, this "population count" operation can be pretty useful in some cases like data compression algorithms, , and . How might one write some simple C to return the number of one bits in an unsigned 64 bit value?cryptography, chess, erro
Unrolling loops
Dec 10, 2025
Written by me, proof-read by an LLM. Details at end. A common theme for helping the compiler optimise is to give it as much information as possible. Using the , targeting the right CPU model, keeping , and for today's topic: telling it how many loop iterations there are going to be ahead of time.right signedness of typesloop iterations independent Taking the range-based sum example , but using a
Induction variables and loops
Dec 09, 2025
Written by me, proof-read by an LLM. Details at end. Loop optimisations often surprise us. What looks expensive might be fast, and what looks clever might be slow. we saw how the compiler canonicalises loops so it (usually) doesn't matter how you write them, they'll come out the same. What happens if we do something a little more expensive inside the loop?Yesterday Let's take a look at something
Going loopy
Dec 08, 2025
Written by me, proof-read by an LLM. Details at end. Which loop style is "best"? This very question led to the creation of Compiler Explorer! In 2011 I was arguing with my team about whether we could switch all our loops from ordinal or iterator-style to the "new" range-for. I wrote a small to iteratively show the compiler output as I edited code in , and the seed of was born.1shell scriptCompi