Error ABI

(matklad.github.io)

87 points | by todsacerdoti 4 days ago ago

44 comments

  • Panzerschrek 3 days ago ago

    After reading this article I have yet another idea how to handle errors. A caller should provide two return addresses for a calle - one for normal return and another one for error return. Doing so isn't particularly costly - it requires pushing an error-handler block address before the call instruction and popping it after. It should be easier compared to unwinding, since no side-tables are necessary and thus going the error path shouldn't be substantially costly compared to the happy path.

    • zozbot234 3 days ago ago

      The nice thing about this convention is that the error return address can be modeled in earlier versions of the API/ABI simply as an extra parameter - technically a continuation that the function tailcalls into; this is extensible to any return of enum-like data, not just error values.

  • fleventynine 3 days ago ago

    Rigid ABIs aren't necessary for statically linked programs. Ideally, the compiler would look at the usage of the function in context and figure out an ABI specifically for that function that minimizes unnecessary copies and register churn.

    IMHO this is the next logical step in LTO; today we leave a lot of code size and performance on the floor in order to meet some arbitrary ABI.

    • ozgrakkurt 3 days ago ago

      Can’t offload everything into the compiler. It is already too slow.

      Soon people will demand it just figures out what you are implementing and rewrites your whole codebase

      • fleventynine 2 days ago ago

        > Can’t offload everything into the compiler. It is already too slow.

        Speak for yourself. On embedded platforms I'd happily make my compiles twice as slow for 10% code size improvements.

      • acedTrex 3 days ago ago

        > Soon people will demand it just figures out what you are implementing and rewrites your whole codebase

        We have this now, it is indeed very slow lol. Gemini is pretty fast however.

      • thfuran 3 days ago ago

        Isn't that what people are using Claude for now?

    • layer8 3 days ago ago

      I would argue that most of today’s performance problems in software are unrelated to ABI.

      • stmw 3 days ago ago

        I would argue that is largely true because we got the ABIs and the hardware to support them to be highly optimized. Thing slow down very quickly if one gets off that hard-won autobahn of ABI efficiency.

        • pornel 2 days ago ago

          Compilers inline everything they can.

          Partly it's due to lack of better ideas for effective inter-procedural analysis and specialization, but it could also be a symptom of working around the cost of ABIs.

          • layer8 2 days ago ago

            The point of interfaces is to decouple caller implementation details from callee implementation details, which almost by definition prevents optimization opportunities that rely on the respective details. There is no free lunch, so to speak. Whole-program optimization affords more optimizations, but also reduces tractability of the generated code and its relation to the source code, including the modularization present in the source code.

            In the current software landscape, I don’t see these additional optimizations as a priority.

      • fleventynine 2 days ago ago

        When looking at the rv32imc emitted by the Rust compiler, it's clear that there would be a lot less code if the compiler could choose different registers than those defined in the ABI for the arguments of leaf functions.

        Not to mention issues like the op mentions making it impossible to properly take advantage of RVO with stuff like Result<T> and the default ABI.

  • eqvinox 3 days ago ago

    > Instead, when returning an error, rather than jumping to the return address, we look it up in the side table to find a corresponding error recovery address, and jump to that. Stack unwinding!

    Since this is a custom ABI, how about just adding something large enough (cacheline sized, e.g. 64) to the return address? Saves the table shenanigans. Just tell the codegen to force emit the error path at that address... (I wonder if LLVM can actually do that...)

    • stmw 3 days ago ago

      This is a good direction - many ABI's provide for this kind of thing at the top of the stack for interrupts (meaning x-bytes below the downward-growing stack are reserved by the OS).

  • carterschonwald 3 days ago ago

    My monad-ste library on Hackage does a pretty clean no dirty state core dump in code space. I wrote it originally for ability to have mutable state optimizations in an industrial interpreter where I didn’t want overhead of every monadic bind doing a tag check. It winds codifying a fun pattern for avoiding dirty state though it could probably be extended to regions or something. But unclear if the Ux would stay nice

  • malkia 3 days ago ago

    I loved Java checked exceptions when learning the language (coming from C++), then understood the biggest pitfall - if you need to extend with new exception, or remove existing you have to change all the code using it (maybe there are better approaches to this, but that's what I remmember). I'm back to C++ but it felt good having exceptions done like this.

    • bigstrat2003 3 days ago ago

      That is not a pitfall, that is the whole point of checked exceptions. By making the errors part of the type system, you get the ability for the compiler to warn you "hey man, that method started to raise a new error so you have to make sure to handle it". That is a great benefit, not a pitfall!

      Put another way - it would be a lot easier for the programmer to write code if there were no types checked by the compiler at all, but we recognize that the safety net they give us is worth the additional effort at times when refactoring. So why would the benefits of static type checking be worth it, but not the benefits of static error type checking? It seems to me that either both are good ideas, or neither is.

      • undefined 3 days ago ago
        [deleted]
      • malkia 3 days ago ago

        The pitfall was if I have to extend it with new exceptions, one had to go and fix all code that was handling it to handle this one too.

        Works great in monorepo, but not sure if code is spread out.

        • kelnos 3 days ago ago

          Right, but I think the GP is saying that, with checked exceptions, the 'throws' list is part of that function's API. Changing the 'throws' list is exactly like changing the arguments to or return type of the function.

          It's not so much a "pitfall" as it is an intended part of the deal.

          It just turns out that many people hated it, so most of the time functions omit the 'throws' list and throw unchecked subclasses of RuntimeException. Which also has its trade offs! (Or "pitfalls", if you want to use the same term.)

          • malkia 2 days ago ago

            Rephrasing it this way, makes me rethink what I knew. Thanks!

      • nofriend 3 days ago ago

        Types help the programmer. When the compiler gives me a type error, it is telling me about something I messed up and that would otherwise be an error at runtime. Sometimes the type system is wrong and I need an escape hatch. A type system that is wrong too often isn't very useful. For the majority of exceptions, there no useful thing that can be done at the call site to deal with them beyond bubbling them up the stack. A type system that constantly makes me explicitly deal with such errors is making my job harder, not easier.

        • vips7L 3 days ago ago

          There are plenty of errors/exceptions that don't need to be bubbled up the call stack. I don't think that's the main issue with them. Like you say the issue with checked exceptions is that there is no escape hatch to shut the compiler up and panic if you can't handle an error or its not possible. They desperately need a short hand like Rust's ?, Swift's try?, or Kotlin's !!.

              A a;
              try {
                  a = someFnToGetA();
              } catch (AException aex) {
                  // not possible in this situation
                  throw new RuntimeException(aex);
              }
          
          In a modern language that has checked errors that just becomes something like:

              val a = try! someFnToGetA();
              val a = someFnToGetA()!!
          • kbolino 2 days ago ago

            Yeah, the problems seem to largely be ergonomic.

            As another example, the exception type hierarchy doesn't pull enough weight. Exception is the base class of all checked exceptions and RuntimeException is the base class of all "ordinary" unchecked exceptions, but it confusingly subclasses Exception. So there's no way to catch only "all checked exceptions". Then, Error is distinct from that hierarchy, but some things that smell like errors were made into exceptions instead (e.g. NullPointerException).

            This was compounded by the fact that, in the original design, you could only call out one exception type in a catch statement. So if you had 3 different disjoint exception types that you simply wanted to wrap and rethrow, you had to write 3 different catch blocks for them. Java 7 added the ability to catch multiple exceptions in the same block, but it was too little, too late (as far as redeeming checked exceptions goes).

            • vips7L 2 days ago ago

              > So if you had 3 different disjoint exception types that you simply wanted to wrap and rethrow, you had to write 3 different catch blocks for them.

              Agreed. There's a proposal for exception catching in switch [0] which I'm hopeful will alleviate a lot of this. I think that jep plus combining exceptions with sealed types the error handling will be convenient and easy.

                  sealed abstract class SomeException extends Exception permits AException, BException {};
                  
                  void someFn() throws SomeException;
                 
                  // hypothetically handling in switch would let you enumerate the subtypes of the exception
                  var a = switch (someFn()) {
                      case A a -> a;
                      case throws AException aex -> new A();
                      case throws BException bex -> throw new RuntimeException(bex);
                  };
              
              > As another example, the exception type hierarchy doesn't pull enough weight.

              Kotlin has an interesting proposal for their language that creates their own "error" type that will allow type unions [1]. The only thing I worry about is that it further puts Kotlin away from Java making interop a lot harder.

              [0] https://openjdk.org/jeps/8323658

              [1] https://github.com/Kotlin/KEEP/blob/main/proposals/KEEP-0441...

  • aw1621107 3 days ago ago

    iex [0] might be potentially relevant here. From what I understand it basically implements this bit:

    > Finally, another option is to say that -> Result<T, E> behaves exactly as -> T ABI-wise, no error affordances whatsoever. Instead, when returning an error, rather than jumping to the return address, we look it up in the side table to find a corresponding error recovery address, and jump to that. Stack unwinding!

    And at least based on the listed benchmarks it can indeed result in better performance than "regular" Result<T, E>.

    (Might be nice to mention this on the corresponding lobste.rs thread as well to see if anyone has anything interesting to add, if anyone has access)

    [0]: https://github.com/iex-rs/iex

  • quotemstr 3 days ago ago

    > That is the reason why mature error handling libraries hide the error behind a thin pointer, approached pioneered in Rust by failure and deployed across the ecosystem in anyhow. But this requires global allocator, which is also not entirely zero cost.

    No, it doesn't require a global allocator. You make the thinly-pointed-to error object have a vtable and, in this vtable, you provide a "deallocate" or "drop" function. No need for a single global allocator.

    > (this requires the errors to be register-sized).

    Uh, what? No, you can make it a pointer, and honestly, in every real-world ABI, there are tons of callee-clobbered registers we COULD use for errors or other side information and just... don't.

    • stmw 3 days ago ago

      Good upvote-worthy points, I'd just say that this is exactly the issue - eventually you end up with a Java++ implementation of exceptions.

      As you say, the more limited implementation can be done more efficiently than the post post claims, but it has to be restricted and managed in a way that existing runtimes/compilers/languages haven't succeeded at so far... "just say no" earlier.

  • stmw 3 days ago ago

    I'm simultaneously amused and concerned by the recurring proposals for Rust to add

    1. exceptions 2. garbage collection

    Sometimes in slightly modified forms or names, and often with very well-articulated, technically competent justifications (as is the case here).

    Just say no!

    • kbolino 3 days ago ago

      But the argument is not about adding exceptions to the language.

      It's about borrowing a technique from languages that do have exceptions (and Rust's own panic unwinding) to implement Rust's existing error handling without making any changes to the language.

      The post is light on how this would actually work, though. I think that re-using the name "stack unwinding" is a little misleading. The stack would only unwind if the error was bubbled up, which would only happen if the programmer coded it to. Indeed, the error could just get dropped anywhere up the stack, stopping it from unwinding further.

      I think this would be tricky to implement, since the compiler would have to figure out which code actually is on the error path, and any given function could have multiple error paths (as many as one per each incoming error return, if they're all handled differently). It'd also make Result<T,E> even more special than it already is.

      All that having been said, if you squint a bit, the end result does vaguely resemble Java's checked exceptions.

      • stmw 3 days ago ago

        And that was really my (unpopular) opinion above - that exceptions and garbage collectors are two examples of runtime/compiler/language features that after some iteration around the corner cases tend to end up in roughly the same place Java did 20+ years ago - as you say, "if you squint a bit".

        It is interesting to think why it is so for exceptions specifically. The discussion here offers some of the possible reasons, I think, including violent disagreement of what we all even mean by exceptions vs stack unwinding vs error propagation.

    • necubi 3 days ago ago

      This doesn't have anything to do with exceptions, and the context appears to be Zig, not Rust.

      The article is about how we represent errors not their control flow (i.e., exceptions).

      • stmw 3 days ago ago

        Fair point re: Zig vs Rust, but my larger point was about exceptions by any other name, in the linked article:

        "Instead, when returning an error, rather than jumping to the return address, we look it up in the side table to find a corresponding error recovery address, and jump to that. Stack unwinding!

        The bold claim is that unwinding is the optimal thing to do! .."

        • necubi 3 days ago ago

          If by "exceptions" you're talking about stack unwinding (as opposed to the language-level control flow constructs like throw/catch) then Rust has always had that with panics and panic=unwind.

          • stmw 3 days ago ago

            I am talking about the wider category that includes stack unwinding as an error handling pattern for errors outside of catastrophic. T

            My opinion (which one need not approve of) is that it asymptotically approaches the language-level control flow constructs.

    • stmw 3 days ago ago

      For clarification, this is not just a Rust, or a Zig thing. [removed incorrect info re: Linux]

      • comex 3 days ago ago

        To be clear, that has nothing to do with exceptions, garbage collection, or error ABIs. But I guess you're saying it's a case of tacking on language features to languages that don't have them, since the main desired MS extension is one that sort of resembles class inheritance.

    • MaulingMonkey 3 days ago ago

      Rust already has exceptions (when panic=unwind)

      • stmw 3 days ago ago

        But it's not really encouraged as it is in C++ or Java. Quoting Rustonomicon: "There is an API called catch_unwind that enables catching a panic without spawning a thread. Still, we would encourage you to only do this sparingly. In particular, Rust's current unwinding implementation is heavily optimized for the "doesn't unwind" case. If a program doesn't unwind, there should be no runtime cost for the program being ready to unwind. As a consequence, actually unwinding will be more expensive than in e.g. Java. Don't build your programs to unwind under normal circumstances. Ideally, you should only panic for programming errors or extreme problems."

  • deathanatos 3 days ago ago

    > Naively composing errors out of ADTs does pessimize the happy path. Error objects recursively composed out of enums tend to be big, which inflates size_of<Result<T, E>>, which pushes functions throughout the call stack to “return large structs through memory” ABI. Error virality is key here — just a single large error on however rare code path leads to worse code everywhere.

    This isn't true; proof by counter-example: anyhow::Error.

    For example, a lot of Rust code uses "anyhow", a crate which provides sort of a catch-all "anyhow::Error" type. Any other error can be put into an anyhow::Error, and an anyhow::Error is not good for much except displaying, and adding additional context to it. (For that reason, anyhow::Error is usually used at a high-level, where you don't care what specifically went wrong, b/c the only thing you'll use it for is propagation & display.)

    No matter what error we put into an anyhow::Error, the stack size is 8 B. (Because it's a pointer to the error E, effectively, though in practice "it's a bit more complicated", but not in any way that harms the argument here.) So clearly, here, we can stuff as much context/data/etc. into the error type E without virally infecting the whole stack with a larger Result<T, E>.

    (Rust does allow you to make E larger, and that can mean a Result<T, E> gets larger, yes. But you're one pointer away from moving that to the heap & fixing that. Rust, being a low level language, … permits you that / leaves that up to you. The stack space isn't free — as TFA points out, spilling registers has a cost — but nor are heap allocations free. Rust leaves it up to you, effectively.)

    My understanding of Zig the other day is that it doesn't permit associated data at all, and errors are just integer error code, effectively, under the hood. This is a pretty sad state of affairs — I hate the classic unix problem where you get something like,

      $ mkdir $P
      mkdir: no such file or directory
    
    Which I now special path in the neurons in my head so-as to short circuit wandering the desert of "yeah, no such directory … that's why I'm asking you to create it". (And all other variations of this pattern.)

    All of that could have been avoided if Unix had the ability to tell us what didn't exist. (And there are so many variants: what exists unexpectedly? what perm did we lack? what device failed I/O?)

    (And I suppose you could make Result<T, E> special / known to the compiler, and it could implement stack unwinding specifically. I don't think that leave me with good vibes in the language design dept., and there are other types that have similar stack-propagating behavior to Result (Option, Poll, maybe someday a generator type). What about them?)

    • Philpax 3 days ago ago

      The article mentions anyhow:

      > That is the reason why mature error handling libraries hide the error behind a thin pointer, approached pioneered in Rust by failure and deployed across the ecosystem in anyhow. But this requires global allocator, which is also not entirely zero cost.

      • deathanatos 3 days ago ago

        Oh oops … IDK how I missed that … but also that seems to really undercut the article's own thesis then if they're aware of it.

        > But this requires global allocator, which is also not entirely zero cost.

        Heap allocs are not free. But then, IDK that the approach of using the unwinding infra is any better. You still have to store the associated data somewhere, & then unwind the stack. That "somewhere" might require a global allocator¹.

        (¹Say you put the associated data on the stack, and unwind, and your "recovery point"/catch/etc. site might get a pointer to it. Put what if that recovery point then calls a function, and that function requires more stack depth that exists prior to the object?

        I supposed you could put it somewhere, and then move it up the stack into the stack frame of the recovery function, but that's more expensive. That might work, though.

        But since C++ impls put it on the heap, that leads me to assume there's a gotcha somewhere here.)

    • kbolino 3 days ago ago

      There is a middle ground that I think the post glosses over, which would be to split apart the Result<T,E> value whenever its two cases differ significantly in size. You'd also have to track the discriminant of course.

      Basically, supposing T alone fits in a register or two, but E is so big that the union of T and E would spill onto the stack, treat them as two different values instead of one.