The Cost of a Function Call

(lemire.me)

18 points | by ingve a day ago ago

7 comments

  • account42 a day ago ago

    This post is surprisingly shallow considering the author. I don't think it should be surprising to anyone even remotely familiar with low level programming that function calls have an overhead but inlining doesn't always always end up making things faster.

    Inlining is also not a binary yes or no question. E.g. modern compilers can create clones of functions with constants propagated into some of the arguments, which gives some of the benefits of inlining. They are also free to change the calling convention (or make one up on the spot) for internal functions instead of inlining - something I'd like to see compilers explore further.

    • randomtoast a day ago ago

      > This post is surprisingly shallow considering the author.

      I think a low effort post from time to time is okay.

  • bob1029 a day ago ago

    > For functions that can be fast or slow, the decision as to whether to inline or not depends on the input.

    This is one area where modern JIT runtimes can dominate static compilations. Dynamic profile guided optimization adjusts things like inlining based upon how the live workload actually performs. You don't need to have any data ahead of time.

    There are very few cases where I would trade this sort of long tail optimization capability for something like faster startup. Cold start happens once. If the machine rarely stops, it's not very relevant to the typical operational concerns.

    • 3836293648 18 hours ago ago

      It absolutely theoretically can, but afaik neither V8 or the JVM can actually do it to a level where it outperforms the static optimisations of GCC or LLVM.

      Is this still the case or am I going on outdated info on the matter?

  • RandomBK a day ago ago

    Code length will itself become a problem. The instruction cache is limited in size and often quite small. Bloating instruction counts with lots of duplicated code will eventually have a negative effect on performance.

    Ultimately, there's too many factors to predetermine which approach is faster. Write clean code, and let a profiler guide optimizations when needed.

    • LorenPechtel 17 hours ago ago

      Exactly. Memory access is a major factor in runtime, often more important than instruction counts. And in the vast, vast majority of cases it doesn't matter. I trust the compiler to make reasonable choices, something would have to be deployed at a very large scale before the programmer time of considering such things became cheaper than the hardware savings from doing it. And the vast majority of code simply doesn't execute often enough to matter one way or the other.

      Save your brainpower for the right algorithms and for the inner loops the profiler identifies (I did not expect to learn that the slowest piece of code was referring to SQL fields by name!) Ignore the rest.

  • lelag a day ago ago

    This is all fairly obvious, no?

    At first, write clean code with functions and don’t obsess over call overhead. Once it works, profile, then optimize where it actually matters.

    Premature optimization, etc.