25 comments

  • nickcw 2 days ago ago

    I love it :-)

    Back in the distant past I wrote some really big ARM 32 assembly projects. 64 bit ARM is really very similar!

    I had a look through the code. Some ENTRY/EXIT macros to help with the drudgery of save restore registers & stack frame would probably help. Also some register renaming would help readability (eg if a register points to incoming data throughout a subroutine rename it pdata).

    I salute your effort and please enjoy the core dumps :-)

    • imtomt 2 days ago ago

      Thank you! Definitely, some macros would probably be super helpful. Register renaming, too, I'm sure. When I started this project I didn't even know about register renaming lol, and at this point it's so big I'd have to dig pretty deep to find 'em all. Definitely worth doing, though, I'm sure.

    • gjvc 2 days ago ago

      RISC OS had some amazing apps written ARM26/32 -- my teenage mind was boggled

  • benj111 2 days ago ago

    Cool. I particularly like the O'Reilly book cover that never was. Although I fear you may have misunderstood what wasm is...

    Question/critique. Isn't getting the mime type by file extension a bit windowsy? Would it not be easier to read the magic number when you're at the assembly level?

    • jprjr_ 2 days ago ago

      You could argue well, if you have to open the file to read it anyway may as well look for magic numbers.

      That doesn't work well with text documents which won't have any kind of magic number. So now you're doing some heuristics to determine is this text/plain, text/html, text/svg? You're pretty much just guessing at that point.

      A good number of file formats out there are just Zip files with a particular structure. JAR files, docx - so relying on magic numbers doesn't really work for those, either.

      Also to service a HEAD request you'd have to open the file and read a few bytes that you just discard.

      If you just do it by extensions you don't need to read files at all or perform heuristics, and no ambiguity for what mimetypes to use for text documents, zip-based formats, etc.

    • imtomt 2 days ago ago

      You could do that, but it's not really necessary, and adds extra overhead and complexity. And as the other commenter pointed out, it wouldn't work for text file types without magic numbers. I considered reading the magic number at first, but after doing more research, I've found most web servers (nginx and apache, anyway) just match based on file extension. I figure if it's good enough for them, it's good enough for ymawky.

  • mrbluecoat 2 days ago ago

    > written entirely by-hand in ARM64 assembly as a fun project. It's probably got a lot of vulnerabilities I'm unaware of

    Impressive, but that second part worries me. I hope one day AI security scans upon commit (or integrated in the IDE) will alleviate that risk.

    What's the current security gold standard for web servers? Hiawatha? https://hiawatha.leisink.net/

    • imtomt 2 days ago ago

      Well, if security is a major concern, definitely don't use ymawky in production! That said, I did try my best to harden it. I've fuzzed the parser extensively with afl-fuzz, and got several hours without a single hang or crash. There's no major vulns I'm aware of, but in a ~4500 SLOC assembly project, there's probably gonna be some vulnerabilities that are hiding.

    • efficax 2 days ago ago

      Hiawatha is written in C, and so despite its security posture, it probably contains vulnerabilities.

  • kunley 2 days ago ago

    Ahh, this little gem ported to Linux, great! That opens much more possibilities to play with it, thanks

  • Lucasoato 2 days ago ago

    Is an assembly webserver more performant than webservers written in other languages? Are there any hard limits on how much you can squeeze when using a particular framework?

    • imtomt 2 days ago ago

      The language matters much less than the architecture of the web server. A server written in assembly using a fork-on-request model like ymawky is going to be much slower than a server written in C using an async event loop like nginx, because forking is very inefficient at scale. Plus a big bottleneck is the networking syscalls, rather than the code itself, so two servers with the same model written in Assembly and C would likely be roughly equal.

  • hparadiz 2 days ago ago

    I love projects like this because I think eventually all common computing tasks will be broken down in constituent most computationally optimized components

  • tosti 2 days ago ago

    This isn't a bad thing per se. I imagine this could be a thing for an embedded side project or a tiny rescue system.

    Edit: or learning arm64 assembly :)

  • sylware 2 days ago ago

    arm64 is an IP-locked ISA, namely it is not worth assembly writting, stick to plain and simple C.

    RISC-V is. I am self-hosting many of my internet thingies. I plan to move to RISC-V only hardware and to rewrite my internet software directly in mono-threaded paranoid RISC-V assembly.

    • imtomt 2 days ago ago

      For sure. My laptop has an arm64 chip, which is why this is written in arm64. If I had an intel chip, it would be written in x86_64. RISC-V is very interesting, though, and I'd love to learn more at some point.

      Good luck with your RISC-V asm stuff! Hit me up if you publish any of it :)

      • sylware 7 hours ago ago

        I have a good set of RISC-V thingies on internet already, but not cleanely in some git repositories (obsviously not on microsoft github.com or gitlab).

        I am currently attempting to define some binary specifications for a wayland compositor on linux, and to do that I write RISC-V assembly which I run on x86_64 thx to a very small interpreter. So the "cleanup" will happen "after" I get my own real-life wayland compositor (I am currently finishing the memory layout for keyboard and mouse support if you were curious about it).

        Letting people know here is seriously compromised: HN started to aggressively block web browsers which are not based on one of the 'whatwg cartel' web engines, or their security provider is in love with gogol, dunno, could be my internet lines actively filtered too :) noscript/basic HTML is the only way for web freedom.

    • AntronX 2 days ago ago

      > arm64 is an IP-locked ISA, namely it is not worth assembly writting

      Why is that a problem? Google search returns 3K page arm64 ISA manual. What else do you need to write asm?

      • jprjr_ 2 days ago ago

        Nothing else, I think what the're more concerned with is how open an architecture is.

        I think with RISC-V if you wanted to design your own chips and stuff you can just do it, whereas ARM doesn't let you do that.

        I'm not about to build my own chips so it doesn't matter all that much to me but I understand where the person is coming from. They'd rather write assembly for the more open architecture.

        • sylware 7 hours ago ago

          Exactly what I am doing: I have even a small interpreter to run RISC-V machine code, to a certain extend ofc, on x86_64, and I could just do that on arm64 too: since The 'R' is RISC-V, means "Reduced", such implementation is more than reasonable, even for one average dev.

  • wewewedxfgdf 2 days ago ago

    You wrote this by hand? Impressive.

  • radhitya 2 days ago ago

    "raw syscalls only: no libc wrappers"

    insane! i wonder how many times you have spent to learn about them!

    • binaryturtle 2 days ago ago

      Like one time? It is like any other API. You look it up, study the parameters (in this case you need to look up in which register is what argument expected) and off it goes.

    • imtomt 2 days ago ago

      Honestly it's easier than you'd think! All the syscall numbers are in /usr/include/asm-generic/unistd.h (on linux), and you can read the man page for any of them.

  • wewewedxfgdf 2 days ago ago

    There's critical security flaws in this web server. Consult your local LLM for a security analysis.