Defrag.exfat Is Inefficient and Dangerous

(github.com)

30 points | by dxdxdt a day ago ago

7 comments

  • zapzupnz 14 hours ago ago

    Does nobody else think the responses from the person who wrote the code read like the usual sycophantic “you’re absolutely right!” tone you get from AI these days?

    • pajko 9 hours ago ago

      There at least 2 AIs.

  • ycombinatrix 15 hours ago ago

    >We prioritized simplicity and correctness first, and plan to incrementally introduce performance optimizations in future iterations.

    Sir, this is a correctness issue.

  • forgotpwd16 18 hours ago ago

    >After reviewing the core defrag logic myself, I've come to a conclusion that it's AI slop.

    Will call it a human slop. AI may've given them some code but they certainly haven't use it fully. I uploaded the defrag.c in ChatGPT asking to review on performance/correctness/safety and pointed the sames issues as you (alongside bunch of others but not interested at the moment to review them).

    • stuaxo 16 hours ago ago

      Talk about a baptism of fire for the dev.

      Seems like they are very new to tbings and didn't expect it to be adopted, but were hoping for a bit of feedback.

  • burnt-resistor 16 hours ago ago

    Sigh. Piss poor engineering, likely by humans. For the love of god, do atomic updates by duplicating data first such as in a move-out-of-the-way-first strategy before doing metadata updates. And keep a backup of metadata at each point of time to maximize crash consistency and crash recovery while minimizing the potential for data loss. An online defrag kernel module would likely be much more useful but I don't trust them to be able to handle such an undertaking.

    If a user has double storage available, it's probably best to do the old-fashioned "defrag" by single-threaded copying all files and file metadata to a newly-formatted volume.

    • doubled112 9 hours ago ago

      That last paragraph sums up the ZFS defrag procedure at one shop I worked at. Buy new disks and send/receive the pool.

      At our size and use case the timing was usually close to perfect. The pools were getting close to full and fragmented as larger disks became inexpensive.