Google proposes Open Knowledge Format based on Markdown

(cloud.google.com)

69 points | by itherseed 2 days ago ago

16 comments

  • sadschnitzel 2 days ago ago

    I love the simplicity of this OKF spec, but I'm not sure everything can be represented well in "just Markdown".

    I've recently become intrigued by representing concepts so that AI can co-contribute effectively and token-efficiently (typically: find a good way to represent something as semi-structured sequential text), but also without compromising the human lens on the representation. We shouldn't accept a downgrade of the human knowledge representation experience just to make it AI-accessible. That's especially true if traditionally non-dev personas need to contribute, and they almost certainly find "weird text format + git" much worse than their current authoring/viz tools.

    I'm excited to see how standards for semantically representing different kinds of knowledge emerge in the next few years!

    Successful examples I can think of to mix in are open standards like DBML for schemas/E-R, LikeC4 for architecture, diagrams-as-code ideas like Mermaid, all of which LLMs seem to "get" well (or can be told about from a short EBNF prompt). Crucially, they also have pretty human viz forms, and you can you can just ```code block``` inline them in Markdown next to natural language. And you can get LLMs to help you author the syntax.

    Harder to crack is stuff where there's implicit human meaning in spatial layout and colour, like in complex spreadsheets or Miro. I haven't found good alternatives for those yet.

    My own attempt in my (data engineering) domain is https://equalexperts.github.io/satsuma-lang/ for AI-and-human source-to-target mappings and transforms. A succinct structured text representation that allows natural language, but also nice viz and LSP/grammar tooling that helps agents not to have to slice and dice big docs token-inefficiently to reason about things like lineage or completeness or undefined sources.

    • xamde a day ago ago

      OKF seems OK, but bound to Markdown. A Markdown document can be turned into an OKF document by adding a 'type' to the frontmatter YAML.

      What about a knowledge graph language, which can be stated in Markdown prose, in Markdown code blocks, but also everywhere a text field is waiting for you? In the minimalistic language https://ddot.it you can link outside the Markdown world, to files, URLs or even just labels. Like OKF it's just a format.

      Disclaimer: I wrote that (short) spec.

      • sadschnitzel 19 hours ago ago

        I love how unobtrusive that is, great compromise between readability and expressiveness!

        • xamde 16 hours ago ago

          Thanks, it is based on an unreleased 50-page complex syntax speec with over 40 different kind of arrows. Luckily, I simplified BEFORE release :-)

    • jarym a day ago ago

      Markdown is the defacto format for LLMs and humans to interoperate. And I agree not everything can be represented well but that’s missing the point - it seems to win because markdown is the lowest common denominator for both human and AI models.

    • UltraSane a day ago ago

      You can't represent knowledge well without a graph format showing labeled relationships between entities.

  • mrkiouak 2 days ago ago

    I love revisiting RDF/OWL Semantic Web formats every 10 years.

    One of these years will be the one!

    https://en.wikipedia.org/wiki/Semantic_Web

  • yladiz 17 hours ago ago

    Having looked at many PDFs that needed to be “translated” to Markdown, it feels like a strange choice - I know it’s primarily to make things easily accessible to AI, but if we’re going to train models anyway, why not train them on something better? Markdown is quite limited, and can’t render something like a nested table for example, and if the point of having “open knowledge” is for AI, why do we need to use a format that won’t really be read by humans?

  • sermakarevich 18 hours ago ago

    Love the approach. I am a big fun of hierarchical knowledge organization. I think that almost all current Claude abstractions to knowledge management are broken. It becomes visible when you start running many coders concurrently or need to create 1K+ skills fe: https://news.ycombinator.com/item?id=48407998

  • port11 11 hours ago ago

    Google has announced… Markdown with YAML front-matter, ladies and gentlemen. Please applause. 15kb of spec for this!

    (I’d be less sardonic if we could all stop using oops-you-missed-an-indent-YAML.)

  • bsimpson a day ago ago

    Is the flavor of Markdown (e.g. CommonMark) specified? Didn't see anything about it by perusing the first few pages, but that feels important for a spec.

  • verdverm 2 days ago ago
    • KomoD a day ago ago

      [dead]

  • matthewbarras a day ago ago

    Check out barrasindustries.com/okfind/

    Just an idea for an OKF bundle registry

  • undefined a day ago ago
    [deleted]
  • glass1122 a day ago ago

    [dead]