OpenTelemetry profiles enters public alpha

(opentelemetry.io)

114 points | by tanelpoder 5 hours ago ago

12 comments

  • SEJeff 2 hours ago ago

    I wonder how this compares to grafana pyroscope, which is really good for this sort of thing and already quite mature:

    https://grafana.com/oss/pyroscope/

    https://github.com/grafana/pyroscope

  • genthree 4 hours ago ago

    Relatedly: Has anyone profiled the performance and reliability characteristics of rsyslogd (Linux and FreeBSD distributed syslogger, maybe other platforms too) in its mode where it’s shipping logs to a central node? I’ve configured and used it with relatively small (high single digit nodes, bursts of activity to a million or two requests per minute or so) set-ups but have wondered if there’s a reason it’s not a more common solution for distributed logging and tracing (yes it doesn’t solve the UI problem for those, but it does solve collecting your logs)

    Like… has anyone done a Jepsen-like stress test on rsyslogd and shared the results? I’ve half-assedly looked before and not been able to find anything.

    • jbaiter an hour ago ago

      We're doing this with a few dozen GiBs of logs a day (rsylog -> central rsylog -> elasticsearch). It works reliably, but the config is an absolute nightmare, documentation is a mixed bag and troubleshooting often involves deep dives into the C code. We're planning to migrate to Alloy+Loki.

    • nesarkvechnep 2 hours ago ago

      People don’t care about syslog. 98% of my colleagues haven’t heard of it.

      • malux85 2 hours ago ago

        You are drawing a global conclusion from a tiny sample!

  • ollien 2 hours ago ago

    Very excited for this. We've used the Elixir version of this at $WORK a handful of times and have found it exceptionally useful.

  • secondcoming 4 hours ago ago

    > Continuously capturing low-overhead performance profiles in production

    It suprises me that anything designed by the OTel community could ever meet 'low-overhead' expectations.

    • tanelpoder 4 hours ago ago

      The reference implementation of the profiler [1] was originally built by the Optimyze team that Elastic then acquired (and donated to OTEL). That team is very good at what they do. For example, they invented the .eh_frame walking technique to get stack traces from binaries without frame pointers enabled.

      Some of the OGs from that team later founded Zymtrace [2] and they're doing the same for profiling what happens inside GPUs now!

      [1] https://github.com/open-telemetry/opentelemetry-ebpf-profile...

      [2] https://zymtrace.com/article/zero-friction-gpu-profiler/

    • felixge 4 hours ago ago

      OTel Profiling SIG maintainer here: I understand your concern, but we’ve tried our best to make things efficient across the protocol and all involved components.

      Please let us know if you find any issues with what we are shipping right now.

    • phillipcarter 4 hours ago ago

      Anything to actually add?