Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High Cardinality Metrics - Prometheus Remote Write (experimental-prometheus-rw) #3761

Open
jameshounshell opened this issue May 30, 2024 · 1 comment
Assignees

Comments

@jameshounshell
Copy link

jameshounshell commented May 30, 2024

Feature Description

When using the Prometheus Remote Write functionality it is not currently possible to limit what labels are included in the series sent to prometheus. Currently we have an issue where the full url is included in the k6_http_ prefixed metrics where the unique id's randomly generated by some of our developers tests absolutely explode the cardinality leading to poor performance when querying or when the developers use the k6 grafana dashboard for time spans longer than a few minutes.

For example this urls like these result in high cardinality within prometheus
http://redacted.redacted.svc.cluster.local:8080/api/v1/applications/2b38e5b6-7e72-41a2-9f4d-9f2e51787e78

I could ask the developers to use a smaller fixed number of ID's but this doesn't seem feasible in the long run.

Suggested Solution (optional)

I'm looking for something similar to Prometheus' relabel config functionality (regex matching and manipulation) or have the k6 library have some way to mark/sanitize urls containing unique ID's (ex: opentelemetry tracing does this automatically based on common http server frameworks).

Alternatively even some command line flags to drop certain labels from the emitted metrics would be helpful.

Already existing or connected issues / PRs (optional)

No response

Edit/Update

  • I found this in the k6 documentation about how to limit the name label but I'd need to test if this can also be used to edit/limit the url label. Though this still leaves it to the discretion of the the developer and their test, rather than allowing me to override the default behavior.
@jameshounshell
Copy link
Author

jameshounshell commented Jun 5, 2024

We've arrived at a solution where we use the url grouping with the tag function.

This is good enough but it leaves it to the developers to implement and I can imagine many will forget and we'll still have to chase down the offending k6 test and deal with the fallout of whatever cardinality explosion happens before we catch it.

I'd still like it if there were a way to control this functionality with an environment variable (or config file) so that we as the platform engineering team can enforce it with a kyverno policy, etc. We manage a internal k6 helm chart for the developers so we'd have the opportunity set it there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
2 participants