Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow group traces using the a common traceID #3369

Open
pablochacin opened this issue Oct 4, 2023 · 4 comments
Open

Allow group traces using the a common traceID #3369

pablochacin opened this issue Oct 4, 2023 · 4 comments
Assignees
Labels

Comments

@pablochacin
Copy link

pablochacin commented Oct 4, 2023

Background

In the Open Telemetry API a Span represents a single operation within a trace. Spans can be nested to form a trace tree. Each trace contains a root span, which typically describes the entire operation and, optionally, one or more sub-spans for its sub-operations.

A span is said to be a root span if it does not have a parent. Each trace includes a single root span, which is the shared ancestor of all other spans in the trace. Implementations MUST provide an option to create a Span as a root span, and MUST generate a new TraceId for each root span created.

Example

In the discussion that follows we will be using the following example as a reference. It describes the whole journey of a user in a shop. The user logs in, browses the catalog, adding products to the cart, and then checks out.

function userJourney() {
      // start of journey
      http.post()  // login
      
      // browse catalogue and add to cart
      http.get();
      http.get()
      http.put()
       
      // checkout
      http.get()
      http.post()
});

Problem

Presently, the experimental/tracing module generates traceIDs for each HTTP request and propagates them using an HTTP header. The resulting trace tree for the example will be:

- root span
   |- post
- root span
   |- get
- root span
   |- get
- root span
   |- put
- root span
   |- get
 - root span
   |- post

This feature is useful to allow distributed tracing of each individual request. However, for more complex tests, it is customary to group requests that model a user journey in order to collect metrics such as error rates.

It would be convenient for developers to be able to create a single tree of traces with all the requests in that journey.

In practice, this means that all requests under the journey should share the same traceID.

Some mechanism is needed in order to tell the tracing module when a new root span starts and a new traceID is needed.

An important nuance about the traceID that is generated by the tracing module, is that, in order to work properly with the cloud tracing feature (which makes traces generated by the application under test visible in the test results) this ID must follow a concrete format. Therefore, it cannot be supplied by the developer of the test by means of an API, but must be generated by the tracing module.

Another important consideration is that the tracing module is not presently generating any span that corresponds to this traceID. As a consequence, the tracing backend will report that the "root span is missing". Addressing this issue is outside the scope of this feature but we intend that the proposed solution could also help to address it.

Suggested Solution

Automatic traceID generation based on the request context

The tracing extension keeps track of the context in which each request is executed, including the scenario, iteration, and group. This information is used as metadata and sent to the cloud tracing backend.

We propose that the tracing extension generate a different traceID for each context and that all requests in the same context use the same traceID.

The resulting trace tree for the example would be:

- root span
   |- post
   |- get
   |- get
   |- put
   |- get
   |- post

All requests for the journey are now under the same root span.

Alternative: Explicit API for controlling root spans

Instead of automatically creating a new traceID for each context (as described in the previous section), the tracing module could provide an interface for signaling when a new root span starts and a new TraceID is needed, following the OTEL specification.

The startTrace() function generates a new traceID to be used for all subsequent requests until the current root span is ended using the endTrace() function.

By default, a new traceID will be used for each request unless the startTrace method is called.

function userJourney() {
      // start of journey
      startTrace()
      
      http.post()  // login
      
      // browse catalogue and add to cart
      http.get();
      http.get()
      http.put()
       
      // checkout
      http.get()
      http.post()
    
     // end of journey
     endTrace()
});

The main advantage of this option is that it will give control to the developer to define the scope of each trace.

Additionally, it could be extended to cover other requirements, such as creating sub-spans for grouping steps in a test following the OTEL API.

However, this interface is error-prone: starting a new trace in the wrong place (for example, in a sub-function) will break the trace tree.

Also, it doesn't add any immediate value when compared with the automatically generated traces unless it is complemented with other features.

Related problems not covered by this proposal

Generating a root span for each traceID

As mentioned before, currently the tracing extension is only generating the traceID and associating it with a random parentID, but no root span is reported to the tracing backend.

This proposal does not address this issue. However, there seems not to be any impediment to implement this functionality in the future.

Creating sub-spans

For complex tests, it would be convenient to allow developers to group requests in sub-spans that represent different steps in the test.

For example, group the request for the browse catalog and checkout steps under different sub-spans:

- root span
   |- post
   |- sub-span (browse catalog)
   |   |- get
   |   |- get
   |   |- put
   |- sub-span (checkout)
       |- get
       |- post

Creating this span tree will require a mechanism to inform the tracing extension of the start/end of a new spam to be used as the parent of the requests.

Implementing this functionality is outside the scope of this proposal.

Already existing or connected issues / PRs (optional)

#3029 mentions the ability to export traces, which was removed from the tracing module.

@pablochacin
Copy link
Author

Edited to add considerations for handling nested groups.

@pablochacin
Copy link
Author

Edited to add an alternative to the JS API based on a discussion with the core and insights team.

@oleiade
Copy link
Member

oleiade commented Nov 9, 2023

ref #2728
ref #3212

@pablochacin pablochacin changed the title Allow relating traces under a group using the a common traceID Nov 9, 2023
@pablochacin pablochacin changed the title Allow relating traces sing the a common traceID Nov 9, 2023
@pablochacin
Copy link
Author

pablochacin commented Nov 9, 2023

Massive re-editing to

  • Add OTEL definitions as context
  • Limit the scope of the problem of defining a single root span for all requests in a "context"
  • Identify related issues not covered by the proposal
@oleiade oleiade removed the triage label Dec 20, 2023
@olegbespalov olegbespalov removed their assignment Apr 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 participants