Web Frontend Instrumentation and Monitoring with OpenTelemetry and Elastic

DevOps, SRE and software engineering teams all require telemetry data to understand what's going on across their infrastructure and full-stack applications. Indeed we have covered instrumentation of backend services in several language ecosystems using OpenTelemetry (OTel) in the past. Yet for frontend tools, teams are often still relying on RUM agents, or sadly no instrumentation at all, due to the subtle differences in metrics that are needed to understand what's going on.

In this blog, we will discuss the current state of client instrumentation for the browser, along with an example showing how to instrument a simple JavaScript frontend using the OpenTelemetry browser instrumentation. Furthermore, we'll also share how the baggage propagators help us build a full picture of what is going on across the entire application by connecting backend traces with frontend signals. If you want to dive straight into the code, check out the repo here.

Application Overview

The application that we use for this blog is called OTel Record Store, a simple web application written with Svelte and JavaScript (albeit our implementation is compatible with other web frameworks), communicating with a Java backend. Both send telemetry signals to an Elastic backend.

Eagle-eyed readers will noticed that signals from our frontend pass through a proxy and collector. The proxy is required to ensure that the appropriate Cross-Origin headers are populated to allow the signals to pass into Elastic, as well as the traditional reasons such as security, privacy and access control:

events {}

http {

  server {

    listen 8123; 

    # Traces endpoint exposed as example, others available in code repo
    location /v1/traces {
      proxy_pass http://host.docker.internal:4318;
      # Apply CORS headers to ALL responses, including POST
      add_header 'Access-Control-Allow-Origin' 'http://localhost:4173' always;
      add_header 'Access-Control-Allow-Methods' 'POST, OPTIONS' always;
      add_header 'Access-Control-Allow-Headers' 'Content-Type' always;
      add_header 'Access-Control-Allow-Credentials' 'true' always;

      # Preflight requests receive a 204 No Content response
      if ($request_method = OPTIONS) {
        return 204;
      }
    }
  }
}

While collectors can also be used to add headers, we have left this example to perform traditional tasks such as routing and processing.

Prerequisites

This example requires an Elastic cluster, run either locally via start-local, via Elastic Cloud or Serverless. Here we use the Managed OLTP endpoint in Elastic Serverless. Any mechanism requires you to specify several key environment variables, listed in the .env-example file:

ELASTIC_ENDPOINT=https://my-elastic-endpoint:443
ELASTIC_API_KEY=my-api-key

Running the application

To run our example, follow the steps in the project README, summarized below:

# Terminal 1: backend service, proxy and collector
docker-compose build
docker-compose up

# Terminal 2: frontend and sample telemetry data
cd records-ui
npm install
npm run generate

Java Backend Instrumentation

We will not cover the specifics of instrumentation of Java services with EDOT as there is already a great guide to get started in the

elastic-otel-java

README. The example is here purely for showcasing propagation that is important for investigating UI issues. All you need to know is that we make use of automatic instrumentation, sending logs, metrics and traces via OpenTelemetry Protocol, or OTLP using the below environment variables:

OTEL_RESOURCE_ATTRIBUTES=service.version=1,deployment.environment=dev
OTEL_SERVICE_NAME=record-store-server-java
OTEL_EXPORTER_OTLP_ENDPOINT=$ELASTIC_ENDPOINT
OTEL_EXPORTER_OTLP_HEADERS="Authorization=ApiKey ${ELASTIC_API_KEY}"
OTEL_TRACES_EXPORTER=otlp
OTEL_METRICS_EXPORTER=otlp
OTEL_LOGS_EXPORTER=otlp

The instrumentation is then initialized using the

-javaagent

option:

ENV JAVA_TOOL_OPTIONS="-javaagent:./elastic-otel-javaagent-1.2.1.jar"

Client Instrumentation

Now that we have established our prerequisites, let's dive into the instrumentation code for our simple web application. Although we'll cover the implementation in sections, the full solution is available here in

frontend.tracer.ts

State of OTel Client Instrumentation

At time of writing, the OpenTelemetry JavaScript SDK has stable support for metrics and traces, with logs currently under development and therefore subject to breaking changes as listed in their documentation:

Traces	Metrics	Logs
Stable	Stable	Development

What differs from many other SDKs is the note warning that client instrumentation for the browser is experimental and mostly unspecified. It is subject to breaking change, and many pieces such as plugin support for measuring Google Core Web Vitals are in progress as reflected in the Client Instrumentation SIG project board. In subsequent sections we'll show examples for signal capture, and also browser specific instrumentations including document load, user interaction and Core Web Vitals capture.

Resource Definition

When instrumenting web UIs, we need to establish our UI as an OpenTelemetry Resource. By definition, resources are entites that produce telemetry information. We want to see our UI as an entity in our system that interacts with other entities, which can be specified using the following code:

// Defines a Resource to include metadata like service.name, required by Elastic
import { resourceFromAttributes, detectResources } from '@opentelemetry/resources';

// Experimental detector for browser environment
import { browserDetector } from '@opentelemetry/opentelemetry-browser-detector';

// Provides standard semantic keys for attributes, like service.name
import { ATTR_SERVICE_NAME } from '@opentelemetry/semantic-conventions';

const detectedResources = detectResources({ detectors: [browserDetector] });
let resource = resourceFromAttributes({
	[ATTR_SERVICE_NAME]: 'records-ui-web',
	'service.version': 1,
	'deployment.environment': 'dev'
});
resource = resource.merge(detectedResources);

A unique identifier for the service is required, and is common to all SDKs. What differs from other implementations is the inclusion of the

browserDetector

which, when merged with our defined resource attributes adds browser attributes such as platform, brands (e.g. Chrome versus Edge) and whether a mobile browser is being used:

Having this information on spans and errors is useful in diagnostic situations in identifying application and dependency compatibility issues with certain browsers (such as Internet Explorer from my time as an engineer 🤦).

Logs

Traditionally, frontend engineers rely on the DevTools console of their favourite browser to examine logs. With UI log messages only being accessible within your browser rather than forwarded to a file somewhere, which is the common pattern with backend services, we lose visibility of this resource when triaging user issues.

OpenTelemetry defines the concept of an exporter that allow us to send signals to a particular destination, such as logs.

// Get logger and severity constant imports
import { logs, SeverityNumber } from '@opentelemetry/api-logs';

// Provider and batch processor for sending logs
import { BatchLogRecordProcessor, LoggerProvider } from '@opentelemetry/sdk-logs';

// Export logs via OTLP
import { OTLPLogExporter } from '@opentelemetry/exporter-logs-otlp-http';

// Configure logging to send to the collector via nginx
const logExporter = new OTLPLogExporter({
	url: 'http://localhost:8123/v1/logs' // nginx proxy
});

const loggerProvider = new LoggerProvider({
	resource: resource, // see resource initialisation above
	processors: [new BatchLogRecordProcessor(logExporter)]
});

logs.setGlobalLoggerProvider(loggerProvider);

Once the provider has been initialized, we need to get a hold of the logger to send our traces to Elastic rather than using good ol'

console.log('Help!')

// Example gets logger and sends a message to Elastic
const logger = logs.getLogger('default', '1.0.0');
logger.emit({
	severityNumber: SeverityNumber.INFO,
	severityText: 'INFO',
	body: 'Logger initialized'
});

They will now be visible in Discover and the Logs views, allowing us to search for relevant outages as part of investigations and incidents:

Traces

The power of traces in diagnosing issues in the UI is in the visibility of not just what is going on within the web application, but seeing the connections and time taken to make calls to the labyrinth of services behind. To instrument a web-based application, we need to make use of the

WebTraceProvider

using the

OTLPTraceExporter

in a similar way to how exporters work for logs and metrics:

/* Packages for exporting traces */

// Import the WebTracerProvider, which is the core provider for browser-based tracing
import { WebTracerProvider } from '@opentelemetry/sdk-trace-web';

// BatchSpanProcessor forwards spans to the exporter in batches to prevent flooding
import { BatchSpanProcessor } from '@opentelemetry/sdk-trace-base';

// Import the OTLP HTTP exporter for sending traces to the collector over HTTP
import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-http';

// Configure the OTLP exporter to talk to the collector via nginx
const exporter = new OTLPTraceExporter({
	url: 'http://localhost:8123/v1/traces' // nginx proxy
});

// Instantiate the trace provider and inject the resource
const provider = new WebTracerProvider({
	resource: resource,
	spanProcessors: [
		// Send each completed span through the OTLP exporter
		new BatchSpanProcessor(exporter)
	]
});

Next we need to register our provider. One thing that's slightly different in the web world is how we configure propagation. Context propagation in OpenTelemetry refers to the concept of moving context between services and processes which, in our case, allows us to correlate the web signals with those of backend services. Often this is done automatically. As you will see from the below snippet, there are 3 concepts that help us with propagation:

// This context manager ensures span context is maintained across async boundaries in the browser
import { ZoneContextManager } from '@opentelemetry/context-zone';

// Context Propagation across signals
import {
	CompositePropagator,
	W3CBaggagePropagator,
	W3CTraceContextPropagator
} from '@opentelemetry/core';

// Provider instantiation code omitted

// Register the provider with propagation and set up the async context manager for spans
provider.register({
	contextManager: new ZoneContextManager(),
	propagator: new CompositePropagator({
		propagators: [new W3CBaggagePropagator(), new W3CTraceContextPropagator()]
	})
});

The first is the

ZoneContextManager

which propagates context such as spans and traces across asynchronous operations. Web developers will be familiar with zone.js, the framework used by many JS frameworks to provide an execution context that persists across async tasks.

Additionally, we have combined the

W3CBaggagePropagator

and

W3CTraceContextPropagator

using the

CompositePropagator

to ensure key value pair attributes are passed between signals as per the W3C specification defined here. In the case of the

W3CTraceContextPropagator

, it allows the propagation of the

traceparent

and

tracestate

HTTP headers as per the specification located here.

Auto Instrumentation

The simplest way to start instrumenting a web application is to register the web auto-instrumentations. At time of writing the documentation states that the following instrumentations can be configured via this approach:

Configuration for each configuration can be passed as configuration to

registerInstrumentations

as shown in the below example configuring the fetch and XMLHTTPRequest instrumentations:

// Used to auto-register built-in instrumentations
import { registerInstrumentations } from '@opentelemetry/instrumentation';

// Import the auto-instrumentations for web, which includes common libraries, frameworks and document load
import { getWebAutoInstrumentations } from '@opentelemetry/auto-instrumentations-web';

// Enable automatic span generation for document load and user click interactions
registerInstrumentations({
  instrumentations: [
    getWebAutoInstrumentations({
      '@opentelemetry/instrumentation-fetch': {
        propagateTraceHeaderCorsUrls: /.*/,
        clearTimingResources: true
        },
        '@opentelemetry/instrumentation-xml-http-request': {
          propagateTraceHeaderCorsUrls: /.*/
          }
      })
    ]
});

Taking the @opentelemetry/instrumentation-fetch instrumentation as an example, we are able to see traces for HTTP requests, and the propagators also ensure that the spans can connect with our Java backend services to give a full picture of the amount of time taken to process the request at each stage:

While auto-instrumentations is agreat way to get common instrumentations, we can also instantiate instrumentations directly, as we'll see in the remainder of this article.

Document Load Instrumentation

Another consideration unique to web frontend is the time taken to load assets such as images, JavaScript files and even stylesheets. Such assets taking considerable time to load can impact metrics such as First Contentful Paint, and therefore the user experience. The OTel Document Load instrumentation allows for automatic instrumentation of the time taken to load assets when using the @opentelemetry/sdk-trace-web package.

It is simply a case of adding the instrumentation to the

instrumentations

array we have provided to our provider using

registerInstrumentations

// Used to auto-register built-in instrumentations like page load and user interaction
import { registerInstrumentations } from '@opentelemetry/instrumentation';

// Document Load Instrumentation automatically creates spans for document load events
import { DocumentLoadInstrumentation } from '@opentelemetry/instrumentation-document-load';

// Configuration discussed above omitted

// Enable automatic span generation for document load and user click interactions
registerInstrumentations({
  instrumentations: [
    // Automatically tracks when the document loads
    new DocumentLoadInstrumentation({
      ignoreNetworkEvents: false,
      ignorePerformancePaintEvents: false
      }),
      // Other instrumentations omitted
  ]
});

This configuration will create a new trace conventiently named

documentLoad

, that will show us the time taken to load resources within the document, similar to the following:

Each span will have metadata attached to help us identify which resources are taking considerable time to load, such as this image example, where the resource takes 837ms to load:

Click Events

You may wonder why we want to capture user interactions with web applications for diagnostic purposes. Being able to see the trigger points for errors can be useful in incidents to establish a timeline of what happened, and determine if users are indeed being impact as is the case for Real Ueer Monitoring tools. But if we also consider the field of Digital Experience Monitoring, or DEM, software teams need details on usage of application features to understand the user journey and how it could possibly being improved in a data-drive way. Capturing user events is required for both.

The OTel UserInteraction instrumentation for web is how we capture these events. Similar to the document load instrumentation it depends on the @opentelemetry/sdk-trace-web package, and when used with

zone-js

and the

ZoneContextManager

it also supports async operations.

Like other instrumentations it is added via

registerInstrumentations

// Used to auto-register built-in instrumentations like page load and user interaction
import { registerInstrumentations } from '@opentelemetry/instrumentation';

// Automatically creates spans for user interactions like clicks
import { UserInteractionInstrumentation } from '@opentelemetry/instrumentation-user-interaction';

// Configuration discussed above omitted

// Enable automatic span generation for document load and user click interactions
registerInstrumentations({
  instrumentations: [
    // User events
    new UserInteractionInstrumentation({
      eventNames: ['click', 'input'] // instrument click and input events only
    }),
    // Other instrumentations omitted
  ]
});

It will capture and label spans for the user events we configure, and leveraging the propagators configured previously can connect spans from other resources to the user event, similar to the below example where we see the service call to get records when the user adds a search term to the

input

box:

Metrics

There are numerous different measurements that are helpful in capturing useful indicators of availability and performace of web applications, such as latency, throughput or the number of 404 errors. Google Core Web Vitals are a set of standard metrics used by web developers to measure real-world user experience of web sites, including loading performance, reactivity to user input and visual stability. Given at time of writing the Core Web Vitals Plugin for OTel Browser is on the backlog, let's try building our own custom instrumentation using the web-vitals JS library to capture these as OTel metrics.

In OpenTelemetry you can create your own custom instrumentation by extending the

InstrumentationBase

, overriding the

constructor

to create the

MeterProvider

Meter

and

OTLPMetricExporter

that will allow us to send our Core Web Vital measurements to Elastic via our proxy, as presented in

web-vitals.instrumentation.ts

. Note that below we show only the LCP meter for succinctness, but the full example here measures all web vitals.

/* OpenTelemetry JS packages */
// Instrumentation base to create a custom Instrumentation for our provider
import {
	InstrumentationBase,
	type InstrumentationConfig,
	type InstrumentationModuleDefinition
} from '@opentelemetry/instrumentation';

// Metrics API
import {
	metrics,
	type ObservableGauge,
	type Meter,
	type Attributes,
	type ObservableResult,

} from '@opentelemetry/api';

export class WebVitalsInstrumentation extends InstrumentationBase {

  // Meter captures measurements at runtime
	private cwvMeter: Meter;

	/* Core Web Vitals Measures, LCP provided, others omitted */
	private lcp: ObservableGauge;

	constructor(config: InstrumentationConfig, resource: Resource) {
		super('WebVitalsInstrumentation', '1.0', config);

    // Create metric reader to process metrics and export using OTLP
		const metricReader = new PeriodicExportingMetricReader({
			exporter: new OTLPMetricExporter({
				url: 'http://localhost:8123/v1/metrics' // nginx proxy
			}),
			// Default is 60000ms (60 seconds).
			// Set to 10 seconds for demo purposes only.
			exportIntervalMillis: 10000
		});

    // Creating Meter Provider factory to send metrics
		const myServiceMeterProvider = new MeterProvider({
			resource: resource,
			readers: [metricReader]
		});
		metrics.setGlobalMeterProvider(myServiceMeterProvider);

    // Create web vitals meter
		this.cwvMeter = metrics.getMeter('core-web-vitals', '1.0.0');

		// Initialising CWV metric gauge instruments (LCP given as example, others omitted here)
		this.lcp = this.cwvMeter.createObservableGauge('lcp', { unit: 'ms', description: 'Largest Contentful Paint' });
	}

	protected init(): InstrumentationModuleDefinition | InstrumentationModuleDefinition[] | void {}

  // Other steps discussed later
}

You'll notice in our LCP example we have created an

ObservableGauge

to capture the value at the time it is read via a callback function. This can be setup when we

enable

our custom instrumentation, specifying when the LCP event is triggered the value will be sent via

result.observe

/* Web Vitals Frontend package, LCP shown as example*/
import { onLCP, type LCPMetric } from 'web-vitals';

/* OpenTelemetry JS packages */
// Instrumentation base to create a custom Instrumentation for our provider
import {
	InstrumentationBase,
	type InstrumentationConfig,
	type InstrumentationModuleDefinition
} from '@opentelemetry/instrumentation';

// Metrics API
import {
	metrics,
	type ObservableGauge,
	type Meter,
	type Attributes,
	type ObservableResult,

} from '@opentelemetry/api';
 
// Other OTel Metrics imports omitted

// Time calculator via performance component
import { hrTime } from '@opentelemetry/core';

type CWVMetric = LCPMetric | CLSMetric | INPMetric | TTFBMetric | FCPMetric;

export class WebVitalsInstrumentation extends InstrumentationBase {

	/* Core Web Vitals Measures */
	private lcp: ObservableGauge;

	// Constructor and Initialization omitted

	enable() {
		// Capture Largest Contentful Paint, other vitals omitted
		onLCP(
			(metric) => {
				this.lcp.addCallback((result) => {
					this.sendMetric(metric, result);
				});
			},
			{ reportAllChanges: true }
		);
	}

  // Callback utility to add attributes and send captured metric
	private sendMetric(metric: CWVMetric, result: ObservableResult<Attributes>): void {
		const now = hrTime();

		const attributes = {
			startTime: now,
			'web_vital.name': metric.name,
			'web_vital.id': metric.id,
			'web_vital.navigationType': metric.navigationType,
			'web_vital.delta': metric.delta,
			'web_vital.value': metric.value,
			'web_vital.rating': metric.rating,
			// metric specific attributes
			'web_vital.entries': JSON.stringify(metric.entries)
		};

		result.observe(metric.value, attributes);
	}
}

To use our own instrumentation, we need to register our instrumentation just like we did in

frontend.tracer.ts

for the available web instrumentations to capture document and user event instrumentations:

registerInstrumentations({
  instrumentations: [
    // Other web instrumentations omitted
    // Custom Web Vitals instrumentation
    new WebVitalsInstrumentation({}, resource)
    ]
});

The

lcp

metric, along with the attributes we specified as part of our

sendMetric

function will be sent to our Elastic cluster:

These metrics will not feed into the User Experience dashboard due to compatibility, but we can create a dashboard leveraging the values to show the trends of each of our vitals:

Summary

In this blog, we presented the current state of client instrumentation for the browser, along with an example showing how to instrument a simple JavaScript frontend using the OpenTelemetry browser instrumentation. To reflect back on the code, check out the repo here. If you have any questions or want to learn from other developers connect with the Elastic Community.

Developer resources:

OTel Record Store Application

JavaScript Browser Instrumentation

Web Frontend Instrumentation and Monitoring with OpenTelemetry and Elastic

Application Overview

Prerequisites

Running the application

Java Backend Instrumentation

Client Instrumentation

State of OTel Client Instrumentation

Resource Definition

Logs

Traces

Auto Instrumentation

Document Load Instrumentation

Click Events

Metrics

Summary

Jump to section

Share this article