# MCP analytics

Every authenticated MCP request the Zuplo MCP Gateway handles produces a set of
structured analytics events. The events power the MCP tab on the Zuplo Portal's
**Analytics** page and feed the same data into Zuplo's standard log and metrics
pipelines. This page explains why each event exists, the dimensions that scope
the data, and the operational questions the dashboard exists to answer.

## What the analytics are for

A platform team running an MCP Gateway usually wants to answer a small number of
recurring questions:

- Is the gateway healthy right now? What's the success rate, and where are the
  failures coming from — the gateway, the upstream, or the client?
- Which capabilities (tools, prompts, resources) are users actually exercising,
  and which are slow or error-prone?
- Who is using the gateway and how heavily?
- Did the upstream OAuth flow finish for the user who just complained, or did
  they hit a connect-required state nobody resolved?
- When latency went up, was it the gateway or the upstream?

The analytics event taxonomy is shaped to answer each of those questions without
leaving the dashboard.

## The three event families

Every MCP analytics event belongs to one of three families. The split matters
because each family answers a different kind of question.

- **`mcp_request`** events fire at the route boundary. They record the
  acceptance or rejection of an inbound MCP request before any JSON-RPC routing
  happens — what authentication and authorization decisions the gateway made and
  why. These are the events that tell you "the gateway rejected this request"
  versus "the gateway accepted it and something downstream went wrong."
- **`capability_invocation`** events fire on every parsed JSON-RPC call. They
  record what the client asked for (the `mcpMethod` and `capabilityName`) and
  what happened — success, error, latency. This family feeds the
  top-capabilities tables and the per-tool error-rate views.
- **`auth_event`** entries record the OAuth lifecycle: tokens issued and
  validated, consent approvals, upstream connections established, and token
  revocations. This family powers the "did the user actually finish OAuth"
  question.

Together the three families let an operator pivot from a failed tool call to the
OAuth event that issued the token, to the request boundary that accepted the
request, without leaving the analytics surface.

## Outcomes drive the chart colors

Every event carries an `outcome` value in one of seven classes — `success`,
`failure`, `denied`, `application_error`, `connect_required`, `partial`,
`cancelled`. Outcome class drives chart colors and the success-rate KPI.
Failures break down further by `failureOrigin` (gateway, upstream, client) for
the failure-origin chart and KPI.

The split between `denied`, `application_error`, and `failure` matters: a 401 on
the route is a `denied`, an upstream returning an MCP-level error inside a 200
response is an `application_error`, and an actual operational failure (timeout,
network error, malformed response) is a `failure`. The operator sees the same
red chart slice in all three cases but can pivot to the right next question by
clicking the slice.

## Dimensions you'll filter by

Each event carries the route and identity fields that scope it:

- `operationId` (surfaced as `virtualServerName`) — the route's identity
- `upstreamServerId` (surfaced as `upstreamServerName`) — the upstream's id
- `subjectId` — the authenticated user
- `authProfileId` and `upstreamAuthMode` — which OAuth surface produced the call
- `httpMethod`, `transport`, `mcpMethod`, `clientName` — protocol shape
- `latencyMs`, with the gateway and upstream slices when both halves are
  measured
- `reasonCode` and `errorType` on failure events — stable programmer-friendly
  strings like `missing_token`, `invalid_audience`, `connect_required`,
  `upstream_timeout`

Reason codes appear in both analytics events and the structured-log
counterparts, which lets a single string cross-reference the two data sources
when an operator is debugging.

The dashboard's drill-in model uses these dimensions. Clicking any value in a
breakdown table (a user, an upstream, a capability) scopes the entire dashboard
to that value; clicking again toggles the filter off. Multiple drill-ins compose
with AND semantics — clicking a user, then an upstream, then a capability type
narrows the view to that combination.

## How the dashboard answers each question

The Portal renders the MCP analytics in a fixed order so the layout doesn't
reshape when filters or time ranges change. The order maps to the recurring
operator questions.

The headline cards across the top — total events, success rate, p95 latency, and
failure origin decomposition — answer "is the gateway healthy?" at a glance. The
latency card splits gateway and upstream slices beneath the p95 number, which is
the fastest way to tell "the gateway is slow" from "the upstream is slow."

The Events Over Time chart breaks volume down by event family and outcome, with
failure outcomes always rendered in red. An error spike has a characteristic
shape — a red bar appearing in a previously-green window — that's easy to spot
on the chart and click into.

The capability tables — Top Capabilities (with view toggles for Most Calls, Most
Errors, and Slowest) and the per-type filter (Tool, Resource, Prompt) — answer
"what are users actually doing?" and "what's broken?" An operator can pivot from
a slow tool to its server-and-type drill-in with one click, narrowing the
dashboard to that capability across every other view.

The Top Users table groups by `subjectId`, with email-style subjects rendered as
the email (so `auth0|google-apps|alex@example.com` shows as `alex@example.com`)
and other subject formats shown as-is. It's the fastest path from "alex says the
gateway is broken" to alex's failed events.

The Top MCP Routes and Top Upstream Servers tables sit side by side. They track
together for most projects because each route proxies one upstream; the split
shows up when one upstream sits behind multiple routes (for example, a
full-featured route and a read-only one). The MCP Methods, Top Clients, and
Transport panels add a final layer of protocol-shape detail: which JSON-RPC
methods are flowing, which clients identified themselves at `initialize`, and
which transport they used.

The JSON-RPC Error Codes and Failure Origins panels decompose failures so an
operator can answer "is this our problem, theirs, or the client's?" without
leaving the dashboard. The Top Reason Codes table at the bottom is the most
direct path into structured logs — the same `reasonCode` value appears on both
surfaces.

## Where to find it

The MCP analytics dashboard is a tab on the Zuplo Portal's **Analytics** page.
At the account scope,
[Analytics → MCP](https://portal.zuplo.com/+/account/analytics) aggregates
across every project on the account that has MCP routes. At the project scope,
[Analytics → MCP](https://portal.zuplo.com/+/account/project/analytics) shows
that project's events only.

The MCP tab appears automatically once any MCP request has been recorded for the
project. New projects show the empty state until the first MCP request lands.

## Reference: event types

The dashboard is built from a fixed set of event types. New types may be added
over time, but the families and outcome classes above stay stable.

### `mcp_request`

Boundary events at the MCP route. Examples include `mcp_request_accepted` and
`mcp_request_rejected`. Carries `operationId`, `subjectId` (when known),
`httpMethod`, `transport`, the `reasonCode` on rejection, and the `latencyMs`
spent at the boundary.

### `capability_invocation`

Per-capability events emitted by
[`McpProxyHandler`](../code-config/mcp-proxy-handler.mdx). Each invoked call
emits two events: an `mcp_capability_invoked` event before the upstream fetch
(carrying the parsed `mcpMethod` and `capabilityName`), and an
`mcp_capability_completed` event afterward (carrying `outcome`, `mcpStatus`,
`latencyMs`, and any JSON-RPC error details).

### `auth_event`

OAuth and upstream-auth lifecycle events. Examples include
`mcp_auth_downstream_token_issued`, `mcp_auth_downstream_token_validated`,
`mcp_auth_upstream_connection_established`, and `mcp_auth_consent_approved`.
Carries the same identity fields as the other families when applicable, plus
`authProfileId` and `upstreamAuthMode`.

## Forwarding the underlying data

The same events that back the dashboard also flow through Zuplo's standard
analytics pipeline. Every event corresponds to a structured log entry — see
[Logging](./logging.mdx) for the MCP-specific log fields. Log destinations
supported include Datadog, AWS CloudWatch, Google Cloud Logging, Splunk, Sumo
Logic, New Relic, Loki, Dynatrace, and VMware Log Insight; see
[Logging](../../articles/logging.mdx) for the full list of destinations and how
to enable them.

For metrics, see the built-in
[metrics plugins](../../articles/metrics-plugins.mdx) (Datadog, Dynatrace, New
Relic, OpenTelemetry). The OpenTelemetry plugin specifically exports traces and
logs for the MCP request, every inbound policy, the handler, and the upstream
fetch.

## Related

- [Logging](./logging.mdx) — the structured-log counterpart, including the field
  model and OpenTelemetry export.
- [`McpProxyHandler` reference](../code-config/mcp-proxy-handler.mdx) — the
  handler whose capability instrumentation drives the dashboard's top-capability
  views.
- [Troubleshooting](../troubleshooting.mdx) — the operator playbook when an
  analytics chart shows something concerning.
