Masking sensitive LLM data
Masking gives you control over the tracing data sent to Langfuse. Use masking functions to redact sensitive information before trace data leaves your application, for example to:
- Redact sensitive information from trace or observation inputs, outputs, and metadata.
- Transform OpenTelemetry span attributes before export.
- Implement fine-grained data filtering for compliance or privacy requirements.
Learn more about Langfuse's data security and privacy measures concerning the stored data in our security and compliance overview.
Configure masking
The Python SDK supports two masking hooks. For new Python SDK setups, prefer mask_otel_spans.
| Method | Status | When it runs | What it covers |
|---|---|---|---|
mask_otel_spans | Recommended | At export stage, after Langfuse decides which OpenTelemetry spans this client should export and after Langfuse media handling | Raw OpenTelemetry span attributes from Langfuse SDK spans and third-party instrumentations exported by this Langfuse client |
mask | Legacy | Synchronously when Langfuse SDK attributes are created | Data set through Langfuse SDK APIs such as start_observation(), update(), and set_trace_io() |
Use mask_otel_spans to patch OpenTelemetry span attributes before the Langfuse client exports them. The hook receives read-only snapshots of one OpenTelemetry export batch and returns sparse patches for the spans that should change.
from typing import Optional
from langfuse import Langfuse
from langfuse.types import (
MaskOtelSpansParams,
MaskOtelSpansResult,
OtelSpanPatch,
)
def mask_otel_spans(
*, params: MaskOtelSpansParams
) -> Optional[MaskOtelSpansResult]:
patches = {}
for identifier, span in params.spans.items():
if span.instrumentation_scope_name == "openai":
patches[identifier] = OtelSpanPatch(
delete_attributes=(
"gen_ai.prompt.0.content",
"gen_ai.completion.0.content",
),
set_attributes={"masking.applied": True},
)
return MaskOtelSpansResult(span_patches=patches)
langfuse = Langfuse(mask_otel_spans=mask_otel_spans)mask_otel_spans behavior
mask_otel_spans follows the public Python SDK type contract:
- It receives one OpenTelemetry export batch as
params.spans. A batch is not guaranteed to contain a complete trace, request, or Langfuse observation tree. - Each key is an
OtelSpanIdentifier(trace_id, span_id). Reuse the identifier objects fromparams.spanswhen returning patches. - Each value is an
OtelSpanDatasnapshot aftershould_export_spanfiltering and export-stage media handling. Itsattributesandresource_attributesmappings are read-only. - Return
Noneto leave the whole batch unchanged. - Return
MaskOtelSpansResult(span_patches=...)to delete or replace attributes on selected spans. - Patches are sparse. Omit spans that do not need changes.
OtelSpanPatchdeletesdelete_attributesfirst and then appliesset_attributes, soset_attributeswins when the same key appears in both.set_attributesvalues must be valid OpenTelemetry attribute values: strings, booleans, integers, floats, or homogeneous sequences of those scalar types.- The hook can only change span attributes. It cannot change the span name, IDs, parent relationship, resource attributes, events, links, or instrumentation scope.
- The hook affects only spans exported by this Langfuse client. If the same OpenTelemetry spans are sent to another exporter, that exporter receives its own unmodified copy.
mask_otel_spans only affects spans that pass through the Langfuse Python SDK span processor and are exported by this Langfuse client. If you also send telemetry to another observability backend through a separate OpenTelemetry span processor or exporter, that backend receives its own unmodified copy of the spans. Configure masking separately for any non-Langfuse exporter.
mask_otel_spans is synchronous. It usually runs on the OpenTelemetry batch span processor worker thread, so it should not block the main caller thread of your application. During flush() and shutdown it may run on the caller thread.
Keep the function deterministic and fast. Network calls are possible, but slow masking backs up the OpenTelemetry export queue and delays span export. Avoid long-running work, unbounded retries, request-local state, the current active span, and async I/O inside the masking function.
If mask_otel_spans raises an exception or returns an invalid MaskOtelSpansResult, Langfuse drops the whole export batch. If an individual OtelSpanPatch is invalid, Langfuse drops only that span from the Langfuse export. Invalid returned attribute values delete only the affected attribute.
Legacy mask behavior
The mask parameter is the legacy Python SDK masking hook. It runs synchronously when Langfuse SDK attributes are created and applies only to data set through Langfuse SDK APIs such as start_observation(), update(), and set_trace_io(). It does not inspect final raw OpenTelemetry span attributes from third-party instrumentations.
Use mask only when you specifically need to transform data at Langfuse SDK attribute creation time.
from typing import Any
from langfuse import Langfuse
def masking_function(*, data: Any, **kwargs: Any) -> Any:
if isinstance(data, str) and data.startswith("SECRET_"):
return "REDACTED"
if isinstance(data, dict):
return {key: masking_function(data=value) for key, value in data.items()}
if isinstance(data, list):
return [masking_function(data=item) for item in data]
return data
langfuse = Langfuse(mask=masking_function)For new Python SDK masking setups, prefer mask_otel_spans.
Examples
Redact credit card numbers with mask_otel_spans
This example scans exported OpenTelemetry string attributes for credit card-like patterns and replaces matches before the span is exported to Langfuse.
import re
from typing import Optional
from langfuse import Langfuse, observe
from langfuse.types import (
MaskOtelSpansParams,
MaskOtelSpansResult,
OtelSpanPatch,
)
credit_card_pattern = re.compile(r"\b(?:\d[ -]*?){13,19}\b")
def mask_otel_spans(
*, params: MaskOtelSpansParams
) -> Optional[MaskOtelSpansResult]:
patches = {}
for identifier, span in params.spans.items():
replacements = {}
for key, value in span.attributes.items():
if isinstance(value, str):
masked_value = credit_card_pattern.sub(
"[REDACTED CREDIT CARD]", value
)
if masked_value != value:
replacements[key] = masked_value
if replacements:
patches[identifier] = OtelSpanPatch(set_attributes=replacements)
return MaskOtelSpansResult(span_patches=patches)
langfuse = Langfuse(mask_otel_spans=mask_otel_spans)
@observe()
def process_payment():
return "Customer paid with card number 4111 1111 1111 1111."
result = process_payment()
print(result)
# Output: Customer paid with card number 4111 1111 1111 1111.
# Flush spans in short-lived applications.
langfuse.flush()The function result printed in your application is unchanged. The exported span attributes sent to Langfuse contain the redacted value.
Redact email addresses and phone numbers
import re
from typing import Optional
from langfuse import Langfuse
from langfuse.types import (
MaskOtelSpansParams,
MaskOtelSpansResult,
OtelSpanPatch,
)
email_pattern = re.compile(r"\b[\w.-]+?@[\w.-]+?\.\w+?\b")
phone_pattern = re.compile(r"\b\d{3}[-. ]?\d{3}[-. ]?\d{4}\b")
def mask_otel_spans(
*, params: MaskOtelSpansParams
) -> Optional[MaskOtelSpansResult]:
patches = {}
for identifier, span in params.spans.items():
replacements = {}
for key, value in span.attributes.items():
if isinstance(value, str):
masked_value = email_pattern.sub("[REDACTED EMAIL]", value)
masked_value = phone_pattern.sub("[REDACTED PHONE]", masked_value)
if masked_value != value:
replacements[key] = masked_value
if replacements:
patches[identifier] = OtelSpanPatch(set_attributes=replacements)
return MaskOtelSpansResult(span_patches=patches)
langfuse = Langfuse(mask_otel_spans=mask_otel_spans)To prevent sensitive data from being sent to Langfuse, you can provide a mask function to the LangfuseSpanProcessor. This function will be applied to the input, output, and metadata of every observation.
The function receives an object { data }, where data is the stringified JSON of the attribute's value. It should return the masked data.
import { NodeSDK } from "@opentelemetry/sdk-node";
import { LangfuseSpanProcessor } from "@langfuse/otel";
const spanProcessor = new LangfuseSpanProcessor({
mask: ({ data }) => {
const maskedData = data.replace(
/\b\d{4}[- ]?\d{4}[- ]?\d{4}[- ]?\d{4}\b/g,
"***MASKED_CREDIT_CARD***",
);
return maskedData;
},
});
const sdk = new NodeSDK({
spanProcessors: [spanProcessor],
});
sdk.start();See JS/TS SDK docs for more details.
When using the CallbackHandler, you can pass mask to the constructor:
import { CallbackHandler } from "langfuse-langchain";
function maskingFunction(params: { data: any }) {
if (typeof params.data === "string" && params.data.startsWith("SECRET_")) {
return "REDACTED";
}
return params.data;
}
const handler = new CallbackHandler({
mask: maskingFunction,
});Related resources
- Data Retention - Automatically delete traces, observations, scores, and media assets after a configured retention period.
- Data Deletion - Manually delete individual or batches of traces.
GitHub discussions
Last edited