What is the difficulty level and which AI tool should I use?

This prompt is advanced level. It works with claude.

💻DeveloppementAdvancedClaude

Set Up Application Observability

Implement the three pillars of observability (logs, metrics, traces) with OpenTelemetry, Prometheus, and Grafana dashboards.

Paste in your AI

Paste this prompt in ChatGPT, Claude or Gemini and customize the variables in brackets.

Tu es un expert SRE (Site Reliability Engineering) spécialisé dans l'observabilité des systèmes distribués. Je dois mettre en place les trois piliers de l'observabilité pour mon application.

**Application à instrumenter :**
- Type : [EX: API Node.js, microservices Python, application Go]
- Infrastructure : [EX: Kubernetes, Docker Compose, VPS simple]
- Volume de trafic : [EX: 10k requêtes/minute]
- Stack actuelle : [EX: aucun monitoring, Sentry uniquement, logs basiques]

**Outils disponibles ou souhaités :**
- Logs : [EX: Loki + Grafana, ELK Stack, Datadog]
- Métriques : [EX: Prometheus + Grafana, CloudWatch, Datadog]
- Traces : [EX: Jaeger, Zipkin, Tempo, Datadog APM]

Mets en place les trois piliers de l'observabilité :

1. **Logging structuré** : implémente le logging JSON avec des champs standards (timestamp, level, service, trace_id, user_id, duration). Définis les niveaux de log et quand utiliser chacun. Évite les logs inutiles qui noient les logs importants.

2. **Métriques Prometheus** : instrumente les métriques métier clés (taux de succès des commandes, revenus par heure) et techniques (latence P95/P99, taux d'erreur, utilisation des ressources). Fournis le code d'instrumentation.

3. **Tracing distribué avec OpenTelemetry** : configure le SDK OpenTelemetry, instrumente les routes HTTP et les appels de base de données, propage le trace context entre services.

4. **Dashboards Grafana** : propose la définition JSON de 3 dashboards : vue d'ensemble de santé, latence et erreurs, et métriques métier.

5. **Alertes** : définis des règles d'alerte pertinentes avec des seuils basés sur les SLOs (Service Level Objectives) et le budget d'erreur.

Why this prompt works

This prompt structures observability according to the three industry-recognized pillars (logs, metrics, traces), ensuring complete system visibility: logs for error context, metrics for trends and alerts, and traces for understanding inter-service interactions.The distinction between technical and business metrics is fundamental for product teams: knowing that P99 latency is 500ms is useful for engineering, but knowing that order success rate dropped 5% is critical information for the business. Both types of metrics must coexist.Adopting OpenTelemetry for tracing is a wise strategic choice as it's the open-source standard that avoids vendor lock-in: the same instrumentation code can send traces to Jaeger, Tempo, or Datadog as needed, without modifying application code.

Use Cases

Setting up production monitoringDebugging complex incidentsSLO and SLA compliance

Expected Output

Complete configuration of all three pillars with instrumentation code, Grafana dashboards, and SLO-based alert rules.

Learn more

Check the full skill on Prompt Guide to master this technique from A to Z.

View on Prompt Guide

Glossary Terms

Context Embedding Few-Shot Prompting Fine-Tuning GPT Guardrails Inference LLM

📬 Get new prompts every week

Join our newsletter and never miss a prompt.

Similar Prompts

💻DeveloppementIntermediateAll AIs

Optimize Your Web Application Frontend Performance

A comprehensive prompt to audit and optimize web application frontend performance, covering bundle size, rendering, assets and Core Web Vitals.

0104

💻DeveloppementAdvancedClaude

Debug a Production Error

Quickly analyze a production error with a structured Root Cause Analysis approach and an immediate action plan.

47211

💻DeveloppementIntermediateGemini

Create a CLI Tool with Node.js

Create a professional Node.js CLI tool with Commander.js, Inquirer, visual feedback, and npm publishing.

23225

💻DeveloppementIntermediateClaude

Complete Code Review for Pull Requests

Get an exhaustive code review covering quality, performance, security, and maintainability for any language.

34240