githubEdit

How do I understand Grafana log ingestion, telemetry endpoints, and datasources in DuploCloud?

Context When working with Grafana in DuploCloud, you may need to understand how logs are ingested into Loki, where telemetry data comes from, how to control log labels, and what each datasource is used for. This information is essential for setting up proper monitoring, alerts, and dashboards. Answer Log Ingestion Delay Logs in Loki are nearly live with minimal delay. The system buffers logs for 10 seconds before making a bulk request to push logs to Loki, so you can expect to see logs with approximately a 10-second delay from when they are generated. Telemetry Endpoints For telemetry data, DuploCloud uses an OpenTelemetry endpoint where APM metrics and traces are pushed. If your application is already pushing data to this endpoint, the metrics should be available for querying in Grafana. The OTEL endpoint for use within EKS clusters is: http://duplo-monitoring-alloy.duploservices-otel-prod01.svc.cluster.local:4317 Note: This endpoint only works within the EKS cluster environment. Log Labels Control There is a limit of approximately 25 labels per log line (including labels like cluster , container , flags , detected_level ). For better performance, it's recommended to keep the number of labels low. Additionally, label values should not be dynamic (such as session IDs or timestamps) as this can impact performance. Datasource Documentation Each datasource in your Grafana setup serves a specific purpose (metrics, CloudWatch, profiling, etc.). For detailed information about each component and datasource, refer to the comprehensive documentation available at: DuploCloud Automation Platform Documentation This documentation will help you understand which alerts, metrics, and dashboards are appropriate for tracking your specific use cases.

Last updated

Was this helpful?