integrations/tgi/documentation.yaml (41 lines of code) (raw):

exporter_type: included app_name_short: TGI app_name: Text Generation Inference app_site_name: {{app_name_short}} app_site_url: https://huggingface.co/docs/text-generation-inference/en/index exporter_name: the {{app_name_short}} exporter exporter_repo_url: https://huggingface.co/docs/text-generation-inference/en/reference/metrics gke_setup_url: /kubernetes-engine/docs/tutorials/serve-gemma-gpu-tgi additional_prereq_info: | {{app_site_name}} exposes Prometheus-format metrics automatically; you do not have to install it separately. To verify that {{exporter_name}} is emitting metrics on the expected endpoints, do the following: 1. Set up port forwarding by using the following command: <pre class="devsite-click-to-copy"> kubectl -n {{namespace_name}} port-forward {{pod_name}} 8080:8080 </pre> 2. Access the endpoint `localhost:8080/metrics` by using the browser or the `curl` utility in another terminal session. dashboard_available: true multiple_dashboards: false dashboard_display_name: {{app_name_short}} Prometheus Overview podmonitoring_config: | apiVersion: monitoring.googleapis.com/v1 kind: PodMonitoring metadata: name: tgi labels: app.kubernetes.io/name: tgi app.kubernetes.io/part-of: google-cloud-managed-prometheus spec: endpoints: - port: 8080 scheme: http interval: 30s path: /metrics selector: matchLabels: app: tgi-gemma-server additional_podmonitoring_info: | Ensure that the values of the `port` and `matchLabels` fields match those of the {{app_name_short}} pods you want to monitor. sample_promql_query: up{job="tgi", cluster="{{cluster_name}}", namespace="{{namespace_name}}"}