Path Lines of Code README.md 147 applications/jupyter/README.md 3 applications/rag/README.md 3 applications/ray/README.md 41 applications/ray/tfvars_examples/README.md 10 benchmarks/65k-cpu-nodes-simulated-ai-benchmark.md 88 benchmarks/README.md 6 benchmarks/accelerator-based-ai-benchmark.md 159 benchmarks/benchmark/README.md 4 benchmarks/benchmark/dataset/README.md 2 benchmarks/benchmark/dataset/ShareGPT_v3_unflitered_cleaned_split/README.md 23 benchmarks/benchmark/dataset/ShareGPT_v3_unflitered_cleaned_split/requirements.txt 2 benchmarks/benchmark/tools/README.md 1 benchmarks/benchmark/tools/dlio/README.md 97 benchmarks/benchmark/tools/locust-load-inference/README.md 191 benchmarks/benchmark/tools/locust-load-inference/locust-custom-exporter/README.md 37 benchmarks/benchmark/tools/locust-load-inference/locust-custom-exporter/go.mod 7 benchmarks/benchmark/tools/locust-load-inference/locust-docker/locust-tasks/requirements.txt 37 benchmarks/benchmark/tools/locust-load-inference/locust-runner/requirements.txt 7 benchmarks/benchmark/tools/model-load-benchmark/README.md 86 benchmarks/benchmark/tools/model-load-benchmark/benchmarker.ini 2 benchmarks/benchmark/tools/model-load-benchmark/go.mod 53 benchmarks/benchmark/tools/model-load-benchmark/requirements.txt 2 benchmarks/benchmark/tools/profile-generator/README.md 148 benchmarks/benchmark/tools/profile-generator/container/requirements.txt 36 benchmarks/inference-server/README.md 8 benchmarks/inference-server/jetstream/README.md 112 benchmarks/inference-server/text-generation-inference/README.md 96 benchmarks/inference-server/text-generation-inference/autoscaling.md 20 benchmarks/inference-server/triton/README.md 126 benchmarks/inference-server/vllm/README.md 101 benchmarks/infra/accelerator-cluster/README.md 27 benchmarks/infra/accelerator-cluster/stage-1/README.md 73 benchmarks/infra/accelerator-cluster/stage-2/README.md 86 benchmarks/orchestration/README.md 10 best-practices/README.md 9 best-practices/gke-batch-refarch/README.md 541 best-practices/gke-batch-refarch/jobset/requirements.txt 10 best-practices/gke-batch-refarch/platform/monitoring/kueue-dashboard.json 1372 best-practices/hotswap.md 132 best-practices/ml-platform/README.md 5 best-practices/startup-latency.md 174 charts/tpu-dra-driver/README.md 8 contributing.md 27 gke-batch-refarch/README.md 1 infrastructure/README.md 91 jupyter-on-gke/README.md 3 modules/cloudsql/README.md 13 modules/custom-metrics-stackdriver-adapter/README.md 53 modules/gcs/README.md 3 modules/gke-autopilot-private-cluster/README.md 33 modules/gke-autopilot-public-cluster/README.md 27 modules/gke-standard-private-cluster/README.md 44 modules/gke-standard-public-cluster/README.md 38 modules/inference-service/README.md 2 modules/jetstream-maxtext-deployment/README.md 153 modules/jupyter/jupyter_image/notebook_image/README.md 8 modules/jupyter/jupyter_image/notebook_image/requirements.txt 5 modules/kuberay-cluster/kuberay_image/requirements.txt 9 modules/prometheus-adapter/README.md 15 ray-on-gke/README.md 98 ray-on-gke/examples/notebooks/gpt-j-online.ipynb 285 ray-on-gke/examples/notebooks/ray-dist-mnist.ipynb 274 ray-on-gke/examples/notebooks/ray-fine-tune-hugging-face.ipynb 431 ray-on-gke/examples/notebooks/ray_basic.ipynb 207 ray-on-gke/examples/notebooks/ray_mnist.ipynb 580 ray-on-gke/examples/notebooks/stable-diffusion-tpu.ipynb 358 ray-on-gke/examples/notebooks/stable_diffusion.ipynb 202 ray-on-gke/examples/tfvars/README.md 10 ray-on-gke/examples/tfvars/raycluster-with-gcsfuse-volumes.tfvars 34 ray-on-gke/examples/tfvars/raycluster-with-grafana-dashboard.tfvars 32 ray-on-gke/examples/tfvars/raycluster-with-iap-auth.tfvars 49 ray-on-gke/examples/tfvars/raycluster-with-workload-identity.tfvars 35 ray-on-gke/examples/tfvars/simple-raycluster-with-existing-gke-cluster.tfvars 32 ray-on-gke/examples/tfvars/simple-raycluster-with-new-gke-cluster.tfvars 32 ray-on-gke/guides/observability/README.md 41 ray-on-gke/guides/raytrain-with-gcsfusecsi/README.md 84 ray-on-gke/guides/tpu/README.md 80 ray-on-gke/templates/README.md 41 ray-on-gke/templates/tfvars_examples/README.md 10 ray-on-gke/tpu/kuberay-tpu-webhook/README.md 1 ray-on-gke/tpu/kuberay-tpu-webhook/Troubleshooting.md 45 ray-on-gke/tpu/kuberay-tpu-webhook/go.mod 68 ray-on-gke/tpu/kuberay-tpu-webhook/samples/tpu-test.py 20 security_test/README.md 73 security_test/allowlist/category/cluster/continuous-image-puller/capabilities.json 17 security_test/allowlist/category/cluster/continuous-image-puller/distroless.json 34 security_test/allowlist/category/cluster/continuous-image-puller/imagedigest.json 34 security_test/allowlist/category/cluster/continuous-image-puller/imagefreshness.json 34 security_test/allowlist/category/cluster/continuous-image-puller/imagepath.json 34 security_test/allowlist/category/cluster/continuous-image-puller/readonlyrootfs.json 17 security_test/allowlist/category/cluster/continuous-image-puller/sbom.json 34 security_test/allowlist/category/cluster/continuous-image-puller/seccompprofile.json 13 security_test/allowlist/category/cluster/hub/capabilities.json 17 security_test/allowlist/category/cluster/hub/distroless.json 18 security_test/allowlist/category/cluster/hub/imagedigest.json 18 security_test/allowlist/category/cluster/hub/imagefreshness.json 18 security_test/allowlist/category/cluster/hub/imagepath.json 18 security_test/allowlist/category/cluster/hub/rbac.json 200 security_test/allowlist/category/cluster/hub/readonlyrootfs.json 17 security_test/allowlist/category/cluster/hub/seccompprofile.json 13 security_test/allowlist/category/cluster/kuberay-operator-leader-election/rbac.json 129 security_test/allowlist/category/cluster/kuberay-operator/rbac.json 1454 security_test/allowlist/category/cluster/mistral-7b-instruct/allowprivilegeescalation.json 17 security_test/allowlist/category/cluster/mistral-7b-instruct/capabilities.json 17 security_test/allowlist/category/cluster/mistral-7b-instruct/distroless.json 34 security_test/allowlist/category/cluster/mistral-7b-instruct/imagedigest.json 34 security_test/allowlist/category/cluster/mistral-7b-instruct/imagefreshness.json 34 security_test/allowlist/category/cluster/mistral-7b-instruct/imagepath.json 34 security_test/allowlist/category/cluster/mistral-7b-instruct/readonlyrootfs.json 17 security_test/allowlist/category/cluster/mistral-7b-instruct/rootless.json 17 security_test/allowlist/category/cluster/mistral-7b-instruct/sbom.json 34 security_test/allowlist/category/cluster/mistral-7b-instruct/seccompprofile.json 13 security_test/allowlist/category/cluster/proxy/capabilities.json 17 security_test/allowlist/category/cluster/proxy/distroless.json 18 security_test/allowlist/category/cluster/proxy/imagedigest.json 18 security_test/allowlist/category/cluster/proxy/imagefreshness.json 18 security_test/allowlist/category/cluster/proxy/imagepath.json 18 security_test/allowlist/category/cluster/proxy/readonlyrootfs.json 17 security_test/allowlist/category/cluster/proxy/sbom.json 18 security_test/allowlist/category/cluster/proxy/seccompprofile.json 13 security_test/allowlist/category/cluster/rag-frontend/allowprivilegeescalation.json 32 security_test/allowlist/category/cluster/rag-frontend/capabilities.json 32 security_test/allowlist/category/cluster/rag-frontend/distroless.json 18 security_test/allowlist/category/cluster/rag-frontend/imagedigest.json 18 security_test/allowlist/category/cluster/rag-frontend/imagefreshness.json 18 security_test/allowlist/category/cluster/rag-frontend/imagepath.json 34 security_test/allowlist/category/cluster/rag-frontend/readonlyrootfs.json 32 security_test/allowlist/category/cluster/rag-frontend/rootless.json 32 security_test/allowlist/category/cluster/rag-frontend/sbom.json 34 security_test/allowlist/category/cluster/rag-frontend/seccompprofile.json 13 security_test/allowlist/category/cluster/ray-cluster-kuberay/rbac.json 58 security_test/allowlist/category/cluster/rayjob-editor-role/rbac.json 146 security_test/allowlist/category/cluster/rayjob-viewer-role/rbac.json 74 security_test/allowlist/category/cluster/rayservice-editor-role/rbac.json 146 security_test/allowlist/category/cluster/rayservice-viewer-role/rbac.json 74 security_test/allowlist/category/helm/iap/defaultnamespace.json 21 security_test/allowlist/category/helm/kuberay-tpu-webhook/allowprivilegeescalation.json 17 security_test/allowlist/category/helm/kuberay-tpu-webhook/capabilities.json 17 security_test/allowlist/category/helm/kuberay-tpu-webhook/imagedigest.json 18 security_test/allowlist/category/helm/kuberay-tpu-webhook/imagefreshness.json 18 security_test/allowlist/category/helm/kuberay-tpu-webhook/imagepath.json 18 security_test/allowlist/category/helm/kuberay-tpu-webhook/readonlyrootfs.json 17 security_test/allowlist/category/helm/kuberay-tpu-webhook/rootless.json 17 security_test/allowlist/category/helm/kuberay-tpu-webhook/seccompprofile.json 13 slurm-on-gke/README.md 318 tools/dcgm-on-gke/grafana/gke-dcgm-grafana-dashboard.json 1147 tools/dcgm-on-gke/grafana/proxy/README.md 28 tools/dcgm-on-gke/grafana/proxy/requirements.txt 1 tools/gke-disk-image-builder/README.md 198 tools/gke-disk-image-builder/go.mod 36 tools/saxml-on-gke/httpserver/README.md 277 tools/saxml-on-gke/httpserver/requirements-build.txt 3 tools/saxml-on-gke/httpserver/requirements.txt 3 tpu-provisioner/README.md 111 tpu-provisioner/admission_controller/README.md 54 tpu-provisioner/admission_controller/certificates/README.md 4 tpu-provisioner/admission_controller/requirements.txt 15 tpu-provisioner/examples/jobset.yaml 45 tpu-provisioner/go.mod 86 tpu-provisioner/internal/auth/gcp/README.md 2 tutorials-and-examples/cloudshell-tutorial.md 96 tutorials-and-examples/flyte/README.md 3 tutorials-and-examples/genAI-LLM/e2e-genai-langchain-app/README.md 5 tutorials-and-examples/genAI-LLM/e2e-genai-langchain-app/src/backend/requirements.txt 1932 tutorials-and-examples/genAI-LLM/e2e-genai-langchain-app/src/frontend/package-lock.json 8496 tutorials-and-examples/genAI-LLM/e2e-genai-langchain-app/src/frontend/package.json 33 tutorials-and-examples/genAI-LLM/e2e-genai-langchain-app/src/frontend/tsconfig.json 17 tutorials-and-examples/genAI-LLM/finetuning-gemma-2b-on-l4/README.md 3 tutorials-and-examples/gpu-examples/a100-jax/README.md 4 tutorials-and-examples/gpu-examples/online-serving-single-gpu/README.md 2 tutorials-and-examples/gpu-examples/online-serving-single-gpu/src/client/tfserve-requirements.txt 3 tutorials-and-examples/gpu-examples/online-serving-single-gpu/src/client/triton-requirements.txt 8 tutorials-and-examples/gpu-examples/online-serving-single-gpu/src/tfserve-model-repository/mnist/1/keras_metadata.pb 10 tutorials-and-examples/gpu-examples/online-serving-single-gpu/src/tfserve-model-repository/mnist/1/saved_model.pb 5116 tutorials-and-examples/gpu-examples/online-serving-single-gpu/src/triton-model-repository/mnist/1/model.savedmodel/keras_metadata.pb 10 tutorials-and-examples/gpu-examples/online-serving-single-gpu/src/triton-model-repository/mnist/1/model.savedmodel/saved_model.pb 5116 tutorials-and-examples/gpu-examples/training-single-gpu/README.md 2 tutorials-and-examples/gpu-examples/training-single-gpu/src/tensorflow-mnist-example/requirements.txt 1 tutorials-and-examples/hf-tgi/README.md 3 tutorials-and-examples/inference-servers/checkpoints/README.md 3 tutorials-and-examples/inference-servers/jetstream/README.md 6 tutorials-and-examples/inference-servers/jetstream/maxtext/single-host-inference/README.md 188 tutorials-and-examples/inference-servers/jetstream/pytorch/single-host-inference/README.md 270 tutorials-and-examples/inference-servers/maxdiffusion/README.md 6 tutorials-and-examples/kserve/README.md 200 tutorials-and-examples/langchain-chatbot/README.md 3 tutorials-and-examples/llamaindex/rag/README.md 3 tutorials-and-examples/metaflow/README.md 3 tutorials-and-examples/mlflow/finetune-gemma/README.md 3 tutorials-and-examples/models-as-oci/README.md 3 tutorials-and-examples/nvidia-bionemo/README.md 5 tutorials-and-examples/nvidia-bionemo/fine-tuning/README.md 165 tutorials-and-examples/nvidia-bionemo/pretraining/README.md 105 tutorials-and-examples/nvidia-bionemo/requirements.txt 2 tutorials-and-examples/nvidia-nim/README.md 135 tutorials-and-examples/nvidia-nim/blueprints/README.md 16 tutorials-and-examples/nvidia-nim/blueprints/digitalhuman/README.md 256 tutorials-and-examples/nvidia-nim/blueprints/digitalhuman/https.md 121 tutorials-and-examples/nvidia-nim/blueprints/drugdiscovery/README.md 152 tutorials-and-examples/skypilot/README.md 3 tutorials-and-examples/skypilot/dws-and-kueue/README.md 3 tutorials-and-examples/storage/hyperdisk-ml/README.md 189 tutorials-and-examples/storage/parallelstore-backup-and-recovery/README.md 88 tutorials-and-examples/tpu-examples/single-host-inference/jax/requirements.txt 8 tutorials-and-examples/tpu-examples/single-host-inference/jax/stable-diffusion/README.md 4 tutorials-and-examples/tpu-examples/single-host-inference/pt/densenet161/requirements.txt 1 tutorials-and-examples/tpu-examples/single-host-inference/tf/resnet50/requirements.txt 4 tutorials-and-examples/tpu-examples/training/gpt/fsdp_config.json 8 tutorials-and-examples/tpu-examples/training/gpt/my_config_2.json 31 tutorials-and-examples/tpu-examples/training/mnist-single-tpu/README.md 4 tutorials-and-examples/tpu-examples/training/mnist-single-tpu/src/tensorflow-mnist-example/requirements.txt 1 tutorials-and-examples/vector-databases/NEXT-2024-Weaviate-Demo/README.md 198 tutorials-and-examples/vector-databases/NEXT-2024-Weaviate-Demo/demo-website/README.md 26 tutorials-and-examples/vector-databases/NEXT-2024-Weaviate-Demo/demo-website/package-lock.json 5680 tutorials-and-examples/vector-databases/NEXT-2024-Weaviate-Demo/demo-website/package.json 34 tutorials-and-examples/vector-databases/NEXT-2024-Weaviate-Demo/demo-website/public/vercel.svg 1 tutorials-and-examples/vector-databases/NEXT-2024-Weaviate-Demo/demo-website/tmp/test.txt 1 tutorials-and-examples/vector-databases/NEXT-2024-Weaviate-Demo/demo-website/tsconfig.json 26 tutorials-and-examples/vector-databases/NEXT-2024-Weaviate-Demo/first_99_objects.json 1091 tutorials-and-examples/vector-databases/readme.md 1 tutorials-and-examples/workflow-orchestration/dws-examples/README.md 15 tutorials-and-examples/workflow-orchestration/dws-multiclusters-example/README.md 53 tutorials-and-examples/workflow-orchestration/indexed-job/README.md 142 tutorials-and-examples/workflow-orchestration/jobset/pytorch/README.md 55