huggingface / text-generation-inference
File Size

The distribution of size of files (measured in lines of code).

Intro
Learn more...
File Size Overall
15% | 33% | 33% | 10% | 6%
Legend:
1001+
501-1000
201-500
101-200
1-100


explore: grouped by folders | grouped by size | sunburst | 3D view
File Size per Extension
1001+
501-1000
201-500
101-200
1-100
py12% | 38% | 35% | 8% | 5%
rs36% | 18% | 31% | 7% | 5%
cu0% | 32% | 39% | 21% | 6%
cuh0% | 22% | 21% | 28% | 27%
proto0% | 0% | 100% | 0% | 0%
nix0% | 0% | 0% | 66% | 33%
cpp0% | 0% | 0% | 84% | 15%
hpp0% | 0% | 0% | 88% | 11%
toml0% | 0% | 0% | 15% | 84%
js0% | 0% | 0% | 0% | 100%
cmake0% | 0% | 0% | 0% | 100%
h0% | 0% | 0% | 0% | 100%
File Size per Logical Decomposition
primary
1001+
501-1000
201-500
101-200
1-100
server12% | 34% | 35% | 11% | 5%
router66% | 9% | 14% | 4% | 5%
backends8% | 39% | 37% | 7% | 7%
launcher93% | 0% | 0% | 0% | 6%
clients0% | 56% | 27% | 0% | 16%
proto0% | 0% | 100% | 0% | 0%
benchmark0% | 0% | 40% | 51% | 8%
load_tests0% | 0% | 58% | 0% | 41%
ROOT0% | 0% | 0% | 81% | 18%
nix0% | 0% | 0% | 54% | 45%
Longest Files (Top 50)
File# lines# units
flash_causal_lm.py
in backends/gaudi/server/text_generation_server/models
2113 26
server.rs
in router/src
2105 1
flash_causal_lm.py
in server/text_generation_server/models
2009 19
main.rs
in launcher/src
1815 35
__init__.py
in server/text_generation_server/models
1742 2
validation.rs
in router/src
1186 9
lib.rs
in router/src
1176 17
flash_llama4_modeling.py
in backends/gaudi/server/text_generation_server/models/custom_modeling
1116 53
mpt_modeling.py
in server/text_generation_server/models/custom_modeling
1105 35
idefics_modeling.py
in server/text_generation_server/models/custom_modeling
1048 32
__init__.py
in backends/gaudi/server/text_generation_server/models
984 2
t5_modeling.py
in server/text_generation_server/models/custom_modeling
934 32
vlm_causal_lm.py
in server/text_generation_server/models
931 28
flash_vlm_causal_lm.py
in backends/gaudi/server/text_generation_server/models
856 30
quantize.py
in backends/gaudi/server/text_generation_server/layers/gptq
855 32
quantize.py
in server/text_generation_server/layers/gptq
855 32
mllama.py
in server/text_generation_server/models/custom_modeling
826 31
seq2seq_lm.py
in server/text_generation_server/models
751 10
flash_mllama.py
in backends/gaudi/server/text_generation_server/models/custom_modeling
748 31
qwen2_5_vl.py
in server/text_generation_server/models/custom_modeling
748 27
radix.rs
in backends/v3/src
741 30
seq2seq_lm.py
in backends/gaudi/server/text_generation_server/models
737 10
qwen2_5_vl.py
in backends/gaudi/server/text_generation_server/models/custom_modeling
724 25
flash_gemma3_modeling.py
in server/text_generation_server/models/custom_modeling
724 21
causal_lm.py
in server/text_generation_server/models
713 10
idefics_causal_lm.py
in server/text_generation_server/models
708 10
idefics2.py
in server/text_generation_server/models/custom_modeling
678 30
idefics2.py
in backends/gaudi/server/text_generation_server/models/custom_modeling
677 30
queue.rs
in backends/v3/src
670 5
mamba.py
in server/text_generation_server/models
667 13
bloom_modeling.py
in backends/gaudi/server/text_generation_server/models/custom_modeling
652 22
bloom_modeling.py
in server/text_generation_server/models/custom_modeling
652 22
flash_attn_triton.py
in server/text_generation_server/layers/attention
649 11
chat.rs
in router/src
641 7
tokens.py
in backends/gaudi/server/text_generation_server/utils
634 27
flash_dbrx_modeling.py
in server/text_generation_server/models/custom_modeling
632 24
flash_gemma3_modeling.py
in backends/gaudi/server/text_generation_server/models/custom_modeling
629 20
backend.rs
in backends/llamacpp/src
614 12
flash_dbrx_modeling.py
in backends/gaudi/server/text_generation_server/models/custom_modeling
614 24
opt_modeling.py
in server/text_generation_server/models/custom_modeling
610 18
flash_deepseek_v3_modeling.py
in backends/gaudi/server/text_generation_server/models/custom_modeling
604 16
flash_rw_modeling.py
in server/text_generation_server/models/custom_modeling
583 18
flash_llama_modeling.py
in server/text_generation_server/models/custom_modeling
581 14
flash_rw_modeling.py
in backends/gaudi/server/text_generation_server/models/custom_modeling
578 18
flash_deepseek_v3_modeling.py
in server/text_generation_server/models/custom_modeling
569 13
neox_modeling.py
in server/text_generation_server/models/custom_modeling
562 23
flash_deepseek_v2_modeling.py
in server/text_generation_server/models/custom_modeling
561 13
flash_llama_modeling.py
in backends/gaudi/server/text_generation_server/models/custom_modeling
555 14
mllama_causal_lm.py
in backends/gaudi/server/text_generation_server/models
546 11
q_matrix.cu
in server/exllamav2_kernels/exllamav2_kernels/cuda
544 -
Files With Most Units (Top 50)
File# lines# units
flash_llama4_modeling.py
in backends/gaudi/server/text_generation_server/models/custom_modeling
1116 53
generator.py
in backends/neuron/server/text_generation_server
501 43
logits_process.py
in backends/gaudi/server/text_generation_server/utils
402 37
logits_process.py
in server/text_generation_server/utils
412 37
weights.py
in backends/gaudi/server/text_generation_server/utils
295 36
main.rs
in launcher/src
1815 35
mpt_modeling.py
in server/text_generation_server/models/custom_modeling
1105 35
weights.py
in server/text_generation_server/utils
290 33
quantize.py
in backends/gaudi/server/text_generation_server/layers/gptq
855 32
idefics_modeling.py
in server/text_generation_server/models/custom_modeling
1048 32
t5_modeling.py
in server/text_generation_server/models/custom_modeling
934 32
quantize.py
in server/text_generation_server/layers/gptq
855 32
flash_mllama.py
in backends/gaudi/server/text_generation_server/models/custom_modeling
748 31
mllama.py
in server/text_generation_server/models/custom_modeling
826 31
idefics2.py
in backends/gaudi/server/text_generation_server/models/custom_modeling
677 30
flash_vlm_causal_lm.py
in backends/gaudi/server/text_generation_server/models
856 30
radix.rs
in backends/v3/src
741 30
idefics2.py
in server/text_generation_server/models/custom_modeling
678 30
vlm_causal_lm.py
in server/text_generation_server/models
931 28
tokens.py
in backends/gaudi/server/text_generation_server/utils
634 27
qwen2_5_vl.py
in server/text_generation_server/models/custom_modeling
748 27
clip.py
in backends/gaudi/server/text_generation_server/models/custom_modeling
466 26
flash_causal_lm.py
in backends/gaudi/server/text_generation_server/models
2113 26
clip.py
in server/text_generation_server/models/custom_modeling
466 26
qwen2_5_vl.py
in backends/gaudi/server/text_generation_server/models/custom_modeling
724 25
flash_dbrx_modeling.py
in backends/gaudi/server/text_generation_server/models/custom_modeling
614 24
rotary.py
in backends/gaudi/server/text_generation_server/layers
507 24
flash_dbrx_modeling.py
in server/text_generation_server/models/custom_modeling
632 24
rotary.py
in server/text_generation_server/layers
494 24
idefics3.py
in backends/gaudi/server/text_generation_server/models/custom_modeling
467 23
fp8.py
in backends/gaudi/server/text_generation_server/layers
528 23
tokens.py
in server/text_generation_server/utils
530 23
neox_modeling.py
in server/text_generation_server/models/custom_modeling
562 23
idefics3.py
in server/text_generation_server/models/custom_modeling
468 23
bloom_modeling.py
in backends/gaudi/server/text_generation_server/models/custom_modeling
652 22
bloom_modeling.py
in server/text_generation_server/models/custom_modeling
652 22
flash_gemma3_modeling.py
in server/text_generation_server/models/custom_modeling
724 21
flash_gemma3_modeling.py
in backends/gaudi/server/text_generation_server/models/custom_modeling
629 20
siglip.py
in backends/gaudi/server/text_generation_server/models/custom_modeling
297 20
siglip.py
in server/text_generation_server/models/custom_modeling
297 20
flash_causal_lm.py
in server/text_generation_server/models
2009 19
flash_rw_modeling.py
in backends/gaudi/server/text_generation_server/models/custom_modeling
578 18
lora.py
in backends/gaudi/server/text_generation_server/adapters
370 18
flash_rw_modeling.py
in server/text_generation_server/models/custom_modeling
583 18
opt_modeling.py
in server/text_generation_server/models/custom_modeling
610 18
qwen2_vl.py
in server/text_generation_server/models/custom_modeling
454 18
lora.py
in server/text_generation_server/adapters
385 18
flash_mixtral_modeling.py
in backends/gaudi/server/text_generation_server/models/custom_modeling
422 17
lib.rs
in router/src
1176 17
flash_mixtral_modeling.py
in server/text_generation_server/models/custom_modeling
434 17
Files With Long Lines (Top 50)

There are 100 files with lines longer than 120 characters. In total, there are 244 long lines.

File# lines# units# long lines
fused_bloom_attention_cuda.cu
in server/custom_kernels/custom_kernels
219 - 14
fused_attention_cuda.cu
in server/custom_kernels/custom_kernels
219 - 14
q_matrix.cu
in server/exllamav2_kernels/exllamav2_kernels/cuda
544 - 11
q4_matmul.cu
in server/exllama_kernels/exllama_kernels/cuda_func
218 - 10
server.rs
in router/src
2105 1 9
mpt_modeling.py
in server/text_generation_server/models/custom_modeling
1105 35 8
main.rs
in launcher/src
1815 35 6
matrix.cuh
in server/exllama_kernels/exllama_kernels
250 - 6
__init__.py
in server/text_generation_server/models
1742 2 6
__init__.py
in server/text_generation_server/layers/gptq
401 10 6
trtllm.cmake
in backends/trtllm/cmake
40 - 5
exllama_ext.cpp
in server/exllama_kernels/exllama_kernels
198 3 5
qdq_5.cuh
in server/exllamav2_kernels/exllamav2_kernels/cuda/quant
184 - 5
gptq.py
in server/text_generation_server/layers/marlin
390 14 4
matrix_view.cuh
in server/exllamav2_kernels/exllamav2_kernels/cuda
104 - 4
q_gemm_kernel.cuh
in server/exllamav2_kernels/exllamav2_kernels/cuda
507 - 4
ext.cpp
in server/exllamav2_kernels/exllamav2_kernels
115 - 4
main.rs
in backends/trtllm/src
296 - 3
main.rs
in backends/v2/src
203 - 3
flash_phi_moe_modeling.py
in backends/gaudi/server/text_generation_server/models/custom_modeling
134 2 3
__init__.py
in backends/gaudi/server/text_generation_server/models
984 2 3
__init__.py
in backends/gaudi/server/text_generation_server/layers/gptq
371 11 3
backend.rs
in backends/v3/src
450 7 3
main.rs
in backends/v3/src
217 - 3
lib.rs
in router/src
1176 17 3
q4_matrix.cu
in server/exllama_kernels/exllama_kernels/cuda_func
166 - 3
flash_phi_moe_modeling.py
in server/text_generation_server/models/custom_modeling
134 2 3
qdq_3.cuh
in server/exllamav2_kernels/exllamav2_kernels/cuda/quant
146 - 3
q_gemm.cu
in server/exllamav2_kernels/exllamav2_kernels/cuda
198 - 3
backend.rs
in backends/v2/src
402 7 2
bnb.py
in backends/gaudi/server/text_generation_server/layers
93 8 2
mlp.py
in backends/gaudi/server/text_generation_server/layers
214 10 2
cli.py
in backends/gaudi/server/text_generation_server
297 3 2
queue.rs
in backends/v3/src
670 5 2
usage_stats.rs
in router/src
369 5 2
flash_cohere_modeling.py
in server/text_generation_server/models/custom_modeling
448 15 2
flash_causal_lm.py
in server/text_generation_server/models
2009 19 2
bnb.py
in server/text_generation_server/layers
93 8 2
w8a8_int.py
in server/text_generation_server/layers/compressed_tensors
196 10 2
mlp.py
in server/text_generation_server/layers
214 10 2
kv_cache.py
in server/text_generation_server/layers/attention
265 9 2
cli.py
in server/text_generation_server
301 3 2
util.cuh
in server/exllamav2_kernels/exllamav2_kernels/cuda
45 - 2
generation.rs
in benchmark/src
186 1 2
nix
121 - 2
long.js
in load_tests
50 2 1
common.js
in load_tests
50 2 1
backend.hpp
in backends/trtllm/csrc
133 4 1
hardware.hpp
in backends/trtllm/csrc
38 2 1
mod.rs
in backends/v2/src/client
52 2 1