huggingface / transformers-research-projects
File Size

The distribution of size of files (measured in lines of code).

Intro
Learn more...
File Size Overall
6% | 28% | 47% | 10% | 6%
Legend:
1001+
501-1000
201-500
101-200
1-100


explore: grouped by folders | grouped by size | sunburst | 3D view
File Size per Extension
1001+
501-1000
201-500
101-200
1-100
py6% | 27% | 48% | 11% | 6%
ipynb0% | 100% | 0% | 0% | 0%
yaml0% | 0% | 0% | 0% | 100%
toml0% | 0% | 0% | 0% | 100%
File Size per Logical Decomposition
primary
1001+
501-1000
201-500
101-200
1-100
visual_bert54% | 0% | 35% | 9% | 0%
lxmert54% | 0% | 35% | 9% | 0%
movement-pruning0% | 90% | 0% | 3% | 6%
xtreme-s0% | 100% | 0% | 0% | 0%
distillation0% | 38% | 36% | 0% | 25%
rag-end2end-retriever0% | 36% | 32% | 22% | 8%
pplm0% | 61% | 37% | 0% | <1%
deebert0% | 63% | 26% | 10% | <1%
self-training-text-classification0% | 68% | 31% | 0% | 0%
bert-loses-patience0% | 60% | 0% | 39% | <1%
bertabs0% | 56% | 22% | 10% | 10%
longform-qa0% | 67% | 32% | 0% | 0%
onnx0% | 71% | 0% | 18% | 9%
luke0% | 89% | 0% | 0% | 10%
robust-speech-event0% | 49% | 43% | 0% | 7%
rag0% | 28% | 31% | 17% | 21%
jax-projects0% | 0% | 90% | 4% | 5%
seq2seq-distillation0% | 0% | 71% | 21% | 6%
tapex0% | 0% | 89% | 10% | 0%
quantization-qdqbert0% | 0% | 87% | 9% | 3%
performer0% | 0% | 100% | 0% | 0%
wav2vec20% | 0% | 86% | 13% | 0%
synthid_text0% | 0% | 100% | 0% | 0%
bertology0% | 0% | 100% | 0% | 0%
information-gain-filtration0% | 0% | 99% | 0% | <1%
mm-imdb0% | 0% | 81% | 18% | 0%
layoutlmv30% | 0% | 100% | 0% | 0%
mlm_wwm0% | 0% | 72% | 27% | 0%
codeparrot0% | 0% | 26% | 58% | 14%
zero-shot-distillation0% | 0% | 100% | 0% | 0%
vqgan-clip0% | 0% | 64% | 0% | 35%
adversarial0% | 0% | 55% | 44% | 0%
decision_transformer0% | 0% | 0% | 100% | 0%
fsner0% | 0% | 0% | 0% | 100%
token-healing0% | 0% | 0% | 0% | 100%
Longest Files (Top 50)
File# lines# units
modeling_frcnn.py
in visual_bert
1337 85
1337 85
masked_run_squad.py
in movement-pruning
935 8
masked_run_glue.py
in movement-pruning
798 7
run_xtreme_s.py
in xtreme-s
729 5
700 6
finetune_rag.py
in rag-end2end-retriever
689 24
646 9
Saving_PruneBERT.ipynb
in movement-pruning
645 -
modeling_bert_masked.py
in movement-pruning/emmental
623 34
618 6
finetuning.py
in self-training-text-classification
617 4
run_glue_with_pabee.py
in bert-loses-patience
611 5
605 49
eli5_utils.py
in longform-qa
582 35
generation_onnx.py
in onnx/summarization/bart_onnx
566 33
565 2
run_speech_recognition_ctc_bnb.py
in robust-speech-event
555 5
550 25
497 3
run_quant_qa.py
in quantization-qdqbert
495 3
493 3
run_clm_mp.py
in jax-projects/model_parallel
479 6
478 3
utils.py
in seq2seq-distillation
472 51
run_mmimdb.py
in mm-imdb
461 5
visualizing_image.py
in visual_bert
451 11
451 11
439 13
437 20
utils.py
in visual_bert
426 26
distiller.py
in distillation
426 11
utils.py
in lxmert
426 26
run_wav2vec2_pretrain_flax.py
in jax-projects/wav2vec2
425 7
run_mlm_flax_stream.py
in jax-projects/dataset-streaming
416 12
run_funsd_cord.py
in layoutlmv3
407 3
run_hybrid_clip.py
in jax-projects/hybrid_clip
406 12
401 15
detector_training.py
in synthid_text
391 2
finetune.py
in seq2seq-distillation
387 22
_test_seq2seq_examples.py
in seq2seq-distillation
384 14
358 4
lightning_base.py
in rag-end2end-retriever
343 22
run_asr.py
in wav2vec2
337 7
335 21
334 3
333 18
lightning_base.py
in seq2seq-distillation
329 20
run_bertology.py
in bertology
327 6
run_mlm_wwm.py
in mlm_wwm
321 5
Files With Most Units (Top 50)
File# lines# units
modeling_frcnn.py
in visual_bert
1337 85
1337 85
utils.py
in seq2seq-distillation
472 51
605 49
eli5_utils.py
in longform-qa
582 35
modeling_bert_masked.py
in movement-pruning/emmental
623 34
generation_onnx.py
in onnx/summarization/bart_onnx
566 33
utils.py
in visual_bert
426 26
utils.py
in lxmert
426 26
550 25
finetune_rag.py
in rag-end2end-retriever
689 24
finetune.py
in seq2seq-distillation
387 22
lightning_base.py
in rag-end2end-retriever
343 22
335 21
lightning_base.py
in seq2seq-distillation
329 20
187 20
utils_rag.py
in rag-end2end-retriever
187 20
437 20
bigbird_flax.py
in jax-projects/big_bird
252 18
333 18
255 16
utils_hans.py
in adversarial
207 15
401 15
_test_seq2seq_examples.py
in seq2seq-distillation
384 14
quant_trainer.py
in quantization-qdqbert
202 14
minhash_deduplication.py
in codeparrot/scripts
131 13
VQGAN_CLIP.py
in vqgan-clip
220 13
439 13
run_hybrid_clip.py
in jax-projects/hybrid_clip
406 12
run_mlm_flax_stream.py
in jax-projects/dataset-streaming
416 12
155 12
preprocessing.py
in codeparrot/scripts
149 12
distillation.py
in seq2seq-distillation
248 11
igf.py
in information-gain-filtration/igf
206 11
distributed_ray_retriever.py
in rag-end2end-retriever
103 11
visualizing_image.py
in visual_bert
451 11
distiller.py
in distillation
426 11
451 11
240 11
modeling_hybrid_clip.py
in jax-projects/hybrid_clip
261 10
utils.py
in synthid_text
274 10
codeparrot_training.py
in codeparrot/scripts
264 10
82 9
modeling_pabee_albert.py
in bert-loses-patience/pabee
198 9
modeling_pabee_bert.py
in bert-loses-patience/pabee
196 9
lm_seqs_dataset.py
in distillation
91 9
646 9
utils_mmimdb.py
in mm-imdb
104 9
75 9
callbacks.py
in seq2seq-distillation
89 8
Files With Long Lines (Top 7)

There are 7 files with lines longer than 120 characters. In total, there are 19 long lines.

File# lines# units# long lines
Saving_PruneBERT.ipynb
in movement-pruning
645 - 13
run_clm_mp.py
in jax-projects/model_parallel
479 6 1
modeling_frcnn.py
in visual_bert
1337 85 1
utils.py
in visual_bert
426 26 1
1337 85 1
utils.py
in lxmert
426 26 1
38 1 1