pytorch / fairseq
File Size

The distribution of size of files (measured in lines of code).

Intro
  • File size measurements show the distribution of size of files.
  • Files are classified in four categories based on their size (lines of code): 1-100 (very small files), 101-200 (small files), 201-500 (medium size files), 501-1000 (long files), 1001+(very long files).
  • It is a good practice to keep files small. Long files may become "bloaters", code that have increased to such gargantuan proportions that they are hard to work with.
Learn more...
File Size Overall
  • There are 453 files with 65,518 lines of code.
    • 5 very long files (5,678 lines of code)
    • 18 long files (10,620 lines of code)
    • 84 medium size files (26,313 lines of codeclsfd_ftr_w_mp_ins)
    • 91 small files (12,880 lines of code)
    • 255 very small files (10,027 lines of code)
8% | 16% | 40% | 19% | 15%
Legend:
1001+
501-1000
201-500
101-200
1-100


explore: zoomable circles | sunburst | 3D view
File Size per Extension
1001+
501-1000
201-500
101-200
1-100
py9% | 16% | 40% | 18% | 14%
cu0% | 0% | 60% | 31% | 7%
cpp0% | 0% | 0% | 53% | 46%
pyx0% | 0% | 0% | 100% | 0%
yaml0% | 0% | 0% | 0% | 100%
cuh0% | 0% | 0% | 0% | 100%
lua0% | 0% | 0% | 0% | 100%
h0% | 0% | 0% | 0% | 100%
cfg0% | 0% | 0% | 0% | 100%
toml0% | 0% | 0% | 0% | 100%
File Size per Logical Decomposition
primary
1001+
501-1000
201-500
101-200
1-100
fairseq/models12% | 27% | 42% | 12% | 4%
fairseq19% | 31% | 33% | 10% | 4%
fairseq/dataclass69% | 0% | 24% | 0% | 5%
fairseq/data9% | 4% | 40% | 27% | 17%
fairseq/modules0% | 14% | 25% | 27% | 32%
fairseq/model_parallel0% | 32% | 37% | 15% | 14%
fairseq/distributed0% | 65% | 0% | 0% | 34%
fairseq/tasks0% | 7% | 73% | 15% | 3%
fairseq_cli0% | 0% | 85% | 6% | 7%
fairseq/optim0% | 0% | 23% | 35% | 41%
fairseq/logging0% | 0% | 81% | 18% | <1%
fairseq/criterions0% | 0% | 20% | 56% | 22%
fairseq/clib0% | 0% | 34% | 35% | 29%
ROOT0% | 0% | 79% | 0% | 20%
scripts0% | 0% | 0% | 32% | 67%
fairseq/scoring0% | 0% | 0% | 42% | 57%
fairseq/config0% | 0% | 0% | 0% | 100%
fairseq/benchmark0% | 0% | 0% | 0% | 100%
scripts/constraints0% | 0% | 0% | 0% | 100%
Longest Files (Top 50)
File# lines# units
emformer.py
in fairseq/models/speech_to_text/modules
1323 46
trainer.py
in fairseq
1219 57
wav2vec2.py
in fairseq/models/wav2vec
1057 31
configs.py
in fairseq/dataclass
1050 11
multilingual_data_manager.py
in fairseq/data/multilingual
1029 41
lightconv.py
in fairseq/models
789 29
sequence_generator.py
in fairseq
694 24
checkpoint_utils.py
in fairseq
672 16
model.py
in fairseq/model_parallel/models/pipeline_parallel_transformer
625 28
utils.py
in fairseq/distributed
608 36
utils.py
in fairseq
608 60
lstm.py
in fairseq/models
596 26
xm_transformer.py
in fairseq/models/speech_to_text
584 28
s2s_transformer.py
in fairseq/models/speech_to_speech
583 21
wav2vec2_asr.py
in fairseq/models/wav2vec
582 29
model.py
in fairseq/models/roberta
557 29
kmeans_attention.py
in fairseq/modules
549 53
fconv.py
in fairseq/models
544 31
wav2vec.py
in fairseq/models/wav2vec
540 21
fconv_self_att.py
in fairseq/models
536 26
iterators.py
in fairseq/data
532 53
multihead_attention.py
in fairseq/modules
516 13
online_backtranslation.py
in fairseq/tasks
505 30
multilingual_language_modeling.py
in fairseq/tasks
495 13
transformer_lm.py
in fairseq/models
485 22
search.py
in fairseq
474 27
speech_to_text_dataset.py
in fairseq/data/audio
463 28
hubert.py
in fairseq/models/hubert
454 15
indexed_dataset.py
in fairseq/data
453 63
s2t_transformer.py
in fairseq/models/speech_to_text
451 26
layers.py
in fairseq/model_parallel/models/pipeline_parallel_transformer
439 21
berard.py
in fairseq/models/speech_to_text
432 18
transformer_layer.py
in fairseq/modules
427 23
data_utils.py
in fairseq/data
427 20
levenshtein_transformer.py
in fairseq/models/nat
418 15
train.py
in fairseq_cli
413 9
text_to_speech.py
in fairseq/tasks
401 22
speech_to_speech.py
in fairseq/tasks
400 20
semisupervised_translation.py
in fairseq/tasks
399 11
translation.py
in fairseq/tasks
397 13
fp16_optimizer.py
in fairseq/optim
387 43
fairseq_task.py
in fairseq/tasks
380 42
convtransformer.py
in fairseq/models/speech_to_text
374 17
tts_transformer.py
in fairseq/models/text_to_speech
374 18
utils.py
in fairseq/dataclass
373 16
fastspeech2.py
in fairseq/models/text_to_speech
368 23
multilingual_translation.py
in fairseq/tasks
364 23
nonautoregressive_transformer.py
in fairseq/models/nat
359 19
sampled_multi_dataset.py
in fairseq/data/multilingual
356 24
transformer_decoder.py
in fairseq/models/transformer
349 15
Files With Most Units (Top 20)
File# lines# units
indexed_dataset.py
in fairseq/data
453 63
utils.py
in fairseq
608 60
trainer.py
in fairseq
1219 57
kmeans_attention.py
in fairseq/modules
549 53
iterators.py
in fairseq/data
532 53
progress_bar.py
in fairseq/logging
346 53
meters.py
in fairseq/logging
233 48
fairseq_model.py
in fairseq/models
311 47
token_generation_constraints.py
in fairseq
263 47
emformer.py
in fairseq/models/speech_to_text/modules
1323 46
fp16_optimizer.py
in fairseq/optim
387 43
fairseq_task.py
in fairseq/tasks
380 42
multilingual_data_manager.py
in fairseq/data/multilingual
1029 41
data_cfg.py
in fairseq/data/audio
190 38
utils.py
in fairseq/distributed
608 36
wav2vec2.py
in fairseq/models/wav2vec
1057 31
fconv.py
in fairseq/models
544 31
dictionary.py
in fairseq/data
310 31
online_backtranslation.py
in fairseq/tasks
505 30
huffman_mmap_indexed_dataset.py
in fairseq/data/huffman
194 30
Files With Long Lines (Top 18)

There are 18 files with lines longer than 120 characters. In total, there are 65 long lines.

File# lines# units# long lines
transformer_legacy.py
in fairseq/models/transformer
216 15 17
lightconv.py
in fairseq/models
789 29 12
model.py
in fairseq/model_parallel/models/pipeline_parallel_transformer
625 28 9
configs.py
in fairseq/dataclass
1050 11 4
trainer.py
in fairseq
1219 57 4
wav2vec.py
in fairseq/models/wav2vec
540 21 3
transformer_lm.py
in fairseq/models
485 22 2
pyx
token_block_utils_fast.pyx
in fairseq/data
154 1 2
generate.py
in fairseq_cli
334 4 2
average_checkpoints.py
in scripts
121 3 2
__init__.py
in fairseq/tasks
81 4 1
sentence_prediction.py
in fairseq/tasks
224 9 1
fp16_optimizer.py
in fairseq/optim
387 43 1
transformer_config.py
in fairseq/models/transformer
265 5 1
distributed_fairseq_model.py
in fairseq/models
108 1 1
enc_dec.py
in fairseq/models/roberta
158 6 1
model_camembert.py
in fairseq/models/roberta
37 2 1
dynamic_convolution.py
in fairseq/modules
231 12 1