amazon-research / BartGraphSumm
File Size

The distribution of size of files (measured in lines of code).

Intro
  • File size measurements show the distribution of size of files.
  • Files are classified in four categories based on their size (lines of code): 1-100 (very small files), 101-200 (small files), 201-500 (medium size files), 501-1000 (long files), 1001+(very long files).
  • It is a good practice to keep files small. Long files may become "bloaters", code that have increased to such gargantuan proportions that they are hard to work with.
Learn more...
File Size Overall
  • There are 317 files with 33,649 lines of code.
    • 0 very long files (0 lines of code)
    • 7 long files (4,467 lines of code)
    • 44 medium size files (12,540 lines of codeclsfd_ftr_w_mp_ins)
    • 59 small files (8,418 lines of code)
    • 207 very small files (8,224 lines of code)
0% | 13% | 37% | 25% | 24%
Legend:
1001+
501-1000
201-500
101-200
1-100


explore: zoomable circles | sunburst | 3D view
File Size per Extension
1001+
501-1000
201-500
101-200
1-100
py0% | 14% | 37% | 24% | 24%
cu0% | 0% | 64% | 35% | 0%
cpp0% | 0% | 0% | 64% | 35%
pyx0% | 0% | 0% | 74% | 25%
cuh0% | 0% | 0% | 0% | 100%
lua0% | 0% | 0% | 0% | 100%
h0% | 0% | 0% | 0% | 100%
toml0% | 0% | 0% | 0% | 100%
File Size per Logical Decomposition
primary
1001+
501-1000
201-500
101-200
1-100
src/fairseq/fairseq/models0% | 31% | 41% | 18% | 8%
src/fairseq/fairseq0% | 42% | 35% | 8% | 12%
src/fairseq/fairseq/modules0% | 0% | 31% | 30% | 38%
src/fairseq/fairseq/data0% | 0% | 36% | 26% | 36%
src/fairseq/fairseq/tasks0% | 0% | 47% | 47% | 5%
src/fairseq/fairseq_cli0% | 0% | 60% | 27% | 12%
src0% | 0% | 66% | 15% | 18%
src/fairseq/fairseq/logging0% | 0% | 79% | 20% | <1%
src/fairseq/fairseq/optim0% | 0% | 18% | 36% | 44%
src/fairseq/fairseq/clib0% | 0% | 43% | 45% | 11%
src/fairseq/fairseq/model_parallel0% | 0% | 28% | 21% | 49%
src/fairseq/fairseq/criterions0% | 0% | 0% | 25% | 74%
src/fairseq/scripts0% | 0% | 0% | 31% | 68%
src/fairseq0% | 0% | 0% | 69% | 30%
src/fairseq/fairseq/benchmark0% | 0% | 0% | 0% | 100%
Longest Files (Top 50)
File# lines# units
transformer.py
in src/fairseq/fairseq/models
870 37
trainer.py
in src/fairseq/fairseq
736 42
sequence_generator.py
in src/fairseq/fairseq
660 30
wav2vec.py
in src/fairseq/fairseq/models
629 25
lightconv.py
in src/fairseq/fairseq/models
554 29
lstm.py
in src/fairseq/fairseq/models
512 27
options.py
in src/fairseq/fairseq
506 20
fconv.py
in src/fairseq/fairseq/models
455 31
fconv_self_att.py
in src/fairseq/fairseq/models
448 26
indexed_dataset.py
in src/fairseq/fairseq/data
398 62
levenshtein_transformer.py
in src/fairseq/fairseq/models/nat
381 15
multihead_attention.py
in src/fairseq/fairseq/modules
380 11
utils.py
in src/fairseq/fairseq
361 38
graph_construction.py
in src
355 9
checkpoint_utils.py
in src/fairseq/fairseq
353 12
longformer_multihead_attention.py
in src/fairseq/fairseq/modules
339 12
nonautoregressive_transformer.py
in src/fairseq/fairseq/models/nat
324 19
semisupervised_translation.py
in src/fairseq/fairseq/tasks
316 11
fp16_optimizer.py
in src/fairseq/fairseq/optim
315 37
prepare_data.py
in src
315 9
model.py
in src/fairseq/fairseq/models/bart
307 19
translation.py
in src/fairseq/fairseq/tasks
305 14
transformer_layer.py
in src/fairseq/fairseq/modules
302 16
dictionary.py
in src/fairseq/fairseq/data
298 30
preprocess.py
in src/fairseq/fairseq_cli
294 7
model.py
in src/fairseq/fairseq/models/roberta
292 23
iterators.py
in src/fairseq/fairseq/data
286 40
lightconv_cuda_kernel.cu
in src/fairseq/fairseq/modules/lightconv_layer
285 -
edit_dist.cu
in src/fairseq/fairseq/clib/libnat_cuda
284 -
denoising_dataset.py
in src/fairseq/fairseq/data
281 18
train.py
in src/fairseq/fairseq_cli
279 10
fairseq_model.py
in src/fairseq/fairseq/models
270 46
multilingual_masked_lm.py
in src/fairseq/fairseq/tasks
257 10
multilingual_translation.py
in src/fairseq/fairseq/tasks
256 19
progress_bar.py
in src/fairseq/fairseq/logging
256 38
transformer_lm.py
in src/fairseq/fairseq/models
248 13
multihead_attention.py
in src/fairseq/fairseq/model_parallel/modules
244 6
masked_lm.py
in src/fairseq/fairseq/models
241 13
fairseq_task.py
in src/fairseq/fairseq/tasks
238 21
iterative_refinement_generator.py
in src/fairseq/fairseq
235 4
distributed_utils.py
in src/fairseq/fairseq
232 13
language_pair_dataset.py
in src/fairseq/fairseq/data
228 13
file_utils.py
in src/fairseq/fairseq
220 13
insertion_transformer.py
in src/fairseq/fairseq/models/nat
215 16
levenshtein_utils.py
in src/fairseq/fairseq/models/nat
211 9
block_pair_dataset.py
in src/fairseq/fairseq/data/legacy
210 13
lightweight_convolution.py
in src/fairseq/fairseq/modules
209 14
meters.py
in src/fairseq/fairseq/logging
208 42
generate.py
in src/fairseq/fairseq_cli
207 3
hub_utils.py
in src/fairseq/fairseq
201 23
Files With Most Units (Top 20)
File# lines# units
indexed_dataset.py
in src/fairseq/fairseq/data
398 62
fairseq_model.py
in src/fairseq/fairseq/models
270 46
trainer.py
in src/fairseq/fairseq
736 42
meters.py
in src/fairseq/fairseq/logging
208 42
iterators.py
in src/fairseq/fairseq/data
286 40
utils.py
in src/fairseq/fairseq
361 38
progress_bar.py
in src/fairseq/fairseq/logging
256 38
fp16_optimizer.py
in src/fairseq/fairseq/optim
315 37
transformer.py
in src/fairseq/fairseq/models
870 37
fconv.py
in src/fairseq/fairseq/models
455 31
sequence_generator.py
in src/fairseq/fairseq
660 30
dictionary.py
in src/fairseq/fairseq/data
298 30
lightconv.py
in src/fairseq/fairseq/models
554 29
lstm.py
in src/fairseq/fairseq/models
512 27
fconv_self_att.py
in src/fairseq/fairseq/models
448 26
wav2vec.py
in src/fairseq/fairseq/models
629 25
hub_utils.py
in src/fairseq/fairseq
201 23
bmuf.py
in src/fairseq/fairseq/optim
150 23
model.py
in src/fairseq/fairseq/models/roberta
292 23
fairseq_task.py
in src/fairseq/fairseq/tasks
238 21
Files With Long Lines (Top 20)

There are 28 files with lines longer than 120 characters. In total, there are 81 long lines.

File# lines# units# long lines
lightconv.py
in src/fairseq/fairseq/models
554 29 14
transformer.py
in src/fairseq/fairseq/models
870 37 12
prepare_data.py
in src
315 9 6
transformer_lm.py
in src/fairseq/fairseq/models
248 13 5
average_checkpoints.py
in src/fairseq/scripts
111 3 5
graph_construction.py
in src
355 9 4
adaptive_softmax.py
in src/fairseq/fairseq/modules
138 10 3
fconv.py
in src/fairseq/fairseq/models
455 31 3
preprocess.py
in src/fairseq/fairseq_cli
294 7 3
translation.py
in src/fairseq/fairseq/tasks
305 14 2
hub_interface.py
in src/fairseq/fairseq/models/bart
132 10 2
model_camembert.py
in src/fairseq/fairseq/models/roberta
30 2 2
model.py
in src/fairseq/fairseq/models/roberta
292 23 2
pyx
token_block_utils_fast.pyx
in src/fairseq/fairseq/data
153 1 2
generate.py
in src/fairseq/fairseq_cli
207 3 2
bart_decode_parallel.py
in src
84 2 2
sentence_prediction.py
in src/fairseq/fairseq/tasks
193 10 1
semisupervised_translation.py
in src/fairseq/fairseq/tasks
316 11 1
longformer_multihead_attention.py
in src/fairseq/fairseq/modules
339 12 1
lightconv_cuda_kernel.cu
in src/fairseq/fairseq/modules/lightconv_layer
285 - 1