amazon-research / fact-check-summarization

File Size

The distribution of size of files (measured in lines of code).

Intro

File size measurements show the distribution of size of files.
Files are classified in four categories based on their size (lines of code): 1-100 (very small files), 101-200 (small files), 201-500 (medium size files), 501-1000 (long files), 1001+(very long files).
It is a good practice to keep files small. Long files may become "bloaters", code that have increased to such gargantuan proportions that they are hard to work with.

Learn more...

File Size Overall

There are 289 files with 33,591 lines of code.

1 very long files (1,130 lines of code)
6 long files (3,792 lines of code)
40 medium size files (11,886 lines of codeclsfd_ftr_w_mp_ins)
70 small files (9,965 lines of code)
172 very small files (6,818 lines of code)

Legend:

1001+

501-1000

201-500

101-200

1-100

explore: zoomable circles | sunburst | 3D view

File Size per Extension

1001+

501-1000

201-500

101-200

1-100

File Size per Logical Decomposition

primary

1001+

501-1000

201-500

101-200

1-100

Longest Files (Top 50)

File	# lines	# units
sm_inference_asum.py in preprocess	1130	16
evaluate_hypo.py in preprocess	792	13
transformer.py in fairseq/models	744	39
wav2vec.py in fairseq/models	629	25
lightconv.py in fairseq/models	554	29
trainer.py in fairseq	543	29
data_prepro_clean.py in preprocess	530	16
sequence_generator.py in fairseq	499	22
options.py in fairseq	482	21
lstm.py in fairseq/models	459	23
fconv.py in fairseq/models	455	31
fconv_self_att.py in fairseq/models	448	26
indexed_dataset.py in fairseq/data	398	62
levenshtein_transformer.py in fairseq/models/nat	381	15
multihead_attention.py in fairseq/modules	370	10
preprocess.py in fairseq_cli	368	8
checkpoint_utils.py in fairseq	360	13
utils.py in fairseq	354	38
translation.py in fairseq/tasks	331	14
nonautoregressive_transformer.py in fairseq/models/nat	324	19
semisupervised_translation.py in fairseq/tasks	316	11
fp16_optimizer.py in fairseq/optim	298	37
lightconv_cuda_kernel.cu in fairseq/modules/lightconv_layer	285	-
edit_dist.cu in fairseq/clib/libnat_cuda	284	-
dictionary.py in fairseq/data	282	29
denoising_dataset.py in fairseq/data	281	18
fairseq_model.py in fairseq/models	265	45
model.py in fairseq/models/roberta	265	22
multilingual_masked_lm.py in fairseq/tasks	257	10
progress_bar.py in fairseq/logging	257	38
multilingual_translation.py in fairseq/tasks	256	19
masked_lm.py in fairseq/models	250	13
train.py in fairseq_cli	244	8
transformer_layer.py in fairseq/modules	243	9
transformer_lm.py in fairseq/models	237	12
iterative_refinement_generator.py in fairseq	235	4
translation_with_unlikelihood.py in fairseq/tasks	234	12
fairseq_task.py in fairseq/tasks	232	21
language_pair_dataset.py in fairseq/data	229	13
model.py in fairseq/models/bart	226	12
iterators.py in fairseq/data	225	37
insertion_transformer.py in fairseq/models/nat	215	16
levenshtein_utils.py in fairseq/models/nat	211	9
file_utils.py in fairseq	210	12
block_pair_dataset.py in fairseq/data/legacy	210	13
lightweight_convolution.py in fairseq/modules	209	14
dynamic_convolution.py in fairseq/modules	201	12
meters.py in fairseq/logging	199	40
hub_utils.py in fairseq	197	23
downsampled_multihead_attention.py in fairseq/modules	197	8

Files With Most Units (Top 20)

File	# lines	# units
indexed_dataset.py in fairseq/data	398	62
fairseq_model.py in fairseq/models	265	45
meters.py in fairseq/logging	199	40
transformer.py in fairseq/models	744	39
utils.py in fairseq	354	38
progress_bar.py in fairseq/logging	257	38
fp16_optimizer.py in fairseq/optim	298	37
iterators.py in fairseq/data	225	37
fconv.py in fairseq/models	455	31
trainer.py in fairseq	543	29
lightconv.py in fairseq/models	554	29
dictionary.py in fairseq/data	282	29
fconv_self_att.py in fairseq/models	448	26
wav2vec.py in fairseq/models	629	25
hub_utils.py in fairseq	197	23
bmuf.py in fairseq/optim	145	23
lstm.py in fairseq/models	459	23
sequence_generator.py in fairseq	499	22
model.py in fairseq/models/roberta	265	22
options.py in fairseq	482	21

Files With Long Lines (Top 20)

There are 25 files with lines longer than 120 characters. In total, there are 79 long lines.

File	# lines	# units	# long lines
lightconv.py in fairseq/models	554	29	14
transformer.py in fairseq/models	744	39	12
data_prepro_clean.py in preprocess	530	16	9
transformer_lm.py in fairseq/models	237	12	5
average_checkpoints.py in scripts	105	3	5
evaluate_hypo.py in preprocess	792	13	4
fconv.py in fairseq/models	455	31	3
label_smoothed_cross_entropy_with_multitask.py in fairseq/criterions	193	8	3
preprocess.py in fairseq_cli	368	8	3
model.py in fairseq/models/bart	226	12	2
model.py in fairseq/models/roberta	265	22	2
pyx token_block_utils_fast.pyx in fairseq/data	153	1	2
round_robin_zip_datasets.py in fairseq/data	69	10	2
sm_inference_asum.py in preprocess	1130	16	2
semisupervised_translation.py in fairseq/tasks	316	11	1
lightconv_cuda_kernel.cu in fairseq/modules/lightconv_layer	285	-	1
cuda_utils.cu in fairseq/modules	169	-	1
dynamic_convolution.py in fairseq/modules	201	12	1
wav2vec.py in fairseq/models	629	25	1
hub_interface.py in fairseq/models/bart	164	10	1