facebookresearch / UNLU
File Size

The distribution of size of files (measured in lines of code).

Intro
  • File size measurements show the distribution of size of files.
  • Files are classified in four categories based on their size (lines of code): 1-100 (very small files), 101-200 (small files), 201-500 (medium size files), 501-1000 (long files), 1001+(very long files).
  • It is a good practice to keep files small. Long files may become "bloaters", code that have increased to such gargantuan proportions that they are hard to work with.
Learn more...
File Size Overall
  • There are 54 files with 9,713 lines of code.
    • 1 very long files (1,016 lines of code)
    • 4 long files (2,997 lines of code)
    • 13 medium size files (3,964 lines of codeclsfd_ftr_w_mp_ins)
    • 4 small files (477 lines of code)
    • 32 very small files (1,259 lines of code)
10% | 30% | 40% | 4% | 12%
Legend:
1001+
501-1000
201-500
101-200
1-100


explore: zoomable circles | sunburst | 3D view
File Size per Extension
1001+
501-1000
201-500
101-200
1-100
py10% | 31% | 39% | 5% | 12%
yaml0% | 0% | 100% | 0% | 0%
sed0% | 0% | 0% | 0% | 100%
bash0% | 0% | 0% | 0% | 100%
File Size per Logical Decomposition
primary
1001+
501-1000
201-500
101-200
1-100
anli/src26% | 49% | 6% | 6% | 10%
infersent_comp0% | 60% | 25% | 0% | 14%
infersent_comp/encoder0% | 91% | 0% | 0% | 8%
codes0% | 0% | 75% | 12% | 11%
codes/rnn_training0% | 0% | 100% | 0% | 0%
anli0% | 0% | 99% | 0% | <1%
dataset_utils0% | 0% | 100% | 0% | 0%
ROOT0% | 0% | 78% | 0% | 21%
utils0% | 0% | 0% | 0% | 100%
infersent_comp/dataset0% | 0% | 0% | 0% | 100%
dataset_utils/mnli0% | 0% | 0% | 0% | 100%
dataset_utils/snli0% | 0% | 0% | 0% | 100%
dataset_utils/ocnli0% | 0% | 0% | 0% | 100%
Longest Files (Top 50)
File# lines# units
train_with_scramble.py
in anli/src/nli
1016 17
training.py
in anli/src/nli
940 17
train_with_confidence.py
in anli/src/nli
933 18
models.py
in infersent_comp
562 39
models.py
in infersent_comp/encoder
562 39
word_randomization.py
in codes
475 23
train_nli_ray.py
in codes/rnn_training
462 4
train_nli_w2v.py
in codes/rnn_training
427 4
run_causal_lm.py
in anli
332 3
rnn_models.py
in codes
331 11
data_iterator.py
in dataset_utils
315 5
min_tree.py
in codes
282 14
plot.py
in codes
243 7
train_nli.py
in infersent_comp
238 2
evaluation.py
in anli/src/nli
237 2
config.yaml
in root
210 -
random_eval.py
in codes
207 4
Non_transformers_probe.py
in codes/rnn_training
205 6
models.py
in codes
133 16
build_data.py
in anli/src/dataset_tools
119 6
rnn.py
in codes
113 8
inference_debug.py
in anli/src/nli
112 3
mnli_preprocess.py
in dataset_utils/mnli
84 6
table.py
in codes
79 -
data.py
in infersent_comp
78 6
format_convert.py
in anli/src/dataset_tools
74 4
list_dict_data_tool.py
in anli/src/utils
71 5
common.py
in codes
57 9
common.py
in anli/src/utils
56 9
main.py
in root
56 2
bleu_extract.py
in codes
55 1
mutils.py
in infersent_comp
53 3
snli_preprocess.py
in dataset_utils/snli
53 6
sed
tokenizer.sed
in infersent_comp/dataset
52 -
extract_features.py
in infersent_comp/encoder
51 -
prepare_nli_lm_data.py
in anli/src
50 -
reformat.py
in codes
50 -
save_tool.py
in anli/src/utils
49 5
optim.py
in utils
45 4
batchbuilder.py
in anli/src/flint/data_utils
42 4
fields.py
in anli/src/flint/data_utils
41 6
ocnli_preprocess.py
in dataset_utils/ocnli
41 2
get_data.bash
in infersent_comp/dataset
37 -
eval_metric.py
in utils
29 2
transformer_utils.py
in utils
26 2
prepare_lm_data.py
in anli/src
17 -
config.py
in anli/src
6 -
__init__.py
in anli
1 -
__init__.py
in anli/src/nli
1 -
__init__.py
in anli/src/utils
1 -
Files With Most Units (Top 20)
File# lines# units
models.py
in infersent_comp
562 39
models.py
in infersent_comp/encoder
562 39
word_randomization.py
in codes
475 23
train_with_confidence.py
in anli/src/nli
933 18
train_with_scramble.py
in anli/src/nli
1016 17
training.py
in anli/src/nli
940 17
models.py
in codes
133 16
min_tree.py
in codes
282 14
rnn_models.py
in codes
331 11
common.py
in anli/src/utils
56 9
common.py
in codes
57 9
rnn.py
in codes
113 8
plot.py
in codes
243 7
build_data.py
in anli/src/dataset_tools
119 6
fields.py
in anli/src/flint/data_utils
41 6
data.py
in infersent_comp
78 6
Non_transformers_probe.py
in codes/rnn_training
205 6
snli_preprocess.py
in dataset_utils/snli
53 6
mnli_preprocess.py
in dataset_utils/mnli
84 6
save_tool.py
in anli/src/utils
49 5
Files With Long Lines (Top 9)

There are 9 files with lines longer than 120 characters. In total, there are 20 long lines.

File# lines# units# long lines
Non_transformers_probe.py
in codes/rnn_training
205 6 8
run_causal_lm.py
in anli
332 3 4
train_with_scramble.py
in anli/src/nli
1016 17 2
inference_debug.py
in anli/src/nli
112 3 1
training.py
in anli/src/nli
940 17 1
batchbuilder.py
in anli/src/flint/data_utils
42 4 1
optim.py
in utils
45 4 1
train_nli.py
in infersent_comp
238 2 1
rnn.py
in codes
113 8 1