facebookresearch / muss
File Size

The distribution of size of files (measured in lines of code).

Intro
  • File size measurements show the distribution of size of files.
  • Files are classified in four categories based on their size (lines of code): 1-100 (very small files), 101-200 (small files), 201-500 (medium size files), 501-1000 (long files), 1001+(very long files).
  • It is a good practice to keep files small. Long files may become "bloaters", code that have increased to such gargantuan proportions that they are hard to work with.
Learn more...
File Size Overall
  • There are 33 files with 3,783 lines of code.
    • 0 very long files (0 lines of code)
    • 0 long files (0 lines of code)
    • 6 medium size files (1,899 lines of codeclsfd_ftr_w_mp_ins)
    • 9 small files (1,178 lines of code)
    • 18 very small files (706 lines of code)
0% | 0% | 50% | 31% | 18%
Legend:
1001+
501-1000
201-500
101-200
1-100


explore: zoomable circles | sunburst | 3D view
File Size per Extension
1001+
501-1000
201-500
101-200
1-100
py0% | 0% | 50% | 31% | 18%
cfg0% | 0% | 0% | 0% | 100%
es0% | 0% | 0% | 0% | 100%
File Size per Logical Decomposition
primary
1001+
501-1000
201-500
101-200
1-100
muss/mining0% | 0% | 78% | 10% | 10%
muss/utils0% | 0% | 59% | 34% | 6%
muss0% | 0% | 34% | 40% | 25%
scripts0% | 0% | 56% | 29% | 13%
muss/fairseq0% | 0% | 42% | 39% | 18%
muss/resources0% | 0% | 0% | 59% | 40%
muss/evaluation0% | 0% | 0% | 0% | 100%
ROOT0% | 0% | 0% | 0% | 100%
Longest Files (Top 33)
File# lines# units
helpers.py
in muss/utils
390 46
nn_search.py
in muss/mining
374 27
preprocessors.py
in muss
355 63
training.py
in muss/mining
354 13
mine_sequences.py
in scripts
220 -
base.py
in muss/fairseq
206 6
main.py
in muss/fairseq
190 19
preprocessing.py
in muss
154 19
text.py
in muss
150 23
datasets.py
in muss/resources
142 14
resources.py
in muss/utils
121 12
train_paper_models.py
in scripts
114 3
slurm.py
in muss
103 9
submitit.py
in muss/utils
103 9
filtering.py
in muss/mining
101 12
preprocessing.py
in muss/mining
96 11
roberta.py
in muss/fairseq
87 2
feature_extraction.py
in muss
58 11
simplify.py
in muss
56 5
prepare.py
in muss/resources
56 4
laser.py
in muss
51 3
simplifiers.py
in muss
47 4
training.py
in muss/utils
42 4
kenlm.py
in muss
42 4
paths.py
in muss/resources
39 4
general.py
in muss/evaluation
32 2
utils.py
in muss/evaluation
22 4
simplify.py
in scripts
20 -
create_base_index.py
in scripts
19 -
setup.py
in root
17 -
train_model.py
in scripts
11 -
cfg
setup.cfg
in root
7 -
examples.es
in scripts
4 -
Files With Most Units (Top 20)
File# lines# units
preprocessors.py
in muss
355 63
helpers.py
in muss/utils
390 46
nn_search.py
in muss/mining
374 27
text.py
in muss
150 23
main.py
in muss/fairseq
190 19
preprocessing.py
in muss
154 19
datasets.py
in muss/resources
142 14
training.py
in muss/mining
354 13
resources.py
in muss/utils
121 12
filtering.py
in muss/mining
101 12
feature_extraction.py
in muss
58 11
preprocessing.py
in muss/mining
96 11
slurm.py
in muss
103 9
submitit.py
in muss/utils
103 9
base.py
in muss/fairseq
206 6
simplify.py
in muss
56 5
paths.py
in muss/resources
39 4
prepare.py
in muss/resources
56 4
simplifiers.py
in muss
47 4
training.py
in muss/utils
42 4
Files With Long Lines (Top 12)

There are 12 files with lines longer than 120 characters. In total, there are 23 long lines.

File# lines# units# long lines
train_paper_models.py
in scripts
114 3 4
base.py
in muss/fairseq
206 6 3
preprocessors.py
in muss
355 63 3
roberta.py
in muss/fairseq
87 2 2
training.py
in muss/mining
354 13 2
kenlm.py
in muss
42 4 2
mine_sequences.py
in scripts
220 - 2
simplify.py
in muss
56 5 1
prepare.py
in muss/resources
56 4 1
nn_search.py
in muss/mining
374 27 1
filtering.py
in muss/mining
101 12 1
preprocessing.py
in muss/mining
96 11 1