facebookresearch / DisCo
Unit Size

The distribution of size of units (measured in lines of code).

Intro
  • Unit size measurements show the distribution of size of units of code (methods, functions...).
  • Units are classified in four categories based on their size (lines of code): 1-20 (small units), 20-50 (medium size units), 51-100 (long units), 101+ (very long units).
  • You should aim at keeping units small (< 20 lines). Long units may become "bloaters", code that have increased to such gargantuan proportions that they are hard to work with.
Learn more...
Unit Size Overall
  • There are 1,034 units with 11,348 lines of code in units (79.7% of code).
    • 11 very long units (1,508 lines of code)
    • 23 long units (1,480 lines of code)
    • 103 medium size units (3,200 lines of code)
    • 138 small units (1,973 lines of code)
    • 759 very small units (3,187 lines of code)
13% | 13% | 28% | 17% | 28%
Legend:
101+
51-100
21-50
11-20
1-10
Unit Size per Extension
101+
51-100
21-50
11-20
1-10
py13% | 13% | 28% | 16% | 28%
cpp0% | 0% | 25% | 48% | 26%
lua0% | 0% | 34% | 55% | 10%
Unit Size per Logical Component
primary logical decomposition
101+
51-100
21-50
11-20
1-10
ROOT50% | 14% | 19% | 5% | 10%
fairseq_cli50% | 18% | 20% | 4% | 7%
fairseq11% | 20% | 24% | 16% | 27%
fairseq/modules17% | 0% | 35% | 20% | 26%
scripts0% | 43% | 38% | 13% | 4%
fairseq/models0% | 8% | 50% | 14% | 27%
fairseq/optim0% | 11% | 16% | 28% | 43%
fairseq/tasks0% | 22% | 32% | 13% | 31%
fairseq/strategies0% | 35% | 20% | 7% | 37%
fairseq/data0% | 0% | 22% | 23% | 54%
fairseq/clib0% | 0% | 25% | 48% | 26%
fairseq/criterions0% | 0% | 0% | 74% | 25%
Alternative Visuals
Longest Units
Top 20 longest units
Unit# linesMcCabe index# params
179 38 24
def main()
in preprocess.py
148 35 1
def main()
in fairseq_cli/preprocess.py
148 35 1
def forward()
in fairseq/modules/masked_multihead_attention.py
141 37 10
def forward()
in fairseq/modules/multihead_attention.py
134 37 9
def main()
in fairseq_cli/eval_lm.py
133 38 1
def main()
in eval_lm.py
133 38 1
def main()
in generate.py
132 44 1
def main()
in fairseq_cli/generate.py
132 44 1
def train_step()
in fairseq/trainer.py
120 40 4
def main()
in generate_disco.py
108 33 1
def main()
in fairseq_cli/interactive.py
97 31 1
def main()
in interactive.py
97 31 1
def generate()
in fairseq/sequence_scorer.py
82 19 4
def main()
in scripts/rm_pt.py
82 1 0
def _register_grad_hook()
in fairseq/legacy_distributed_data_parallel.py
70 21 1
def load_dataset()
in fairseq/tasks/translation_self.py
70 19 4
def main()
in scripts/spm_encode.py
70 19 0
def add_args()
in fairseq/models/bert_seq2seq.py
67 1 1
def _upgrade_state_dict()
in fairseq/checkpoint_utils.py
66 4 1