amazon-research / fact-check-summarization
Unit Size

The distribution of size of units (measured in lines of code).

Intro
  • Unit size measurements show the distribution of size of units of code (methods, functions...).
  • Units are classified in four categories based on their size (lines of code): 1-20 (small units), 20-50 (medium size units), 51-100 (long units), 101+ (very long units).
  • You should aim at keeping units small (< 20 lines). Long units may become "bloaters", code that have increased to such gargantuan proportions that they are hard to work with.
Learn more...
Unit Size Overall
  • There are 2,116 units with 23,901 lines of code in units (71.2% of code).
    • 19 very long units (2,709 lines of code)
    • 46 long units (3,047 lines of code)
    • 241 medium size units (7,620 lines of code)
    • 274 small units (3,913 lines of code)
    • 1,536 very small units (6,612 lines of code)
11% | 12% | 31% | 16% | 27%
Legend:
101+
51-100
21-50
11-20
1-10
Unit Size per Extension
101+
51-100
21-50
11-20
1-10
py11% | 12% | 32% | 16% | 27%
cpp0% | 0% | 12% | 37% | 49%
lua0% | 0% | 34% | 55% | 10%
pyx0% | 0% | 0% | 0% | 100%
Unit Size per Logical Component
primary logical decomposition
101+
51-100
21-50
11-20
1-10
preprocess40% | 23% | 25% | 5% | 4%
fairseq_cli60% | 10% | 14% | 6% | 8%
fairseq/tasks14% | 18% | 35% | 15% | 16%
fairseq/models4% | 11% | 42% | 15% | 26%
fairseq8% | 17% | 26% | 18% | 28%
scripts0% | 43% | 35% | 13% | 7%
fairseq/optim0% | 13% | 19% | 29% | 38%
fairseq/data0% | 4% | 23% | 19% | 53%
fairseq/criterions0% | 5% | 48% | 19% | 26%
fairseq/modules0% | 0% | 48% | 21% | 29%
fairseq/benchmark0% | 0% | 13% | 25% | 60%
fairseq/logging0% | 0% | 5% | 11% | 82%
fairseq/clib0% | 0% | 18% | 44% | 37%
ROOT0% | 0% | 0% | 0% | 100%
Alternative Visuals
Longest Units
Top 20 longest units
Unit# linesMcCabe index# params
def main()
in fairseq_cli/preprocess.py
304 68 1
def make_unlikelihood_dataset()
in preprocess/evaluate_hypo.py
187 52 1
def add_args()
in fairseq/models/wav2vec.py
168 1 1
def _main()
in fairseq_cli/generate.py
150 45 2
def main()
in fairseq_cli/eval_lm.py
148 39 1
def _run_q_gen_process_local()
in preprocess/sm_inference_asum.py
144 30 11
def evaluate_hypo()
in preprocess/evaluate_hypo.py
143 29 7
def _run_qa_gen_process_local_batch_lines()
in preprocess/sm_inference_asum.py
143 37 10
def generate()
in fairseq/iterative_refinement_generator.py
141 38 4
def load_dataset()
in fairseq/tasks/semisupervised_translation.py
139 23 4
def _run_qa_gen_process_local()
in preprocess/sm_inference_asum.py
133 26 10
def load_dataset()
in fairseq/tasks/multilingual_masked_lm.py
122 17 5
def select_unlikelihood_hypos_lm_score()
in preprocess/evaluate_hypo.py
120 43 1
def __init__()
in fairseq/models/wav2vec.py
117 18 2
def _run_qa_eval_gen_process_local()
in preprocess/sm_inference_asum.py
116 24 12
def _run_qa_eval_process_local()
in preprocess/sm_inference_asum.py
112 25 12
def load_dataset()
in fairseq/tasks/multilingual_denoising.py
111 19 5
def train_step()
in fairseq/trainer.py
108 32 3
def main()
in fairseq_cli/interactive.py
103 31 1
def generate()
in fairseq/sequence_scorer.py
93 22 4