amazon-research / BartGraphSumm
Unit Size

The distribution of size of units (measured in lines of code).

Intro
  • Unit size measurements show the distribution of size of units of code (methods, functions...).
  • Units are classified in four categories based on their size (lines of code): 1-20 (small units), 20-50 (medium size units), 51-100 (long units), 101+ (very long units).
  • You should aim at keeping units small (< 20 lines). Long units may become "bloaters", code that have increased to such gargantuan proportions that they are hard to work with.
Learn more...
Unit Size Overall
  • There are 2,260 units with 23,736 lines of code in units (70.5% of code).
    • 11 very long units (1,610 lines of code)
    • 47 long units (3,187 lines of code)
    • 236 medium size units (7,319 lines of code)
    • 299 small units (4,291 lines of code)
    • 1,667 very small units (7,329 lines of code)
6% | 13% | 30% | 18% | 30%
Legend:
101+
51-100
21-50
11-20
1-10
Unit Size per Extension
101+
51-100
21-50
11-20
1-10
py6% | 13% | 30% | 17% | 30%
cpp0% | 0% | 12% | 37% | 49%
lua0% | 0% | 34% | 55% | 10%
pyx0% | 0% | 0% | 0% | 100%
Unit Size per Logical Component
primary logical decomposition
101+
51-100
21-50
11-20
1-10
src/fairseq/fairseq_cli57% | 11% | 15% | 6% | 8%
src/fairseq/fairseq/tasks15% | 20% | 28% | 17% | 17%
src/fairseq/fairseq/models4% | 14% | 38% | 15% | 27%
src/fairseq/fairseq8% | 17% | 29% | 15% | 28%
src0% | 46% | 31% | 15% | 5%
src/fairseq/scripts0% | 44% | 37% | 13% | 3%
src/fairseq/fairseq/optim0% | 13% | 19% | 28% | 37%
src/fairseq/fairseq/data0% | 5% | 21% | 19% | 53%
src/fairseq/fairseq/criterions0% | 8% | 41% | 19% | 31%
src/fairseq/fairseq/modules0% | 0% | 43% | 27% | 29%
src/fairseq/fairseq/benchmark0% | 0% | 34% | 8% | 58%
src/fairseq/fairseq/model_parallel0% | 0% | 13% | 21% | 64%
src/fairseq/fairseq/logging0% | 0% | 5% | 11% | 83%
src/fairseq/fairseq/clib0% | 0% | 18% | 44% | 37%
src/fairseq0% | 0% | 0% | 0% | 100%
Alternative Visuals
Longest Units
Top 20 longest units
Unit# linesMcCabe index# params
def main()
in src/fairseq/fairseq_cli/preprocess.py
239 54 1
def _main()
in src/fairseq/fairseq_cli/generate.py
179 53 2
def add_args()
in src/fairseq/fairseq/models/wav2vec.py
168 1 1
def main()
in src/fairseq/fairseq_cli/eval_lm.py
151 41 2
def generate()
in src/fairseq/fairseq/iterative_refinement_generator.py
141 38 4
def load_dataset()
in src/fairseq/fairseq/tasks/semisupervised_translation.py
139 23 4
def train_step()
in src/fairseq/fairseq/trainer.py
137 48 3
def load_dataset()
in src/fairseq/fairseq/tasks/multilingual_masked_lm.py
122 17 5
def __init__()
in src/fairseq/fairseq/models/wav2vec.py
117 18 2
def load_dataset()
in src/fairseq/fairseq/tasks/multilingual_denoising.py
113 20 5
def main()
in src/fairseq/fairseq_cli/interactive.py
104 32 1
def cluster_docs()
in src/graph_utils.py
98 27 7
def generate()
in src/fairseq/fairseq/sequence_scorer.py
93 22 4
def load_dataset()
in src/fairseq/fairseq/tasks/sentence_prediction.py
92 11 4
def build_model()
in src/fairseq/fairseq/models/multilingual_transformer.py
91 25 3
def construct_graph()
in src/graph_construction.py
91 21 3
def __init__()
in src/fairseq/fairseq/models/transformer.py
89 16 5
def build_model()
in src/fairseq/fairseq/models/lstm.py
85 14 3
def main()
in src/fairseq/scripts/rm_pt.py
82 1 0
def load_dataset()
in src/fairseq/fairseq/tasks/sentence_ranking.py
79 11 4