awslabs / pptod
Unit Size

The distribution of size of units (measured in lines of code).

Intro
  • Unit size measurements show the distribution of size of units of code (methods, functions...).
  • Units are classified in four categories based on their size (lines of code): 1-20 (small units), 20-50 (medium size units), 51-100 (long units), 101+ (very long units).
  • You should aim at keeping units small (< 20 lines). Long units may become "bloaters", code that have increased to such gargantuan proportions that they are hard to work with.
Learn more...
Unit Size Overall
  • There are 308 units with 5,875 lines of code in units (72.1% of code).
    • 5 very long units (706 lines of code)
    • 9 long units (615 lines of code)
    • 79 medium size units (2,501 lines of code)
    • 82 small units (1,290 lines of code)
    • 133 very small units (763 lines of code)
12% | 10% | 42% | 21% | 12%
Legend:
101+
51-100
21-50
11-20
1-10
Unit Size per Extension
101+
51-100
21-50
11-20
1-10
py12% | 10% | 42% | 21% | 12%
Unit Size per Logical Component
primary logical decomposition
101+
51-100
21-50
11-20
1-10
E2E_TOD24% | 17% | 36% | 10% | 10%
DST14% | 15% | 44% | 16% | 8%
data/multiwoz0% | 6% | 53% | 20% | 18%
data/pre-training_corpora0% | 0% | 31% | 57% | 10%
Pretraining0% | 0% | 82% | 0% | 17%
IC0% | 0% | 51% | 21% | 27%
E2E_TOD/modelling0% | 0% | 42% | 45% | 12%
DST/modelling0% | 0% | 42% | 45% | 12%
IC/modelling0% | 0% | 50% | 23% | 26%
Pretraining/modelling0% | 0% | 0% | 54% | 45%
Alternative Visuals
Longest Units
Top 20 longest units
Unit# linesMcCabe index# params
def clean_slot_values()
in E2E_TOD/clean_dataset.py
229 125 4
def batch_generate()
in E2E_TOD/inference_utlis.py
142 36 7
121 74 7
def __init__()
in E2E_TOD/dataclass.py
112 10 10
def __init__()
in DST/dataclass.py
102 10 9
def flatten_data()
in E2E_TOD/dataclass.py
82 8 2
def parse_one_eva_instance()
in E2E_TOD/dataclass.py
82 8 6
def queryJsons()
in E2E_TOD/db_ops.py
74 43 5
def queryJsons()
in data/multiwoz/utlis/db_ops.py
74 43 5
def domain_eval()
in E2E_TOD/eval.py
70 31 3
def e2e_batch_generate()
in E2E_TOD/e2e_inference_utlis.py
65 18 4
def flatten_data()
in DST/dataclass.py
60 7 2
def compute_jacc()
in DST/compute_joint_acc.py
55 23 3
def _get_metric_results()
in E2E_TOD/eval.py
53 13 4
def get_optimizers()
in DST/learn.py
49 12 5
def clean_text()
in E2E_TOD/clean_dataset.py
48 1 2
def bspan_to_constraint_dict()
in E2E_TOD/reader.py
46 24 3
def bspan_to_constraint_dict()
in data/multiwoz/utlis/reader.py
46 24 3
45 6 7
def parse_one_dataset()
in Pretraining/dataclass.py
45 12 7