facebookresearch / cc_net
Unit Size

The distribution of size of units (measured in lines of code).

Intro
  • Unit size measurements show the distribution of size of units of code (methods, functions...).
  • Units are classified in four categories based on their size (lines of code): 1-20 (small units), 20-50 (medium size units), 51-100 (long units), 101+ (very long units).
  • You should aim at keeping units small (< 20 lines). Long units may become "bloaters", code that have increased to such gargantuan proportions that they are hard to work with.
Learn more...
Unit Size Overall
  • There are 305 units with 2,633 lines of code in units (69.6% of code).
    • 0 very long units (0 lines of code)
    • 3 long units (205 lines of code)
    • 20 medium size units (608 lines of code)
    • 53 small units (770 lines of code)
    • 229 very small units (1,050 lines of code)
0% | 7% | 23% | 29% | 39%
Legend:
101+
51-100
21-50
11-20
1-10
Unit Size per Extension
101+
51-100
21-50
11-20
1-10
py0% | 7% | 23% | 29% | 39%
Unit Size per Logical Component
primary logical decomposition
101+
51-100
21-50
11-20
1-10
cc_net0% | 8% | 22% | 29% | 40%
cc_net/tools0% | 0% | 33% | 29% | 37%
Alternative Visuals
Longest Units
Top 20 longest units
Unit# linesMcCabe index# params
def _mine_shard()
in cc_net/mine.py
75 11 4
def normalize_spacing_for_tok()
in cc_net/text_normalizer.py
66 1 2
def get_parser()
in cc_net/jsonql.py
64 2 0
def mine()
in cc_net/mine.py
46 19 1
def regroup()
in cc_net/mine.py
46 14 2
def _validate_test()
in cc_net/mine.py
45 8 3
def describe()
in cc_net/jsonql.py
40 26 4
def _dl_shard()
in cc_net/tools/dl_cc_100.py
38 11 2
def display_stats()
in cc_net/jsonql.py
34 14 5
def merge()
in cc_net/jsonql.py
30 2 4
def open_read()
in cc_net/jsonql.py
30 14 1
def compare_load()
in cc_net/flat_hash_set.py
30 6 1
def do()
in cc_net/tools/expand_corpus.py
29 12 2
def parse_doc()
in cc_net/process_wet_file.py
28 7 2
def move_segments()
in cc_net/mine.py
26 6 2
def log_error()
in cc_net/jsonql.py
25 5 2
def finalize_doc()
in cc_net/dedup.py
25 7 3
def do()
in cc_net/split_by_lang.py
25 12 2
def main()
in cc_net/mine.py
24 11 2
def get_cirrus_urls()
in cc_net/get_wiki_cirrus.py
22 5 1