tensorflow / datasets
Unit Size

The distribution of size of units (measured in lines of code).

Intro
  • Unit size measurements show the distribution of size of units of code (methods, functions...).
  • Units are classified in four categories based on their size (lines of code): 1-20 (small units), 20-50 (medium size units), 51-100 (long units), 101+ (very long units).
  • You should aim at keeping units small (< 20 lines). Long units may become "bloaters", code that have increased to such gargantuan proportions that they are hard to work with.
Learn more...
Unit Size Overall
  • There are 2,817 units with 34,142 lines of code in units (64.9% of code).
    • 7 very long units (3,000 lines of code)
    • 33 long units (2,107 lines of code)
    • 368 medium size units (10,922 lines of code)
    • 692 small units (10,045 lines of code)
    • 1,717 very small units (8,068 lines of code)
8% | 6% | 31% | 29% | 23%
Legend:
101+
51-100
21-50
11-20
1-10
Unit Size per Extension
101+
51-100
21-50
11-20
1-10
py8% | 6% | 31% | 29% | 23%
Unit Size per Logical Component
primary logical decomposition
101+
51-100
21-50
11-20
1-10
tensorflow_datasets/structured50% | 7% | 19% | 14% | 7%
tensorflow_datasets/text18% | 8% | 31% | 30% | 11%
tensorflow_datasets/scripts15% | 0% | 14% | 40% | 29%
tensorflow_datasets/vision_language17% | 17% | 49% | 7% | 9%
tensorflow_datasets/audio6% | 3% | 33% | 39% | 17%
tensorflow_datasets/image_classification0% | 6% | 34% | 44% | 14%
tensorflow_datasets/core0% | 2% | 17% | 26% | 53%
tensorflow_datasets/object_detection0% | 13% | 59% | 17% | 9%
tensorflow_datasets/text_simplification0% | 66% | 23% | 5% | 4%
tensorflow_datasets/question_answering0% | 15% | 51% | 22% | 10%
tensorflow_datasets/robomimic0% | 47% | 0% | 32% | 19%
tensorflow_datasets/testing0% | 2% | 30% | 23% | 42%
tensorflow_datasets/translate0% | 7% | 29% | 43% | 18%
tensorflow_datasets/d4rl0% | 16% | 43% | 16% | 23%
tensorflow_datasets/image0% | 0% | 63% | 25% | 11%
tensorflow_datasets/summarization0% | 0% | 37% | 42% | 19%
tensorflow_datasets/rl_unplugged0% | 0% | 47% | 11% | 41%
tensorflow_datasets/video0% | 0% | 48% | 35% | 15%
tensorflow_datasets/graphs0% | 0% | 51% | 37% | 11%
tensorflow_datasets/time_series0% | 0% | 73% | 0% | 26%
tensorflow_datasets/ranking0% | 0% | 28% | 38% | 33%
tensorflow_datasets/rlds0% | 0% | 15% | 46% | 37%
tensorflow_datasets0% | 0% | 0% | 0% | 100%
Alternative Visuals
Longest Units
Top 20 longest units
Unit# linesMcCabe index# params
def _info()
in tensorflow_datasets/structured/covid19/covid19.py
1410 1 1
def _split_generators()
in tensorflow_datasets/text/gem/gem.py
535 26 2
def _generate_examples()
in tensorflow_datasets/text/gem/gem.py
500 106 5
def _info()
in tensorflow_datasets/structured/kddcup99/kddcup99.py
219 1 1
def register_subparser()
in tensorflow_datasets/scripts/cli/build.py
122 2 1
def _generate_examples()
in tensorflow_datasets/vision_language/wit_kaggle/wit_kaggle.py
108 22 5
def _build_pcollection()
in tensorflow_datasets/audio/nsynth.py
106 6 5
def _info()
in tensorflow_datasets/structured/movielens.py
95 4 1
def _generate_examples()
in tensorflow_datasets/question_answering/trivia_qa.py
91 14 4
def download_and_prepare()
in tensorflow_datasets/core/dataset_builder.py
86 14 4
def _get_page_content()
in tensorflow_datasets/text/c4.py
84 20 4
def _generate_examples()
in tensorflow_datasets/text_simplification/wiki_auto/wiki_auto.py
80 17 3
def _split_generators()
in tensorflow_datasets/text/c4.py
78 12 3
def _get_features()
in tensorflow_datasets/robomimic/robomimic_ph/robomimic_ph.py
74 6 1
def _info()
in tensorflow_datasets/text_simplification/wiki_auto/wiki_auto.py
73 5 1
def parse_100k_ratings_data()
in tensorflow_datasets/structured/movielens_parsing.py
71 9 1
def _split_generators()
in tensorflow_datasets/text/glue.py
68 9 2
def _generate_examples()
in tensorflow_datasets/object_detection/wider_face.py
66 10 3
def _generate_examples()
in tensorflow_datasets/object_detection/coco.py
66 8 6
def _sample_negative_patches()
in tensorflow_datasets/image_classification/cbis_ddsm.py
66 10 9