tensorflow / data-validation
Unit Size

The distribution of size of units (measured in lines of code).

Intro
  • Unit size measurements show the distribution of size of units of code (methods, functions...).
  • Units are classified in four categories based on their size (lines of code): 1-20 (small units), 20-50 (medium size units), 51-100 (long units), 101+ (very long units).
  • You should aim at keeping units small (< 20 lines). Long units may become "bloaters", code that have increased to such gargantuan proportions that they are hard to work with.
Learn more...
Unit Size Overall
  • There are 814 units with 9,621 lines of code in units (66.6% of code).
    • 3 very long units (491 lines of code)
    • 19 long units (1,258 lines of code)
    • 106 medium size units (3,337 lines of code)
    • 135 small units (1,965 lines of code)
    • 551 very small units (2,570 lines of code)
5% | 13% | 34% | 20% | 26%
Legend:
101+
51-100
21-50
11-20
1-10
Unit Size per Extension
101+
51-100
21-50
11-20
1-10
cc12% | 18% | 35% | 17% | 15%
py0% | 9% | 34% | 22% | 33%
h0% | 0% | 0% | 0% | 100%
Unit Size per Logical Component
primary logical decomposition
101+
51-100
21-50
11-20
1-10
tensorflow_data_validation/anomalies12% | 17% | 36% | 17% | 16%
tensorflow_data_validation/statistics0% | 11% | 36% | 24% | 27%
tensorflow_data_validation/utils0% | 9% | 33% | 16% | 41%
tensorflow_data_validation/pywrap0% | 100% | 0% | 0% | 0%
tensorflow_data_validation/skew0% | 0% | 34% | 56% | 9%
tensorflow_data_validation/arrow0% | 0% | 35% | 24% | 39%
tensorflow_data_validation/tools0% | 0% | 69% | 0% | 30%
tensorflow_data_validation/api0% | 0% | 12% | 6% | 80%
tensorflow_data_validation/coders0% | 0% | 0% | 67% | 32%
ROOT0% | 0% | 0% | 23% | 76%
tensorflow_data_validation0% | 0% | 0% | 0% | 100%
Alternative Visuals
Longest Units
Top 20 longest units
Unit# linesMcCabe index# params
void Schema::UpdateFeatureInternal()
in tensorflow_data_validation/anomalies/schema.cc
191 43 5
std::vector Schema::UpdateFeatureSelf()
in tensorflow_data_validation/anomalies/schema.cc
176 39 1
std::vector UpdateBoolDomain()
in tensorflow_data_validation/anomalies/bool_domain_util.cc
124 21 2
def expand()
in tensorflow_data_validation/statistics/generators/top_k_uniques_stats_generator.py
96 7 2
def get_schema_dataframe()
in tensorflow_data_validation/utils/display_util.py
84 32 1
tensorflow::Status ValidateFeatureStatisticsWithSerializedInputs()
in tensorflow_data_validation/anomalies/feature_statistics_validator.cc
83 15 10
def _make_numeric_stats_proto()
in tensorflow_data_validation/statistics/generators/basic_stats_generator.py
73 11 5
std::vector UpdateNumExamplesComparatorDirect()
in tensorflow_data_validation/anomalies/dataset_constraints_util.cc
73 13 3
tensorflow::Status Schema::UpdateFeature()
in tensorflow_data_validation/anomalies/schema.cc
68 12 5
std::vector UpdateImageDomain()
in tensorflow_data_validation/anomalies/image_domain_util.cc
66 8 2
std::vector UpdateNaturalLanguageDomain()
in tensorflow_data_validation/anomalies/natural_language_domain_util.cc
66 16 2
tensorflow::Status ValidateFeatureStatistics()
in tensorflow_data_validation/anomalies/feature_statistics_validator.cc
66 6 10
void VerifyTokenConstraints()
in tensorflow_data_validation/anomalies/natural_language_domain_util.cc
63 9 4
def __init__()
in tensorflow_data_validation/statistics/stats_options.py
62 5 29
def get_generators()
in tensorflow_data_validation/statistics/stats_impl.py
61 22 2
def _calculate_mi()
in tensorflow_data_validation/statistics/generators/sklearn_mutual_information.py
60 18 5
def _mi_for_arrays()
in tensorflow_data_validation/utils/mutual_information_util.py
60 25 8
UpdateSummary UpdateStringDomain()
in tensorflow_data_validation/anomalies/string_domain_util.cc
59 6 4
UpdateSummary UpdateIntDomain()
in tensorflow_data_validation/anomalies/int_domain_util.cc
57 12 2
def _make_feature_stats_proto()
in tensorflow_data_validation/statistics/generators/basic_stats_generator.py
56 13 14