tensorflow / data-validation
File Size

The distribution of size of files (measured in lines of code).

Intro
  • File size measurements show the distribution of size of files.
  • Files are classified in four categories based on their size (lines of code): 1-100 (very small files), 101-200 (small files), 201-500 (medium size files), 501-1000 (long files), 1001+(very long files).
  • It is a good practice to keep files small. Long files may become "bloaters", code that have increased to such gargantuan proportions that they are hard to work with.
Learn more...
File Size Overall
  • There are 116 files with 14,451 lines of code.
    • 1 very long files (1,125 lines of code)
    • 4 long files (2,420 lines of code)
    • 16 medium size files (4,900 lines of codeclsfd_ftr_w_mp_ins)
    • 26 small files (3,791 lines of code)
    • 69 very small files (2,215 lines of code)
7% | 16% | 33% | 26% | 15%
Legend:
1001+
501-1000
201-500
101-200
1-100


explore: zoomable circles | sunburst | 3D view
File Size per Extension
1001+
501-1000
201-500
101-200
1-100
cc24% | 11% | 34% | 19% | 10%
py0% | 21% | 38% | 29% | 10%
h0% | 0% | 0% | 40% | 59%
proto0% | 0% | 0% | 0% | 100%
bzl0% | 0% | 0% | 0% | 100%
yaml0% | 0% | 0% | 0% | 100%
toml0% | 0% | 0% | 0% | 100%
File Size per Logical Decomposition
primary
1001+
501-1000
201-500
101-200
1-100
tensorflow_data_validation/anomalies20% | 9% | 29% | 22% | 18%
tensorflow_data_validation/statistics0% | 36% | 35% | 23% | 4%
tensorflow_data_validation/utils0% | 0% | 42% | 44% | 12%
tensorflow_data_validation/api0% | 0% | 73% | 0% | 26%
tensorflow_data_validation/skew0% | 0% | 78% | 0% | 21%
tensorflow_data_validation/arrow0% | 0% | 0% | 81% | 18%
ROOT0% | 0% | 0% | 87% | 12%
tensorflow_data_validation0% | 0% | 0% | 0% | 100%
tensorflow_data_validation/pywrap0% | 0% | 0% | 0% | 100%
tensorflow_data_validation/tools0% | 0% | 0% | 0% | 100%
g3doc/api_docs0% | 0% | 0% | 0% | 100%
tensorflow_data_validation/coders0% | 0% | 0% | 0% | 100%
g3doc0% | 0% | 0% | 0% | 100%
Longest Files (Top 50)
File# lines# units
schema.cc
in tensorflow_data_validation/anomalies
1125 34
basic_stats_generator.py
in tensorflow_data_validation/statistics/generators
740 37
lift_stats_generator.py
in tensorflow_data_validation/statistics/generators
613 34
feature_util.cc
in tensorflow_data_validation/anomalies
552 27
stats_impl.py
in tensorflow_data_validation/statistics
515 38
statistics_view.cc
in tensorflow_data_validation/anomalies
477 35
natural_language_stats_generator.py
in tensorflow_data_validation/statistics/generators
464 23
schema_anomalies.cc
in tensorflow_data_validation/anomalies
396 23
mutual_information_util.py
in tensorflow_data_validation/utils
349 19
display_util.py
in tensorflow_data_validation/utils
347 16
mutual_information.py
in tensorflow_data_validation/statistics/generators
335 19
validation_api.py
in tensorflow_data_validation/api
316 20
stats_options.py
in tensorflow_data_validation/statistics
300 36
stats_util.py
in tensorflow_data_validation/utils
293 37
partitioned_stats_generator.py
in tensorflow_data_validation/statistics/generators
259 24
bool_domain_util.cc
in tensorflow_data_validation/anomalies
259 9
feature_statistics_validator.cc
in tensorflow_data_validation/anomalies
258 6
sklearn_mutual_information.py
in tensorflow_data_validation/statistics/generators
223 7
metrics.cc
in tensorflow_data_validation/anomalies
215 12
top_k_uniques_stats_generator.py
in tensorflow_data_validation/statistics/generators
205 6
feature_skew_detector.py
in tensorflow_data_validation/skew
204 11
time_stats_generator.py
in tensorflow_data_validation/statistics/generators
197 11
natural_language_domain_util.cc
in tensorflow_data_validation/anomalies
186 3
top_k_uniques_sketch_stats_generator.py
in tensorflow_data_validation/statistics/generators
179 11
arrow_util.py
in tensorflow_data_validation/arrow
176 12
stats_gen_lib.py
in tensorflow_data_validation/utils
173 5
string_domain_util.cc
in tensorflow_data_validation/anomalies
172 8
schema_util.py
in tensorflow_data_validation/utils
170 13
validation_lib.py
in tensorflow_data_validation/utils
164 6
float_domain_util.cc
in tensorflow_data_validation/anomalies
164 4
image_stats_generator.py
in tensorflow_data_validation/statistics/generators
159 14
top_k_uniques_stats_util.py
in tensorflow_data_validation/utils
157 7
cross_feature_stats_generator.py
in tensorflow_data_validation/statistics/generators
155 9
int_domain_util.cc
in tensorflow_data_validation/anomalies
148 3
setup.py
in root
146 12
slicing_util.py
in tensorflow_data_validation/utils
143 10
top_k_uniques_combiner_stats_generator.py
in tensorflow_data_validation/statistics/generators
141 10
quantiles_util.py
in tensorflow_data_validation/utils
129 5
stats_generator.py
in tensorflow_data_validation/statistics/generators
128 33
dataset_constraints_util.cc
in tensorflow_data_validation/anomalies
128 3
natural_language_domain_inferring_stats_generator.py
in tensorflow_data_validation/statistics/generators
122 10
path.cc
in tensorflow_data_validation/anomalies
119 13
schema.h
in tensorflow_data_validation/anomalies
118 -
statistics_view.h
in tensorflow_data_validation/anomalies
108 4
sparse_feature_stats_generator.py
in tensorflow_data_validation/statistics/generators
104 5
feature_partition_util.py
in tensorflow_data_validation/utils
103 8
schema_anomalies.h
in tensorflow_data_validation/anomalies
102 4
statistics_view_test_util.cc
in tensorflow_data_validation/anomalies
98 4
image_domain_util.cc
in tensorflow_data_validation/anomalies
89 1
variance_util.py
in tensorflow_data_validation/utils
86 8
Files With Most Units (Top 20)
File# lines# units
stats_impl.py
in tensorflow_data_validation/statistics
515 38
basic_stats_generator.py
in tensorflow_data_validation/statistics/generators
740 37
stats_util.py
in tensorflow_data_validation/utils
293 37
stats_options.py
in tensorflow_data_validation/statistics
300 36
statistics_view.cc
in tensorflow_data_validation/anomalies
477 35
lift_stats_generator.py
in tensorflow_data_validation/statistics/generators
613 34
schema.cc
in tensorflow_data_validation/anomalies
1125 34
stats_generator.py
in tensorflow_data_validation/statistics/generators
128 33
feature_util.cc
in tensorflow_data_validation/anomalies
552 27
partitioned_stats_generator.py
in tensorflow_data_validation/statistics/generators
259 24
natural_language_stats_generator.py
in tensorflow_data_validation/statistics/generators
464 23
schema_anomalies.cc
in tensorflow_data_validation/anomalies
396 23
validation_api.py
in tensorflow_data_validation/api
316 20
mutual_information.py
in tensorflow_data_validation/statistics/generators
335 19
mutual_information_util.py
in tensorflow_data_validation/utils
349 19
types.py
in tensorflow_data_validation
85 18
display_util.py
in tensorflow_data_validation/utils
347 16
image_stats_generator.py
in tensorflow_data_validation/statistics/generators
159 14
schema_util.py
in tensorflow_data_validation/utils
170 13
path.cc
in tensorflow_data_validation/anomalies
119 13
Files With Long Lines (Top 2)

There are 2 files with lines longer than 120 characters. In total, there are 2 long lines.

File# lines# units# long lines
validation_api.py
in tensorflow_data_validation/api
316 20 1
__init__.py
in tensorflow_data_validation
38 - 1