tensorflow / data-validation
Duplication

Places in code with 6 or more lines that are exactly the same.

Intro
  • For duplication, we look at places in code where there are 6 or more lines of code that are exactly the same.
  • Before duplication is calculated, the code is cleaned to remove empty lines, comments, and frequently duplicated constructs such as imports.
  • You should aim at having as little as possible (<5%) of duplicated code as high-level of duplication can lead to maintenance difficulties, poor factoring, and logical contradictions.
Learn more...
Duplication Overall
  • 2% duplication:
    • 12,486 cleaned lines of cleaned code (without empty lines, comments, and frequently duplicated constructs such as imports)
    • 339 duplicated lines
  • 30 duplicates
system2% (339 lines)
Duplication per Extension
py2% (209 lines)
cc1% (63 lines)
proto32% (60 lines)
h1% (7 lines)
Duplication per Component (primary)
tensorflow_data_validation/anomalies2% (116 lines)
tensorflow_data_validation/statistics2% (110 lines)
tensorflow_data_validation/utils4% (99 lines)
tensorflow_data_validation/skew5% (14 lines)
g3doc0% (0 lines)
g3doc/api_docs0% (0 lines)
tensorflow_data_validation/pywrap0% (0 lines)
tensorflow_data_validation/arrow0% (0 lines)
tensorflow_data_validation0% (0 lines)
tensorflow_data_validation/api0% (0 lines)
tensorflow_data_validation/tools0% (0 lines)
tensorflow_data_validation/coders0% (0 lines)
ROOT0% (0 lines)

Duplication Between Components (50+ lines)

G tensorflow_data_validation/anomalies tensorflow_data_validation/anomalies tensorflow_data_validation/skew tensorflow_data_validation/skew tensorflow_data_validation/anomalies--tensorflow_data_validation/skew 54

Download: SVG DOT (open online Graphviz editor)

Open 3D force graph...

Show more details on duplication between components...
Longest Duplicates
The list of 20 longest duplicates.
See data for all 30 duplicates...
Size#FoldersFilesLinesCode
15 x 2 tensorflow_data_validation/anomalies/proto
tensorflow_data_validation/anomalies/proto
proto
feature_statistics_to_proto.proto
validation_config.proto
2:18 (23%)
2:18 (45%)
view
14 x 2 tensorflow_data_validation/anomalies/proto
tensorflow_data_validation/skew/protos
proto
validation_metadata.proto
feature_skew_results.proto
2:16 (40%)
2:15 (25%)
view
14 x 2 tensorflow_data_validation/anomalies/proto
tensorflow_data_validation/anomalies/proto
proto
validation_config.proto
validation_metadata.proto
1:14 (42%)
1:14 (40%)
view
13 x 2 tensorflow_data_validation/anomalies/proto
tensorflow_data_validation/anomalies/proto
proto
feature_statistics_to_proto.proto
validation_metadata.proto
2:14 (20%)
2:14 (37%)
view
13 x 2 tensorflow_data_validation/anomalies/proto
tensorflow_data_validation/skew/protos
proto
validation_config.proto
feature_skew_results.proto
2:14 (39%)
2:14 (23%)
view
13 x 2 tensorflow_data_validation/anomalies/proto
tensorflow_data_validation/skew/protos
proto
feature_statistics_to_proto.proto
feature_skew_results.proto
2:14 (20%)
2:14 (23%)
view
10 x 2 tensorflow_data_validation/anomalies
tensorflow_data_validation/anomalies
schema_anomalies.cc
schema_util.cc
57:66 (3%)
24:33 (58%)
view
9 x 2 tensorflow_data_validation/utils
tensorflow_data_validation/utils
top_k_uniques_stats_util.py
top_k_uniques_stats_util.py
51:59 (6%)
105:113 (6%)
view
9 x 2 tensorflow_data_validation/statistics/generators
tensorflow_data_validation/statistics/generators
image_stats_generator.py
natural_language_domain_inferring_sta...
265:275 (6%)
189:199 (8%)
view
8 x 2 tensorflow_data_validation/utils
tensorflow_data_validation/utils
mutual_information_util.py
mutual_information_util.py
99:106 (2%)
170:177 (2%)
view
8 x 2 tensorflow_data_validation/statistics/generators
tensorflow_data_validation/statistics/generators
top_k_uniques_combiner_stats_generato...
top_k_uniques_stats_generator.py
96:103 (6%)
149:156 (4%)
view
7 x 2 tensorflow_data_validation/anomalies
tensorflow_data_validation/anomalies
feature_statistics_validator.cc
feature_statistics_validator.cc
77:84 (3%)
183:191 (3%)
view
7 x 2 tensorflow_data_validation/utils
tensorflow_data_validation/utils
display_util.py
display_util.py
271:277 (2%)
333:339 (2%)
view
7 x 2 tensorflow_data_validation/anomalies
tensorflow_data_validation/anomalies
float_domain_util.cc
int_domain_util.cc
81:91 (6%)
109:119 (6%)
view
7 x 2 tensorflow_data_validation/anomalies
tensorflow_data_validation/anomalies
feature_statistics_validator.cc
feature_statistics_validator.h
175:181 (3%)
98:104 (17%)
view
7 x 2 tensorflow_data_validation/utils
tensorflow_data_validation/utils
display_util.py
display_util.py
333:339 (2%)
439:445 (2%)
view
7 x 2 tensorflow_data_validation/utils
tensorflow_data_validation/utils
display_util.py
display_util.py
271:277 (2%)
439:445 (2%)
view
7 x 2 tensorflow_data_validation/statistics/generators
tensorflow_data_validation/statistics/generators
natural_language_domain_inferring_sta...
natural_language_stats_generator.py
214:229 (6%)
582:597 (1%)
view
7 x 2 tensorflow_data_validati...generators/constituents
tensorflow_data_validati...generators/constituents
count_missing_generator.py
length_diff_generator.py
55:62 (21%)
58:65 (12%)
view
7 x 2 tensorflow_data_validation/utils
tensorflow_data_validation/utils
validation_lib.py
validation_lib.py
117:123 (5%)
241:247 (5%)
view
Duplicated Units
The list of top 6 duplicated units.
See data for all 6 unit duplicates...
Size#FoldersFilesLinesCode
11 x 2 tensorflow_data_validation/anomalies
tensorflow_data_validation/anomalies
schema_anomalies.cc
schema_util.cc
57:68 
24:35 
view
8 x 2 tensorflow_data_validation/utils
tensorflow_data_validation/utils
top_k_uniques_stats_util.py
top_k_uniques_stats_util.py
0:0 
0:0 
view
7 x 2 tensorflow_data_validation/utils
tensorflow_data_validation/utils
display_util.py
display_util.py
0:0 
0:0 
view
11 x 2 tensorflow_data_validation/statistics/generators
tensorflow_data_validation/statistics/generators
natural_language_domain_inferring_sta...
natural_language_stats_generator.py
0:0 
0:0 
view
7 x 2 tensorflow_data_validation/statistics/generators
tensorflow_data_validation/statistics/generators
stats_generator.py
stats_generator.py
0:0 
0:0 
view
9 x 2 tensorflow_data_validation/statistics/generators
tensorflow_data_validation/statistics/generators
stats_generator.py
stats_generator.py
0:0 
0:0 
view