tensorflow / text
Duplication

Places in code with 6 or more lines that are exactly the same.

Intro
  • For duplication, we look at places in code where there are 6 or more lines of code that are exactly the same.
  • Before duplication is calculated, the code is cleaned to remove empty lines, comments, and frequently duplicated constructs such as imports.
  • You should aim at having as little as possible (<5%) of duplicated code as high-level of duplication can lead to maintenance difficulties, poor factoring, and logical contradictions.
Learn more...
Duplication Overall
  • 12% duplication:
    • 11,397 cleaned lines of cleaned code (without empty lines, comments, and frequently duplicated constructs such as imports)
    • 1,405 duplicated lines
  • 99 duplicates
system12% (1,405 lines)
Duplication per Extension
cc13% (756 lines)
py14% (565 lines)
h4% (72 lines)
bzl9% (12 lines)
Duplication per Component (primary)
tensorflow_text/core/kernels10% (706 lines)
tensorflow_text/python/ops7% (178 lines)
tensorflow_text/tools/wordpiece_vocab20% (160 lines)
tensorflow_text/core/ops16% (122 lines)
oss_scripts/pip_package86% (102 lines)
tensorflow_text/python/benchmarks22% (99 lines)
tensorflow_text/python/numpy30% (26 lines)
tensorflow_text6% (12 lines)
tensorflow_text/core/pybinds0% (0 lines)
tensorflow_text/tools0% (0 lines)
tensorflow_text/python/metrics0% (0 lines)
tensorflow_text/python0% (0 lines)
tensorflow_text/python/keras0% (0 lines)
oss_scripts0% (0 lines)
Longest Duplicates
The list of 20 longest duplicates.
See data for all 99 duplicates...
Size#FoldersFilesLinesCode
30 x 2 tensorflow_text/python/ops
tensorflow_text/python/ops
greedy_constrained_sequence_op.py
viterbi_constrained_sequence_op.py
43:164 (71%)
44:187 (71%)
view
28 x 2 tensorflow_text/core/kernels
tensorflow_text/core/kernels
sentencepiece_kernels.cc
sentencepiece_kernels.cc
268:301 (5%)
409:442 (5%)
view
27 x 2 oss_scripts/pip_package
oss_scripts/pip_package
setup.nightly.py
setup.py
38:75 (47%)
38:75 (45%)
view
25 x 2 tensorflow_text/core/kernels
tensorflow_text/core/kernels
normalize_kernels.cc
normalize_kernels.cc
107:131 (9%)
238:262 (9%)
view
24 x 2 tensorflow_text/tools/wordpiece_vocab
tensorflow_text/tools/wordpiece_vocab
generate_vocab.py
wordpiece_tokenizer_learner.py
51:74 (16%)
33:56 (42%)
view
22 x 2 tensorflow_text/core/kernels
tensorflow_text/core/kernels
split_merge_tokenize_kernel.cc
tokenizer_from_logits_kernel.cc
175:208 (16%)
192:225 (15%)
view
22 x 2 tensorflow_text/core/kernels
tensorflow_text/core/kernels
split_merge_tokenize_kernel.cc
tokenizer_from_logits_kernel.cc
29:64 (16%)
29:64 (15%)
view
20 x 2 tensorflow_text/tools/wordpiece_vocab
tensorflow_text/tools/wordpiece_vocab
generate_vocab.py
generate_word_counts.py
130:155 (13%)
77:102 (32%)
view
20 x 2 tensorflow_text/core/kernels
tensorflow_text/core/kernels
unicode_script_tokenize_kernel.cc
whitespace_tokenize_kernel.cc
143:173 (21%)
117:147 (27%)
view
18 x 2 tensorflow_text/tools/wordpiece_vocab
tensorflow_text/tools/wordpiece_vocab
generate_vocab.py
measure_wordpiece_stats.py
93:111 (12%)
74:92 (28%)
view
18 x 2 tensorflow_text/python/ops
tensorflow_text/python/ops
fast_wordpiece_tokenizer.py
wordpiece_tokenizer.py
211:230 (19%)
266:285 (13%)
view
17 x 2 tensorflow_text/core/ops
tensorflow_text/core/ops
sentencepiece_ops.cc
sentencepiece_ops.cc
51:68 (12%)
88:106 (12%)
view
17 x 2 oss_scripts/pip_package
oss_scripts/pip_package
setup.nightly.py
setup.py
87:103 (29%)
90:106 (28%)
view
16 x 2 tensorflow_text/core/kernels
tensorflow_text/core/kernels
unicode_script_tokenize_kernel.cc
whitespace_tokenize_kernel.cc
58:86 (17%)
54:82 (21%)
view
13 x 2 tensorflow_text/core/kernels
tensorflow_text/core/kernels
split_merge_tokenize_kernel.cc
tokenizer_from_logits_kernel.cc
94:111 (9%)
101:118 (9%)
view
13 x 2 tensorflow_text/core/kernels
tensorflow_text/core/kernels
sentence_fragmenter.h
sentence_fragmenter_v2.h
147:162 (14%)
129:144 (19%)
view
12 x 2 tensorflow_text/python/ops
tensorflow_text/python/ops
unicode_script_tokenizer.py
whitespace_tokenizer.py
166:178 (15%)
119:131 (21%)
view
12 x 2 tensorflow_text/core/kernels
tensorflow_text/core/kernels
split_merge_tokenize_kernel.cc
tokenizer_from_logits_kernel.cc
157:173 (9%)
174:190 (8%)
view
12 x 2 tensorflow_text/core/ops
tensorflow_text/core/ops
sentencepiece_ops.cc
sentencepiece_ops.cc
39:50 (8%)
74:85 (8%)
view
12 x 2 tensorflow_text/tools/wordpiece_vocab
tensorflow_text/tools/wordpiece_vocab
generate_vocab.py
generate_word_counts.py
39:50 (8%)
36:47 (19%)
view
Duplicated Units
The list of top 3 duplicated units.
See data for all 3 unit duplicates...
Size#FoldersFilesLinesCode
13 x 2 tensorflow_text/core/kernels
tensorflow_text/core/kernels
tokenizer_from_logits_kernel.cc
split_merge_tokenize_kernel.cc
45:58 
45:58 
view
9 x 2 tensorflow_text/python/benchmarks
tensorflow_text/python/benchmarks
ops_benchmarks.py
ops_benchmarks.py
0:0 
0:0 
view
7 x 2 tensorflow_text/core/kernels
tensorflow_text/core/kernels
regex_split.cc
regex_split.cc
71:78 
80:87 
view