duplicated block id: 1 size: 30 cleaned lines of code in 2 files: - tensorflow_text/python/ops/greedy_constrained_sequence_op.py (43:164) - tensorflow_text/python/ops/viterbi_constrained_sequence_op.py (44:187) duplicated block id: 2 size: 28 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/sentencepiece_kernels.cc (268:301) - tensorflow_text/core/kernels/sentencepiece_kernels.cc (409:442) duplicated block id: 3 size: 27 cleaned lines of code in 2 files: - oss_scripts/pip_package/setup.nightly.py (38:75) - oss_scripts/pip_package/setup.py (38:75) duplicated block id: 4 size: 25 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/normalize_kernels.cc (107:131) - tensorflow_text/core/kernels/normalize_kernels.cc (238:262) duplicated block id: 5 size: 24 cleaned lines of code in 2 files: - tensorflow_text/tools/wordpiece_vocab/generate_vocab.py (51:74) - tensorflow_text/tools/wordpiece_vocab/wordpiece_tokenizer_learner.py (33:56) duplicated block id: 6 size: 22 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/split_merge_tokenize_kernel.cc (175:208) - tensorflow_text/core/kernels/tokenizer_from_logits_kernel.cc (192:225) duplicated block id: 7 size: 22 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/split_merge_tokenize_kernel.cc (29:64) - tensorflow_text/core/kernels/tokenizer_from_logits_kernel.cc (29:64) duplicated block id: 8 size: 20 cleaned lines of code in 2 files: - tensorflow_text/tools/wordpiece_vocab/generate_vocab.py (130:155) - tensorflow_text/tools/wordpiece_vocab/generate_word_counts.py (77:102) duplicated block id: 9 size: 20 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/unicode_script_tokenize_kernel.cc (143:173) - tensorflow_text/core/kernels/whitespace_tokenize_kernel.cc (117:147) duplicated block id: 10 size: 18 cleaned lines of code in 2 files: - tensorflow_text/tools/wordpiece_vocab/generate_vocab.py (93:111) - tensorflow_text/tools/wordpiece_vocab/measure_wordpiece_stats.py (74:92) duplicated block id: 11 size: 18 cleaned lines of code in 2 files: - tensorflow_text/python/ops/fast_wordpiece_tokenizer.py (211:230) - tensorflow_text/python/ops/wordpiece_tokenizer.py (266:285) duplicated block id: 12 size: 17 cleaned lines of code in 2 files: - tensorflow_text/core/ops/sentencepiece_ops.cc (51:68) - tensorflow_text/core/ops/sentencepiece_ops.cc (88:106) duplicated block id: 13 size: 17 cleaned lines of code in 2 files: - oss_scripts/pip_package/setup.nightly.py (87:103) - oss_scripts/pip_package/setup.py (90:106) duplicated block id: 14 size: 16 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/unicode_script_tokenize_kernel.cc (58:86) - tensorflow_text/core/kernels/whitespace_tokenize_kernel.cc (54:82) duplicated block id: 15 size: 13 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/split_merge_tokenize_kernel.cc (94:111) - tensorflow_text/core/kernels/tokenizer_from_logits_kernel.cc (101:118) duplicated block id: 16 size: 13 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/sentence_fragmenter.h (147:162) - tensorflow_text/core/kernels/sentence_fragmenter_v2.h (129:144) duplicated block id: 17 size: 12 cleaned lines of code in 2 files: - tensorflow_text/python/ops/unicode_script_tokenizer.py (166:178) - tensorflow_text/python/ops/whitespace_tokenizer.py (119:131) duplicated block id: 18 size: 12 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/split_merge_tokenize_kernel.cc (157:173) - tensorflow_text/core/kernels/tokenizer_from_logits_kernel.cc (174:190) duplicated block id: 19 size: 12 cleaned lines of code in 2 files: - tensorflow_text/core/ops/sentencepiece_ops.cc (39:50) - tensorflow_text/core/ops/sentencepiece_ops.cc (74:85) duplicated block id: 20 size: 12 cleaned lines of code in 2 files: - tensorflow_text/tools/wordpiece_vocab/generate_vocab.py (39:50) - tensorflow_text/tools/wordpiece_vocab/generate_word_counts.py (36:47) duplicated block id: 21 size: 11 cleaned lines of code in 2 files: - tensorflow_text/core/ops/unicode_script_tokenize_op.cc (42:53) - tensorflow_text/core/ops/whitespace_tokenize_op.cc (41:52) duplicated block id: 22 size: 10 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/sentencepiece_kernels.cc (319:328) - tensorflow_text/core/kernels/sentencepiece_kernels.cc (458:467) duplicated block id: 23 size: 10 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/sentence_breaking_utils.cc (159:172) - tensorflow_text/core/kernels/sentence_breaking_utils.cc (187:200) duplicated block id: 24 size: 10 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/sentencepiece_kernels.cc (667:678) - tensorflow_text/core/kernels/sentencepiece_kernels.cc (699:710) duplicated block id: 25 size: 10 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/sentence_breaking_utils.cc (132:145) - tensorflow_text/core/kernels/sentence_breaking_utils.cc (159:172) duplicated block id: 26 size: 10 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/fast_wordpiece_tokenizer_kernel_template.h (57:83) - tensorflow_text/core/kernels/fast_wordpiece_tokenizer_kernel_template.h (281:307) duplicated block id: 27 size: 10 cleaned lines of code in 2 files: - tensorflow_text/python/benchmarks/benchmark_utils.py (189:200) - tensorflow_text/python/benchmarks/benchmark_utils.py (251:262) duplicated block id: 28 size: 10 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/sentence_breaking_utils.cc (132:145) - tensorflow_text/core/kernels/sentence_breaking_utils.cc (187:200) duplicated block id: 29 size: 9 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/sentencepiece_kernels.cc (412:421) - tensorflow_text/core/kernels/sentencepiece_kernels.cc (561:570) duplicated block id: 30 size: 9 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/sentencepiece_kernels.cc (559:568) - tensorflow_text/core/kernels/sentencepiece_kernels.cc (699:708) duplicated block id: 31 size: 9 cleaned lines of code in 2 files: - tensorflow_text/python/benchmarks/ops_benchmarks.py (175:184) - tensorflow_text/python/benchmarks/ops_benchmarks.py (187:196) duplicated block id: 32 size: 9 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/sentencepiece_kernels.cc (643:652) - tensorflow_text/core/kernels/sentencepiece_kernels.cc (699:708) duplicated block id: 33 size: 9 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/normalize_kernels.cc (43:54) - tensorflow_text/core/kernels/normalize_kernels.cc (97:107) duplicated block id: 34 size: 9 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/sentencepiece_kernels.cc (643:652) - tensorflow_text/core/kernels/sentencepiece_kernels.cc (667:676) duplicated block id: 35 size: 9 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/normalize_kernels.cc (94:104) - tensorflow_text/core/kernels/normalize_kernels.cc (219:229) duplicated block id: 36 size: 9 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/sentencepiece_kernels.cc (271:280) - tensorflow_text/core/kernels/sentencepiece_kernels.cc (561:570) duplicated block id: 37 size: 9 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/sentence_breaking_utils.cc (100:112) - tensorflow_text/core/kernels/sentence_breaking_utils.cc (161:173) duplicated block id: 38 size: 9 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/sentencepiece_kernels.cc (559:568) - tensorflow_text/core/kernels/sentencepiece_kernels.cc (667:676) duplicated block id: 39 size: 9 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/sentence_breaking_utils.cc (96:106) - tensorflow_text/core/kernels/sentence_breaking_utils.cc (220:230) duplicated block id: 40 size: 9 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/sentencepiece_kernels.cc (559:568) - tensorflow_text/core/kernels/sentencepiece_kernels.cc (643:652) duplicated block id: 41 size: 8 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/fast_wordpiece_tokenizer_kernel_template.h (57:78) - tensorflow_text/core/kernels/whitespace_tokenizer_kernel_template.h (57:76) duplicated block id: 42 size: 8 cleaned lines of code in 2 files: - tensorflow_text/core/ops/unicode_script_tokenize_op.cc (33:40) - tensorflow_text/core/ops/whitespace_tokenize_op.cc (33:40) duplicated block id: 43 size: 8 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/fast_wordpiece_tokenizer_kernel_template.h (281:302) - tensorflow_text/core/kernels/whitespace_tokenizer_kernel_template.h (57:76) duplicated block id: 44 size: 8 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/sentencepiece_kernels.cc (412:419) - tensorflow_text/core/kernels/sentencepiece_kernels.cc (645:652) duplicated block id: 45 size: 8 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/sentencepiece_kernels.cc (412:419) - tensorflow_text/core/kernels/sentencepiece_kernels.cc (669:676) duplicated block id: 46 size: 8 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/split_merge_tokenize_kernel.cc (69:76) - tensorflow_text/core/kernels/tokenizer_from_logits_kernel.cc (74:81) duplicated block id: 47 size: 8 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/sentencepiece_kernels.cc (310:317) - tensorflow_text/core/kernels/sentencepiece_kernels.cc (449:456) duplicated block id: 48 size: 8 cleaned lines of code in 2 files: - tensorflow_text/python/ops/trimmer_ops.py (177:184) - tensorflow_text/python/ops/trimmer_ops.py (263:270) duplicated block id: 49 size: 8 cleaned lines of code in 2 files: - tensorflow_text/python/ops/pointer_ops.py (195:202) - tensorflow_text/python/ops/pointer_ops.py (505:512) duplicated block id: 50 size: 8 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/sentence_fragmenter.cc (316:328) - tensorflow_text/core/kernels/sentence_fragmenter_v2.cc (579:591) duplicated block id: 51 size: 8 cleaned lines of code in 2 files: - tensorflow_text/python/benchmarks/ops_benchmarks.py (33:40) - tensorflow_text/python/benchmarks/tokenizers_benchmarks.py (42:49) duplicated block id: 52 size: 8 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/sentence_breaking_utils.cc (100:111) - tensorflow_text/core/kernels/sentence_breaking_utils.cc (189:200) duplicated block id: 53 size: 8 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/sentence_breaking_utils.cc (65:76) - tensorflow_text/core/kernels/sentence_breaking_utils.cc (134:145) duplicated block id: 54 size: 8 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/ragged_tensor_to_tensor_tflite.cc (198:207) - tensorflow_text/core/kernels/ragged_tensor_to_tensor_tflite.cc (229:238) duplicated block id: 55 size: 8 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/sentencepiece_kernels.cc (271:278) - tensorflow_text/core/kernels/sentencepiece_kernels.cc (669:676) duplicated block id: 56 size: 8 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/sentence_breaking_utils.cc (65:76) - tensorflow_text/core/kernels/sentence_breaking_utils.cc (100:111) duplicated block id: 57 size: 8 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/sentence_breaking_utils.cc (65:76) - tensorflow_text/core/kernels/sentence_breaking_utils.cc (189:200) duplicated block id: 58 size: 8 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/sentencepiece_kernels.cc (271:278) - tensorflow_text/core/kernels/sentencepiece_kernels.cc (645:652) duplicated block id: 59 size: 8 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/sentence_breaking_utils.cc (65:76) - tensorflow_text/core/kernels/sentence_breaking_utils.cc (161:172) duplicated block id: 60 size: 8 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/sentencepiece_kernels.cc (271:278) - tensorflow_text/core/kernels/sentencepiece_kernels.cc (701:708) duplicated block id: 61 size: 8 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/constrained_sequence.cc (173:184) - tensorflow_text/core/kernels/constrained_sequence.cc (257:269) duplicated block id: 62 size: 8 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/sentencepiece_kernels.cc (412:419) - tensorflow_text/core/kernels/sentencepiece_kernels.cc (701:708) duplicated block id: 63 size: 8 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/sentence_breaking_utils.cc (100:111) - tensorflow_text/core/kernels/sentence_breaking_utils.cc (134:145) duplicated block id: 64 size: 7 cleaned lines of code in 2 files: - tensorflow_text/python/ops/trimmer_ops.py (264:270) - tensorflow_text/python/ops/trimmer_ops.py (344:350) duplicated block id: 65 size: 7 cleaned lines of code in 2 files: - tensorflow_text/python/ops/sentencepiece_tokenizer.py (196:203) - tensorflow_text/python/ops/whitespace_tokenizer.py (122:129) duplicated block id: 66 size: 7 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/sentence_breaking_utils.cc (189:195) - tensorflow_text/core/kernels/sentence_breaking_utils.cc (224:230) duplicated block id: 67 size: 7 cleaned lines of code in 2 files: - oss_scripts/pip_package/setup.nightly.py (79:85) - oss_scripts/pip_package/setup.py (82:88) duplicated block id: 68 size: 7 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/sentence_breaking_utils.cc (134:140) - tensorflow_text/core/kernels/sentence_breaking_utils.cc (224:230) duplicated block id: 69 size: 7 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/sentence_breaking_kernels_v2.cc (63:69) - tensorflow_text/core/kernels/whitespace_tokenize_kernel.cc (130:136) duplicated block id: 70 size: 7 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/sentence_breaking_kernels_v2.cc (63:69) - tensorflow_text/core/kernels/unicode_script_tokenize_kernel.cc (156:162) duplicated block id: 71 size: 7 cleaned lines of code in 2 files: - tensorflow_text/python/numpy/viterbi_decode.py (86:94) - tensorflow_text/python/numpy/viterbi_decode.py (130:138) duplicated block id: 72 size: 7 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/sentence_breaking_utils.cc (161:167) - tensorflow_text/core/kernels/sentence_breaking_utils.cc (224:230) duplicated block id: 73 size: 7 cleaned lines of code in 2 files: - tensorflow_text/core/ops/sentencepiece_ops.cc (152:158) - tensorflow_text/core/ops/sentencepiece_ops.cc (164:170) duplicated block id: 74 size: 7 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/sentence_fragmenter.cc (109:116) - tensorflow_text/core/kernels/sentence_fragmenter_v2.cc (414:421) duplicated block id: 75 size: 7 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/normalize_kernels.cc (43:51) - tensorflow_text/core/kernels/normalize_kernels.cc (222:229) duplicated block id: 76 size: 7 cleaned lines of code in 2 files: - tensorflow_text/python/benchmarks/ops_benchmarks.py (60:66) - tensorflow_text/python/benchmarks/tokenizers_benchmarks.py (136:142) duplicated block id: 77 size: 7 cleaned lines of code in 2 files: - tensorflow_text/python/ops/sentencepiece_tokenizer.py (196:203) - tensorflow_text/python/ops/unicode_script_tokenizer.py (169:176) duplicated block id: 78 size: 7 cleaned lines of code in 2 files: - tensorflow_text/python/ops/trimmer_ops.py (178:184) - tensorflow_text/python/ops/trimmer_ops.py (344:350) duplicated block id: 79 size: 7 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/sentence_breaking_utils.cc (65:71) - tensorflow_text/core/kernels/sentence_breaking_utils.cc (224:230) duplicated block id: 80 size: 7 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/sentence_breaking_utils.cc (111:117) - tensorflow_text/core/kernels/sentence_fragmenter_v2.cc (74:80) duplicated block id: 81 size: 6 cleaned lines of code in 2 files: - tensorflow_text/python/benchmarks/ops_benchmarks.py (69:74) - tensorflow_text/python/benchmarks/tokenizers_benchmarks.py (240:245) duplicated block id: 82 size: 6 cleaned lines of code in 2 files: - tensorflow_text/python/ops/pointer_ops.py (111:116) - tensorflow_text/python/ops/pointer_ops.py (411:416) duplicated block id: 83 size: 6 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/ngrams_kernel_template.h (157:162) - tensorflow_text/core/kernels/ngrams_kernel_template.h (172:177) duplicated block id: 84 size: 6 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/wordpiece_tokenizer.cc (162:167) - tensorflow_text/core/kernels/wordpiece_tokenizer.cc (212:217) duplicated block id: 85 size: 6 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/sentence_breaking_utils.cc (76:81) - tensorflow_text/core/kernels/sentence_fragmenter_v2.cc (47:52) duplicated block id: 86 size: 6 cleaned lines of code in 2 files: - tensorflow_text/python/benchmarks/ops_benchmarks.py (69:74) - tensorflow_text/python/benchmarks/ops_benchmarks.py (167:172) duplicated block id: 87 size: 6 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/sentencepiece_kernels.cc (333:338) - tensorflow_text/core/kernels/sentencepiece_kernels.cc (472:477) duplicated block id: 88 size: 6 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/normalize_kernels.cc (146:151) - tensorflow_text/core/kernels/normalize_kernels.cc (279:285) duplicated block id: 89 size: 6 cleaned lines of code in 2 files: - tensorflow_text/python/benchmarks/ops_benchmarks.py (167:172) - tensorflow_text/python/benchmarks/tokenizers_benchmarks.py (240:245) duplicated block id: 90 size: 6 cleaned lines of code in 2 files: - tensorflow_text/python/benchmarks/ops_benchmarks.py (167:172) - tensorflow_text/python/benchmarks/tokenizers_benchmarks.py (144:149) duplicated block id: 91 size: 6 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/sentence_breaking_utils.cc (200:205) - tensorflow_text/core/kernels/sentence_fragmenter_v2.cc (136:141) duplicated block id: 92 size: 6 cleaned lines of code in 2 files: - tensorflow_text/python/numpy/viterbi_decode.py (109:115) - tensorflow_text/python/numpy/viterbi_decode.py (158:164) duplicated block id: 93 size: 6 cleaned lines of code in 2 files: - tensorflow_text/python/benchmarks/tokenizers_benchmarks.py (144:149) - tensorflow_text/python/benchmarks/tokenizers_benchmarks.py (240:245) duplicated block id: 94 size: 6 cleaned lines of code in 2 files: - tensorflow_text/core/kernels/wordpiece_tokenizer.cc (211:216) - tensorflow_text/core/kernels/wordpiece_tokenizer.h (42:47) duplicated block id: 95 size: 6 cleaned lines of code in 2 files: - tensorflow_text/tftext.bzl (51:56) - tensorflow_text/tftext.bzl (68:73) duplicated block id: 96 size: 6 cleaned lines of code in 2 files: - tensorflow_text/tools/wordpiece_vocab/generate_vocab.py (204:210) - tensorflow_text/tools/wordpiece_vocab/generate_word_counts.py (111:117) duplicated block id: 97 size: 6 cleaned lines of code in 2 files: - tensorflow_text/python/benchmarks/tokenizers_benchmarks.py (102:108) - tensorflow_text/python/benchmarks/tokenizers_benchmarks.py (134:140) duplicated block id: 98 size: 6 cleaned lines of code in 2 files: - tensorflow_text/python/benchmarks/ops_benchmarks.py (69:74) - tensorflow_text/python/benchmarks/tokenizers_benchmarks.py (144:149) duplicated block id: 99 size: 6 cleaned lines of code in 2 files: - tensorflow_text/core/ops/split_merge_tokenize_op.cc (87:92) - tensorflow_text/core/ops/wordpiece_op.cc (106:112)