facebookresearch / TransCoder
Duplication

Places in code with 6 or more lines that are exactly the same.

Intro
  • For duplication, we look at places in code where there are 6 or more lines of code that are exactly the same.
  • Before duplication is calculated, the code is cleaned to remove empty lines, comments, and frequently duplicated constructs such as imports.
  • You should aim at having as little as possible (<5%) of duplicated code as high-level of duplication can lead to maintenance difficulties, poor factoring, and logical contradictions.
Learn more...
Duplication Overall
  • 12% duplication:
    • 65,079 cleaned lines of cleaned code (without empty lines, comments, and frequently duplicated constructs such as imports)
    • 7,849 duplicated lines
  • 177,934 duplicates
system12% (7,849 lines)
Duplication per Extension
py22% (5,570 lines)
java5% (1,457 lines)
cpp6% (822 lines)
Duplication per Component (primary)
data/evaluation/geeks_for_geeks_successful_test_scripts/python27% (5,304 lines)
data/evaluation/geeks_for_geeks_successful_test_scripts/java5% (1,457 lines)
data/evaluation/geeks_for_geeks_successful_test_scripts/cpp6% (822 lines)
preprocessing/src5% (91 lines)
XLM/src/data10% (74 lines)
XLM/src4% (65 lines)
XLM/src/evaluation5% (24 lines)
XLM/src/model1% (12 lines)
ROOT0% (0 lines)
XLM0% (0 lines)
preprocessing0% (0 lines)
Longest Duplicates
The list of 20 longest duplicates.
See data for all 177,934 duplicates...
Size#FoldersFilesLinesCode
84 x 2 data/evaluation/geeks_fo...ssful_test_scripts/java
data/evaluation/geeks_fo...ssful_test_scripts/java
SORT_EVEN_PLACED_ELEMENTS_INCREASING_...
SPLIT_ARRAY_ADD_FIRST_PART_END.java
12:105 (100%)
12:106 (100%)
view
51 x 2 data/evaluation/geeks_fo...ssful_test_scripts/java
data/evaluation/geeks_fo...ssful_test_scripts/java
FIND_THE_MINIMUM_DISTANCE_BETWEEN_TWO...
FIND_THE_MINIMUM_DISTANCE_BETWEEN_TWO...
25:84 (82%)
36:96 (72%)
view
45 x 2 data/evaluation/geeks_fo...ssful_test_scripts/java
data/evaluation/geeks_fo...ssful_test_scripts/java
FIND_WHETHER_AN_ARRAY_IS_SUBSET_OF_AN...
FIND_WHETHER_AN_ARRAY_IS_SUBSET_OF_AN...
30:74 (72%)
33:77 (70%)
view
40 x 2 data/evaluation/geeks_fo...ssful_test_scripts/java
data/evaluation/geeks_fo...ssful_test_scripts/java
DISTRIBUTING_ITEMS_PERSON_CANNOT_TAKE...
DISTRIBUTING_ITEMS_PERSON_CANNOT_TAKE...
26:74 (76%)
22:70 (81%)
view
32 x 2 data/evaluation/geeks_fo...ful_test_scripts/python
data/evaluation/geeks_fo...ful_test_scripts/python
MOVE_VE_ELEMENTS_END_ORDER_EXTRA_SPAC...
MOVE_ZEROES_END_ARRAY.py
26:57 (68%)
20:51 (78%)
view
28 x 2 data/evaluation/geeks_fo...ful_test_scripts/python
data/evaluation/geeks_fo...ful_test_scripts/python
FIND_WHETHER_AN_ARRAY_IS_SUBSET_OF_AN...
FIND_WHETHER_AN_ARRAY_IS_SUBSET_OF_AN...
21:48 (73%)
27:54 (63%)
view
28 x 2 data/evaluation/geeks_fo...ssful_test_scripts/java
data/evaluation/geeks_fo...ssful_test_scripts/java
PRODUCT_NODES_K_TH_LEVEL_TREE_REPRESE...
SUM_NODES_K_TH_LEVEL_TREE_REPRESENTED...
32:63 (68%)
32:63 (68%)
view
21 x 2 data/evaluation/geeks_fo...essful_test_scripts/cpp
data/evaluation/geeks_fo...essful_test_scripts/cpp
FIND_WHETHER_AN_ARRAY_IS_SUBSET_OF_AN...
FIND_WHETHER_AN_ARRAY_IS_SUBSET_OF_AN...
35:59 (63%)
35:59 (63%)
view
19 x 2 data/evaluation/geeks_fo...ful_test_scripts/python
data/evaluation/geeks_fo...ful_test_scripts/python
MAXIMUM_CONSECUTIVE_REPEATING_CHARACT...
MAXIMUM_CONSECUTIVE_REPEATING_CHARACT...
20:42 (59%)
20:42 (59%)
view
19 x 2 data/evaluation/geeks_fo...essful_test_scripts/cpp
data/evaluation/geeks_fo...essful_test_scripts/cpp
PRODUCT_NODES_K_TH_LEVEL_TREE_REPRESE...
SUM_NODES_K_TH_LEVEL_TREE_REPRESENTED...
35:57 (61%)
35:57 (61%)
view
19 x 2 data/evaluation/geeks_fo...ful_test_scripts/python
data/evaluation/geeks_fo...ful_test_scripts/python
FIND_THE_MINIMUM_DISTANCE_BETWEEN_TWO...
FIND_THE_MINIMUM_DISTANCE_BETWEEN_TWO...
13:35 (76%)
23:45 (57%)
view
18 x 2 data/evaluation/geeks_fo...ssful_test_scripts/java
data/evaluation/geeks_fo...ssful_test_scripts/java
DYNAMIC_PROGRAMMING_SET_3_LONGEST_INC...
LONGEST_INCREASING_SUBSEQUENCE_1.java
13:35 (40%)
13:35 (40%)
view
18 x 2 data/evaluation/geeks_fo...ssful_test_scripts/java
data/evaluation/geeks_fo...ssful_test_scripts/java
MAXIMUM_CONSECUTIVE_REPEATING_CHARACT...
MAXIMUM_CONSECUTIVE_REPEATING_CHARACT...
32:58 (51%)
30:56 (54%)
view
18 x 2 data/evaluation/geeks_fo...ssful_test_scripts/java
data/evaluation/geeks_fo...ssful_test_scripts/java
CHECK_IF_A_NUMBER_IS_POWER_OF_ANOTHER...
CHECK_IF_A_NUMBER_IS_POWER_OF_ANOTHER...
33:54 (52%)
32:53 (54%)
view
18 x 2 data/evaluation/geeks_fo...ful_test_scripts/python
data/evaluation/geeks_fo...ful_test_scripts/python
PRODUCT_NODES_K_TH_LEVEL_TREE_REPRESE...
SUM_NODES_K_TH_LEVEL_TREE_REPRESENTED...
24:41 (58%)
24:41 (58%)
view
18 x 2 data/evaluation/geeks_fo...ful_test_scripts/python
data/evaluation/geeks_fo...ful_test_scripts/python
WRITE_ONE_LINE_C_FUNCTION_TO_FIND_WHE...
WRITE_ONE_LINE_C_FUNCTION_TO_FIND_WHE...
19:36 (69%)
13:30 (90%)
view
17 x 2 data/evaluation/geeks_fo...ssful_test_scripts/java
data/evaluation/geeks_fo...ssful_test_scripts/java
CHANGE_BITS_CAN_MADE_ONE_FLIP.java
CHANGE_BITS_CAN_MADE_ONE_FLIP_1.java
28:48 (62%)
25:45 (68%)
view
17 x 2 data/evaluation/geeks_fo...ssful_test_scripts/java
data/evaluation/geeks_fo...ssful_test_scripts/java
WRITE_ONE_LINE_C_FUNCTION_TO_FIND_WHE...
WRITE_ONE_LINE_C_FUNCTION_TO_FIND_WHE...
25:45 (70%)
20:40 (85%)
view
16 x 2 preprocessing/src
preprocessing/src
code_tokenizer.py
code_tokenizer.py
573:588 (2%)
704:719 (2%)
view
15 x 2 XLM/src/data
XLM/src/data
dataset.py
dataset.py
218:238 (5%)
440:460 (5%)
view
Duplicated Units
The list of top 20 duplicated units.
See data for all 38 unit duplicates...
Size#FoldersFilesLinesCode
78 x 2 data/evaluation/geeks_fo...ssful_test_scripts/java
data/evaluation/geeks_fo...ssful_test_scripts/java
SORT_EVEN_PLACED_ELEMENTS_INCREASING_...
SPLIT_ARRAY_ADD_FIRST_PART_END.java
28:106 
29:107 
view
54 x 2 data/evaluation/geeks_fo...ssful_test_scripts/java
data/evaluation/geeks_fo...ssful_test_scripts/java
FIND_THE_MINIMUM_DISTANCE_BETWEEN_TWO...
FIND_THE_MINIMUM_DISTANCE_BETWEEN_TWO...
31:85 
42:97 
view
54 x 2 data/evaluation/geeks_fo...ssful_test_scripts/java
data/evaluation/geeks_fo...ssful_test_scripts/java
FIND_WHETHER_AN_ARRAY_IS_SUBSET_OF_AN...
FIND_WHETHER_AN_ARRAY_IS_SUBSET_OF_AN...
30:84 
33:87 
view
43 x 2 data/evaluation/geeks_fo...ssful_test_scripts/java
data/evaluation/geeks_fo...ssful_test_scripts/java
DISTRIBUTING_ITEMS_PERSON_CANNOT_TAKE...
DISTRIBUTING_ITEMS_PERSON_CANNOT_TAKE...
32:75 
28:71 
view
32 x 11 data/evaluation/geeks_fo...ssful_test_scripts/java
data/evaluation/geeks_fo...ssful_test_scripts/java
data/evaluation/geeks_fo...ssful_test_scripts/java
data/evaluation/geeks_fo...ssful_test_scripts/java
data/evaluation/geeks_fo...ssful_test_scripts/java
data/evaluation/geeks_fo...ssful_test_scripts/java
data/evaluation/geeks_fo...ssful_test_scripts/java
data/evaluation/geeks_fo...ssful_test_scripts/java
data/evaluation/geeks_fo...ssful_test_scripts/java
data/evaluation/geeks_fo...ssful_test_scripts/java
...
COUNT_DISTINCT_OCCURRENCES_AS_A_SUBSE...
WAYS_TRANSFORMING_ONE_STRING_REMOVING...
WILDCARD_CHARACTER_MATCHING.java
CHECK_STRING_CAN_OBTAINED_ROTATING_AN...
SPACE_OPTIMIZED_SOLUTION_LCS.java
FIND_ONE_EXTRA_CHARACTER_STRING_1.java
MAXIMUM_LENGTH_PREFIX_ONE_STRING_OCCU...
LONGEST_COMMON_SUBSTRING_SPACE_OPTIMI...
FIND_NUMBER_TIMES_STRING_OCCURS_GIVEN...
CHECK_POSSIBLE_TRANSFORM_ONE_STRING_A...
...
40:72 
50:82 
24:56 
26:58 
35:67 
31:63 
27:59 
44:76 
39:71 
45:77 
...
view
32 x 4 data/evaluation/geeks_fo...ssful_test_scripts/java
data/evaluation/geeks_fo...ssful_test_scripts/java
data/evaluation/geeks_fo...ssful_test_scripts/java
data/evaluation/geeks_fo...ssful_test_scripts/java
SUM_TWO_LARGE_NUMBERS.java
SUM_TWO_LARGE_NUMBERS_1.java
MULTIPLY_LARGE_NUMBERS_REPRESENTED_AS...
PROGRAM_CENSOR_WORD_ASTERISKS_SENTENC...
46:78 
44:76 
49:81 
32:64 
view
32 x 2 data/evaluation/geeks_fo...ssful_test_scripts/java
data/evaluation/geeks_fo...ssful_test_scripts/java
PROGRAM_COUNT_OCCURRENCE_GIVEN_CHARAC...
CHECK_OCCURRENCES_CHARACTER_APPEAR_TO...
26:58 
30:62 
view
32 x 2 data/evaluation/geeks_fo...ssful_test_scripts/java
data/evaluation/geeks_fo...ssful_test_scripts/java
PRODUCT_NODES_K_TH_LEVEL_TREE_REPRESE...
SUM_NODES_K_TH_LEVEL_TREE_REPRESENTED...
32:64 
32:64 
view
25 x 2 data/evaluation/geeks_fo...essful_test_scripts/cpp
data/evaluation/geeks_fo...essful_test_scripts/cpp
FIND_WHETHER_AN_ARRAY_IS_SUBSET_OF_AN...
FIND_WHETHER_AN_ARRAY_IS_SUBSET_OF_AN...
35:60 
35:60 
view
23 x 2 data/evaluation/geeks_fo...essful_test_scripts/cpp
data/evaluation/geeks_fo...essful_test_scripts/cpp
SUM_NODES_K_TH_LEVEL_TREE_REPRESENTED...
PRODUCT_NODES_K_TH_LEVEL_TREE_REPRESE...
35:58 
35:58 
view
22 x 2 data/evaluation/geeks_fo...essful_test_scripts/cpp
data/evaluation/geeks_fo...essful_test_scripts/cpp
PANGRAM_CHECKING.cpp
FIND_EXPRESSION_DUPLICATE_PARENTHESIS...
35:57 
40:62 
view
21 x 63 data/evaluation/geeks_fo...ssful_test_scripts/java
data/evaluation/geeks_fo...ssful_test_scripts/java
data/evaluation/geeks_fo...ssful_test_scripts/java
data/evaluation/geeks_fo...ssful_test_scripts/java
data/evaluation/geeks_fo...ssful_test_scripts/java
data/evaluation/geeks_fo...ssful_test_scripts/java
data/evaluation/geeks_fo...ssful_test_scripts/java
data/evaluation/geeks_fo...ssful_test_scripts/java
data/evaluation/geeks_fo...ssful_test_scripts/java
data/evaluation/geeks_fo...ssful_test_scripts/java
...
COUNT_CHARACTERS_STRING_DISTANCE_ENGL...
LONGEST_PALINDROME_SUBSEQUENCE_SPACE....
LONGEST_EVEN_LENGTH_SUBSTRING_SUM_FIR...
CHECK_LARGE_NUMBER_DIVISIBLE_9_NOT.java
NUMBER_DIGITS_REMOVED_MAKE_NUMBER_DIV...
CHECK_GIVEN_STRING_ROTATION_PALINDROM...
COUNT_PALINDROMIC_SUBSEQUENCE_GIVEN_S...
CHECK_LARGE_NUMBER_DIVISIBLE_3_NOT.java
LONGEST_PREFIX_ALSO_SUFFIX_1.java
LONGEST_REPEATING_SUBSEQUENCE.java
...
27:48 
41:62 
39:60 
25:46 
31:52 
23:44 
38:59 
25:46 
42:63 
32:53 
...
view
21 x 16 data/evaluation/geeks_fo...ssful_test_scripts/java
data/evaluation/geeks_fo...ssful_test_scripts/java
data/evaluation/geeks_fo...ssful_test_scripts/java
data/evaluation/geeks_fo...ssful_test_scripts/java
data/evaluation/geeks_fo...ssful_test_scripts/java
data/evaluation/geeks_fo...ssful_test_scripts/java
data/evaluation/geeks_fo...ssful_test_scripts/java
data/evaluation/geeks_fo...ssful_test_scripts/java
data/evaluation/geeks_fo...ssful_test_scripts/java
data/evaluation/geeks_fo...ssful_test_scripts/java
...
PRINT_A_CLOSEST_STRING_THAT_DOES_NOT_...
LEXICOGRAPHICALLY_NEXT_STRING.java
LEXICOGRAPHICALLY_MINIMUM_STRING_ROTA...
DECODE_MEDIAN_STRING_ORIGINAL_STRING....
LEXICOGRAPHICAL_MAXIMUM_SUBSTRING_STR...
BINARY_REPRESENTATION_OF_NEXT_NUMBER....
REMOVE_BRACKETS_ALGEBRAIC_STRING_CONT...
PRINT_WORDS_STRING_REVERSE_ORDER.java
LEXICOGRAPHICAL_CONCATENATION_SUBSTRI...
NTH_EVEN_LENGTH_PALINDROME.java
...
31:52 
25:46 
29:50 
40:61 
28:49 
36:57 
45:66 
34:55 
36:57 
24:45 
...
view
21 x 2 data/evaluation/geeks_fo...ssful_test_scripts/java
data/evaluation/geeks_fo...ssful_test_scripts/java
WRITE_ONE_LINE_C_FUNCTION_TO_FIND_WHE...
WRITE_ONE_LINE_C_FUNCTION_TO_FIND_WHE...
25:46 
20:41 
view
15 x 2 data/evaluation/geeks_fo...ssful_test_scripts/java
data/evaluation/geeks_fo...ssful_test_scripts/java
LONGEST_INCREASING_SUBSEQUENCE_1.java
DYNAMIC_PROGRAMMING_SET_3_LONGEST_INC...
13:28 
13:28 
view
15 x 2 data/evaluation/geeks_fo...essful_test_scripts/cpp
data/evaluation/geeks_fo...essful_test_scripts/cpp
FIND_THE_MINIMUM_DISTANCE_BETWEEN_TWO...
FIND_THE_MINIMUM_DISTANCE_BETWEEN_TWO...
36:51 
45:60 
view
15 x 2 data/evaluation/geeks_fo...essful_test_scripts/cpp
data/evaluation/geeks_fo...essful_test_scripts/cpp
GIVEN_TWO_STRINGS_FIND_FIRST_STRING_S...
GIVEN_TWO_STRINGS_FIND_FIRST_STRING_S...
26:41 
27:42 
view
14 x 2 data/evaluation/geeks_fo...essful_test_scripts/cpp
data/evaluation/geeks_fo...essful_test_scripts/cpp
DISTRIBUTING_ITEMS_PERSON_CANNOT_TAKE...
DISTRIBUTING_ITEMS_PERSON_CANNOT_TAKE...
35:49 
28:42 
view
13 x 13 data/evaluation/geeks_fo...essful_test_scripts/cpp
data/evaluation/geeks_fo...essful_test_scripts/cpp
data/evaluation/geeks_fo...essful_test_scripts/cpp
data/evaluation/geeks_fo...essful_test_scripts/cpp
data/evaluation/geeks_fo...essful_test_scripts/cpp
data/evaluation/geeks_fo...essful_test_scripts/cpp
data/evaluation/geeks_fo...essful_test_scripts/cpp
data/evaluation/geeks_fo...essful_test_scripts/cpp
data/evaluation/geeks_fo...essful_test_scripts/cpp
data/evaluation/geeks_fo...essful_test_scripts/cpp
...
CHECK_POSSIBLE_TRANSFORM_ONE_STRING_A...
LONGEST_COMMON_SUBSTRING_SPACE_OPTIMI...
FIND_NUMBER_TIMES_STRING_OCCURS_GIVEN...
SPACE_OPTIMIZED_SOLUTION_LCS.cpp
PRINT_SHORTEST_COMMON_SUPERSEQUENCE.cpp
FIND_ONE_EXTRA_CHARACTER_STRING_1.cpp
SUM_TWO_LARGE_NUMBERS.cpp
WAYS_TRANSFORMING_ONE_STRING_REMOVING...
SUM_TWO_LARGE_NUMBERS_1.cpp
COUNT_DISTINCT_OCCURRENCES_AS_A_SUBSE...
...
48:61 
47:60 
45:58 
38:51 
64:77 
34:47 
45:58 
44:57 
44:57 
42:55 
...
view
13 x 2 data/evaluation/geeks_fo...essful_test_scripts/cpp
data/evaluation/geeks_fo...essful_test_scripts/cpp
CHECK_OCCURRENCES_CHARACTER_APPEAR_TO...
PROGRAM_COUNT_OCCURRENCE_GIVEN_CHARAC...
33:46 
27:40 
view