awslabs / aws-cv-unique-information
Duplication

Places in code with 6 or more lines that are exactly the same.

Intro
  • For duplication, we look at places in code where there are 6 or more lines of code that are exactly the same.
  • Before duplication is calculated, the code is cleaned to remove empty lines, comments, and frequently duplicated constructs such as imports.
  • You should aim at having as little as possible (<5%) of duplicated code as high-level of duplication can lead to maintenance difficulties, poor factoring, and logical contradictions.
Learn more...
Duplication Overall
  • 31% duplication:
    • 3,061 cleaned lines of cleaned code (without empty lines, comments, and frequently duplicated constructs such as imports)
    • 961 duplicated lines
  • 110 duplicates
system31% (961 lines)
Duplication per Extension
py31% (961 lines)
Duplication per Component (primary)
scripts46% (729 lines)
modules15% (185 lines)
methods14% (33 lines)
archive16% (14 lines)
Longest Duplicates
The list of 20 longest duplicates.
See data for all 110 duplicates...
Size#FoldersFilesLinesCode
21 x 2 sample_info/scripts
sample_info/scripts
data_summarization.py
ground_truth_effects.py
106:132 (17%)
81:107 (19%)
view
21 x 2 sample_info/modules
sample_info/modules
data_utils.py
data_utils.py
387:413 (5%)
435:461 (5%)
view
17 x 2 sample_info/scripts
sample_info/scripts
compute_informativeness.py
prepare_informativeness_orders_for_da...
69:90 (16%)
69:92 (19%)
view
16 x 2 sample_info/scripts
sample_info/scripts
compute_influence_functions.py
compute_influence_functions_brute_for...
124:144 (16%)
140:160 (14%)
view
16 x 2 sample_info/scripts
sample_info/scripts
generate_commands.py
generate_commands.py
440:455 (2%)
533:548 (2%)
view
14 x 2 sample_info/scripts
sample_info/scripts
generate_commands.py
generate_commands.py
471:484 (2%)
564:577 (2%)
view
14 x 2 sample_info/scripts
sample_info/scripts
compute_influence_functions.py
compute_influence_functions_brute_for...
73:91 (14%)
72:90 (12%)
view
14 x 2 sample_info/archive
sample_info/scripts
total_gradient.py
train_classifier.py
26:39 (16%)
23:36 (18%)
view
14 x 2 sample_info/scripts
sample_info/scripts
data_summarization.py
train_classifier.py
118:134 (11%)
74:90 (18%)
view
13 x 2 sample_info/scripts
sample_info/scripts
data_summarization.py
train_classifier.py
144:157 (10%)
98:111 (17%)
view
13 x 2 sample_info/scripts
sample_info/scripts
compute_informativeness.py
prepare_informativeness_orders_for_da...
34:49 (12%)
36:51 (14%)
view
13 x 2 sample_info/modules
sample_info/modules
data_utils.py
data_utils.py
446:461 (3%)
489:504 (3%)
view
13 x 2 sample_info/scripts
sample_info/scripts
compute_influence_functions.py
compute_informativeness.py
63:80 (13%)
69:86 (12%)
view
13 x 2 sample_info/scripts
sample_info/scripts
compute_influence_functions.py
prepare_informativeness_orders_for_da...
63:80 (13%)
69:88 (14%)
view
13 x 2 sample_info/modules
sample_info/modules
sgd.py
sgd.py
46:60 (18%)
103:117 (18%)
view
13 x 2 sample_info/modules
sample_info/modules
data_utils.py
data_utils.py
398:413 (3%)
489:504 (3%)
view
12 x 2 sample_info/scripts
sample_info/scripts
ground_truth_effects.py
train_classifier.py
93:107 (11%)
74:88 (16%)
view
12 x 2 sample_info/scripts
sample_info/scripts
compute_influence_functions.py
ground_truth_effects.py
39:50 (12%)
45:56 (11%)
view
12 x 2 sample_info/scripts
sample_info/scripts
compute_influence_functions.py
compute_influence_functions_brute_for...
41:55 (12%)
40:54 (11%)
view
11 x 2 sample_info/scripts
sample_info/scripts
compute_influence_functions.py
prepare_informativeness_orders_for_da...
42:55 (11%)
37:50 (12%)
view