facebookresearch / nbref
Duplication

Places in code with 6 or more lines that are exactly the same.

Intro
  • For duplication, we look at places in code where there are 6 or more lines of code that are exactly the same.
  • Before duplication is calculated, the code is cleaned to remove empty lines, comments, and frequently duplicated constructs such as imports.
  • You should aim at having as little as possible (<5%) of duplicated code as high-level of duplication can lead to maintenance difficulties, poor factoring, and logical contradictions.
Learn more...
Duplication Overall
  • 35% duplication:
    • 4,406 cleaned lines of cleaned code (without empty lines, comments, and frequently duplicated constructs such as imports)
    • 1,565 duplicated lines
  • 75 duplicates
system35% (1,565 lines)
Duplication per Extension
py35% (1,565 lines)
Duplication per Component (primary)
baseline_model/data_utils42% (748 lines)
preprocess37% (528 lines)
baseline_model34% (228 lines)
baseline_model/modules14% (61 lines)
preprocess/cram_vul_dataset0% (0 lines)
Longest Duplicates
The list of 20 longest duplicates.
See data for all 75 duplicates...
Size#FoldersFilesLinesCode
46 x 2 baseline_model/data_utils
baseline_model/data_utils
train.py
train_gnn.py
167:288 (36%)
281:403 (22%)
view
42 x 2 baseline_model/data_utils
baseline_model/data_utils
train.py
train_vul.py
34:98 (33%)
74:137 (24%)
view
38 x 2 baseline_model/data_utils
baseline_model/data_utils
train_sim.py
train_vul.py
136:188 (32%)
262:316 (22%)
view
36 x 2 baseline_model/data_utils
baseline_model/data_utils
train_sim.py
train_vul.py
80:130 (31%)
168:256 (21%)
view
34 x 2 preprocess
preprocess
asm_mips.py
asm_obj.py
485:524 (6%)
420:460 (9%)
view
33 x 2 preprocess
preprocess
asm_mips.py
asm_obj.py
412:454 (6%)
350:391 (9%)
view
29 x 2 baseline_model/data_utils
baseline_model/data_utils
train_gnn.py
train_tree_encoder.py
31:72 (14%)
30:65 (6%)
view
29 x 2 baseline_model/data_utils
baseline_model/data_utils
train_gnn.py
train_vul.py
31:72 (14%)
74:114 (17%)
view
29 x 2 baseline_model/data_utils
baseline_model/data_utils
train.py
train_tree_encoder.py
34:75 (22%)
30:65 (6%)
view
29 x 2 baseline_model/data_utils
baseline_model/data_utils
train.py
train_gnn.py
34:75 (22%)
31:72 (14%)
view
29 x 2 baseline_model/data_utils
baseline_model/data_utils
train_tree_encoder.py
train_vul.py
30:65 (6%)
74:114 (17%)
view
28 x 2 preprocess
preprocess
asm_mips.py
asm_obj.py
40:77 (5%)
80:117 (7%)
view
27 x 2 preprocess
preprocess
asm_mips.py
asm_obj.py
93:132 (4%)
135:179 (7%)
view
23 x 2 baseline_model/data_utils
baseline_model/data_utils
train_tree_encoder_v2.py
train_tree_encoder_v2.py
42:77 (9%)
231:259 (9%)
view
23 x 2 baseline_model
baseline_model
run_tree_transformer.py
run_tree_transformer_multi_gpu.py
95:122 (14%)
103:128 (12%)
view
21 x 2 preprocess
preprocess
asm_mips.py
asm_obj.py
337:366 (3%)
304:333 (5%)
view
20 x 2 preprocess
preprocess
sim_preprocess.py
vul_preprocess.py
89:113 (17%)
94:120 (18%)
view
20 x 2 preprocess
preprocess
asm_mips.py
asm_obj.py
14:37 (3%)
56:79 (5%)
view
18 x 2 baseline_model/data_utils
baseline_model/data_utils
train_sim.py
train_vul.py
52:74 (15%)
139:163 (10%)
view
18 x 2 baseline_model/data_utils
baseline_model/data_utils
train_tree_encoder_v2.py
train_tree_encoder_v2.py
79:100 (7%)
268:289 (7%)
view
Duplicated Units
The list of top 12 duplicated units.
See data for all 12 unit duplicates...
Size#FoldersFilesLinesCode
24 x 2 baseline_model/data_utils
baseline_model/data_utils
train_gnn.py
train.py
0:0 
0:0 
view
17 x 3 baseline_model/data_utils
baseline_model/data_utils
baseline_model/data_utils
train_gnn.py
train.py
train_vul.py
0:0 
0:0 
0:0 
view
12 x 2 baseline_model/data_utils
baseline_model/data_utils
train_sim.py
train_vul.py
0:0 
0:0 
view
11 x 2 baseline_model/data_utils
baseline_model/data_utils
train_sim.py
train_vul.py
0:0 
0:0 
view
11 x 2 baseline_model/data_utils
baseline_model/data_utils
train_sim.py
train_vul.py
0:0 
0:0 
view
12 x 2 preprocess
preprocess
asm_mips.py
asm_obj.py
0:0 
0:0 
view
8 x 2 baseline_model
baseline_model
run_tree_transformer.py
run_tree_transformer_multi_gpu.py
0:0 
0:0 
view
8 x 4 baseline_model/data_utils
baseline_model/data_utils
baseline_model/data_utils
baseline_model/data_utils
train_gnn.py
train_tree_encoder.py
train.py
train_vul.py
0:0 
0:0 
0:0 
0:0 
view
7 x 2 preprocess
preprocess
vul_preprocess.py
sim_preprocess.py
0:0 
0:0 
view
8 x 2 baseline_model/data_utils
baseline_model/modules
ggnn_utils.py
encoder_decoder_layers.py
0:0 
0:0 
view
7 x 2 preprocess
preprocess
asm_mips.py
asm_obj.py
0:0 
0:0 
view
9 x 2 preprocess
preprocess
asm_mips.py
asm_obj.py
0:0 
0:0 
view