awslabs / pptod
Duplication

Places in code with 6 or more lines that are exactly the same.

Intro
  • For duplication, we look at places in code where there are 6 or more lines of code that are exactly the same.
  • Before duplication is calculated, the code is cleaned to remove empty lines, comments, and frequently duplicated constructs such as imports.
  • You should aim at having as little as possible (<5%) of duplicated code as high-level of duplication can lead to maintenance difficulties, poor factoring, and logical contradictions.
Learn more...
Duplication Overall
  • 52% duplication:
    • 7,766 cleaned lines of cleaned code (without empty lines, comments, and frequently duplicated constructs such as imports)
    • 4,092 duplicated lines
  • 174 duplicates
system52% (4,092 lines)
Duplication per Extension
py52% (4,092 lines)
Duplication per Component (primary)
E2E_TOD55% (1,562 lines)
data/multiwoz87% (1,116 lines)
DST64% (747 lines)
IC53% (238 lines)
Pretraining31% (128 lines)
data/pre-training_corpora8% (121 lines)
E2E_TOD/modelling100% (84 lines)
DST/modelling100% (84 lines)
Pretraining/modelling26% (6 lines)
IC/modelling11% (6 lines)

Duplication Between Components (50+ lines)

G E2E_TOD E2E_TOD data/multiwoz data/multiwoz E2E_TOD--data/multiwoz 1732 IC IC E2E_TOD--IC 305 Pretraining Pretraining E2E_TOD--Pretraining 251 DST DST DST--E2E_TOD 1376 DST--data/multiwoz 221 DST--IC 315 DST--Pretraining 255 IC--Pretraining 272 DST/modelling DST/modelling E2E_TOD/modelling E2E_TOD/modelling DST/modelling--E2E_TOD/modelling 168

Download: SVG DOT (open online Graphviz editor)

Open 3D force graph...

Show more details on duplication between components...
Longest Duplicates
The list of 20 longest duplicates.
See data for all 174 duplicates...
Size#FoldersFilesLinesCode
203 x 2 E2E_TOD
data/multiwoz/utlis
db_ops.py
db_ops.py
4:264 (100%)
4:264 (87%)
view
185 x 2 E2E_TOD
data/multiwoz/utlis
utils.py
utils.py
7:226 (95%)
7:243 (95%)
view
173 x 2 E2E_TOD
data/multiwoz/utlis
reader.py
reader.py
16:238 (49%)
18:240 (47%)
view
168 x 2 E2E_TOD
data/multiwoz/utlis
reader.py
reader.py
248:447 (48%)
247:441 (46%)
view
99 x 2 E2E_TOD
data/multiwoz/utlis
ontology.py
ontology.py
1:178 (100%)
1:178 (100%)
view
99 x 2 DST
data/multiwoz/utlis
ontology.py
ontology.py
1:178 (100%)
1:178 (100%)
view
99 x 2 DST
E2E_TOD
ontology.py
ontology.py
1:178 (100%)
1:178 (100%)
view
84 x 2 DST/modelling
E2E_TOD/modelling
T5Model.py
T5Model.py
7:101 (100%)
7:101 (100%)
view
54 x 2 DST
E2E_TOD
dataclass.py
dataclass.py
325:386 (13%)
368:429 (10%)
view
51 x 2 data/multiwoz/utlis
data/multiwoz/utlis
postprocessing_dataset.py
processing_funcs.py
5:61 (26%)
4:60 (34%)
view
50 x 2 DST
E2E_TOD
dataclass.py
dataclass.py
131:184 (12%)
143:196 (9%)
view
44 x 2 DST
E2E_TOD
dataclass.py
dataclass.py
230:275 (10%)
247:292 (8%)
view
39 x 2 E2E_TOD
Pretraining
dataclass.py
dataclass.py
371:414 (7%)
248:291 (16%)
view
39 x 2 DST
Pretraining
dataclass.py
dataclass.py
328:371 (9%)
248:291 (16%)
view
37 x 2 data/multiwoz/utlis
data/multiwoz/utlis
postprocessing_dataset.py
processing_funcs.py
63:102 (19%)
62:101 (25%)
view
33 x 2 DST
E2E_TOD
dataclass.py
dataclass.py
186:221 (8%)
198:232 (6%)
view
33 x 2 DST
E2E_TOD
learn.py
learn.py
197:232 (12%)
139:174 (15%)
view
33 x 2 IC
IC
inference.py
learn.py
82:114 (32%)
166:198 (16%)
view
30 x 2 IC
Pretraining
learn.py
pretrain.py
122:153 (14%)
110:140 (16%)
view
29 x 2 DST
Pretraining
learn.py
pretrain.py
202:232 (10%)
111:140 (16%)
view
Duplicated Units
The list of top 20 duplicated units.
See data for all 39 unit duplicates...
Size#FoldersFilesLinesCode
81 x 2 E2E_TOD
data/multiwoz/utlis
db_ops.py
db_ops.py
0:0 
0:0 
view
46 x 2 E2E_TOD
data/multiwoz/utlis
reader.py
reader.py
0:0 
0:0 
view
40 x 2 E2E_TOD
data/multiwoz/utlis
reader.py
reader.py
0:0 
0:0 
view
37 x 2 E2E_TOD
data/multiwoz/utlis
db_ops.py
db_ops.py
0:0 
0:0 
view
35 x 2 E2E_TOD/modelling
DST/modelling
T5Model.py
T5Model.py
0:0 
0:0 
view
31 x 2 E2E_TOD
data/multiwoz/utlis
reader.py
reader.py
0:0 
0:0 
view
34 x 2 E2E_TOD
data/multiwoz/utlis
reader.py
reader.py
0:0 
0:0 
view
28 x 2 E2E_TOD
data/multiwoz/utlis
db_ops.py
db_ops.py
0:0 
0:0 
view
36 x 2 E2E_TOD
data/multiwoz/utlis
reader.py
reader.py
0:0 
0:0 
view
24 x 2 data/multiwoz/utlis
data/multiwoz/utlis
postprocessing_dataset.py
processing_funcs.py
0:0 
0:0 
view
23 x 2 E2E_TOD
data/multiwoz/utlis
db_ops.py
db_ops.py
0:0 
0:0 
view
19 x 4 E2E_TOD
E2E_TOD/modelling
DST
DST/modelling
dataclass.py
T5Model.py
dataclass.py
T5Model.py
0:0 
0:0 
0:0 
0:0 
view
19 x 2 E2E_TOD
data/multiwoz/utlis
utils.py
utils.py
0:0 
0:0 
view
19 x 2 data/multiwoz/utlis
data/multiwoz/utlis
postprocessing_dataset.py
processing_funcs.py
0:0 
0:0 
view
21 x 2 E2E_TOD
data/multiwoz/utlis
reader.py
reader.py
0:0 
0:0 
view
17 x 2 E2E_TOD/modelling
DST/modelling
T5Model.py
T5Model.py
0:0 
0:0 
view
19 x 2 E2E_TOD
data/multiwoz/utlis
reader.py
reader.py
0:0 
0:0 
view
16 x 2 E2E_TOD
DST
dataclass.py
dataclass.py
0:0 
0:0 
view
15 x 2 E2E_TOD
DST
dataclass.py
dataclass.py
0:0 
0:0 
view
15 x 2 E2E_TOD
DST
dataclass.py
dataclass.py
0:0 
0:0 
view