aws-samples / sagemaker-huggingface-nlp
Duplication

Places in code with 6 or more lines that are exactly the same.

Intro
  • For duplication, we look at places in code where there are 6 or more lines of code that are exactly the same.
  • Before duplication is calculated, the code is cleaned to remove empty lines, comments, and frequently duplicated constructs such as imports.
  • You should aim at having as little as possible (<5%) of duplicated code as high-level of duplication can lead to maintenance difficulties, poor factoring, and logical contradictions.
Learn more...
Duplication Overall
  • 75% duplication:
    • 258 cleaned lines of cleaned code (without empty lines, comments, and frequently duplicated constructs such as imports)
    • 196 duplicated lines
  • 5 duplicates
system75% (196 lines)
Duplication per Extension
py75% (196 lines)
Duplication per Component (primary)
ROOT80% (98 lines)
scripts71% (98 lines)

Duplication Between Components (50+ lines)

G ROOT ROOT scripts scripts ROOT--scripts 196

Download: SVG DOT (open online Graphviz editor)

Open 3D force graph...

Show more details on duplication between components...
Longest Duplicates
The list of 5 longest duplicates.
See data for all 5 duplicates...
Size#FoldersFilesLinesCode
51 x 2 ROOT
scripts
distilbertqatrain.py
train.py
41:117 (42%)
27:103 (45%)
view
18 x 2 ROOT
scripts
distilbertqatrain.py
train.py
179:203 (14%)
160:184 (15%)
view
16 x 2 ROOT
scripts
distilbertqatrain.py
train.py
149:173 (13%)
130:154 (14%)
view
7 x 2 ROOT
scripts
distilbertqatrain.py
train.py
15:23 (5%)
16:24 (6%)
view
6 x 2 ROOT
scripts
distilbertqatrain.py
train.py
132:138 (4%)
115:121 (5%)
view
Duplicated Units
The list of top 3 duplicated units.
See data for all 3 unit duplicates...
Size#FoldersFilesLinesCode
25 x 2 ROOT
scripts
distilbertqatrain.py
train.py
0:0 
0:0 
view
18 x 2 ROOT
scripts
distilbertqatrain.py
train.py
0:0 
0:0 
view
11 x 2 ROOT
scripts
distilbertqatrain.py
train.py
0:0 
0:0 
view