aws-samples / sagemaker-end-to-end-workshop
Duplication

Places in code with 6 or more lines that are exactly the same.

Intro
  • For duplication, we look at places in code where there are 6 or more lines of code that are exactly the same.
  • Before duplication is calculated, the code is cleaned to remove empty lines, comments, and frequently duplicated constructs such as imports.
  • You should aim at having as little as possible (<5%) of duplicated code as high-level of duplication can lead to maintenance difficulties, poor factoring, and logical contradictions.
Learn more...
Duplication Overall
  • 66% duplication:
    • 785 cleaned lines of cleaned code (without empty lines, comments, and frequently duplicated constructs such as imports)
    • 519 duplicated lines
  • 18 duplicates
system66% (519 lines)
Duplication per Extension
py67% (519 lines)
Duplication per Component (primary)
4-Deployment/RealTime/config100% (171 lines)
4-Deployment/Batch/config88% (167 lines)
6-Pipelines/config49% (120 lines)
3-Evaluation/solutions68% (32 lines)
5-Monitoring/config30% (29 lines)
5-Monitoring0% (0 lines)
6-Pipelines/modelbuild0% (0 lines)
ROOT0% (0 lines)

Duplication Between Components (50+ lines)

G 4-Deployment/Batch/config 4-Deployment/Batch/config 4-Deployment/RealTime/config 4-Deployment/RealTime/config 4-Deployment/Batch/config--4-Deployment/RealTime/config 334 6-Pipelines/config 6-Pipelines/config 4-Deployment/Batch/config--6-Pipelines/config 168 4-Deployment/RealTime/config--6-Pipelines/config 176 5-Monitoring/config 5-Monitoring/config 4-Deployment/RealTime/config--5-Monitoring/config 58 3-Evaluation/solutions 3-Evaluation/solutions 3-Evaluation/solutions--6-Pipelines/config 64

Download: SVG DOT (open online Graphviz editor)

Open 3D force graph...

Show more details on duplication between components...
Longest Duplicates
The list of 18 longest duplicates.
See data for all 18 duplicates...
Size#FoldersFilesLinesCode
88 x 2 4-Deployment/RealTime/config
6-Pipelines/config
xgboost_customer_churn.py
xgboost_customer_churn.py
14:137 (98%)
14:137 (98%)
view
78 x 2 4-Deployment/Batch/config
6-Pipelines/config
xgboost_customer_churn.py
xgboost_customer_churn.py
17:130 (76%)
14:127 (87%)
view
78 x 2 4-Deployment/Batch/config
4-Deployment/RealTime/config
xgboost_customer_churn.py
xgboost_customer_churn.py
17:130 (76%)
14:127 (87%)
view
52 x 2 4-Deployment/Batch/config
4-Deployment/RealTime/config
solution_lab2.py
solution_lab2.py
73:136 (60%)
66:129 (63%)
view
26 x 2 3-Evaluation/solutions
6-Pipelines/config
evaluate_with_experiments.py
evaluate.py
46:78 (55%)
43:75 (54%)
view
15 x 2 4-Deployment/RealTime/config
5-Monitoring/config
xgboost_customer_churn.py
inference.py
116:138 (16%)
8:30 (44%)
view
14 x 2 4-Deployment/Batch/config
4-Deployment/RealTime/config
solution_lab2.py
solution_lab2.py
21:44 (16%)
21:44 (17%)
view
14 x 2 5-Monitoring/config
6-Pipelines/config
inference.py
xgboost_customer_churn.py
8:29 (41%)
116:137 (15%)
view
10 x 2 4-Deployment/Batch/config
4-Deployment/RealTime/config
solution_lab2.py
solution_lab2.py
59:70 (11%)
53:64 (12%)
view
8 x 2 4-Deployment/Batch/config
5-Monitoring/config
solution_lab2.py
solution_lab4.py
59:66 (9%)
44:51 (13%)
view
8 x 2 4-Deployment/RealTime/config
5-Monitoring/config
solution_lab2.py
solution_lab4.py
53:60 (9%)
44:51 (13%)
view
7 x 2 4-Deployment/Batch/config
4-Deployment/RealTime/config
xgboost_customer_churn.py
xgboost_customer_churn.py
137:143 (6%)
132:138 (7%)
view
7 x 2 4-Deployment/Batch/config
5-Monitoring/config
xgboost_customer_churn.py
inference.py
137:143 (6%)
24:30 (20%)
view
6 x 2 4-Deployment/Batch/config
4-Deployment/RealTime/config
solution_lab2.py
solution_lab2.py
50:55 (6%)
46:51 (7%)
view
6 x 2 4-Deployment/Batch/config
6-Pipelines/config
xgboost_customer_churn.py
xgboost_customer_churn.py
137:142 (5%)
132:137 (6%)
view
6 x 2 3-Evaluation/solutions
6-Pipelines/config
evaluate_with_experiments.py
evaluate.py
35:42 (12%)
32:39 (12%)
view
6 x 2 4-Deployment/RealTime/config
5-Monitoring/config
solution_lab2.py
solution_lab4.py
46:51 (7%)
36:41 (9%)
view
6 x 2 4-Deployment/Batch/config
5-Monitoring/config
solution_lab2.py
solution_lab4.py
50:55 (6%)
36:41 (9%)
view
Duplicated Units
The list of top 4 duplicated units.
See data for all 4 unit duplicates...
Size#FoldersFilesLinesCode
43 x 3 4-Deployment/Batch/config
4-Deployment/RealTime/config
6-Pipelines/config
xgboost_customer_churn.py
xgboost_customer_churn.py
xgboost_customer_churn.py
0:0 
0:0 
0:0 
view
19 x 3 4-Deployment/Batch/config
4-Deployment/RealTime/config
6-Pipelines/config
xgboost_customer_churn.py
xgboost_customer_churn.py
xgboost_customer_churn.py
0:0 
0:0 
0:0 
view
22 x 2 4-Deployment/RealTime/config
5-Monitoring/config
xgboost_customer_churn.py
inference.py
0:0 
0:0 
view
9 x 3 4-Deployment/Batch/config
4-Deployment/RealTime/config
6-Pipelines/config
xgboost_customer_churn.py
xgboost_customer_churn.py
xgboost_customer_churn.py
0:0 
0:0 
0:0 
view