awslabs / mlmax
Duplication

Places in code with 6 or more lines that are exactly the same.

Intro
  • For duplication, we look at places in code where there are 6 or more lines of code that are exactly the same.
  • Before duplication is calculated, the code is cleaned to remove empty lines, comments, and frequently duplicated constructs such as imports.
  • You should aim at having as little as possible (<5%) of duplicated code as high-level of duplication can lead to maintenance difficulties, poor factoring, and logical contradictions.
Learn more...
Duplication Overall
  • 41% duplication:
    • 3,725 cleaned lines of cleaned code (without empty lines, comments, and frequently duplicated constructs such as imports)
    • 1,534 duplicated lines
  • 266 duplicates
system41% (1,534 lines)
Duplication per Extension
yaml57% (1,015 lines)
py26% (519 lines)
Duplication per Component (primary)
modules/pipeline/templates77% (464 lines)
modules/data/templates65% (212 lines)
modules/pipeline24% (194 lines)
modules/monitoring/templates70% (173 lines)
modules/environment27% (166 lines)
modules/monitoring54% (133 lines)
modules/data51% (124 lines)
src/mlmax15% (68 lines)
ROOT0% (0 lines)
modules/environment/util0% (0 lines)
modules/data/src0% (0 lines)

Duplication Between Components (50+ lines)

G modules/data/templates modules/data/templates modules/pipeline/templates modules/pipeline/templates modules/data/templates--modules/pipeline/templates 498 modules/monitoring/templates modules/monitoring/templates modules/data/templates--modules/monitoring/templates 298 modules/monitoring/templates--modules/pipeline/templates 498 modules/data modules/data modules/monitoring modules/monitoring modules/data--modules/monitoring 229 modules/pipeline modules/pipeline modules/data--modules/pipeline 139 modules/monitoring--modules/pipeline 168

Download: SVG DOT (open online Graphviz editor)

Open 3D force graph...

Show more details on duplication between components...
Longest Duplicates
The list of 20 longest duplicates.
See data for all 266 duplicates...
Size#FoldersFilesLinesCode
99 x 2 modules/data/templates
modules/pipeline/templates
roles.yaml
roles.yaml
1:104 (73%)
1:104 (93%)
view
34 x 2 modules/data/templates
modules/monitoring/templates
my_data_pipeline.yaml
my_monitor_pipeline.yaml
63:97 (36%)
82:116 (30%)
view
32 x 2 modules/pipeline/templates
modules/pipeline/templates
my_inference_pipeline.yaml
my_training_pipeline.yaml
165:196 (16%)
236:267 (12%)
view
31 x 2 modules/monitoring/templates
modules/pipeline/templates
roles.yaml
roles.yaml
74:124 (29%)
74:124 (29%)
view
30 x 2 modules/pipeline/templates
modules/pipeline/templates
my_inference_pipeline.yaml
my_training_pipeline.yaml
20:49 (15%)
19:48 (11%)
view
29 x 2 modules/data/templates
modules/monitoring/templates
roles.yaml
roles.yaml
74:104 (21%)
74:104 (27%)
view
23 x 2 modules/monitoring/templates
modules/pipeline/templates
roles.yaml
roles.yaml
2:27 (21%)
2:27 (21%)
view
23 x 2 modules/data/templates
modules/monitoring/templates
roles.yaml
roles.yaml
2:27 (17%)
2:27 (21%)
view
20 x 2 modules/data/templates
modules/pipeline/templates
my_data_pipeline.yaml
my_training_pipeline.yaml
73:92 (21%)
248:267 (7%)
view
20 x 2 modules/monitoring/templates
modules/pipeline/templates
my_monitor_pipeline.yaml
my_inference_pipeline.yaml
92:111 (17%)
177:196 (10%)
view
20 x 2 modules/pipeline
modules/pipeline
inference_pipeline_define.py
training_pipeline_define.py
41:68 (16%)
52:78 (13%)
view
20 x 2 modules/data/templates
modules/pipeline/templates
my_data_pipeline.yaml
my_inference_pipeline.yaml
73:92 (21%)
177:196 (10%)
view
20 x 2 modules/monitoring/templates
modules/pipeline/templates
my_monitor_pipeline.yaml
my_training_pipeline.yaml
92:111 (17%)
248:267 (7%)
view
19 x 2 modules/pipeline/templates
modules/pipeline/templates
my_inference_pipeline.yaml
my_training_pipeline.yaml
80:98 (9%)
86:104 (7%)
view
16 x 2 modules/data
modules/monitoring
data_pipeline_run.py
monitor_pipeline_run.py
62:82 (31%)
73:93 (26%)
view
15 x 2 modules/pipeline/templates
modules/pipeline/templates
my_training_pipeline.yaml
my_training_pipeline.yaml
90:104 (5%)
236:250 (5%)
view
15 x 2 modules/data/templates
modules/monitoring/templates
my_data_pipeline.yaml
my_monitor_pipeline.yaml
1:17 (15%)
1:17 (13%)
view
15 x 2 modules/data/templates
modules/pipeline/templates
my_data_pipeline.yaml
my_training_pipeline.yaml
1:17 (15%)
1:17 (5%)
view
15 x 2 modules/data/templates
modules/pipeline/templates
my_data_pipeline.yaml
my_inference_pipeline.yaml
1:17 (15%)
1:18 (7%)
view
15 x 2 modules/pipeline/templates
modules/pipeline/templates
my_inference_pipeline.yaml
my_training_pipeline.yaml
84:98 (7%)
236:250 (5%)
view
Duplicated Units
The list of top 2 duplicated units.
See data for all 2 unit duplicates...
Size#FoldersFilesLinesCode
7 x 2 modules/monitoring
modules/monitoring
monitor_pipeline_create.py
monitor_pipeline_define.py
0:0 
0:0 
view
7 x 2 modules/monitoring
modules/data
monitor_pipeline_run.py
data_pipeline_run.py
0:0 
0:0 
view