aws-samples / amazon-sagemaker-mlops-with-featurestore-and-datawrangler
Duplication

Places in code with 6 or more lines that are exactly the same.

Intro
  • For duplication, we look at places in code where there are 6 or more lines of code that are exactly the same.
  • Before duplication is calculated, the code is cleaned to remove empty lines, comments, and frequently duplicated constructs such as imports.
  • You should aim at having as little as possible (<5%) of duplicated code as high-level of duplication can lead to maintenance difficulties, poor factoring, and logical contradictions.
Learn more...
Duplication Overall
  • 23% duplication:
    • 4,204 cleaned lines of cleaned code (without empty lines, comments, and frequently duplicated constructs such as imports)
    • 1,001 duplicated lines
  • 100 duplicates
system23% (1,001 lines)
Duplication per Extension
py24% (1,001 lines)
Duplication per Component (primary)
repos/serving/lambdas46% (174 lines)
repos/serving/infra14% (114 lines)
repos/features_ingestion_pipeline/infra45% (110 lines)
repos/build_pipeline/infra57% (110 lines)
repos/build_pipeline/pipelines17% (97 lines)
infra10% (91 lines)
demo-workspace/utils53% (76 lines)
repos/build_pipeline/scripts75% (55 lines)
demo-workspace/scripts79% (55 lines)
demo-workspace/tensorflow15% (45 lines)
repos/serving/pipelines23% (37 lines)
repos/features_ingestion_pipeline/pipelines26% (19 lines)
ROOT6% (6 lines)
repos/features_ingestion_pipeline17% (6 lines)
repos/build_pipeline16% (6 lines)
repos/build_pipeline/lambdas0% (0 lines)
repos/serving0% (0 lines)
repos/serving/scripts0% (0 lines)
lambdas/functions/auto_approval0% (0 lines)

Duplication Between Components (50+ lines)

G repos/build_pipeline/infra repos/build_pipeline/infra repos/features_ingestion_pipeline/infra repos/features_ingestion_pipeline/infra repos/build_pipeline/infra--repos/features_ingestion_pipeline/infra 194 repos/serving/infra repos/serving/infra repos/build_pipeline/infra--repos/serving/infra 186 repos/features_ingestion_pipeline/infra--repos/serving/infra 174 demo-workspace/scripts demo-workspace/scripts repos/build_pipeline/scripts repos/build_pipeline/scripts demo-workspace/scripts--repos/build_pipeline/scripts 110 demo-workspace/utils demo-workspace/utils repos/serving/lambdas repos/serving/lambdas demo-workspace/utils--repos/serving/lambdas 92 demo-workspace/tensorflow demo-workspace/tensorflow repos/serving/pipelines repos/serving/pipelines demo-workspace/tensorflow--repos/serving/pipelines 74

Download: SVG DOT (open online Graphviz editor)

Open 3D force graph...

Show more details on duplication between components...
Longest Duplicates
The list of 20 longest duplicates.
See data for all 100 duplicates...
Size#FoldersFilesLinesCode
75 x 2 repos/features_ingestion_pipeline/infra
repos/serving/infra
sm_pipeline_utils.py
sm_pipeline_utils.py
11:123 (100%)
11:123 (100%)
view
49 x 2 repos/build_pipeline/infra
repos/serving/infra
sm_pipeline_utils.py
sm_pipeline_utils.py
11:90 (65%)
11:90 (65%)
view
49 x 2 repos/build_pipeline/infra
repos/features_ingestion_pipeline/infra
sm_pipeline_utils.py
sm_pipeline_utils.py
11:90 (65%)
11:90 (65%)
view
47 x 2 repos/serving/lambdas/functions/read-ddb
repos/serving/lambdas/functions/read-sm
lambda_function.py
lambda_function.py
9:85 (100%)
9:86 (100%)
view
46 x 2 demo-workspace/utils
repos/serving/lambdas/functions/xgboost_inference
create_dataset.py
lambda_function.py
7:52 (80%)
23:68 (41%)
view
25 x 2 repos/build_pipeline/infra
repos/features_ingestion_pipeline/infra
sm_pipeline_utils.py
sm_pipeline_utils.py
92:123 (33%)
92:123 (33%)
view
25 x 2 repos/build_pipeline/infra
repos/serving/infra
sm_pipeline_utils.py
sm_pipeline_utils.py
92:123 (33%)
92:123 (33%)
view
25 x 2 demo-workspace/scripts
repos/build_pipeline/scripts
xgboost_starter_script.py
xgboost_starter_script.py
29:59 (44%)
32:62 (41%)
view
19 x 2 demo-workspace/utils
repos/features_ingestion_pipeline/pipelines
parse_flow.py
parse_flow.py
6:28 (100%)
6:28 (100%)
view
17 x 2 repos/serving/lambdas/fu...rocessing-job-execution
repos/serving/lambdas/fu...essing-job-status-check
lambda_function.py
lambda_function.py
8:35 (27%)
8:36 (30%)
view
13 x 2 demo-workspace/scripts
repos/build_pipeline/scripts
xgboost_starter_script.py
xgboost_starter_script.py
65:82 (23%)
68:85 (21%)
view
11 x 2 demo-workspace/utils
repos/features_ingestion_pipeline/infra
feature_store_utils.py
feature_store_utils.py
6:18 (22%)
9:21 (18%)
view
10 x 2 demo-workspace/tensorflow
repos/serving/pipelines
tf_pipeline.py
batch_transform_serving_pipeline.py
89:98 (4%)
96:105 (6%)
view
9 x 2 demo-workspace/scripts
repos/build_pipeline/scripts
create_dataset.py
create_dataset.py
13:26 (69%)
13:27 (69%)
view
8 x 2 infra
infra
mlops_featurestore_construct.py
mlops_featurestore_construct.py
238:246 (2%)
256:263 (2%)
view
8 x 2 repos/build_pipeline/infra
repos/features_ingestion_pipeline/infra
build_model_stack.py
features_ingestion_stack.py
35:42 (6%)
33:40 (7%)
view
8 x 2 repos/build_pipeline/pipelines
repos/build_pipeline/pipelines
xgboost_pipeline.py
xgboost_pipeline.py
370:377 (1%)
424:432 (1%)
view
8 x 2 repos/build_pipeline/infra
repos/features_ingestion_pipeline/infra
build_model_stack.py
features_ingestion_stack.py
90:97 (6%)
121:128 (7%)
view
8 x 2 repos/build_pipeline/pipelines
repos/build_pipeline/pipelines
xgboost_pipeline.py
xgboost_pipeline.py
370:377 (1%)
465:473 (1%)
view
8 x 2 repos/build_pipeline/pipelines
repos/build_pipeline/pipelines
xgboost_pipeline.py
xgboost_pipeline.py
424:432 (1%)
465:473 (1%)
view
Duplicated Units
The list of top 4 duplicated units.
See data for all 4 unit duplicates...
Size#FoldersFilesLinesCode
44 x 2 repos/serving/lambdas/functions/read-ddb
repos/serving/lambdas/functions/read-sm
lambda_function.py
lambda_function.py
0:0 
0:0 
view
26 x 3 repos/features_ingestion_pipeline/infra
repos/build_pipeline/infra
repos/serving/infra
sm_pipeline_utils.py
sm_pipeline_utils.py
sm_pipeline_utils.py
0:0 
0:0 
0:0 
view
9 x 2 repos/serving/lambdas/fu...rocessing-job-execution
repos/serving/lambdas/fu...essing-job-status-check
lambda_function.py
lambda_function.py
0:0 
0:0 
view
6 x 3 repos/features_ingestion_pipeline/infra
repos/build_pipeline/infra
repos/serving/infra
sm_pipeline_utils.py
sm_pipeline_utils.py
sm_pipeline_utils.py
0:0 
0:0 
0:0 
view