aws / sagemaker-spark
Duplication

Places in code with 6 or more lines that are exactly the same.

Intro
  • For duplication, we look at places in code where there are 6 or more lines of code that are exactly the same.
  • Before duplication is calculated, the code is cleaned to remove empty lines, comments, and frequently duplicated constructs such as imports.
  • You should aim at having as little as possible (<5%) of duplicated code as high-level of duplication can lead to maintenance difficulties, poor factoring, and logical contradictions.
Learn more...
Duplication Overall
  • 31% duplication:
    • 6,680 cleaned lines of cleaned code (without empty lines, comments, and frequently duplicated constructs such as imports)
    • 2,098 duplicated lines
  • 421 duplicates
system31% (2,098 lines)
Duplication per Extension
py31% (1,057 lines)
scala28% (826 lines)
proto83% (172 lines)
yml48% (43 lines)
Duplication per Component (primary)
sagemaker-spark-sdk/src/main/scala28% (826 lines)
sagemaker-pyspark-sdk/src/sagemaker_pyspark/algorithms35% (774 lines)
sagemaker-pyspark-sdk/src/sagemaker_pyspark23% (194 lines)
protobuf83% (172 lines)
sagemaker-pyspark-sdk/src/sagemaker_pyspark/transformation33% (89 lines)
ROOT48% (43 lines)
sagemaker-spark-sdk/project0% (0 lines)
sagemaker-spark-sdk0% (0 lines)
sagemaker-pyspark-sdk0% (0 lines)
sagemaker-pyspark-sdk/src0% (0 lines)

Duplication Between Components (50+ lines)

G sagemaker-pyspark-sdk/src/sagemaker_pyspark sagemaker-pyspark-sdk/src/sagemaker_pyspark sagemaker-pyspark-sdk/src/sagemaker_pyspark/algorithms sagemaker-pyspark-sdk/src/sagemaker_pyspark/algorithms sagemaker-pyspark-sdk/src/sagemaker_pyspark--sagemaker-pyspark-sdk/src/sagemaker_pyspark/algorithms 439 sagemaker-spark-sdk/src/main/scala sagemaker-spark-sdk/src/main/scala sagemaker-pyspark-sdk/src/sagemaker_pyspark--sagemaker-spark-sdk/src/main/scala 72

Download: SVG DOT (open online Graphviz editor)

Open 3D force graph...

Show more details on duplication between components...
Longest Duplicates
The list of 20 longest duplicates.
See data for all 421 duplicates...
Size#FoldersFilesLinesCode
57 x 2 sagemaker-spark-sdk/src/...ker/sparksdk/algorithms
sagemaker-spark-sdk/src/...ker/sparksdk/algorithms
LinearLearnerSageMakerEstimator.scala
LinearLearnerSageMakerEstimator.scala
597:653 (9%)
754:810 (9%)
view
55 x 2 protobuf
protobuf
proto
record-2.5.proto
record.proto
7:72 (53%)
8:73 (52%)
view
38 x 2 sagemaker-spark-sdk/src/...ker/sparksdk/algorithms
sagemaker-spark-sdk/src/...ker/sparksdk/algorithms
FactorizationMachinesSageMakerEstimat...
LinearLearnerSageMakerEstimator.scala
646:683 (11%)
1056:1093 (6%)
view
33 x 2 sagemaker-pyspark-sdk/sr...aker_pyspark/algorithms
sagemaker-pyspark-sdk/sr...aker_pyspark/algorithms
FactorizationMachinesSageMakerEstimat...
LinearLearnerSageMakerEstimator.py
472:505 (8%)
1138:1171 (4%)
view
33 x 2 sagemaker-pyspark-sdk/sr...aker_pyspark/algorithms
sagemaker-pyspark-sdk/sr...aker_pyspark/algorithms
FactorizationMachinesSageMakerEstimat...
FactorizationMachinesSageMakerEstimat...
472:505 (8%)
653:686 (8%)
view
33 x 2 sagemaker-pyspark-sdk/sr...aker_pyspark/algorithms
sagemaker-pyspark-sdk/sr...aker_pyspark/algorithms
FactorizationMachinesSageMakerEstimat...
LinearLearnerSageMakerEstimator.py
653:686 (8%)
1138:1171 (4%)
view
32 x 2 sagemaker-pyspark-sdk/sr...aker_pyspark/algorithms
sagemaker-pyspark-sdk/sr...aker_pyspark/algorithms
LinearLearnerSageMakerEstimator.py
LinearLearnerSageMakerEstimator.py
882:917 (4%)
1088:1123 (4%)
view
32 x 2 sagemaker-pyspark-sdk/sr...aker_pyspark/algorithms
sagemaker-pyspark-sdk/sr...aker_pyspark/algorithms
FactorizationMachinesSageMakerEstimat...
LinearLearnerSageMakerEstimator.py
603:638 (7%)
882:917 (4%)
view
32 x 2 sagemaker-pyspark-sdk/sr...aker_pyspark/algorithms
sagemaker-pyspark-sdk/sr...aker_pyspark/algorithms
FactorizationMachinesSageMakerEstimat...
LinearLearnerSageMakerEstimator.py
603:638 (7%)
650:685 (4%)
view
32 x 2 sagemaker-pyspark-sdk/sr...aker_pyspark/algorithms
sagemaker-pyspark-sdk/sr...aker_pyspark/algorithms
FactorizationMachinesSageMakerEstimat...
LinearLearnerSageMakerEstimator.py
603:638 (7%)
1088:1123 (4%)
view
32 x 2 sagemaker-pyspark-sdk/sr...aker_pyspark/algorithms
sagemaker-pyspark-sdk/sr...aker_pyspark/algorithms
FactorizationMachinesSageMakerEstimat...
LinearLearnerSageMakerEstimator.py
421:456 (7%)
1088:1123 (4%)
view
32 x 2 sagemaker-pyspark-sdk/sr...aker_pyspark/algorithms
sagemaker-pyspark-sdk/sr...aker_pyspark/algorithms
FactorizationMachinesSageMakerEstimat...
LinearLearnerSageMakerEstimator.py
421:456 (7%)
882:917 (4%)
view
32 x 2 sagemaker-spark-sdk/src/...ker/sparksdk/algorithms
sagemaker-spark-sdk/src/...ker/sparksdk/algorithms
FactorizationMachinesSageMakerEstimat...
FactorizationMachinesSageMakerEstimat...
386:417 (9%)
536:567 (9%)
view
32 x 2 sagemaker-pyspark-sdk/sr...aker_pyspark/algorithms
sagemaker-pyspark-sdk/sr...aker_pyspark/algorithms
FactorizationMachinesSageMakerEstimat...
LinearLearnerSageMakerEstimator.py
421:456 (7%)
650:685 (4%)
view
32 x 2 sagemaker-pyspark-sdk/sr...aker_pyspark/algorithms
sagemaker-pyspark-sdk/sr...aker_pyspark/algorithms
LinearLearnerSageMakerEstimator.py
LinearLearnerSageMakerEstimator.py
650:685 (4%)
882:917 (4%)
view
32 x 2 sagemaker-pyspark-sdk/sr...aker_pyspark/algorithms
sagemaker-pyspark-sdk/sr...aker_pyspark/algorithms
LinearLearnerSageMakerEstimator.py
LinearLearnerSageMakerEstimator.py
650:685 (4%)
1088:1123 (4%)
view
32 x 2 sagemaker-pyspark-sdk/sr...aker_pyspark/algorithms
sagemaker-pyspark-sdk/sr...aker_pyspark/algorithms
FactorizationMachinesSageMakerEstimat...
FactorizationMachinesSageMakerEstimat...
421:456 (7%)
603:638 (7%)
view
31 x 2 sagemaker-pyspark-sdk/sr...aker_pyspark/algorithms
sagemaker-pyspark-sdk/sr...aker_pyspark/algorithms
FactorizationMachinesSageMakerEstimat...
KMeansSageMakerEstimator.py
472:502 (7%)
227:257 (14%)
view
31 x 2 sagemaker-pyspark-sdk/sr...aker_pyspark/algorithms
sagemaker-pyspark-sdk/sr...aker_pyspark/algorithms
FactorizationMachinesSageMakerEstimat...
XGBoostSageMakerEstimator.py
653:683 (7%)
390:420 (6%)
view
31 x 2 sagemaker-pyspark-sdk/sr...aker_pyspark/algorithms
sagemaker-pyspark-sdk/sr...aker_pyspark/algorithms
LDASageMakerEstimator.py
PCASageMakerEstimator.py
155:189 (20%)
142:176 (21%)
view