aws-samples / amazon-eks-apache-spark-etl-sample
Duplication

Places in code with 6 or more lines that are exactly the same.

Intro
  • For duplication, we look at places in code where there are 6 or more lines of code that are exactly the same.
  • Before duplication is calculated, the code is cleaned to remove empty lines, comments, and frequently duplicated constructs such as imports.
  • You should aim at having as little as possible (<5%) of duplicated code as high-level of duplication can lead to maintenance difficulties, poor factoring, and logical contradictions.
Learn more...
Duplication Overall
  • 13% duplication:
    • 501 cleaned lines of cleaned code (without empty lines, comments, and frequently duplicated constructs such as imports)
    • 70 duplicated lines
  • 6 duplicates
system13% (70 lines)
Duplication per Extension
yaml18% (34 lines)
yml14% (24 lines)
scala9% (12 lines)
Duplication per Component (primary)
kubernetes16% (58 lines)
spark-application/src/main/scala9% (12 lines)
spark-application/project0% (0 lines)
spark-application0% (0 lines)
Longest Duplicates
The list of 6 longest duplicates.
See data for all 6 duplicates...
Size#FoldersFilesLinesCode
12 x 2 kubernetes
kubernetes
eksctl.yaml
eksctl.yaml
76:87 (8%)
110:121 (8%)
view
10 x 2 kubernetes
kubernetes
eksctl.yaml
eksctl.yaml
112:121 (7%)
150:159 (7%)
view
10 x 2 kubernetes
kubernetes
eksctl.yaml
eksctl.yaml
78:87 (7%)
150:159 (7%)
view
6 x 2 kubernetes
kubernetes
cluster_autoscaler.yml
cluster_autoscaler.yml
68:73 (3%)
103:108 (3%)
view
6 x 2 kubernetes
kubernetes
cluster_autoscaler.yml
cluster_autoscaler.yml
94:100 (3%)
112:118 (3%)
view
6 x 2 spark-application/src/main/scala
spark-application/src/main/scala
ValueZones.scala
ValueZones.scala
76:81 (4%)
93:98 (4%)
view