aws-samples / aws-glue-samples
Duplication

Places in code with 6 or more lines that are exactly the same.

Intro
  • For duplication, we look at places in code where there are 6 or more lines of code that are exactly the same.
  • Before duplication is calculated, the code is cleaned to remove empty lines, comments, and frequently duplicated constructs such as imports.
  • You should aim at having as little as possible (<5%) of duplicated code as high-level of duplication can lead to maintenance difficulties, poor factoring, and logical contradictions.
Learn more...
Duplication Overall
  • 30% duplication:
    • 3,646 cleaned lines of cleaned code (without empty lines, comments, and frequently duplicated constructs such as imports)
    • 1,102 duplicated lines
  • 172 duplicates
system30% (1,102 lines)
Duplication per Extension
scala41% (484 lines)
yaml94% (336 lines)
java38% (156 lines)
py7% (126 lines)
Duplication per Component (primary)
utilities/sagemaker_notebook_automation94% (336 lines)
GlueCustomConnectors/localValidation58% (282 lines)
GlueCustomConnectors/gluescripts/withoutConnection89% (187 lines)
GlueCustomConnectors/gluescripts/withConnection92% (179 lines)
GlueCustomConnectors/development/Spark12% (90 lines)
utilities/use_only_IAM_access_controls17% (16 lines)
utilities/Hive_metastore_migration/src<1% (12 lines)
utilities/Crawler_undo_redo/src0% (0 lines)
GlueCustomConnectors/development/Athena0% (0 lines)
GlueCustomConnectors/glueJobValidation0% (0 lines)

Duplication Between Components (50+ lines)

G GlueCustomConnectors/gluescripts/withConnection GlueCustomConnectors/gluescripts/withConnection GlueCustomConnectors/gluescripts/withoutConnection GlueCustomConnectors/gluescripts/withoutConnection GlueCustomConnectors/gluescripts/withConnection--GlueCustomConnectors/gluescripts/withoutConnection 346

Download: SVG DOT (open online Graphviz editor)

Open 3D force graph...

Show more details on duplication between components...
Longest Duplicates
The list of 20 longest duplicates.
See data for all 172 duplicates...
Size#FoldersFilesLinesCode
63 x 2 utilities/sagemaker_notebook_automation
utilities/sagemaker_notebook_automation
glue_sagemaker_notebook.yaml
glue_sagemaker_notebook_cn.yaml
1:66 (35%)
1:66 (35%)
view
37 x 2 utilities/sagemaker_notebook_automation
utilities/sagemaker_notebook_automation
glue_sagemaker_notebook_cn.yaml
glue_sagemaker_notebook_cn.yaml
107:165 (20%)
168:226 (20%)
view
37 x 2 utilities/sagemaker_notebook_automation
utilities/sagemaker_notebook_automation
glue_sagemaker_notebook.yaml
glue_sagemaker_notebook.yaml
107:165 (20%)
168:226 (20%)
view
27 x 2 utilities/sagemaker_notebook_automation
utilities/sagemaker_notebook_automation
glue_sagemaker_notebook.yaml
glue_sagemaker_notebook_cn.yaml
91:122 (15%)
91:122 (15%)
view
25 x 2 GlueCustomConnectors/gluescripts/withoutConnection
GlueCustomConnectors/gluescripts/withoutConnection
JDBCSalesforce.java
JDBCSalesforce.scala
5:30 (100%)
5:30 (71%)
view
22 x 2 utilities/sagemaker_notebook_automation
utilities/sagemaker_notebook_automation
glue_sagemaker_notebook.yaml
glue_sagemaker_notebook_cn.yaml
150:183 (12%)
150:183 (12%)
view
22 x 2 GlueCustomConnectors/gluescripts/withoutConnection
GlueCustomConnectors/gluescripts/withoutConnection
SparkSnowflake.java
SparkSnowflake.scala
5:27 (100%)
5:27 (84%)
view
22 x 2 GlueCustomConnectors/gluescripts/withConnection
GlueCustomConnectors/gluescripts/withConnection
JDBCSalesforce.java
JDBCSalesforce.scala
5:27 (100%)
5:27 (73%)
view
21 x 2 GlueCustomConnectors/gluescripts/withConnection
GlueCustomConnectors/gluescripts/withConnection
SparkSnowflake.java
SparkSnowflake.scala
5:26 (100%)
5:26 (84%)
view
19 x 2 GlueCustomConnectors/gluescripts/withoutConnection
GlueCustomConnectors/gluescripts/withoutConnection
AthenaCloudwatch.java
AthenaCloudwatch.scala
5:24 (100%)
5:24 (100%)
view
19 x 2 GlueCustomConnectors/localValidation
GlueCustomConnectors/localValidation
DataSchemaTest.scala
DbtableQueryTest.scala
36:59 (59%)
65:89 (19%)
view
19 x 2 GlueCustomConnectors/gluescripts/withConnection
GlueCustomConnectors/gluescripts/withConnection
AthenaCloudwatch.java
AthenaCloudwatch.scala
5:24 (100%)
5:24 (100%)
view
16 x 2 GlueCustomConnectors/localValidation
GlueCustomConnectors/localValidation
DataSourceTest.scala
JDBCUrlTest.scala
45:66 (44%)
47:67 (42%)
view
14 x 2 GlueCustomConnectors/localValidation
GlueCustomConnectors/localValidation
DbtableQueryTest.scala
DbtableQueryTest.scala
71:89 (14%)
136:154 (14%)
view
14 x 2 GlueCustomConnectors/localValidation
GlueCustomConnectors/localValidation
CatalogConnectionTest.scala
DataSourceTest.scala
57:75 (38%)
47:66 (38%)
view
14 x 2 GlueCustomConnectors/localValidation
GlueCustomConnectors/localValidation
DataSourceTest.scala
SecretsManagerTest.scala
47:66 (38%)
48:66 (40%)
view
14 x 2 GlueCustomConnectors/localValidation
GlueCustomConnectors/localValidation
DbtableQueryTest.scala
JDBCUrlTest.scala
71:89 (14%)
49:67 (36%)
view
14 x 2 GlueCustomConnectors/localValidation
GlueCustomConnectors/localValidation
ColumnPartitioningTest.scala
JDBCUrlTest.scala
48:66 (36%)
49:67 (36%)
view
14 x 2 GlueCustomConnectors/localValidation
GlueCustomConnectors/localValidation
ColumnPartitioningTest.scala
SecretsManagerTest.scala
48:66 (36%)
48:66 (40%)
view
14 x 2 GlueCustomConnectors/localValidation
GlueCustomConnectors/localValidation
DbtableQueryTest.scala
SecretsManagerTest.scala
71:89 (14%)
48:66 (40%)
view
Duplicated Units
The list of top 1 duplicated units.
See data for all 1 unit duplicate
Size#FoldersFilesLinesCode
18 x 4 GlueCustomConnectors/gluescripts/withConnection
GlueCustomConnectors/gluescripts/withConnection
GlueCustomConnectors/gluescripts/withoutConnection
GlueCustomConnectors/gluescripts/withoutConnection
AthenaCloudwatch.scala
AthenaCloudwatch.java
AthenaCloudwatch.scala
AthenaCloudwatch.java
6:25 
6:25 
6:25 
6:25 
view