awslabs / aws-data-wrangler
Duplication

Places in code with 6 or more lines that are exactly the same.

Intro
  • For duplication, we look at places in code where there are 6 or more lines of code that are exactly the same.
  • Before duplication is calculated, the code is cleaned to remove empty lines, comments, and frequently duplicated constructs such as imports.
  • You should aim at having as little as possible (<5%) of duplicated code as high-level of duplication can lead to maintenance difficulties, poor factoring, and logical contradictions.
Learn more...
Duplication Overall
  • 22% duplication:
    • 12,519 cleaned lines of cleaned code (without empty lines, comments, and frequently duplicated constructs such as imports)
    • 2,764 duplicated lines
  • 423 duplicates
system22% (2,764 lines)
Duplication per Extension
py22% (2,764 lines)
Duplication per Component (primary)
awswrangler/s324% (957 lines)
awswrangler/catalog41% (770 lines)
awswrangler16% (587 lines)
awswrangler/athena26% (347 lines)
awswrangler/quicksight10% (85 lines)
awswrangler/lakeformation5% (18 lines)
awswrangler/data_api0% (0 lines)
awswrangler/opensearch0% (0 lines)
awswrangler/dynamodb0% (0 lines)
ROOT0% (0 lines)

Duplication Between Components (50+ lines)

G awswrangler/catalog awswrangler/catalog awswrangler/s3 awswrangler/s3 awswrangler/catalog--awswrangler/s3 210

Download: SVG DOT (open online Graphviz editor)

Open 3D force graph...

Show more details on duplication between components...
Longest Duplicates
The list of 20 longest duplicates.
See data for all 423 duplicates...
Size#FoldersFilesLinesCode
56 x 2 awswrangler/s3
awswrangler/s3
_write_text.py
_write_text.py
415:478 (9%)
867:931 (9%)
view
28 x 2 awswrangler/s3
awswrangler/s3
_write_parquet.py
_write_text.py
201:228 (5%)
669:696 (4%)
view
26 x 2 awswrangler/s3
awswrangler/s3
_write_parquet.py
_write_parquet.py
603:628 (5%)
666:691 (5%)
view
26 x 2 awswrangler/catalog
awswrangler/catalog
_create.py
_create.py
298:323 (3%)
378:403 (3%)
view
26 x 2 awswrangler/s3
awswrangler/s3
_write_text.py
_write_text.py
958:983 (4%)
1025:1050 (4%)
view
25 x 2 awswrangler/catalog
awswrangler/catalog
_create.py
_create.py
948:972 (3%)
1116:1140 (3%)
view
22 x 2 awswrangler/catalog
awswrangler/catalog
_create.py
_create.py
344:365 (3%)
422:443 (3%)
view
22 x 2 awswrangler
awswrangler
postgresql.py
redshift.py
233:303 (12%)
643:718 (2%)
view
20 x 2 awswrangler
awswrangler
mysql.py
postgresql.py
172:236 (10%)
167:231 (11%)
view
20 x 2 awswrangler/s3
awswrangler/s3
_write_text.py
_write_text.py
642:661 (3%)
1062:1081 (3%)
view
20 x 2 awswrangler/catalog
awswrangler/catalog
_definitions.py
_definitions.py
170:189 (7%)
248:267 (7%)
view
20 x 2 awswrangler
awswrangler
postgresql.py
redshift.py
167:231 (11%)
572:641 (2%)
view
20 x 2 awswrangler/catalog
awswrangler/catalog
_definitions.py
_definitions.py
131:150 (7%)
212:231 (7%)
view
20 x 2 awswrangler
awswrangler
mysql.py
redshift.py
172:236 (10%)
572:641 (2%)
view
19 x 2 awswrangler/s3
awswrangler/s3
_read_text.py
_read_text.py
167:290 (7%)
315:438 (7%)
view
18 x 2 awswrangler
awswrangler
postgresql.py
sqlserver.py
167:227 (10%)
188:247 (9%)
view
18 x 2 awswrangler
awswrangler
redshift.py
sqlserver.py
572:637 (2%)
188:247 (9%)
view
18 x 2 awswrangler
awswrangler
mysql.py
sqlserver.py
172:232 (9%)
188:247 (9%)
view
17 x 2 awswrangler/athena
awswrangler/athena
_read.py
_read.py
612:628 (2%)
926:942 (2%)
view
17 x 2 awswrangler/s3
awswrangler/s3
_write_text.py
_write_text.py
91:107 (2%)
681:697 (2%)
view
Duplicated Units
The list of top 7 duplicated units.
See data for all 7 unit duplicates...
Size#FoldersFilesLinesCode
14 x 2 awswrangler/s3
awswrangler/s3
_read_text.py
_read_text.py
0:0 
0:0 
view
8 x 2 awswrangler
awswrangler
postgresql.py
redshift.py
0:0 
0:0 
view
8 x 2 awswrangler/athena
awswrangler/athena
_utils.py
_utils.py
0:0 
0:0 
view
7 x 2 awswrangler/catalog
awswrangler/catalog
_create.py
_create.py
0:0 
0:0 
view
7 x 2 awswrangler/catalog
awswrangler/catalog
_create.py
_create.py
0:0 
0:0 
view
6 x 2 awswrangler
awswrangler
cloudwatch.py
cloudwatch.py
0:0 
0:0 
view
6 x 2 awswrangler/catalog
awswrangler/catalog
_get.py
_get.py
0:0 
0:0 
view