aws-samples / amazon-textract-transformer-pipeline
Duplication

Places in code with 6 or more lines that are exactly the same.

Intro
  • For duplication, we look at places in code where there are 6 or more lines of code that are exactly the same.
  • Before duplication is calculated, the code is cleaned to remove empty lines, comments, and frequently duplicated constructs such as imports.
  • You should aim at having as little as possible (<5%) of duplicated code as high-level of duplication can lead to maintenance difficulties, poor factoring, and logical contradictions.
Learn more...
Duplication Overall
  • 14% duplication:
    • 6,719 cleaned lines of cleaned code (without empty lines, comments, and frequently duplicated constructs such as imports)
    • 953 duplicated lines
  • 39 duplicates
system14% (953 lines)
Duplication per Extension
py15% (901 lines)
html5% (52 lines)
Duplication per Component (primary)
notebooks/util24% (274 lines)
pipeline/postprocessing40% (250 lines)
notebooks/src12% (181 lines)
pipeline/ocr8% (92 lines)
pipeline19% (64 lines)
notebooks/review9% (35 lines)
pipeline/review5% (19 lines)
notebooks/annotation3% (17 lines)
pipeline/enrichment11% (12 lines)
ROOT3% (9 lines)
pipeline/fn-trigger0% (0 lines)
annotation/fn-SMGT-Post0% (0 lines)
annotation0% (0 lines)
annotation/fn-SMGT-Pre0% (0 lines)
notebooks/preproc0% (0 lines)

Duplication Between Components (50+ lines)

G notebooks/util notebooks/util pipeline/postprocessing pipeline/postprocessing notebooks/util--pipeline/postprocessing 488

Download: SVG DOT (open online Graphviz editor)

Open 3D force graph...

Show more details on duplication between components...
Longest Duplicates
The list of 20 longest duplicates.
See data for all 39 duplicates...
Size#FoldersFilesLinesCode
167 x 2 notebooks/util/postproc
pipeline/postprocessing/fn-postprocess/util
boxes.py
boxes.py
11:229 (100%)
11:229 (100%)
view
39 x 2 notebooks/util/postproc
pipeline/postprocessing/fn-postprocess/util
deser.py
deser.py
11:72 (100%)
11:72 (100%)
view
38 x 2 notebooks/util/postproc
pipeline/postprocessing/fn-postprocess/util
config.py
config.py
13:79 (100%)
13:79 (100%)
view
17 x 2 notebooks/annotation
notebooks/review
ocr-bbox-and-validation.liquid.tpl.html
fields-validation.liquid.html
3:21 (3%)
3:29 (4%)
view
15 x 2 notebooks/util
notebooks/util
smgt.py
smgt.py
31:45 (7%)
49:63 (7%)
view
15 x 2 notebooks/src/code/data
notebooks/src/code/data
ner.py
ner.py
169:183 (4%)
231:245 (4%)
view
12 x 2 pipeline
pipeline
__init__.py
__init__.py
77:88 (4%)
107:118 (4%)
view
12 x 2 pipeline
pipeline
__init__.py
__init__.py
77:88 (4%)
92:103 (4%)
view
12 x 2 notebooks/src/code/data
notebooks/src/code/data
geometry.py
ner.py
159:170 (8%)
169:180 (3%)
view
12 x 2 notebooks/src/code/data
notebooks/src/code/data
geometry.py
ner.py
159:170 (8%)
231:242 (3%)
view
12 x 2 pipeline
pipeline
__init__.py
__init__.py
92:103 (4%)
107:118 (4%)
view
10 x 2 notebooks/src/code/data
notebooks/src/code/data
mlm.py
ner.py
105:115 (6%)
148:158 (3%)
view
10 x 2 notebooks/src/code/data
notebooks/src/code/data
mlm.py
ner.py
152:162 (6%)
307:329 (3%)
view
9 x 2 ROOT
pipeline
cdk_demo_stack.py
__init__.py
64:72 (4%)
92:100 (3%)
view
9 x 2 ROOT
pipeline
cdk_demo_stack.py
__init__.py
64:72 (4%)
77:85 (3%)
view
9 x 2 ROOT
pipeline
cdk_demo_stack.py
__init__.py
64:72 (4%)
107:115 (3%)
view
9 x 2 notebooks/src/code/data
notebooks/src/code/data
mlm.py
ner.py
69:82 (5%)
80:93 (2%)
view
9 x 2 pipeline/ocr/sfn_semaphore
pipeline/ocr/sfn_semaphore
__init__.py
__init__.py
298:306 (1%)
573:606 (1%)
view
9 x 2 pipeline/ocr/sfn_semaphore
pipeline/ocr/sfn_semaphore
__init__.py
__init__.py
50:58 (1%)
186:194 (1%)
view
9 x 2 notebooks/review
notebooks/review
fields-validation.liquid.html
fields-validation.liquid.html
88:96 (2%)
133:141 (2%)
view
Duplicated Units
The list of top 10 duplicated units.
See data for all 10 unit duplicates...
Size#FoldersFilesLinesCode
41 x 2 pipeline/postprocessing/fn-postprocess/util
notebooks/util/postproc
boxes.py
boxes.py
0:0 
0:0 
view
14 x 2 notebooks/src/code/data
notebooks/src/code/data
ner.py
ner.py
0:0 
0:0 
view
12 x 2 pipeline/postprocessing/fn-postprocess/util
notebooks/util/postproc
boxes.py
boxes.py
0:0 
0:0 
view
15 x 2 pipeline/postprocessing/fn-postprocess/util
notebooks/util/postproc
deser.py
deser.py
0:0 
0:0 
view
9 x 2 pipeline/postprocessing/fn-postprocess/util
notebooks/util/postproc
deser.py
deser.py
0:0 
0:0 
view
9 x 2 pipeline/postprocessing/fn-postprocess/util
notebooks/util/postproc
boxes.py
boxes.py
0:0 
0:0 
view
9 x 2 pipeline/postprocessing/fn-postprocess/util
notebooks/util/postproc
deser.py
deser.py
0:0 
0:0 
view
7 x 2 pipeline/postprocessing/fn-postprocess/util
notebooks/util/postproc
config.py
config.py
0:0 
0:0 
view
6 x 2 pipeline/ocr/sfn_semaphore
pipeline/ocr/sfn_semaphore
__init__.py
__init__.py
0:0 
0:0 
view
6 x 2 notebooks/src/code/data
notebooks/src/code/data
base.py
mlm.py
0:0 
0:0 
view