aws-samples / amazon-textract-serverless-large-scale-document-processing
Duplication

Places in code with 6 or more lines that are exactly the same.

Intro
  • For duplication, we look at places in code where there are 6 or more lines of code that are exactly the same.
  • Before duplication is calculated, the code is cleaned to remove empty lines, comments, and frequently duplicated constructs such as imports.
  • You should aim at having as little as possible (<5%) of duplicated code as high-level of duplication can lead to maintenance difficulties, poor factoring, and logical contradictions.
Learn more...
Duplication Overall
  • 87% duplication:
    • 3,091 cleaned lines of cleaned code (without empty lines, comments, and frequently duplicated constructs such as imports)
    • 2,710 duplicated lines
  • 220 duplicates
system87% (2,710 lines)
Duplication per Extension
py94% (2,710 lines)
Duplication per Component (primary)
src89% (1,368 lines)
textract-pipeline/lambda/textractor100% (600 lines)
textract-pipeline/lambda/helper100% (289 lines)
textract-pipeline/lambda/asyncprocessor100% (161 lines)
textract-pipeline/lambda/jobresultprocessor100% (78 lines)
textract-pipeline/lambda/syncprocessor100% (71 lines)
textract-pipeline/lambda/documentprocessor100% (69 lines)
textract-pipeline/lambda/s3batchprocessor100% (46 lines)
textract-pipeline/lambda/s3processor100% (28 lines)
textract-pipeline/lib0% (0 lines)

Duplication Between Components (50+ lines)

G src src textract-pipeline/lambda/textractor textract-pipeline/lambda/textractor src--textract-pipeline/lambda/textractor 1200 textract-pipeline/lambda/helper textract-pipeline/lambda/helper src--textract-pipeline/lambda/helper 578 textract-pipeline/lambda/asyncprocessor textract-pipeline/lambda/asyncprocessor src--textract-pipeline/lambda/asyncprocessor 342 textract-pipeline/lambda/jobresultprocessor textract-pipeline/lambda/jobresultprocessor src--textract-pipeline/lambda/jobresultprocessor 170 textract-pipeline/lambda/syncprocessor textract-pipeline/lambda/syncprocessor src--textract-pipeline/lambda/syncprocessor 162 textract-pipeline/lambda/documentprocessor textract-pipeline/lambda/documentprocessor src--textract-pipeline/lambda/documentprocessor 138 textract-pipeline/lambda/s3batchprocessor textract-pipeline/lambda/s3batchprocessor src--textract-pipeline/lambda/s3batchprocessor 108 textract-pipeline/lambda/s3processor textract-pipeline/lambda/s3processor src--textract-pipeline/lambda/s3processor 72

Download: SVG DOT (open online Graphviz editor)

Open 3D force graph...

Show more details on duplication between components...
Longest Duplicates
The list of 20 longest duplicates.
See data for all 220 duplicates...
Size#FoldersFilesLinesCode
521 x 2 src
textract-pipeline/lambda/textractor/python
trp.py
trp.py
3:652 (100%)
3:652 (100%)
view
179 x 2 src
textract-pipeline/lambda/helper/python
helper.py
helper.py
8:228 (100%)
8:228 (100%)
view
161 x 2 src
textract-pipeline/lambda/asyncprocessor
asyncproc.py
lambda_function.py
7:216 (100%)
7:216 (100%)
view
110 x 2 src
textract-pipeline/lambda/helper/python
datastore.py
datastore.py
6:150 (100%)
6:150 (100%)
view
79 x 2 src
textract-pipeline/lambda/textractor/python
og.py
og.py
6:107 (100%)
6:107 (100%)
view
78 x 2 src
textract-pipeline/lambda/jobresultprocessor
jobresultsproc.py
lambda_function.py
9:117 (100%)
9:117 (100%)
view
71 x 2 src
textract-pipeline/lambda/syncprocessor
syncproc.py
lambda_function.py
9:104 (100%)
9:104 (100%)
view
69 x 2 src
textract-pipeline/lambda/documentprocessor
docproc.py
lambda_function.py
5:103 (100%)
5:103 (100%)
view
46 x 2 src
textract-pipeline/lambda/s3batchprocessor
s3batchproc.py
lambda_function.py
8:73 (100%)
8:73 (100%)
view
28 x 2 src
textract-pipeline/lambda/s3processor
s3proc.py
lambda_function.py
8:48 (100%)
8:48 (100%)
view
22 x 2 src
src
trp.py
trp.py
194:222 (4%)
246:274 (4%)
view
22 x 2 src
textract-pipeline/lambda/textractor/python
trp.py
trp.py
194:222 (4%)
246:274 (4%)
view
22 x 2 src
textract-pipeline/lambda/textractor/python
trp.py
trp.py
246:274 (4%)
194:222 (4%)
view
22 x 2 textract-pipeline/lambda/textractor/python
textract-pipeline/lambda/textractor/python
trp.py
trp.py
194:222 (4%)
246:274 (4%)
view
15 x 2 textract-pipeline/lambda/textractor/python
textract-pipeline/lambda/textractor/python
trp.py
trp.py
256:274 (2%)
394:412 (2%)
view
15 x 2 src
textract-pipeline/lambda/textractor/python
trp.py
trp.py
204:222 (2%)
394:412 (2%)
view
15 x 2 src
textract-pipeline/lambda/textractor/python
trp.py
trp.py
394:412 (2%)
256:274 (2%)
view
15 x 2 src
src
trp.py
trp.py
204:222 (2%)
394:412 (2%)
view
15 x 2 src
textract-pipeline/lambda/textractor/python
trp.py
trp.py
394:412 (2%)
204:222 (2%)
view
15 x 2 textract-pipeline/lambda/textractor/python
textract-pipeline/lambda/textractor/python
trp.py
trp.py
204:222 (2%)
394:412 (2%)
view
Duplicated Units
The list of top 20 duplicated units.
See data for all 57 unit duplicates...
Size#FoldersFilesLinesCode
38 x 2 textract-pipeline/lambda/asyncprocessor
src
lambda_function.py
asyncproc.py
0:0 
0:0 
view
37 x 2 textract-pipeline/lambda/asyncprocessor
src
lambda_function.py
asyncproc.py
0:0 
0:0 
view
36 x 2 textract-pipeline/lambda/asyncprocessor
src
lambda_function.py
asyncproc.py
0:0 
0:0 
view
32 x 2 textract-pipeline/lambda/jobresultprocessor
src
lambda_function.py
jobresultsproc.py
0:0 
0:0 
view
30 x 2 textract-pipeline/lambda/s3batchprocessor
src
lambda_function.py
s3batchproc.py
0:0 
0:0 
view
26 x 2 textract-pipeline/lambda/helper/python
src
helper.py
helper.py
0:0 
0:0 
view
26 x 2 textract-pipeline/lambda/syncprocessor
src
lambda_function.py
syncproc.py
0:0 
0:0 
view
26 x 2 textract-pipeline/lambda/jobresultprocessor
src
lambda_function.py
jobresultsproc.py
0:0 
0:0 
view
23 x 2 textract-pipeline/lambda/helper/python
src
datastore.py
datastore.py
0:0 
0:0 
view
23 x 2 textract-pipeline/lambda/textractor/python
src
trp.py
trp.py
0:0 
0:0 
view
23 x 2 textract-pipeline/lambda/textractor/python
src
trp.py
trp.py
0:0 
0:0 
view
22 x 2 textract-pipeline/lambda/documentprocessor
src
lambda_function.py
docproc.py
0:0 
0:0 
view
22 x 2 textract-pipeline/lambda/documentprocessor
src
lambda_function.py
docproc.py
0:0 
0:0 
view
20 x 2 textract-pipeline/lambda/helper/python
src
datastore.py
datastore.py
0:0 
0:0 
view
20 x 2 textract-pipeline/lambda/textractor/python
src
trp.py
trp.py
0:0 
0:0 
view
19 x 2 textract-pipeline/lambda/helper/python
src
datastore.py
datastore.py
0:0 
0:0 
view
19 x 2 textract-pipeline/lambda/helper/python
src
datastore.py
datastore.py
0:0 
0:0 
view
19 x 2 textract-pipeline/lambda/s3processor
src
lambda_function.py
s3proc.py
0:0 
0:0 
view
19 x 2 textract-pipeline/lambda/textractor/python
src
trp.py
trp.py
0:0 
0:0 
view
19 x 2 textract-pipeline/lambda/textractor/python
src
og.py
og.py
0:0 
0:0 
view