aws-samples / document-processing-pipeline-for-regulated-industries
File Size

The distribution of size of files (measured in lines of code).

Intro
  • File size measurements show the distribution of size of files.
  • Files are classified in four categories based on their size (lines of code): 1-100 (very small files), 101-200 (small files), 201-500 (medium size files), 501-1000 (long files), 1001+(very long files).
  • It is a good practice to keep files small. Long files may become "bloaters", code that have increased to such gargantuan proportions that they are hard to work with.
Learn more...
File Size Overall
  • There are 20 files with 2,569 lines of code.
    • 0 very long files (0 lines of code)
    • 1 long files (524 lines of code)
    • 2 medium size files (522 lines of codeclsfd_ftr_w_mp_ins)
    • 4 small files (652 lines of code)
    • 13 very small files (871 lines of code)
0% | 20% | 20% | 25% | 33%
Legend:
1001+
501-1000
201-500
101-200
1-100


explore: zoomable circles | sunburst | 3D view
File Size per Extension
1001+
501-1000
201-500
101-200
1-100
py0% | 24% | 12% | 24% | 38%
ts0% | 0% | 61% | 30% | 8%
File Size per Logical Decomposition
primary
1001+
501-1000
201-500
101-200
1-100
code/lambda_layer/pipeline0% | 51% | 0% | 34% | 13%
code/lambda_layer/metadata-services0% | 0% | 84% | 0% | 15%
infrastructure/lib0% | 0% | 61% | 30% | 8%
code/comprehend_sync0% | 0% | 0% | 100% | 0%
code/metadata0% | 0% | 0% | 0% | 100%
code/textract_async0% | 0% | 0% | 0% | 100%
code/document_registrar0% | 0% | 0% | 0% | 100%
code/extension_detector0% | 0% | 0% | 0% | 100%
code/textract_sync0% | 0% | 0% | 0% | 100%
code/document_classifier0% | 0% | 0% | 0% | 100%
Longest Files (Top 20)
File# lines# units
trp.py
in code/lambda_layer/pipeline/python
524 103
datastore.py
in code/lambda_layer/metadata-services/python
270 12
textract-pipeline-stack.ts
in infrastructure/lib
252 1
helper.py
in code/lambda_layer/pipeline/python
199 19
comprehend_processor.py
in code/comprehend_sync
173 7
metadata.py
in code/lambda_layer/pipeline/python
154 22
metadata-stack.ts
in infrastructure/lib
126 -
textract_processor.py
in code/textract_async
96 3
document_registrar.py
in code/document_registrar
87 3
og.py
in code/lambda_layer/pipeline/python
86 8
extension_detector.py
in code/extension_detector
82 3
textract_processor.py
in code/textract_sync
78 4
document_classifier.py
in code/document_classifier
73 3
pipeline.py
in code/metadata
68 3
textract_starter.py
in code/textract_async
66 3
lineage.py
in code/metadata
64 2
es.py
in code/lambda_layer/pipeline/python
48 4
helper.py
in code/lambda_layer/metadata-services/python
48 3
registry.py
in code/metadata
42 2
analytics-stack.ts
in infrastructure/lib
33 -
Files With Most Units (Top 18)
File# lines# units
trp.py
in code/lambda_layer/pipeline/python
524 103
metadata.py
in code/lambda_layer/pipeline/python
154 22
helper.py
in code/lambda_layer/pipeline/python
199 19
datastore.py
in code/lambda_layer/metadata-services/python
270 12
og.py
in code/lambda_layer/pipeline/python
86 8
comprehend_processor.py
in code/comprehend_sync
173 7
es.py
in code/lambda_layer/pipeline/python
48 4
textract_processor.py
in code/textract_sync
78 4
helper.py
in code/lambda_layer/metadata-services/python
48 3
document_classifier.py
in code/document_classifier
73 3
textract_processor.py
in code/textract_async
96 3
textract_starter.py
in code/textract_async
66 3
extension_detector.py
in code/extension_detector
82 3
document_registrar.py
in code/document_registrar
87 3
pipeline.py
in code/metadata
68 3
registry.py
in code/metadata
42 2
lineage.py
in code/metadata
64 2
textract-pipeline-stack.ts
in infrastructure/lib
252 1
Files With Long Lines (Top 10)

There are 10 files with lines longer than 120 characters. In total, there are 26 long lines.

File# lines# units# long lines
textract-pipeline-stack.ts
in infrastructure/lib
252 1 5
comprehend_processor.py
in code/comprehend_sync
173 7 4
datastore.py
in code/lambda_layer/metadata-services/python
270 12 3
textract_processor.py
in code/textract_async
96 3 3
lineage.py
in code/metadata
64 2 3
trp.py
in code/lambda_layer/pipeline/python
524 103 2
extension_detector.py
in code/extension_detector
82 3 2
registry.py
in code/metadata
42 2 2
textract_starter.py
in code/textract_async
66 3 1
pipeline.py
in code/metadata
68 3 1