awslabs / aws-data-wrangler
Unit Size

The distribution of size of units (measured in lines of code).

Intro
  • Unit size measurements show the distribution of size of units of code (methods, functions...).
  • Units are classified in four categories based on their size (lines of code): 1-20 (small units), 20-50 (medium size units), 51-100 (long units), 101+ (very long units).
  • You should aim at keeping units small (< 20 lines). Long units may become "bloaters", code that have increased to such gargantuan proportions that they are hard to work with.
Learn more...
Unit Size Overall
  • There are 557 units with 4,796 lines of code in units (36.7% of code).
    • 1 very long units (280 lines of code)
    • 2 long units (119 lines of code)
    • 33 medium size units (939 lines of code)
    • 82 small units (1,195 lines of code)
    • 439 very small units (2,263 lines of code)
5% | 2% | 19% | 24% | 47%
Legend:
101+
51-100
21-50
11-20
1-10
Unit Size per Extension
101+
51-100
21-50
11-20
1-10
py5% | 2% | 19% | 24% | 47%
Unit Size per Logical Component
primary logical decomposition
101+
51-100
21-50
11-20
1-10
awswrangler15% | 3% | 13% | 22% | 44%
awswrangler/s30% | 4% | 26% | 27% | 41%
awswrangler/catalog0% | 0% | 36% | 8% | 54%
awswrangler/data_api0% | 0% | 49% | 18% | 31%
awswrangler/athena0% | 0% | 15% | 46% | 38%
awswrangler/opensearch0% | 0% | 0% | 37% | 62%
awswrangler/lakeformation0% | 0% | 0% | 53% | 46%
awswrangler/quicksight0% | 0% | 0% | 8% | 91%
awswrangler/dynamodb0% | 0% | 0% | 0% | 100%
Alternative Visuals
Longest Units
Top 20 longest units
Unit# linesMcCabe index# params
def _build_cluster_args()
in awswrangler/emr.py
280 50 1
def create_cluster()
in awswrangler/emr.py
61 1 0
def _fetch()
in awswrangler/s3/_fs.py
58 21 3
def close()
in awswrangler/s3/_fs.py
49 6 1
def flush()
in awswrangler/s3/_fs.py
43 9 2
def _execute_statement()
in awswrangler/data_api/rds.py
41 5 3
def athena2pyarrow()
in awswrangler/_data_types.py
39 21 1
def _fetch_range_proxy()
in awswrangler/s3/_fs.py
39 4 3
def to_csv()
in awswrangler/s3/_write_text.py
35 1 0
def to_parquet()
in awswrangler/s3/_write_parquet.py
35 1 0
def to_json()
in awswrangler/s3/_write_text.py
34 1 0
def _create_csv_table()
in awswrangler/catalog/_create.py
30 1 0
def create_csv_table()
in awswrangler/catalog/_create.py
29 1 0
def _create_json_table()
in awswrangler/catalog/_create.py
28 1 0
def copy()
in awswrangler/redshift.py
28 1 0
def athena2pandas()
in awswrangler/_data_types.py
27 15 1
def athena2quicksight()
in awswrangler/_data_types.py
27 13 1
def store_parquet_metadata()
in awswrangler/s3/_write_parquet.py
27 1 0
def _apply_index()
in awswrangler/s3/_read_parquet.py
27 18 3
def create_json_table()
in awswrangler/catalog/_create.py
27 1 0