awslabs / ml-io
Unit Size

The distribution of size of units (measured in lines of code).

Intro
  • Unit size measurements show the distribution of size of units of code (methods, functions...).
  • Units are classified in four categories based on their size (lines of code): 1-20 (small units), 20-50 (medium size units), 51-100 (long units), 101+ (very long units).
  • You should aim at keeping units small (< 20 lines). Long units may become "bloaters", code that have increased to such gargantuan proportions that they are hard to work with.
Learn more...
Unit Size Overall
  • There are 643 units with 7,463 lines of code in units (45.9% of code).
    • 5 very long units (986 lines of code)
    • 12 long units (890 lines of code)
    • 62 medium size units (2,011 lines of code)
    • 82 small units (1,185 lines of code)
    • 482 very small units (2,391 lines of code)
13% | 11% | 26% | 15% | 32%
Legend:
101+
51-100
21-50
11-20
1-10
Unit Size per Extension
101+
51-100
21-50
11-20
1-10
cc13% | 15% | 31% | 16% | 23%
h14% | 2% | 15% | 14% | 52%
c0% | 0% | 100% | 0% | 0%
py0% | 0% | 0% | 13% | 86%
Unit Size per Logical Component
primary logical decomposition
101+
51-100
21-50
11-20
1-10
src/mlio-py/mlio26% | 15% | 21% | 14% | 21%
src/mlio/record_readers36% | 0% | 14% | 19% | 29%
src/mlio0% | 18% | 46% | 10% | 24%
src/mlio/streams0% | 0% | 36% | 29% | 34%
src/mlio/instance_readers0% | 0% | 29% | 39% | 31%
src/mlio/memory0% | 0% | 28% | 39% | 31%
src/mlio/data_stores0% | 0% | 17% | 38% | 43%
src/mlio/util0% | 0% | 0% | 34% | 65%
src/mlio/integ0% | 0% | 0% | 48% | 51%
src/mlio/detail0% | 0% | 0% | 23% | 76%
include/mlio/util0% | 0% | 0% | 22% | 77%
include/mlio0% | 0% | 0% | 0% | 100%
include/mlio/memory0% | 0% | 0% | 0% | 100%
include/mlio/streams0% | 0% | 0% | 0% | 100%
include/mlio/detail0% | 0% | 0% | 0% | 100%
include/mlio/data_stores0% | 0% | 0% | 0% | 100%
src/mlio-py0% | 0% | 0% | 0% | 100%
include/mlio/record_readers0% | 0% | 0% | 0% | 100%
Alternative Visuals
Longest Units
Top 20 longest units
Unit# linesMcCabe index# params
void register_data_readers()
in src/mlio-py/mlio/core/data_reader.cc
400 10 1
void register_data_stores()
in src/mlio-py/mlio/core/data_store.cc
175 4 1
XXH_FORCE_INLINE void XXH3_accumulate_512()
in src/mlio-py/mlio/contrib/insights/hll/xxh3.h
167 14 4
std::optional Csv_record_reader::read_line()
in src/mlio/record_readers/csv_record_reader.cc
128 30 2
XXH_FORCE_INLINE void XXH3_scrambleAcc()
in src/mlio-py/mlio/contrib/insights/hll/xxh3.h
116 6 2
std::optional get_data_type()
in src/mlio-py/mlio/core/py_buffer.cc
99 38 1
bool Csv_reader::Decoder::decode()
in src/mlio/csv_reader.cc
98 26 2
void Column_analyzer::analyze()
in src/mlio-py/mlio/contrib/insights/column_analyzer.cc
89 22 1
bool Csv_record_tokenizer::next()
in src/mlio/csv_record_tokenizer.cc
76 16 0
void register_exceptions()
in src/mlio-py/mlio/core/error.cc
74 15 1
void register_tensors()
in src/mlio-py/mlio/core/tensor.cc
73 1 1
bool Recordio_protobuf_reader::Decoder::decode_feature()
in src/mlio/recordio_protobuf_reader.cc
73 19 1
py::buffer_info to_py_buffer()
in src/mlio-py/mlio/core/py_device_array.cc
68 14 1
void Parallel_data_reader::init_graph()
in src/mlio/parallel_data_reader.cc
66 8 0
bool Image_reader::decode_core()
in src/mlio/image_reader.cc
66 15 2
XXH_FORCE_INLINE XXH_errorcode XXH3_update()
in src/mlio-py/mlio/contrib/insights/hll/xxh3.h
56 8 4
void register_schema()
in src/mlio-py/mlio/core/schema.cc
52 1 1
50 8 0
bool Recordio_protobuf_reader::Decoder::decode_feature()
in src/mlio/recordio_protobuf_reader.cc
49 16 2
Memory_slice Default_chunk_reader::read_chunk()
in src/mlio/record_readers/detail/default_chunk_reader.cc
49 13 1