Petastorm
Conditional Complexity

Intro
  • Conditional complexity (also called cyclomatic complexity) is a term used to measure the complexity of software. The term refers to the number of possible paths through a program function. A higher value ofter means higher maintenance and testing costs (infosecinstitute.com).
  • Conditional complexity is calculated by counting all conditions in the program that can affect the execution path (e.g. if statement, loops, switches, and/or operators, try and catch blocks...).
  • Conditional complexity is measured at the unit level (methods, functions...).
  • Units are classified in four categories based on the measured McCabe index: 1-5 (simple units), 6-10 (medium complex units), 11-25 (complex units), 26+ (very complex units).
Learn more...
Conditional Complexity Overall
  • There are 467 units with 4,204 lines of code in units (81.9% of code).
    • 0 very complex units (0 lines of code)
    • 0 complex units (0 lines of code)
    • 23 medium complex units (682 lines of code)
    • 52 simple units (1,184 lines of code)
    • 392 very simple units (2,338 lines of code)
0% | 0% | 16% | 28% | 55%
Legend:
51+
26-50
11-25
6-10
1-5
Alternative Visuals
Conditional Complexity per Extension
py0% | 0% | 16% | 28% | 55%
Legend:
51+
26-50
11-25
6-10
1-5
Conditional Complexity per Logical Component
primary logical decomposition
petastorm0% | 0% | 27% | 25% | 46%
petastorm/workers_pool0% | 0% | 16% | 37% | 46%
petastorm/etl0% | 0% | 9% | 33% | 56%
petastorm/reader_impl0% | 0% | 15% | 11% | 72%
petastorm/gcsfs_helpers0% | 0% | 55% | 0% | 44%
examples/mnist0% | 0% | 0% | 44% | 55%
petastorm/spark0% | 0% | 0% | 27% | 72%
examples/spark_dataset_converter0% | 0% | 0% | 40% | 59%
petastorm/benchmark0% | 0% | 0% | 29% | 70%
examples/imagenet0% | 0% | 0% | 66% | 33%
petastorm/tools0% | 0% | 0% | 30% | 69%
petastorm/hdfs0% | 0% | 0% | 16% | 83%
examples/hello_world0% | 0% | 0% | 0% | 100%
petastorm/pyarrow_helpers0% | 0% | 0% | 0% | 100%
Legend:
51+
26-50
11-25
6-10
1-5
Most Complex Units
Top 50 most complex units
Unit# linesMcCabe index# params
def __init__()
in petastorm/reader.py
53 20 17
def __init__()
in petastorm/fs_utils.py
67 19 6
def _sanitize_field_tf_types()
in petastorm/tf_utils.py
21 17 1
def form_ngram()
in petastorm/ngram.py
26 16 3
def get_results()
in petastorm/workers_pool/process_pool.py
32 16 1
def _validate_ngram()
in petastorm/ngram.py
16 15 5
def generate_petastorm_metadata()
in petastorm/etl/petastorm_generate_metadata.py
37 15 5
def _numpy_and_codec_from_arrow_type()
in petastorm/unischema.py
33 15 1
def _load_rows_with_predicate()
in petastorm/py_dict_reader_worker.py
33 14 5
def get_results()
in petastorm/workers_pool/thread_pool.py
20 13 1
def from_arrow_schema()
in petastorm/unischema.py
35 13 3
def _add_many()
in petastorm/reader_impl/pytorch_shuffling_buffer.py
31 12 2
def _sanitize_pytorch_types()
in petastorm/pytorch.py
18 12 1
def _iter_impl()
in petastorm/pytorch.py
25 12 1
def _ventilate()
in petastorm/workers_pool/ventilator.py
17 12 1
def process()
in petastorm/arrow_reader_worker.py
29 11 4
def walk()
in petastorm/gcsfs_helpers/gcsfs_wrapper.py
24 11 2
def __init__()
in petastorm/weighted_sampling_reader.py
21 11 3
def encode()
in petastorm/codecs.py
23 11 3
def make_reader()
in petastorm/reader.py
63 11 21
def add_to_dataset_metadata()
in petastorm/utils.py
25 11 3
def transform_schema()
in petastorm/transform.py
20 11 2
def create_schema_view()
in petastorm/unischema.py
13 11 2
def read_next()
in petastorm/arrow_reader_worker.py
34 10 4
def copy_dataset()
in petastorm/tools/copy_dataset.py
28 10 9
def process()
in petastorm/py_dict_reader_worker.py
25 10 4
def _iter_impl()
in petastorm/pytorch.py
28 10 1
def run()
in petastorm/workers_pool/thread_pool.py
22 10 1
def hdfs_connect_namenode()
in petastorm/hdfs/namenode.py
11 10 4
def match_unischema_fields()
in petastorm/unischema.py
19 10 2
def main()
in examples/mnist/pytorch_example.py
47 9 0
def load_row_groups()
in petastorm/etl/dataset_metadata.py
25 9 1
def read_next()
in petastorm/py_dict_reader_worker.py
17 9 4
def make_batch_reader()
in petastorm/reader.py
54 9 19
def get_results()
in petastorm/workers_pool/dummy_pool.py
14 9 1
def _keep_retrying_while_zmq_again()
in petastorm/workers_pool/process_pool.py
17 9 3
def dict_to_spark_row()
in petastorm/unischema.py
26 9 2
def _convert_precision()
in petastorm/spark/spark_dataset_converter.py
17 9 2
def convert_fields()
in petastorm/ngram.py
8 8 3
def _split_row_groups()
in petastorm/etl/dataset_metadata.py
21 8 1
def build_index()
in petastorm/etl/rowgroup_indexers.py
13 8 3
def _partition_row_groups()
in petastorm/reader.py
11 8 6
def decimal_friendly_collate()
in petastorm/pytorch.py
12 8 1
def decode_row()
in petastorm/utils.py
20 8 2
def get_filesystem_and_path_or_paths()
in petastorm/fs_utils.py
18 8 2
def __init__()
in petastorm/workers_pool/ventilator.py
23 8 7
def _worker_bootstrap()
in petastorm/workers_pool/process_pool.py
53 8 8
def run()
in examples/spark_dataset_converter/pytorch_converter_example.py
30 7 1
def run()
in examples/spark_dataset_converter/tensorflow_converter_example.py
33 7 1
def imagenet_directory_to_petastorm_dataset()
in examples/imagenet/generate_petastorm_imagenet.py
35 7 5