awslabs / unsupervised-qa
Unit Size

The distribution of size of units (measured in lines of code).

Intro
  • Unit size measurements show the distribution of size of units of code (methods, functions...).
  • Units are classified in four categories based on their size (lines of code): 1-20 (small units), 20-50 (medium size units), 51-100 (long units), 101+ (very long units).
  • You should aim at keeping units small (< 20 lines). Long units may become "bloaters", code that have increased to such gargantuan proportions that they are hard to work with.
Learn more...
Unit Size Overall
  • There are 120 units with 1,199 lines of code in units (71.0% of code).
    • 1 very long units (111 lines of code)
    • 2 long units (119 lines of code)
    • 11 medium size units (301 lines of code)
    • 15 small units (201 lines of code)
    • 91 very small units (467 lines of code)
9% | 9% | 25% | 16% | 38%
Legend:
101+
51-100
21-50
11-20
1-10
Unit Size per Extension
101+
51-100
21-50
11-20
1-10
py9% | 9% | 25% | 16% | 38%
Unit Size per Logical Component
primary logical decomposition
101+
51-100
21-50
11-20
1-10
distant_supervision11% | 6% | 19% | 17% | 45%
spark_scripts0% | 24% | 48% | 15% | 12%
Alternative Visuals
Longest Units
Top 20 longest units
Unit# linesMcCabe index# params
def _create_es_index()
in distant_supervision/ds_es_client.py
111 3 1
def _obtain_retrieved_sentences_for_single_article()
in distant_supervision/synthetic_data_creator.py
65 19 6
def main()
in spark_scripts/create_ds_synthetic_dataset.py
54 6 1
def _run_stat()
in spark_scripts/stat_for_ner_category_to_wh_words.py
40 9 5
def _construct_dataset_sample()
in distant_supervision/synthetic_data_creator.py
40 3 10
def _get_entity2qpa_list()
in distant_supervision/entity_to_queries_mapper.py
32 9 3
def __init__()
in distant_supervision/synthetic_data_creator.py
27 1 13
def _compute_answer_start()
in distant_supervision/synthetic_data_creator.py
25 6 5
def run_job()
in distant_supervision/synthetic_data_creator.py
24 5 5
def main()
in spark_scripts/write_sentence_level_es_index.py
23 2 1
def main()
in spark_scripts/create_squad_ner_dataset.py
23 5 1
def _index_rdd_partition()
in distant_supervision/ds_es_client.py
23 3 2
def main()
in spark_scripts/tokenize_and_ner_inputs.py
22 2 1
def _process_row()
in distant_supervision/input_parser.py
22 5 2
def main()
in spark_scripts/stat_for_ner_category_to_wh_words.py
19 2 1
def _make_styled_questions()
in distant_supervision/synthetic_data_creator.py
17 2 6
def _create_jsonl_training_files()
in spark_scripts/create_ds_synthetic_dataset.py
15 3 2
def _get_valid_context_sentences()
in distant_supervision/entity_to_queries_mapper.py
15 4 3
def is_similar()
in distant_supervision/text_preprocessor.py
14 3 6
def _process_row()
in distant_supervision/squad_ner_creator.py
14 4 2