awslabs / sagemaker-debugger
Unit Size

The distribution of size of units (measured in lines of code).

Intro
  • Unit size measurements show the distribution of size of units of code (methods, functions...).
  • Units are classified in four categories based on their size (lines of code): 1-20 (small units), 20-50 (medium size units), 51-100 (long units), 101+ (very long units).
  • You should aim at keeping units small (< 20 lines). Long units may become "bloaters", code that have increased to such gargantuan proportions that they are hard to work with.
Learn more...
Unit Size Overall
  • There are 1,172 units with 10,947 lines of code in units (70.7% of code).
    • 3 very long units (444 lines of code)
    • 14 long units (875 lines of code)
    • 98 medium size units (2,849 lines of code)
    • 212 small units (3,022 lines of code)
    • 845 very small units (3,757 lines of code)
4% | 7% | 26% | 27% | 34%
Legend:
101+
51-100
21-50
11-20
1-10
Unit Size per Extension
101+
51-100
21-50
11-20
1-10
py4% | 7% | 26% | 27% | 34%
Unit Size per Logical Component
primary logical decomposition
101+
51-100
21-50
11-20
1-10
smdebug/profiler13% | 13% | 28% | 22% | 22%
smdebug/core0% | 4% | 18% | 33% | 44%
smdebug/pytorch0% | 29% | 23% | 12% | 34%
smdebug/xgboost0% | 13% | 28% | 23% | 33%
smdebug/rules0% | 15% | 44% | 17% | 22%
smdebug/tensorflow0% | 0% | 31% | 33% | 34%
smdebug/trials0% | 0% | 32% | 20% | 47%
smdebug/mxnet0% | 0% | 37% | 31% | 31%
ROOT0% | 0% | 29% | 54% | 16%
smdebug/analysis0% | 0% | 0% | 90% | 9%
smdebug0% | 0% | 0% | 13% | 86%
Alternative Visuals
Longest Units
Top 20 longest units
Unit# linesMcCabe index# params
def merge_timeline()
in smdebug/profiler/analysis/utils/merge_timelines.py
188 32 5
def load_config()
in smdebug/profiler/profiler_config_parser.py
141 13 1
def _read_event()
in smdebug/profiler/trace_event_file_parser.py
115 38 3
def plot_detailed_profiler_data()
in smdebug/profiler/analysis/notebook_utils/timeline_charts.py
86 10 2
def forward_pre_hook()
in smdebug/pytorch/hook.py
82 25 3
def preprocess_system_metrics()
in smdebug/profiler/analysis/notebook_utils/heatmap.py
70 24 3
def is_checkpoint_updated()
in smdebug/core/state_store.py
64 19 1
def get_training_phase_intervals()
in smdebug/profiler/analysis/utils/pandas_data_analysis.py
63 13 2
def parse_tree_model()
in smdebug/xgboost/utils.py
62 9 3
def get_utilization_stats()
in smdebug/profiler/analysis/utils/pandas_data_analysis.py
62 15 4
def get_step_statistics()
in smdebug/profiler/analysis/utils/pandas_data_analysis.py
60 5 2
def _get_sm_tj_jobs_with_prefix()
in smdebug/rules/action/stop_training_action.py
59 11 1
55 10 1
def write_event()
in smdebug/core/tfevent/timeline_file_writer.py
55 22 2
def create_plot()
in smdebug/profiler/analysis/notebook_utils/heatmap.py
53 12 1
def list_prefix()
in smdebug/core/access_layer/s3handler.py
53 18 1
def get_framework_metrics_by_timesteps()
in smdebug/profiler/analysis/utils/profiler_data_to_pandas.py
51 21 3
def update_data()
in smdebug/profiler/analysis/notebook_utils/timeline_charts.py
49 14 2
def get_device_usage_stats()
in smdebug/profiler/analysis/utils/pandas_data_analysis.py
48 14 3
def _register_actions()
in smdebug/rules/action/action.py
48 13 3