aws / sagemaker-spark-container
File Size

The distribution of size of files (measured in lines of code).

Intro
Learn more...
File Size Overall
0% | 0% | 19% | 20% | 60%
Legend:
1001+
501-1000
201-500
101-200
1-100


explore: grouped by folders | grouped by size | sunburst | 3D view
File Size per Extension
1001+
501-1000
201-500
101-200
1-100
py0% | 0% | 31% | 32% | 36%
xml0% | 0% | 0% | 0% | 100%
toml0% | 0% | 0% | 0% | 100%
File Size per Logical Decomposition
primary
1001+
501-1000
201-500
101-200
1-100
src0% | 0% | 33% | 34% | 32%
spark0% | 0% | 0% | 0% | 100%
smsparkbuild0% | 0% | 0% | 0% | 100%
Longest Files (Top 41)
File# lines# units
bootstrapper.py
in src/smspark
387 21
job.py
in src/smspark
155 12
cli.py
in src/smspark
136 5
config.py
in src/smspark
108 8
status.py
in src/smspark
85 11
75 8
56 4
hdfs-site.xml
in spark/processing/3.5/py3/hadoop-config
51 -
hdfs-site.xml
in spark/processing/3.4/py3/hadoop-config
51 -
hdfs-site.xml
in spark/processing/3.0/py3/hadoop-config
51 -
hdfs-site.xml
in spark/processing/3.1/py3/hadoop-config
51 -
hdfs-site.xml
in spark/processing/3.2/py3/hadoop-config
51 -
hdfs-site.xml
in spark/processing/3.3/py3/hadoop-config
51 -
errors.py
in src/smspark
41 5
34 3
nginx_utils.py
in src/smspark
33 3
setup.py
in smsparkbuild/py37
33 -
setup.py
in smsparkbuild/py39
33 -
yarn-site.xml
in spark/processing/3.5/py3/hadoop-config
32 -
yarn-site.xml
in spark/processing/3.4/py3/hadoop-config
32 -
yarn-site.xml
in spark/processing/3.0/py3/hadoop-config
32 -
yarn-site.xml
in spark/processing/3.1/py3/hadoop-config
32 -
yarn-site.xml
in spark/processing/2.4/py3/hadoop-config
32 -
yarn-site.xml
in spark/processing/3.2/py3/hadoop-config
32 -
yarn-site.xml
in spark/processing/3.3/py3/hadoop-config
32 -
core-site.xml
in spark/processing/3.5/py3/hadoop-config
29 -
core-site.xml
in spark/processing/3.4/py3/hadoop-config
29 -
core-site.xml
in spark/processing/3.0/py3/hadoop-config
29 -
core-site.xml
in spark/processing/3.1/py3/hadoop-config
29 -
core-site.xml
in spark/processing/2.4/py3/hadoop-config
29 -
core-site.xml
in spark/processing/3.2/py3/hadoop-config
29 -
core-site.xml
in spark/processing/3.3/py3/hadoop-config
29 -
history_server_cli.py
in src/smspark
20 1
hdfs-site.xml
in spark/processing/2.4/py3/hadoop-config
16 -
config_path_utils.py
in src/smspark
14 1
waiter.py
in src/smspark
13 1
constants.py
in src/smspark
10 -
defaults.py
in src/smspark
3 -
pyproject.toml
in smsparkbuild/py37
3 -
pyproject.toml
in smsparkbuild/py39
3 -
__init__.py
in src/smspark
1 -
Files With Most Units (Top 13)
File# lines# units
bootstrapper.py
in src/smspark
387 21
job.py
in src/smspark
155 12
status.py
in src/smspark
85 11
75 8
config.py
in src/smspark
108 8
errors.py
in src/smspark
41 5
cli.py
in src/smspark
136 5
56 4
34 3
nginx_utils.py
in src/smspark
33 3
config_path_utils.py
in src/smspark
14 1
waiter.py
in src/smspark
13 1
history_server_cli.py
in src/smspark
20 1
Files With Long Lines (Top 10)

There are 10 files with lines longer than 120 characters. In total, there are 10 long lines.

File# lines# units# long lines
config.py
in src/smspark
108 8 1
job.py
in src/smspark
155 12 1
bootstrapper.py
in src/smspark
387 21 1
yarn-site.xml
in spark/processing/3.5/py3/hadoop-config
32 - 1
yarn-site.xml
in spark/processing/3.4/py3/hadoop-config
32 - 1
yarn-site.xml
in spark/processing/3.0/py3/hadoop-config
32 - 1
yarn-site.xml
in spark/processing/3.1/py3/hadoop-config
32 - 1
yarn-site.xml
in spark/processing/2.4/py3/hadoop-config
32 - 1
yarn-site.xml
in spark/processing/3.2/py3/hadoop-config
32 - 1
yarn-site.xml
in spark/processing/3.3/py3/hadoop-config
32 - 1
Correlations

File Size vs. Commits (all time): 41 points

spark/processing/3.5/py3/hadoop-config/core-site.xml x: 1 commits (all time) y: 29 lines of code spark/processing/3.5/py3/hadoop-config/hdfs-site.xml x: 1 commits (all time) y: 51 lines of code spark/processing/3.5/py3/hadoop-config/yarn-site.xml x: 1 commits (all time) y: 32 lines of code src/smspark/bootstrapper.py x: 15 commits (all time) y: 387 lines of code smsparkbuild/py39/setup.py x: 2 commits (all time) y: 33 lines of code src/smspark/constants.py x: 3 commits (all time) y: 10 lines of code src/smspark/config_path_utils.py x: 2 commits (all time) y: 14 lines of code src/smspark/job.py x: 7 commits (all time) y: 155 lines of code spark/processing/2.4/py3/hadoop-config/core-site.xml x: 2 commits (all time) y: 29 lines of code smsparkbuild/py37/pyproject.toml x: 1 commits (all time) y: 3 lines of code src/smspark/cli.py x: 4 commits (all time) y: 136 lines of code src/smspark/config.py x: 4 commits (all time) y: 108 lines of code src/smspark/errors.py x: 4 commits (all time) y: 41 lines of code src/smspark/history_server_cli.py x: 4 commits (all time) y: 20 lines of code src/smspark/history_server_utils.py x: 4 commits (all time) y: 56 lines of code src/smspark/nginx_utils.py x: 4 commits (all time) y: 33 lines of code src/smspark/spark_event_logs_publisher.py x: 4 commits (all time) y: 75 lines of code src/smspark/waiter.py x: 4 commits (all time) y: 13 lines of code spark/processing/3.0/py3/hadoop-config/hdfs-site.xml x: 2 commits (all time) y: 51 lines of code src/smspark/defaults.py x: 3 commits (all time) y: 3 lines of code src/smspark/spark_executor_logs_watcher.py x: 3 commits (all time) y: 34 lines of code src/smspark/status.py x: 3 commits (all time) y: 85 lines of code spark/processing/2.4/py3/hadoop-config/hdfs-site.xml x: 1 commits (all time) y: 16 lines of code src/smspark/__init__.py x: 1 commits (all time) y: 1 lines of code
387.0
lines of code
  min: 1.0
  average: 48.59
  25th percentile: 29.0
  median: 32.0
  75th percentile: 51.0
  max: 387.0
0 15.0
commits (all time)
min: 1.0 | average: 2.51 | 25th percentile: 1.0 | median: 2.0 | 75th percentile: 3.5 | max: 15.0

File Size vs. Contributors (all time): 41 points

spark/processing/3.5/py3/hadoop-config/core-site.xml x: 1 contributors (all time) y: 29 lines of code spark/processing/3.5/py3/hadoop-config/hdfs-site.xml x: 1 contributors (all time) y: 51 lines of code spark/processing/3.5/py3/hadoop-config/yarn-site.xml x: 1 contributors (all time) y: 32 lines of code src/smspark/bootstrapper.py x: 10 contributors (all time) y: 387 lines of code smsparkbuild/py39/setup.py x: 2 contributors (all time) y: 33 lines of code src/smspark/constants.py x: 3 contributors (all time) y: 10 lines of code src/smspark/config_path_utils.py x: 1 contributors (all time) y: 14 lines of code src/smspark/job.py x: 6 contributors (all time) y: 155 lines of code spark/processing/2.4/py3/hadoop-config/core-site.xml x: 2 contributors (all time) y: 29 lines of code smsparkbuild/py37/pyproject.toml x: 1 contributors (all time) y: 3 lines of code src/smspark/cli.py x: 4 contributors (all time) y: 136 lines of code src/smspark/config.py x: 4 contributors (all time) y: 108 lines of code src/smspark/errors.py x: 4 contributors (all time) y: 41 lines of code src/smspark/history_server_cli.py x: 4 contributors (all time) y: 20 lines of code src/smspark/history_server_utils.py x: 4 contributors (all time) y: 56 lines of code src/smspark/nginx_utils.py x: 4 contributors (all time) y: 33 lines of code src/smspark/spark_event_logs_publisher.py x: 4 contributors (all time) y: 75 lines of code src/smspark/waiter.py x: 4 contributors (all time) y: 13 lines of code src/smspark/defaults.py x: 3 contributors (all time) y: 3 lines of code src/smspark/spark_executor_logs_watcher.py x: 3 contributors (all time) y: 34 lines of code src/smspark/status.py x: 3 contributors (all time) y: 85 lines of code spark/processing/2.4/py3/hadoop-config/hdfs-site.xml x: 1 contributors (all time) y: 16 lines of code src/smspark/__init__.py x: 1 contributors (all time) y: 1 lines of code
387.0
lines of code
  min: 1.0
  average: 48.59
  25th percentile: 29.0
  median: 32.0
  75th percentile: 51.0
  max: 387.0
0 10.0
contributors (all time)
min: 1.0 | average: 2.32 | 25th percentile: 1.0 | median: 2.0 | 75th percentile: 3.5 | max: 10.0

File Size vs. Commits (30 days): 0 points

No data for "commits (30d)" vs. "lines of code".

File Size vs. Contributors (30 days): 0 points

No data for "contributors (30d)" vs. "lines of code".


File Size vs. Commits (90 days): 0 points

No data for "commits (90d)" vs. "lines of code".

File Size vs. Contributors (90 days): 0 points

No data for "contributors (90d)" vs. "lines of code".