deepseek-ai / smallpond
File Size

The distribution of size of files (measured in lines of code).

Intro
Learn more...
File Size Overall
40% | 18% | 28% | 3% | 9%
Legend:
1001+
501-1000
201-500
101-200
1-100


explore: grouped by folders | grouped by size | sunburst | 3D view
File Size per Extension
1001+
501-1000
201-500
101-200
1-100
py40% | 18% | 28% | 3% | 8%
toml0% | 0% | 0% | 0% | 100%
in0% | 0% | 0% | 0% | 100%
File Size per Logical Decomposition
primary
1001+
501-1000
201-500
101-200
1-100
smallpond43% | 19% | 26% | 3% | 6%
benchmarks0% | 0% | 56% | 0% | 43%
ROOT0% | 0% | 0% | 0% | 100%
Longest Files (Top 35)
File# lines# units
task.py
in smallpond/execution
2475 241
node.py
in smallpond/logical
1198 106
scheduler.py
in smallpond/execution
971 93
dataset.py
in smallpond/logical
700 65
driver.py
in smallpond/execution
366 13
workqueue.py
in smallpond/execution
342 43
322 8
dataframe.py
in smallpond
296 38
arrow.py
in smallpond/io
293 17
session.py
in smallpond
263 13
executor.py
in smallpond/execution
225 25
planner.py
in smallpond/logical
224 16
manager.py
in smallpond/execution
217 4
udf.py
in smallpond/logical
179 21
utility.py
in smallpond
139 17
warc.py
in smallpond/contrib
97 5
93 3
filesystem.py
in smallpond/io
92 6
79 2
75 -
72 2
common.py
in smallpond
66 10
worker.py
in smallpond
63 -
base.py
in smallpond/platform
60 11
__init__.py
in smallpond
34 1
optimizer.py
in smallpond/logical
33 4
mpi.py
in smallpond/platform
29 2
log_dataset.py
in smallpond/contrib
26 4
__init__.py
in smallpond/platform
21 1
copy_table.py
in smallpond/contrib
14 2
__init__.py
in smallpond/io
1 -
__init__.py
in smallpond/logical
1 -
__init__.py
in smallpond/contrib
1 -
__init__.py
in smallpond/execution
1 -
in
1 -
Files With Most Units (Top 28)
File# lines# units
task.py
in smallpond/execution
2475 241
node.py
in smallpond/logical
1198 106
scheduler.py
in smallpond/execution
971 93
dataset.py
in smallpond/logical
700 65
workqueue.py
in smallpond/execution
342 43
dataframe.py
in smallpond
296 38
executor.py
in smallpond/execution
225 25
udf.py
in smallpond/logical
179 21
arrow.py
in smallpond/io
293 17
utility.py
in smallpond
139 17
planner.py
in smallpond/logical
224 16
session.py
in smallpond
263 13
driver.py
in smallpond/execution
366 13
base.py
in smallpond/platform
60 11
common.py
in smallpond
66 10
322 8
filesystem.py
in smallpond/io
92 6
warc.py
in smallpond/contrib
97 5
optimizer.py
in smallpond/logical
33 4
log_dataset.py
in smallpond/contrib
26 4
manager.py
in smallpond/execution
217 4
93 3
79 2
72 2
mpi.py
in smallpond/platform
29 2
copy_table.py
in smallpond/contrib
14 2
__init__.py
in smallpond
34 1
__init__.py
in smallpond/platform
21 1
Files With Long Lines (Top 18)

There are 18 files with lines longer than 120 characters. In total, there are 158 long lines.

File# lines# units# long lines
task.py
in smallpond/execution
2475 241 50
scheduler.py
in smallpond/execution
971 93 26
node.py
in smallpond/logical
1198 106 18
planner.py
in smallpond/logical
224 16 16
dataset.py
in smallpond/logical
700 65 14
arrow.py
in smallpond/io
293 17 7
322 8 6
executor.py
in smallpond/execution
225 25 6
workqueue.py
in smallpond/execution
342 43 3
warc.py
in smallpond/contrib
97 5 2
manager.py
in smallpond/execution
217 4 2
driver.py
in smallpond/execution
366 13 2
filesystem.py
in smallpond/io
92 6 1
optimizer.py
in smallpond/logical
33 4 1
session.py
in smallpond
263 13 1
copy_table.py
in smallpond/contrib
14 2 1
dataframe.py
in smallpond
296 38 1
common.py
in smallpond
66 10 1
Correlations

File Size vs. Commits (all time): 25 points

benchmarks/file_io_benchmark.py x: 1 commits (all time) y: 72 lines of code benchmarks/gray_sort_benchmark.py x: 1 commits (all time) y: 322 lines of code benchmarks/hash_partition_benchmark.py x: 1 commits (all time) y: 79 lines of code benchmarks/urls_sort_benchmark.py x: 1 commits (all time) y: 93 lines of code smallpond/common.py x: 1 commits (all time) y: 66 lines of code smallpond/contrib/copy_table.py x: 1 commits (all time) y: 14 lines of code smallpond/contrib/log_dataset.py x: 1 commits (all time) y: 26 lines of code smallpond/dataframe.py x: 1 commits (all time) y: 296 lines of code smallpond/execution/driver.py x: 1 commits (all time) y: 366 lines of code smallpond/execution/executor.py x: 1 commits (all time) y: 225 lines of code smallpond/execution/manager.py x: 1 commits (all time) y: 217 lines of code smallpond/execution/scheduler.py x: 1 commits (all time) y: 971 lines of code smallpond/execution/task.py x: 1 commits (all time) y: 2475 lines of code smallpond/execution/workqueue.py x: 1 commits (all time) y: 342 lines of code smallpond/logical/dataset.py x: 1 commits (all time) y: 700 lines of code smallpond/logical/node.py x: 1 commits (all time) y: 1198 lines of code smallpond/logical/optimizer.py x: 1 commits (all time) y: 33 lines of code smallpond/logical/udf.py x: 1 commits (all time) y: 179 lines of code smallpond/session.py x: 1 commits (all time) y: 263 lines of code smallpond/utility.py x: 1 commits (all time) y: 139 lines of code
2475.0
lines of code
  min: 14.0
  average: 353.8
  25th percentile: 75.5
  median: 217.0
  75th percentile: 332.0
  max: 2475.0
0 1.0
commits (all time)
min: 1.0 | average: 1.0 | 25th percentile: 1.0 | median: 1.0 | 75th percentile: 1.0 | max: 1.0

File Size vs. Contributors (all time): 25 points

benchmarks/file_io_benchmark.py x: 1 contributors (all time) y: 72 lines of code benchmarks/gray_sort_benchmark.py x: 1 contributors (all time) y: 322 lines of code benchmarks/hash_partition_benchmark.py x: 1 contributors (all time) y: 79 lines of code benchmarks/urls_sort_benchmark.py x: 1 contributors (all time) y: 93 lines of code smallpond/common.py x: 1 contributors (all time) y: 66 lines of code smallpond/contrib/copy_table.py x: 1 contributors (all time) y: 14 lines of code smallpond/contrib/log_dataset.py x: 1 contributors (all time) y: 26 lines of code smallpond/dataframe.py x: 1 contributors (all time) y: 296 lines of code smallpond/execution/driver.py x: 1 contributors (all time) y: 366 lines of code smallpond/execution/executor.py x: 1 contributors (all time) y: 225 lines of code smallpond/execution/manager.py x: 1 contributors (all time) y: 217 lines of code smallpond/execution/scheduler.py x: 1 contributors (all time) y: 971 lines of code smallpond/execution/task.py x: 1 contributors (all time) y: 2475 lines of code smallpond/execution/workqueue.py x: 1 contributors (all time) y: 342 lines of code smallpond/logical/dataset.py x: 1 contributors (all time) y: 700 lines of code smallpond/logical/node.py x: 1 contributors (all time) y: 1198 lines of code smallpond/logical/optimizer.py x: 1 contributors (all time) y: 33 lines of code smallpond/logical/udf.py x: 1 contributors (all time) y: 179 lines of code smallpond/session.py x: 1 contributors (all time) y: 263 lines of code smallpond/utility.py x: 1 contributors (all time) y: 139 lines of code
2475.0
lines of code
  min: 14.0
  average: 353.8
  25th percentile: 75.5
  median: 217.0
  75th percentile: 332.0
  max: 2475.0
0 1.0
contributors (all time)
min: 1.0 | average: 1.0 | 25th percentile: 1.0 | median: 1.0 | 75th percentile: 1.0 | max: 1.0

File Size vs. Commits (30 days): 0 points

No data for "commits (30d)" vs. "lines of code".

File Size vs. Contributors (30 days): 0 points

No data for "contributors (30d)" vs. "lines of code".


File Size vs. Commits (90 days): 25 points

benchmarks/file_io_benchmark.py x: 1 commits (90d) y: 72 lines of code benchmarks/gray_sort_benchmark.py x: 1 commits (90d) y: 322 lines of code benchmarks/hash_partition_benchmark.py x: 1 commits (90d) y: 79 lines of code benchmarks/urls_sort_benchmark.py x: 1 commits (90d) y: 93 lines of code smallpond/common.py x: 1 commits (90d) y: 66 lines of code smallpond/contrib/copy_table.py x: 1 commits (90d) y: 14 lines of code smallpond/contrib/log_dataset.py x: 1 commits (90d) y: 26 lines of code smallpond/dataframe.py x: 1 commits (90d) y: 296 lines of code smallpond/execution/driver.py x: 1 commits (90d) y: 366 lines of code smallpond/execution/executor.py x: 1 commits (90d) y: 225 lines of code smallpond/execution/manager.py x: 1 commits (90d) y: 217 lines of code smallpond/execution/scheduler.py x: 1 commits (90d) y: 971 lines of code smallpond/execution/task.py x: 1 commits (90d) y: 2475 lines of code smallpond/execution/workqueue.py x: 1 commits (90d) y: 342 lines of code smallpond/logical/dataset.py x: 1 commits (90d) y: 700 lines of code smallpond/logical/node.py x: 1 commits (90d) y: 1198 lines of code smallpond/logical/optimizer.py x: 1 commits (90d) y: 33 lines of code smallpond/logical/udf.py x: 1 commits (90d) y: 179 lines of code smallpond/session.py x: 1 commits (90d) y: 263 lines of code smallpond/utility.py x: 1 commits (90d) y: 139 lines of code
2475.0
lines of code
  min: 14.0
  average: 353.8
  25th percentile: 75.5
  median: 217.0
  75th percentile: 332.0
  max: 2475.0
0 1.0
commits (90d)
min: 1.0 | average: 1.0 | 25th percentile: 1.0 | median: 1.0 | 75th percentile: 1.0 | max: 1.0

File Size vs. Contributors (90 days): 25 points

benchmarks/file_io_benchmark.py x: 1 contributors (90d) y: 72 lines of code benchmarks/gray_sort_benchmark.py x: 1 contributors (90d) y: 322 lines of code benchmarks/hash_partition_benchmark.py x: 1 contributors (90d) y: 79 lines of code benchmarks/urls_sort_benchmark.py x: 1 contributors (90d) y: 93 lines of code smallpond/common.py x: 1 contributors (90d) y: 66 lines of code smallpond/contrib/copy_table.py x: 1 contributors (90d) y: 14 lines of code smallpond/contrib/log_dataset.py x: 1 contributors (90d) y: 26 lines of code smallpond/dataframe.py x: 1 contributors (90d) y: 296 lines of code smallpond/execution/driver.py x: 1 contributors (90d) y: 366 lines of code smallpond/execution/executor.py x: 1 contributors (90d) y: 225 lines of code smallpond/execution/manager.py x: 1 contributors (90d) y: 217 lines of code smallpond/execution/scheduler.py x: 1 contributors (90d) y: 971 lines of code smallpond/execution/task.py x: 1 contributors (90d) y: 2475 lines of code smallpond/execution/workqueue.py x: 1 contributors (90d) y: 342 lines of code smallpond/logical/dataset.py x: 1 contributors (90d) y: 700 lines of code smallpond/logical/node.py x: 1 contributors (90d) y: 1198 lines of code smallpond/logical/optimizer.py x: 1 contributors (90d) y: 33 lines of code smallpond/logical/udf.py x: 1 contributors (90d) y: 179 lines of code smallpond/session.py x: 1 contributors (90d) y: 263 lines of code smallpond/utility.py x: 1 contributors (90d) y: 139 lines of code
2475.0
lines of code
  min: 14.0
  average: 353.8
  25th percentile: 75.5
  median: 217.0
  75th percentile: 332.0
  max: 2475.0
0 1.0
contributors (90d)
min: 1.0 | average: 1.0 | 25th percentile: 1.0 | median: 1.0 | 75th percentile: 1.0 | max: 1.0