apache / datafusion-ray
File Size

The distribution of size of files (measured in lines of code).

Intro
Learn more...
File Size Overall
35% | 0% | 32% | 10% | 21%
Legend:
1001+
501-1000
201-500
101-200
1-100


explore: grouped by folders | grouped by size | sunburst | 3D view
File Size per Extension
1001+
501-1000
201-500
101-200
1-100
py49% | 0% | 31% | 13% | 5%
proto67% | 0% | 30% | 0% | 2%
rs0% | 0% | 50% | 17% | 31%
sql0% | 0% | 0% | 0% | 100%
toml0% | 0% | 0% | 0% | 100%
File Size per Logical Decomposition
primary
1001+
501-1000
201-500
101-200
1-100
datafusion_ray77% | 0% | 21% | 0% | <1%
src31% | 0% | 41% | 9% | 17%
k8s0% | 0% | 71% | 16% | 11%
tpch0% | 0% | 0% | 20% | 79%
dev0% | 0% | 0% | 65% | 34%
ROOT0% | 0% | 0% | 0% | 100%
Longest Files (Top 50)
File# lines# units
friendly.py
in datafusion_ray
1579 1
proto
datafusion.proto
in src/proto
1068 -
proto
487 -
core.py
in datafusion_ray
436 24
369 21
323 8
297 2
util.rs
in src
278 5
cmds.py
in k8s
234 7
codec.rs
in src
188 4
170 2
148 8
132 2
generate-changelog.py
in dev/release
120 3
Cargo.toml
in root
100 -
93 7
90 1
89 -
flight.rs
in src
86 -
79 8
stage.rs
in src
77 9
62 4
52 7
45 -
q2.sql
in tpch/queries
43 -
proto
39 -
q7.sql
in tpch/queries
39 -
q21.sql
in tpch/queries
39 -
q8.sql
in tpch/queries
37 -
q20.sql
in tpch/queries
37 -
q22.sql
in tpch/queries
37 -
check-rat-report.py
in dev/release
36 -
q19.sql
in tpch/queries
35 -
q9.sql
in tpch/queries
32 -
q18.sql
in tpch/queries
32 -
q15.sql
in tpch/queries
31 -
q10.sql
in tpch/queries
31 -
q16.sql
in tpch/queries
30 -
build.rs
in root
29 1
q12.sql
in tpch/queries
28 -
q11.sql
in tpch/queries
27 -
27 -
q5.sql
in tpch/queries
24 -
23 2
q3.sql
in tpch/queries
22 -
q1.sql
in tpch/queries
21 -
q4.sql
in tpch/queries
21 -
q13.sql
in tpch/queries
20 -
lib.rs
in src
19 -
q17.sql
in tpch/queries
17 -
Files With Most Units (Top 20)
File# lines# units
core.py
in datafusion_ray
436 24
369 21
stage.rs
in src
77 9
79 8
148 8
323 8
93 7
52 7
cmds.py
in k8s
234 7
util.rs
in src
278 5
codec.rs
in src
188 4
62 4
generate-changelog.py
in dev/release
120 3
297 2
132 2
23 2
170 2
90 1
friendly.py
in datafusion_ray
1579 1
build.rs
in root
29 1
Files With Long Lines (Top 5)

There are 5 files with lines longer than 120 characters. In total, there are 12 long lines.

File# lines# units# long lines
323 8 5
cmds.py
in k8s
234 7 3
proto
datafusion.proto
in src/proto
1068 - 2
148 8 1
90 1 1
Correlations

File Size vs. Commits (all time): 55 points

Cargo.toml x: 16 commits (all time) y: 100 lines of code datafusion_ray/core.py x: 8 commits (all time) y: 436 lines of code dev/release/check-rat-report.py x: 2 commits (all time) y: 36 lines of code dev/release/generate-changelog.py x: 2 commits (all time) y: 120 lines of code k8s/bench_toolbox.py x: 1 commits (all time) y: 323 lines of code k8s/cmds.py x: 1 commits (all time) y: 234 lines of code k8s/pricing.py x: 1 commits (all time) y: 132 lines of code k8s/spark_tpcbench.py x: 1 commits (all time) y: 90 lines of code src/codec.rs x: 3 commits (all time) y: 188 lines of code tpch/tpcbench.py x: 11 commits (all time) y: 170 lines of code src/dataframe.rs x: 7 commits (all time) y: 369 lines of code datafusion_ray/util.py x: 2 commits (all time) y: 4 lines of code src/lib.rs x: 8 commits (all time) y: 19 lines of code src/util.rs x: 6 commits (all time) y: 278 lines of code src/context.rs x: 10 commits (all time) y: 89 lines of code src/processor_service.rs x: 2 commits (all time) y: 297 lines of code dev/create_license.py x: 1 commits (all time) y: 27 lines of code pyproject.toml x: 8 commits (all time) y: 45 lines of code datafusion_ray/__init__.py x: 7 commits (all time) y: 8 lines of code src/physical.rs x: 2 commits (all time) y: 62 lines of code src/proto/datafusion_ray.proto x: 6 commits (all time) y: 39 lines of code src/stage.rs x: 1 commits (all time) y: 77 lines of code src/stage_reader.rs x: 1 commits (all time) y: 148 lines of code tpch/queries/q1.sql x: 1 commits (all time) y: 21 lines of code tpch/queries/q14.sql x: 1 commits (all time) y: 13 lines of code tpch/queries/q18.sql x: 1 commits (all time) y: 32 lines of code tpch/queries/q2.sql x: 1 commits (all time) y: 43 lines of code tpch/queries/q6.sql x: 1 commits (all time) y: 9 lines of code src/proto/datafusion.proto x: 3 commits (all time) y: 1068 lines of code src/proto/datafusion_common.proto x: 2 commits (all time) y: 487 lines of code datafusion_ray/friendly.py x: 1 commits (all time) y: 1579 lines of code src/flight.rs x: 1 commits (all time) y: 86 lines of code src/max_rows.rs x: 1 commits (all time) y: 52 lines of code build.rs x: 3 commits (all time) y: 29 lines of code
1579.0
lines of code
  min: 2.0
  average: 134.8
  25th percentile: 27.0
  median: 39.0
  75th percentile: 120.0
  max: 1579.0
0 16.0
commits (all time)
min: 1.0 | average: 2.64 | 25th percentile: 1.0 | median: 1.0 | 75th percentile: 2.0 | max: 16.0

File Size vs. Contributors (all time): 55 points

Cargo.toml x: 7 contributors (all time) y: 100 lines of code datafusion_ray/core.py x: 4 contributors (all time) y: 436 lines of code dev/release/check-rat-report.py x: 2 contributors (all time) y: 36 lines of code dev/release/generate-changelog.py x: 2 contributors (all time) y: 120 lines of code k8s/bench_toolbox.py x: 1 contributors (all time) y: 323 lines of code k8s/cmds.py x: 1 contributors (all time) y: 234 lines of code k8s/pricing.py x: 1 contributors (all time) y: 132 lines of code k8s/spark_tpcbench.py x: 1 contributors (all time) y: 90 lines of code src/codec.rs x: 2 contributors (all time) y: 188 lines of code tpch/tpcbench.py x: 4 contributors (all time) y: 170 lines of code src/dataframe.rs x: 3 contributors (all time) y: 369 lines of code datafusion_ray/util.py x: 1 contributors (all time) y: 4 lines of code src/lib.rs x: 5 contributors (all time) y: 19 lines of code src/util.rs x: 3 contributors (all time) y: 278 lines of code src/context.rs x: 5 contributors (all time) y: 89 lines of code src/processor_service.rs x: 1 contributors (all time) y: 297 lines of code dev/create_license.py x: 1 contributors (all time) y: 27 lines of code pyproject.toml x: 5 contributors (all time) y: 45 lines of code datafusion_ray/__init__.py x: 5 contributors (all time) y: 8 lines of code src/physical.rs x: 2 contributors (all time) y: 62 lines of code src/proto/datafusion_ray.proto x: 4 contributors (all time) y: 39 lines of code src/stage.rs x: 1 contributors (all time) y: 77 lines of code src/stage_reader.rs x: 1 contributors (all time) y: 148 lines of code tpch/queries/q1.sql x: 1 contributors (all time) y: 21 lines of code tpch/queries/q14.sql x: 1 contributors (all time) y: 13 lines of code tpch/queries/q18.sql x: 1 contributors (all time) y: 32 lines of code tpch/queries/q2.sql x: 1 contributors (all time) y: 43 lines of code tpch/queries/q6.sql x: 1 contributors (all time) y: 9 lines of code src/proto/datafusion.proto x: 3 contributors (all time) y: 1068 lines of code src/proto/datafusion_common.proto x: 2 contributors (all time) y: 487 lines of code datafusion_ray/friendly.py x: 1 contributors (all time) y: 1579 lines of code src/flight.rs x: 1 contributors (all time) y: 86 lines of code src/max_rows.rs x: 1 contributors (all time) y: 52 lines of code build.rs x: 2 contributors (all time) y: 29 lines of code src/proto/mod.rs x: 2 contributors (all time) y: 2 lines of code
1579.0
lines of code
  min: 2.0
  average: 134.8
  25th percentile: 27.0
  median: 39.0
  75th percentile: 120.0
  max: 1579.0
0 7.0
contributors (all time)
min: 1.0 | average: 1.8 | 25th percentile: 1.0 | median: 1.0 | 75th percentile: 2.0 | max: 7.0

File Size vs. Commits (30 days): 0 points

No data for "commits (30d)" vs. "lines of code".

File Size vs. Contributors (30 days): 0 points

No data for "contributors (30d)" vs. "lines of code".


File Size vs. Commits (90 days): 53 points

Cargo.toml x: 7 commits (90d) y: 100 lines of code datafusion_ray/core.py x: 8 commits (90d) y: 436 lines of code dev/release/check-rat-report.py x: 1 commits (90d) y: 36 lines of code dev/release/generate-changelog.py x: 1 commits (90d) y: 120 lines of code k8s/bench_toolbox.py x: 1 commits (90d) y: 323 lines of code k8s/cmds.py x: 1 commits (90d) y: 234 lines of code k8s/pricing.py x: 1 commits (90d) y: 132 lines of code k8s/spark_tpcbench.py x: 1 commits (90d) y: 90 lines of code src/codec.rs x: 3 commits (90d) y: 188 lines of code tpch/tpcbench.py x: 8 commits (90d) y: 170 lines of code src/dataframe.rs x: 7 commits (90d) y: 369 lines of code datafusion_ray/util.py x: 2 commits (90d) y: 4 lines of code src/lib.rs x: 4 commits (90d) y: 19 lines of code src/util.rs x: 6 commits (90d) y: 278 lines of code src/context.rs x: 3 commits (90d) y: 89 lines of code src/processor_service.rs x: 2 commits (90d) y: 297 lines of code dev/create_license.py x: 1 commits (90d) y: 27 lines of code pyproject.toml x: 3 commits (90d) y: 45 lines of code datafusion_ray/__init__.py x: 4 commits (90d) y: 8 lines of code src/physical.rs x: 2 commits (90d) y: 62 lines of code src/proto/datafusion_ray.proto x: 2 commits (90d) y: 39 lines of code src/stage.rs x: 1 commits (90d) y: 77 lines of code src/stage_reader.rs x: 1 commits (90d) y: 148 lines of code tpch/queries/q1.sql x: 1 commits (90d) y: 21 lines of code tpch/queries/q14.sql x: 1 commits (90d) y: 13 lines of code tpch/queries/q2.sql x: 1 commits (90d) y: 43 lines of code tpch/queries/q6.sql x: 1 commits (90d) y: 9 lines of code src/proto/datafusion.proto x: 1 commits (90d) y: 1068 lines of code src/proto/datafusion_common.proto x: 1 commits (90d) y: 487 lines of code datafusion_ray/friendly.py x: 1 commits (90d) y: 1579 lines of code src/flight.rs x: 1 commits (90d) y: 86 lines of code src/max_rows.rs x: 1 commits (90d) y: 52 lines of code
1579.0
lines of code
  min: 4.0
  average: 139.3
  25th percentile: 27.0
  median: 39.0
  75th percentile: 126.0
  max: 1579.0
0 8.0
commits (90d)
min: 1.0 | average: 1.89 | 25th percentile: 1.0 | median: 1.0 | 75th percentile: 2.0 | max: 8.0

File Size vs. Contributors (90 days): 53 points

Cargo.toml x: 3 contributors (90d) y: 100 lines of code datafusion_ray/core.py x: 4 contributors (90d) y: 436 lines of code dev/release/check-rat-report.py x: 1 contributors (90d) y: 36 lines of code dev/release/generate-changelog.py x: 1 contributors (90d) y: 120 lines of code k8s/bench_toolbox.py x: 1 contributors (90d) y: 323 lines of code k8s/cmds.py x: 1 contributors (90d) y: 234 lines of code k8s/pricing.py x: 1 contributors (90d) y: 132 lines of code k8s/spark_tpcbench.py x: 1 contributors (90d) y: 90 lines of code src/codec.rs x: 2 contributors (90d) y: 188 lines of code tpch/tpcbench.py x: 3 contributors (90d) y: 170 lines of code src/dataframe.rs x: 3 contributors (90d) y: 369 lines of code datafusion_ray/util.py x: 1 contributors (90d) y: 4 lines of code src/lib.rs x: 3 contributors (90d) y: 19 lines of code src/util.rs x: 3 contributors (90d) y: 278 lines of code src/context.rs x: 2 contributors (90d) y: 89 lines of code src/processor_service.rs x: 1 contributors (90d) y: 297 lines of code dev/create_license.py x: 1 contributors (90d) y: 27 lines of code pyproject.toml x: 2 contributors (90d) y: 45 lines of code datafusion_ray/__init__.py x: 3 contributors (90d) y: 8 lines of code src/physical.rs x: 2 contributors (90d) y: 62 lines of code src/proto/datafusion_ray.proto x: 2 contributors (90d) y: 39 lines of code src/stage.rs x: 1 contributors (90d) y: 77 lines of code src/stage_reader.rs x: 1 contributors (90d) y: 148 lines of code tpch/queries/q1.sql x: 1 contributors (90d) y: 21 lines of code tpch/queries/q14.sql x: 1 contributors (90d) y: 13 lines of code tpch/queries/q2.sql x: 1 contributors (90d) y: 43 lines of code tpch/queries/q6.sql x: 1 contributors (90d) y: 9 lines of code src/proto/datafusion.proto x: 1 contributors (90d) y: 1068 lines of code src/proto/datafusion_common.proto x: 1 contributors (90d) y: 487 lines of code datafusion_ray/friendly.py x: 1 contributors (90d) y: 1579 lines of code src/flight.rs x: 1 contributors (90d) y: 86 lines of code src/max_rows.rs x: 1 contributors (90d) y: 52 lines of code
1579.0
lines of code
  min: 4.0
  average: 139.3
  25th percentile: 27.0
  median: 39.0
  75th percentile: 126.0
  max: 1579.0
0 4.0
contributors (90d)
min: 1.0 | average: 1.38 | 25th percentile: 1.0 | median: 1.0 | 75th percentile: 1.0 | max: 4.0