apache / datafusion-python
File Change Frequency

File change frequency (churn) shows the distribution of file updates (days with at least one commit).

Overview
File Change Frequency Overall
  • There are 139 files with 15,248 lines of code.
    • 0 files changed more than 100 times (0 lines of code)
    • 4 files changed 51-100 times (2,324 lines of code)
    • 9 files changed 21-50 times (4,331 lines of code)
    • 30 files changed 6-20 times (3,470 lines of code)
    • 96 files changed 1-5 times (5,123 lines of code)
0% | 15% | 28% | 22% | 33%
Legend:
101+
51-100
21-50
6-20
1-5

explore: grouped by folders | grouped by update frequency | data
Contributors Count Frequency Overall
  • There are 139 files with 15,248 lines of code.
    • 0 files changed by more than 25 contributors (0 lines of code)
    • 6 files changed by 11-25 contributors (2,556 lines of code)
    • 19 files changed by 6-10 contributors (5,575 lines of code)
    • 67 files changed by 2-5 contributors (4,652 lines of code)
    • 47 files changed by 1 contributor (2,465 lines of code)
0% | 16% | 36% | 30% | 16%
Legend:
26+
11-25
6-10
2-5
1

explore: grouped by folders | grouped by contributors count | data
File Change Frequency per File Extension
rs, py, rst, md, sql, sh, yaml, toml, gitignore, txt, svg, html, css, dockerfile, dockerignore, bat, json, gitmodules
File Change Frequency per Extension
The number of recorded file updates
101+
51-100
21-50
6-20
1-5
rs0% | 23% | 17% | 21% | 38%
toml0% | 29% | 70% | 0% | 0%
py0% | 0% | 55% | 31% | 13%
sql0% | 0% | 0% | 0% | 100%
File Change Frequency per Logical Decomposition
primary
primary (file change frequency)
The number of recorded file updates
101+
51-100
21-50
6-20
1-5
src0% | 23% | 17% | 21% | 38%
ROOT0% | 28% | 69% | 0% | 1%
python0% | 0% | 70% | 14% | 14%
benchmarks0% | 0% | 0% | 50% | 49%
dev0% | 0% | 0% | 63% | 36%
Most Frequently Changed Files (Top 50)

See data for all files...

File# lines# unitscreatedlast modified# changes
(days)
# contributorsfirst
contributor
latest
contributor
Cargo.toml
in root
57 - 2022-07-22 2025-04-24 77 15 agrove@apache.org timsaucer@gmail.com
929 28 2022-07-22 2025-03-13 63 23 agrove@apache.org 54253219+jsai28@users.norep...
719 25 2022-07-22 2025-04-25 58 20 agrove@apache.org chenkovsky@qq.com
619 52 2022-07-22 2025-05-05 54 18 agrove@apache.org kosiew@gmail.com
expr.rs
in src
756 19 2023-02-15 2025-05-05 48 10 jdye64@gmail.com chenkovsky@qq.com
functions.py
in python/datafusion
1099 243 2024-05-14 2025-04-25 30 9 michael-j-ward@users.norepl... chenkovsky@qq.com
lib.rs
in src
94 2 2022-07-22 2025-03-30 29 13 agrove@apache.org chenkovsky@qq.com
138 - 2022-07-22 2025-03-17 27 14 agrove@apache.org 67336892+spaarsh@users.nore...
data_type.rs
in src/common
731 7 2023-02-13 2025-02-01 26 6 jdye64@gmail.com timsaucer@gmail.com
expr.py
in python/datafusion
653 143 2024-05-14 2025-05-05 25 9 michael-j-ward@users.norepl... chenkovsky@qq.com
context.py
in python/datafusion
423 69 2024-07-18 2025-03-17 23 9 timsaucer@gmail.com 67336892+spaarsh@users.nore...
udaf.rs
in src
148 11 2022-07-22 2025-02-20 22 9 agrove@apache.org kevinjqliu@users.noreply.gi...
dataframe.py
in python/datafusion
289 58 2024-07-18 2025-03-17 21 6 timsaucer@gmail.com 67336892+spaarsh@users.nore...
173 6 2022-07-26 2025-04-24 20 7 84413234+kylebrooks-8451@us... timsaucer@gmail.com
logical.rs
in src/sql
190 8 2023-02-13 2025-05-05 20 7 jdye64@gmail.com chenkovsky@qq.com
udf.rs
in src
78 5 2022-07-22 2025-02-20 18 7 agrove@apache.org kevinjqliu@users.noreply.gi...
__init__.py
in python/datafusion
76 4 2024-05-14 2025-04-27 17 7 michael-j-ward@users.norepl... 37878412+deanm0000@users.no...
112 3 2023-01-20 2025-02-20 14 6 jdye64@gmail.com kevinjqliu@users.noreply.gi...
utils.rs
in src
63 - 2022-07-22 2025-03-22 12 8 agrove@apache.org timsaucer@gmail.com
udf.py
in python/datafusion
294 34 2024-07-18 2025-03-15 12 6 timsaucer@gmail.com kosiew@gmail.com
22 - 2022-07-26 2025-02-20 11 5 84413234+kylebrooks-8451@us... kevinjqliu@users.noreply.gi...
143 2 2022-07-26 2025-02-20 11 6 84413234+kylebrooks-8451@us... kevinjqliu@users.noreply.gi...
46 - 2023-01-27 2025-02-20 10 5 andygrove73@gmail.com kevinjqliu@users.noreply.gi...
102 9 2022-07-22 2025-02-01 10 6 agrove@apache.org timsaucer@gmail.com
table_scan.rs
in src/expr
106 11 2023-02-15 2025-02-20 10 5 jdye64@gmail.com kevinjqliu@users.noreply.gi...
window.rs
in src/expr
239 10 2023-10-21 2025-03-22 10 5 jdye64@gmail.com timsaucer@gmail.com
projection.rs
in src/expr
81 9 2023-02-16 2025-02-20 9 4 jdye64@gmail.com kevinjqliu@users.noreply.gi...
literal.rs
in src/expr
105 3 2023-02-20 2025-02-20 9 7 andygrove73@gmail.com kevinjqliu@users.noreply.gi...
generate-changelog.py
in dev/release
106 3 2023-05-23 2025-03-17 9 4 andygrove73@gmail.com 67336892+spaarsh@users.nore...
aggregate.rs
in src/expr
115 11 2023-02-19 2025-03-22 9 6 andygrove73@gmail.com timsaucer@gmail.com
common.rs
in src
6 - 2023-02-15 2025-05-05 8 4 jdye64@gmail.com chenkovsky@qq.com
substrait.py
in python/datafusion
60 9 2024-05-14 2025-03-12 8 2 michael-j-ward@users.norepl... timsaucer@gmail.com
errors.rs
in src
66 5 2022-07-26 2025-02-20 8 5 84413234+kylebrooks-8451@us... kevinjqliu@users.noreply.gi...
config.rs
in src
25 1 2022-09-27 2025-02-20 7 7 me@francis.run kevinjqliu@users.noreply.gi...
common.py
in python/datafusion
36 - 2024-05-14 2025-05-05 7 3 michael-j-ward@users.norepl... chenkovsky@qq.com
sort.rs
in src/expr
66 9 2023-02-19 2025-02-20 7 4 andygrove73@gmail.com kevinjqliu@users.noreply.gi...
join-datafusion.py
in benchmarks/db-benchmark
251 1 2023-05-04 2025-03-17 7 3 andygrove73@gmail.com 67336892+spaarsh@users.nore...
groupby-datafusion.py
in benchmarks/db-benchmark
474 2 2023-05-04 2025-03-17 7 3 andygrove73@gmail.com 67336892+spaarsh@users.nore...
udwf.rs
in src
43 2 2024-09-30 2025-02-20 6 3 timsaucer@gmail.com kevinjqliu@users.noreply.gi...
location.py
in python/datafusion/input
44 2 2024-05-14 2025-03-12 6 4 michael-j-ward@users.norepl... timsaucer@gmail.com
limit.rs
in src/expr
50 7 2023-02-19 2025-02-20 6 5 andygrove73@gmail.com kevinjqliu@users.noreply.gi...
tpch.py
in benchmarks/tpch
52 1 2023-05-03 2025-03-17 6 3 andygrove73@gmail.com 67336892+spaarsh@users.nore...
schema.rs
in src/common
246 11 2023-05-30 2025-05-05 6 4 jdye64@gmail.com chenkovsky@qq.com
object_store.py
in python/datafusion
7 - 2024-05-14 2025-03-12 5 3 michael-j-ward@users.norepl... timsaucer@gmail.com
signature.rs
in src/expr
19 - 2023-02-23 2024-09-07 5 4 jdye64@gmail.com emgeee@users.noreply.github...
record_batch.py
in python/datafusion
26 7 2024-07-18 2025-03-12 5 2 timsaucer@gmail.com timsaucer@gmail.com
catalog.py
in python/datafusion
28 9 2024-07-18 2025-03-17 5 3 timsaucer@gmail.com 67336892+spaarsh@users.nore...
51 8 2023-02-22 2025-02-20 5 3 jdye64@gmail.com kevinjqliu@users.noreply.gi...
51 7 2023-02-20 2025-03-22 5 4 andygrove73@gmail.com timsaucer@gmail.com
filter.rs
in src/expr
53 8 2023-02-19 2025-02-20 5 4 andygrove73@gmail.com kevinjqliu@users.noreply.gi...
Files With Most Contributors (Top 50)
Based on the number of unique email addresses found in commits.

See data for all files...

File# lines# unitscreatedlast modified# changes
(days)
# contributorsfirst
contributor
latest
contributor
929 28 2022-07-22 2025-03-13 63 23 agrove@apache.org 54253219+jsai28@users.norep...
719 25 2022-07-22 2025-04-25 58 20 agrove@apache.org chenkovsky@qq.com
619 52 2022-07-22 2025-05-05 54 18 agrove@apache.org kosiew@gmail.com
Cargo.toml
in root
57 - 2022-07-22 2025-04-24 77 15 agrove@apache.org timsaucer@gmail.com
138 - 2022-07-22 2025-03-17 27 14 agrove@apache.org 67336892+spaarsh@users.nore...
lib.rs
in src
94 2 2022-07-22 2025-03-30 29 13 agrove@apache.org chenkovsky@qq.com
expr.rs
in src
756 19 2023-02-15 2025-05-05 48 10 jdye64@gmail.com chenkovsky@qq.com
functions.py
in python/datafusion
1099 243 2024-05-14 2025-04-25 30 9 michael-j-ward@users.norepl... chenkovsky@qq.com
expr.py
in python/datafusion
653 143 2024-05-14 2025-05-05 25 9 michael-j-ward@users.norepl... chenkovsky@qq.com
context.py
in python/datafusion
423 69 2024-07-18 2025-03-17 23 9 timsaucer@gmail.com 67336892+spaarsh@users.nore...
udaf.rs
in src
148 11 2022-07-22 2025-02-20 22 9 agrove@apache.org kevinjqliu@users.noreply.gi...
utils.rs
in src
63 - 2022-07-22 2025-03-22 12 8 agrove@apache.org timsaucer@gmail.com
logical.rs
in src/sql
190 8 2023-02-13 2025-05-05 20 7 jdye64@gmail.com chenkovsky@qq.com
173 6 2022-07-26 2025-04-24 20 7 84413234+kylebrooks-8451@us... timsaucer@gmail.com
udf.rs
in src
78 5 2022-07-22 2025-02-20 18 7 agrove@apache.org kevinjqliu@users.noreply.gi...
__init__.py
in python/datafusion
76 4 2024-05-14 2025-04-27 17 7 michael-j-ward@users.norepl... 37878412+deanm0000@users.no...
literal.rs
in src/expr
105 3 2023-02-20 2025-02-20 9 7 andygrove73@gmail.com kevinjqliu@users.noreply.gi...
config.rs
in src
25 1 2022-09-27 2025-02-20 7 7 me@francis.run kevinjqliu@users.noreply.gi...
data_type.rs
in src/common
731 7 2023-02-13 2025-02-01 26 6 jdye64@gmail.com timsaucer@gmail.com
dataframe.py
in python/datafusion
289 58 2024-07-18 2025-03-17 21 6 timsaucer@gmail.com 67336892+spaarsh@users.nore...
112 3 2023-01-20 2025-02-20 14 6 jdye64@gmail.com kevinjqliu@users.noreply.gi...
udf.py
in python/datafusion
294 34 2024-07-18 2025-03-15 12 6 timsaucer@gmail.com kosiew@gmail.com
143 2 2022-07-26 2025-02-20 11 6 84413234+kylebrooks-8451@us... kevinjqliu@users.noreply.gi...
102 9 2022-07-22 2025-02-01 10 6 agrove@apache.org timsaucer@gmail.com
aggregate.rs
in src/expr
115 11 2023-02-19 2025-03-22 9 6 andygrove73@gmail.com timsaucer@gmail.com
22 - 2022-07-26 2025-02-20 11 5 84413234+kylebrooks-8451@us... kevinjqliu@users.noreply.gi...
window.rs
in src/expr
239 10 2023-10-21 2025-03-22 10 5 jdye64@gmail.com timsaucer@gmail.com
table_scan.rs
in src/expr
106 11 2023-02-15 2025-02-20 10 5 jdye64@gmail.com kevinjqliu@users.noreply.gi...
46 - 2023-01-27 2025-02-20 10 5 andygrove73@gmail.com kevinjqliu@users.noreply.gi...
errors.rs
in src
66 5 2022-07-26 2025-02-20 8 5 84413234+kylebrooks-8451@us... kevinjqliu@users.noreply.gi...
limit.rs
in src/expr
50 7 2023-02-19 2025-02-20 6 5 andygrove73@gmail.com kevinjqliu@users.noreply.gi...
store.rs
in src
200 5 2022-11-08 2024-10-06 5 5 wseaton@users.noreply.githu... mesejoleon@gmail.com
projection.rs
in src/expr
81 9 2023-02-16 2025-02-20 9 4 jdye64@gmail.com kevinjqliu@users.noreply.gi...
generate-changelog.py
in dev/release
106 3 2023-05-23 2025-03-17 9 4 andygrove73@gmail.com 67336892+spaarsh@users.nore...
common.rs
in src
6 - 2023-02-15 2025-05-05 8 4 jdye64@gmail.com chenkovsky@qq.com
sort.rs
in src/expr
66 9 2023-02-19 2025-02-20 7 4 andygrove73@gmail.com kevinjqliu@users.noreply.gi...
schema.rs
in src/common
246 11 2023-05-30 2025-05-05 6 4 jdye64@gmail.com chenkovsky@qq.com
location.py
in python/datafusion/input
44 2 2024-05-14 2025-03-12 6 4 michael-j-ward@users.norepl... timsaucer@gmail.com
74 7 2023-02-16 2025-02-01 5 4 andygrove73@gmail.com timsaucer@gmail.com
51 7 2023-02-20 2025-03-22 5 4 andygrove73@gmail.com timsaucer@gmail.com
join.rs
in src/expr
144 21 2023-03-01 2025-02-20 5 4 14581281+iajoiner@users.nor... kevinjqliu@users.noreply.gi...
signature.rs
in src/expr
19 - 2023-02-23 2024-09-07 5 4 jdye64@gmail.com emgeee@users.noreply.github...
filter.rs
in src/expr
53 8 2023-02-19 2025-02-20 5 4 andygrove73@gmail.com kevinjqliu@users.noreply.gi...
union.rs
in src/expr
56 8 2023-02-28 2025-02-20 4 4 14581281+iajoiner@users.nor... kevinjqliu@users.noreply.gi...
distinct.rs
in src/expr
62 7 2023-03-14 2025-02-20 4 4 jdye64@gmail.com kevinjqliu@users.noreply.gi...
55 9 2023-03-13 2025-02-20 4 4 jdye64@gmail.com kevinjqliu@users.noreply.gi...
groupby-datafusion.py
in benchmarks/db-benchmark
474 2 2023-05-04 2025-03-17 7 3 andygrove73@gmail.com 67336892+spaarsh@users.nore...
join-datafusion.py
in benchmarks/db-benchmark
251 1 2023-05-04 2025-03-17 7 3 andygrove73@gmail.com 67336892+spaarsh@users.nore...
common.py
in python/datafusion
36 - 2024-05-14 2025-05-05 7 3 michael-j-ward@users.norepl... chenkovsky@qq.com
tpch.py
in benchmarks/tpch
52 1 2023-05-03 2025-03-17 6 3 andygrove73@gmail.com 67336892+spaarsh@users.nore...
Files With Least Contributors (Top 50)
Based on the number of unique email addresses found in commits.

See data for all files...

File# lines# unitscreatedlast modified# changes
(days)
# contributorsfirst
contributor
latest
contributor
statement.rs
in src/expr
370 30 2025-05-05 2025-05-05 1 1 chenkovsky@qq.com chenkovsky@qq.com
html_formatter.py
in python/datafusion
278 32 2025-04-21 2025-05-05 2 1 kosiew@gmail.com kosiew@gmail.com
138 6 2025-05-05 2025-05-05 1 1 chenkovsky@qq.com chenkovsky@qq.com
138 6 2025-05-05 2025-05-05 1 1 chenkovsky@qq.com chenkovsky@qq.com
create_index.rs
in src/expr
90 6 2025-05-05 2025-05-05 1 1 chenkovsky@qq.com chenkovsky@qq.com
80 11 2025-05-05 2025-05-05 1 1 chenkovsky@qq.com chenkovsky@qq.com
75 11 2025-05-05 2025-05-05 1 1 chenkovsky@qq.com chenkovsky@qq.com
drop_view.rs
in src/expr
67 10 2025-05-05 2025-05-05 1 1 chenkovsky@qq.com chenkovsky@qq.com
65 6 2025-05-05 2025-05-05 1 1 chenkovsky@qq.com chenkovsky@qq.com
65 6 2025-05-05 2025-05-05 1 1 chenkovsky@qq.com chenkovsky@qq.com
drop_function.rs
in src/expr
62 10 2025-05-05 2025-05-05 1 1 chenkovsky@qq.com chenkovsky@qq.com
58 7 2025-05-05 2025-05-05 1 1 chenkovsky@qq.com chenkovsky@qq.com
plan.py
in python/datafusion
50 18 2024-10-04 2025-03-12 4 1 timsaucer@gmail.com timsaucer@gmail.com
q2.sql
in benchmarks/tpch/queries
43 - 2023-05-03 2023-05-03 1 1 andygrove73@gmail.com andygrove73@gmail.com
dialect.rs
in src/unparser
43 - 2025-03-30 2025-03-30 1 1 chenkovsky@qq.com chenkovsky@qq.com
q7.sql
in benchmarks/tpch/queries
39 - 2023-05-03 2023-05-03 1 1 andygrove73@gmail.com andygrove73@gmail.com
q21.sql
in benchmarks/tpch/queries
39 - 2023-05-03 2023-05-03 1 1 andygrove73@gmail.com andygrove73@gmail.com
q8.sql
in benchmarks/tpch/queries
37 - 2023-05-03 2023-05-03 1 1 andygrove73@gmail.com andygrove73@gmail.com
q20.sql
in benchmarks/tpch/queries
37 - 2023-05-03 2023-05-03 1 1 andygrove73@gmail.com andygrove73@gmail.com
q22.sql
in benchmarks/tpch/queries
37 - 2023-05-03 2023-05-03 1 1 andygrove73@gmail.com andygrove73@gmail.com
mod.rs
in src/unparser
37 - 2025-03-30 2025-03-30 1 1 chenkovsky@qq.com chenkovsky@qq.com
q19.sql
in benchmarks/tpch/queries
35 - 2023-05-03 2023-05-03 1 1 andygrove73@gmail.com andygrove73@gmail.com
copy_to.rs
in src/expr
34 4 2025-05-05 2025-05-05 1 1 chenkovsky@qq.com chenkovsky@qq.com
function.rs
in src/common
33 - 2023-05-30 2023-05-30 1 1 jdye64@gmail.com jdye64@gmail.com
q9.sql
in benchmarks/tpch/queries
32 - 2023-05-03 2023-05-03 1 1 andygrove73@gmail.com andygrove73@gmail.com
q18.sql
in benchmarks/tpch/queries
32 - 2023-05-03 2023-05-03 1 1 andygrove73@gmail.com andygrove73@gmail.com
unparser.py
in python/datafusion
32 9 2025-03-30 2025-03-30 1 1 chenkovsky@qq.com chenkovsky@qq.com
q15.sql
in benchmarks/tpch/queries
31 - 2023-05-03 2023-05-03 1 1 andygrove73@gmail.com andygrove73@gmail.com
q10.sql
in benchmarks/tpch/queries
31 - 2023-05-03 2023-05-03 1 1 andygrove73@gmail.com andygrove73@gmail.com
q16.sql
in benchmarks/tpch/queries
30 - 2023-05-03 2023-05-03 1 1 andygrove73@gmail.com andygrove73@gmail.com
29 3 2023-02-23 2023-02-23 1 1 jdye64@gmail.com jdye64@gmail.com
q12.sql
in benchmarks/tpch/queries
28 - 2023-05-03 2023-05-03 1 1 andygrove73@gmail.com andygrove73@gmail.com
q11.sql
in benchmarks/tpch/queries
27 - 2023-05-03 2023-05-03 1 1 andygrove73@gmail.com andygrove73@gmail.com
values.rs
in src/expr
27 3 2025-05-05 2025-05-05 1 1 chenkovsky@qq.com chenkovsky@qq.com
27 - 2022-07-22 2022-07-22 1 1 agrove@apache.org agrove@apache.org
dml.rs
in src/expr
26 3 2025-05-05 2025-05-05 1 1 chenkovsky@qq.com chenkovsky@qq.com
q5.sql
in benchmarks/tpch/queries
24 - 2023-05-03 2023-05-03 1 1 andygrove73@gmail.com andygrove73@gmail.com
q3.sql
in benchmarks/tpch/queries
22 - 2023-05-03 2023-05-03 1 1 andygrove73@gmail.com andygrove73@gmail.com
q1.sql
in benchmarks/tpch/queries
21 - 2023-05-03 2023-05-03 1 1 andygrove73@gmail.com andygrove73@gmail.com
q4.sql
in benchmarks/tpch/queries
21 - 2023-05-03 2023-05-03 1 1 andygrove73@gmail.com andygrove73@gmail.com
q13.sql
in benchmarks/tpch/queries
20 - 2023-05-03 2023-05-03 1 1 andygrove73@gmail.com andygrove73@gmail.com
q17.sql
in benchmarks/tpch/queries
17 - 2023-05-03 2023-05-03 1 1 andygrove73@gmail.com andygrove73@gmail.com
q14.sql
in benchmarks/tpch/queries
13 - 2023-05-03 2023-05-03 1 1 andygrove73@gmail.com andygrove73@gmail.com
col.py
in python/datafusion
11 2 2025-04-27 2025-04-27 1 1 37878412+deanm0000@users.no... 37878412+deanm0000@users.no...
q6.sql
in benchmarks/tpch/queries
9 - 2023-05-03 2023-05-03 1 1 andygrove73@gmail.com andygrove73@gmail.com
build.rs
in root
3 1 2023-04-19 2023-04-19 1 1 jdyer@nvidia.com jdyer@nvidia.com
sql.rs
in src
2 - 2023-02-13 2023-02-13 1 1 jdye64@gmail.com jdye64@gmail.com
bool_expr.rs
in src/expr
264 20 2023-02-23 2024-09-07 2 2 jdye64@gmail.com emgeee@users.noreply.github...
like.rs
in src/expr
151 24 2023-02-23 2024-09-07 2 2 jdye64@gmail.com emgeee@users.noreply.github...
create_tables.sql
in benchmarks/tpch
133 - 2023-05-03 2024-10-04 2 2 andygrove73@gmail.com michael-j-ward@users.norepl...
Correlations

File Size vs. Number of Changes: 139 points

python/datafusion/html_formatter.py x: 278 lines of code y: 2 # changes src/dataframe.rs x: 619 lines of code y: 54 # changes python/datafusion/common.py x: 36 lines of code y: 7 # changes python/datafusion/expr.py x: 653 lines of code y: 25 # changes src/common.rs x: 6 lines of code y: 8 # changes src/common/schema.rs x: 246 lines of code y: 6 # changes src/expr.rs x: 756 lines of code y: 48 # changes src/expr/copy_to.rs x: 34 lines of code y: 1 # changes src/expr/create_catalog.rs x: 65 lines of code y: 1 # changes src/expr/create_external_table.rs x: 138 lines of code y: 1 # changes src/expr/create_index.rs x: 90 lines of code y: 1 # changes src/expr/describe_table.rs x: 58 lines of code y: 1 # changes src/expr/dml.rs x: 26 lines of code y: 1 # changes src/expr/drop_catalog_schema.rs x: 80 lines of code y: 1 # changes src/expr/drop_function.rs x: 62 lines of code y: 1 # changes src/expr/drop_view.rs x: 67 lines of code y: 1 # changes src/expr/recursive_query.rs x: 75 lines of code y: 1 # changes src/expr/statement.rs x: 370 lines of code y: 1 # changes src/sql/logical.rs x: 190 lines of code y: 20 # changes python/datafusion/__init__.py x: 76 lines of code y: 17 # changes python/datafusion/col.py x: 11 lines of code y: 1 # changes python/datafusion/functions.py x: 1099 lines of code y: 30 # changes src/functions.rs x: 719 lines of code y: 58 # changes Cargo.toml x: 57 lines of code y: 77 # changes src/dataset_exec.rs x: 173 lines of code y: 20 # changes python/datafusion/unparser.py x: 32 lines of code y: 1 # changes src/lib.rs x: 94 lines of code y: 29 # changes src/unparser/dialect.rs x: 43 lines of code y: 1 # changes src/unparser/mod.rs x: 37 lines of code y: 1 # changes src/expr/aggregate.rs x: 115 lines of code y: 9 # changes src/expr/aggregate_expr.rs x: 51 lines of code y: 5 # changes src/expr/window.rs x: 239 lines of code y: 10 # changes src/utils.rs x: 63 lines of code y: 12 # changes benchmarks/db-benchmark/groupby-datafusion.py x: 474 lines of code y: 7 # changes benchmarks/db-benchmark/join-datafusion.py x: 251 lines of code y: 7 # changes benchmarks/tpch/tpch.py x: 52 lines of code y: 6 # changes dev/release/generate-changelog.py x: 106 lines of code y: 9 # changes pyproject.toml x: 138 lines of code y: 27 # changes python/datafusion/catalog.py x: 28 lines of code y: 5 # changes python/datafusion/context.py x: 423 lines of code y: 23 # changes python/datafusion/dataframe.py x: 289 lines of code y: 21 # changes python/datafusion/udf.py x: 294 lines of code y: 12 # changes python/datafusion/io.py x: 80 lines of code y: 4 # changes src/context.rs x: 929 lines of code y: 63 # changes dev/release/check-rat-report.py x: 34 lines of code y: 3 # changes python/datafusion/input/__init__.py x: 4 lines of code y: 3 # changes python/datafusion/input/base.py x: 8 lines of code y: 3 # changes python/datafusion/input/location.py x: 44 lines of code y: 6 # changes python/datafusion/object_store.py x: 7 lines of code y: 5 # changes python/datafusion/plan.py x: 50 lines of code y: 4 # changes python/datafusion/record_batch.py x: 26 lines of code y: 5 # changes python/datafusion/substrait.py x: 60 lines of code y: 8 # changes src/config.rs x: 25 lines of code y: 7 # changes src/dataset.rs x: 22 lines of code y: 11 # changes src/errors.rs x: 66 lines of code y: 8 # changes src/expr/create_memory_table.rs x: 62 lines of code y: 3 # changes src/expr/create_view.rs x: 59 lines of code y: 4 # changes src/expr/distinct.rs x: 62 lines of code y: 4 # changes src/expr/drop_table.rs x: 55 lines of code y: 3 # changes src/expr/explain.rs x: 74 lines of code y: 3 # changes src/expr/extension.rs x: 25 lines of code y: 3 # changes src/expr/filter.rs x: 53 lines of code y: 5 # changes src/expr/join.rs x: 144 lines of code y: 5 # changes src/expr/limit.rs x: 50 lines of code y: 6 # changes src/expr/literal.rs x: 105 lines of code y: 9 # changes src/expr/logical_node.rs x: 5 lines of code y: 4 # changes src/expr/projection.rs x: 81 lines of code y: 9 # changes src/expr/repartition.rs x: 89 lines of code y: 3 # changes src/expr/sort.rs x: 66 lines of code y: 7 # changes src/expr/subquery.rs x: 48 lines of code y: 4 # changes src/expr/subquery_alias.rs x: 55 lines of code y: 4 # changes src/expr/table_scan.rs x: 106 lines of code y: 10 # changes src/expr/union.rs x: 56 lines of code y: 4 # changes src/expr/unnest.rs x: 52 lines of code y: 3 # changes src/physical_plan.rs x: 46 lines of code y: 10 # changes src/pyarrow_filter_expression.rs x: 143 lines of code y: 11 # changes src/pyarrow_util.rs x: 33 lines of code y: 2 # changes src/substrait.rs x: 112 lines of code y: 14 # changes src/udaf.rs x: 148 lines of code y: 22 # changes src/udf.rs x: 78 lines of code y: 18 # changes src/udwf.rs x: 43 lines of code y: 6 # changes src/catalog.rs x: 102 lines of code y: 10 # changes src/common/data_type.rs x: 731 lines of code y: 26 # changes src/expr/conditional_expr.rs x: 31 lines of code y: 3 # changes src/record_batch.rs x: 74 lines of code y: 5 # changes src/sql/exceptions.rs x: 11 lines of code y: 2 # changes src/store.rs x: 200 lines of code y: 5 # changes benchmarks/tpch/create_tables.sql x: 133 lines of code y: 2 # changes src/expr/alias.rs x: 42 lines of code y: 3 # changes src/expr/between.rs x: 50 lines of code y: 2 # changes src/expr/bool_expr.rs x: 264 lines of code y: 2 # changes src/expr/case.rs x: 34 lines of code y: 2 # changes src/expr/column.rs x: 32 lines of code y: 3 # changes src/expr/exists.rs x: 22 lines of code y: 3 # changes src/expr/grouping_set.rs x: 17 lines of code y: 2 # changes src/expr/indexed_field.rs x: 46 lines of code y: 3 # changes src/expr/like.rs x: 151 lines of code y: 2 # changes src/expr/scalar_subquery.rs x: 24 lines of code y: 2 # changes src/expr/signature.rs x: 19 lines of code y: 5 # changes src/expr/unnest_expr.rs x: 41 lines of code y: 2 # changes benchmarks/tpch/queries/q1.sql x: 21 lines of code y: 1 # changes benchmarks/tpch/queries/q10.sql x: 31 lines of code y: 1 # changes benchmarks/tpch/queries/q12.sql x: 28 lines of code y: 1 # changes benchmarks/tpch/queries/q14.sql x: 13 lines of code y: 1 # changes benchmarks/tpch/queries/q17.sql x: 17 lines of code y: 1 # changes benchmarks/tpch/queries/q21.sql x: 39 lines of code y: 1 # changes benchmarks/tpch/queries/q3.sql x: 22 lines of code y: 1 # changes benchmarks/tpch/queries/q5.sql x: 24 lines of code y: 1 # changes benchmarks/tpch/queries/q6.sql x: 9 lines of code y: 1 # changes build.rs x: 3 lines of code y: 1 # changes
77.0
# changes
  min: 1.0
  average: 7.32
  25th percentile: 1.0
  median: 3.0
  75th percentile: 7.0
  max: 77.0
0 1099.0
lines of code
min: 2.0 | average: 109.7 | 25th percentile: 29.0 | median: 50.0 | 75th percentile: 94.0 | max: 1099.0

Number of Contributors vs. Number of Changes: 139 points

python/datafusion/html_formatter.py x: 1 # contributors y: 2 # changes src/dataframe.rs x: 18 # contributors y: 54 # changes python/datafusion/common.py x: 3 # contributors y: 7 # changes python/datafusion/expr.py x: 9 # contributors y: 25 # changes src/common.rs x: 4 # contributors y: 8 # changes src/common/schema.rs x: 4 # contributors y: 6 # changes src/expr.rs x: 10 # contributors y: 48 # changes src/expr/copy_to.rs x: 1 # contributors y: 1 # changes src/sql/logical.rs x: 7 # contributors y: 20 # changes python/datafusion/__init__.py x: 7 # contributors y: 17 # changes python/datafusion/functions.py x: 9 # contributors y: 30 # changes src/functions.rs x: 20 # contributors y: 58 # changes Cargo.toml x: 15 # contributors y: 77 # changes src/lib.rs x: 13 # contributors y: 29 # changes src/expr/aggregate.rs x: 6 # contributors y: 9 # changes src/expr/aggregate_expr.rs x: 4 # contributors y: 5 # changes src/expr/window.rs x: 5 # contributors y: 10 # changes src/utils.rs x: 8 # contributors y: 12 # changes benchmarks/tpch/tpch.py x: 3 # contributors y: 6 # changes dev/release/generate-changelog.py x: 4 # contributors y: 9 # changes pyproject.toml x: 14 # contributors y: 27 # changes python/datafusion/catalog.py x: 3 # contributors y: 5 # changes python/datafusion/context.py x: 9 # contributors y: 23 # changes python/datafusion/dataframe.py x: 6 # contributors y: 21 # changes python/datafusion/udf.py x: 6 # contributors y: 12 # changes python/datafusion/io.py x: 3 # contributors y: 4 # changes src/context.rs x: 23 # contributors y: 63 # changes dev/release/check-rat-report.py x: 3 # contributors y: 3 # changes python/datafusion/input/__init__.py x: 2 # contributors y: 3 # changes python/datafusion/plan.py x: 1 # contributors y: 4 # changes python/datafusion/record_batch.py x: 2 # contributors y: 5 # changes python/datafusion/substrait.py x: 2 # contributors y: 8 # changes src/config.rs x: 7 # contributors y: 7 # changes src/dataset.rs x: 5 # contributors y: 11 # changes src/errors.rs x: 5 # contributors y: 8 # changes src/expr/distinct.rs x: 4 # contributors y: 4 # changes src/expr/limit.rs x: 5 # contributors y: 6 # changes src/expr/literal.rs x: 7 # contributors y: 9 # changes src/expr/logical_node.rs x: 2 # contributors y: 4 # changes src/expr/sort.rs x: 4 # contributors y: 7 # changes src/pyarrow_filter_expression.rs x: 6 # contributors y: 11 # changes src/pyarrow_util.rs x: 2 # contributors y: 2 # changes src/substrait.rs x: 6 # contributors y: 14 # changes src/udaf.rs x: 9 # contributors y: 22 # changes src/udf.rs x: 7 # contributors y: 18 # changes src/catalog.rs x: 6 # contributors y: 10 # changes src/common/data_type.rs x: 6 # contributors y: 26 # changes src/store.rs x: 5 # contributors y: 5 # changes
77.0
# changes
  min: 1.0
  average: 7.32
  25th percentile: 1.0
  median: 3.0
  75th percentile: 7.0
  max: 77.0
0 23.0
# contributors
min: 1.0 | average: 3.58 | 25th percentile: 1.0 | median: 3.0 | 75th percentile: 4.0 | max: 23.0

Number of Contributors vs. File Size: 139 points

python/datafusion/html_formatter.py x: 1 # contributors y: 278 lines of code src/dataframe.rs x: 18 # contributors y: 619 lines of code python/datafusion/common.py x: 3 # contributors y: 36 lines of code python/datafusion/expr.py x: 9 # contributors y: 653 lines of code src/common.rs x: 4 # contributors y: 6 lines of code src/common/schema.rs x: 4 # contributors y: 246 lines of code src/expr.rs x: 10 # contributors y: 756 lines of code src/expr/copy_to.rs x: 1 # contributors y: 34 lines of code src/expr/create_catalog.rs x: 1 # contributors y: 65 lines of code src/expr/create_external_table.rs x: 1 # contributors y: 138 lines of code src/expr/create_index.rs x: 1 # contributors y: 90 lines of code src/expr/describe_table.rs x: 1 # contributors y: 58 lines of code src/expr/dml.rs x: 1 # contributors y: 26 lines of code src/expr/drop_catalog_schema.rs x: 1 # contributors y: 80 lines of code src/expr/drop_function.rs x: 1 # contributors y: 62 lines of code src/expr/recursive_query.rs x: 1 # contributors y: 75 lines of code src/expr/statement.rs x: 1 # contributors y: 370 lines of code src/sql/logical.rs x: 7 # contributors y: 190 lines of code python/datafusion/__init__.py x: 7 # contributors y: 76 lines of code python/datafusion/col.py x: 1 # contributors y: 11 lines of code python/datafusion/functions.py x: 9 # contributors y: 1099 lines of code src/functions.rs x: 20 # contributors y: 719 lines of code Cargo.toml x: 15 # contributors y: 57 lines of code src/dataset_exec.rs x: 7 # contributors y: 173 lines of code src/lib.rs x: 13 # contributors y: 94 lines of code src/unparser/dialect.rs x: 1 # contributors y: 43 lines of code src/unparser/mod.rs x: 1 # contributors y: 37 lines of code src/expr/aggregate.rs x: 6 # contributors y: 115 lines of code src/expr/aggregate_expr.rs x: 4 # contributors y: 51 lines of code src/expr/window.rs x: 5 # contributors y: 239 lines of code src/utils.rs x: 8 # contributors y: 63 lines of code benchmarks/db-benchmark/groupby-datafusion.py x: 3 # contributors y: 474 lines of code benchmarks/db-benchmark/join-datafusion.py x: 3 # contributors y: 251 lines of code benchmarks/tpch/tpch.py x: 3 # contributors y: 52 lines of code dev/release/generate-changelog.py x: 4 # contributors y: 106 lines of code pyproject.toml x: 14 # contributors y: 138 lines of code python/datafusion/catalog.py x: 3 # contributors y: 28 lines of code python/datafusion/context.py x: 9 # contributors y: 423 lines of code python/datafusion/dataframe.py x: 6 # contributors y: 289 lines of code python/datafusion/udf.py x: 6 # contributors y: 294 lines of code python/datafusion/io.py x: 3 # contributors y: 80 lines of code src/context.rs x: 23 # contributors y: 929 lines of code dev/release/check-rat-report.py x: 3 # contributors y: 34 lines of code python/datafusion/input/__init__.py x: 2 # contributors y: 4 lines of code python/datafusion/input/base.py x: 2 # contributors y: 8 lines of code python/datafusion/input/location.py x: 4 # contributors y: 44 lines of code python/datafusion/object_store.py x: 3 # contributors y: 7 lines of code python/datafusion/plan.py x: 1 # contributors y: 50 lines of code python/datafusion/record_batch.py x: 2 # contributors y: 26 lines of code python/datafusion/substrait.py x: 2 # contributors y: 60 lines of code src/config.rs x: 7 # contributors y: 25 lines of code src/dataset.rs x: 5 # contributors y: 22 lines of code src/errors.rs x: 5 # contributors y: 66 lines of code src/expr/analyze.rs x: 3 # contributors y: 51 lines of code src/expr/create_memory_table.rs x: 3 # contributors y: 62 lines of code src/expr/create_view.rs x: 3 # contributors y: 59 lines of code src/expr/distinct.rs x: 4 # contributors y: 62 lines of code src/expr/explain.rs x: 3 # contributors y: 74 lines of code src/expr/extension.rs x: 3 # contributors y: 25 lines of code src/expr/filter.rs x: 4 # contributors y: 53 lines of code src/expr/join.rs x: 4 # contributors y: 144 lines of code src/expr/limit.rs x: 5 # contributors y: 50 lines of code src/expr/literal.rs x: 7 # contributors y: 105 lines of code src/expr/projection.rs x: 4 # contributors y: 81 lines of code src/expr/repartition.rs x: 3 # contributors y: 89 lines of code src/expr/sort.rs x: 4 # contributors y: 66 lines of code src/expr/table_scan.rs x: 5 # contributors y: 106 lines of code src/physical_plan.rs x: 5 # contributors y: 46 lines of code src/pyarrow_filter_expression.rs x: 6 # contributors y: 143 lines of code src/pyarrow_util.rs x: 2 # contributors y: 33 lines of code src/substrait.rs x: 6 # contributors y: 112 lines of code src/udaf.rs x: 9 # contributors y: 148 lines of code src/udf.rs x: 7 # contributors y: 78 lines of code src/udwf.rs x: 3 # contributors y: 43 lines of code src/catalog.rs x: 6 # contributors y: 102 lines of code src/common/data_type.rs x: 6 # contributors y: 731 lines of code src/record_batch.rs x: 4 # contributors y: 74 lines of code src/sql/exceptions.rs x: 2 # contributors y: 11 lines of code src/store.rs x: 5 # contributors y: 200 lines of code benchmarks/tpch/create_tables.sql x: 2 # contributors y: 133 lines of code src/expr/sort_expr.rs x: 2 # contributors y: 62 lines of code src/expr/between.rs x: 2 # contributors y: 50 lines of code src/expr/bool_expr.rs x: 2 # contributors y: 264 lines of code src/expr/grouping_set.rs x: 2 # contributors y: 17 lines of code src/expr/indexed_field.rs x: 3 # contributors y: 46 lines of code src/expr/like.rs x: 2 # contributors y: 151 lines of code src/expr/scalar_subquery.rs x: 2 # contributors y: 24 lines of code src/expr/signature.rs x: 4 # contributors y: 19 lines of code src/expr/unnest_expr.rs x: 2 # contributors y: 41 lines of code benchmarks/tpch/queries/q1.sql x: 1 # contributors y: 21 lines of code benchmarks/tpch/queries/q14.sql x: 1 # contributors y: 13 lines of code benchmarks/tpch/queries/q3.sql x: 1 # contributors y: 22 lines of code build.rs x: 1 # contributors y: 3 lines of code
1099.0
lines of code
  min: 2.0
  average: 109.7
  25th percentile: 29.0
  median: 50.0
  75th percentile: 94.0
  max: 1099.0
0 23.0
# contributors
min: 1.0 | average: 3.58 | 25th percentile: 1.0 | median: 3.0 | 75th percentile: 4.0 | max: 23.0