apache / spark
File Change Frequency

File change frequency (churn) shows the distribution of file updates (days with at least one commit).

Overview
File Change Frequency Overall
  • There are 4,066 files with 637,106 lines of code.
    • 113 files changed more than 100 times (126,025 lines of code)
    • 242 files changed 51-100 times (110,988 lines of code)
    • 629 files changed 21-50 times (145,487 lines of code)
    • 1,279 files changed 6-20 times (158,399 lines of code)
    • 1,803 files changed 1-5 times (96,207 lines of code)
19% | 17% | 22% | 24% | 15%
Legend:
101+
51-100
21-50
6-20
1-5

explore: grouped by folders | grouped by update frequency | data
Contributors Count Frequency Overall
  • There are 4,066 files with 637,106 lines of code.
    • 437 files changed by more than 25 contributors (228,135 lines of code)
    • 779 files changed by 11-25 contributors (168,471 lines of code)
    • 702 files changed by 6-10 contributors (97,250 lines of code)
    • 1,417 files changed by 2-5 contributors (112,557 lines of code)
    • 731 files changed by 1 contributor (30,693 lines of code)
35% | 26% | 15% | 17% | 4%
Legend:
26+
11-25
6-10
2-5
1

explore: grouped by folders | grouped by contributors count | data
File Change Frequency per File Extension
scala, q, py, java, txt, json, sql, md, xml, r, yaml, rst, sh, js, properties, proto, css, pyi, cmd, html, ipynb, xsd, orc, gitignore, avsc, rb, bat, cfg, g4, ini, sbt, in, thrift, c, toml, rmd, ps1, gitattributes, bash
File Change Frequency per Extension
The number of recorded file updates
101+
51-100
21-50
6-20
1-5
scala22% | 16% | 22% | 23% | 15%
py24% | 23% | 27% | 18% | 6%
sbt100% | 0% | 0% | 0% | 0%
pyi0% | 55% | 13% | 29% | 1%
java0% | 6% | 22% | 40% | 29%
g40% | 77% | 22% | 0% | 0%
css0% | 44% | 0% | 39% | 15%
xml0% | 96% | 0% | 0% | 3%
js0% | 0% | 54% | 40% | 5%
proto0% | 0% | 16% | 38% | 45%
html0% | 0% | 0% | 46% | 53%
bash0% | 0% | 0% | 100% | 0%
toml0% | 0% | 0% | 100% | 0%
in0% | 0% | 0% | 100% | 0%
yaml0% | 0% | 0% | 0% | 100%
ps10% | 0% | 0% | 0% | 100%
c0% | 0% | 0% | 0% | 100%
cfg0% | 0% | 0% | 0% | 100%
File Change Frequency per Logical Decomposition
primary
primary (file change frequency)
The number of recorded file updates
101+
51-100
21-50
6-20
1-5
sql21% | 13% | 18% | 24% | 20%
core28% | 16% | 24% | 22% | 7%
python20% | 28% | 26% | 20% | 5%
mllib7% | 29% | 33% | 24% | 4%
dev52% | 14% | 5% | 16% | 10%
project96% | 0% | 3% | 0% | 0%
resource-managers11% | 19% | 33% | 23% | 12%
streaming7% | 21% | 36% | 26% | 8%
common0% | 4% | 28% | 37% | 28%
launcher0% | 27% | 31% | 33% | 8%
ROOT0% | 100% | 0% | 0% | 0%
repl0% | 50% | 41% | 0% | 8%
mllib-local0% | 0% | 64% | 35% | 0%
graphx0% | 0% | 38% | 54% | 6%
tools0% | 0% | 100% | 0% | 0%
connector0% | 0% | 0% | 56% | 43%
build0% | 0% | 0% | 78% | 21%
licenses-binary0% | 0% | 0% | 0% | 100%
hadoop-cloud0% | 0% | 0% | 0% | 100%
connect-examples0% | 0% | 0% | 0% | 100%
R0% | 0% | 0% | 0% | 100%
Most Frequently Changed Files (Top 50)

See data for all files...

File# lines# unitscreatedlast modified# changes
(days)
# contributorsfirst
contributor
latest
contributor
1460 12 2011-07-15 2025-05-06 818 222 ismael@juma.me.uk wenchen@databricks.com
Analyzer.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis
2889 137 2014-03-21 2025-05-06 742 196 michael@databricks.com mihailo.aleksic@databricks.com
SQLConf.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/internal
5815 42 2017-03-14 2025-05-06 729 225 rxin@databricks.com gurwls223@apache.org
AstBuilder.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser
4715 310 2016-03-31 2025-04-24 543 150 hvanhovell@questtec.nl wenghy02@gmail.com
SparkContext.scala
in core/src/main/scala/org/apache/spark
1923 124 2013-05-12 2025-02-24 534 237 kayo@yahoo-inc.com sririshindra@gmail.com
Optimizer.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer
1683 92 2014-03-21 2025-04-29 486 148 michael@databricks.com pavle.martinovic@databricks...
202 3 2014-06-02 2025-03-19 462 174 pwendell@gmail.com ruifengz@apache.org
Utils.scala
in core/src/main/scala/org/apache/spark/util
2154 163 2013-09-01 2025-04-04 456 205 matei@eecs.berkeley.edu vinod.kc.in@gmail.com
dataframe.py
in python/pyspark/sql
851 173 2015-02-10 2025-04-10 383 142 michael@databricks.com ruifengz@apache.org
CheckAnalysis.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis
920 23 2015-02-24 2025-04-30 356 111 michael@databricks.com gengliang@apache.org
DAGScheduler.scala
in core/src/main/scala/org/apache/spark/scheduler
2113 101 2013-05-12 2025-01-07 354 155 kayo@yahoo-inc.com m.zhang@databricks.com
FunctionRegistry.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis
961 36 2014-03-21 2025-04-17 353 143 michael@databricks.com roreeves@linkedin.com
SparkStrategies.scala
in sql/core/src/main/scala/org/apache/spark/sql/execution
806 25 2014-03-21 2025-04-18 345 115 michael@databricks.com kabhwan.opensource@gmail.com
QueryCompilationErrors.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/errors
3781 539 2021-01-16 2025-04-09 310 96 yuminkim@gmail.com chenhao.li@databricks.com
modules.py
in dev/sparktestsupport
1409 7 2015-06-28 2025-04-29 289 85 joshrosen@databricks.com yangjie01@baidu.com
package.scala
in core/src/main/scala/org/apache/spark/internal/config
2456 - 2016-03-07 2025-04-25 285 124 vanzin@cloudera.com yao@apache.org
QueryExecutionErrors.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/errors
2609 354 2021-01-16 2025-04-27 284 109 yuminkim@gmail.com nija@databricks.com
BlockManager.scala
in core/src/main/scala/org/apache/spark/storage
1544 83 2013-05-12 2024-11-15 283 130 kayo@yahoo-inc.com xumovens@gmail.com
Executor.scala
in core/src/main/scala/org/apache/spark/executor
952 28 2013-05-12 2025-04-13 278 138 kayo@yahoo-inc.com roreeves@linkedin.com
Cast.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions
1905 81 2014-03-21 2025-03-24 267 90 michael@databricks.com harsh.motwani@databricks.com
basicLogicalOperators.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical
1503 96 2016-04-23 2025-04-29 264 103 rxin@databricks.com a.guo@databricks.com
RDD.scala
in core/src/main/scala/org/apache/spark/rdd
1080 101 2013-09-01 2024-12-04 250 137 matei@eecs.berkeley.edu 1754789345@qq.com
CodeGenerator.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen
1210 69 2014-07-30 2025-03-20 247 80 michael@databricks.com chengpan@apache.org
SparkSqlParser.scala
in sql/core/src/main/scala/org/apache/spark/sql/execution
1040 46 2016-03-28 2025-04-24 234 94 hvanhovell@questtec.nl wenghy02@gmail.com
SparkSubmit.scala
in core/src/main/scala/org/apache/spark/deploy
883 22 2014-03-29 2025-02-21 226 110 sandy@cloudera.com gurwls223@apache.org
SessionCatalog.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog
1490 140 2016-03-17 2025-03-14 224 83 andrew@databricks.com allison.wang@databricks.com
HiveMetastoreCatalog.scala
in sql/hive/src/main/scala/org/apache/spark/sql/hive
327 10 2014-03-21 2025-01-24 223 79 michael@databricks.com herman@databricks.com
DataSourceV2Strategy.scala
in sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2
551 13 2017-09-15 2025-04-15 216 63 wenchen@databricks.com gengliang@apache.org
CoarseGrainedSchedulerBackend.scala
in core/src/main/scala/org/apache/spark/scheduler/cluster
722 40 2013-09-07 2024-07-11 215 114 alig@cs.berkeley.edu rui.wang@databricks.com
Master.scala
in core/src/main/scala/org/apache/spark/deploy/master
1095 40 2013-09-01 2025-02-12 213 98 matei@eecs.berkeley.edu dongjoon@apache.org
HiveClientImpl.scala
in sql/hive/src/main/scala/org/apache/spark/sql/hive/client
1102 70 2016-01-30 2025-03-10 210 80 rxin@databricks.com yao@apache.org
worker.py
in python/pyspark
1728 32 2013-01-01 2025-05-06 208 77 joshrosen@eecs.berkeley.edu gurwls223@apache.org
collectionOperations.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions
4533 177 2015-07-21 2024-12-28 203 80 ski.rodriguez@gmail.com yangjie01@baidu.com
tables.scala
in sql/core/src/main/scala/org/apache/spark/sql/execution/command
992 49 2016-04-13 2025-01-24 202 74 andrew@databricks.com herman@databricks.com
datetimeExpressions.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions
3066 150 2015-08-20 2025-04-19 201 75 rxin@databricks.com vinod.kc.in@gmail.com
TaskSetManager.scala
in core/src/main/scala/org/apache/spark/scheduler
1010 43 2013-09-25 2025-01-07 200 113 kayousterhout@gmail.com m.zhang@databricks.com
session.py
in python/pyspark/sql
921 78 2016-04-28 2025-02-20 196 64 andrew@databricks.com gurwls223@apache.org
dataframe.py
in python/pyspark/sql/connect
1945 163 2022-10-06 2025-02-27 194 25 gurwls223@apache.org wenghy02@gmail.com
stringExpressions.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions
2968 154 2015-08-20 2025-02-13 191 81 tarek.auel@googlemail.com dejan.krakovic@databricks.com
TaskSchedulerImpl.scala
in core/src/main/scala/org/apache/spark/scheduler
886 54 2013-12-20 2025-04-25 190 104 kayousterhout@gmail.com yao@apache.org
DataSourceStrategy.scala
in sql/core/src/main/scala/org/apache/spark/sql/execution/datasources
613 26 2015-07-21 2025-04-15 189 82 rxin@databricks.com amanda.liu@databricks.com
readwriter.py
in python/pyspark/sql
860 64 2015-05-19 2025-01-09 189 84 davies@databricks.com gurwls223@apache.org
Expression.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions
827 48 2014-03-21 2025-05-07 187 76 michael@databricks.com buyingyi@gmail.com
SparkEnv.scala
in core/src/main/scala/org/apache/spark
414 15 2013-09-01 2025-02-07 186 96 matei@eecs.berkeley.edu ueshin@databricks.com
types.py
in python/pyspark/sql
1984 215 2015-02-10 2025-02-20 186 69 davies@databricks.com ruifengz@apache.org
ResolveSessionCatalog.scala
in sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis
613 16 2019-10-04 2025-02-13 183 53 wenchen@databricks.com dejan.krakovic@databricks.com
classification.py
in python/pyspark/ml
2173 245 2015-01-29 2025-04-28 183 51 meng@databricks.com weichen.xu@databricks.com
StagePage.scala
in core/src/main/scala/org/apache/spark/ui/jobs
471 10 2013-08-15 2024-02-23 180 88 pwendell@gmail.com hiufkwok@gmail.com
Worker.scala
in core/src/main/scala/org/apache/spark/deploy/worker
801 30 2013-09-01 2024-08-21 178 90 matei@eecs.berkeley.edu dongjoon@apache.org
DateTimeUtils.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util
432 45 2015-06-23 2025-04-19 174 69 davies@databricks.com vinod.kc.in@gmail.com
Files With Most Contributors (Top 50)
Based on the number of unique email addresses found in commits.

See data for all files...

File# lines# unitscreatedlast modified# changes
(days)
# contributorsfirst
contributor
latest
contributor
SparkContext.scala
in core/src/main/scala/org/apache/spark
1923 124 2013-05-12 2025-02-24 534 237 kayo@yahoo-inc.com sririshindra@gmail.com
SQLConf.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/internal
5815 42 2017-03-14 2025-05-06 729 225 rxin@databricks.com gurwls223@apache.org
1460 12 2011-07-15 2025-05-06 818 222 ismael@juma.me.uk wenchen@databricks.com
Utils.scala
in core/src/main/scala/org/apache/spark/util
2154 163 2013-09-01 2025-04-04 456 205 matei@eecs.berkeley.edu vinod.kc.in@gmail.com
Analyzer.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis
2889 137 2014-03-21 2025-05-06 742 196 michael@databricks.com mihailo.aleksic@databricks.com
202 3 2014-06-02 2025-03-19 462 174 pwendell@gmail.com ruifengz@apache.org
DAGScheduler.scala
in core/src/main/scala/org/apache/spark/scheduler
2113 101 2013-05-12 2025-01-07 354 155 kayo@yahoo-inc.com m.zhang@databricks.com
AstBuilder.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser
4715 310 2016-03-31 2025-04-24 543 150 hvanhovell@questtec.nl wenghy02@gmail.com
Optimizer.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer
1683 92 2014-03-21 2025-04-29 486 148 michael@databricks.com pavle.martinovic@databricks...
FunctionRegistry.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis
961 36 2014-03-21 2025-04-17 353 143 michael@databricks.com roreeves@linkedin.com
dataframe.py
in python/pyspark/sql
851 173 2015-02-10 2025-04-10 383 142 michael@databricks.com ruifengz@apache.org
Executor.scala
in core/src/main/scala/org/apache/spark/executor
952 28 2013-05-12 2025-04-13 278 138 kayo@yahoo-inc.com roreeves@linkedin.com
RDD.scala
in core/src/main/scala/org/apache/spark/rdd
1080 101 2013-09-01 2024-12-04 250 137 matei@eecs.berkeley.edu 1754789345@qq.com
BlockManager.scala
in core/src/main/scala/org/apache/spark/storage
1544 83 2013-05-12 2024-11-15 283 130 kayo@yahoo-inc.com xumovens@gmail.com
package.scala
in core/src/main/scala/org/apache/spark/internal/config
2456 - 2016-03-07 2025-04-25 285 124 vanzin@cloudera.com yao@apache.org
SparkStrategies.scala
in sql/core/src/main/scala/org/apache/spark/sql/execution
806 25 2014-03-21 2025-04-18 345 115 michael@databricks.com kabhwan.opensource@gmail.com
CoarseGrainedSchedulerBackend.scala
in core/src/main/scala/org/apache/spark/scheduler/cluster
722 40 2013-09-07 2024-07-11 215 114 alig@cs.berkeley.edu rui.wang@databricks.com
TaskSetManager.scala
in core/src/main/scala/org/apache/spark/scheduler
1010 43 2013-09-25 2025-01-07 200 113 kayousterhout@gmail.com m.zhang@databricks.com
CheckAnalysis.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis
920 23 2015-02-24 2025-04-30 356 111 michael@databricks.com gengliang@apache.org
SparkSubmit.scala
in core/src/main/scala/org/apache/spark/deploy
883 22 2014-03-29 2025-02-21 226 110 sandy@cloudera.com gurwls223@apache.org
QueryExecutionErrors.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/errors
2609 354 2021-01-16 2025-04-27 284 109 yuminkim@gmail.com nija@databricks.com
TaskSchedulerImpl.scala
in core/src/main/scala/org/apache/spark/scheduler
886 54 2013-12-20 2025-04-25 190 104 kayousterhout@gmail.com yao@apache.org
basicLogicalOperators.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical
1503 96 2016-04-23 2025-04-29 264 103 rxin@databricks.com a.guo@databricks.com
Master.scala
in core/src/main/scala/org/apache/spark/deploy/master
1095 40 2013-09-01 2025-02-12 213 98 matei@eecs.berkeley.edu dongjoon@apache.org
QueryCompilationErrors.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/errors
3781 539 2021-01-16 2025-04-09 310 96 yuminkim@gmail.com chenhao.li@databricks.com
SparkEnv.scala
in core/src/main/scala/org/apache/spark
414 15 2013-09-01 2025-02-07 186 96 matei@eecs.berkeley.edu ueshin@databricks.com
CoarseGrainedExecutorBackend.scala
in core/src/main/scala/org/apache/spark/executor
486 14 2013-09-07 2025-01-24 172 95 alig@cs.berkeley.edu yangjie01@baidu.com
SparkSqlParser.scala
in sql/core/src/main/scala/org/apache/spark/sql/execution
1040 46 2016-03-28 2025-04-24 234 94 hvanhovell@questtec.nl wenghy02@gmail.com
Cast.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions
1905 81 2014-03-21 2025-03-24 267 90 michael@databricks.com harsh.motwani@databricks.com
Worker.scala
in core/src/main/scala/org/apache/spark/deploy/worker
801 30 2013-09-01 2024-08-21 178 90 matei@eecs.berkeley.edu dongjoon@apache.org
StagePage.scala
in core/src/main/scala/org/apache/spark/ui/jobs
471 10 2013-08-15 2024-02-23 180 88 pwendell@gmail.com hiufkwok@gmail.com
FsHistoryProvider.scala
in core/src/main/scala/org/apache/spark/deploy/history
1222 57 2014-06-23 2025-02-12 153 87 vanzin@cloudera.com cnauroth@apache.org
SparkConf.scala
in core/src/main/scala/org/apache/spark
505 54 2013-12-24 2025-01-16 151 86 prashant.s@imaginea.com bo.zhang@databricks.com
modules.py
in dev/sparktestsupport
1409 7 2015-06-28 2025-04-29 289 85 joshrosen@databricks.com yangjie01@baidu.com
MapOutputTracker.scala
in core/src/main/scala/org/apache/spark
1131 76 2013-09-01 2024-05-24 146 85 matei@eecs.berkeley.edu yi.wu@databricks.com
readwriter.py
in python/pyspark/sql
860 64 2015-05-19 2025-01-09 189 84 davies@databricks.com gurwls223@apache.org
SessionCatalog.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog
1490 140 2016-03-17 2025-03-14 224 83 andrew@databricks.com allison.wang@databricks.com
DataSourceStrategy.scala
in sql/core/src/main/scala/org/apache/spark/sql/execution/datasources
613 26 2015-07-21 2025-04-15 189 82 rxin@databricks.com amanda.liu@databricks.com
stringExpressions.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions
2968 154 2015-08-20 2025-02-13 191 81 tarek.auel@googlemail.com dejan.krakovic@databricks.com
CodeGenerator.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen
1210 69 2014-07-30 2025-03-20 247 80 michael@databricks.com chengpan@apache.org
HiveClientImpl.scala
in sql/hive/src/main/scala/org/apache/spark/sql/hive/client
1102 70 2016-01-30 2025-03-10 210 80 rxin@databricks.com yao@apache.org
collectionOperations.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions
4533 177 2015-07-21 2024-12-28 203 80 ski.rodriguez@gmail.com yangjie01@baidu.com
HiveMetastoreCatalog.scala
in sql/hive/src/main/scala/org/apache/spark/sql/hive
327 10 2014-03-21 2025-01-24 223 79 michael@databricks.com herman@databricks.com
PairRDDFunctions.scala
in core/src/main/scala/org/apache/spark/rdd
541 67 2013-09-01 2024-05-04 149 79 matei@eecs.berkeley.edu daniel.tenedorio@databricks...
PythonRDD.scala
in core/src/main/scala/org/apache/spark/api/python
687 51 2013-09-01 2025-04-22 173 78 matei@eecs.berkeley.edu gurwls223@apache.org
KryoSerializer.scala
in core/src/main/scala/org/apache/spark/serializer
575 30 2013-09-01 2025-04-15 126 78 matei@eecs.berkeley.edu yao@apache.org
UIUtils.scala
in core/src/main/scala/org/apache/spark/ui
589 27 2013-09-01 2025-03-19 118 78 matei@eecs.berkeley.edu yao@apache.org
worker.py
in python/pyspark
1728 32 2013-01-01 2025-05-06 208 77 joshrosen@eecs.berkeley.edu gurwls223@apache.org
Expression.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions
827 48 2014-03-21 2025-05-07 187 76 michael@databricks.com buyingyi@gmail.com
unresolved.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis
641 40 2014-03-21 2025-05-07 148 76 michael@databricks.com buyingyi@gmail.com
Files With Least Contributors (Top 50)
Based on the number of unique email addresses found in commits.

See data for all files...

File# lines# unitscreatedlast modified# changes
(days)
# contributorsfirst
contributor
latest
contributor
LazyTry.scala
in core/src/main/scala/org/apache/spark/util
9 1
ExpressionResolver.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver
527 28 2024-12-18 2025-05-05 6 1 vladimir.golubev@databricks... vladimir.golubev@databricks...
KeyValueGroupedDataset.scala
in sql/core/src/main/scala/org/apache/spark/sql/classic
452 15 2025-01-24 2025-01-24 1 1 herman@databricks.com herman@databricks.com
SparkConnectServerPage.scala
in sql/connect/server/src/main/scala/org/apache/spark/sql/connect/ui
428 12 2024-08-02 2024-08-02 1 1 gurwls223@apache.org gurwls223@apache.org
NameScope.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver
406 19 2024-12-18 2025-05-05 4 1 vladimir.golubev@databricks... vladimir.golubev@databricks...
V2ExpressionBuilder.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util
402 9 2025-02-20 2025-02-20 1 1 gengliang@apache.org gengliang@apache.org
LICENSE-javassist.html
in licenses-binary
369 - 2018-07-01 2018-07-01 1 1 srowen@gmail.com srowen@gmail.com
GcmTransportCipher.java
in common/network-common/src/main/java/org/apache/spark/network/crypto
332 20 2024-06-21 2024-06-21 1 1 steve.weis@databricks.com steve.weis@databricks.com
LiteralValueProtoConverter.scala
in sql/connect/common/src/main/scala/org/apache/spark/sql/connect/common
316 9 2024-08-02 2024-08-02 1 1 gurwls223@apache.org gurwls223@apache.org
SparkSession.scala
in sql/api/src/main/scala/org/apache/spark/sql
313 19 2025-01-24 2025-02-06 2 1 herman@databricks.com herman@databricks.com
Catalog.scala
in sql/connect/common/src/main/scala/org/apache/spark/sql/connect
304 41 2025-01-24 2025-01-24 1 1 herman@databricks.com herman@databricks.com
SQLContext.scala
in sql/api/src/main/scala/org/apache/spark/sql
296 58 2025-01-24 2025-02-17 3 1 herman@databricks.com herman@databricks.com
TransformWithStateInPySparkPythonRunner.scala
in sql/core/src/main/scala/org/apache/spark/sql/execution/python/streaming
291 11 2025-04-18 2025-04-18 1 1 kabhwan.opensource@gmail.com kabhwan.opensource@gmail.com
CtrTransportCipher.java
in common/network-common/src/main/java/org/apache/spark/network/crypto
284 22 2024-06-21 2024-06-21 1 1 steve.weis@databricks.com steve.weis@databricks.com
SetOperationLikeResolver.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver
283 16 2025-04-08 2025-04-08 1 1 vladimir.golubev@databricks... vladimir.golubev@databricks...
SortMergeJoinEvaluatorFactory.scala
in sql/core/src/main/scala/org/apache/spark/sql/execution/joins
259 1 2023-07-12 2023-07-12 1 1 vinod.kc.in@gmail.com vinod.kc.in@gmail.com
ResolutionValidator.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver
239 24 2024-12-18 2025-04-08 3 1 vladimir.golubev@databricks... vladimir.golubev@databricks...
SQLContext.scala
in sql/core/src/main/scala/org/apache/spark/sql/classic
232 17 2025-01-24 2025-02-17 2 1 herman@databricks.com herman@databricks.com
KeyValueGroupedDataset.scala
in sql/api/src/main/scala/org/apache/spark/sql
228 19 2025-01-24 2025-01-24 1 1 herman@databricks.com herman@databricks.com
avroSqlFunctions.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions
221 5 2024-12-06 2024-12-06 1 1 panbingkun@apache.org panbingkun@apache.org
VariantShreddingWriter.java
in common/variant/src/main/java/org/apache/spark/types/variant
221 3 2024-11-13 2025-02-28 3 1 david.cashman@databricks.com david.cashman@databricks.com
ArrowVectorReader.scala
in sql/connect/common/src/main/scala/org/apache/spark/sql/connect/client/arrow
219 20 2024-08-02 2025-02-05 2 1 gurwls223@apache.org gurwls223@apache.org
HiveFunctionRegistryUtils.java
in sql/hive/src/main/java/org/apache/hadoop/hive/ql/exec
218 5 2025-03-12 2025-03-12 1 1 chengpan@apache.org chengpan@apache.org
SortResolver.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver
211 7 2025-04-08 2025-04-08 1 1 vladimir.golubev@databricks... vladimir.golubev@databricks...
proto
catalog.proto
in sql/connect/common/src/main/protobuf/spark/connect
211 - 2024-08-02 2024-08-02 1 1 gurwls223@apache.org gurwls223@apache.org
ExpressionIdAssigner.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver
210 11 2025-02-04 2025-04-08 2 1 vladimir.golubev@databricks... vladimir.golubev@databricks...
ExecutePlanResponseReattachableIterator.scala
in sql/connect/common/src/main/scala/org/apache/spark/sql/connect/client
210 10 2024-08-02 2024-08-02 1 1 gurwls223@apache.org gurwls223@apache.org
tblib.py
in python/pyspark/errors/exceptions
200 12 2025-03-27 2025-03-27 1 1 wenghy02@gmail.com wenghy02@gmail.com
RelationResolution.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis
196 6 2024-10-15 2024-10-15 1 1 vladimir.golubev@databricks... vladimir.golubev@databricks...
AggregateResolver.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver
192 7 2025-04-08 2025-05-05 2 1 vladimir.golubev@databricks... vladimir.golubev@databricks...
progress.scala
in sql/api/src/main/scala/org/apache/spark/sql/streaming
192 10 2024-09-10 2024-09-10 1 1 herman@databricks.com herman@databricks.com
DataFrameNaFunctions.scala
in sql/core/src/main/scala/org/apache/spark/sql/classic
185 14 2025-01-24 2025-01-24 1 1 herman@databricks.com herman@databricks.com
JoinResolver.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver
180 7 2025-04-08 2025-04-08 1 1 vladimir.golubev@databricks... vladimir.golubev@databricks...
ConfigEntry.scala
in common/utils/src/main/scala/org/apache/spark/internal/config
176 8 2024-12-04 2024-12-04 1 1 herman@databricks.com herman@databricks.com
FlatMapGroupsInPandasWithStateExec.scala
in sql/core/src/main/scala/org/apache/spark/sql/execution/python/streaming
171 1 2025-02-11 2025-02-11 1 1 bo.gao@databricks.com bo.gao@databricks.com
SparkAsyncProfiler.scala
in connector/profiler/src/main/scala/org/apache/spark/profiler
164 7 2025-01-15 2025-01-16 2 1 chengpan@apache.org chengpan@apache.org
SQLContext.scala
in sql/connect/common/src/main/scala/org/apache/spark/sql/connect
162 11 2025-01-24 2025-02-17 2 1 herman@databricks.com herman@databricks.com
AggregateExpressionResolver.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver
154 6 2025-02-04 2025-04-08 2 1 vladimir.golubev@databricks... vladimir.golubev@databricks...
ExpressionResolutionValidator.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver
150 16 2024-12-18 2025-04-08 3 1 vladimir.golubev@databricks... vladimir.golubev@databricks...
ConnectRepl.scala
in sql/connect/client/jvm/src/main/scala/org/apache/spark/sql/application
145 2 2025-01-31 2025-01-31 1 1 herman@databricks.com herman@databricks.com
StreamingQueryListener.scala
in sql/api/src/main/scala/org/apache/spark/sql/streaming
142 14 2024-09-10 2025-01-24 2 1 herman@databricks.com herman@databricks.com
UDFRegistration.scala
in sql/core/src/main/scala/org/apache/spark/sql/classic
142 5 2025-01-24 2025-01-24 1 1 herman@databricks.com herman@databricks.com
StreamingConf.scala
in streaming/src/main/scala/org/apache/spark/streaming
140 - 2020-03-10 2020-03-23 2 1 beliefer@163.com beliefer@163.com
ShowTablesExtendedExec.scala
in sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2
139 4 2023-11-16 2023-11-16 1 1 pbk1982@gmail.com pbk1982@gmail.com
IndentingXMLStreamWriter.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/xml
138 34 2024-08-13 2024-08-13 1 1 alden.lau@databricks.com alden.lau@databricks.com
GrpcRetryHandler.scala
in sql/connect/common/src/main/scala/org/apache/spark/sql/connect/client
137 10 2024-08-02 2024-08-02 1 1 gurwls223@apache.org gurwls223@apache.org
progress.py
in python/pyspark/sql/connect/shell
136 11 2024-04-04 2024-04-09 2 1 martin.grund@databricks.com martin.grund@databricks.com
AvroFileFormat.scala
in sql/core/src/main/scala/org/apache/spark/sql/avro
135 6 2024-11-05 2024-11-05 1 1 eric.marnadi@databricks.com eric.marnadi@databricks.com
ArrayExpressionUtils.java
in sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions
134 15 2024-09-03 2024-10-29 2 1 panbingkun@baidu.com panbingkun@baidu.com
interface.scala
in sql/api/src/main/scala/org/apache/spark/sql/catalog
134 4 2024-09-06 2024-09-06 1 1 herman@databricks.com herman@databricks.com
Correlations

File Size vs. Number of Changes: 4071 points

core/src/main/scala/org/apache/spark/util/collection/BitSet.scala x: 162 lines of code y: 33 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CTESubstitution.scala x: 261 lines of code y: 39 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveIdentifierClause.scala x: 100 lines of code y: 11 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/unresolved.scala x: 641 lines of code y: 148 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala x: 827 lines of code y: 187 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala x: 424 lines of code y: 132 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala x: 847 lines of code y: 145 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreePatternBits.scala x: 38 lines of code y: 3 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreePatterns.scala x: 150 lines of code y: 69 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ApplyDefaultCollationToStringType.scala x: 135 lines of code y: 3 # changes mllib/src/main/scala/org/apache/spark/ml/classification/DecisionTreeClassifier.scala x: 209 lines of code y: 67 # changes mllib/src/main/scala/org/apache/spark/ml/classification/FMClassifier.scala x: 225 lines of code y: 24 # changes mllib/src/main/scala/org/apache/spark/ml/classification/GBTClassifier.scala x: 270 lines of code y: 71 # changes mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala x: 312 lines of code y: 54 # changes mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala x: 929 lines of code y: 170 # changes mllib/src/main/scala/org/apache/spark/ml/classification/MultilayerPerceptronClassifier.scala x: 255 lines of code y: 54 # changes mllib/src/main/scala/org/apache/spark/ml/classification/RandomForestClassifier.scala x: 346 lines of code y: 64 # changes mllib/src/main/scala/org/apache/spark/ml/clustering/BisectingKMeans.scala x: 201 lines of code y: 48 # changes mllib/src/main/scala/org/apache/spark/ml/clustering/GaussianMixture.scala x: 455 lines of code y: 67 # changes mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala x: 518 lines of code y: 70 # changes mllib/src/main/scala/org/apache/spark/ml/regression/GBTRegressor.scala x: 237 lines of code y: 67 # changes mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala x: 967 lines of code y: 80 # changes mllib/src/main/scala/org/apache/spark/ml/regression/LinearRegression.scala x: 594 lines of code y: 136 # changes mllib/src/main/scala/org/apache/spark/ml/regression/RandomForestRegressor.scala x: 219 lines of code y: 58 # changes mllib/src/main/scala/org/apache/spark/ml/tree/impl/GradientBoostedTrees.scala x: 351 lines of code y: 30 # changes mllib/src/main/scala/org/apache/spark/ml/tree/impl/RandomForest.scala x: 827 lines of code y: 55 # changes mllib/src/main/scala/org/apache/spark/ml/tree/treeModels.scala x: 331 lines of code y: 37 # changes mllib/src/main/scala/org/apache/spark/ml/util/HasTrainingSummary.scala x: 21 lines of code y: 4 # changes python/pyspark/testing/connectutils.py x: 177 lines of code y: 49 # changes sql/connect/server/src/main/scala/org/apache/spark/sql/connect/config/Connect.scala x: 325 lines of code y: 9 # changes sql/connect/server/src/main/scala/org/apache/spark/sql/connect/ml/MLCache.scala x: 187 lines of code y: 8 # changes sql/connect/server/src/main/scala/org/apache/spark/sql/connect/ml/MLHandler.scala x: 323 lines of code y: 16 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/LiteralFunctionResolution.scala x: 23 lines of code y: 3 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/package.scala x: 179 lines of code y: 52 # changes sql/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectAnalyzeHandler.scala x: 186 lines of code y: 4 # changes core/src/main/scala/org/apache/spark/util/UninterruptibleThread.scala x: 79 lines of code y: 6 # changes python/pyspark/errors/exceptions/captured.py x: 284 lines of code y: 22 # changes python/pyspark/pandas/config.py x: 363 lines of code y: 28 # changes python/pyspark/pandas/groupby.py x: 1800 lines of code y: 97 # changes python/pyspark/pandas/namespace.py x: 1517 lines of code y: 82 # changes python/pyspark/pandas/series.py x: 2215 lines of code y: 118 # changes python/pyspark/pandas/utils.py x: 657 lines of code y: 49 # changes python/pyspark/testing/pandasutils.py x: 486 lines of code y: 28 # changes python/pyspark/testing/utils.py x: 560 lines of code y: 59 # changes python/pyspark/sql/conversion.py x: 415 lines of code y: 2 # changes python/pyspark/sql/pandas/serializers.py x: 884 lines of code y: 52 # changes python/pyspark/worker.py x: 1728 lines of code y: 208 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala x: 5815 lines of code y: 729 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/python/ArrowPythonRunner.scala x: 97 lines of code y: 41 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/constraints.scala x: 174 lines of code y: 4 # changes python/pyspark/sql/datasource.py x: 256 lines of code y: 25 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala x: 2889 lines of code y: 742 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala x: 617 lines of code y: 172 # changes common/utils/src/main/scala/org/apache/spark/util/SparkStringUtils.scala x: 8 lines of code y: 2 # changes common/utils/src/main/scala/org/apache/spark/util/SparkTestUtils.scala x: 69 lines of code y: 2 # changes connector/avro/src/main/scala/org/apache/spark/sql/avro/AvroDataToCatalyst.scala x: 105 lines of code y: 12 # changes connector/avro/src/main/scala/org/apache/spark/sql/avro/CatalystDataToAvro.scala x: 39 lines of code y: 4 # changes connector/kafka-0-10-token-provider/src/main/scala/org/apache/spark/kafka010/KafkaConfigUpdater.scala x: 58 lines of code y: 4 # changes connector/kafka-0-10-token-provider/src/main/scala/org/apache/spark/kafka010/KafkaTokenSparkConf.scala x: 87 lines of code y: 2 # changes connector/kafka-0-10-token-provider/src/main/scala/org/apache/spark/kafka010/KafkaTokenUtil.scala x: 230 lines of code y: 5 # changes project/SparkBuild.scala x: 1460 lines of code y: 818 # changes sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala x: 474 lines of code y: 107 # changes python/pyspark/sql/pandas/types.py x: 920 lines of code y: 40 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/AggregateResolver.scala x: 192 lines of code y: 2 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/ExpressionResolver.scala x: 527 lines of code y: 6 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/FunctionResolver.scala x: 102 lines of code y: 4 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/NameScope.scala x: 406 lines of code y: 4 # changes sql/connect/server/src/main/scala/org/apache/spark/sql/connect/ml/MLUtils.scala x: 514 lines of code y: 24 # changes mllib/src/main/scala/org/apache/spark/ml/classification/NaiveBayes.scala x: 436 lines of code y: 58 # changes mllib/src/main/scala/org/apache/spark/ml/clustering/LDA.scala x: 533 lines of code y: 60 # changes mllib/src/main/scala/org/apache/spark/ml/feature/BucketedRandomProjectionLSH.scala x: 150 lines of code y: 20 # changes mllib/src/main/scala/org/apache/spark/ml/feature/ChiSqSelector.scala x: 108 lines of code y: 41 # changes mllib/src/main/scala/org/apache/spark/ml/feature/CountVectorizer.scala x: 248 lines of code y: 41 # changes mllib/src/main/scala/org/apache/spark/ml/feature/IDF.scala x: 157 lines of code y: 35 # changes mllib/src/main/scala/org/apache/spark/ml/feature/MaxAbsScaler.scala x: 122 lines of code y: 22 # changes mllib/src/main/scala/org/apache/spark/ml/feature/MinHashLSH.scala x: 161 lines of code y: 21 # changes mllib/src/main/scala/org/apache/spark/ml/feature/MinMaxScaler.scala x: 161 lines of code y: 39 # changes mllib/src/main/scala/org/apache/spark/ml/feature/OneHotEncoder.scala x: 378 lines of code y: 41 # changes mllib/src/main/scala/org/apache/spark/ml/feature/PCA.scala x: 133 lines of code y: 38 # changes mllib/src/main/scala/org/apache/spark/ml/feature/RFormula.scala x: 384 lines of code y: 62 # changes mllib/src/main/scala/org/apache/spark/ml/feature/RobustScaler.scala x: 180 lines of code y: 19 # changes mllib/src/main/scala/org/apache/spark/ml/feature/StandardScaler.scala x: 217 lines of code y: 52 # changes mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala x: 405 lines of code y: 73 # changes mllib/src/main/scala/org/apache/spark/ml/feature/TargetEncoder.scala x: 301 lines of code y: 9 # changes mllib/src/main/scala/org/apache/spark/ml/feature/UnivariateFeatureSelector.scala x: 308 lines of code y: 16 # changes mllib/src/main/scala/org/apache/spark/ml/feature/VarianceThresholdSelector.scala x: 134 lines of code y: 13 # changes mllib/src/main/scala/org/apache/spark/ml/feature/VectorIndexer.scala x: 370 lines of code y: 49 # changes mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala x: 1062 lines of code y: 104 # changes mllib/src/main/scala/org/apache/spark/ml/regression/AFTSurvivalRegression.scala x: 345 lines of code y: 73 # changes mllib/src/main/scala/org/apache/spark/ml/regression/FMRegressor.scala x: 478 lines of code y: 26 # changes mllib/src/main/scala/org/apache/spark/ml/regression/IsotonicRegression.scala x: 199 lines of code y: 42 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/state/StateDataSource.scala x: 443 lines of code y: 18 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/AsyncProgressTrackingMicroBatchExecution.scala x: 224 lines of code y: 10 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala x: 483 lines of code y: 169 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/OperatorStateMetadata.scala x: 348 lines of code y: 12 # changes sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/ColumnDefaultValue.java x: 42 lines of code y: 3 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala x: 920 lines of code y: 356 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/TypeUtils.scala x: 102 lines of code y: 43 # changes python/pyspark/sql/connect/functions/builtin.py x: 2417 lines of code y: 58 # changes sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala x: 998 lines of code y: 141 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveCatalogs.scala x: 112 lines of code y: 66 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveTableSpec.scala x: 84 lines of code y: 6 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/v2ResolutionPlans.scala x: 164 lines of code y: 45 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2AlterTableCommands.scala x: 223 lines of code y: 17 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala x: 1148 lines of code y: 135 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/CreateTableExec.scala x: 42 lines of code y: 13 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/ReplaceTableExec.scala x: 89 lines of code y: 13 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala x: 555 lines of code y: 100 # changes sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala x: 333 lines of code y: 93 # changes sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkGetColumnsOperation.scala x: 176 lines of code y: 27 # changes mllib/src/main/scala/org/apache/spark/ml/classification/OneVsRest.scala x: 344 lines of code y: 51 # changes mllib/src/main/scala/org/apache/spark/ml/feature/Imputer.scala x: 214 lines of code y: 29 # changes mllib/src/main/scala/org/apache/spark/ml/feature/Word2Vec.scala x: 250 lines of code y: 58 # changes mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala x: 263 lines of code y: 32 # changes mllib/src/main/scala/org/apache/spark/ml/tuning/CrossValidator.scala x: 304 lines of code y: 63 # changes mllib/src/main/scala/org/apache/spark/ml/tuning/TrainValidationSplit.scala x: 278 lines of code y: 37 # changes mllib/src/main/scala/org/apache/spark/ml/util/ReadWrite.scala x: 524 lines of code y: 55 # changes resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/KerberosConfDriverFeatureStep.scala x: 202 lines of code y: 9 # changes dev/sparktestsupport/modules.py x: 1409 lines of code y: 289 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/namedExpressions.scala x: 441 lines of code y: 122 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala x: 1683 lines of code y: 486 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/SparkOptimizer.scala x: 81 lines of code y: 56 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/UnionLoopExec.scala x: 152 lines of code y: 4 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/view.scala x: 30 lines of code y: 20 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala x: 1503 lines of code y: 264 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/CacheManager.scala x: 305 lines of code y: 80 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/rules.scala x: 558 lines of code y: 134 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/CatalystTypeConverters.scala x: 464 lines of code y: 61 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/python/PythonArrowOutput.scala x: 243 lines of code y: 13 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/WriteToDataSourceV2Exec.scala x: 528 lines of code y: 64 # changes python/pyspark/pandas/accessors.py x: 434 lines of code y: 33 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala x: 1547 lines of code y: 144 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/dsl/package.scala x: 399 lines of code y: 142 # changes sql/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala x: 3468 lines of code y: 45 # changes sql/core/src/main/scala/org/apache/spark/sql/classic/Dataset.scala x: 1498 lines of code y: 8 # changes sql/core/src/main/protobuf/org/apache/spark/sql/execution/streaming/StateMessage.proto x: 219 lines of code y: 7 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/python/streaming/TransformWithStateInPySparkStateServer.scala x: 754 lines of code y: 4 # changes common/utils/src/main/scala/org/apache/spark/internal/config/ConfigBuilder.scala x: 272 lines of code y: 4 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/TransformWithStateExec.scala x: 539 lines of code y: 38 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStoreErrors.scala x: 367 lines of code y: 20 # changes sql/core/src/main/scala/org/apache/spark/sql/classic/Catalog.scala x: 568 lines of code y: 5 # changes core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala x: 733 lines of code y: 94 # changes python/pyspark/ml/classification.py x: 2173 lines of code y: 183 # changes python/pyspark/ml/connect/readwrite.py x: 290 lines of code y: 12 # changes python/pyspark/ml/feature.py x: 3621 lines of code y: 171 # changes python/pyspark/ml/regression.py x: 1554 lines of code y: 134 # changes python/pyspark/ml/util.py x: 714 lines of code y: 77 # changes python/pyspark/sql/connect/client/core.py x: 1449 lines of code y: 84 # changes python/pyspark/sql/connect/group.py x: 488 lines of code y: 40 # changes python/pyspark/sql/connect/plan.py x: 2133 lines of code y: 135 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStore.scala x: 693 lines of code y: 81 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStoreConf.scala x: 37 lines of code y: 22 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStoreRDD.scala x: 96 lines of code y: 22 # changes project/plugins.sbt x: 14 lines of code y: 154 # changes common/utils/src/main/scala/org/apache/spark/ErrorClassesJSONReader.scala x: 123 lines of code y: 9 # changes sql/api/src/main/scala/org/apache/spark/sql/errors/DataTypeErrors.scala x: 224 lines of code y: 20 # changes sql/api/src/main/scala/org/apache/spark/sql/errors/ExecutionErrors.scala x: 210 lines of code y: 16 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala x: 2609 lines of code y: 284 # changes sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveUtils.scala x: 369 lines of code y: 89 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala x: 1425 lines of code y: 93 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStoreChangelog.scala x: 413 lines of code y: 18 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercionValidation.scala x: 95 lines of code y: 2 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeExtractors.scala x: 368 lines of code y: 72 # changes python/pyspark/ml/connect/functions.py x: 47 lines of code y: 7 # changes core/src/main/scala/org/apache/spark/deploy/ExternalShuffleService.scala x: 129 lines of code y: 33 # changes core/src/main/scala/org/apache/spark/errors/SparkCoreErrors.scala x: 416 lines of code y: 20 # changes core/src/main/scala/org/apache/spark/internal/config/package.scala x: 2456 lines of code y: 285 # changes core/src/main/scala/org/apache/spark/scheduler/SchedulableBuilder.scala x: 181 lines of code y: 41 # changes core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala x: 886 lines of code y: 190 # changes core/src/main/scala/org/apache/spark/ui/UIWorkloadGenerator.scala x: 79 lines of code y: 50 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala x: 637 lines of code y: 117 # changes python/pyspark/sql/streaming/list_state_client.py x: 164 lines of code y: 6 # changes python/pyspark/sql/streaming/proto/StateMessage_pb2.pyi x: 1116 lines of code y: 6 # changes python/pyspark/sql/streaming/stateful_processor_api_client.py x: 392 lines of code y: 15 # changes sql/core/src/main/scala/org/apache/spark/sql/avro/AvroOptions.scala x: 116 lines of code y: 3 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/timeExpressions.scala x: 458 lines of code y: 9 # changes python/pyspark/sql/functions/__init__.py x: 466 lines of code y: 5 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/ResolverGuard.scala x: 369 lines of code y: 5 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/CodeGeneratorWithInterpretedFallback.scala x: 28 lines of code y: 7 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ToStringBase.scala x: 422 lines of code y: 12 # changes sql/core/src/main/scala/org/apache/spark/sql/avro/AvroDeserializer.scala x: 404 lines of code y: 3 # changes sql/core/src/main/scala/org/apache/spark/sql/avro/AvroOutputWriter.scala x: 56 lines of code y: 2 # changes sql/core/src/main/scala/org/apache/spark/sql/avro/AvroSerializer.scala x: 314 lines of code y: 3 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala x: 553 lines of code y: 92 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceUtils.scala x: 225 lines of code y: 39 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetOptions.scala x: 55 lines of code y: 22 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetWriteSupport.scala x: 366 lines of code y: 36 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ResolveDefaultColumnsUtil.scala x: 406 lines of code y: 40 # changes sql/api/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4 x: 2122 lines of code y: 81 # changes sql/api/src/main/scala/org/apache/spark/sql/catalyst/util/SparkParserUtils.scala x: 144 lines of code y: 5 # changes sql/api/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala x: 681 lines of code y: 26 # changes sql/api/src/main/scala/org/apache/spark/sql/types/StructField.scala x: 148 lines of code y: 8 # changes sql/api/src/main/scala/org/apache/spark/sql/types/StructType.scala x: 406 lines of code y: 14 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SQLFunction.scala x: 204 lines of code y: 2 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AbstractSqlParser.scala x: 85 lines of code y: 7 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala x: 4715 lines of code y: 543 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/ParserInterface.scala x: 23 lines of code y: 12 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/ParserUtils.scala x: 217 lines of code y: 35 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala x: 1040 lines of code y: 234 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/command/CreateSQLFunctionCommand.scala x: 279 lines of code y: 4 # changes python/pyspark/ml/connect/tuning.py x: 318 lines of code y: 10 # changes python/run-tests.py x: 287 lines of code y: 58 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala x: 991 lines of code y: 147 # changes mllib/src/main/scala/org/apache/spark/ml/Estimator.scala x: 29 lines of code y: 15 # changes mllib/src/main/scala/org/apache/spark/ml/Model.scala x: 13 lines of code y: 12 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StatefulProcessorHandleImpl.scala x: 477 lines of code y: 25 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryRelation.scala x: 417 lines of code y: 67 # changes core/src/main/scala/org/apache/spark/api/python/PythonRDD.scala x: 687 lines of code y: 173 # changes sql/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectService.scala x: 339 lines of code y: 6 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/IncrementalExecution.scala x: 467 lines of code y: 84 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/SymmetricHashJoinStateManager.scala x: 793 lines of code y: 39 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/statefulOperators.scala x: 1059 lines of code y: 79 # changes sql/core/src/main/scala/org/apache/spark/sql/jdbc/MySQLDialect.scala x: 314 lines of code y: 63 # changes sql/connect/common/src/main/protobuf/spark/connect/ml.proto x: 114 lines of code y: 7 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala x: 393 lines of code y: 52 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/NormalizePlan.scala x: 145 lines of code y: 11 # changes core/src/main/scala/org/apache/spark/serializer/SerializationDebugger.scala x: 274 lines of code y: 15 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/xml/StaxXmlParser.scala x: 873 lines of code y: 28 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala x: 3066 lines of code y: 201 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala x: 432 lines of code y: 174 # changes python/pyspark/sql/streaming/stateful_processor.py x: 141 lines of code y: 14 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/UnsupportedOperationChecker.scala x: 439 lines of code y: 69 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/pythonLogicalOperators.scala x: 190 lines of code y: 28 # changes sql/core/src/main/scala/org/apache/spark/sql/classic/RelationalGroupedDataset.scala x: 473 lines of code y: 2 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala x: 806 lines of code y: 345 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/python/streaming/TransformWithStateInPySparkPythonRunner.scala x: 291 lines of code y: 1 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala x: 961 lines of code y: 353 # changes python/pyspark/core/context.py x: 784 lines of code y: 10 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/stat/StatFunctions.scala x: 166 lines of code y: 57 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/QueryExecution.scala x: 440 lines of code y: 140 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/InsertAdaptiveSparkPlan.scala x: 102 lines of code y: 37 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/PlanAdaptiveDynamicPruningFilters.scala x: 54 lines of code y: 11 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/dynamicpruning/PlanDynamicPruningFilters.scala x: 55 lines of code y: 19 # changes mllib/src/main/scala/org/apache/spark/ml/source/image/ImageFileFormat.scala x: 71 lines of code y: 8 # changes mllib/src/main/scala/org/apache/spark/ml/source/libsvm/LibSVMRelation.scala x: 140 lines of code y: 45 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVFileFormat.scala x: 127 lines of code y: 54 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JsonFileFormat.scala x: 110 lines of code y: 44 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/text/TextFileFormat.scala x: 107 lines of code y: 30 # changes sql/hive/src/main/scala/org/apache/spark/sql/hive/orc/OrcFileFormat.scala x: 285 lines of code y: 52 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/Resolver.scala x: 421 lines of code y: 7 # changes sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/TableChange.java x: 490 lines of code y: 16 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala x: 551 lines of code y: 216 # changes core/src/main/scala/org/apache/spark/storage/BlockManagerMasterEndpoint.scala x: 818 lines of code y: 82 # changes python/pyspark/sql/classic/column.py x: 490 lines of code y: 11 # changes python/pyspark/sql/column.py x: 317 lines of code y: 92 # changes python/pyspark/sql/connect/column.py x: 482 lines of code y: 73 # changes python/pyspark/sql/connect/expressions.py x: 1039 lines of code y: 57 # changes python/pyspark/sql/connect/proto/expressions_pb2.pyi x: 1764 lines of code y: 55 # changes sql/api/src/main/scala/org/apache/spark/sql/Column.scala x: 274 lines of code y: 8 # changes sql/api/src/main/scala/org/apache/spark/sql/Dataset.scala x: 412 lines of code y: 4 # changes sql/api/src/main/scala/org/apache/spark/sql/internal/columnNodes.scala x: 408 lines of code y: 11 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala x: 1057 lines of code y: 167 # changes sql/connect/common/src/main/protobuf/spark/connect/expressions.proto x: 412 lines of code y: 9 # changes sql/connect/common/src/main/scala/org/apache/spark/sql/connect/Dataset.scala x: 1025 lines of code y: 3 # changes sql/connect/common/src/main/scala/org/apache/spark/sql/connect/SparkSession.scala x: 598 lines of code y: 14 # changes sql/connect/common/src/main/scala/org/apache/spark/sql/connect/columnNodeSupport.scala x: 248 lines of code y: 6 # changes sql/core/src/main/scala/org/apache/spark/sql/classic/columnNodeSupport.scala x: 265 lines of code y: 2 # changes python/pyspark/accumulators.py x: 173 lines of code y: 68 # changes common/variant/src/main/java/org/apache/spark/types/variant/VariantUtil.java x: 396 lines of code y: 9 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/xmlExpressions.scala x: 204 lines of code y: 21 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala x: 613 lines of code y: 189 # changes core/src/main/scala/org/apache/spark/serializer/KryoSerializer.scala x: 575 lines of code y: 126 # changes core/src/main/scala/org/apache/spark/api/python/PythonWorkerFactory.scala x: 399 lines of code y: 64 # changes core/src/main/scala/org/apache/spark/api/python/PythonWorkerUtils.scala x: 118 lines of code y: 6 # changes core/src/main/scala/org/apache/spark/api/python/StreamingPythonRunner.scala x: 123 lines of code y: 19 # changes core/src/main/scala/org/apache/spark/api/r/RAuthHelper.scala x: 16 lines of code y: 2 # changes core/src/main/scala/org/apache/spark/api/r/RRDD.scala x: 119 lines of code y: 28 # changes core/src/main/scala/org/apache/spark/security/SocketAuthServer.scala x: 102 lines of code y: 7 # changes python/pyspark/core/broadcast.py x: 166 lines of code y: 3 # changes python/pyspark/daemon.py x: 171 lines of code y: 44 # changes python/pyspark/sql/connect/streaming/worker/listener_worker.py x: 72 lines of code y: 12 # changes python/pyspark/sql/streaming/python_streaming_source_runner.py x: 161 lines of code y: 12 # changes python/pyspark/sql/worker/commit_data_source_write.py x: 79 lines of code y: 8 # changes python/pyspark/sql/worker/create_data_source.py x: 121 lines of code y: 12 # changes python/pyspark/sql/worker/lookup_data_sources.py x: 69 lines of code y: 5 # changes python/pyspark/sql/worker/plan_data_source_read.py x: 301 lines of code y: 18 # changes python/pyspark/sql/worker/write_into_data_source.py x: 179 lines of code y: 14 # changes python/pyspark/taskcontext.py x: 147 lines of code y: 30 # changes sql/core/src/main/scala/org/apache/spark/sql/api/python/PythonSQLUtils.scala x: 139 lines of code y: 51 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/python/streaming/PythonStreamingSourceRunner.scala x: 184 lines of code y: 2 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala x: 813 lines of code y: 151 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/LogicalRelation.scala x: 77 lines of code y: 40 # changes sql/api/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseLexer.g4 x: 607 lines of code y: 35 # changes launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java x: 424 lines of code y: 53 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala x: 783 lines of code y: 114 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/python/UserDefinedPythonDataSource.scala x: 437 lines of code y: 9 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/python/MapInBatchExec.scala x: 56 lines of code y: 14 # changes python/pyspark/errors/exceptions/connect.py x: 332 lines of code y: 25 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveWithCTE.scala x: 258 lines of code y: 10 # changes core/src/main/scala/org/apache/spark/executor/Executor.scala x: 952 lines of code y: 278 # changes sql/api/src/main/scala/org/apache/spark/sql/catalyst/util/SparkDateTimeUtils.scala x: 426 lines of code y: 14 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/finishAnalysis.scala x: 133 lines of code y: 42 # changes common/utils/src/main/scala/org/apache/spark/internal/LogKey.scala x: 861 lines of code y: 50 # changes sql/core/src/main/scala/org/apache/spark/sql/classic/StreamingQueryManager.scala x: 259 lines of code y: 2 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala x: 701 lines of code y: 106 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/ProgressReporter.scala x: 446 lines of code y: 64 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/HDFSBackedStateStoreProvider.scala x: 801 lines of code y: 73 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDBStateStoreProvider.scala x: 755 lines of code y: 48 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStoreCoordinator.scala x: 265 lines of code y: 10 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/xml/XmlOptions.scala x: 167 lines of code y: 15 # changes sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/TableCatalogCapability.java x: 9 lines of code y: 4 # changes sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/StagingTableCatalog.java x: 83 lines of code y: 12 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashJoin.scala x: 663 lines of code y: 63 # changes python/pyspark/sql/dataframe.py x: 851 lines of code y: 383 # changes python/pyspark/ml/connect/base.py x: 132 lines of code y: 8 # changes python/pyspark/ml/connect/evaluation.py x: 142 lines of code y: 8 # changes python/pyspark/ml/connect/feature.py x: 216 lines of code y: 6 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ApplyCharTypePaddingHelper.scala x: 168 lines of code y: 3 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/AnalysisHelper.scala x: 210 lines of code y: 18 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDBFileManager.scala x: 783 lines of code y: 48 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVOptions.scala x: 284 lines of code y: 38 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JSONOptions.scala x: 190 lines of code y: 49 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala x: 3781 lines of code y: 310 # changes sql/core/src/main/scala/org/apache/spark/sql/classic/DataFrameReader.scala x: 224 lines of code y: 3 # changes sql/core/src/main/scala/org/apache/spark/sql/classic/DataStreamWriter.scala x: 323 lines of code y: 2 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/AggregateExpressionResolver.scala x: 154 lines of code y: 2 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/ExpressionIdAssigner.scala x: 210 lines of code y: 2 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/PruneMetadataColumns.scala x: 77 lines of code y: 1 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/ResolutionValidator.scala x: 239 lines of code y: 3 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/SetOperationLikeResolver.scala x: 283 lines of code y: 1 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2Writes.scala x: 146 lines of code y: 17 # changes sql/connect/common/src/main/scala/org/apache/spark/sql/connect/client/arrow/ArrowSerializer.scala x: 499 lines of code y: 6 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/variant/variantExpressions.scala x: 743 lines of code y: 36 # changes core/src/main/scala/org/apache/spark/util/Utils.scala x: 2154 lines of code y: 456 # changes python/pyspark/testing/sqlutils.py x: 132 lines of code y: 25 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/UnivocityParser.scala x: 456 lines of code y: 57 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormat.scala x: 178 lines of code y: 35 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVDataSource.scala x: 239 lines of code y: 36 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSource.scala x: 462 lines of code y: 66 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilters.scala x: 704 lines of code y: 62 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaConverter.scala x: 470 lines of code y: 45 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/command/DescribeRelationJsonCommand.scala x: 262 lines of code y: 8 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/hash.scala x: 827 lines of code y: 58 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/AnsiTypeCoercion.scala x: 125 lines of code y: 35 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala x: 249 lines of code y: 145 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercionHelper.scala x: 535 lines of code y: 7 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala x: 400 lines of code y: 87 # changes sql/core/src/main/resources/org/apache/spark/sql/execution/ui/static/spark-sql-viz.js x: 264 lines of code y: 20 # changes dev/merge_spark_pr.py x: 512 lines of code y: 77 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/python/ArrowEvalPythonExec.scala x: 95 lines of code y: 31 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/python/ExtractPythonUDFs.scala x: 243 lines of code y: 51 # changes python/pyspark/errors/exceptions/base.py x: 150 lines of code y: 26 # changes sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OffHeapColumnVector.java x: 486 lines of code y: 47 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala x: 592 lines of code y: 113 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/python/PythonScan.scala x: 49 lines of code y: 6 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/SpecificInternalRow.scala x: 221 lines of code y: 13 # changes sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/ParquetVectorUpdaterFactory.java x: 1454 lines of code y: 20 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala x: 374 lines of code y: 119 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRowConverter.scala x: 675 lines of code y: 58 # changes launcher/src/main/java/org/apache/spark/launcher/AbstractCommandBuilder.java x: 256 lines of code y: 51 # changes sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/TableCatalog.java x: 78 lines of code y: 28 # changes sql/api/src/main/scala/org/apache/spark/sql/types/DataType.scala x: 369 lines of code y: 23 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/sources/RatePerMicroBatchStream.scala x: 134 lines of code y: 4 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala x: 1905 lines of code y: 267 # changes python/pyspark/sql/connect/session.py x: 838 lines of code y: 129 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlanInfo.scala x: 72 lines of code y: 27 # changes python/pyspark/ml/tuning.py x: 1133 lines of code y: 83 # changes core/src/main/scala/org/apache/spark/util/Distribution.scala x: 40 lines of code y: 16 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/collect.scala x: 399 lines of code y: 35 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala x: 1210 lines of code y: 247 # changes mllib/src/main/scala/org/apache/spark/ml/functions.scala x: 46 lines of code y: 10 # changes mllib/src/main/scala/org/apache/spark/mllib/util/MLUtils.scala x: 397 lines of code y: 63 # changes python/pyspark/ml/clustering.py x: 1000 lines of code y: 94 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala x: 180 lines of code y: 50 # changes mllib-local/src/main/scala/org/apache/spark/ml/linalg/Vectors.scala x: 578 lines of code y: 32 # changes mllib/src/main/scala/org/apache/spark/ml/param/params.scala x: 603 lines of code y: 64 # changes project/MimaExcludes.scala x: 202 lines of code y: 462 # changes core/src/main/resources/org/apache/spark/ui/static/webui.css x: 403 lines of code y: 56 # changes core/src/main/scala/org/apache/spark/ui/UIUtils.scala x: 589 lines of code y: 118 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala x: 486 lines of code y: 130 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/AggUtils.scala x: 432 lines of code y: 28 # changes sql/catalyst/src/main/java/org/apache/spark/sql/connector/util/V2ExpressionSQLBuilder.java x: 313 lines of code y: 36 # changes sql/core/src/main/scala/org/apache/spark/sql/jdbc/H2Dialect.scala x: 230 lines of code y: 44 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ValidateSubqueryExpression.scala x: 301 lines of code y: 2 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala x: 1490 lines of code y: 224 # changes python/pyspark/sql/plot/core.py x: 297 lines of code y: 21 # changes connector/protobuf/src/main/scala/org/apache/spark/sql/protobuf/ProtobufDataToCatalyst.scala x: 131 lines of code y: 11 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TableOutputResolver.scala x: 549 lines of code y: 34 # changes sql/hive-thriftserver/src/main/java/org/apache/hive/service/cli/session/HiveSessionImpl.java x: 778 lines of code y: 17 # changes sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveUDFs.scala x: 337 lines of code y: 75 # changes sql/api/src/main/scala/org/apache/spark/sql/catalyst/parser/DataTypeAstBuilder.scala x: 201 lines of code y: 13 # changes python/pyspark/errors/exceptions/__init__.py x: 13 lines of code y: 4 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/Columnar.scala x: 388 lines of code y: 31 # changes core/src/main/scala/org/apache/spark/ui/env/EnvironmentPage.scala x: 165 lines of code y: 17 # changes sql/api/src/main/scala/org/apache/spark/sql/functions.scala x: 1882 lines of code y: 25 # changes connector/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaExceptions.scala x: 162 lines of code y: 9 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/misc.scala x: 436 lines of code y: 98 # changes sql/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectStreamingQueryCache.scala x: 237 lines of code y: 5 # changes sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala x: 1102 lines of code y: 210 # changes sql/api/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala x: 248 lines of code y: 7 # changes sql/api/src/main/scala/org/apache/spark/sql/catalyst/util/TimestampFormatter.scala x: 432 lines of code y: 10 # changes core/src/main/java/org/apache/spark/memory/TaskMemoryManager.java x: 315 lines of code y: 41 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala x: 695 lines of code y: 96 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/TungstenAggregationIterator.scala x: 233 lines of code y: 45 # changes sql/connect/common/src/main/scala/org/apache/spark/sql/connect/client/SparkConnectClient.scala x: 544 lines of code y: 9 # changes core/src/main/scala/org/apache/spark/util/ThreadUtils.scala x: 247 lines of code y: 33 # changes core/src/main/scala/org/apache/spark/BarrierCoordinator.scala x: 149 lines of code y: 14 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/objects.scala x: 454 lines of code y: 44 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/python/ArrowEvalPythonUDTFExec.scala x: 56 lines of code y: 8 # changes core/src/main/scala/org/apache/spark/ui/ConsoleProgressBar.scala x: 67 lines of code y: 19 # changes sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala x: 311 lines of code y: 71 # changes mllib/src/main/scala/org/apache/spark/ml/stat/Summarizer.scala x: 544 lines of code y: 27 # changes mllib/src/main/scala/org/apache/spark/mllib/clustering/GaussianMixture.scala x: 196 lines of code y: 31 # changes mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala x: 370 lines of code y: 42 # changes mllib/src/main/scala/org/apache/spark/mllib/feature/IDF.scala x: 150 lines of code y: 18 # changes mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/RowMatrix.scala x: 560 lines of code y: 64 # changes mllib/src/main/scala/org/apache/spark/mllib/optimization/GradientDescent.scala x: 184 lines of code y: 42 # changes mllib/src/main/scala/org/apache/spark/mllib/optimization/LBFGS.scala x: 143 lines of code y: 32 # changes mllib/src/main/scala/org/apache/spark/mllib/stat/Statistics.scala x: 76 lines of code y: 22 # changes sql/api/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala x: 324 lines of code y: 6 # changes sql/api/src/main/scala/org/apache/spark/sql/catalyst/encoders/RowEncoder.scala x: 81 lines of code y: 13 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/DeserializerBuildHelper.scala x: 407 lines of code y: 18 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/SerializerBuildHelper.scala x: 418 lines of code y: 23 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/types/PhysicalDataType.scala x: 323 lines of code y: 17 # changes python/pyspark/sql/connect/udf.py x: 226 lines of code y: 35 # changes python/pyspark/sql/udf.py x: 464 lines of code y: 77 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/ExplainUtils.scala x: 200 lines of code y: 17 # changes sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveShim.scala x: 1277 lines of code y: 95 # changes mllib/src/main/scala/org/apache/spark/ml/stat/FValueTest.scala x: 87 lines of code y: 11 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/DeduplicateRelations.scala x: 394 lines of code y: 33 # changes python/pyspark/ml/base.py x: 174 lines of code y: 23 # changes python/pyspark/sql/utils.py x: 289 lines of code y: 97 # changes python/pyspark/pandas/base.py x: 601 lines of code y: 59 # changes core/src/main/scala/org/apache/spark/util/JsonProtocol.scala x: 1419 lines of code y: 134 # changes sql/connect/common/src/main/scala/org/apache/spark/sql/connect/KeyValueGroupedDataset.scala x: 581 lines of code y: 4 # changes sql/core/src/main/scala/org/apache/spark/sql/scripting/SqlScriptingExecutionNode.scala x: 707 lines of code y: 26 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoder.scala x: 189 lines of code y: 77 # changes common/variant/src/main/java/org/apache/spark/types/variant/VariantBuilder.java x: 450 lines of code y: 8 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/SparkShreddingUtils.scala x: 672 lines of code y: 9 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/cteOperators.scala x: 108 lines of code y: 3 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Relation.scala x: 209 lines of code y: 26 # changes python/pyspark/sql/connect/conversion.py x: 35 lines of code y: 34 # changes python/pyspark/sql/connect/dataframe.py x: 1945 lines of code y: 194 # changes sql/core/src/main/scala/org/apache/spark/sql/classic/SparkSession.scala x: 665 lines of code y: 5 # changes python/pyspark/logger/logger.py x: 113 lines of code y: 5 # changes python/pyspark/sql/connect/proto/relations_pb2.pyi x: 3616 lines of code y: 91 # changes sql/connect/common/src/main/protobuf/spark/connect/relations.proto x: 984 lines of code y: 8 # changes core/src/main/scala/org/apache/spark/SparkContext.scala x: 1923 lines of code y: 534 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/SqlScriptingLogicalPlans.scala x: 227 lines of code y: 7 # changes python/pyspark/pandas/sql_formatter.py x: 124 lines of code y: 16 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2SessionCatalog.scala x: 421 lines of code y: 63 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlanner.scala x: 60 lines of code y: 31 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/ResolveWriteToStream.scala x: 99 lines of code y: 13 # changes python/pyspark/ml/pipeline.py x: 253 lines of code y: 48 # changes core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala x: 883 lines of code y: 226 # changes core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala x: 510 lines of code y: 118 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDBStateEncoder.scala x: 1126 lines of code y: 16 # changes python/pyspark/sql/session.py x: 921 lines of code y: 196 # changes python/pyspark/sql/types.py x: 1984 lines of code y: 186 # changes python/pyspark/errors/__init__.py x: 70 lines of code y: 21 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala x: 458 lines of code y: 119 # changes sql/api/src/main/scala/org/apache/spark/sql/streaming/ValueState.scala x: 10 lines of code y: 7 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Mode.scala x: 268 lines of code y: 13 # changes python/packaging/connect/setup.py x: 84 lines of code y: 19 # changes python/pyspark/shell.py x: 88 lines of code y: 94 # changes python/pyspark/testing/__init__.py x: 40 lines of code y: 7 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveHints.scala x: 198 lines of code y: 37 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/hints.scala x: 100 lines of code y: 25 # changes sql/api/src/main/scala/org/apache/spark/sql/internal/SqlApiConfHelper.scala x: 17 lines of code y: 7 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CollationTypeCoercion.scala x: 339 lines of code y: 10 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveInlineTables.scala x: 14 lines of code y: 25 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala x: 2968 lines of code y: 191 # changes sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala x: 613 lines of code y: 183 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/ddl.scala x: 104 lines of code y: 58 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala x: 611 lines of code y: 120 # changes core/src/main/scala/org/apache/spark/scheduler/JobResult.scala x: 8 lines of code y: 11 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/SQLExecution.scala x: 240 lines of code y: 58 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/QueryStageExec.scala x: 219 lines of code y: 41 # changes core/src/main/scala/org/apache/spark/deploy/master/Master.scala x: 1095 lines of code y: 213 # changes sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/ExpressionImplUtils.java x: 259 lines of code y: 16 # changes core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala x: 1222 lines of code y: 153 # changes python/pyspark/sql/streaming/readwriter.py x: 612 lines of code y: 23 # changes mllib/src/main/scala/org/apache/spark/ml/Pipeline.scala x: 245 lines of code y: 48 # changes common/network-common/src/main/java/org/apache/spark/network/server/TransportRequestHandler.java x: 252 lines of code y: 22 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/TTLState.scala x: 271 lines of code y: 9 # changes core/src/main/scala/org/apache/spark/api/java/JavaSparkContext.scala x: 260 lines of code y: 99 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/python/CoGroupedArrowPythonRunner.scala x: 97 lines of code y: 17 # changes python/pyspark/ml/stat.py x: 185 lines of code y: 33 # changes python/pyspark/sql/profiler.py x: 308 lines of code y: 8 # changes python/pyspark/pandas/plot/core.py x: 424 lines of code y: 34 # changes launcher/src/main/java/org/apache/spark/launcher/CommandBuilderUtils.java x: 240 lines of code y: 21 # changes core/src/main/scala/org/apache/spark/SparkEnv.scala x: 414 lines of code y: 186 # changes sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeArrayData.java x: 400 lines of code y: 36 # changes sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/codegen/UnsafeWriter.java x: 168 lines of code y: 11 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/python/UserDefinedPythonFunction.scala x: 213 lines of code y: 36 # changes core/src/main/scala/org/apache/spark/deploy/PythonRunner.scala x: 122 lines of code y: 32 # changes python/pyspark/ml/fpm.py x: 243 lines of code y: 32 # changes python/pyspark/sql/pandas/conversion.py x: 610 lines of code y: 56 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowConverters.scala x: 379 lines of code y: 36 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/r/ArrowRRunner.scala x: 147 lines of code y: 10 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/jdbc/JDBCTable.scala x: 80 lines of code y: 18 # changes python/pyspark/sql/classic/dataframe.py x: 1539 lines of code y: 24 # changes python/pyspark/sql/udtf.py x: 275 lines of code y: 29 # changes python/pyspark/sql/connect/proto/commands_pb2.pyi x: 2053 lines of code y: 45 # changes sql/connect/common/src/main/protobuf/spark/connect/commands.proto x: 448 lines of code y: 3 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/command/commands.scala x: 135 lines of code y: 63 # changes connector/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaOffsetReaderAdmin.scala x: 434 lines of code y: 13 # changes connector/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaOffsetReaderConsumer.scala x: 466 lines of code y: 15 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/subquery.scala x: 347 lines of code y: 58 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FlatMapGroupsWithStateExec.scala x: 417 lines of code y: 47 # changes sql/connect/client/jvm/src/main/scala/org/apache/spark/sql/application/ConnectRepl.scala x: 145 lines of code y: 1 # changes sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClient.scala x: 138 lines of code y: 39 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCOptions.scala x: 213 lines of code y: 50 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala x: 237 lines of code y: 78 # changes sql/core/src/main/scala/org/apache/spark/sql/internal/BaseSessionStateBuilder.scala x: 257 lines of code y: 91 # changes sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionStateBuilder.scala x: 171 lines of code y: 64 # changes sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala x: 234 lines of code y: 142 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/metric/SQLMetrics.scala x: 120 lines of code y: 57 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLAppStatusListener.scala x: 479 lines of code y: 47 # changes resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala x: 673 lines of code y: 98 # changes mllib-local/src/main/scala/org/apache/spark/ml/linalg/Matrices.scala x: 815 lines of code y: 23 # changes python/pyspark/java_gateway.py x: 105 lines of code y: 88 # changes repl/src/main/scala/org/apache/spark/repl/Main.scala x: 88 lines of code y: 21 # changes sql/core/src/main/scala/org/apache/spark/sql/artifact/ArtifactManager.scala x: 377 lines of code y: 14 # changes sql/core/src/main/scala/org/apache/spark/sql/classic/KeyValueGroupedDataset.scala x: 452 lines of code y: 1 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/ExistingRDD.scala x: 220 lines of code y: 88 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlan.scala x: 378 lines of code y: 131 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/analysis/DetectAmbiguousSelfJoin.scala x: 110 lines of code y: 13 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzeColumnCommand.scala x: 106 lines of code y: 34 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzeTableCommand.scala x: 12 lines of code y: 21 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/command/createDataSourceTables.scala x: 155 lines of code y: 75 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala x: 734 lines of code y: 163 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala x: 992 lines of code y: 202 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala x: 315 lines of code y: 72 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetUtils.scala x: 366 lines of code y: 25 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/stat/FrequentItems.scala x: 134 lines of code y: 30 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/memory.scala x: 215 lines of code y: 76 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala x: 140 lines of code y: 58 # changes sql/core/src/main/scala/org/apache/spark/sql/internal/SessionState.scala x: 114 lines of code y: 74 # changes sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLDriver.scala x: 82 lines of code y: 37 # changes sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala x: 20 lines of code y: 167 # changes sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala x: 327 lines of code y: 223 # changes sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala x: 176 lines of code y: 116 # changes sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/SaveAsHiveFile.scala x: 39 lines of code y: 23 # changes sql/core/src/main/scala/org/apache/spark/sql/avro/SchemaConverters.scala x: 358 lines of code y: 3 # changes core/src/main/scala/org/apache/spark/executor/CoarseGrainedExecutorBackend.scala x: 486 lines of code y: 172 # changes python/pyspark/ml/evaluation.py x: 579 lines of code y: 55 # changes python/pyspark/pandas/indexing.py x: 1210 lines of code y: 41 # changes connector/protobuf/src/main/scala/org/apache/spark/sql/protobuf/utils/SchemaConverters.scala x: 176 lines of code y: 17 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/subquery.scala x: 757 lines of code y: 68 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/DecorrelateInnerQuery.scala x: 529 lines of code y: 30 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala x: 272 lines of code y: 78 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/CallMethodViaReflection.scala x: 167 lines of code y: 25 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collationExpressions.scala x: 111 lines of code y: 17 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala x: 1531 lines of code y: 86 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/numberFormatExpressions.scala x: 280 lines of code y: 21 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/xml/xpath.scala x: 176 lines of code y: 22 # changes common/unsafe/src/main/java/org/apache/spark/sql/catalyst/util/CollationFactory.java x: 979 lines of code y: 33 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2ScanRelationPushDown.scala x: 441 lines of code y: 49 # changes python/pyspark/sql/connect/proto/types_pb2.pyi x: 914 lines of code y: 13 # changes python/pyspark/ml/recommendation.py x: 328 lines of code y: 48 # changes python/pyspark/sql/connect/proto/common_pb2.pyi x: 566 lines of code y: 6 # changes common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDB.java x: 333 lines of code y: 6 # changes python/pyspark/sql/avro/functions.py x: 108 lines of code y: 24 # changes sql/connect/server/src/main/scala/org/apache/spark/sql/connect/execution/ExecuteGrpcResponseSender.scala x: 253 lines of code y: 5 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/state/StatePartitionReader.scala x: 232 lines of code y: 18 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateSchemaCompatibilityChecker.scala x: 278 lines of code y: 17 # changes core/src/main/scala/org/apache/spark/rdd/NewHadoopRDD.scala x: 298 lines of code y: 94 # changes core/src/main/scala/org/apache/spark/SparkConf.scala x: 505 lines of code y: 151 # changes python/pyspark/version.py x: 1 lines of code y: 16 # changes common/unsafe/src/main/java/org/apache/spark/sql/catalyst/util/CollationAwareUTF8String.java x: 921 lines of code y: 22 # changes sql/api/src/main/scala/org/apache/spark/sql/types/StringType.scala x: 108 lines of code y: 28 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/util/SchemaUtils.scala x: 250 lines of code y: 24 # changes python/pyspark/ml/connect/io_utils.py x: 197 lines of code y: 8 # changes python/pyspark/cloudpickle/cloudpickle.py x: 793 lines of code y: 8 # changes core/src/main/scala/org/apache/spark/TaskContext.scala x: 99 lines of code y: 65 # changes core/src/main/scala/org/apache/spark/scheduler/Task.scala x: 132 lines of code y: 94 # changes python/pyspark/sql/plot/plotly.py x: 197 lines of code y: 11 # changes python/pyspark/sql/connect/proto/base_pb2.pyi x: 3038 lines of code y: 52 # changes sql/connect/common/src/main/protobuf/spark/connect/base.proto x: 921 lines of code y: 5 # changes python/pyspark/sql/variant_utils.py x: 615 lines of code y: 8 # changes core/src/main/scala/org/apache/spark/deploy/history/EventLogFileWriters.scala x: 273 lines of code y: 10 # changes sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/MutableColumnarRow.java x: 239 lines of code y: 15 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala x: 518 lines of code y: 84 # changes sql/api/src/main/scala/org/apache/spark/sql/types/UpCastRule.scala x: 41 lines of code y: 10 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/HistogramNumeric.scala x: 196 lines of code y: 6 # changes python/pyspark/core/rdd.py x: 1451 lines of code y: 7 # changes python/pyspark/sql/readwriter.py x: 860 lines of code y: 189 # changes core/src/main/scala/org/apache/spark/deploy/DeployMessage.scala x: 147 lines of code y: 59 # changes core/src/main/java/org/apache/spark/shuffle/sort/ShuffleExternalSorter.java x: 275 lines of code y: 35 # changes core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala x: 2113 lines of code y: 354 # changes core/src/main/scala/org/apache/spark/scheduler/OutputCommitCoordinator.scala x: 143 lines of code y: 22 # changes core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala x: 1010 lines of code y: 200 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/ShufflePartitionsUtil.scala x: 265 lines of code y: 23 # changes streaming/src/main/scala/org/apache/spark/streaming/receiver/BlockGenerator.scala x: 182 lines of code y: 25 # changes resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala x: 712 lines of code y: 65 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala x: 4533 lines of code y: 203 # changes python/pyspark/ml/functions.py x: 301 lines of code y: 23 # changes python/pyspark/resource/profile.py x: 176 lines of code y: 14 # changes python/pyspark/sql/pandas/utils.py x: 87 lines of code y: 23 # changes sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/ParquetColumnVector.java x: 248 lines of code y: 11 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetReadSupport.scala x: 369 lines of code y: 30 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/udaf.scala x: 429 lines of code y: 37 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala x: 497 lines of code y: 65 # changes graphx/src/main/scala/org/apache/spark/graphx/Pregel.scala x: 54 lines of code y: 24 # changes mllib/src/main/scala/org/apache/spark/mllib/tree/model/DecisionTreeModel.scala x: 219 lines of code y: 43 # changes mllib/src/main/scala/org/apache/spark/mllib/tree/model/treeEnsembleModels.scala x: 269 lines of code y: 38 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/OffsetSeq.scala x: 95 lines of code y: 27 # changes core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala x: 654 lines of code y: 103 # changes streaming/src/main/scala/org/apache/spark/streaming/dstream/DStream.scala x: 541 lines of code y: 60 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/joinTypes.scala x: 132 lines of code y: 26 # changes core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala x: 363 lines of code y: 125 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/xml/XmlInferSchema.scala x: 432 lines of code y: 19 # changes python/pyspark/sql/connect/readwriter.py x: 851 lines of code y: 41 # changes python/pyspark/sql/catalog.py x: 323 lines of code y: 67 # changes common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java x: 1654 lines of code y: 34 # changes python/pyspark/ml/torch/distributor.py x: 630 lines of code y: 31 # changes core/src/main/scala/org/apache/spark/rdd/RDD.scala x: 1080 lines of code y: 250 # changes core/src/main/scala/org/apache/spark/shuffle/sort/SortShuffleManager.scala x: 146 lines of code y: 37 # changes sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/WritableColumnVector.java x: 599 lines of code y: 38 # changes sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/HiveScriptTransformationExec.scala x: 239 lines of code y: 9 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/percentiles.scala x: 388 lines of code y: 17 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/package.scala x: 223 lines of code y: 37 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/BaseScriptTransformationExec.scala x: 322 lines of code y: 26 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StateTypesEncoderUtils.scala x: 179 lines of code y: 10 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala x: 1071 lines of code y: 164 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala x: 803 lines of code y: 85 # changes sql/hive/src/main/scala/org/apache/spark/sql/hive/client/IsolatedClientLoader.scala x: 229 lines of code y: 90 # changes core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala x: 203 lines of code y: 69 # changes core/src/main/scala/org/apache/spark/storage/BlockInfoManager.scala x: 327 lines of code y: 21 # changes core/src/main/scala/org/apache/spark/storage/BlockManager.scala x: 1544 lines of code y: 283 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/InMemoryCatalog.scala x: 533 lines of code y: 71 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala x: 308 lines of code y: 66 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala x: 989 lines of code y: 64 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala x: 975 lines of code y: 85 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/JoinEstimation.scala x: 242 lines of code y: 17 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/basicPhysicalOperators.scala x: 642 lines of code y: 87 # changes streaming/src/main/scala/org/apache/spark/streaming/dstream/SocketInputDStream.scala x: 94 lines of code y: 26 # changes streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobGenerator.scala x: 199 lines of code y: 58 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/intervalExpressions.scala x: 749 lines of code y: 40 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala x: 750 lines of code y: 80 # changes sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIDriver.scala x: 510 lines of code y: 115 # changes sql/hive/src/main/scala/org/apache/spark/sql/hive/client/package.scala x: 74 lines of code y: 29 # changes sql/core/src/main/scala/org/apache/spark/sql/jdbc/TeradataDialect.scala x: 51 lines of code y: 19 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/bitwiseExpressions.scala x: 223 lines of code y: 27 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/QueryPlanConstraints.scala x: 68 lines of code y: 15 # changes python/pyspark/sql/group.py x: 90 lines of code y: 60 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala x: 269 lines of code y: 124 # changes core/src/main/scala/org/apache/spark/scheduler/cluster/StandaloneSchedulerBackend.scala x: 252 lines of code y: 55 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala x: 562 lines of code y: 76 # changes sql/connect/server/src/main/scala/org/apache/spark/sql/connect/ui/SparkConnectServerListener.scala x: 382 lines of code y: 2 # changes mllib/src/main/scala/org/apache/spark/ml/feature/Binarizer.scala x: 171 lines of code y: 31 # changes mllib/src/main/scala/org/apache/spark/ml/feature/HashingTF.scala x: 109 lines of code y: 38 # changes mllib/src/main/scala/org/apache/spark/ml/util/SchemaUtils.scala x: 136 lines of code y: 17 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/PropagateEmptyRelation.scala x: 138 lines of code y: 27 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/OptimizeMetadataOnlyQuery.scala x: 119 lines of code y: 23 # changes python/pyspark/pandas/generic.py x: 991 lines of code y: 69 # changes python/pyspark/pandas/internal.py x: 800 lines of code y: 45 # changes python/pyspark/pandas/resample.py x: 378 lines of code y: 13 # changes python/pyspark/pandas/window.py x: 539 lines of code y: 29 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CollationTypeCasts.scala x: 8 lines of code y: 21 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/DecimalPrecision.scala x: 8 lines of code y: 29 # changes common/unsafe/src/main/java/org/apache/spark/sql/catalyst/util/CollationSupport.java x: 642 lines of code y: 25 # changes common/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java x: 1414 lines of code y: 77 # changes python/pyspark/sql/connect/proto/base_pb2_grpc.py x: 454 lines of code y: 12 # changes python/pyspark/profiler.py x: 354 lines of code y: 20 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/GroupStateImpl.scala x: 194 lines of code y: 17 # changes python/pyspark/sql/connect/streaming/readwriter.py x: 606 lines of code y: 22 # changes python/pyspark/sql/connect/streaming/query.py x: 340 lines of code y: 19 # changes repl/src/main/scala/org/apache/spark/repl/SparkILoop.scala x: 107 lines of code y: 62 # changes python/pyspark/ml/linalg/__init__.py x: 779 lines of code y: 25 # changes python/pyspark/mllib/linalg/__init__.py x: 908 lines of code y: 38 # changes core/src/main/scala/org/apache/spark/deploy/master/ui/MasterPage.scala x: 340 lines of code y: 54 # changes core/src/main/scala/org/apache/spark/deploy/master/ui/ApplicationPage.scala x: 140 lines of code y: 57 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/datasketchesAggregates.scala x: 217 lines of code y: 10 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JsonInferSchema.scala x: 323 lines of code y: 36 # changes connector/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala x: 558 lines of code y: 9 # changes python/pyspark/sql/connect/catalog.py x: 278 lines of code y: 24 # changes python/pyspark/mllib/classification.py x: 398 lines of code y: 65 # changes python/pyspark/mllib/feature.py x: 346 lines of code y: 57 # changes python/pyspark/mllib/regression.py x: 371 lines of code y: 62 # changes core/src/main/scala/org/apache/spark/input/PortableDataStream.scala x: 127 lines of code y: 20 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ShuffleExchangeExec.scala x: 283 lines of code y: 50 # changes core/src/main/java/org/apache/spark/io/ReadAheadInputStream.java x: 295 lines of code y: 11 # changes core/src/main/scala/org/apache/spark/TestUtils.scala x: 309 lines of code y: 61 # changes core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala x: 201 lines of code y: 49 # changes core/src/main/scala/org/apache/spark/deploy/worker/ExecutorRunner.scala x: 152 lines of code y: 81 # changes core/src/main/scala/org/apache/spark/util/collection/AppendOnlyMap.scala x: 215 lines of code y: 20 # changes resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/MountVolumesFeatureStep.scala x: 111 lines of code y: 22 # changes resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala x: 1227 lines of code y: 107 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastNestedLoopJoinExec.scala x: 471 lines of code y: 27 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/CatalogV2Implicits.scala x: 180 lines of code y: 36 # changes core/src/main/scala/org/apache/spark/scheduler/DAGSchedulerEvent.scala x: 83 lines of code y: 66 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala x: 268 lines of code y: 39 # changes core/src/main/scala/org/apache/spark/scheduler/SparkListener.scala x: 313 lines of code y: 100 # changes core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala x: 215 lines of code y: 69 # changes sql/api/src/main/scala/org/apache/spark/sql/catalyst/analysis/noSuchItemsExceptions.scala x: 171 lines of code y: 8 # changes sql/api/src/main/scala/org/apache/spark/sql/catalyst/util/SparkIntervalUtils.scala x: 430 lines of code y: 5 # changes sql/api/src/main/scala/org/apache/spark/sql/types/Decimal.scala x: 495 lines of code y: 7 # changes core/src/main/scala/org/apache/spark/storage/DiskStore.scala x: 269 lines of code y: 76 # changes sql/connect/common/src/main/scala/org/apache/spark/sql/connect/common/UdfUtils.scala x: 498 lines of code y: 3 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryErrorsBase.scala x: 32 lines of code y: 26 # changes python/pyspark/pandas/plot/matplotlib.py x: 597 lines of code y: 21 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/linearRegression.scala x: 307 lines of code y: 12 # changes core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala x: 1103 lines of code y: 94 # changes core/src/main/java/org/apache/spark/shuffle/sort/UnsafeShuffleWriter.java x: 423 lines of code y: 42 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/nullExpressions.scala x: 426 lines of code y: 50 # changes core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala x: 801 lines of code y: 178 # changes core/src/main/scala/org/apache/spark/executor/TaskMetrics.scala x: 207 lines of code y: 90 # changes core/src/main/resources/org/apache/spark/ui/static/stagepage.js x: 1051 lines of code y: 36 # changes core/src/main/scala/org/apache/spark/deploy/FaultToleranceTest.scala x: 329 lines of code y: 49 # changes core/src/main/scala/org/apache/spark/deploy/rest/RestSubmissionClient.scala x: 439 lines of code y: 30 # changes core/src/main/scala/org/apache/spark/ui/JettyUtils.scala x: 464 lines of code y: 83 # changes core/src/main/scala/org/apache/spark/util/AccumulatorV2.scala x: 253 lines of code y: 28 # changes mllib/src/main/scala/org/apache/spark/ml/feature/QuantileDiscretizer.scala x: 155 lines of code y: 41 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala x: 142 lines of code y: 35 # changes core/src/main/scala/org/apache/spark/resource/ResourceProfile.scala x: 359 lines of code y: 29 # changes core/src/main/scala/org/apache/spark/deploy/history/HistoryPage.scala x: 91 lines of code y: 40 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/CentralMomentAgg.scala x: 290 lines of code y: 28 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/Exchange.scala x: 48 lines of code y: 23 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/BatchScanExec.scala x: 224 lines of code y: 30 # changes sql/connect/server/src/main/scala/org/apache/spark/sql/connect/ui/SparkConnectServerPage.scala x: 428 lines of code y: 1 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/ColumnType.scala x: 624 lines of code y: 30 # changes core/src/main/scala/org/apache/spark/scheduler/JobWaiter.scala x: 33 lines of code y: 30 # changes core/src/main/scala/org/apache/spark/ui/jobs/JobsTab.scala x: 35 lines of code y: 18 # changes python/pyspark/sql/window.py x: 45 lines of code y: 39 # changes python/pyspark/sql/streaming/listener.py x: 642 lines of code y: 14 # changes python/pyspark/sql/connect/types.py x: 295 lines of code y: 35 # changes dev/create-release/translate-contributors.py x: 201 lines of code y: 10 # changes core/src/main/scala/org/apache/spark/ui/storage/StoragePage.scala x: 194 lines of code y: 23 # changes mllib/src/main/scala/org/apache/spark/ml/r/GeneralizedLinearRegressionWrapper.scala x: 156 lines of code y: 24 # changes mllib/src/main/scala/org/apache/spark/ml/feature/Tokenizer.scala x: 88 lines of code y: 31 # changes mllib/src/main/scala/org/apache/spark/mllib/classification/NaiveBayes.scala x: 272 lines of code y: 60 # changes mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeansModel.scala x: 163 lines of code y: 47 # changes mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAModel.scala x: 540 lines of code y: 54 # changes mllib/src/main/scala/org/apache/spark/mllib/clustering/PowerIterationClustering.scala x: 265 lines of code y: 34 # changes mllib/src/main/scala/org/apache/spark/mllib/feature/ChiSqSelector.scala x: 203 lines of code y: 39 # changes mllib/src/main/scala/org/apache/spark/mllib/fpm/FPGrowth.scala x: 199 lines of code y: 39 # changes mllib/src/main/scala/org/apache/spark/mllib/recommendation/MatrixFactorizationModel.scala x: 235 lines of code y: 53 # changes core/src/main/scala/org/apache/spark/SecurityManager.scala x: 251 lines of code y: 54 # changes mllib/src/main/scala/org/apache/spark/mllib/clustering/StreamingKMeans.scala x: 183 lines of code y: 32 # changes streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala x: 204 lines of code y: 61 # changes streaming/src/main/scala/org/apache/spark/streaming/dstream/RawInputDStream.scala x: 69 lines of code y: 29 # changes sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2.scala x: 120 lines of code y: 53 # changes sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLEnv.scala x: 54 lines of code y: 39 # changes sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLSessionManager.scala x: 76 lines of code y: 33 # changes core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala x: 722 lines of code y: 215 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLListener.scala x: 68 lines of code y: 40 # changes streaming/src/main/scala/org/apache/spark/streaming/DStreamGraph.scala x: 149 lines of code y: 47 # changes python/pyspark/pandas/datetimes.py x: 190 lines of code y: 16 # changes python/pyspark/pandas/spark/accessors.py x: 238 lines of code y: 30 # changes python/pyspark/ml/param/__init__.py x: 326 lines of code y: 45 # changes python/pyspark/pandas/strings.py x: 309 lines of code y: 24 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NestedColumnAliasing.scala x: 256 lines of code y: 34 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/jdbc/JDBCScanBuilder.scala x: 137 lines of code y: 23 # changes core/src/main/scala/org/apache/spark/deploy/master/ui/MasterWebUI.scala x: 98 lines of code y: 62 # changes resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/KubernetesConf.scala x: 235 lines of code y: 39 # changes resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala x: 732 lines of code y: 82 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/compression/compressionSchemes.scala x: 673 lines of code y: 13 # changes common/network-common/src/main/java/org/apache/spark/network/crypto/GcmTransportCipher.java x: 332 lines of code y: 1 # changes core/src/main/scala/org/apache/spark/broadcast/Broadcast.scala x: 46 lines of code y: 28 # changes core/src/main/scala/org/apache/spark/deploy/worker/WorkerWatcher.scala x: 49 lines of code y: 30 # changes streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala x: 298 lines of code y: 101 # changes streaming/src/main/scala/org/apache/spark/streaming/StreamingContext.scala x: 469 lines of code y: 135 # changes streaming/src/main/scala/org/apache/spark/streaming/receiver/ReceiverSupervisorImpl.scala x: 153 lines of code y: 36 # changes streaming/src/main/scala/org/apache/spark/streaming/scheduler/ReceiverTracker.scala x: 457 lines of code y: 58 # changes core/src/main/scala/org/apache/spark/status/api/v1/api.scala x: 497 lines of code y: 61 # changes sql/core/src/main/scala/org/apache/spark/sql/internal/SharedState.scala x: 190 lines of code y: 64 # changes core/src/main/scala/org/apache/spark/io/CompressionCodec.scala x: 150 lines of code y: 55 # changes python/pyspark/sql/context.py x: 292 lines of code y: 115 # changes core/src/main/scala/org/apache/spark/status/AppStatusListener.scala x: 1100 lines of code y: 72 # changes core/src/main/scala/org/apache/spark/status/AppStatusStore.scala x: 749 lines of code y: 57 # changes core/src/main/scala/org/apache/spark/util/collection/ExternalAppendOnlyMap.scala x: 382 lines of code y: 78 # changes python/pyspark/pandas/indexes/multi.py x: 527 lines of code y: 41 # changes core/src/main/scala/org/apache/spark/deploy/LocalSparkCluster.scala x: 78 lines of code y: 53 # changes core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala x: 328 lines of code y: 129 # changes core/src/main/scala/org/apache/spark/scheduler/dynalloc/ExecutorMonitor.scala x: 440 lines of code y: 22 # changes core/src/main/scala/org/apache/spark/storage/BlockManagerDecommissioner.scala x: 336 lines of code y: 25 # changes core/src/main/scala/org/apache/spark/storage/BlockManagerMaster.scala x: 203 lines of code y: 85 # changes core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala x: 617 lines of code y: 41 # changes core/src/main/scala/org/apache/spark/MapOutputTracker.scala x: 1131 lines of code y: 146 # changes connector/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/consumer/KafkaDataConsumer.scala x: 447 lines of code y: 11 # changes connector/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisBackedBlockRDD.scala x: 208 lines of code y: 5 # changes sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveInspectors.scala x: 875 lines of code y: 86 # changes common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalBlockHandler.java x: 478 lines of code y: 21 # changes common/network-yarn/src/main/java/org/apache/spark/network/yarn/YarnShuffleService.java x: 417 lines of code y: 39 # changes core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java x: 590 lines of code y: 55 # changes core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeExternalSorter.java x: 582 lines of code y: 72 # changes sql/hive-thriftserver/src/main/java/org/apache/hive/service/cli/CLIService.java x: 415 lines of code y: 12 # changes sql/hive-thriftserver/src/main/java/org/apache/hive/service/cli/operation/OperationManager.java x: 233 lines of code y: 12 # changes sql/hive-thriftserver/src/main/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java x: 637 lines of code y: 14 # changes mllib/src/main/scala/org/apache/spark/ml/classification/ProbabilisticClassifier.scala x: 165 lines of code y: 28 # changes mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/BlockMatrix.scala x: 346 lines of code y: 35 # changes core/src/main/scala/org/apache/spark/broadcast/TorrentBroadcast.scala x: 255 lines of code y: 72 # changes core/src/main/scala/org/apache/spark/storage/DiskBlockManager.scala x: 253 lines of code y: 90 # changes core/src/main/scala/org/apache/spark/ui/WebUI.scala x: 156 lines of code y: 51 # changes core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala x: 534 lines of code y: 77 # changes core/src/main/scala/org/apache/spark/ui/SparkUI.scala x: 178 lines of code y: 85 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/object.scala x: 682 lines of code y: 48 # changes core/src/main/scala/org/apache/spark/shuffle/sort/SortShuffleWriter.scala x: 68 lines of code y: 37 # changes core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala x: 147 lines of code y: 45 # changes core/src/main/scala/org/apache/spark/metrics/MetricsSystem.scala x: 174 lines of code y: 56 # changes core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala x: 541 lines of code y: 149 # changes core/src/main/scala/org/apache/spark/resource/ResourceUtils.scala x: 356 lines of code y: 25 # changes core/src/main/scala/org/apache/spark/rpc/netty/NettyRpcEnv.scala x: 540 lines of code y: 43 # changes core/src/main/scala/org/apache/spark/scheduler/ReplayListenerBus.scala x: 82 lines of code y: 25 # changes python/pyspark/errors/error_classes.py x: 9 lines of code y: 64 # changes core/src/main/scala/org/apache/spark/storage/BlockId.scala x: 208 lines of code y: 42 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRDD.scala x: 195 lines of code y: 73 # changes core/src/main/scala/org/apache/spark/scheduler/TaskResultGetter.scala x: 121 lines of code y: 49 # changes mllib/src/main/scala/org/apache/spark/mllib/clustering/BisectingKMeans.scala x: 348 lines of code y: 27 # changes mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeans.scala x: 326 lines of code y: 83 # changes resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/KubernetesUtils.scala x: 315 lines of code y: 31 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/StreamingJoinHelper.scala x: 180 lines of code y: 22 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcUtils.scala x: 403 lines of code y: 44 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/CompactibleFileStreamLog.scala x: 232 lines of code y: 26 # changes sql/hive/src/main/scala/org/apache/spark/sql/hive/TableReader.scala x: 343 lines of code y: 80 # changes streaming/src/main/scala/org/apache/spark/streaming/dstream/InputDStream.scala x: 53 lines of code y: 35 # changes python/pyspark/__init__.py x: 72 lines of code y: 82 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ExpectsInputTypes.scala x: 30 lines of code y: 18 # changes common/network-common/src/main/java/org/apache/spark/network/util/TransportConf.java x: 287 lines of code y: 45 # changes python/pyspark/mllib/clustering.py x: 449 lines of code y: 77 # changes python/pyspark/mllib/evaluation.py x: 254 lines of code y: 36 # changes python/pyspark/serializers.py x: 373 lines of code y: 139 # changes python/pyspark/streaming/dstream.py x: 489 lines of code y: 31 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/InterpretedUnsafeProjection.scala x: 215 lines of code y: 26 # changes sql/core/src/main/scala/org/apache/spark/sql/sources/interfaces.scala x: 99 lines of code y: 96 # changes sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedParquetRecordReader.java x: 296 lines of code y: 43 # changes sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/HiveTableScanExec.scala x: 178 lines of code y: 36 # changes core/src/main/scala/org/apache/spark/deploy/worker/ui/WorkerWebUI.scala x: 31 lines of code y: 58 # changes core/src/main/scala/org/apache/spark/ui/PagedTable.scala x: 272 lines of code y: 18 # changes core/src/main/scala/org/apache/spark/ui/exec/ExecutorsTab.scala x: 50 lines of code y: 35 # changes core/src/main/scala/org/apache/spark/ui/jobs/AllJobsPage.scala x: 519 lines of code y: 64 # changes core/src/main/scala/org/apache/spark/ui/jobs/JobPage.scala x: 472 lines of code y: 48 # changes core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala x: 471 lines of code y: 180 # changes core/src/main/scala/org/apache/spark/ui/jobs/StageTable.scala x: 294 lines of code y: 106 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/ui/AllExecutionsPage.scala x: 495 lines of code y: 35 # changes dev/run-tests.py x: 431 lines of code y: 147 # changes core/src/main/resources/org/apache/spark/ui/static/executorspage.js x: 710 lines of code y: 33 # changes core/src/main/resources/org/apache/spark/ui/static/historypage.js x: 209 lines of code y: 33 # changes core/src/main/scala/org/apache/spark/deploy/master/WorkerInfo.scala x: 124 lines of code y: 37 # changes sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedRleValuesReader.java x: 709 lines of code y: 29 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashedRelation.scala x: 795 lines of code y: 90 # changes mllib/src/main/scala/org/apache/spark/mllib/evaluation/RegressionMetrics.scala x: 65 lines of code y: 23 # changes sql/catalyst/src/main/java/org/apache/spark/sql/vectorized/ArrowColumnVector.java x: 481 lines of code y: 17 # changes core/src/main/scala/org/apache/spark/scheduler/TaskScheduler.scala x: 31 lines of code y: 62 # changes core/src/main/scala/org/apache/spark/scheduler/TaskInfo.scala x: 86 lines of code y: 45 # changes core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedClusterMessage.scala x: 93 lines of code y: 67 # changes streaming/src/main/scala/org/apache/spark/streaming/scheduler/BatchInfo.scala x: 19 lines of code y: 19 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/BoundAttribute.scala x: 62 lines of code y: 50 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/javaCode.scala x: 195 lines of code y: 22 # changes core/src/main/protobuf/org/apache/spark/status/protobuf/store_types.proto x: 740 lines of code y: 23 # changes common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBTypeInfo.java x: 287 lines of code y: 6 # changes core/src/main/scala/org/apache/spark/api/java/JavaPairRDD.scala x: 421 lines of code y: 88 # changes core/src/main/scala/org/apache/spark/status/protobuf/StageDataWrapperSerializer.scala x: 659 lines of code y: 10 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ToNumberParser.scala x: 640 lines of code y: 9 # changes mllib/src/main/scala/org/apache/spark/mllib/linalg/Vectors.scala x: 694 lines of code y: 84 # changes core/src/main/scala/org/apache/spark/rdd/ParallelCollectionRDD.scala x: 103 lines of code y: 35 # changes mllib/src/main/scala/org/apache/spark/ml/attribute/attributes.scala x: 355 lines of code y: 16 # changes mllib/src/main/scala/org/apache/spark/ml/feature/Bucketizer.scala x: 189 lines of code y: 35 # changes mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala x: 1122 lines of code y: 133 # changes mllib/src/main/scala/org/apache/spark/mllib/linalg/Matrices.scala x: 773 lines of code y: 61 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala x: 1053 lines of code y: 63 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/rows.scala x: 135 lines of code y: 41 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastHashJoinExec.scala x: 174 lines of code y: 40 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Projection.scala x: 89 lines of code y: 55 # changes common/unsafe/src/main/java/org/apache/spark/unsafe/Platform.java x: 231 lines of code y: 25 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Average.scala x: 123 lines of code y: 43 # changes common/utils/src/main/scala/org/apache/spark/util/ClosureCleaner.scala x: 618 lines of code y: 2 # changes core/src/main/scala/org/apache/spark/status/LiveEntity.scala x: 817 lines of code y: 44 # changes python/pyspark/ml/param/_shared_params_code_gen.py x: 308 lines of code y: 50 # changes python/pyspark/ml/param/shared.py x: 451 lines of code y: 53 # changes python/pyspark/shuffle.py x: 446 lines of code y: 30 # changes core/src/main/scala/org/apache/spark/scheduler/SchedulerBackend.scala x: 29 lines of code y: 30 # changes core/src/main/scala/org/apache/spark/deploy/master/ZooKeeperPersistenceEngine.scala x: 48 lines of code y: 47 # changes mllib/src/main/scala/org/apache/spark/mllib/tree/DecisionTree.scala x: 110 lines of code y: 53 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/debug/package.scala x: 183 lines of code y: 59 # changes streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaPairDStream.scala x: 380 lines of code y: 67 # changes streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaStreamingContext.scala x: 274 lines of code y: 90 # changes python/pyspark/sql/connect/proto/catalog_pb2.pyi x: 913 lines of code y: 9 # changes project/MimaBuild.scala x: 59 lines of code y: 38 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/joins/SortMergeJoinExec.scala x: 1142 lines of code y: 64 # changes core/src/main/scala/org/apache/spark/scheduler/ShuffleMapTask.scala x: 62 lines of code y: 89 # changes core/src/main/scala/org/apache/spark/package.scala x: 12 lines of code y: 31 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statements.scala x: 114 lines of code y: 79 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/interfaces.scala x: 255 lines of code y: 59 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/joins/ShuffledHashJoinExec.scala x: 505 lines of code y: 30 # changes core/src/main/scala/org/apache/spark/scheduler/ActiveJob.scala x: 18 lines of code y: 14 # changes core/src/main/scala/org/apache/spark/storage/BlockManagerMessages.scala x: 82 lines of code y: 43 # changes core/src/main/scala/org/apache/spark/scheduler/ResultTask.scala x: 47 lines of code y: 58 # changes core/src/main/scala/org/apache/spark/rdd/CheckpointRDD.scala x: 10 lines of code y: 50 # changes python/pyspark/storagelevel.py x: 61 lines of code y: 28 # changes mllib-local/src/main/scala/org/apache/spark/ml/linalg/BLAS.scala x: 654 lines of code y: 18 # changes mllib/src/main/scala/org/apache/spark/ml/ann/Layer.scala x: 425 lines of code y: 17 # changes mllib/src/main/scala/org/apache/spark/ml/evaluation/BinaryClassificationEvaluator.scala x: 90 lines of code y: 35 # changes mllib/src/main/scala/org/apache/spark/ml/param/shared/sharedParams.scala x: 149 lines of code y: 52 # changes mllib/src/main/scala/org/apache/spark/ml/tree/treeParams.scala x: 297 lines of code y: 41 # changes mllib/src/main/scala/org/apache/spark/mllib/linalg/BLAS.scala x: 568 lines of code y: 31 # changes mllib/src/main/scala/org/apache/spark/mllib/random/RandomRDDs.scala x: 506 lines of code y: 16 # changes mllib/src/main/scala/org/apache/spark/mllib/recommendation/ALS.scala x: 221 lines of code y: 69 # changes core/src/main/scala/org/apache/spark/scheduler/Stage.scala x: 53 lines of code y: 55 # changes core/src/main/scala/org/apache/spark/storage/RDDInfo.scala x: 48 lines of code y: 21 # changes common/kvstore/src/main/java/org/apache/spark/util/kvstore/InMemoryStore.java x: 370 lines of code y: 10 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala x: 514 lines of code y: 21 # changes core/src/main/scala/org/apache/spark/scheduler/TaskResult.scala x: 70 lines of code y: 45 # changes core/src/main/scala/org/apache/spark/TaskEndReason.scala x: 138 lines of code y: 63 # changes core/src/main/scala/org/apache/spark/deploy/ApplicationDescription.scala x: 20 lines of code y: 26 # changes core/src/main/scala/org/apache/spark/rdd/BlockRDD.scala x: 51 lines of code y: 37 # changes core/src/main/scala/org/apache/spark/rdd/CoGroupedRDD.scala x: 122 lines of code y: 69 # changes sql/hive-thriftserver/src/main/java/org/apache/hive/service/cli/thrift/ThriftCLIServiceClient.java x: 390 lines of code y: 5 # changes sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala x: 25 lines of code y: 69 # changes core/src/main/scala/org/apache/spark/rdd/EmptyRDD.scala x: 10 lines of code y: 17 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/TypedAggregateExpression.scala x: 219 lines of code y: 25 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/command/cache.scala x: 9 lines of code y: 26 # changes core/src/main/scala/org/apache/spark/scheduler/Schedulable.scala x: 23 lines of code y: 28 # changes streaming/src/main/scala/org/apache/spark/streaming/dstream/UnionDStream.scala x: 29 lines of code y: 23 # changes core/src/main/scala/org/apache/spark/api/java/JavaRDD.scala x: 89 lines of code y: 48 # changes graphx/src/main/scala/org/apache/spark/graphx/impl/EdgePartition.scala x: 338 lines of code y: 16 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateUnsafeProjection.scala x: 298 lines of code y: 58 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateMutableProjection.scala x: 102 lines of code y: 49 # changes licenses-binary/LICENSE-javassist.html x: 369 lines of code y: 1 # changes core/src/main/scala/org/apache/spark/api/java/JavaDoubleRDD.scala x: 82 lines of code y: 48 # changes core/src/main/scala/org/apache/spark/deploy/master/ApplicationState.scala x: 5 lines of code y: 25 # changes core/src/main/scala/org/apache/spark/Aggregator.scala x: 32 lines of code y: 43 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSQLParser.scala x: 1040 lines of code y: 3 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUdf.scala x: 1053 lines of code y: 21 # changes
818.0
# changes
  min: 1.0
  average: 19.31
  25th percentile: 2.0
  median: 7.0
  75th percentile: 20.0
  max: 818.0
0 5815.0
lines of code
min: 1.0 | average: 157.16 | 25th percentile: 24.0 | median: 68.0 | 75th percentile: 167.0 | max: 5815.0

Number of Contributors vs. Number of Changes: 4071 points

core/src/main/scala/org/apache/spark/util/collection/BitSet.scala x: 25 # contributors y: 33 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CTESubstitution.scala x: 27 # contributors y: 39 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveIdentifierClause.scala x: 8 # contributors y: 11 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/unresolved.scala x: 76 # contributors y: 148 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala x: 76 # contributors y: 187 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala x: 62 # contributors y: 132 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala x: 74 # contributors y: 145 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreePatternBits.scala x: 3 # contributors y: 3 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreePatterns.scala x: 36 # contributors y: 69 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ApplyDefaultCollationToStringType.scala x: 2 # contributors y: 3 # changes mllib/src/main/scala/org/apache/spark/ml/classification/DecisionTreeClassifier.scala x: 26 # contributors y: 67 # changes mllib/src/main/scala/org/apache/spark/ml/classification/FMClassifier.scala x: 8 # contributors y: 24 # changes mllib/src/main/scala/org/apache/spark/ml/classification/GBTClassifier.scala x: 31 # contributors y: 71 # changes mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala x: 18 # contributors y: 54 # changes mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala x: 52 # contributors y: 170 # changes mllib/src/main/scala/org/apache/spark/ml/classification/MultilayerPerceptronClassifier.scala x: 25 # contributors y: 54 # changes mllib/src/main/scala/org/apache/spark/ml/classification/RandomForestClassifier.scala x: 30 # contributors y: 64 # changes mllib/src/main/scala/org/apache/spark/ml/clustering/BisectingKMeans.scala x: 21 # contributors y: 48 # changes mllib/src/main/scala/org/apache/spark/ml/regression/DecisionTreeRegressor.scala x: 28 # contributors y: 65 # changes mllib/src/main/scala/org/apache/spark/ml/regression/GBTRegressor.scala x: 27 # contributors y: 67 # changes mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala x: 31 # contributors y: 80 # changes mllib/src/main/scala/org/apache/spark/ml/regression/LinearRegression.scala x: 51 # contributors y: 136 # changes mllib/src/main/scala/org/apache/spark/ml/regression/RandomForestRegressor.scala x: 29 # contributors y: 58 # changes mllib/src/main/scala/org/apache/spark/ml/tree/impl/GradientBoostedTrees.scala x: 16 # contributors y: 30 # changes mllib/src/main/scala/org/apache/spark/ml/tree/impl/RandomForest.scala x: 29 # contributors y: 55 # changes mllib/src/main/scala/org/apache/spark/ml/tree/treeModels.scala x: 16 # contributors y: 37 # changes mllib/src/main/scala/org/apache/spark/ml/util/HasTrainingSummary.scala x: 3 # contributors y: 4 # changes python/pyspark/testing/connectutils.py x: 14 # contributors y: 49 # changes sql/connect/server/src/main/scala/org/apache/spark/sql/connect/config/Connect.scala x: 8 # contributors y: 9 # changes sql/connect/server/src/main/scala/org/apache/spark/sql/connect/ml/MLCache.scala x: 4 # contributors y: 8 # changes sql/connect/server/src/main/scala/org/apache/spark/sql/connect/ml/MLHandler.scala x: 3 # contributors y: 16 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/package.scala x: 32 # contributors y: 52 # changes sql/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectAnalyzeHandler.scala x: 4 # contributors y: 4 # changes python/pyspark/errors/exceptions/captured.py x: 6 # contributors y: 22 # changes python/pyspark/pandas/config.py x: 12 # contributors y: 28 # changes python/pyspark/pandas/groupby.py x: 20 # contributors y: 97 # changes python/pyspark/pandas/namespace.py x: 21 # contributors y: 82 # changes python/pyspark/pandas/series.py x: 19 # contributors y: 118 # changes python/pyspark/pandas/utils.py x: 10 # contributors y: 49 # changes python/pyspark/testing/pandasutils.py x: 10 # contributors y: 28 # changes python/pyspark/testing/utils.py x: 16 # contributors y: 59 # changes python/pyspark/sql/pandas/serializers.py x: 20 # contributors y: 52 # changes python/pyspark/worker.py x: 77 # contributors y: 208 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala x: 225 # contributors y: 729 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/python/ArrowPythonRunner.scala x: 17 # contributors y: 41 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/constraints.scala x: 2 # contributors y: 4 # changes python/pyspark/sql/datasource.py x: 6 # contributors y: 25 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala x: 196 # contributors y: 742 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala x: 69 # contributors y: 172 # changes connector/avro/src/main/scala/org/apache/spark/sql/avro/AvroDataToCatalyst.scala x: 12 # contributors y: 12 # changes connector/kafka-0-10-token-provider/src/main/scala/org/apache/spark/kafka010/KafkaTokenUtil.scala x: 5 # contributors y: 5 # changes project/SparkBuild.scala x: 222 # contributors y: 818 # changes sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala x: 51 # contributors y: 107 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/AggregateResolver.scala x: 1 # contributors y: 2 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/ExpressionResolver.scala x: 1 # contributors y: 6 # changes mllib/src/main/scala/org/apache/spark/ml/classification/NaiveBayes.scala x: 27 # contributors y: 58 # changes mllib/src/main/scala/org/apache/spark/ml/clustering/LDA.scala x: 32 # contributors y: 60 # changes mllib/src/main/scala/org/apache/spark/ml/feature/BucketedRandomProjectionLSH.scala x: 12 # contributors y: 20 # changes mllib/src/main/scala/org/apache/spark/ml/feature/ChiSqSelector.scala x: 20 # contributors y: 41 # changes mllib/src/main/scala/org/apache/spark/ml/feature/IDF.scala x: 21 # contributors y: 35 # changes mllib/src/main/scala/org/apache/spark/ml/feature/MaxAbsScaler.scala x: 13 # contributors y: 22 # changes mllib/src/main/scala/org/apache/spark/ml/feature/MinHashLSH.scala x: 11 # contributors y: 21 # changes mllib/src/main/scala/org/apache/spark/ml/feature/MinMaxScaler.scala x: 22 # contributors y: 39 # changes mllib/src/main/scala/org/apache/spark/ml/feature/OneHotEncoder.scala x: 23 # contributors y: 41 # changes mllib/src/main/scala/org/apache/spark/ml/feature/PCA.scala x: 22 # contributors y: 38 # changes mllib/src/main/scala/org/apache/spark/ml/feature/RFormula.scala x: 31 # contributors y: 62 # changes mllib/src/main/scala/org/apache/spark/ml/feature/RobustScaler.scala x: 7 # contributors y: 19 # changes mllib/src/main/scala/org/apache/spark/ml/feature/StandardScaler.scala x: 23 # contributors y: 52 # changes mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala x: 35 # contributors y: 73 # changes mllib/src/main/scala/org/apache/spark/ml/feature/UnivariateFeatureSelector.scala x: 8 # contributors y: 16 # changes mllib/src/main/scala/org/apache/spark/ml/feature/VarianceThresholdSelector.scala x: 6 # contributors y: 13 # changes mllib/src/main/scala/org/apache/spark/ml/feature/VectorIndexer.scala x: 24 # contributors y: 49 # changes mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala x: 48 # contributors y: 104 # changes mllib/src/main/scala/org/apache/spark/ml/regression/AFTSurvivalRegression.scala x: 29 # contributors y: 73 # changes mllib/src/main/scala/org/apache/spark/ml/regression/FMRegressor.scala x: 9 # contributors y: 26 # changes mllib/src/main/scala/org/apache/spark/ml/regression/IsotonicRegression.scala x: 26 # contributors y: 42 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/state/StateDataSource.scala x: 9 # contributors y: 18 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala x: 59 # contributors y: 169 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/OperatorStateMetadata.scala x: 7 # contributors y: 12 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala x: 111 # contributors y: 356 # changes python/pyspark/sql/connect/functions/builtin.py x: 23 # contributors y: 58 # changes sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala x: 64 # contributors y: 141 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveCatalogs.scala x: 23 # contributors y: 66 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/v2ResolutionPlans.scala x: 17 # contributors y: 45 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2AlterTableCommands.scala x: 11 # contributors y: 17 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala x: 45 # contributors y: 135 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/CreateTableExec.scala x: 10 # contributors y: 13 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/ReplaceTableExec.scala x: 11 # contributors y: 13 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala x: 47 # contributors y: 100 # changes sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala x: 54 # contributors y: 93 # changes sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkGetColumnsOperation.scala x: 17 # contributors y: 27 # changes mllib/src/main/scala/org/apache/spark/ml/classification/OneVsRest.scala x: 26 # contributors y: 51 # changes mllib/src/main/scala/org/apache/spark/ml/feature/Imputer.scala x: 14 # contributors y: 29 # changes mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala x: 19 # contributors y: 32 # changes mllib/src/main/scala/org/apache/spark/ml/tuning/CrossValidator.scala x: 32 # contributors y: 63 # changes mllib/src/main/scala/org/apache/spark/ml/util/ReadWrite.scala x: 25 # contributors y: 55 # changes resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/KerberosConfDriverFeatureStep.scala x: 7 # contributors y: 9 # changes dev/sparktestsupport/modules.py x: 85 # contributors y: 289 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/namedExpressions.scala x: 59 # contributors y: 122 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala x: 148 # contributors y: 486 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/SparkOptimizer.scala x: 37 # contributors y: 56 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala x: 103 # contributors y: 264 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/CacheManager.scala x: 45 # contributors y: 80 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/rules.scala x: 58 # contributors y: 134 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/CatalystTypeConverters.scala x: 28 # contributors y: 61 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/python/PythonArrowOutput.scala x: 7 # contributors y: 13 # changes python/pyspark/pandas/accessors.py x: 9 # contributors y: 33 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala x: 58 # contributors y: 144 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/dsl/package.scala x: 72 # contributors y: 142 # changes sql/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala x: 20 # contributors y: 45 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/TransformWithStateExec.scala x: 10 # contributors y: 38 # changes core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala x: 47 # contributors y: 94 # changes python/pyspark/ml/classification.py x: 51 # contributors y: 183 # changes python/pyspark/ml/connect/readwrite.py x: 4 # contributors y: 12 # changes python/pyspark/ml/regression.py x: 43 # contributors y: 134 # changes python/pyspark/ml/util.py x: 30 # contributors y: 77 # changes python/pyspark/ml/wrapper.py x: 24 # contributors y: 58 # changes python/pyspark/sql/connect/client/core.py x: 23 # contributors y: 84 # changes python/pyspark/sql/connect/group.py x: 9 # contributors y: 40 # changes python/pyspark/sql/connect/plan.py x: 31 # contributors y: 135 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStore.scala x: 32 # contributors y: 81 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStoreConf.scala x: 16 # contributors y: 22 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStoreRDD.scala x: 17 # contributors y: 22 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowFunctionFrame.scala x: 14 # contributors y: 22 # changes project/plugins.sbt x: 66 # contributors y: 154 # changes common/utils/src/main/scala/org/apache/spark/ErrorClassesJSONReader.scala x: 9 # contributors y: 9 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala x: 109 # contributors y: 284 # changes python/pyspark/sql/pandas/functions.py x: 15 # contributors y: 35 # changes python/pyspark/sql/pandas/group_ops.py x: 13 # contributors y: 40 # changes python/pyspark/util.py x: 26 # contributors y: 58 # changes sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveUtils.scala x: 40 # contributors y: 89 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/jdbc/JDBCTableCatalog.scala x: 15 # contributors y: 28 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala x: 29 # contributors y: 93 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStoreChangelog.scala x: 14 # contributors y: 18 # changes python/pyspark/ml/connect/functions.py x: 2 # contributors y: 7 # changes core/src/main/scala/org/apache/spark/deploy/ExternalShuffleService.scala x: 31 # contributors y: 33 # changes core/src/main/scala/org/apache/spark/internal/config/package.scala x: 124 # contributors y: 285 # changes core/src/main/scala/org/apache/spark/scheduler/SchedulableBuilder.scala x: 32 # contributors y: 41 # changes core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala x: 104 # contributors y: 190 # changes core/src/main/scala/org/apache/spark/ui/UIWorkloadGenerator.scala x: 35 # contributors y: 50 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala x: 41 # contributors y: 117 # changes sql/core/src/main/scala/org/apache/spark/sql/avro/AvroOptions.scala x: 4 # contributors y: 3 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/timeExpressions.scala x: 6 # contributors y: 9 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ToStringBase.scala x: 10 # contributors y: 12 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ArrayBasedMapBuilder.scala x: 9 # contributors y: 11 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/HiveResult.scala x: 16 # contributors y: 40 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala x: 53 # contributors y: 92 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceUtils.scala x: 24 # contributors y: 39 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetWriteSupport.scala x: 18 # contributors y: 36 # changes sql/api/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4 x: 38 # contributors y: 81 # changes sql/api/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala x: 18 # contributors y: 26 # changes sql/api/src/main/scala/org/apache/spark/sql/types/StructType.scala x: 13 # contributors y: 14 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala x: 150 # contributors y: 543 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/ParserInterface.scala x: 11 # contributors y: 12 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/ParserUtils.scala x: 26 # contributors y: 35 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala x: 94 # contributors y: 234 # changes python/pyspark/ml/connect/tuning.py x: 6 # contributors y: 10 # changes python/run-tests.py x: 30 # contributors y: 58 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala x: 67 # contributors y: 147 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StatefulProcessorHandleImpl.scala x: 5 # contributors y: 25 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryRelation.scala x: 36 # contributors y: 67 # changes core/src/main/scala/org/apache/spark/api/python/PythonRDD.scala x: 78 # contributors y: 173 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/state/StreamStreamJoinStatePartitionReader.scala x: 5 # contributors y: 8 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/IncrementalExecution.scala x: 38 # contributors y: 84 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamingSymmetricHashJoinExec.scala x: 19 # contributors y: 38 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/statefulOperators.scala x: 32 # contributors y: 79 # changes sql/core/src/main/scala/org/apache/spark/sql/jdbc/MySQLDialect.scala x: 26 # contributors y: 63 # changes sql/connect/common/src/main/protobuf/spark/connect/ml.proto x: 3 # contributors y: 7 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala x: 33 # contributors y: 52 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/NormalizePlan.scala x: 2 # contributors y: 11 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/xml/StaxXmlParser.scala x: 11 # contributors y: 28 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala x: 75 # contributors y: 201 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala x: 69 # contributors y: 174 # changes python/pyspark/sql/streaming/stateful_processor.py x: 4 # contributors y: 14 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/UnsupportedOperationChecker.scala x: 39 # contributors y: 69 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/pythonLogicalOperators.scala x: 16 # contributors y: 28 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala x: 115 # contributors y: 345 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveReferencesInAggregate.scala x: 5 # contributors y: 11 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala x: 143 # contributors y: 353 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/stat/StatFunctions.scala x: 31 # contributors y: 57 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/QueryExecution.scala x: 73 # contributors y: 140 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/InsertAdaptiveSparkPlan.scala x: 21 # contributors y: 37 # changes mllib/src/main/scala/org/apache/spark/ml/source/libsvm/LibSVMRelation.scala x: 30 # contributors y: 45 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVFileFormat.scala x: 27 # contributors y: 54 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/text/TextFileFormat.scala x: 22 # contributors y: 30 # changes sql/hive/src/main/scala/org/apache/spark/sql/hive/orc/OrcFileFormat.scala x: 30 # contributors y: 52 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala x: 63 # contributors y: 216 # changes core/src/main/scala/org/apache/spark/storage/BlockManagerMasterEndpoint.scala x: 46 # contributors y: 82 # changes python/pyspark/sql/column.py x: 43 # contributors y: 92 # changes python/pyspark/sql/connect/column.py x: 9 # contributors y: 73 # changes python/pyspark/sql/connect/expressions.py x: 8 # contributors y: 57 # changes python/pyspark/sql/connect/proto/expressions_pb2.pyi x: 14 # contributors y: 55 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala x: 71 # contributors y: 167 # changes python/pyspark/accumulators.py x: 41 # contributors y: 68 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala x: 82 # contributors y: 189 # changes core/src/main/scala/org/apache/spark/serializer/KryoSerializer.scala x: 78 # contributors y: 126 # changes core/src/main/scala/org/apache/spark/api/python/PythonWorkerFactory.scala x: 45 # contributors y: 64 # changes core/src/main/scala/org/apache/spark/api/python/StreamingPythonRunner.scala x: 10 # contributors y: 19 # changes python/pyspark/daemon.py x: 28 # contributors y: 44 # changes python/pyspark/taskcontext.py x: 18 # contributors y: 30 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala x: 61 # contributors y: 151 # changes launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java x: 26 # contributors y: 53 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala x: 56 # contributors y: 114 # changes core/src/main/scala/org/apache/spark/executor/Executor.scala x: 138 # contributors y: 278 # changes common/utils/src/main/scala/org/apache/spark/internal/LogKey.scala x: 20 # contributors y: 50 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala x: 46 # contributors y: 106 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/ProgressReporter.scala x: 33 # contributors y: 64 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/HDFSBackedStateStoreProvider.scala x: 37 # contributors y: 73 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashJoin.scala x: 29 # contributors y: 63 # changes python/pyspark/sql/dataframe.py x: 142 # contributors y: 383 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/AnalysisHelper.scala x: 12 # contributors y: 18 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVOptions.scala x: 23 # contributors y: 38 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JSONOptions.scala x: 29 # contributors y: 49 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala x: 96 # contributors y: 310 # changes sql/api/src/main/scala/org/apache/spark/sql/catalyst/parser/parsers.scala x: 9 # contributors y: 15 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/variant/variantExpressions.scala x: 13 # contributors y: 36 # changes core/src/main/scala/org/apache/spark/util/Utils.scala x: 205 # contributors y: 456 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/ui/ExecutionPage.scala x: 22 # contributors y: 35 # changes python/pyspark/testing/sqlutils.py x: 10 # contributors y: 25 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSource.scala x: 42 # contributors y: 66 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilters.scala x: 28 # contributors y: 62 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaConverter.scala x: 27 # contributors y: 45 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/hash.scala x: 41 # contributors y: 58 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/AnsiTypeCoercion.scala x: 20 # contributors y: 35 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala x: 62 # contributors y: 145 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala x: 48 # contributors y: 87 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/RuleIdCollection.scala x: 29 # contributors y: 53 # changes dev/merge_spark_pr.py x: 36 # contributors y: 77 # changes python/pyspark/errors/exceptions/base.py x: 7 # contributors y: 26 # changes sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OffHeapColumnVector.java x: 27 # contributors y: 47 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala x: 59 # contributors y: 119 # changes launcher/src/main/java/org/apache/spark/launcher/AbstractCommandBuilder.java x: 21 # contributors y: 51 # changes sql/api/src/main/scala/org/apache/spark/sql/types/DataType.scala x: 14 # contributors y: 23 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala x: 37 # contributors y: 47 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala x: 90 # contributors y: 267 # changes python/pyspark/sql/connect/session.py x: 24 # contributors y: 129 # changes python/pyspark/ml/tuning.py x: 45 # contributors y: 83 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFileFormat.scala x: 28 # contributors y: 52 # changes common/utils/src/main/scala/org/apache/spark/internal/Logging.scala x: 9 # contributors y: 22 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala x: 80 # contributors y: 247 # changes python/pyspark/ml/clustering.py x: 33 # contributors y: 94 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala x: 31 # contributors y: 50 # changes mllib-local/src/main/scala/org/apache/spark/ml/linalg/Vectors.scala x: 17 # contributors y: 32 # changes project/MimaExcludes.scala x: 174 # contributors y: 462 # changes core/src/main/resources/org/apache/spark/ui/static/webui.css x: 39 # contributors y: 56 # changes core/src/main/scala/org/apache/spark/ui/UIUtils.scala x: 78 # contributors y: 118 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala x: 57 # contributors y: 130 # changes sql/core/src/main/scala/org/apache/spark/sql/jdbc/OracleDialect.scala x: 22 # contributors y: 45 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/AggUtils.scala x: 22 # contributors y: 28 # changes sql/catalyst/src/main/java/org/apache/spark/sql/connector/util/V2ExpressionSQLBuilder.java x: 12 # contributors y: 36 # changes sql/core/src/main/scala/org/apache/spark/sql/jdbc/H2Dialect.scala x: 14 # contributors y: 44 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala x: 83 # contributors y: 224 # changes python/pyspark/sql/plot/core.py x: 2 # contributors y: 21 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TableOutputResolver.scala x: 19 # contributors y: 34 # changes sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveUDFs.scala x: 45 # contributors y: 75 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/Columnar.scala x: 23 # contributors y: 31 # changes core/src/main/scala/org/apache/spark/ui/env/EnvironmentPage.scala x: 15 # contributors y: 17 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/misc.scala x: 53 # contributors y: 98 # changes sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala x: 80 # contributors y: 210 # changes core/src/main/java/org/apache/spark/memory/TaskMemoryManager.java x: 30 # contributors y: 41 # changes core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeInMemorySorter.java x: 28 # contributors y: 49 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala x: 49 # contributors y: 96 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/objects.scala x: 24 # contributors y: 44 # changes core/src/main/scala/org/apache/spark/BarrierTaskContext.scala x: 18 # contributors y: 33 # changes core/src/main/scala/org/apache/spark/ui/ConsoleProgressBar.scala x: 17 # contributors y: 19 # changes mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala x: 20 # contributors y: 27 # changes mllib/src/main/scala/org/apache/spark/ml/stat/Summarizer.scala x: 13 # contributors y: 27 # changes mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala x: 25 # contributors y: 42 # changes mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/RowMatrix.scala x: 34 # contributors y: 64 # changes mllib/src/main/scala/org/apache/spark/mllib/optimization/GradientDescent.scala x: 31 # contributors y: 42 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/SerializerBuildHelper.scala x: 12 # contributors y: 23 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/types/PhysicalDataType.scala x: 13 # contributors y: 17 # changes python/pyspark/sql/connect/udf.py x: 5 # contributors y: 35 # changes python/pyspark/sql/udf.py x: 26 # contributors y: 77 # changes sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveShim.scala x: 48 # contributors y: 95 # changes python/pyspark/sql/utils.py x: 31 # contributors y: 97 # changes python/pyspark/pandas/base.py x: 13 # contributors y: 59 # changes core/src/main/scala/org/apache/spark/util/JsonProtocol.scala x: 69 # contributors y: 134 # changes python/packaging/classic/setup.py x: 6 # contributors y: 19 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoder.scala x: 32 # contributors y: 77 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Relation.scala x: 19 # contributors y: 26 # changes python/pyspark/sql/connect/conversion.py x: 10 # contributors y: 34 # changes python/pyspark/sql/connect/dataframe.py x: 25 # contributors y: 194 # changes python/pyspark/sql/connect/proto/relations_pb2.pyi x: 22 # contributors y: 91 # changes sql/core/src/main/scala/org/apache/spark/sql/scripting/SqlScriptingInterpreter.scala x: 5 # contributors y: 17 # changes core/src/main/scala/org/apache/spark/SparkContext.scala x: 237 # contributors y: 534 # changes sql/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/StreamingForeachBatchHelper.scala x: 6 # contributors y: 6 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlanner.scala x: 20 # contributors y: 31 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/V1Writes.scala x: 5 # contributors y: 13 # changes python/pyspark/ml/pipeline.py x: 24 # contributors y: 48 # changes core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala x: 110 # contributors y: 226 # changes core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala x: 73 # contributors y: 118 # changes python/pyspark/sql/session.py x: 64 # contributors y: 196 # changes python/pyspark/sql/types.py x: 69 # contributors y: 186 # changes python/pyspark/errors/utils.py x: 4 # contributors y: 19 # changes python/pyspark/errors/__init__.py x: 4 # contributors y: 21 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala x: 47 # contributors y: 119 # changes python/pyspark/shell.py x: 59 # contributors y: 94 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveHints.scala x: 25 # contributors y: 37 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/hints.scala x: 17 # contributors y: 25 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CollationTypeCoercion.scala x: 3 # contributors y: 10 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveInlineTables.scala x: 18 # contributors y: 25 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala x: 81 # contributors y: 191 # changes sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala x: 53 # contributors y: 183 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/ddl.scala x: 31 # contributors y: 58 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala x: 65 # contributors y: 120 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/internal/StaticSQLConf.scala x: 27 # contributors y: 33 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SparkPlanGraph.scala x: 23 # contributors y: 35 # changes core/src/main/scala/org/apache/spark/deploy/master/Master.scala x: 98 # contributors y: 213 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSink.scala x: 27 # contributors y: 38 # changes core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala x: 87 # contributors y: 153 # changes python/pyspark/sql/streaming/readwriter.py x: 9 # contributors y: 23 # changes common/network-common/src/main/java/org/apache/spark/network/server/TransportRequestHandler.java x: 20 # contributors y: 22 # changes core/src/main/scala/org/apache/spark/api/java/JavaSparkContext.scala x: 58 # contributors y: 99 # changes python/pyspark/pandas/plot/core.py x: 12 # contributors y: 34 # changes core/src/main/scala/org/apache/spark/SparkEnv.scala x: 96 # contributors y: 186 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/python/UserDefinedPythonFunction.scala x: 15 # contributors y: 36 # changes python/pyspark/sql/pandas/conversion.py x: 20 # contributors y: 56 # changes sql/core/src/main/scala/org/apache/spark/sql/api/r/SQLUtils.scala x: 34 # contributors y: 54 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/jdbc/JDBCTable.scala x: 8 # contributors y: 18 # changes mllib/src/main/scala/org/apache/spark/ml/feature/StopWordsRemover.scala x: 24 # contributors y: 31 # changes python/pyspark/sql/udtf.py x: 11 # contributors y: 29 # changes python/pyspark/sql/connect/proto/commands_pb2.pyi x: 18 # contributors y: 45 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/command/commands.scala x: 36 # contributors y: 63 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/subquery.scala x: 33 # contributors y: 58 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FlatMapGroupsWithStateExec.scala x: 23 # contributors y: 47 # changes sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClient.scala x: 21 # contributors y: 39 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCOptions.scala x: 30 # contributors y: 50 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala x: 42 # contributors y: 78 # changes sql/core/src/main/scala/org/apache/spark/sql/internal/BaseSessionStateBuilder.scala x: 52 # contributors y: 91 # changes sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionStateBuilder.scala x: 38 # contributors y: 64 # changes sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala x: 55 # contributors y: 142 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLAppStatusListener.scala x: 32 # contributors y: 47 # changes resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala x: 39 # contributors y: 98 # changes python/pyspark/java_gateway.py x: 52 # contributors y: 88 # changes repl/src/main/scala/org/apache/spark/repl/Main.scala x: 21 # contributors y: 21 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/ExistingRDD.scala x: 41 # contributors y: 88 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlan.scala x: 56 # contributors y: 131 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/command/createDataSourceTables.scala x: 33 # contributors y: 75 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala x: 55 # contributors y: 163 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala x: 74 # contributors y: 202 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala x: 44 # contributors y: 72 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetUtils.scala x: 15 # contributors y: 25 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala x: 34 # contributors y: 58 # changes sql/core/src/main/scala/org/apache/spark/sql/internal/SessionState.scala x: 34 # contributors y: 74 # changes sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLDriver.scala x: 30 # contributors y: 37 # changes sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala x: 55 # contributors y: 167 # changes sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala x: 79 # contributors y: 223 # changes sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala x: 60 # contributors y: 116 # changes sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/SaveAsHiveFile.scala x: 16 # contributors y: 23 # changes core/src/main/scala/org/apache/spark/executor/CoarseGrainedExecutorBackend.scala x: 95 # contributors y: 172 # changes python/pyspark/ml/evaluation.py x: 28 # contributors y: 55 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/subquery.scala x: 34 # contributors y: 68 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala x: 46 # contributors y: 78 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Implicits.scala x: 15 # contributors y: 20 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala x: 48 # contributors y: 86 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/randomExpressions.scala x: 25 # contributors y: 40 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2ScanRelationPushDown.scala x: 18 # contributors y: 49 # changes python/pyspark/ml/recommendation.py x: 26 # contributors y: 48 # changes core/src/main/scala/org/apache/spark/rdd/NewHadoopRDD.scala x: 63 # contributors y: 94 # changes core/src/main/scala/org/apache/spark/SparkConf.scala x: 86 # contributors y: 151 # changes python/pyspark/version.py x: 12 # contributors y: 16 # changes common/unsafe/src/main/java/org/apache/spark/sql/catalyst/util/CollationAwareUTF8String.java x: 7 # contributors y: 22 # changes core/src/main/scala/org/apache/spark/TaskContext.scala x: 43 # contributors y: 65 # changes core/src/main/scala/org/apache/spark/TaskContextImpl.scala x: 29 # contributors y: 45 # changes core/src/main/scala/org/apache/spark/scheduler/Task.scala x: 58 # contributors y: 94 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala x: 32 # contributors y: 84 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/CharVarcharUtils.scala x: 10 # contributors y: 21 # changes python/pyspark/sql/readwriter.py x: 84 # contributors y: 189 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/python/BatchEvalPythonExec.scala x: 20 # contributors y: 24 # changes core/src/main/scala/org/apache/spark/deploy/DeployMessage.scala x: 39 # contributors y: 59 # changes core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala x: 155 # contributors y: 354 # changes core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala x: 113 # contributors y: 200 # changes resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala x: 40 # contributors y: 65 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala x: 80 # contributors y: 203 # changes python/pyspark/ml/functions.py x: 11 # contributors y: 23 # changes mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala x: 42 # contributors y: 70 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/udaf.scala x: 20 # contributors y: 37 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala x: 29 # contributors y: 65 # changes graphx/src/main/scala/org/apache/spark/graphx/Pregel.scala x: 22 # contributors y: 24 # changes core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala x: 65 # contributors y: 103 # changes streaming/src/main/scala/org/apache/spark/streaming/dstream/DStream.scala x: 37 # contributors y: 60 # changes core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala x: 75 # contributors y: 125 # changes python/pyspark/sql/connect/readwriter.py x: 11 # contributors y: 41 # changes python/pyspark/ml/torch/distributor.py x: 9 # contributors y: 31 # changes core/src/main/scala/org/apache/spark/rdd/RDD.scala x: 137 # contributors y: 250 # changes core/src/main/scala/org/apache/spark/shuffle/sort/SortShuffleManager.scala x: 24 # contributors y: 37 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/package.scala x: 26 # contributors y: 37 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceManager.scala x: 3 # contributors y: 18 # changes core/src/main/scala/org/apache/spark/deploy/master/MasterArguments.scala x: 25 # contributors y: 25 # changes core/src/main/scala/org/apache/spark/deploy/worker/WorkerArguments.scala x: 33 # contributors y: 35 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala x: 70 # contributors y: 164 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala x: 42 # contributors y: 85 # changes sql/hive/src/main/scala/org/apache/spark/sql/hive/client/IsolatedClientLoader.scala x: 38 # contributors y: 90 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalogUtils.scala x: 21 # contributors y: 32 # changes core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala x: 48 # contributors y: 69 # changes core/src/main/scala/org/apache/spark/deploy/security/HadoopDelegationTokenManager.scala x: 14 # contributors y: 28 # changes core/src/main/scala/org/apache/spark/rdd/PipedRDD.scala x: 37 # contributors y: 44 # changes core/src/main/scala/org/apache/spark/storage/BlockInfoManager.scala x: 18 # contributors y: 21 # changes core/src/main/scala/org/apache/spark/storage/BlockManager.scala x: 130 # contributors y: 283 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/InMemoryCatalog.scala x: 33 # contributors y: 71 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala x: 37 # contributors y: 66 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala x: 37 # contributors y: 64 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala x: 49 # contributors y: 85 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/basicPhysicalOperators.scala x: 42 # contributors y: 87 # changes streaming/src/main/scala/org/apache/spark/streaming/dstream/SocketInputDStream.scala x: 24 # contributors y: 26 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala x: 22 # contributors y: 80 # changes sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIDriver.scala x: 62 # contributors y: 115 # changes sql/hive/src/main/scala/org/apache/spark/sql/hive/client/package.scala x: 15 # contributors y: 29 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/python/EvaluatePython.scala x: 21 # contributors y: 23 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala x: 61 # contributors y: 124 # changes core/src/main/scala/org/apache/spark/executor/ExecutorExitCode.scala x: 14 # contributors y: 13 # changes core/src/main/scala/org/apache/spark/scheduler/cluster/StandaloneSchedulerBackend.scala x: 40 # contributors y: 55 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala x: 43 # contributors y: 76 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/OptimizeMetadataOnlyQuery.scala x: 19 # contributors y: 23 # changes python/pyspark/pandas/data_type_ops/datetime_ops.py x: 7 # contributors y: 25 # changes python/pyspark/pandas/generic.py x: 16 # contributors y: 69 # changes python/pyspark/pandas/internal.py x: 10 # contributors y: 45 # changes python/pyspark/pandas/window.py x: 10 # contributors y: 29 # changes common/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java x: 46 # contributors y: 77 # changes core/src/main/scala/org/apache/spark/deploy/master/ui/MasterPage.scala x: 31 # contributors y: 54 # changes core/src/main/scala/org/apache/spark/deploy/master/ui/ApplicationPage.scala x: 45 # contributors y: 57 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JsonInferSchema.scala x: 17 # contributors y: 36 # changes python/pyspark/mllib/classification.py x: 38 # contributors y: 65 # changes python/pyspark/mllib/feature.py x: 27 # contributors y: 57 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/FileDataSourceV2.scala x: 8 # contributors y: 22 # changes resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala x: 17 # contributors y: 48 # changes core/src/main/scala/org/apache/spark/TestUtils.scala x: 35 # contributors y: 61 # changes core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala x: 33 # contributors y: 49 # changes core/src/main/scala/org/apache/spark/deploy/worker/ExecutorRunner.scala x: 57 # contributors y: 81 # changes core/src/main/scala/org/apache/spark/deploy/history/HistoryServerArguments.scala x: 16 # contributors y: 17 # changes resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala x: 56 # contributors y: 107 # changes sql/gen-sql-api-docs.py x: 7 # contributors y: 6 # changes core/src/main/scala/org/apache/spark/scheduler/DAGSchedulerEvent.scala x: 49 # contributors y: 66 # changes core/src/main/scala/org/apache/spark/scheduler/SparkListener.scala x: 58 # contributors y: 100 # changes core/src/main/scala/org/apache/spark/storage/DiskStore.scala x: 51 # contributors y: 76 # changes core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala x: 90 # contributors y: 178 # changes core/src/main/scala/org/apache/spark/executor/TaskMetrics.scala x: 46 # contributors y: 90 # changes core/src/main/scala/org/apache/spark/deploy/FaultToleranceTest.scala x: 39 # contributors y: 49 # changes core/src/main/scala/org/apache/spark/ui/JettyUtils.scala x: 55 # contributors y: 83 # changes resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala x: 28 # contributors y: 41 # changes core/src/main/scala/org/apache/spark/FutureAction.scala x: 34 # contributors y: 40 # changes core/src/main/scala/org/apache/spark/scheduler/JobWaiter.scala x: 27 # contributors y: 30 # changes python/pyspark/sql/connect/types.py x: 11 # contributors y: 35 # changes core/src/main/scala/org/apache/spark/util/logging/FileAppender.scala x: 15 # contributors y: 16 # changes mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeansModel.scala x: 30 # contributors y: 47 # changes mllib/src/main/scala/org/apache/spark/mllib/recommendation/MatrixFactorizationModel.scala x: 41 # contributors y: 53 # changes mllib/src/main/scala/org/apache/spark/mllib/clustering/StreamingKMeans.scala x: 25 # contributors y: 32 # changes streaming/src/main/scala/org/apache/spark/streaming/dstream/DStreamCheckpointData.scala x: 19 # contributors y: 21 # changes streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala x: 45 # contributors y: 61 # changes sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2.scala x: 39 # contributors y: 53 # changes sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIService.scala x: 21 # contributors y: 26 # changes sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLEnv.scala x: 29 # contributors y: 39 # changes core/src/main/scala/org/apache/spark/Partitioner.scala x: 47 # contributors y: 61 # changes core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala x: 114 # contributors y: 215 # changes streaming/src/main/scala/org/apache/spark/streaming/DStreamGraph.scala x: 29 # contributors y: 47 # changes streaming/src/main/scala/org/apache/spark/streaming/util/RawTextSender.scala x: 23 # contributors y: 23 # changes streaming/src/main/scala/org/apache/spark/streaming/util/RecurringTimer.scala x: 18 # contributors y: 18 # changes python/pyspark/pandas/spark/accessors.py x: 12 # contributors y: 30 # changes python/pyspark/pandas/indexes/base.py x: 14 # contributors y: 63 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NestedColumnAliasing.scala x: 17 # contributors y: 34 # changes core/src/main/scala/org/apache/spark/deploy/master/ui/MasterWebUI.scala x: 44 # contributors y: 62 # changes streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala x: 66 # contributors y: 101 # changes streaming/src/main/scala/org/apache/spark/streaming/StreamingContext.scala x: 70 # contributors y: 135 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala x: 26 # contributors y: 30 # changes core/src/main/scala/org/apache/spark/status/api/v1/api.scala x: 42 # contributors y: 61 # changes core/src/main/scala/org/apache/spark/io/CompressionCodec.scala x: 41 # contributors y: 55 # changes python/pyspark/sql/context.py x: 51 # contributors y: 115 # changes core/src/main/scala/org/apache/spark/status/AppStatusListener.scala x: 42 # contributors y: 72 # changes core/src/main/scala/org/apache/spark/status/AppStatusStore.scala x: 30 # contributors y: 57 # changes core/src/main/scala/org/apache/spark/util/collection/ExternalAppendOnlyMap.scala x: 39 # contributors y: 78 # changes python/pyspark/pandas/data_type_ops/base.py x: 9 # contributors y: 37 # changes python/pyspark/pandas/indexes/multi.py x: 12 # contributors y: 41 # changes core/src/main/scala/org/apache/spark/deploy/LocalSparkCluster.scala x: 42 # contributors y: 53 # changes core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala x: 68 # contributors y: 129 # changes core/src/main/scala/org/apache/spark/deploy/worker/CommandUtils.scala x: 28 # contributors y: 34 # changes core/src/main/scala/org/apache/spark/storage/BlockManagerMaster.scala x: 56 # contributors y: 85 # changes core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala x: 26 # contributors y: 41 # changes core/src/main/scala/org/apache/spark/MapOutputTracker.scala x: 85 # contributors y: 146 # changes sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveInspectors.scala x: 40 # contributors y: 86 # changes core/src/main/scala/org/apache/spark/broadcast/TorrentBroadcast.scala x: 49 # contributors y: 72 # changes core/src/main/scala/org/apache/spark/rdd/JdbcRDD.scala x: 29 # contributors y: 33 # changes core/src/main/scala/org/apache/spark/storage/DiskBlockManager.scala x: 58 # contributors y: 90 # changes core/src/main/scala/org/apache/spark/ui/WebUI.scala x: 37 # contributors y: 51 # changes core/src/main/scala/org/apache/spark/deploy/Client.scala x: 38 # contributors y: 51 # changes core/src/main/scala/org/apache/spark/ui/SparkUI.scala x: 60 # contributors y: 85 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/object.scala x: 25 # contributors y: 48 # changes core/src/main/scala/org/apache/spark/deploy/master/FileSystemPersistenceEngine.scala x: 29 # contributors y: 37 # changes core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala x: 79 # contributors y: 149 # changes core/src/main/scala/org/apache/spark/util/SizeEstimator.scala x: 40 # contributors y: 53 # changes python/pyspark/errors/error_classes.py x: 18 # contributors y: 64 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InMemoryFileIndex.scala x: 26 # contributors y: 28 # changes core/src/main/scala/org/apache/spark/metrics/MetricsConfig.scala x: 33 # contributors y: 37 # changes mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeans.scala x: 47 # contributors y: 83 # changes sql/hive/src/main/scala/org/apache/spark/sql/hive/TableReader.scala x: 53 # contributors y: 80 # changes python/pyspark/__init__.py x: 54 # contributors y: 82 # changes python/pyspark/mllib/clustering.py x: 41 # contributors y: 77 # changes python/pyspark/mllib/util.py x: 19 # contributors y: 39 # changes python/pyspark/serializers.py x: 58 # contributors y: 139 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/OptimizeSkewedJoin.scala x: 15 # contributors y: 43 # changes resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStep.scala x: 28 # contributors y: 37 # changes sql/core/src/main/scala/org/apache/spark/sql/sources/interfaces.scala x: 30 # contributors y: 96 # changes core/src/main/scala/org/apache/spark/ui/exec/ExecutorsTab.scala x: 24 # contributors y: 35 # changes core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala x: 88 # contributors y: 180 # changes core/src/main/scala/org/apache/spark/ui/jobs/StageTable.scala x: 63 # contributors y: 106 # changes dev/run-tests.py x: 52 # contributors y: 147 # changes core/src/main/scala/org/apache/spark/scheduler/TaskScheduler.scala x: 41 # contributors y: 62 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/orc/OrcScan.scala x: 13 # contributors y: 24 # changes core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedClusterMessage.scala x: 46 # contributors y: 67 # changes core/src/main/scala/org/apache/spark/rdd/AsyncRDDActions.scala x: 31 # contributors y: 40 # changes core/src/main/scala/org/apache/spark/api/java/JavaPairRDD.scala x: 51 # contributors y: 88 # changes mllib/src/main/scala/org/apache/spark/mllib/linalg/Vectors.scala x: 46 # contributors y: 84 # changes core/src/main/scala/org/apache/spark/metrics/sink/GraphiteSink.scala x: 24 # contributors y: 24 # changes core/src/main/scala/org/apache/spark/api/java/JavaRDDLike.scala x: 54 # contributors y: 97 # changes core/src/main/scala/org/apache/spark/rdd/ParallelCollectionRDD.scala x: 30 # contributors y: 35 # changes mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala x: 59 # contributors y: 133 # changes mllib/src/main/scala/org/apache/spark/mllib/linalg/Matrices.scala x: 38 # contributors y: 61 # changes streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaDStreamLike.scala x: 32 # contributors y: 50 # changes core/src/main/scala/org/apache/spark/util/Clock.scala x: 10 # contributors y: 9 # changes core/src/main/scala/org/apache/spark/rdd/CoalescedRDD.scala x: 35 # contributors y: 45 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Average.scala x: 23 # contributors y: 43 # changes core/src/main/scala/org/apache/spark/deploy/master/ZooKeeperPersistenceEngine.scala x: 33 # contributors y: 47 # changes streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaStreamingContext.scala x: 45 # contributors y: 90 # changes core/src/main/scala/org/apache/spark/scheduler/ShuffleMapTask.scala x: 53 # contributors y: 89 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statements.scala x: 21 # contributors y: 79 # changes core/src/main/scala/org/apache/spark/scheduler/ResultTask.scala x: 44 # contributors y: 58 # changes python/pyspark/storagelevel.py x: 23 # contributors y: 28 # changes mllib/src/main/scala/org/apache/spark/ml/evaluation/BinaryClassificationEvaluator.scala x: 16 # contributors y: 35 # changes mllib/src/main/scala/org/apache/spark/ml/param/shared/sharedParams.scala x: 22 # contributors y: 52 # changes mllib/src/main/scala/org/apache/spark/ml/tree/treeParams.scala x: 18 # contributors y: 41 # changes mllib/src/main/scala/org/apache/spark/mllib/classification/LogisticRegression.scala x: 36 # contributors y: 54 # changes mllib/src/main/scala/org/apache/spark/mllib/optimization/Gradient.scala x: 28 # contributors y: 30 # changes mllib/src/main/scala/org/apache/spark/mllib/recommendation/ALS.scala x: 44 # contributors y: 69 # changes core/src/main/scala/org/apache/spark/scheduler/Stage.scala x: 33 # contributors y: 55 # changes core/src/main/scala/org/apache/spark/scheduler/TaskResult.scala x: 34 # contributors y: 45 # changes core/src/main/scala/org/apache/spark/rdd/CoGroupedRDD.scala x: 40 # contributors y: 69 # changes sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala x: 33 # contributors y: 69 # changes sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/MariaDBConnectionProvider.scala x: 1 # contributors y: 8 # changes core/src/main/resources/org/apache/spark/ui/static/sorttable.js x: 13 # contributors y: 12 # changes core/src/main/scala/org/apache/spark/api/java/JavaRDD.scala x: 34 # contributors y: 48 # changes sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateUnsafeProjection.scala x: 22 # contributors y: 58 # changes core/src/main/scala/org/apache/spark/rdd/ShuffledRDD.scala x: 32 # contributors y: 45 # changes streaming/src/main/scala/org/apache/spark/streaming/dstream/ConstantInputDStream.scala x: 19 # contributors y: 19 # changes streaming/src/main/scala/org/apache/spark/streaming/dstream/QueueInputDStream.scala x: 22 # contributors y: 22 # changes core/src/main/scala/org/apache/spark/util/CompletionIterator.scala x: 14 # contributors y: 12 # changes streaming/src/main/scala/org/apache/spark/streaming/dstream/StateDStream.scala x: 25 # contributors y: 28 # changes core/src/main/scala/org/apache/spark/deploy/master/RecoveryState.scala x: 16 # contributors y: 16 # changes
818.0
# changes
  min: 1.0
  average: 19.31
  25th percentile: 2.0
  median: 7.0
  75th percentile: 20.0
  max: 818.0
0 237.0
# contributors
min: 1.0 | average: 10.87 | 25th percentile: 2.0 | median: 5.0 | 75th percentile: 13.0 | max: 237.0

Number of Contributors vs. File Size: 4071 points

core/src/main/scala/org/apache/spark/util/collection/BitSet.scala x: 25 # contributors y: 162 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CTESubstitution.scala x: 27 # contributors y: 261 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveIdentifierClause.scala x: 8 # contributors y: 100 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/unresolved.scala x: 76 # contributors y: 641 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala x: 76 # contributors y: 827 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala x: 62 # contributors y: 424 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala x: 74 # contributors y: 847 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreePatternBits.scala x: 3 # contributors y: 38 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreePatterns.scala x: 36 # contributors y: 150 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ApplyDefaultCollationToStringType.scala x: 2 # contributors y: 135 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/DecisionTreeClassifier.scala x: 26 # contributors y: 209 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/FMClassifier.scala x: 8 # contributors y: 225 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/GBTClassifier.scala x: 31 # contributors y: 270 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala x: 18 # contributors y: 312 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala x: 52 # contributors y: 929 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/MultilayerPerceptronClassifier.scala x: 25 # contributors y: 255 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/RandomForestClassifier.scala x: 30 # contributors y: 346 lines of code mllib/src/main/scala/org/apache/spark/ml/clustering/BisectingKMeans.scala x: 21 # contributors y: 201 lines of code mllib/src/main/scala/org/apache/spark/ml/clustering/GaussianMixture.scala x: 26 # contributors y: 455 lines of code mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala x: 36 # contributors y: 518 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/DecisionTreeRegressor.scala x: 28 # contributors y: 209 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/GBTRegressor.scala x: 27 # contributors y: 237 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala x: 31 # contributors y: 967 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/LinearRegression.scala x: 51 # contributors y: 594 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/RandomForestRegressor.scala x: 29 # contributors y: 219 lines of code mllib/src/main/scala/org/apache/spark/ml/tree/impl/GradientBoostedTrees.scala x: 16 # contributors y: 351 lines of code mllib/src/main/scala/org/apache/spark/ml/tree/impl/RandomForest.scala x: 29 # contributors y: 827 lines of code mllib/src/main/scala/org/apache/spark/ml/tree/treeModels.scala x: 16 # contributors y: 331 lines of code mllib/src/main/scala/org/apache/spark/ml/util/HasTrainingSummary.scala x: 3 # contributors y: 21 lines of code python/pyspark/testing/connectutils.py x: 14 # contributors y: 177 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/config/Connect.scala x: 8 # contributors y: 325 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/ml/MLCache.scala x: 4 # contributors y: 187 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/ml/MLHandler.scala x: 3 # contributors y: 323 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/package.scala x: 32 # contributors y: 179 lines of code core/src/main/scala/org/apache/spark/util/UninterruptibleThread.scala x: 3 # contributors y: 79 lines of code python/pyspark/errors/exceptions/captured.py x: 6 # contributors y: 284 lines of code python/pyspark/pandas/config.py x: 12 # contributors y: 363 lines of code python/pyspark/pandas/groupby.py x: 20 # contributors y: 1800 lines of code python/pyspark/pandas/namespace.py x: 21 # contributors y: 1517 lines of code python/pyspark/pandas/series.py x: 19 # contributors y: 2215 lines of code python/pyspark/pandas/utils.py x: 10 # contributors y: 657 lines of code python/pyspark/testing/pandasutils.py x: 10 # contributors y: 486 lines of code python/pyspark/testing/utils.py x: 16 # contributors y: 560 lines of code python/pyspark/sql/conversion.py x: 2 # contributors y: 415 lines of code python/pyspark/sql/pandas/serializers.py x: 20 # contributors y: 884 lines of code python/pyspark/worker.py x: 77 # contributors y: 1728 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala x: 225 # contributors y: 5815 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/python/ArrowPythonRunner.scala x: 17 # contributors y: 97 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/constraints.scala x: 2 # contributors y: 174 lines of code python/pyspark/sql/datasource.py x: 6 # contributors y: 256 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala x: 196 # contributors y: 2889 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala x: 69 # contributors y: 617 lines of code common/utils/src/main/scala/org/apache/spark/util/SparkStringUtils.scala x: 2 # contributors y: 8 lines of code common/utils/src/main/scala/org/apache/spark/util/SparkTestUtils.scala x: 2 # contributors y: 69 lines of code connector/avro/src/main/scala/org/apache/spark/sql/avro/AvroDataToCatalyst.scala x: 12 # contributors y: 105 lines of code connector/avro/src/main/scala/org/apache/spark/sql/avro/CatalystDataToAvro.scala x: 4 # contributors y: 39 lines of code connector/kafka-0-10-token-provider/src/main/scala/org/apache/spark/kafka010/KafkaConfigUpdater.scala x: 4 # contributors y: 58 lines of code connector/kafka-0-10-token-provider/src/main/scala/org/apache/spark/kafka010/KafkaDelegationTokenProvider.scala x: 3 # contributors y: 62 lines of code connector/kafka-0-10-token-provider/src/main/scala/org/apache/spark/kafka010/KafkaRedactionUtil.scala x: 2 # contributors y: 31 lines of code connector/kafka-0-10-token-provider/src/main/scala/org/apache/spark/kafka010/KafkaTokenUtil.scala x: 5 # contributors y: 230 lines of code project/SparkBuild.scala x: 222 # contributors y: 1460 lines of code sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala x: 51 # contributors y: 474 lines of code python/pyspark/sql/pandas/types.py x: 17 # contributors y: 920 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/AggregateResolver.scala x: 1 # contributors y: 192 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/ExpressionResolutionContext.scala x: 1 # contributors y: 34 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/ExpressionResolver.scala x: 1 # contributors y: 527 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/FunctionResolver.scala x: 1 # contributors y: 102 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/NameScope.scala x: 1 # contributors y: 406 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/ml/MLUtils.scala x: 6 # contributors y: 514 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/NaiveBayes.scala x: 27 # contributors y: 436 lines of code mllib/src/main/scala/org/apache/spark/ml/clustering/LDA.scala x: 32 # contributors y: 533 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/BucketedRandomProjectionLSH.scala x: 12 # contributors y: 150 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/ChiSqSelector.scala x: 20 # contributors y: 108 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/IDF.scala x: 21 # contributors y: 157 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/MaxAbsScaler.scala x: 13 # contributors y: 122 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/MinHashLSH.scala x: 11 # contributors y: 161 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/MinMaxScaler.scala x: 22 # contributors y: 161 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/OneHotEncoder.scala x: 23 # contributors y: 378 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/PCA.scala x: 22 # contributors y: 133 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/RFormula.scala x: 31 # contributors y: 384 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/RobustScaler.scala x: 7 # contributors y: 180 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/StandardScaler.scala x: 23 # contributors y: 217 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala x: 35 # contributors y: 405 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/TargetEncoder.scala x: 4 # contributors y: 301 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/UnivariateFeatureSelector.scala x: 8 # contributors y: 308 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/VarianceThresholdSelector.scala x: 6 # contributors y: 134 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/VectorIndexer.scala x: 24 # contributors y: 370 lines of code mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala x: 48 # contributors y: 1062 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/AFTSurvivalRegression.scala x: 29 # contributors y: 345 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/FMRegressor.scala x: 9 # contributors y: 478 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/IsotonicRegression.scala x: 26 # contributors y: 199 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/state/StateDataSource.scala x: 9 # contributors y: 443 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala x: 59 # contributors y: 483 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamingQueryCheckpointMetadata.scala x: 1 # contributors y: 21 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/OperatorStateMetadata.scala x: 7 # contributors y: 348 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/ColumnDefinition.scala x: 4 # contributors y: 177 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala x: 111 # contributors y: 920 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/TypeUtils.scala x: 26 # contributors y: 102 lines of code python/pyspark/sql/connect/functions/builtin.py x: 23 # contributors y: 2417 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala x: 64 # contributors y: 998 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveCatalogs.scala x: 23 # contributors y: 112 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveTableSpec.scala x: 5 # contributors y: 84 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/v2ResolutionPlans.scala x: 17 # contributors y: 164 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2AlterTableCommands.scala x: 11 # contributors y: 223 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala x: 45 # contributors y: 1148 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/CreateTableExec.scala x: 10 # contributors y: 42 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/ReplaceTableExec.scala x: 11 # contributors y: 89 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala x: 47 # contributors y: 555 lines of code sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala x: 54 # contributors y: 333 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/OneVsRest.scala x: 26 # contributors y: 344 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/Imputer.scala x: 14 # contributors y: 214 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/Word2Vec.scala x: 32 # contributors y: 250 lines of code mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala x: 19 # contributors y: 263 lines of code mllib/src/main/scala/org/apache/spark/ml/tuning/CrossValidator.scala x: 32 # contributors y: 304 lines of code mllib/src/main/scala/org/apache/spark/ml/tuning/TrainValidationSplit.scala x: 22 # contributors y: 278 lines of code mllib/src/main/scala/org/apache/spark/ml/util/ReadWrite.scala x: 25 # contributors y: 524 lines of code resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/KerberosConfDriverFeatureStep.scala x: 7 # contributors y: 202 lines of code dev/sparktestsupport/modules.py x: 85 # contributors y: 1409 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/namedExpressions.scala x: 59 # contributors y: 441 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala x: 148 # contributors y: 1683 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/SparkOptimizer.scala x: 37 # contributors y: 81 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/UnionLoopExec.scala x: 2 # contributors y: 152 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/view.scala x: 12 # contributors y: 30 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala x: 103 # contributors y: 1503 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/CacheManager.scala x: 45 # contributors y: 305 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/rules.scala x: 58 # contributors y: 558 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/CatalystTypeConverters.scala x: 28 # contributors y: 464 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/python/PythonArrowOutput.scala x: 7 # contributors y: 243 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/WriteToDataSourceV2Exec.scala x: 31 # contributors y: 528 lines of code python/pyspark/pandas/accessors.py x: 9 # contributors y: 434 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala x: 58 # contributors y: 1547 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/dsl/package.scala x: 72 # contributors y: 399 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala x: 20 # contributors y: 3468 lines of code sql/core/src/main/scala/org/apache/spark/sql/classic/Dataset.scala x: 7 # contributors y: 1498 lines of code sql/core/src/main/protobuf/org/apache/spark/sql/execution/streaming/StateMessage.proto x: 4 # contributors y: 219 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/python/streaming/TransformWithStateInPySparkStateServer.scala x: 2 # contributors y: 754 lines of code common/utils/src/main/scala/org/apache/spark/internal/config/ConfigBuilder.scala x: 2 # contributors y: 272 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/TransformWithStateExec.scala x: 10 # contributors y: 539 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStoreErrors.scala x: 6 # contributors y: 367 lines of code sql/core/src/main/scala/org/apache/spark/sql/classic/Catalog.scala x: 4 # contributors y: 568 lines of code core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala x: 47 # contributors y: 733 lines of code python/pyspark/ml/classification.py x: 51 # contributors y: 2173 lines of code python/pyspark/ml/connect/readwrite.py x: 4 # contributors y: 290 lines of code python/pyspark/ml/feature.py x: 69 # contributors y: 3621 lines of code python/pyspark/ml/regression.py x: 43 # contributors y: 1554 lines of code python/pyspark/ml/util.py x: 30 # contributors y: 714 lines of code python/pyspark/ml/wrapper.py x: 24 # contributors y: 245 lines of code python/pyspark/sql/connect/client/core.py x: 23 # contributors y: 1449 lines of code python/pyspark/sql/connect/group.py x: 9 # contributors y: 488 lines of code python/pyspark/sql/connect/plan.py x: 31 # contributors y: 2133 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStore.scala x: 32 # contributors y: 693 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStoreConf.scala x: 16 # contributors y: 37 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowFunctionFrame.scala x: 14 # contributors y: 361 lines of code project/plugins.sbt x: 66 # contributors y: 14 lines of code common/utils/src/main/scala/org/apache/spark/ErrorClassesJSONReader.scala x: 9 # contributors y: 123 lines of code sql/api/src/main/scala/org/apache/spark/sql/errors/ExecutionErrors.scala x: 7 # contributors y: 210 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala x: 109 # contributors y: 2609 lines of code python/pyspark/sql/pandas/functions.py x: 15 # contributors y: 160 lines of code python/pyspark/sql/pandas/group_ops.py x: 13 # contributors y: 252 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveUtils.scala x: 40 # contributors y: 369 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/jdbc/JDBCTableCatalog.scala x: 15 # contributors y: 360 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala x: 29 # contributors y: 1425 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStoreChangelog.scala x: 14 # contributors y: 413 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeExtractors.scala x: 35 # contributors y: 368 lines of code python/pyspark/ml/connect/functions.py x: 2 # contributors y: 47 lines of code python/pyspark/sql/connect/tvf.py x: 2 # contributors y: 101 lines of code core/src/main/scala/org/apache/spark/deploy/ExternalShuffleService.scala x: 31 # contributors y: 129 lines of code core/src/main/scala/org/apache/spark/internal/config/package.scala x: 124 # contributors y: 2456 lines of code core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala x: 104 # contributors y: 886 lines of code core/src/main/scala/org/apache/spark/ui/UIWorkloadGenerator.scala x: 35 # contributors y: 79 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala x: 41 # contributors y: 637 lines of code python/pyspark/sql/streaming/list_state_client.py x: 3 # contributors y: 164 lines of code python/pyspark/sql/streaming/proto/StateMessage_pb2.pyi x: 4 # contributors y: 1116 lines of code python/pyspark/sql/streaming/stateful_processor_api_client.py x: 6 # contributors y: 392 lines of code sql/core/src/main/scala/org/apache/spark/sql/avro/AvroOptions.scala x: 4 # contributors y: 116 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/timeExpressions.scala x: 6 # contributors y: 458 lines of code python/pyspark/sql/functions/__init__.py x: 3 # contributors y: 466 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/ResolverGuard.scala x: 3 # contributors y: 369 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/CodeGeneratorWithInterpretedFallback.scala x: 7 # contributors y: 28 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ToStringBase.scala x: 10 # contributors y: 422 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ArrayBasedMapBuilder.scala x: 9 # contributors y: 96 lines of code sql/core/src/main/scala/org/apache/spark/sql/avro/AvroDeserializer.scala x: 3 # contributors y: 404 lines of code sql/core/src/main/scala/org/apache/spark/sql/avro/AvroSerializer.scala x: 3 # contributors y: 314 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/HiveResult.scala x: 16 # contributors y: 110 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala x: 53 # contributors y: 553 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceUtils.scala x: 24 # contributors y: 225 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetOptions.scala x: 14 # contributors y: 55 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetWriteSupport.scala x: 18 # contributors y: 366 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ResolveDefaultColumnsUtil.scala x: 16 # contributors y: 406 lines of code sql/api/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4 x: 38 # contributors y: 2122 lines of code sql/api/src/main/scala/org/apache/spark/sql/catalyst/util/SparkParserUtils.scala x: 5 # contributors y: 144 lines of code sql/api/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala x: 18 # contributors y: 681 lines of code sql/api/src/main/scala/org/apache/spark/sql/types/StructField.scala x: 7 # contributors y: 148 lines of code sql/api/src/main/scala/org/apache/spark/sql/types/StructType.scala x: 13 # contributors y: 406 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SQLFunction.scala x: 2 # contributors y: 204 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AbstractSqlParser.scala x: 7 # contributors y: 85 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala x: 150 # contributors y: 4715 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/ParserInterface.scala x: 11 # contributors y: 23 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala x: 94 # contributors y: 1040 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/command/CreateSQLFunctionCommand.scala x: 2 # contributors y: 279 lines of code python/pyspark/ml/connect/tuning.py x: 6 # contributors y: 318 lines of code python/run-tests.py x: 30 # contributors y: 287 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala x: 67 # contributors y: 991 lines of code mllib/src/main/scala/org/apache/spark/ml/Model.scala x: 6 # contributors y: 13 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StatefulProcessorHandleImpl.scala x: 5 # contributors y: 477 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryRelation.scala x: 36 # contributors y: 417 lines of code core/src/main/scala/org/apache/spark/api/python/PythonRDD.scala x: 78 # contributors y: 687 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectService.scala x: 5 # contributors y: 339 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/state/StreamStreamJoinStatePartitionReader.scala x: 5 # contributors y: 123 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/python/streaming/TransformWithStateInPySparkExec.scala x: 2 # contributors y: 403 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/IncrementalExecution.scala x: 38 # contributors y: 467 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamingSymmetricHashJoinExec.scala x: 19 # contributors y: 543 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/TransformWithStateVariableUtils.scala x: 4 # contributors y: 154 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/SymmetricHashJoinStateManager.scala x: 23 # contributors y: 793 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/statefulOperators.scala x: 32 # contributors y: 1059 lines of code sql/core/src/main/scala/org/apache/spark/sql/jdbc/MySQLDialect.scala x: 26 # contributors y: 314 lines of code python/pyspark/sql/connect/proto/ml_pb2.pyi x: 2 # contributors y: 465 lines of code sql/connect/common/src/main/protobuf/spark/connect/ml.proto x: 3 # contributors y: 114 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala x: 33 # contributors y: 393 lines of code core/src/main/scala/org/apache/spark/serializer/SerializationDebugger.scala x: 11 # contributors y: 274 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/xml/StaxXmlParser.scala x: 11 # contributors y: 873 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala x: 75 # contributors y: 3066 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala x: 69 # contributors y: 432 lines of code python/pyspark/sql/pandas/_typing/__init__.pyi x: 7 # contributors y: 322 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/UnsupportedOperationChecker.scala x: 39 # contributors y: 439 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/pythonLogicalOperators.scala x: 16 # contributors y: 190 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala x: 115 # contributors y: 806 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/python/streaming/TransformWithStateInPySparkDeserializer.scala x: 1 # contributors y: 48 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/python/streaming/TransformWithStateInPySparkPythonRunner.scala x: 1 # contributors y: 291 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveReferencesInAggregate.scala x: 5 # contributors y: 101 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala x: 143 # contributors y: 961 lines of code python/pyspark/core/context.py x: 5 # contributors y: 784 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/stat/StatFunctions.scala x: 31 # contributors y: 166 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/QueryExecution.scala x: 73 # contributors y: 440 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/InsertAdaptiveSparkPlan.scala x: 21 # contributors y: 102 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/PlanAdaptiveDynamicPruningFilters.scala x: 9 # contributors y: 54 lines of code mllib/src/main/scala/org/apache/spark/ml/source/libsvm/LibSVMRelation.scala x: 30 # contributors y: 140 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/binaryfile/BinaryFileFormat.scala x: 11 # contributors y: 133 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVFileFormat.scala x: 27 # contributors y: 127 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/text/TextFileFormat.scala x: 22 # contributors y: 107 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/xml/XmlFileFormat.scala x: 6 # contributors y: 106 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/CatalogV2Util.scala x: 31 # contributors y: 516 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala x: 63 # contributors y: 551 lines of code core/src/main/scala/org/apache/spark/storage/BlockManagerMasterEndpoint.scala x: 46 # contributors y: 818 lines of code python/pyspark/sql/classic/column.py x: 5 # contributors y: 490 lines of code python/pyspark/sql/column.py x: 43 # contributors y: 317 lines of code python/pyspark/sql/connect/expressions.py x: 8 # contributors y: 1039 lines of code python/pyspark/sql/connect/proto/expressions_pb2.pyi x: 14 # contributors y: 1764 lines of code sql/api/src/main/scala/org/apache/spark/sql/Column.scala x: 4 # contributors y: 274 lines of code sql/api/src/main/scala/org/apache/spark/sql/internal/columnNodes.scala x: 4 # contributors y: 408 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala x: 71 # contributors y: 1057 lines of code sql/connect/common/src/main/scala/org/apache/spark/sql/connect/Dataset.scala x: 2 # contributors y: 1025 lines of code sql/connect/common/src/main/scala/org/apache/spark/sql/connect/SparkSession.scala x: 6 # contributors y: 598 lines of code sql/connect/common/src/main/scala/org/apache/spark/sql/connect/columnNodeSupport.scala x: 3 # contributors y: 248 lines of code python/pyspark/accumulators.py x: 41 # contributors y: 173 lines of code common/variant/src/main/java/org/apache/spark/types/variant/VariantUtil.java x: 5 # contributors y: 396 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/xml/XmlExpressionEvalUtils.scala x: 3 # contributors y: 159 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/xmlExpressions.scala x: 12 # contributors y: 204 lines of code core/src/main/scala/org/apache/spark/internal/config/Python.scala x: 6 # contributors y: 86 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala x: 82 # contributors y: 613 lines of code core/src/main/scala/org/apache/spark/serializer/KryoSerializer.scala x: 78 # contributors y: 575 lines of code core/src/main/scala/org/apache/spark/api/python/PythonWorkerFactory.scala x: 45 # contributors y: 399 lines of code core/src/main/scala/org/apache/spark/api/python/PythonWorkerUtils.scala x: 3 # contributors y: 118 lines of code core/src/main/scala/org/apache/spark/api/python/StreamingPythonRunner.scala x: 10 # contributors y: 123 lines of code core/src/main/scala/org/apache/spark/api/r/RRDD.scala x: 18 # contributors y: 119 lines of code python/pyspark/daemon.py x: 28 # contributors y: 171 lines of code python/pyspark/sql/connect/streaming/worker/foreach_batch_worker.py x: 5 # contributors y: 59 lines of code python/pyspark/sql/connect/streaming/worker/listener_worker.py x: 4 # contributors y: 72 lines of code python/pyspark/sql/streaming/python_streaming_source_runner.py x: 5 # contributors y: 161 lines of code python/pyspark/sql/worker/plan_data_source_read.py x: 7 # contributors y: 301 lines of code python/pyspark/sql/worker/write_into_data_source.py x: 6 # contributors y: 179 lines of code python/pyspark/taskcontext.py x: 18 # contributors y: 147 lines of code sql/core/src/main/scala/org/apache/spark/sql/api/python/PythonSQLUtils.scala x: 14 # contributors y: 139 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala x: 61 # contributors y: 813 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/LogicalRelation.scala x: 27 # contributors y: 77 lines of code sql/api/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseLexer.g4 x: 25 # contributors y: 607 lines of code launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java x: 26 # contributors y: 424 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala x: 56 # contributors y: 783 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/python/UserDefinedPythonDataSource.scala x: 5 # contributors y: 437 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/python/MapInBatchEvaluatorFactory.scala x: 6 # contributors y: 68 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/python/MapInBatchExec.scala x: 10 # contributors y: 56 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveWithCTE.scala x: 7 # contributors y: 258 lines of code core/src/main/scala/org/apache/spark/executor/Executor.scala x: 138 # contributors y: 952 lines of code sql/api/src/main/scala/org/apache/spark/sql/catalyst/util/SparkDateTimeUtils.scala x: 11 # contributors y: 426 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/finishAnalysis.scala x: 30 # contributors y: 133 lines of code common/utils/src/main/scala/org/apache/spark/internal/LogKey.scala x: 20 # contributors y: 861 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala x: 46 # contributors y: 701 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/ProgressReporter.scala x: 33 # contributors y: 446 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/HDFSBackedStateStoreProvider.scala x: 37 # contributors y: 801 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDBStateStoreProvider.scala x: 21 # contributors y: 755 lines of code sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/TableCatalogCapability.java x: 4 # contributors y: 9 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashJoin.scala x: 29 # contributors y: 663 lines of code python/pyspark/sql/dataframe.py x: 142 # contributors y: 851 lines of code python/pyspark/ml/connect/feature.py x: 3 # contributors y: 216 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/AnalysisHelper.scala x: 12 # contributors y: 210 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDBFileManager.scala x: 21 # contributors y: 783 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVOptions.scala x: 23 # contributors y: 284 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JSONOptions.scala x: 29 # contributors y: 190 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala x: 96 # contributors y: 3781 lines of code sql/core/src/main/scala/org/apache/spark/sql/classic/DataFrameReader.scala x: 2 # contributors y: 224 lines of code sql/core/src/main/scala/org/apache/spark/sql/classic/DataStreamWriter.scala x: 2 # contributors y: 323 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/AggregateExpressionResolver.scala x: 1 # contributors y: 154 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/BinaryArithmeticResolver.scala x: 1 # contributors y: 90 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/ExpressionIdAssigner.scala x: 1 # contributors y: 210 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/JoinResolver.scala x: 1 # contributors y: 180 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/ProjectResolver.scala x: 1 # contributors y: 119 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/ResolutionValidator.scala x: 1 # contributors y: 239 lines of code sql/api/src/main/scala/org/apache/spark/sql/catalyst/parser/parsers.scala x: 9 # contributors y: 313 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/V2ExpressionUtils.scala x: 11 # contributors y: 319 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2Writes.scala x: 10 # contributors y: 146 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/variant/variantExpressions.scala x: 13 # contributors y: 743 lines of code core/src/main/scala/org/apache/spark/util/Utils.scala x: 205 # contributors y: 2154 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/ui/ExecutionPage.scala x: 22 # contributors y: 210 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/UnivocityParser.scala x: 25 # contributors y: 456 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVDataSource.scala x: 18 # contributors y: 239 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSource.scala x: 42 # contributors y: 462 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilters.scala x: 28 # contributors y: 704 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaConverter.scala x: 27 # contributors y: 470 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/hash.scala x: 41 # contributors y: 827 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/AnsiTypeCoercion.scala x: 20 # contributors y: 125 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala x: 62 # contributors y: 249 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercionHelper.scala x: 6 # contributors y: 535 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala x: 48 # contributors y: 400 lines of code sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OffHeapColumnVector.java x: 27 # contributors y: 486 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala x: 56 # contributors y: 592 lines of code sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/ParquetVectorUpdaterFactory.java x: 12 # contributors y: 1454 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala x: 59 # contributors y: 374 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRowConverter.scala x: 29 # contributors y: 675 lines of code launcher/src/main/java/org/apache/spark/launcher/AbstractCommandBuilder.java x: 21 # contributors y: 256 lines of code sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/TableCatalog.java x: 15 # contributors y: 78 lines of code sql/api/src/main/scala/org/apache/spark/sql/types/DataType.scala x: 14 # contributors y: 369 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/QueryExecutionMetering.scala x: 8 # contributors y: 87 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala x: 37 # contributors y: 202 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala x: 90 # contributors y: 1905 lines of code python/pyspark/sql/connect/session.py x: 24 # contributors y: 838 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlanInfo.scala x: 17 # contributors y: 72 lines of code python/pyspark/ml/tuning.py x: 45 # contributors y: 1133 lines of code python/pyspark/sql/connect/client/reattach.py x: 9 # contributors y: 218 lines of code core/src/main/scala/org/apache/spark/util/Distribution.scala x: 13 # contributors y: 40 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/collect.scala x: 25 # contributors y: 399 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ColumnResolutionHelper.scala x: 16 # contributors y: 476 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala x: 80 # contributors y: 1210 lines of code python/pyspark/ml/clustering.py x: 33 # contributors y: 1000 lines of code mllib-local/src/main/scala/org/apache/spark/ml/linalg/Vectors.scala x: 17 # contributors y: 578 lines of code mllib/src/main/scala/org/apache/spark/ml/param/params.scala x: 29 # contributors y: 603 lines of code project/MimaExcludes.scala x: 174 # contributors y: 202 lines of code core/src/main/resources/org/apache/spark/ui/static/webui.css x: 39 # contributors y: 403 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala x: 57 # contributors y: 486 lines of code sql/core/src/main/scala/org/apache/spark/sql/jdbc/OracleDialect.scala x: 22 # contributors y: 193 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/AggUtils.scala x: 22 # contributors y: 432 lines of code sql/catalyst/src/main/java/org/apache/spark/sql/connector/util/V2ExpressionSQLBuilder.java x: 12 # contributors y: 313 lines of code sql/core/src/main/scala/org/apache/spark/sql/jdbc/H2Dialect.scala x: 14 # contributors y: 230 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ValidateSubqueryExpression.scala x: 2 # contributors y: 301 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala x: 83 # contributors y: 1490 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JsonDataSource.scala x: 15 # contributors y: 190 lines of code sql/hive-thriftserver/src/main/java/org/apache/hive/service/cli/session/HiveSessionImpl.java x: 14 # contributors y: 778 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveUDFs.scala x: 45 # contributors y: 337 lines of code sql/api/src/main/scala/org/apache/spark/sql/catalyst/parser/DataTypeAstBuilder.scala x: 11 # contributors y: 201 lines of code core/src/main/scala/org/apache/spark/ui/env/EnvironmentPage.scala x: 15 # contributors y: 165 lines of code sql/api/src/main/scala/org/apache/spark/sql/functions.scala x: 14 # contributors y: 1882 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/misc.scala x: 53 # contributors y: 436 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectListenerBusListener.scala x: 4 # contributors y: 103 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectSessionManager.scala x: 6 # contributors y: 216 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectStreamingQueryCache.scala x: 4 # contributors y: 237 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala x: 80 # contributors y: 1102 lines of code sql/api/src/main/scala/org/apache/spark/sql/catalyst/util/TimestampFormatter.scala x: 7 # contributors y: 432 lines of code core/src/main/java/org/apache/spark/memory/TaskMemoryManager.java x: 30 # contributors y: 315 lines of code core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeInMemorySorter.java x: 28 # contributors y: 260 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala x: 49 # contributors y: 695 lines of code core/src/main/scala/org/apache/spark/util/ThreadUtils.scala x: 25 # contributors y: 247 lines of code core/src/main/scala/org/apache/spark/BarrierCoordinator.scala x: 11 # contributors y: 149 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/objects.scala x: 24 # contributors y: 454 lines of code core/src/main/scala/org/apache/spark/BarrierTaskContext.scala x: 18 # contributors y: 165 lines of code core/src/main/scala/org/apache/spark/ui/ConsoleProgressBar.scala x: 17 # contributors y: 67 lines of code sql/core/src/main/scala/org/apache/spark/sql/jdbc/MsSqlServerDialect.scala x: 28 # contributors y: 191 lines of code sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala x: 35 # contributors y: 311 lines of code mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala x: 20 # contributors y: 362 lines of code mllib/src/main/scala/org/apache/spark/ml/optim/loss/RDDLossFunction.scala x: 5 # contributors y: 32 lines of code mllib/src/main/scala/org/apache/spark/ml/stat/Summarizer.scala x: 13 # contributors y: 544 lines of code mllib/src/main/scala/org/apache/spark/mllib/clustering/GaussianMixture.scala x: 19 # contributors y: 196 lines of code mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala x: 25 # contributors y: 370 lines of code mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/RowMatrix.scala x: 34 # contributors y: 560 lines of code mllib/src/main/scala/org/apache/spark/mllib/optimization/GradientDescent.scala x: 31 # contributors y: 184 lines of code mllib/src/main/scala/org/apache/spark/mllib/optimization/LBFGS.scala x: 19 # contributors y: 143 lines of code mllib/src/main/scala/org/apache/spark/mllib/stat/Statistics.scala x: 12 # contributors y: 76 lines of code sql/api/src/main/scala/org/apache/spark/sql/catalyst/JavaTypeInference.scala x: 7 # contributors y: 120 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/DeserializerBuildHelper.scala x: 10 # contributors y: 407 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/SerializerBuildHelper.scala x: 12 # contributors y: 418 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/types/PhysicalDataType.scala x: 13 # contributors y: 323 lines of code python/pyspark/sql/connect/udf.py x: 5 # contributors y: 226 lines of code python/pyspark/sql/udf.py x: 26 # contributors y: 464 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveShim.scala x: 48 # contributors y: 1277 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/DeduplicateRelations.scala x: 21 # contributors y: 394 lines of code python/pyspark/ml/base.py x: 12 # contributors y: 174 lines of code python/pyspark/sql/utils.py x: 31 # contributors y: 289 lines of code python/pyspark/pandas/base.py x: 13 # contributors y: 601 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PushVariantIntoScan.scala x: 2 # contributors y: 233 lines of code core/src/main/scala/org/apache/spark/util/JsonProtocol.scala x: 69 # contributors y: 1419 lines of code sql/connect/common/src/main/scala/org/apache/spark/sql/connect/KeyValueGroupedDataset.scala x: 3 # contributors y: 581 lines of code sql/core/src/main/scala/org/apache/spark/sql/scripting/SqlScriptingExecutionNode.scala x: 7 # contributors y: 707 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoder.scala x: 32 # contributors y: 189 lines of code sql/connect/common/src/main/scala/org/apache/spark/sql/connect/client/arrow/ArrowDeserializer.scala x: 4 # contributors y: 496 lines of code scalastyle-config.xml x: 29 # contributors y: 337 lines of code common/variant/src/main/java/org/apache/spark/types/variant/VariantBuilder.java x: 4 # contributors y: 450 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/SparkShreddingUtils.scala x: 3 # contributors y: 672 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Relation.scala x: 19 # contributors y: 209 lines of code python/pyspark/sql/connect/dataframe.py x: 25 # contributors y: 1945 lines of code sql/core/src/main/scala/org/apache/spark/sql/classic/SparkSession.scala x: 4 # contributors y: 665 lines of code python/pyspark/sql/connect/proto/relations_pb2.pyi x: 22 # contributors y: 3616 lines of code sql/connect/common/src/main/protobuf/spark/connect/relations.proto x: 7 # contributors y: 984 lines of code core/src/main/scala/org/apache/spark/SparkContext.scala x: 237 # contributors y: 1923 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2SessionCatalog.scala x: 28 # contributors y: 421 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlanner.scala x: 20 # contributors y: 60 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2ScanPartitioningAndOrdering.scala x: 7 # contributors y: 48 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/ResolveWriteToStream.scala x: 11 # contributors y: 99 lines of code python/pyspark/ml/pipeline.py x: 24 # contributors y: 253 lines of code core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala x: 110 # contributors y: 883 lines of code core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala x: 73 # contributors y: 510 lines of code launcher/src/main/java/org/apache/spark/launcher/SparkLauncher.java x: 17 # contributors y: 270 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDBStateEncoder.scala x: 6 # contributors y: 1126 lines of code python/pyspark/sql/session.py x: 64 # contributors y: 921 lines of code python/pyspark/sql/types.py x: 69 # contributors y: 1984 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala x: 47 # contributors y: 458 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Mode.scala x: 8 # contributors y: 268 lines of code python/pyspark/shell.py x: 59 # contributors y: 88 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HintErrorLogger.scala x: 6 # contributors y: 38 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveHints.scala x: 25 # contributors y: 198 lines of code launcher/src/main/java/org/apache/spark/launcher/JavaModuleOptions.java x: 8 # contributors y: 30 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveInlineTables.scala x: 18 # contributors y: 14 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/maskExpressions.scala x: 9 # contributors y: 262 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala x: 81 # contributors y: 2968 lines of code sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala x: 53 # contributors y: 613 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/ddl.scala x: 31 # contributors y: 104 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala x: 65 # contributors y: 611 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/PartitionedFileUtil.scala x: 8 # contributors y: 60 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/FileScan.scala x: 20 # contributors y: 147 lines of code core/src/main/scala/org/apache/spark/scheduler/JobResult.scala x: 11 # contributors y: 8 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/SQLExecution.scala x: 31 # contributors y: 240 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SparkPlanGraph.scala x: 23 # contributors y: 166 lines of code core/src/main/scala/org/apache/spark/deploy/master/Master.scala x: 98 # contributors y: 1095 lines of code sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/ExpressionImplUtils.java x: 10 # contributors y: 259 lines of code core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala x: 87 # contributors y: 1222 lines of code python/pyspark/sql/streaming/readwriter.py x: 9 # contributors y: 612 lines of code mllib/src/main/scala/org/apache/spark/ml/Pipeline.scala x: 21 # contributors y: 245 lines of code common/network-common/src/main/java/org/apache/spark/network/server/TransportRequestHandler.java x: 20 # contributors y: 252 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/TTLState.scala x: 5 # contributors y: 271 lines of code core/src/main/scala/org/apache/spark/api/java/JavaSparkContext.scala x: 58 # contributors y: 260 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/python/CoGroupedArrowPythonRunner.scala x: 7 # contributors y: 97 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/python/PythonUDFRunner.scala x: 10 # contributors y: 170 lines of code launcher/src/main/java/org/apache/spark/launcher/CommandBuilderUtils.java x: 13 # contributors y: 240 lines of code core/src/main/scala/org/apache/spark/SparkEnv.scala x: 96 # contributors y: 414 lines of code sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeArrayData.java x: 23 # contributors y: 400 lines of code sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/codegen/UnsafeWriter.java x: 9 # contributors y: 168 lines of code sql/api/src/main/scala/org/apache/spark/sql/SparkSession.scala x: 1 # contributors y: 313 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/python/UserDefinedPythonFunction.scala x: 15 # contributors y: 213 lines of code core/src/main/scala/org/apache/spark/deploy/PythonRunner.scala x: 16 # contributors y: 122 lines of code python/pyspark/ml/fpm.py x: 17 # contributors y: 243 lines of code python/pyspark/pandas/typedef/typehints.py x: 15 # contributors y: 409 lines of code python/pyspark/sql/pandas/conversion.py x: 20 # contributors y: 610 lines of code sql/api/src/main/scala/org/apache/spark/sql/util/ArrowUtils.scala x: 10 # contributors y: 221 lines of code sql/core/src/main/scala/org/apache/spark/sql/api/r/SQLUtils.scala x: 34 # contributors y: 188 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowConverters.scala x: 16 # contributors y: 379 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/r/ArrowRRunner.scala x: 6 # contributors y: 147 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/StopWordsRemover.scala x: 24 # contributors y: 141 lines of code python/pyspark/sql/classic/dataframe.py x: 10 # contributors y: 1539 lines of code python/pyspark/sql/udtf.py x: 11 # contributors y: 275 lines of code python/pyspark/sql/connect/proto/commands_pb2.pyi x: 18 # contributors y: 2053 lines of code sql/connect/common/src/main/protobuf/spark/connect/commands.proto x: 3 # contributors y: 448 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/command/commands.scala x: 36 # contributors y: 135 lines of code connector/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaOffsetReaderConsumer.scala x: 7 # contributors y: 466 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/subquery.scala x: 33 # contributors y: 347 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FlatMapGroupsWithStateExec.scala x: 23 # contributors y: 417 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCOptions.scala x: 30 # contributors y: 213 lines of code common/unsafe/src/main/java/org/apache/spark/unsafe/array/ByteArrayMethods.java x: 13 # contributors y: 81 lines of code sql/core/src/main/scala/org/apache/spark/sql/ExperimentalMethods.scala x: 9 # contributors y: 17 lines of code sql/core/src/main/scala/org/apache/spark/sql/SparkSessionExtensions.scala x: 17 # contributors y: 153 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala x: 42 # contributors y: 237 lines of code sql/core/src/main/scala/org/apache/spark/sql/internal/BaseSessionStateBuilder.scala x: 52 # contributors y: 257 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionStateBuilder.scala x: 38 # contributors y: 171 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala x: 55 # contributors y: 234 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLAppStatusListener.scala x: 32 # contributors y: 479 lines of code resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala x: 39 # contributors y: 673 lines of code mllib-local/src/main/scala/org/apache/spark/ml/linalg/Matrices.scala x: 18 # contributors y: 815 lines of code connector/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSource.scala x: 8 # contributors y: 284 lines of code python/pyspark/java_gateway.py x: 52 # contributors y: 105 lines of code repl/src/main/scala/org/apache/spark/repl/Main.scala x: 21 # contributors y: 88 lines of code sql/core/src/main/scala/org/apache/spark/sql/artifact/ArtifactManager.scala x: 7 # contributors y: 377 lines of code sql/core/src/main/scala/org/apache/spark/sql/classic/KeyValueGroupedDataset.scala x: 1 # contributors y: 452 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/ExistingRDD.scala x: 41 # contributors y: 220 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlan.scala x: 56 # contributors y: 378 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzePartitionCommand.scala x: 13 # contributors y: 64 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzeTableCommand.scala x: 13 # contributors y: 12 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/command/CommandUtils.scala x: 27 # contributors y: 375 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/command/DataWritingCommand.scala x: 15 # contributors y: 60 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/command/createDataSourceTables.scala x: 33 # contributors y: 155 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala x: 55 # contributors y: 734 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala x: 74 # contributors y: 992 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FallBackFileSourceV2.scala x: 8 # contributors y: 21 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala x: 44 # contributors y: 315 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InsertIntoDataSourceCommand.scala x: 9 # contributors y: 25 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/SaveIntoDataSourceCommand.scala x: 11 # contributors y: 51 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousExecution.scala x: 36 # contributors y: 345 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/memory.scala x: 32 # contributors y: 215 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala x: 34 # contributors y: 140 lines of code sql/core/src/main/scala/org/apache/spark/sql/internal/SessionState.scala x: 34 # contributors y: 114 lines of code sql/core/src/main/scala/org/apache/spark/sql/util/QueryExecutionListener.scala x: 14 # contributors y: 95 lines of code sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLDriver.scala x: 30 # contributors y: 82 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala x: 55 # contributors y: 20 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala x: 79 # contributors y: 327 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala x: 60 # contributors y: 176 lines of code sql/core/src/main/scala/org/apache/spark/sql/avro/SchemaConverters.scala x: 2 # contributors y: 358 lines of code core/src/main/scala/org/apache/spark/executor/CoarseGrainedExecutorBackend.scala x: 95 # contributors y: 486 lines of code python/pyspark/ml/evaluation.py x: 28 # contributors y: 579 lines of code python/pyspark/pandas/indexing.py x: 13 # contributors y: 1210 lines of code connector/protobuf/src/main/scala/org/apache/spark/sql/protobuf/utils/SchemaConverters.scala x: 8 # contributors y: 176 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/subquery.scala x: 34 # contributors y: 757 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/DecorrelateInnerQuery.scala x: 11 # contributors y: 529 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/RewriteMergeIntoTable.scala x: 4 # contributors y: 369 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala x: 46 # contributors y: 272 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Implicits.scala x: 15 # contributors y: 100 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/csvExpressions.scala x: 20 # contributors y: 211 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala x: 48 # contributors y: 1531 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/numberFormatExpressions.scala x: 12 # contributors y: 280 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/randomExpressions.scala x: 25 # contributors y: 315 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/urlExpressions.scala x: 13 # contributors y: 205 lines of code common/unsafe/src/main/java/org/apache/spark/sql/catalyst/util/CollationFactory.java x: 10 # contributors y: 979 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2ScanRelationPushDown.scala x: 18 # contributors y: 441 lines of code sql/connect/common/src/main/scala/org/apache/spark/sql/connect/common/DataTypeProtoConverter.scala x: 3 # contributors y: 257 lines of code python/pyspark/sql/connect/proto/types_pb2.pyi x: 7 # contributors y: 914 lines of code python/pyspark/ml/recommendation.py x: 26 # contributors y: 328 lines of code python/pyspark/sql/connect/proto/common_pb2.pyi x: 5 # contributors y: 566 lines of code common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDB.java x: 9 # contributors y: 323 lines of code common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDB.java x: 4 # contributors y: 333 lines of code common/utils/src/main/scala/org/apache/spark/SparkException.scala x: 10 # contributors y: 448 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/HDFSMetadataLog.scala x: 26 # contributors y: 251 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/state/StatePartitionReader.scala x: 8 # contributors y: 232 lines of code core/src/main/scala/org/apache/spark/rdd/NewHadoopRDD.scala x: 63 # contributors y: 298 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/parquet/ParquetPartitionReaderFactory.scala x: 18 # contributors y: 266 lines of code connector/protobuf/src/main/scala/org/apache/spark/sql/protobuf/ProtobufSerializer.scala x: 7 # contributors y: 286 lines of code core/src/main/scala/org/apache/spark/SparkConf.scala x: 86 # contributors y: 505 lines of code python/pyspark/version.py x: 12 # contributors y: 1 lines of code common/unsafe/src/main/java/org/apache/spark/sql/catalyst/util/CollationAwareUTF8String.java x: 7 # contributors y: 921 lines of code sql/api/src/main/scala/org/apache/spark/sql/types/StringType.scala x: 10 # contributors y: 108 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/util/SchemaUtils.scala x: 15 # contributors y: 250 lines of code core/src/main/scala/org/apache/spark/api/python/PythonUtils.scala x: 24 # contributors y: 137 lines of code python/pyspark/cloudpickle/cloudpickle.py x: 3 # contributors y: 793 lines of code core/src/main/scala/org/apache/spark/TaskContext.scala x: 43 # contributors y: 99 lines of code core/src/main/scala/org/apache/spark/scheduler/Task.scala x: 58 # contributors y: 132 lines of code python/pyspark/sql/streaming/state.py x: 5 # contributors y: 187 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/parameters.scala x: 8 # contributors y: 121 lines of code python/pyspark/sql/connect/proto/base_pb2.pyi x: 20 # contributors y: 3038 lines of code sql/connect/common/src/main/protobuf/spark/connect/base.proto x: 5 # contributors y: 921 lines of code python/pyspark/sql/variant_utils.py x: 5 # contributors y: 615 lines of code sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/MutableColumnarRow.java x: 9 # contributors y: 239 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala x: 32 # contributors y: 518 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/CharVarcharUtils.scala x: 10 # contributors y: 231 lines of code core/src/main/scala/org/apache/spark/deploy/history/ApplicationCache.scala x: 13 # contributors y: 226 lines of code python/pyspark/core/rdd.py x: 5 # contributors y: 1451 lines of code python/pyspark/sql/readwriter.py x: 84 # contributors y: 860 lines of code core/src/main/scala/org/apache/spark/deploy/DeployMessage.scala x: 39 # contributors y: 147 lines of code core/src/main/java/org/apache/spark/shuffle/sort/ShuffleExternalSorter.java x: 21 # contributors y: 275 lines of code core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala x: 155 # contributors y: 2113 lines of code core/src/main/scala/org/apache/spark/scheduler/TaskSet.scala x: 20 # contributors y: 21 lines of code core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala x: 113 # contributors y: 1010 lines of code core/src/main/scala/org/apache/spark/util/collection/Spillable.scala x: 16 # contributors y: 66 lines of code resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala x: 40 # contributors y: 712 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala x: 80 # contributors y: 4533 lines of code python/pyspark/sql/pandas/map_ops.py x: 10 # contributors y: 83 lines of code mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala x: 42 # contributors y: 521 lines of code sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/ParquetColumnVector.java x: 6 # contributors y: 248 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetReadSupport.scala x: 20 # contributors y: 369 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/udaf.scala x: 20 # contributors y: 429 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala x: 29 # contributors y: 497 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala x: 28 # contributors y: 526 lines of code graphx/src/main/scala/org/apache/spark/graphx/Pregel.scala x: 22 # contributors y: 54 lines of code core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala x: 65 # contributors y: 654 lines of code streaming/src/main/scala/org/apache/spark/streaming/dstream/DStream.scala x: 37 # contributors y: 541 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/joinTypes.scala x: 19 # contributors y: 132 lines of code core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala x: 75 # contributors y: 363 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileScanRDD.scala x: 34 # contributors y: 210 lines of code python/pyspark/sql/connect/readwriter.py x: 11 # contributors y: 851 lines of code sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/RowBasedKeyValueBatch.java x: 13 # contributors y: 96 lines of code python/pyspark/sql/catalog.py x: 28 # contributors y: 323 lines of code common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java x: 20 # contributors y: 1654 lines of code python/pyspark/ml/torch/distributor.py x: 9 # contributors y: 630 lines of code core/src/main/scala/org/apache/spark/rdd/RDD.scala x: 137 # contributors y: 1080 lines of code core/src/main/scala/org/apache/spark/shuffle/IndexShuffleBlockResolver.scala x: 31 # contributors y: 472 lines of code sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/ConstantColumnVector.java x: 6 # contributors y: 198 lines of code sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/WritableColumnVector.java x: 27 # contributors y: 599 lines of code core/src/main/scala/org/apache/spark/util/HadoopFSUtils.scala x: 11 # contributors y: 243 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/percentiles.scala x: 9 # contributors y: 388 lines of code core/src/main/scala/org/apache/spark/deploy/master/MasterArguments.scala x: 25 # contributors y: 61 lines of code core/src/main/scala/org/apache/spark/deploy/worker/WorkerArguments.scala x: 33 # contributors y: 117 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala x: 70 # contributors y: 1071 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala x: 42 # contributors y: 803 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowEvaluatorFactoryBase.scala x: 3 # contributors y: 193 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/client/IsolatedClientLoader.scala x: 38 # contributors y: 229 lines of code core/src/main/scala/org/apache/spark/api/r/SerDe.scala x: 17 # contributors y: 362 lines of code core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala x: 48 # contributors y: 203 lines of code core/src/main/scala/org/apache/spark/deploy/security/HadoopDelegationTokenManager.scala x: 14 # contributors y: 201 lines of code core/src/main/scala/org/apache/spark/rdd/PipedRDD.scala x: 37 # contributors y: 176 lines of code core/src/main/scala/org/apache/spark/storage/BlockInfoManager.scala x: 18 # contributors y: 327 lines of code core/src/main/scala/org/apache/spark/storage/BlockManager.scala x: 130 # contributors y: 1544 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/InMemoryCatalog.scala x: 33 # contributors y: 533 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala x: 37 # contributors y: 308 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala x: 37 # contributors y: 989 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala x: 49 # contributors y: 975 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/basicPhysicalOperators.scala x: 42 # contributors y: 642 lines of code streaming/src/main/scala/org/apache/spark/streaming/dstream/SocketInputDStream.scala x: 24 # contributors y: 94 lines of code sql/hive-thriftserver/src/main/java/org/apache/hive/service/auth/HiveAuthFactory.java x: 10 # contributors y: 280 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala x: 22 # contributors y: 750 lines of code sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIDriver.scala x: 62 # contributors y: 510 lines of code sql/core/src/main/scala/org/apache/spark/sql/jdbc/DerbyDialect.scala x: 12 # contributors y: 53 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/python/EvaluatePython.scala x: 21 # contributors y: 222 lines of code python/pyspark/sql/group.py x: 28 # contributors y: 90 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala x: 61 # contributors y: 269 lines of code core/src/main/scala/org/apache/spark/executor/ExecutorExitCode.scala x: 14 # contributors y: 37 lines of code core/src/main/scala/org/apache/spark/scheduler/cluster/StandaloneSchedulerBackend.scala x: 40 # contributors y: 252 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/ObjectHashAggregateExec.scala x: 18 # contributors y: 85 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala x: 43 # contributors y: 562 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/ui/SparkConnectServerListener.scala x: 2 # contributors y: 382 lines of code python/pyspark/pandas/generic.py x: 16 # contributors y: 991 lines of code python/pyspark/pandas/internal.py x: 10 # contributors y: 800 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LocalRelation.scala x: 22 # contributors y: 77 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CollationTypeCasts.scala x: 10 # contributors y: 8 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/DecimalPrecision.scala x: 21 # contributors y: 8 lines of code common/unsafe/src/main/java/org/apache/spark/sql/catalyst/util/CollationSupport.java x: 8 # contributors y: 642 lines of code common/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java x: 46 # contributors y: 1414 lines of code python/pyspark/sql/connect/proto/base_pb2_grpc.py x: 8 # contributors y: 454 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/SchemaPruning.scala x: 12 # contributors y: 128 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/GroupStateImpl.scala x: 10 # contributors y: 194 lines of code python/pyspark/sql/connect/streaming/readwriter.py x: 10 # contributors y: 606 lines of code python/pyspark/ml/linalg/__init__.py x: 15 # contributors y: 779 lines of code python/pyspark/mllib/linalg/__init__.py x: 25 # contributors y: 908 lines of code core/src/main/scala/org/apache/spark/deploy/master/ui/MasterPage.scala x: 31 # contributors y: 340 lines of code core/src/main/scala/org/apache/spark/deploy/master/ui/ApplicationPage.scala x: 45 # contributors y: 140 lines of code core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala x: 20 # contributors y: 174 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JsonInferSchema.scala x: 17 # contributors y: 323 lines of code connector/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala x: 9 # contributors y: 558 lines of code python/pyspark/mllib/classification.py x: 38 # contributors y: 398 lines of code python/pyspark/mllib/feature.py x: 27 # contributors y: 346 lines of code python/pyspark/mllib/regression.py x: 37 # contributors y: 371 lines of code core/src/main/scala/org/apache/spark/Dependency.scala x: 26 # contributors y: 126 lines of code core/src/main/scala/org/apache/spark/input/PortableDataStream.scala x: 17 # contributors y: 127 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ShuffleExchangeExec.scala x: 28 # contributors y: 283 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala x: 13 # contributors y: 139 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DescribeTableExec.scala x: 16 # contributors y: 151 lines of code resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala x: 17 # contributors y: 410 lines of code core/src/main/java/org/apache/spark/io/ReadAheadInputStream.java x: 9 # contributors y: 295 lines of code core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala x: 33 # contributors y: 201 lines of code core/src/main/scala/org/apache/spark/deploy/worker/ExecutorRunner.scala x: 57 # contributors y: 152 lines of code core/src/main/scala/org/apache/spark/util/collection/OpenHashSet.scala x: 30 # contributors y: 177 lines of code core/src/main/scala/org/apache/spark/deploy/history/HistoryServerArguments.scala x: 16 # contributors y: 73 lines of code python/pyspark/pandas/supported_api_gen.py x: 9 # contributors y: 195 lines of code resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala x: 56 # contributors y: 1227 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/CatalogV2Implicits.scala x: 21 # contributors y: 180 lines of code core/src/main/scala/org/apache/spark/scheduler/DAGSchedulerEvent.scala x: 49 # contributors y: 83 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousTextSocketSource.scala x: 8 # contributors y: 203 lines of code core/src/main/scala/org/apache/spark/scheduler/SparkListener.scala x: 58 # contributors y: 313 lines of code core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala x: 48 # contributors y: 215 lines of code core/src/main/scala/org/apache/spark/storage/BlockManagerStorageEndpoint.scala x: 9 # contributors y: 81 lines of code common/utils/src/main/scala/org/apache/spark/util/MavenUtils.scala x: 8 # contributors y: 426 lines of code sql/api/src/main/scala/org/apache/spark/sql/catalyst/util/SparkIntervalUtils.scala x: 4 # contributors y: 430 lines of code sql/api/src/main/scala/org/apache/spark/sql/types/Decimal.scala x: 7 # contributors y: 495 lines of code core/src/main/scala/org/apache/spark/storage/DiskStore.scala x: 51 # contributors y: 269 lines of code sql/connect/common/src/main/scala/org/apache/spark/sql/connect/common/UdfUtils.scala x: 2 # contributors y: 498 lines of code core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala x: 59 # contributors y: 1103 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/nullExpressions.scala x: 31 # contributors y: 426 lines of code core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala x: 90 # contributors y: 801 lines of code core/src/main/scala/org/apache/spark/executor/TaskMetrics.scala x: 46 # contributors y: 207 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/VectorAssembler.scala x: 27 # contributors y: 228 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/csv/CSVScan.scala x: 14 # contributors y: 70 lines of code core/src/main/resources/org/apache/spark/ui/static/stagepage.js x: 20 # contributors y: 1051 lines of code core/src/main/scala/org/apache/spark/deploy/FaultToleranceTest.scala x: 39 # contributors y: 329 lines of code core/src/main/scala/org/apache/spark/deploy/rest/RestSubmissionClient.scala x: 21 # contributors y: 439 lines of code core/src/main/scala/org/apache/spark/ui/JettyUtils.scala x: 55 # contributors y: 464 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/Interaction.scala x: 17 # contributors y: 209 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/QuantileDiscretizer.scala x: 22 # contributors y: 155 lines of code resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala x: 28 # contributors y: 248 lines of code core/src/main/scala/org/apache/spark/internal/config/History.scala x: 12 # contributors y: 248 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/CentralMomentAgg.scala x: 19 # contributors y: 290 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Covariance.scala x: 15 # contributors y: 119 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/ui/SparkConnectServerPage.scala x: 1 # contributors y: 428 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/ColumnType.scala x: 22 # contributors y: 624 lines of code core/src/main/scala/org/apache/spark/scheduler/JobWaiter.scala x: 27 # contributors y: 33 lines of code core/src/main/scala/org/apache/spark/ui/jobs/JobsTab.scala x: 15 # contributors y: 35 lines of code python/pyspark/sql/window.py x: 20 # contributors y: 45 lines of code python/pyspark/sql/streaming/listener.py x: 3 # contributors y: 642 lines of code python/pyspark/conf.py x: 23 # contributors y: 126 lines of code mllib/src/main/scala/org/apache/spark/mllib/classification/NaiveBayes.scala x: 35 # contributors y: 272 lines of code mllib/src/main/scala/org/apache/spark/mllib/clustering/GaussianMixtureModel.scala x: 21 # contributors y: 123 lines of code mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAModel.scala x: 26 # contributors y: 540 lines of code mllib/src/main/scala/org/apache/spark/mllib/clustering/PowerIterationClustering.scala x: 26 # contributors y: 265 lines of code mllib/src/main/scala/org/apache/spark/mllib/fpm/PrefixSpan.scala x: 24 # contributors y: 422 lines of code mllib/src/main/scala/org/apache/spark/mllib/recommendation/MatrixFactorizationModel.scala x: 41 # contributors y: 235 lines of code streaming/src/main/scala/org/apache/spark/streaming/dstream/DStreamCheckpointData.scala x: 19 # contributors y: 101 lines of code streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala x: 45 # contributors y: 204 lines of code streaming/src/main/scala/org/apache/spark/streaming/dstream/RawInputDStream.scala x: 25 # contributors y: 69 lines of code sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2.scala x: 39 # contributors y: 120 lines of code sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLEnv.scala x: 29 # contributors y: 54 lines of code core/src/main/scala/org/apache/spark/Partitioner.scala x: 47 # contributors y: 242 lines of code core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala x: 114 # contributors y: 722 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLListener.scala x: 21 # contributors y: 68 lines of code streaming/src/main/scala/org/apache/spark/streaming/DStreamGraph.scala x: 29 # contributors y: 149 lines of code streaming/src/main/scala/org/apache/spark/streaming/util/RawTextSender.scala x: 23 # contributors y: 50 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcSerializer.scala x: 8 # contributors y: 159 lines of code python/pyspark/pandas/indexes/base.py x: 14 # contributors y: 988 lines of code python/pyspark/pandas/strings.py x: 10 # contributors y: 309 lines of code sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java x: 30 # contributors y: 471 lines of code core/src/main/scala/org/apache/spark/deploy/master/ui/MasterWebUI.scala x: 44 # contributors y: 98 lines of code resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/KubernetesConf.scala x: 23 # contributors y: 235 lines of code resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala x: 45 # contributors y: 732 lines of code resource-managers/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnSchedulerBackend.scala x: 23 # contributors y: 263 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/compression/compressionSchemes.scala x: 12 # contributors y: 673 lines of code common/network-common/src/main/java/org/apache/spark/network/crypto/GcmTransportCipher.java x: 1 # contributors y: 332 lines of code core/src/main/scala/org/apache/spark/rdd/CartesianRDD.scala x: 24 # contributors y: 67 lines of code streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala x: 66 # contributors y: 298 lines of code streaming/src/main/scala/org/apache/spark/streaming/StreamingContext.scala x: 70 # contributors y: 469 lines of code streaming/src/main/scala/org/apache/spark/streaming/scheduler/ReceiverTracker.scala x: 29 # contributors y: 457 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningAwareFileIndex.scala x: 29 # contributors y: 173 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala x: 26 # contributors y: 90 lines of code core/src/main/scala/org/apache/spark/status/api/v1/api.scala x: 42 # contributors y: 497 lines of code core/src/main/scala/org/apache/spark/io/CompressionCodec.scala x: 41 # contributors y: 150 lines of code python/pyspark/sql/context.py x: 51 # contributors y: 292 lines of code launcher/src/main/java/org/apache/spark/launcher/SparkSubmitOptionParser.java x: 9 # contributors y: 143 lines of code core/src/main/scala/org/apache/spark/status/AppStatusListener.scala x: 42 # contributors y: 1100 lines of code core/src/main/scala/org/apache/spark/status/AppStatusStore.scala x: 30 # contributors y: 749 lines of code core/src/main/scala/org/apache/spark/util/collection/ExternalAppendOnlyMap.scala x: 39 # contributors y: 382 lines of code python/pyspark/pandas/data_type_ops/base.py x: 9 # contributors y: 366 lines of code python/pyspark/pandas/indexes/multi.py x: 12 # contributors y: 527 lines of code core/src/main/scala/org/apache/spark/deploy/LocalSparkCluster.scala x: 42 # contributors y: 78 lines of code core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala x: 68 # contributors y: 328 lines of code core/src/main/scala/org/apache/spark/rdd/SequenceFileRDDFunctions.scala x: 24 # contributors y: 38 lines of code core/src/main/scala/org/apache/spark/scheduler/dynalloc/ExecutorMonitor.scala x: 16 # contributors y: 440 lines of code core/src/main/scala/org/apache/spark/storage/BlockManagerDecommissioner.scala x: 14 # contributors y: 336 lines of code core/src/main/scala/org/apache/spark/storage/BlockManagerMaster.scala x: 56 # contributors y: 203 lines of code core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala x: 26 # contributors y: 617 lines of code core/src/main/scala/org/apache/spark/MapOutputTracker.scala x: 85 # contributors y: 1131 lines of code core/src/main/scala/org/apache/spark/api/r/RBackendHandler.scala x: 18 # contributors y: 198 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveInspectors.scala x: 40 # contributors y: 875 lines of code common/network-common/src/main/java/org/apache/spark/network/server/OneForOneStreamManager.java x: 13 # contributors y: 167 lines of code common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalBlockHandler.java x: 15 # contributors y: 478 lines of code common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/OneForOneBlockFetcher.java x: 16 # contributors y: 272 lines of code connector/spark-ganglia-lgpl/src/main/java/com/codahale/metrics/ganglia/GangliaReporter.java x: 3 # contributors y: 293 lines of code core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeExternalSorter.java x: 31 # contributors y: 582 lines of code sql/hive-thriftserver/src/main/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java x: 8 # contributors y: 637 lines of code sql/hive-thriftserver/src/main/java/org/apache/hive/service/cli/thrift/ThriftHttpServlet.java x: 9 # contributors y: 395 lines of code sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/SupportsMetadataColumns.java x: 5 # contributors y: 10 lines of code mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/BlockMatrix.scala x: 22 # contributors y: 346 lines of code core/src/main/scala/org/apache/spark/broadcast/TorrentBroadcast.scala x: 49 # contributors y: 255 lines of code core/src/main/scala/org/apache/spark/rdd/JdbcRDD.scala x: 29 # contributors y: 121 lines of code core/src/main/scala/org/apache/spark/ui/WebUI.scala x: 37 # contributors y: 156 lines of code core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala x: 46 # contributors y: 534 lines of code core/src/main/scala/org/apache/spark/util/logging/RollingFileAppender.scala x: 14 # contributors y: 129 lines of code core/src/main/scala/org/apache/spark/deploy/Client.scala x: 38 # contributors y: 219 lines of code core/src/main/scala/org/apache/spark/rdd/ReliableCheckpointRDD.scala x: 20 # contributors y: 230 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/object.scala x: 25 # contributors y: 682 lines of code core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala x: 27 # contributors y: 147 lines of code core/src/main/scala/org/apache/spark/metrics/MetricsSystem.scala x: 45 # contributors y: 174 lines of code core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala x: 79 # contributors y: 541 lines of code core/src/main/scala/org/apache/spark/rpc/netty/NettyRpcEnv.scala x: 24 # contributors y: 540 lines of code core/src/main/scala/org/apache/spark/scheduler/ReplayListenerBus.scala x: 19 # contributors y: 82 lines of code core/src/main/scala/org/apache/spark/shuffle/ShuffleBlockPusher.scala x: 10 # contributors y: 323 lines of code core/src/main/scala/org/apache/spark/util/SizeEstimator.scala x: 40 # contributors y: 231 lines of code core/src/main/scala/org/apache/spark/storage/BlockId.scala x: 31 # contributors y: 208 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatDataWriter.scala x: 12 # contributors y: 410 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRDD.scala x: 44 # contributors y: 195 lines of code core/src/main/scala/org/apache/spark/ContextCleaner.scala x: 27 # contributors y: 194 lines of code core/src/main/scala/org/apache/spark/metrics/MetricsConfig.scala x: 33 # contributors y: 78 lines of code mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeans.scala x: 47 # contributors y: 326 lines of code mllib/src/main/scala/org/apache/spark/mllib/evaluation/BinaryClassificationMetrics.scala x: 16 # contributors y: 161 lines of code resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/KubernetesUtils.scala x: 19 # contributors y: 315 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/TableReader.scala x: 53 # contributors y: 343 lines of code python/pyspark/__init__.py x: 54 # contributors y: 72 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala x: 31 # contributors y: 307 lines of code common/network-common/src/main/java/org/apache/spark/network/util/TransportConf.java x: 29 # contributors y: 287 lines of code python/pyspark/mllib/clustering.py x: 41 # contributors y: 449 lines of code python/pyspark/mllib/recommendation.py x: 28 # contributors y: 136 lines of code python/pyspark/mllib/stat/KernelDensity.py x: 7 # contributors y: 19 lines of code python/pyspark/mllib/tree.py x: 22 # contributors y: 321 lines of code python/pyspark/serializers.py x: 58 # contributors y: 373 lines of code python/pyspark/streaming/dstream.py x: 19 # contributors y: 489 lines of code resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStep.scala x: 28 # contributors y: 149 lines of code sql/core/src/main/scala/org/apache/spark/sql/sources/interfaces.scala x: 30 # contributors y: 99 lines of code sql/core/src/main/scala/org/apache/spark/sql/streaming/ui/StreamingQueryStatisticsPage.scala x: 12 # contributors y: 480 lines of code core/src/main/scala/org/apache/spark/deploy/worker/ui/WorkerWebUI.scala x: 39 # contributors y: 31 lines of code core/src/main/scala/org/apache/spark/metrics/sink/MetricsServlet.scala x: 22 # contributors y: 33 lines of code core/src/main/scala/org/apache/spark/ui/PagedTable.scala x: 15 # contributors y: 272 lines of code core/src/main/scala/org/apache/spark/ui/jobs/AllJobsPage.scala x: 45 # contributors y: 519 lines of code core/src/main/scala/org/apache/spark/ui/jobs/JobPage.scala x: 37 # contributors y: 472 lines of code core/src/main/scala/org/apache/spark/ui/jobs/PoolPage.scala x: 25 # contributors y: 40 lines of code core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala x: 88 # contributors y: 471 lines of code core/src/main/scala/org/apache/spark/ui/jobs/StageTable.scala x: 63 # contributors y: 294 lines of code core/src/main/scala/org/apache/spark/ui/jobs/StagesTab.scala x: 17 # contributors y: 40 lines of code core/src/main/scala/org/apache/spark/ui/storage/RDDPage.scala x: 37 # contributors y: 209 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/ui/AllExecutionsPage.scala x: 24 # contributors y: 495 lines of code sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/ui/ThriftServerPage.scala x: 23 # contributors y: 346 lines of code sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/ui/ThriftServerSessionPage.scala x: 20 # contributors y: 84 lines of code streaming/src/main/scala/org/apache/spark/streaming/ui/StreamingPage.scala x: 29 # contributors y: 426 lines of code dev/run-tests.py x: 52 # contributors y: 431 lines of code launcher/src/main/java/org/apache/spark/launcher/LauncherServer.java x: 14 # contributors y: 271 lines of code core/src/main/resources/org/apache/spark/ui/static/executorspage.js x: 23 # contributors y: 710 lines of code sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedColumnReader.java x: 28 # contributors y: 310 lines of code sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedRleValuesReader.java x: 16 # contributors y: 709 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashedRelation.scala x: 40 # contributors y: 795 lines of code mllib/src/main/scala/org/apache/spark/mllib/evaluation/RegressionMetrics.scala x: 18 # contributors y: 65 lines of code sql/catalyst/src/main/java/org/apache/spark/sql/vectorized/ArrowColumnVector.java x: 15 # contributors y: 481 lines of code core/src/main/scala/org/apache/spark/scheduler/TaskScheduler.scala x: 41 # contributors y: 31 lines of code core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedClusterMessage.scala x: 46 # contributors y: 93 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/UnwrapCastInBinaryComparison.scala x: 13 # contributors y: 302 lines of code core/src/main/scala/org/apache/spark/rdd/AsyncRDDActions.scala x: 31 # contributors y: 84 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/python/FlatMapGroupsInPandasExec.scala x: 15 # contributors y: 14 lines of code core/src/main/protobuf/org/apache/spark/status/protobuf/store_types.proto x: 8 # contributors y: 740 lines of code common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBTypeInfo.java x: 5 # contributors y: 287 lines of code core/src/main/scala/org/apache/spark/api/java/JavaPairRDD.scala x: 51 # contributors y: 421 lines of code core/src/main/scala/org/apache/spark/status/protobuf/StageDataWrapperSerializer.scala x: 4 # contributors y: 659 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ToNumberParser.scala x: 4 # contributors y: 640 lines of code streaming/src/main/scala/org/apache/spark/streaming/ui/StreamingJobProgressListener.scala x: 17 # contributors y: 195 lines of code mllib/src/main/scala/org/apache/spark/mllib/regression/RegressionModel.scala x: 14 # contributors y: 22 lines of code core/src/main/scala/org/apache/spark/api/java/JavaRDDLike.scala x: 54 # contributors y: 286 lines of code mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala x: 59 # contributors y: 1122 lines of code mllib/src/main/scala/org/apache/spark/mllib/linalg/Matrices.scala x: 38 # contributors y: 773 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/GenerateExec.scala x: 16 # contributors y: 231 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala x: 34 # contributors y: 1053 lines of code core/src/main/scala/org/apache/spark/rdd/CoalescedRDD.scala x: 35 # contributors y: 221 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/AggregationIterator.scala x: 16 # contributors y: 214 lines of code common/unsafe/src/main/java/org/apache/spark/unsafe/Platform.java x: 19 # contributors y: 231 lines of code core/src/main/scala/org/apache/spark/rdd/OrderedRDDFunctions.scala x: 28 # contributors y: 45 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateSafeProjection.scala x: 23 # contributors y: 156 lines of code common/utils/src/main/scala/org/apache/spark/util/ClosureCleaner.scala x: 2 # contributors y: 618 lines of code core/src/main/scala/org/apache/spark/status/LiveEntity.scala x: 26 # contributors y: 817 lines of code python/pyspark/join.py x: 19 # contributors y: 66 lines of code python/pyspark/shuffle.py x: 14 # contributors y: 446 lines of code streaming/src/main/scala/org/apache/spark/streaming/dstream/FlatMapValuedDStream.scala x: 19 # contributors y: 15 lines of code core/src/main/scala/org/apache/spark/deploy/master/ZooKeeperPersistenceEngine.scala x: 33 # contributors y: 48 lines of code mllib/src/main/scala/org/apache/spark/mllib/tree/DecisionTree.scala x: 25 # contributors y: 110 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/debug/package.scala x: 34 # contributors y: 183 lines of code streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaStreamingContext.scala x: 45 # contributors y: 274 lines of code python/pyspark/sql/connect/proto/catalog_pb2.pyi x: 4 # contributors y: 913 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/joins/SortMergeJoinExec.scala x: 33 # contributors y: 1142 lines of code core/src/main/scala/org/apache/spark/scheduler/ShuffleMapTask.scala x: 53 # contributors y: 62 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowExec.scala x: 18 # contributors y: 35 lines of code core/src/main/scala/org/apache/spark/package.scala x: 25 # contributors y: 12 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/interfaces.scala x: 34 # contributors y: 255 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/joins/SortMergeJoinEvaluatorFactory.scala x: 1 # contributors y: 259 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/joins/ShuffledHashJoinExec.scala x: 18 # contributors y: 505 lines of code core/src/main/scala/org/apache/spark/scheduler/ResultTask.scala x: 44 # contributors y: 47 lines of code core/src/main/scala/org/apache/spark/rdd/CheckpointRDD.scala x: 31 # contributors y: 10 lines of code mllib-local/src/main/scala/org/apache/spark/ml/linalg/BLAS.scala x: 12 # contributors y: 654 lines of code mllib/src/main/scala/org/apache/spark/ml/tree/treeParams.scala x: 18 # contributors y: 297 lines of code mllib/src/main/scala/org/apache/spark/mllib/classification/LogisticRegression.scala x: 36 # contributors y: 211 lines of code mllib/src/main/scala/org/apache/spark/mllib/linalg/BLAS.scala x: 22 # contributors y: 568 lines of code mllib/src/main/scala/org/apache/spark/mllib/random/RandomRDDs.scala x: 11 # contributors y: 506 lines of code mllib/src/main/scala/org/apache/spark/mllib/recommendation/ALS.scala x: 44 # contributors y: 221 lines of code mllib/src/main/scala/org/apache/spark/mllib/regression/GeneralizedLinearAlgorithm.scala x: 25 # contributors y: 155 lines of code mllib/src/main/scala/org/apache/spark/mllib/regression/LabeledPoint.scala x: 19 # contributors y: 41 lines of code mllib/src/main/scala/org/apache/spark/mllib/regression/LinearRegression.scala x: 30 # contributors y: 62 lines of code mllib/src/main/scala/org/apache/spark/mllib/regression/RidgeRegression.scala x: 26 # contributors y: 62 lines of code core/src/main/scala/org/apache/spark/storage/BlockManagerId.scala x: 23 # contributors y: 82 lines of code core/src/main/scala/org/apache/spark/status/storeTypes.scala x: 19 # contributors y: 469 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala x: 13 # contributors y: 514 lines of code core/src/main/scala/org/apache/spark/scheduler/TaskResult.scala x: 34 # contributors y: 70 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/SortOrder.scala x: 26 # contributors y: 172 lines of code core/src/main/scala/org/apache/spark/TaskEndReason.scala x: 38 # contributors y: 138 lines of code core/src/main/scala/org/apache/spark/rdd/CoGroupedRDD.scala x: 40 # contributors y: 122 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala x: 33 # contributors y: 25 lines of code sql/core/src/main/java/org/apache/spark/sql/execution/UnsafeKVExternalSorter.java x: 18 # contributors y: 215 lines of code core/src/main/resources/org/apache/spark/ui/static/sorttable.js x: 13 # contributors y: 352 lines of code streaming/src/main/scala/org/apache/spark/streaming/dstream/UnionDStream.scala x: 21 # contributors y: 29 lines of code graphx/src/main/scala/org/apache/spark/graphx/impl/EdgePartition.scala x: 12 # contributors y: 338 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateUnsafeProjection.scala x: 22 # contributors y: 298 lines of code core/src/main/scala/org/apache/spark/rdd/MapPartitionsRDD.scala x: 23 # contributors y: 28 lines of code core/src/main/scala/org/apache/spark/rdd/ShuffledRDD.scala x: 32 # contributors y: 67 lines of code licenses-binary/LICENSE-javassist.html x: 1 # contributors y: 369 lines of code core/src/main/scala/org/apache/spark/api/java/JavaDoubleRDD.scala x: 32 # contributors y: 82 lines of code core/src/main/scala/org/apache/spark/rdd/RDDCheckpointData.scala x: 26 # contributors y: 37 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSQLParser.scala x: 2 # contributors y: 1040 lines of code streaming/src/main/scala/org/apache/spark/streaming/dstream/PluggableInputDStream.scala x: 17 # contributors y: 12 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUdf.scala x: 9 # contributors y: 1053 lines of code core/src/main/scala/org/apache/spark/deploy/master/RecoveryState.scala x: 16 # contributors y: 5 lines of code
5815.0
lines of code
  min: 1.0
  average: 157.16
  25th percentile: 24.0
  median: 68.0
  75th percentile: 167.0
  max: 5815.0
0 237.0
# contributors
min: 1.0 | average: 10.87 | 25th percentile: 2.0 | median: 5.0 | 75th percentile: 13.0 | max: 237.0