apache / spark
File Size

The distribution of size of files (measured in lines of code).

Intro
Learn more...
File Size Overall
21% | 18% | 29% | 16% | 14%
Legend:
1001+
501-1000
201-500
101-200
1-100


explore: grouped by folders | grouped by size | sunburst | 3D view
File Size per Extension
1001+
501-1000
201-500
101-200
1-100
scala18% | 17% | 29% | 17% | 15%
py33% | 23% | 25% | 11% | 5%
pyi76% | 15% | 5% | 1% | 1%
java7% | 10% | 37% | 17% | 27%
g477% | 22% | 0% | 0% | 0%
js24% | 16% | 44% | 7% | 6%
proto0% | 58% | 28% | 9% | 4%
css0% | 0% | 67% | 11% | 21%
html0% | 0% | 53% | 37% | 9%
xml0% | 0% | 96% | 0% | 3%
bash0% | 0% | 0% | 100% | 0%
yaml0% | 0% | 0% | 0% | 100%
ps10% | 0% | 0% | 0% | 100%
toml0% | 0% | 0% | 0% | 100%
in0% | 0% | 0% | 0% | 100%
c0% | 0% | 0% | 0% | 100%
sbt0% | 0% | 0% | 0% | 100%
cfg0% | 0% | 0% | 0% | 100%
File Size per Logical Decomposition
primary
1001+
501-1000
201-500
101-200
1-100
sql20% | 17% | 30% | 15% | 15%
python39% | 22% | 22% | 10% | 4%
core21% | 17% | 26% | 18% | 16%
common10% | 13% | 37% | 14% | 24%
mllib4% | 21% | 34% | 23% | 16%
project84% | 0% | 11% | 0% | 4%
dev40% | 14% | 24% | 8% | 12%
resource-managers11% | 19% | 23% | 24% | 20%
mllib-local0% | 94% | 0% | 0% | 5%
connector0% | 5% | 45% | 21% | 26%
streaming0% | 5% | 37% | 28% | 28%
launcher0% | 0% | 58% | 20% | 21%
graphx0% | 0% | 26% | 33% | 39%
licenses-binary0% | 0% | 100% | 0% | 0%
ROOT0% | 0% | 100% | 0% | 0%
build0% | 0% | 0% | 78% | 21%
repl0% | 0% | 0% | 50% | 49%
hadoop-cloud0% | 0% | 0% | 0% | 100%
connect-examples0% | 0% | 0% | 0% | 100%
tools0% | 0% | 0% | 0% | 100%
R0% | 0% | 0% | 0% | 100%
Longest Files (Top 50)
File# lines# units
SQLConf.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/internal
5815 42
AstBuilder.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser
4715 310
collectionOperations.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions
4533 177
QueryCompilationErrors.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/errors
3781 539
feature.py
in python/pyspark/ml
3621 471
pyi
relations_pb2.pyi
in python/pyspark/sql/connect/proto
3616 471
SparkConnectPlanner.scala
in sql/connect/server/src/main/scala/org/apache/spark/sql/connect/planner
3468 181
datetimeExpressions.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions
3066 150
pyi
base_pb2.pyi
in python/pyspark/sql/connect/proto
3038 375
stringExpressions.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions
2968 154
Analyzer.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis
2889 137
QueryExecutionErrors.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/errors
2609 354
package.scala
in core/src/main/scala/org/apache/spark/internal/config
2456 -
builtin.py
in python/pyspark/sql/connect/functions
2417 474
series.py
in python/pyspark/pandas
2215 143
classification.py
in python/pyspark/ml
2173 245
Utils.scala
in core/src/main/scala/org/apache/spark/util
2154 163
plan.py
in python/pyspark/sql/connect
2133 223
SqlBaseParser.g4
in sql/api/src/main/antlr4/org/apache/spark/sql/catalyst/parser
2122 -
DAGScheduler.scala
in core/src/main/scala/org/apache/spark/scheduler
2113 101
pyi
commands_pb2.pyi
in python/pyspark/sql/connect/proto
2053 227
types.py
in python/pyspark/sql
1984 215
dataframe.py
in python/pyspark/sql/connect
1945 163
SparkContext.scala
in core/src/main/scala/org/apache/spark
1923 124
Cast.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions
1905 81
functions.scala
in sql/api/src/main/scala/org/apache/spark/sql
1882 74
groupby.py
in python/pyspark/pandas
1800 83
pyi
expressions_pb2.pyi
in python/pyspark/sql/connect/proto
1764 204
worker.py
in python/pyspark
1728 32
Optimizer.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer
1683 92
RemoteBlockPushResolver.java
in common/network-shuffle/src/main/java/org/apache/spark/network/shuffle
1654 102
regression.py
in python/pyspark/ml
1554 213
objects.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects
1547 68
BlockManager.scala
in core/src/main/scala/org/apache/spark/storage
1544 83
dataframe.py
in python/pyspark/sql/classic
1539 182
mathExpressions.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions
1531 101
namespace.py
in python/pyspark/pandas
1517 45
basicLogicalOperators.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical
1503 96
Dataset.scala
in sql/core/src/main/scala/org/apache/spark/sql/classic
1498 123
SessionCatalog.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog
1490 140
1460 12
ParquetVectorUpdaterFactory.java
in sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet
1454 148
rdd.py
in python/pyspark/core
1451 181
core.py
in python/pyspark/sql/connect/client
1449 104
RocksDB.scala
in sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state
1425 74
JsonProtocol.scala
in core/src/main/scala/org/apache/spark/util
1419 94
UTF8String.java
in common/unsafe/src/main/java/org/apache/spark/unsafe/types
1414 110
modules.py
in dev/sparktestsupport
1409 7
HiveShim.scala
in sql/hive/src/main/scala/org/apache/spark/sql/hive/client
1277 63
Client.scala
in resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn
1227 42
Files With Most Units (Top 50)
File# lines# units
QueryCompilationErrors.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/errors
3781 539
builtin.py
in python/pyspark/sql/connect/functions
2417 474
pyi
relations_pb2.pyi
in python/pyspark/sql/connect/proto
3616 471
feature.py
in python/pyspark/ml
3621 471
pyi
base_pb2.pyi
in python/pyspark/sql/connect/proto
3038 375
QueryExecutionErrors.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/errors
2609 354
AstBuilder.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser
4715 310
classification.py
in python/pyspark/ml
2173 245
pyi
commands_pb2.pyi
in python/pyspark/sql/connect/proto
2053 227
plan.py
in python/pyspark/sql/connect
2133 223
types.py
in python/pyspark/sql
1984 215
regression.py
in python/pyspark/ml
1554 213
pyi
expressions_pb2.pyi
in python/pyspark/sql/connect/proto
1764 204
dataframe.py
in python/pyspark/sql/classic
1539 182
SparkConnectPlanner.scala
in sql/connect/server/src/main/scala/org/apache/spark/sql/connect/planner
3468 181
rdd.py
in python/pyspark/core
1451 181
collectionOperations.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions
4533 177
dataframe.py
in python/pyspark/sql
851 173
dataframe.py
in python/pyspark/sql/connect
1945 163
Utils.scala
in core/src/main/scala/org/apache/spark/util
2154 163
stringExpressions.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions
2968 154
datetimeExpressions.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions
3066 150
ParquetVectorUpdaterFactory.java
in sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet
1454 148
pyi
StateMessage_pb2.pyi
in python/pyspark/sql/streaming/proto
1116 148
series.py
in python/pyspark/pandas
2215 143
SessionCatalog.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog
1490 140
clustering.py
in python/pyspark/ml
1000 138
Analyzer.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis
2889 137
ColumnType.scala
in sql/core/src/main/scala/org/apache/spark/sql/execution/columnar
624 124
SparkContext.scala
in core/src/main/scala/org/apache/spark
1923 124
Dataset.scala
in sql/core/src/main/scala/org/apache/spark/sql/classic
1498 123
pyi
catalog_pb2.pyi
in python/pyspark/sql/connect/proto
913 119
v2Commands.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical
1148 111
UTF8String.java
in common/unsafe/src/main/java/org/apache/spark/unsafe/types
1414 110
QueryParsingErrors.scala
in sql/api/src/main/scala/org/apache/spark/sql/errors
681 109
__init__.py
in python/pyspark/mllib/linalg
908 109
listener.py
in python/pyspark/sql/streaming
642 106
pyi
types_pb2.pyi
in python/pyspark/sql/connect/proto
914 105
core.py
in python/pyspark/sql/connect/client
1449 104
RemoteBlockPushResolver.java
in common/network-shuffle/src/main/java/org/apache/spark/network/shuffle
1654 102
mathExpressions.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions
1531 101
DAGScheduler.scala
in core/src/main/scala/org/apache/spark/scheduler
2113 101
RDD.scala
in core/src/main/scala/org/apache/spark/rdd
1080 101
expressions.py
in python/pyspark/sql/connect
1039 99
basicLogicalOperators.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical
1503 96
__init__.py
in python/pyspark/ml/linalg
779 94
PythonMLLibAPI.scala
in mllib/src/main/scala/org/apache/spark/mllib/api/python
1122 94
JsonProtocol.scala
in core/src/main/scala/org/apache/spark/util
1419 94
Optimizer.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer
1683 92
Dataset.scala
in sql/connect/common/src/main/scala/org/apache/spark/sql/connect
1025 92
Files With Long Lines (Top 50)

There are 402 files with lines longer than 120 characters. In total, there are 919 long lines.

File# lines# units# long lines
stagepage.js
in core/src/main/resources/org/apache/spark/ui/static
1051 59 45
datetimeExpressions.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions
3066 150 35
StreamingQueryStatisticsPage.scala
in sql/core/src/main/scala/org/apache/spark/sql/streaming/ui
480 11 24
sharedParams.scala
in mllib/src/main/scala/org/apache/spark/ml/param/shared
149 32 22
misc.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions
436 19 19
stringExpressions.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions
2968 154 17
SqlBaseParser.g4
in sql/api/src/main/antlr4/org/apache/spark/sql/catalyst/parser
2122 - 17
202 3 16
executorspage-template.html
in core/src/main/resources/org/apache/spark/ui/static
140 - 15
executorspage.js
in core/src/main/resources/org/apache/spark/ui/static
710 48 14
LogDivertAppender.java
in sql/hive-thriftserver/src/main/java/org/apache/hive/service/cli/operation
227 28 12
windowExpressions.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions
803 31 10
generators.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions
562 29 10
linearRegression.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate
307 10 10
UDFRegistration.scala
in sql/core/src/main/scala/org/apache/spark/sql/classic
142 5 10
ScalaUDF.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions
1053 6 9
SparkConnectPlanner.scala
in sql/connect/server/src/main/scala/org/apache/spark/sql/connect/planner
3468 181 9
shared.py
in python/pyspark/ml/param
451 78 9
variantExpressions.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/variant
743 29 8
KryoSerializer.scala
in core/src/main/scala/org/apache/spark/serializer
575 30 8
Mode.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate
268 16 7
xpath.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/xml
176 9 7
DataSourceV2Strategy.scala
in sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2
551 13 7
TimeWindow.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions
214 8 6
SerializerBuildHelper.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst
418 36 5
regexpExpressions.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions
975 36 5
SessionWindow.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions
64 1 5
WriteToDataSourceV2Exec.scala
in sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2
528 24 5
1460 12 5
DeserializerBuildHelper.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst
407 25 4
maskExpressions.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions
262 8 4
jsonExpressions.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions
458 17 4
AstBuilder.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser
4715 310 4
QueryCompilationErrors.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/errors
3781 539 4
SparkOptimizer.scala
in sql/core/src/main/scala/org/apache/spark/sql/execution
81 - 4
IncrementalExecution.scala
in sql/core/src/main/scala/org/apache/spark/sql/execution/streaming
467 9 4
MicroBatchExecution.scala
in sql/core/src/main/scala/org/apache/spark/sql/execution/streaming
701 21 4
V2ScanRelationPushDown.scala
in sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2
441 15 4
StateDataSource.scala
in sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/state
443 11 4
ParquetFilters.scala
in sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet
704 16 4
StreamingPage.scala
in streaming/src/main/scala/org/apache/spark/streaming/ui
426 13 4
RewriteMergeIntoTable.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis
369 14 3
mathExpressions.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions
1531 101 3
ToStringBase.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions
422 6 3
higherOrderFunctions.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions
989 57 3
TryEval.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions
239 15 3
xmlExpressions.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions
204 9 3
intervalExpressions.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions
749 55 3
avroSqlFunctions.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions
221 5 3
timeExpressions.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions
458 17 3
Correlations

File Size vs. Commits (all time): 4071 points

core/src/main/scala/org/apache/spark/util/collection/BitSet.scala x: 44 commits (all time) y: 162 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CTESubstitution.scala x: 39 commits (all time) y: 261 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveIdentifierClause.scala x: 11 commits (all time) y: 100 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/unresolved.scala x: 154 commits (all time) y: 641 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala x: 200 commits (all time) y: 827 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala x: 137 commits (all time) y: 424 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala x: 158 commits (all time) y: 847 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreePatternBits.scala x: 3 commits (all time) y: 38 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreePatterns.scala x: 73 commits (all time) y: 150 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ApplyDefaultCollationToStringType.scala x: 3 commits (all time) y: 135 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/DecisionTreeClassifier.scala x: 69 commits (all time) y: 209 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/FMClassifier.scala x: 25 commits (all time) y: 225 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/GBTClassifier.scala x: 74 commits (all time) y: 270 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala x: 58 commits (all time) y: 312 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala x: 185 commits (all time) y: 929 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/MultilayerPerceptronClassifier.scala x: 56 commits (all time) y: 255 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/RandomForestClassifier.scala x: 67 commits (all time) y: 346 lines of code mllib/src/main/scala/org/apache/spark/ml/clustering/BisectingKMeans.scala x: 49 commits (all time) y: 201 lines of code mllib/src/main/scala/org/apache/spark/ml/clustering/GaussianMixture.scala x: 69 commits (all time) y: 455 lines of code mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala x: 77 commits (all time) y: 518 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/GBTRegressor.scala x: 69 commits (all time) y: 237 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala x: 85 commits (all time) y: 967 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/LinearRegression.scala x: 154 commits (all time) y: 594 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/RandomForestRegressor.scala x: 61 commits (all time) y: 219 lines of code mllib/src/main/scala/org/apache/spark/ml/tree/impl/GradientBoostedTrees.scala x: 30 commits (all time) y: 351 lines of code mllib/src/main/scala/org/apache/spark/ml/tree/impl/RandomForest.scala x: 58 commits (all time) y: 827 lines of code mllib/src/main/scala/org/apache/spark/ml/tree/treeModels.scala x: 39 commits (all time) y: 331 lines of code mllib/src/main/scala/org/apache/spark/ml/util/HasTrainingSummary.scala x: 4 commits (all time) y: 21 lines of code python/pyspark/testing/connectutils.py x: 57 commits (all time) y: 177 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/config/Connect.scala x: 9 commits (all time) y: 325 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/ml/MLCache.scala x: 8 commits (all time) y: 187 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/ml/MLHandler.scala x: 16 commits (all time) y: 323 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/package.scala x: 53 commits (all time) y: 179 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectAnalyzeHandler.scala x: 4 commits (all time) y: 186 lines of code core/src/main/scala/org/apache/spark/util/UninterruptibleThread.scala x: 6 commits (all time) y: 79 lines of code python/pyspark/errors/exceptions/captured.py x: 23 commits (all time) y: 284 lines of code python/pyspark/pandas/groupby.py x: 107 commits (all time) y: 1800 lines of code python/pyspark/pandas/namespace.py x: 83 commits (all time) y: 1517 lines of code python/pyspark/pandas/series.py x: 132 commits (all time) y: 2215 lines of code python/pyspark/pandas/utils.py x: 51 commits (all time) y: 657 lines of code python/pyspark/testing/pandasutils.py x: 30 commits (all time) y: 486 lines of code python/pyspark/testing/utils.py x: 68 commits (all time) y: 560 lines of code python/pyspark/sql/conversion.py x: 2 commits (all time) y: 415 lines of code python/pyspark/sql/pandas/serializers.py x: 59 commits (all time) y: 884 lines of code python/pyspark/worker.py x: 248 commits (all time) y: 1728 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala x: 921 commits (all time) y: 5815 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/python/ArrowPythonRunner.scala x: 41 commits (all time) y: 97 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/constraints.scala x: 6 commits (all time) y: 174 lines of code python/pyspark/sql/datasource.py x: 25 commits (all time) y: 256 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala x: 925 commits (all time) y: 2889 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala x: 192 commits (all time) y: 617 lines of code common/utils/src/main/scala/org/apache/spark/util/SparkStringUtils.scala x: 2 commits (all time) y: 8 lines of code common/utils/src/main/scala/org/apache/spark/util/SparkTestUtils.scala x: 2 commits (all time) y: 69 lines of code connector/kafka-0-10-token-provider/src/main/scala/org/apache/spark/kafka010/KafkaConfigUpdater.scala x: 4 commits (all time) y: 58 lines of code connector/kafka-0-10-token-provider/src/main/scala/org/apache/spark/kafka010/KafkaRedactionUtil.scala x: 2 commits (all time) y: 31 lines of code connector/kafka-0-10-token-provider/src/main/scala/org/apache/spark/kafka010/KafkaTokenUtil.scala x: 5 commits (all time) y: 230 lines of code project/SparkBuild.scala x: 1203 commits (all time) y: 1460 lines of code sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala x: 119 commits (all time) y: 474 lines of code python/pyspark/sql/pandas/types.py x: 41 commits (all time) y: 920 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/AggregateResolver.scala x: 2 commits (all time) y: 192 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/ExpressionResolver.scala x: 6 commits (all time) y: 527 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/FunctionResolver.scala x: 4 commits (all time) y: 102 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/NameScope.scala x: 4 commits (all time) y: 406 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/ml/MLUtils.scala x: 55 commits (all time) y: 514 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/NaiveBayes.scala x: 65 commits (all time) y: 436 lines of code mllib/src/main/scala/org/apache/spark/ml/clustering/LDA.scala x: 64 commits (all time) y: 533 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/BucketedRandomProjectionLSH.scala x: 21 commits (all time) y: 150 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/ChiSqSelector.scala x: 45 commits (all time) y: 108 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/CountVectorizer.scala x: 46 commits (all time) y: 248 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/IDF.scala x: 38 commits (all time) y: 157 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/MaxAbsScaler.scala x: 24 commits (all time) y: 122 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/MinHashLSH.scala x: 22 commits (all time) y: 161 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/OneHotEncoder.scala x: 45 commits (all time) y: 378 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/PCA.scala x: 41 commits (all time) y: 133 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/RFormula.scala x: 66 commits (all time) y: 384 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/RobustScaler.scala x: 20 commits (all time) y: 180 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/StandardScaler.scala x: 58 commits (all time) y: 217 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala x: 80 commits (all time) y: 405 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/TargetEncoder.scala x: 10 commits (all time) y: 301 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/UnivariateFeatureSelector.scala x: 17 commits (all time) y: 308 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/VarianceThresholdSelector.scala x: 14 commits (all time) y: 134 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/VectorIndexer.scala x: 52 commits (all time) y: 370 lines of code mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala x: 114 commits (all time) y: 1062 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/AFTSurvivalRegression.scala x: 80 commits (all time) y: 345 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/FMRegressor.scala x: 28 commits (all time) y: 478 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/IsotonicRegression.scala x: 46 commits (all time) y: 199 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/state/StateDataSource.scala x: 19 commits (all time) y: 443 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/AsyncProgressTrackingMicroBatchExecution.scala x: 10 commits (all time) y: 224 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala x: 182 commits (all time) y: 483 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/OperatorStateMetadata.scala x: 12 commits (all time) y: 348 lines of code sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/Cast.java x: 5 commits (all time) y: 38 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala x: 395 commits (all time) y: 920 lines of code python/pyspark/sql/connect/functions/builtin.py x: 67 commits (all time) y: 2417 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala x: 151 commits (all time) y: 998 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveCatalogs.scala x: 70 commits (all time) y: 112 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2AlterTableCommands.scala x: 17 commits (all time) y: 223 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala x: 159 commits (all time) y: 1148 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/CreateTableExec.scala x: 13 commits (all time) y: 42 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/ReplaceTableExec.scala x: 13 commits (all time) y: 89 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala x: 104 commits (all time) y: 555 lines of code sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala x: 95 commits (all time) y: 333 lines of code sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkGetColumnsOperation.scala x: 27 commits (all time) y: 176 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/OneVsRest.scala x: 53 commits (all time) y: 344 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/Imputer.scala x: 30 commits (all time) y: 214 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/Word2Vec.scala x: 61 commits (all time) y: 250 lines of code mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala x: 35 commits (all time) y: 263 lines of code mllib/src/main/scala/org/apache/spark/ml/tuning/CrossValidator.scala x: 67 commits (all time) y: 304 lines of code mllib/src/main/scala/org/apache/spark/ml/tuning/TrainValidationSplit.scala x: 39 commits (all time) y: 278 lines of code mllib/src/main/scala/org/apache/spark/ml/util/ReadWrite.scala x: 58 commits (all time) y: 524 lines of code resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/KerberosConfDriverFeatureStep.scala x: 9 commits (all time) y: 202 lines of code dev/sparktestsupport/modules.py x: 366 commits (all time) y: 1409 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/namedExpressions.scala x: 129 commits (all time) y: 441 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala x: 562 commits (all time) y: 1683 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/SparkOptimizer.scala x: 59 commits (all time) y: 81 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/UnionLoopExec.scala x: 4 commits (all time) y: 152 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/ViewResolver.scala x: 4 commits (all time) y: 91 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/view.scala x: 20 commits (all time) y: 30 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala x: 288 commits (all time) y: 1503 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/CacheManager.scala x: 80 commits (all time) y: 305 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/rules.scala x: 142 commits (all time) y: 558 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/CatalystTypeConverters.scala x: 63 commits (all time) y: 464 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/python/PythonArrowOutput.scala x: 15 commits (all time) y: 243 lines of code python/pyspark/pandas/accessors.py x: 34 commits (all time) y: 434 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala x: 164 commits (all time) y: 1547 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/dsl/package.scala x: 151 commits (all time) y: 399 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala x: 54 commits (all time) y: 3468 lines of code sql/core/src/main/scala/org/apache/spark/sql/classic/Dataset.scala x: 8 commits (all time) y: 1498 lines of code sql/core/src/main/protobuf/org/apache/spark/sql/execution/streaming/StateMessage.proto x: 7 commits (all time) y: 219 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/python/streaming/TransformWithStateInPySparkStateServer.scala x: 4 commits (all time) y: 754 lines of code common/utils/src/main/scala/org/apache/spark/internal/config/ConfigBuilder.scala x: 4 commits (all time) y: 272 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/TransformWithStateExec.scala x: 40 commits (all time) y: 539 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStoreErrors.scala x: 20 commits (all time) y: 367 lines of code sql/core/src/main/scala/org/apache/spark/sql/classic/Catalog.scala x: 5 commits (all time) y: 568 lines of code core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala x: 99 commits (all time) y: 733 lines of code python/pyspark/ml/classification.py x: 203 commits (all time) y: 2173 lines of code python/pyspark/ml/connect/readwrite.py x: 16 commits (all time) y: 290 lines of code python/pyspark/ml/feature.py x: 199 commits (all time) y: 3621 lines of code python/pyspark/ml/regression.py x: 145 commits (all time) y: 1554 lines of code python/pyspark/ml/util.py x: 81 commits (all time) y: 714 lines of code python/pyspark/ml/wrapper.py x: 59 commits (all time) y: 245 lines of code python/pyspark/sql/connect/client/core.py x: 95 commits (all time) y: 1449 lines of code python/pyspark/sql/connect/group.py x: 45 commits (all time) y: 488 lines of code python/pyspark/sql/connect/plan.py x: 179 commits (all time) y: 2133 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStore.scala x: 83 commits (all time) y: 693 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStoreConf.scala x: 22 commits (all time) y: 37 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStoreRDD.scala x: 22 commits (all time) y: 96 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowFunctionFrame.scala x: 22 commits (all time) y: 361 lines of code project/plugins.sbt x: 174 commits (all time) y: 14 lines of code common/utils/src/main/scala/org/apache/spark/ErrorClassesJSONReader.scala x: 11 commits (all time) y: 123 lines of code sql/api/src/main/scala/org/apache/spark/sql/errors/DataTypeErrors.scala x: 20 commits (all time) y: 224 lines of code sql/api/src/main/scala/org/apache/spark/sql/errors/ExecutionErrors.scala x: 18 commits (all time) y: 210 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala x: 349 commits (all time) y: 2609 lines of code python/pyspark/sql/pandas/functions.py x: 35 commits (all time) y: 160 lines of code python/pyspark/sql/pandas/group_ops.py x: 42 commits (all time) y: 252 lines of code python/pyspark/util.py x: 61 commits (all time) y: 439 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveUtils.scala x: 91 commits (all time) y: 369 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala x: 97 commits (all time) y: 1425 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStoreChangelog.scala x: 19 commits (all time) y: 413 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercionValidation.scala x: 2 commits (all time) y: 95 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeExtractors.scala x: 74 commits (all time) y: 368 lines of code python/pyspark/ml/connect/functions.py x: 8 commits (all time) y: 47 lines of code core/src/main/scala/org/apache/spark/deploy/ExternalShuffleService.scala x: 33 commits (all time) y: 129 lines of code core/src/main/scala/org/apache/spark/errors/SparkCoreErrors.scala x: 21 commits (all time) y: 416 lines of code core/src/main/scala/org/apache/spark/internal/config/package.scala x: 311 commits (all time) y: 2456 lines of code core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala x: 224 commits (all time) y: 886 lines of code core/src/main/scala/org/apache/spark/ui/UIWorkloadGenerator.scala x: 71 commits (all time) y: 79 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala x: 125 commits (all time) y: 637 lines of code python/pyspark/sql/streaming/proto/StateMessage_pb2.pyi x: 6 commits (all time) y: 1116 lines of code python/pyspark/sql/streaming/stateful_processor_api_client.py x: 15 commits (all time) y: 392 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/timeExpressions.scala x: 11 commits (all time) y: 458 lines of code python/pyspark/sql/functions/__init__.py x: 5 commits (all time) y: 466 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/ResolverGuard.scala x: 5 commits (all time) y: 369 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/CodeGeneratorWithInterpretedFallback.scala x: 7 commits (all time) y: 28 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ToStringBase.scala x: 12 commits (all time) y: 422 lines of code sql/core/src/main/scala/org/apache/spark/sql/avro/AvroOutputWriter.scala x: 2 commits (all time) y: 56 lines of code sql/core/src/main/scala/org/apache/spark/sql/avro/AvroSerializer.scala x: 3 commits (all time) y: 314 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/HiveResult.scala x: 43 commits (all time) y: 110 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala x: 94 commits (all time) y: 553 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceUtils.scala x: 39 commits (all time) y: 225 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetOptions.scala x: 22 commits (all time) y: 55 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetWriteSupport.scala x: 36 commits (all time) y: 366 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ResolveDefaultColumnsUtil.scala x: 43 commits (all time) y: 406 lines of code sql/api/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4 x: 84 commits (all time) y: 2122 lines of code sql/api/src/main/scala/org/apache/spark/sql/catalyst/util/SparkParserUtils.scala x: 5 commits (all time) y: 144 lines of code sql/api/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala x: 29 commits (all time) y: 681 lines of code sql/api/src/main/scala/org/apache/spark/sql/types/StructField.scala x: 8 commits (all time) y: 148 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AbstractSqlParser.scala x: 8 commits (all time) y: 85 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala x: 630 commits (all time) y: 4715 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala x: 279 commits (all time) y: 1040 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/command/CreateSQLFunctionCommand.scala x: 5 commits (all time) y: 279 lines of code python/run-tests.py x: 60 commits (all time) y: 287 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala x: 156 commits (all time) y: 991 lines of code mllib/src/main/scala/org/apache/spark/ml/Estimator.scala x: 15 commits (all time) y: 29 lines of code mllib/src/main/scala/org/apache/spark/ml/Model.scala x: 12 commits (all time) y: 13 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StatefulProcessorHandleImpl.scala x: 26 commits (all time) y: 477 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryRelation.scala x: 68 commits (all time) y: 417 lines of code core/src/main/scala/org/apache/spark/api/python/PythonRDD.scala x: 218 commits (all time) y: 687 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectService.scala x: 6 commits (all time) y: 339 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/state/StreamStreamJoinStatePartitionReader.scala x: 8 commits (all time) y: 123 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/python/streaming/TransformWithStateInPySparkExec.scala x: 2 commits (all time) y: 403 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/IncrementalExecution.scala x: 85 commits (all time) y: 467 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamingSymmetricHashJoinExec.scala x: 39 commits (all time) y: 543 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/SymmetricHashJoinStateManager.scala x: 39 commits (all time) y: 793 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/statefulOperators.scala x: 85 commits (all time) y: 1059 lines of code sql/core/src/main/scala/org/apache/spark/sql/jdbc/MySQLDialect.scala x: 68 commits (all time) y: 314 lines of code sql/connect/common/src/main/protobuf/spark/connect/ml.proto x: 7 commits (all time) y: 114 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala x: 55 commits (all time) y: 393 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/NormalizePlan.scala x: 11 commits (all time) y: 145 lines of code core/src/main/scala/org/apache/spark/serializer/SerializationDebugger.scala x: 16 commits (all time) y: 274 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/xml/StaxXmlParser.scala x: 32 commits (all time) y: 873 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala x: 219 commits (all time) y: 3066 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala x: 188 commits (all time) y: 432 lines of code python/pyspark/sql/pandas/_typing/__init__.pyi x: 13 commits (all time) y: 322 lines of code python/pyspark/sql/streaming/stateful_processor.py x: 14 commits (all time) y: 141 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/UnsupportedOperationChecker.scala x: 70 commits (all time) y: 439 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/pythonLogicalOperators.scala x: 28 commits (all time) y: 190 lines of code sql/core/src/main/scala/org/apache/spark/sql/classic/RelationalGroupedDataset.scala x: 3 commits (all time) y: 473 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala x: 390 commits (all time) y: 806 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/python/streaming/TransformWithStateInPySparkPythonRunner.scala x: 1 commits (all time) y: 291 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala x: 416 commits (all time) y: 961 lines of code python/pyspark/core/context.py x: 11 commits (all time) y: 784 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/stat/StatFunctions.scala x: 63 commits (all time) y: 166 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/QueryExecution.scala x: 147 commits (all time) y: 440 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/InsertAdaptiveSparkPlan.scala x: 37 commits (all time) y: 102 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/PlanAdaptiveDynamicPruningFilters.scala x: 11 commits (all time) y: 54 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/dynamicpruning/PlanDynamicPruningFilters.scala x: 19 commits (all time) y: 55 lines of code mllib/src/main/scala/org/apache/spark/ml/source/libsvm/LibSVMRelation.scala x: 48 commits (all time) y: 140 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVFileFormat.scala x: 54 commits (all time) y: 127 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JsonFileFormat.scala x: 46 commits (all time) y: 110 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/text/TextFileFormat.scala x: 30 commits (all time) y: 107 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/orc/OrcFileFormat.scala x: 52 commits (all time) y: 285 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/Resolver.scala x: 7 commits (all time) y: 421 lines of code sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/TableChange.java x: 16 commits (all time) y: 490 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/CatalogV2Util.scala x: 60 commits (all time) y: 516 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala x: 239 commits (all time) y: 551 lines of code core/src/main/scala/org/apache/spark/storage/BlockManagerMasterEndpoint.scala x: 83 commits (all time) y: 818 lines of code python/pyspark/sql/classic/column.py x: 12 commits (all time) y: 490 lines of code python/pyspark/sql/column.py x: 98 commits (all time) y: 317 lines of code python/pyspark/sql/connect/column.py x: 99 commits (all time) y: 482 lines of code python/pyspark/sql/connect/expressions.py x: 62 commits (all time) y: 1039 lines of code python/pyspark/sql/connect/proto/expressions_pb2.pyi x: 59 commits (all time) y: 1764 lines of code sql/api/src/main/scala/org/apache/spark/sql/Column.scala x: 8 commits (all time) y: 274 lines of code sql/api/src/main/scala/org/apache/spark/sql/internal/columnNodes.scala x: 11 commits (all time) y: 408 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala x: 180 commits (all time) y: 1057 lines of code sql/connect/common/src/main/protobuf/spark/connect/expressions.proto x: 9 commits (all time) y: 412 lines of code sql/connect/common/src/main/scala/org/apache/spark/sql/connect/Dataset.scala x: 3 commits (all time) y: 1025 lines of code sql/connect/common/src/main/scala/org/apache/spark/sql/connect/SparkSession.scala x: 17 commits (all time) y: 598 lines of code sql/core/src/main/scala/org/apache/spark/sql/classic/columnNodeSupport.scala x: 2 commits (all time) y: 265 lines of code python/pyspark/accumulators.py x: 84 commits (all time) y: 173 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/xmlExpressions.scala x: 23 commits (all time) y: 204 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/xml/StaxXmlGenerator.scala x: 10 commits (all time) y: 289 lines of code core/src/main/scala/org/apache/spark/internal/config/Python.scala x: 10 commits (all time) y: 86 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala x: 198 commits (all time) y: 613 lines of code core/src/main/scala/org/apache/spark/serializer/KryoSerializer.scala x: 157 commits (all time) y: 575 lines of code core/src/main/scala/org/apache/spark/api/python/PythonWorkerFactory.scala x: 73 commits (all time) y: 399 lines of code core/src/main/scala/org/apache/spark/api/python/PythonWorkerUtils.scala x: 6 commits (all time) y: 118 lines of code core/src/main/scala/org/apache/spark/api/python/StreamingPythonRunner.scala x: 19 commits (all time) y: 123 lines of code core/src/main/scala/org/apache/spark/api/r/RRDD.scala x: 29 commits (all time) y: 119 lines of code python/pyspark/core/broadcast.py x: 3 commits (all time) y: 166 lines of code python/pyspark/daemon.py x: 49 commits (all time) y: 171 lines of code python/pyspark/sql/streaming/python_streaming_source_runner.py x: 12 commits (all time) y: 161 lines of code python/pyspark/sql/worker/plan_data_source_read.py x: 18 commits (all time) y: 301 lines of code python/pyspark/sql/worker/python_streaming_sink_runner.py x: 10 commits (all time) y: 107 lines of code python/pyspark/sql/worker/write_into_data_source.py x: 14 commits (all time) y: 179 lines of code python/pyspark/taskcontext.py x: 33 commits (all time) y: 147 lines of code python/pyspark/worker_util.py x: 10 commits (all time) y: 117 lines of code sql/core/src/main/scala/org/apache/spark/sql/api/python/PythonSQLUtils.scala x: 53 commits (all time) y: 139 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala x: 161 commits (all time) y: 813 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/LogicalRelation.scala x: 40 commits (all time) y: 77 lines of code sql/api/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseLexer.g4 x: 35 commits (all time) y: 607 lines of code launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java x: 54 commits (all time) y: 424 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala x: 121 commits (all time) y: 783 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/python/UserDefinedPythonDataSource.scala x: 10 commits (all time) y: 437 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/python/MapInBatchEvaluatorFactory.scala x: 10 commits (all time) y: 68 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/python/MapInBatchExec.scala x: 15 commits (all time) y: 56 lines of code python/pyspark/errors/exceptions/connect.py x: 27 commits (all time) y: 332 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveWithCTE.scala x: 10 commits (all time) y: 258 lines of code core/src/main/scala/org/apache/spark/executor/Executor.scala x: 348 commits (all time) y: 952 lines of code sql/api/src/main/scala/org/apache/spark/sql/catalyst/util/SparkDateTimeUtils.scala x: 14 commits (all time) y: 426 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/finishAnalysis.scala x: 42 commits (all time) y: 133 lines of code common/utils/src/main/scala/org/apache/spark/internal/LogKey.scala x: 65 commits (all time) y: 861 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala x: 112 commits (all time) y: 701 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/HDFSBackedStateStoreProvider.scala x: 74 commits (all time) y: 801 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDBStateStoreProvider.scala x: 48 commits (all time) y: 755 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/xml/XmlOptions.scala x: 16 commits (all time) y: 167 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashJoin.scala x: 69 commits (all time) y: 663 lines of code python/pyspark/sql/dataframe.py x: 441 commits (all time) y: 851 lines of code python/pyspark/ml/connect/feature.py x: 6 commits (all time) y: 216 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDBFileManager.scala x: 50 commits (all time) y: 783 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala x: 382 commits (all time) y: 3781 lines of code sql/core/src/main/scala/org/apache/spark/sql/classic/DataFrameReader.scala x: 3 commits (all time) y: 224 lines of code sql/core/src/main/scala/org/apache/spark/sql/classic/DataStreamWriter.scala x: 2 commits (all time) y: 323 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/AggregateExpressionResolver.scala x: 2 commits (all time) y: 154 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/ExpressionIdAssigner.scala x: 2 commits (all time) y: 210 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/JoinResolver.scala x: 1 commits (all time) y: 180 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/ProjectResolver.scala x: 2 commits (all time) y: 119 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/ResolutionValidator.scala x: 3 commits (all time) y: 239 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SessionHolder.scala x: 7 commits (all time) y: 279 lines of code sql/api/src/main/scala/org/apache/spark/sql/catalyst/parser/parsers.scala x: 15 commits (all time) y: 313 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2Writes.scala x: 17 commits (all time) y: 146 lines of code sql/connect/common/src/main/scala/org/apache/spark/sql/connect/client/arrow/ArrowSerializer.scala x: 6 commits (all time) y: 499 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/variant/variantExpressions.scala x: 38 commits (all time) y: 743 lines of code core/src/main/scala/org/apache/spark/util/Utils.scala x: 545 commits (all time) y: 2154 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/UnivocityParser.scala x: 59 commits (all time) y: 456 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormat.scala x: 38 commits (all time) y: 178 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVDataSource.scala x: 36 commits (all time) y: 239 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSource.scala x: 68 commits (all time) y: 462 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilters.scala x: 64 commits (all time) y: 704 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaConverter.scala x: 47 commits (all time) y: 470 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V1FallbackWriters.scala x: 16 commits (all time) y: 49 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/hash.scala x: 59 commits (all time) y: 827 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/AnsiTypeCoercion.scala x: 37 commits (all time) y: 125 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala x: 150 commits (all time) y: 249 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercionHelper.scala x: 7 commits (all time) y: 535 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala x: 91 commits (all time) y: 400 lines of code sql/core/src/main/resources/org/apache/spark/sql/execution/ui/static/spark-sql-viz.js x: 22 commits (all time) y: 264 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/RuleIdCollection.scala x: 56 commits (all time) y: 187 lines of code dev/merge_spark_pr.py x: 91 commits (all time) y: 512 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/python/ArrowEvalPythonExec.scala x: 32 commits (all time) y: 95 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/python/ExtractPythonUDFs.scala x: 51 commits (all time) y: 243 lines of code python/pyspark/errors/exceptions/base.py x: 29 commits (all time) y: 150 lines of code sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OffHeapColumnVector.java x: 49 commits (all time) y: 486 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala x: 121 commits (all time) y: 592 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/python/PythonScan.scala x: 6 commits (all time) y: 49 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/SpecificInternalRow.scala x: 13 commits (all time) y: 221 lines of code sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/ParquetVectorUpdaterFactory.java x: 21 commits (all time) y: 1454 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala x: 122 commits (all time) y: 374 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRowConverter.scala x: 60 commits (all time) y: 675 lines of code launcher/src/main/java/org/apache/spark/launcher/AbstractCommandBuilder.java x: 53 commits (all time) y: 256 lines of code sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/TableCatalog.java x: 28 commits (all time) y: 78 lines of code sql/api/src/main/scala/org/apache/spark/sql/types/DataType.scala x: 23 commits (all time) y: 369 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala x: 51 commits (all time) y: 202 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala x: 295 commits (all time) y: 1905 lines of code python/pyspark/sql/connect/session.py x: 151 commits (all time) y: 838 lines of code python/pyspark/ml/tuning.py x: 84 commits (all time) y: 1133 lines of code core/src/main/scala/org/apache/spark/util/Distribution.scala x: 19 commits (all time) y: 40 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFileFormat.scala x: 52 commits (all time) y: 181 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/collect.scala x: 35 commits (all time) y: 399 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ColumnResolutionHelper.scala x: 29 commits (all time) y: 476 lines of code common/utils/src/main/scala/org/apache/spark/internal/Logging.scala x: 25 commits (all time) y: 415 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala x: 275 commits (all time) y: 1210 lines of code mllib/src/main/scala/org/apache/spark/mllib/util/MLUtils.scala x: 71 commits (all time) y: 397 lines of code python/pyspark/ml/clustering.py x: 100 commits (all time) y: 1000 lines of code mllib-local/src/main/scala/org/apache/spark/ml/linalg/Vectors.scala x: 32 commits (all time) y: 578 lines of code mllib/src/main/scala/org/apache/spark/ml/param/params.scala x: 66 commits (all time) y: 603 lines of code project/MimaExcludes.scala x: 530 commits (all time) y: 202 lines of code core/src/main/resources/org/apache/spark/ui/static/webui.css x: 61 commits (all time) y: 403 lines of code core/src/main/scala/org/apache/spark/ui/UIUtils.scala x: 129 commits (all time) y: 589 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala x: 140 commits (all time) y: 486 lines of code sql/core/src/main/scala/org/apache/spark/sql/jdbc/OracleDialect.scala x: 45 commits (all time) y: 193 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/AggUtils.scala x: 28 commits (all time) y: 432 lines of code sql/catalyst/src/main/java/org/apache/spark/sql/connector/util/V2ExpressionSQLBuilder.java x: 38 commits (all time) y: 313 lines of code sql/core/src/main/scala/org/apache/spark/sql/jdbc/DB2Dialect.scala x: 41 commits (all time) y: 161 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ValidateSubqueryExpression.scala x: 2 commits (all time) y: 301 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala x: 244 commits (all time) y: 1490 lines of code python/pyspark/sql/plot/core.py x: 25 commits (all time) y: 297 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TableOutputResolver.scala x: 35 commits (all time) y: 549 lines of code sql/hive-thriftserver/src/main/java/org/apache/hive/service/cli/session/HiveSessionImpl.java x: 17 commits (all time) y: 778 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveUDFs.scala x: 79 commits (all time) y: 337 lines of code sql/api/src/main/scala/org/apache/spark/sql/catalyst/parser/DataTypeAstBuilder.scala x: 13 commits (all time) y: 201 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/Columnar.scala x: 33 commits (all time) y: 388 lines of code sql/api/src/main/scala/org/apache/spark/sql/functions.scala x: 29 commits (all time) y: 1882 lines of code connector/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaExceptions.scala x: 9 commits (all time) y: 162 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/misc.scala x: 103 commits (all time) y: 436 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectExecutionManager.scala x: 8 commits (all time) y: 181 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectListenerBusListener.scala x: 5 commits (all time) y: 103 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala x: 222 commits (all time) y: 1102 lines of code sql/api/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala x: 7 commits (all time) y: 248 lines of code sql/api/src/main/scala/org/apache/spark/sql/catalyst/util/TimestampFormatter.scala x: 10 commits (all time) y: 432 lines of code core/src/main/java/org/apache/spark/memory/TaskMemoryManager.java x: 42 commits (all time) y: 315 lines of code core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeInMemorySorter.java x: 49 commits (all time) y: 260 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala x: 99 commits (all time) y: 695 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/TungstenAggregationIterator.scala x: 48 commits (all time) y: 233 lines of code sql/connect/common/src/main/scala/org/apache/spark/sql/connect/client/SparkConnectClient.scala x: 11 commits (all time) y: 544 lines of code core/src/main/scala/org/apache/spark/util/ThreadUtils.scala x: 34 commits (all time) y: 247 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/objects.scala x: 45 commits (all time) y: 454 lines of code core/src/main/scala/org/apache/spark/BarrierTaskContext.scala x: 35 commits (all time) y: 165 lines of code core/src/main/scala/org/apache/spark/ui/ConsoleProgressBar.scala x: 20 commits (all time) y: 67 lines of code sql/core/src/main/scala/org/apache/spark/sql/jdbc/MsSqlServerDialect.scala x: 53 commits (all time) y: 191 lines of code sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala x: 75 commits (all time) y: 311 lines of code mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala x: 27 commits (all time) y: 362 lines of code mllib/src/main/scala/org/apache/spark/ml/stat/Summarizer.scala x: 29 commits (all time) y: 544 lines of code mllib/src/main/scala/org/apache/spark/mllib/clustering/GaussianMixture.scala x: 31 commits (all time) y: 196 lines of code mllib/src/main/scala/org/apache/spark/mllib/evaluation/MultilabelMetrics.scala x: 17 commits (all time) y: 137 lines of code mllib/src/main/scala/org/apache/spark/mllib/feature/IDF.scala x: 18 commits (all time) y: 150 lines of code mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/RowMatrix.scala x: 66 commits (all time) y: 560 lines of code mllib/src/main/scala/org/apache/spark/mllib/optimization/LBFGS.scala x: 32 commits (all time) y: 143 lines of code mllib/src/main/scala/org/apache/spark/mllib/stat/Statistics.scala x: 23 commits (all time) y: 76 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/DeserializerBuildHelper.scala x: 19 commits (all time) y: 407 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/types/PhysicalDataType.scala x: 18 commits (all time) y: 323 lines of code python/pyspark/sql/udf.py x: 82 commits (all time) y: 464 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/ExplainUtils.scala x: 17 commits (all time) y: 200 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveShim.scala x: 100 commits (all time) y: 1277 lines of code mllib/src/main/scala/org/apache/spark/ml/stat/ANOVATest.scala x: 9 commits (all time) y: 143 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/DeduplicateRelations.scala x: 33 commits (all time) y: 394 lines of code python/pyspark/sql/utils.py x: 101 commits (all time) y: 289 lines of code python/pyspark/pandas/base.py x: 63 commits (all time) y: 601 lines of code core/src/main/scala/org/apache/spark/util/JsonProtocol.scala x: 142 commits (all time) y: 1419 lines of code sql/connect/common/src/main/scala/org/apache/spark/sql/connect/KeyValueGroupedDataset.scala x: 4 commits (all time) y: 581 lines of code python/packaging/classic/setup.py x: 21 commits (all time) y: 298 lines of code sql/core/src/main/scala/org/apache/spark/sql/scripting/SqlScriptingExecutionNode.scala x: 27 commits (all time) y: 707 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoder.scala x: 83 commits (all time) y: 189 lines of code sql/connect/common/src/main/scala/org/apache/spark/sql/connect/client/arrow/ArrowDeserializer.scala x: 4 commits (all time) y: 496 lines of code scalastyle-config.xml x: 74 commits (all time) y: 337 lines of code common/variant/src/main/java/org/apache/spark/types/variant/Variant.java x: 9 commits (all time) y: 250 lines of code common/variant/src/main/java/org/apache/spark/types/variant/VariantBuilder.java x: 8 commits (all time) y: 450 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/SparkShreddingUtils.scala x: 10 commits (all time) y: 672 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Relation.scala x: 28 commits (all time) y: 209 lines of code python/pyspark/sql/connect/conversion.py x: 35 commits (all time) y: 35 lines of code python/pyspark/sql/connect/dataframe.py x: 270 commits (all time) y: 1945 lines of code sql/core/src/main/scala/org/apache/spark/sql/classic/SparkSession.scala x: 5 commits (all time) y: 665 lines of code python/pyspark/sql/connect/proto/relations_pb2.pyi x: 115 commits (all time) y: 3616 lines of code sql/connect/common/src/main/protobuf/spark/connect/relations.proto x: 8 commits (all time) y: 984 lines of code core/src/main/scala/org/apache/spark/SparkContext.scala x: 768 commits (all time) y: 1923 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2SessionCatalog.scala x: 65 commits (all time) y: 421 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlanner.scala x: 32 commits (all time) y: 60 lines of code core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala x: 247 commits (all time) y: 883 lines of code core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala x: 123 commits (all time) y: 510 lines of code launcher/src/main/java/org/apache/spark/launcher/SparkLauncher.java x: 30 commits (all time) y: 270 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDBStateEncoder.scala x: 16 commits (all time) y: 1126 lines of code python/pyspark/sql/session.py x: 213 commits (all time) y: 921 lines of code python/pyspark/sql/types.py x: 206 commits (all time) y: 1984 lines of code python/pyspark/errors/__init__.py x: 24 commits (all time) y: 70 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala x: 123 commits (all time) y: 458 lines of code sql/api/src/main/scala/org/apache/spark/sql/SQLContext.scala x: 3 commits (all time) y: 296 lines of code sql/core/src/main/scala/org/apache/spark/sql/classic/SQLContext.scala x: 2 commits (all time) y: 232 lines of code sql/api/src/main/scala/org/apache/spark/sql/streaming/ExpiredTimerInfo.scala x: 5 commits (all time) y: 7 lines of code sql/api/src/main/scala/org/apache/spark/sql/streaming/ValueState.scala x: 7 commits (all time) y: 10 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Mode.scala x: 14 commits (all time) y: 268 lines of code python/pyspark/shell.py x: 121 commits (all time) y: 88 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveHints.scala x: 38 commits (all time) y: 198 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/hints.scala x: 25 commits (all time) y: 100 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveInlineTables.scala x: 25 commits (all time) y: 14 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/maskExpressions.scala x: 11 commits (all time) y: 262 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala x: 217 commits (all time) y: 2968 lines of code sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala x: 213 commits (all time) y: 613 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/ddl.scala x: 62 commits (all time) y: 104 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala x: 127 commits (all time) y: 611 lines of code core/src/main/scala/org/apache/spark/scheduler/JobResult.scala x: 14 commits (all time) y: 8 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/QueryStageExec.scala x: 43 commits (all time) y: 219 lines of code core/src/main/scala/org/apache/spark/deploy/master/Master.scala x: 274 commits (all time) y: 1095 lines of code core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala x: 157 commits (all time) y: 1222 lines of code python/pyspark/sql/streaming/readwriter.py x: 23 commits (all time) y: 612 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/sources/ForeachWriterTable.scala x: 21 commits (all time) y: 122 lines of code mllib/src/main/scala/org/apache/spark/mllib/pmml/PMMLExportable.scala x: 9 commits (all time) y: 34 lines of code core/src/main/scala/org/apache/spark/api/java/JavaSparkContext.scala x: 135 commits (all time) y: 260 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/python/CoGroupedArrowPythonRunner.scala x: 17 commits (all time) y: 97 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/python/PythonUDFRunner.scala x: 26 commits (all time) y: 170 lines of code python/pyspark/ml/stat.py x: 35 commits (all time) y: 185 lines of code python/pyspark/sql/profiler.py x: 8 commits (all time) y: 308 lines of code python/pyspark/pandas/plot/core.py x: 37 commits (all time) y: 424 lines of code launcher/src/main/java/org/apache/spark/launcher/CommandBuilderUtils.java x: 22 commits (all time) y: 240 lines of code core/src/main/scala/org/apache/spark/SparkEnv.scala x: 237 commits (all time) y: 414 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/python/UserDefinedPythonFunction.scala x: 36 commits (all time) y: 213 lines of code python/pyspark/pandas/typedef/typehints.py x: 39 commits (all time) y: 409 lines of code python/pyspark/sql/pandas/conversion.py x: 58 commits (all time) y: 610 lines of code sql/core/src/main/scala/org/apache/spark/sql/api/r/SQLUtils.scala x: 58 commits (all time) y: 188 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowConverters.scala x: 38 commits (all time) y: 379 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/jdbc/JDBCTable.scala x: 20 commits (all time) y: 80 lines of code python/pyspark/sql/classic/dataframe.py x: 24 commits (all time) y: 1539 lines of code python/pyspark/sql/udtf.py x: 35 commits (all time) y: 275 lines of code python/pyspark/sql/connect/proto/commands_pb2.pyi x: 46 commits (all time) y: 2053 lines of code sql/connect/common/src/main/protobuf/spark/connect/commands.proto x: 3 commits (all time) y: 448 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/command/commands.scala x: 70 commits (all time) y: 135 lines of code connector/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaOffsetReaderConsumer.scala x: 15 commits (all time) y: 466 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/subquery.scala x: 63 commits (all time) y: 347 lines of code core/src/main/scala/org/apache/spark/deploy/rest/RestSubmissionServer.scala x: 21 commits (all time) y: 268 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FlatMapGroupsWithStateExec.scala x: 49 commits (all time) y: 417 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClient.scala x: 40 commits (all time) y: 138 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCOptions.scala x: 50 commits (all time) y: 213 lines of code common/unsafe/src/main/java/org/apache/spark/unsafe/array/ByteArrayMethods.java x: 14 commits (all time) y: 81 lines of code sql/core/src/main/scala/org/apache/spark/sql/SparkSessionExtensions.scala x: 23 commits (all time) y: 153 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala x: 85 commits (all time) y: 237 lines of code sql/core/src/main/scala/org/apache/spark/sql/internal/BaseSessionStateBuilder.scala x: 94 commits (all time) y: 257 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionStateBuilder.scala x: 67 commits (all time) y: 171 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/metric/SQLMetrics.scala x: 62 commits (all time) y: 120 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLAppStatusListener.scala x: 49 commits (all time) y: 479 lines of code resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala x: 102 commits (all time) y: 673 lines of code mllib-local/src/main/scala/org/apache/spark/ml/linalg/Matrices.scala x: 23 commits (all time) y: 815 lines of code python/pyspark/java_gateway.py x: 109 commits (all time) y: 105 lines of code sql/core/src/main/scala/org/apache/spark/sql/artifact/ArtifactManager.scala x: 14 commits (all time) y: 377 lines of code sql/core/src/main/scala/org/apache/spark/sql/classic/KeyValueGroupedDataset.scala x: 1 commits (all time) y: 452 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/ExistingRDD.scala x: 92 commits (all time) y: 220 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlan.scala x: 143 commits (all time) y: 378 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzeColumnCommand.scala x: 34 commits (all time) y: 106 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzeTableCommand.scala x: 22 commits (all time) y: 12 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/command/CommandUtils.scala x: 42 commits (all time) y: 375 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/command/createDataSourceTables.scala x: 79 commits (all time) y: 155 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala x: 177 commits (all time) y: 734 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala x: 224 commits (all time) y: 992 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FilePartition.scala x: 14 commits (all time) y: 97 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetUtils.scala x: 25 commits (all time) y: 366 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/CacheTableExec.scala x: 18 commits (all time) y: 81 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/Sink.scala x: 10 commits (all time) y: 17 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/memory.scala x: 79 commits (all time) y: 215 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/package.scala x: 19 commits (all time) y: 99 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala x: 61 commits (all time) y: 140 lines of code sql/core/src/main/scala/org/apache/spark/sql/internal/SessionState.scala x: 89 commits (all time) y: 114 lines of code sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLDriver.scala x: 39 commits (all time) y: 82 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala x: 212 commits (all time) y: 20 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala x: 279 commits (all time) y: 327 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala x: 125 commits (all time) y: 176 lines of code sql/core/src/main/scala/org/apache/spark/sql/avro/SchemaConverters.scala x: 3 commits (all time) y: 358 lines of code core/src/main/scala/org/apache/spark/executor/CoarseGrainedExecutorBackend.scala x: 195 commits (all time) y: 486 lines of code python/pyspark/ml/evaluation.py x: 55 commits (all time) y: 579 lines of code python/pyspark/pandas/indexing.py x: 45 commits (all time) y: 1210 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/subquery.scala x: 70 commits (all time) y: 757 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala x: 84 commits (all time) y: 272 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Implicits.scala x: 20 commits (all time) y: 100 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/csvExpressions.scala x: 41 commits (all time) y: 211 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala x: 90 commits (all time) y: 1531 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/numberFormatExpressions.scala x: 21 commits (all time) y: 280 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/randomExpressions.scala x: 41 commits (all time) y: 315 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/urlExpressions.scala x: 19 commits (all time) y: 205 lines of code common/unsafe/src/main/java/org/apache/spark/sql/catalyst/util/CollationFactory.java x: 39 commits (all time) y: 979 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2ScanRelationPushDown.scala x: 53 commits (all time) y: 441 lines of code python/pyspark/sql/connect/proto/types_pb2.pyi x: 13 commits (all time) y: 914 lines of code python/pyspark/ml/recommendation.py x: 49 commits (all time) y: 328 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/HDFSMetadataLog.scala x: 53 commits (all time) y: 251 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/execution/ExecuteGrpcResponseSender.scala x: 6 commits (all time) y: 253 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/state/StatePartitionReader.scala x: 19 commits (all time) y: 232 lines of code core/src/main/scala/org/apache/spark/rdd/NewHadoopRDD.scala x: 115 commits (all time) y: 298 lines of code connector/protobuf/src/main/scala/org/apache/spark/sql/protobuf/ProtobufSerializer.scala x: 11 commits (all time) y: 286 lines of code core/src/main/scala/org/apache/spark/SparkConf.scala x: 179 commits (all time) y: 505 lines of code python/pyspark/version.py x: 18 commits (all time) y: 1 lines of code common/unsafe/src/main/java/org/apache/spark/sql/catalyst/util/CollationAwareUTF8String.java x: 29 commits (all time) y: 921 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/util/SchemaUtils.scala x: 24 commits (all time) y: 250 lines of code core/src/main/scala/org/apache/spark/api/python/PythonUtils.scala x: 45 commits (all time) y: 137 lines of code python/pyspark/cloudpickle/cloudpickle.py x: 8 commits (all time) y: 793 lines of code core/src/main/scala/org/apache/spark/TaskContext.scala x: 78 commits (all time) y: 99 lines of code core/src/main/scala/org/apache/spark/scheduler/Task.scala x: 105 commits (all time) y: 132 lines of code python/pyspark/sql/connect/proto/base_pb2.pyi x: 57 commits (all time) y: 3038 lines of code sql/connect/common/src/main/protobuf/spark/connect/base.proto x: 6 commits (all time) y: 921 lines of code python/pyspark/sql/variant_utils.py x: 8 commits (all time) y: 615 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala x: 88 commits (all time) y: 518 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/HistogramNumeric.scala x: 6 commits (all time) y: 196 lines of code python/pyspark/core/rdd.py x: 7 commits (all time) y: 1451 lines of code python/pyspark/resource/requests.py x: 15 commits (all time) y: 208 lines of code python/pyspark/sql/readwriter.py x: 208 commits (all time) y: 860 lines of code core/src/main/scala/org/apache/spark/deploy/DeployMessage.scala x: 77 commits (all time) y: 147 lines of code core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala x: 459 commits (all time) y: 2113 lines of code core/src/main/scala/org/apache/spark/scheduler/TaskSet.scala x: 27 commits (all time) y: 21 lines of code core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala x: 237 commits (all time) y: 1010 lines of code resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala x: 66 commits (all time) y: 712 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala x: 230 commits (all time) y: 4533 lines of code python/pyspark/ml/functions.py x: 25 commits (all time) y: 301 lines of code mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala x: 74 commits (all time) y: 521 lines of code sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/ParquetColumnVector.java x: 11 commits (all time) y: 248 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetReadSupport.scala x: 31 commits (all time) y: 369 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala x: 66 commits (all time) y: 497 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala x: 62 commits (all time) y: 526 lines of code graphx/src/main/scala/org/apache/spark/graphx/Pregel.scala x: 36 commits (all time) y: 54 lines of code mllib/src/main/scala/org/apache/spark/mllib/tree/model/DecisionTreeModel.scala x: 45 commits (all time) y: 219 lines of code mllib/src/main/scala/org/apache/spark/mllib/tree/model/treeEnsembleModels.scala x: 40 commits (all time) y: 269 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/OffsetSeq.scala x: 27 commits (all time) y: 95 lines of code core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala x: 106 commits (all time) y: 654 lines of code streaming/src/main/scala/org/apache/spark/streaming/dstream/DStream.scala x: 74 commits (all time) y: 541 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/joinTypes.scala x: 27 commits (all time) y: 132 lines of code core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala x: 167 commits (all time) y: 363 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/xml/XmlInferSchema.scala x: 23 commits (all time) y: 432 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileScanRDD.scala x: 56 commits (all time) y: 210 lines of code python/pyspark/sql/connect/readwriter.py x: 48 commits (all time) y: 851 lines of code python/pyspark/sql/catalog.py x: 72 commits (all time) y: 323 lines of code common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java x: 38 commits (all time) y: 1654 lines of code python/pyspark/ml/torch/distributor.py x: 32 commits (all time) y: 630 lines of code core/src/main/scala/org/apache/spark/rdd/RDD.scala x: 329 commits (all time) y: 1080 lines of code sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/WritableColumnVector.java x: 40 commits (all time) y: 599 lines of code core/src/main/scala/org/apache/spark/util/HadoopFSUtils.scala x: 16 commits (all time) y: 243 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/percentiles.scala x: 17 commits (all time) y: 388 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/orc/OrcTable.scala x: 16 commits (all time) y: 39 lines of code core/src/main/scala/org/apache/spark/deploy/master/MasterArguments.scala x: 34 commits (all time) y: 61 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala x: 88 commits (all time) y: 803 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/client/IsolatedClientLoader.scala x: 93 commits (all time) y: 229 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalogUtils.scala x: 32 commits (all time) y: 242 lines of code core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala x: 72 commits (all time) y: 203 lines of code core/src/main/scala/org/apache/spark/rdd/PipedRDD.scala x: 55 commits (all time) y: 176 lines of code core/src/main/scala/org/apache/spark/storage/BlockInfoManager.scala x: 21 commits (all time) y: 327 lines of code core/src/main/scala/org/apache/spark/storage/BlockManager.scala x: 366 commits (all time) y: 1544 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/InMemoryCatalog.scala x: 73 commits (all time) y: 533 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala x: 70 commits (all time) y: 308 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala x: 69 commits (all time) y: 989 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala x: 89 commits (all time) y: 975 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/basicPhysicalOperators.scala x: 90 commits (all time) y: 642 lines of code streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobGenerator.scala x: 88 commits (all time) y: 199 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/intervalExpressions.scala x: 46 commits (all time) y: 749 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala x: 90 commits (all time) y: 750 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/client/package.scala x: 30 commits (all time) y: 74 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/TimeWindow.scala x: 31 commits (all time) y: 214 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/decimalExpressions.scala x: 34 commits (all time) y: 226 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/PythonUDF.scala x: 26 commits (all time) y: 206 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/python/EvaluatePython.scala x: 23 commits (all time) y: 222 lines of code python/pyspark/sql/group.py x: 61 commits (all time) y: 90 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala x: 132 commits (all time) y: 269 lines of code core/src/main/scala/org/apache/spark/scheduler/cluster/StandaloneSchedulerBackend.scala x: 64 commits (all time) y: 252 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala x: 77 commits (all time) y: 562 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/ui/SparkConnectServerListener.scala x: 2 commits (all time) y: 382 lines of code mllib/src/main/scala/org/apache/spark/ml/Transformer.scala x: 26 commits (all time) y: 56 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/Binarizer.scala x: 31 commits (all time) y: 171 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/VectorSlicer.scala x: 22 commits (all time) y: 120 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/PropagateEmptyRelation.scala x: 27 commits (all time) y: 138 lines of code python/pyspark/pandas/generic.py x: 76 commits (all time) y: 991 lines of code python/pyspark/pandas/window.py x: 34 commits (all time) y: 539 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LocalRelation.scala x: 32 commits (all time) y: 77 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/DecimalPrecision.scala x: 29 commits (all time) y: 8 lines of code common/unsafe/src/main/java/org/apache/spark/sql/catalyst/util/CollationSupport.java x: 31 commits (all time) y: 642 lines of code common/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java x: 82 commits (all time) y: 1414 lines of code python/pyspark/sql/connect/proto/base_pb2_grpc.py x: 12 commits (all time) y: 454 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PruneFileSourcePartitions.scala x: 28 commits (all time) y: 65 lines of code python/pyspark/profiler.py x: 20 commits (all time) y: 354 lines of code python/pyspark/sql/connect/streaming/readwriter.py x: 24 commits (all time) y: 606 lines of code python/pyspark/ml/linalg/__init__.py x: 25 commits (all time) y: 779 lines of code python/pyspark/mllib/linalg/__init__.py x: 40 commits (all time) y: 908 lines of code core/src/main/scala/org/apache/spark/deploy/master/ui/MasterPage.scala x: 57 commits (all time) y: 340 lines of code core/src/main/scala/org/apache/spark/deploy/master/ui/ApplicationPage.scala x: 67 commits (all time) y: 140 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/state/utils/SchemaUtil.scala x: 8 commits (all time) y: 396 lines of code connector/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala x: 9 commits (all time) y: 558 lines of code python/pyspark/mllib/classification.py x: 75 commits (all time) y: 398 lines of code python/pyspark/mllib/feature.py x: 57 commits (all time) y: 346 lines of code python/pyspark/mllib/regression.py x: 70 commits (all time) y: 371 lines of code resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala x: 51 commits (all time) y: 410 lines of code core/src/main/scala/org/apache/spark/TestUtils.scala x: 62 commits (all time) y: 309 lines of code core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala x: 59 commits (all time) y: 201 lines of code core/src/main/scala/org/apache/spark/deploy/worker/ExecutorRunner.scala x: 95 commits (all time) y: 152 lines of code core/src/main/scala/org/apache/spark/util/collection/OpenHashSet.scala x: 62 commits (all time) y: 177 lines of code core/src/main/scala/org/apache/spark/deploy/history/HistoryServerArguments.scala x: 17 commits (all time) y: 73 lines of code resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala x: 111 commits (all time) y: 1227 lines of code core/src/main/scala/org/apache/spark/scheduler/DAGSchedulerEvent.scala x: 83 commits (all time) y: 83 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousTextSocketSource.scala x: 14 commits (all time) y: 203 lines of code core/src/main/scala/org/apache/spark/scheduler/SparkListener.scala x: 132 commits (all time) y: 313 lines of code sql/api/src/main/scala/org/apache/spark/sql/catalyst/util/SparkIntervalUtils.scala x: 5 commits (all time) y: 430 lines of code sql/api/src/main/scala/org/apache/spark/sql/types/Decimal.scala x: 8 commits (all time) y: 495 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryErrorsBase.scala x: 28 commits (all time) y: 32 lines of code core/src/main/scala/org/apache/spark/scheduler/HealthTracker.scala x: 11 commits (all time) y: 316 lines of code core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala x: 96 commits (all time) y: 1103 lines of code core/src/main/java/org/apache/spark/shuffle/sort/UnsafeShuffleWriter.java x: 44 commits (all time) y: 423 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/nullExpressions.scala x: 51 commits (all time) y: 426 lines of code core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala x: 216 commits (all time) y: 801 lines of code core/src/main/scala/org/apache/spark/executor/TaskMetrics.scala x: 101 commits (all time) y: 207 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/VectorAssembler.scala x: 47 commits (all time) y: 228 lines of code core/src/main/resources/org/apache/spark/ui/static/stagepage.js x: 36 commits (all time) y: 1051 lines of code core/src/main/scala/org/apache/spark/deploy/FaultToleranceTest.scala x: 61 commits (all time) y: 329 lines of code core/src/main/scala/org/apache/spark/deploy/rest/RestSubmissionClient.scala x: 30 commits (all time) y: 439 lines of code core/src/main/scala/org/apache/spark/ui/JettyUtils.scala x: 90 commits (all time) y: 464 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/QuantileDiscretizer.scala x: 43 commits (all time) y: 155 lines of code resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala x: 43 commits (all time) y: 248 lines of code core/src/main/scala/org/apache/spark/deploy/history/HistoryPage.scala x: 42 commits (all time) y: 91 lines of code core/src/main/scala/org/apache/spark/internal/config/History.scala x: 20 commits (all time) y: 248 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/CentralMomentAgg.scala x: 29 commits (all time) y: 290 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/ColumnType.scala x: 30 commits (all time) y: 624 lines of code core/src/main/scala/org/apache/spark/scheduler/JobWaiter.scala x: 41 commits (all time) y: 33 lines of code python/pyspark/sql/streaming/listener.py x: 15 commits (all time) y: 642 lines of code mllib/src/main/scala/org/apache/spark/ml/r/GeneralizedLinearRegressionWrapper.scala x: 24 commits (all time) y: 156 lines of code mllib/src/main/scala/org/apache/spark/mllib/clustering/GaussianMixtureModel.scala x: 35 commits (all time) y: 123 lines of code mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAModel.scala x: 60 commits (all time) y: 540 lines of code mllib/src/main/scala/org/apache/spark/mllib/feature/ChiSqSelector.scala x: 40 commits (all time) y: 203 lines of code mllib/src/main/scala/org/apache/spark/mllib/recommendation/MatrixFactorizationModel.scala x: 64 commits (all time) y: 235 lines of code python/pyspark/sql/__init__.py x: 31 commits (all time) y: 37 lines of code mllib/src/main/scala/org/apache/spark/mllib/clustering/StreamingKMeans.scala x: 33 commits (all time) y: 183 lines of code sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2.scala x: 60 commits (all time) y: 120 lines of code sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLEnv.scala x: 41 commits (all time) y: 54 lines of code sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLSessionManager.scala x: 34 commits (all time) y: 76 lines of code core/src/main/scala/org/apache/spark/Partitioner.scala x: 79 commits (all time) y: 242 lines of code core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala x: 255 commits (all time) y: 722 lines of code python/pyspark/ml/param/__init__.py x: 47 commits (all time) y: 326 lines of code core/src/main/scala/org/apache/spark/deploy/master/ui/MasterWebUI.scala x: 76 commits (all time) y: 98 lines of code resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/KubernetesConf.scala x: 41 commits (all time) y: 235 lines of code resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala x: 84 commits (all time) y: 732 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/compression/compressionSchemes.scala x: 13 commits (all time) y: 673 lines of code core/src/main/scala/org/apache/spark/broadcast/Broadcast.scala x: 43 commits (all time) y: 46 lines of code streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala x: 140 commits (all time) y: 298 lines of code streaming/src/main/scala/org/apache/spark/streaming/StreamingContext.scala x: 196 commits (all time) y: 469 lines of code streaming/src/main/scala/org/apache/spark/streaming/scheduler/ReceivedBlockTracker.scala x: 30 commits (all time) y: 187 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningAwareFileIndex.scala x: 43 commits (all time) y: 173 lines of code core/src/main/scala/org/apache/spark/storage/DiskBlockObjectWriter.scala x: 29 commits (all time) y: 233 lines of code core/src/main/scala/org/apache/spark/status/api/v1/api.scala x: 62 commits (all time) y: 497 lines of code sql/core/src/main/scala/org/apache/spark/sql/internal/SharedState.scala x: 64 commits (all time) y: 190 lines of code core/src/main/scala/org/apache/spark/io/CompressionCodec.scala x: 71 commits (all time) y: 150 lines of code python/pyspark/sql/context.py x: 132 commits (all time) y: 292 lines of code core/src/main/scala/org/apache/spark/status/AppStatusListener.scala x: 72 commits (all time) y: 1100 lines of code core/src/main/scala/org/apache/spark/status/AppStatusStore.scala x: 59 commits (all time) y: 749 lines of code core/src/main/scala/org/apache/spark/util/collection/ExternalAppendOnlyMap.scala x: 113 commits (all time) y: 382 lines of code python/pyspark/pandas/data_type_ops/base.py x: 42 commits (all time) y: 366 lines of code python/pyspark/pandas/data_type_ops/boolean_ops.py x: 25 commits (all time) y: 328 lines of code python/pyspark/pandas/data_type_ops/num_ops.py x: 42 commits (all time) y: 421 lines of code python/pyspark/pandas/indexes/multi.py x: 43 commits (all time) y: 527 lines of code core/src/main/scala/org/apache/spark/deploy/LocalSparkCluster.scala x: 72 commits (all time) y: 78 lines of code core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala x: 157 commits (all time) y: 328 lines of code core/src/main/scala/org/apache/spark/deploy/worker/CommandUtils.scala x: 44 commits (all time) y: 81 lines of code core/src/main/scala/org/apache/spark/scheduler/dynalloc/ExecutorMonitor.scala x: 23 commits (all time) y: 440 lines of code core/src/main/scala/org/apache/spark/storage/BlockManagerMaster.scala x: 112 commits (all time) y: 203 lines of code core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala x: 42 commits (all time) y: 617 lines of code core/src/main/scala/org/apache/spark/MapOutputTracker.scala x: 183 commits (all time) y: 1131 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveInspectors.scala x: 98 commits (all time) y: 875 lines of code common/network-common/src/main/java/org/apache/spark/network/ssl/SSLFactory.java x: 7 commits (all time) y: 325 lines of code common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalBlockHandler.java x: 22 commits (all time) y: 478 lines of code common/network-yarn/src/main/java/org/apache/spark/network/yarn/YarnShuffleService.java x: 40 commits (all time) y: 417 lines of code core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java x: 58 commits (all time) y: 590 lines of code core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeExternalSorter.java x: 75 commits (all time) y: 582 lines of code sql/hive-thriftserver/src/main/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java x: 15 commits (all time) y: 637 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/ProbabilisticClassifier.scala x: 29 commits (all time) y: 165 lines of code core/src/main/scala/org/apache/spark/broadcast/TorrentBroadcast.scala x: 90 commits (all time) y: 255 lines of code core/src/main/scala/org/apache/spark/storage/DiskBlockManager.scala x: 113 commits (all time) y: 253 lines of code core/src/main/scala/org/apache/spark/ui/WebUI.scala x: 52 commits (all time) y: 156 lines of code core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala x: 82 commits (all time) y: 534 lines of code common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalBlockStoreClient.java x: 18 commits (all time) y: 276 lines of code core/src/main/scala/org/apache/spark/network/netty/NettyBlockTransferService.scala x: 46 commits (all time) y: 179 lines of code core/src/main/scala/org/apache/spark/ui/SparkUI.scala x: 100 commits (all time) y: 178 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/object.scala x: 50 commits (all time) y: 682 lines of code core/src/main/scala/org/apache/spark/shuffle/sort/SortShuffleWriter.scala x: 38 commits (all time) y: 68 lines of code core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala x: 46 commits (all time) y: 147 lines of code core/src/main/scala/org/apache/spark/deploy/master/FileSystemPersistenceEngine.scala x: 46 commits (all time) y: 66 lines of code core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala x: 202 commits (all time) y: 541 lines of code core/src/main/scala/org/apache/spark/rpc/netty/NettyRpcEnv.scala x: 46 commits (all time) y: 540 lines of code python/pyspark/errors/error_classes.py x: 87 commits (all time) y: 9 lines of code core/src/main/scala/org/apache/spark/storage/BlockId.scala x: 53 commits (all time) y: 208 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatDataWriter.scala x: 20 commits (all time) y: 410 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRDD.scala x: 79 commits (all time) y: 195 lines of code graphx/src/main/scala/org/apache/spark/graphx/lib/PageRank.scala x: 48 commits (all time) y: 225 lines of code mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeans.scala x: 91 commits (all time) y: 326 lines of code resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/KubernetesUtils.scala x: 31 commits (all time) y: 315 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcUtils.scala x: 46 commits (all time) y: 403 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/TableReader.scala x: 85 commits (all time) y: 343 lines of code streaming/src/main/scala/org/apache/spark/streaming/dstream/InputDStream.scala x: 45 commits (all time) y: 53 lines of code python/pyspark/__init__.py x: 101 commits (all time) y: 72 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala x: 54 commits (all time) y: 307 lines of code common/network-common/src/main/java/org/apache/spark/network/util/TransportConf.java x: 45 commits (all time) y: 287 lines of code python/pyspark/mllib/clustering.py x: 85 commits (all time) y: 449 lines of code python/pyspark/mllib/recommendation.py x: 56 commits (all time) y: 136 lines of code python/pyspark/serializers.py x: 159 commits (all time) y: 373 lines of code python/pyspark/streaming/dstream.py x: 31 commits (all time) y: 489 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryTableScanExec.scala x: 57 commits (all time) y: 126 lines of code sql/core/src/main/scala/org/apache/spark/sql/streaming/ui/StreamingQueryStatisticsPage.scala x: 20 commits (all time) y: 480 lines of code core/src/main/scala/org/apache/spark/deploy/worker/ui/WorkerWebUI.scala x: 82 commits (all time) y: 31 lines of code core/src/main/scala/org/apache/spark/status/api/v1/OneApplicationResource.scala x: 18 commits (all time) y: 169 lines of code core/src/main/scala/org/apache/spark/ui/PagedTable.scala x: 18 commits (all time) y: 272 lines of code core/src/main/scala/org/apache/spark/ui/jobs/AllJobsPage.scala x: 65 commits (all time) y: 519 lines of code core/src/main/scala/org/apache/spark/ui/jobs/PoolPage.scala x: 37 commits (all time) y: 40 lines of code core/src/main/scala/org/apache/spark/ui/jobs/PoolTable.scala x: 51 commits (all time) y: 49 lines of code core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala x: 217 commits (all time) y: 471 lines of code core/src/main/scala/org/apache/spark/ui/jobs/StageTable.scala x: 129 commits (all time) y: 294 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/ui/AllExecutionsPage.scala x: 35 commits (all time) y: 495 lines of code sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/ui/ThriftServerPage.scala x: 33 commits (all time) y: 346 lines of code streaming/src/main/scala/org/apache/spark/streaming/ui/BatchPage.scala x: 31 commits (all time) y: 352 lines of code dev/run-tests.py x: 168 commits (all time) y: 431 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/python/WindowInPandasExec.scala x: 24 commits (all time) y: 38 lines of code core/src/main/resources/org/apache/spark/ui/static/executorspage.js x: 33 commits (all time) y: 710 lines of code core/src/main/scala/org/apache/spark/deploy/master/WorkerInfo.scala x: 48 commits (all time) y: 124 lines of code sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedPlainValuesReader.java x: 22 commits (all time) y: 326 lines of code sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedRleValuesReader.java x: 30 commits (all time) y: 709 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashedRelation.scala x: 96 commits (all time) y: 795 lines of code core/src/main/scala/org/apache/spark/scheduler/TaskScheduler.scala x: 75 commits (all time) y: 31 lines of code core/src/main/scala/org/apache/spark/scheduler/TaskInfo.scala x: 48 commits (all time) y: 86 lines of code core/src/main/scala/org/apache/spark/rdd/AsyncRDDActions.scala x: 51 commits (all time) y: 84 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/BoundAttribute.scala x: 58 commits (all time) y: 62 lines of code core/src/main/scala/org/apache/spark/api/java/JavaPairRDD.scala x: 116 commits (all time) y: 421 lines of code core/src/main/scala/org/apache/spark/scheduler/InputFormatInfo.scala x: 51 commits (all time) y: 111 lines of code core/src/main/scala/org/apache/spark/status/protobuf/StageDataWrapperSerializer.scala x: 10 commits (all time) y: 659 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ToNumberParser.scala x: 10 commits (all time) y: 640 lines of code mllib/src/main/scala/org/apache/spark/mllib/linalg/Vectors.scala x: 86 commits (all time) y: 694 lines of code mllib/src/main/scala/org/apache/spark/mllib/regression/RegressionModel.scala x: 20 commits (all time) y: 22 lines of code core/src/main/scala/org/apache/spark/api/java/JavaRDDLike.scala x: 126 commits (all time) y: 286 lines of code mllib/src/main/scala/org/apache/spark/ml/attribute/attributes.scala x: 16 commits (all time) y: 355 lines of code mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala x: 154 commits (all time) y: 1122 lines of code mllib/src/main/scala/org/apache/spark/mllib/linalg/Matrices.scala x: 66 commits (all time) y: 773 lines of code streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaDStreamLike.scala x: 66 commits (all time) y: 186 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala x: 67 commits (all time) y: 1053 lines of code core/src/main/scala/org/apache/spark/scheduler/MapStatus.scala x: 34 commits (all time) y: 199 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/NoSuchItemException.scala x: 30 commits (all time) y: 66 lines of code core/src/main/scala/org/apache/spark/storage/StorageUtils.scala x: 50 commits (all time) y: 126 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Projection.scala x: 63 commits (all time) y: 89 lines of code common/utils/src/main/scala/org/apache/spark/util/ClosureCleaner.scala x: 2 commits (all time) y: 618 lines of code core/src/main/scala/org/apache/spark/status/LiveEntity.scala x: 45 commits (all time) y: 817 lines of code core/src/main/scala/org/apache/spark/serializer/JavaSerializer.scala x: 47 commits (all time) y: 120 lines of code python/pyspark/ml/param/_shared_params_code_gen.py x: 55 commits (all time) y: 308 lines of code python/pyspark/ml/param/shared.py x: 57 commits (all time) y: 451 lines of code streaming/src/main/scala/org/apache/spark/streaming/dstream/PairDStreamFunctions.scala x: 36 commits (all time) y: 370 lines of code core/src/main/scala/org/apache/spark/scheduler/SchedulerBackend.scala x: 34 commits (all time) y: 29 lines of code core/src/main/scala/org/apache/spark/deploy/master/ZooKeeperPersistenceEngine.scala x: 60 commits (all time) y: 48 lines of code core/src/main/scala/org/apache/spark/executor/ExecutorSource.scala x: 58 commits (all time) y: 110 lines of code core/src/main/scala/org/apache/spark/rdd/SubtractedRDD.scala x: 53 commits (all time) y: 87 lines of code mllib/src/main/scala/org/apache/spark/mllib/util/LinearDataGenerator.scala x: 36 commits (all time) y: 107 lines of code streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaPairDStream.scala x: 83 commits (all time) y: 380 lines of code streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaStreamingContext.scala x: 138 commits (all time) y: 274 lines of code python/pyspark/sql/connect/proto/catalog_pb2.pyi x: 10 commits (all time) y: 913 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/joins/SortMergeJoinExec.scala x: 65 commits (all time) y: 1142 lines of code core/src/main/scala/org/apache/spark/scheduler/ShuffleMapTask.scala x: 118 commits (all time) y: 62 lines of code core/src/main/scala/org/apache/spark/package.scala x: 42 commits (all time) y: 12 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statements.scala x: 93 commits (all time) y: 114 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/interfaces.scala x: 60 commits (all time) y: 255 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/joins/ShuffledHashJoinExec.scala x: 32 commits (all time) y: 505 lines of code core/src/main/scala/org/apache/spark/scheduler/ActiveJob.scala x: 17 commits (all time) y: 18 lines of code core/src/main/scala/org/apache/spark/scheduler/ResultTask.scala x: 76 commits (all time) y: 47 lines of code core/src/main/scala/org/apache/spark/rdd/CheckpointRDD.scala x: 78 commits (all time) y: 10 lines of code mllib-local/src/main/scala/org/apache/spark/ml/linalg/BLAS.scala x: 19 commits (all time) y: 654 lines of code mllib/src/main/scala/org/apache/spark/ml/ann/Layer.scala x: 17 commits (all time) y: 425 lines of code mllib/src/main/scala/org/apache/spark/ml/evaluation/BinaryClassificationEvaluator.scala x: 36 commits (all time) y: 90 lines of code mllib/src/main/scala/org/apache/spark/ml/param/shared/sharedParams.scala x: 55 commits (all time) y: 149 lines of code mllib/src/main/scala/org/apache/spark/ml/tree/treeParams.scala x: 41 commits (all time) y: 297 lines of code mllib/src/main/scala/org/apache/spark/mllib/classification/LogisticRegression.scala x: 63 commits (all time) y: 211 lines of code mllib/src/main/scala/org/apache/spark/mllib/linalg/BLAS.scala x: 31 commits (all time) y: 568 lines of code mllib/src/main/scala/org/apache/spark/mllib/random/RandomRDDs.scala x: 16 commits (all time) y: 506 lines of code mllib/src/main/scala/org/apache/spark/mllib/recommendation/ALS.scala x: 90 commits (all time) y: 221 lines of code core/src/main/scala/org/apache/spark/scheduler/Stage.scala x: 63 commits (all time) y: 53 lines of code common/kvstore/src/main/java/org/apache/spark/util/kvstore/InMemoryStore.java x: 10 commits (all time) y: 370 lines of code core/src/main/scala/org/apache/spark/scheduler/StageInfo.scala x: 67 commits (all time) y: 78 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala x: 23 commits (all time) y: 514 lines of code core/src/main/scala/org/apache/spark/scheduler/TaskResult.scala x: 56 commits (all time) y: 70 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalog.scala x: 31 commits (all time) y: 118 lines of code core/src/main/scala/org/apache/spark/deploy/ApplicationDescription.scala x: 34 commits (all time) y: 20 lines of code core/src/main/scala/org/apache/spark/rdd/BlockRDD.scala x: 48 commits (all time) y: 51 lines of code sql/hive-thriftserver/src/main/java/org/apache/hive/service/cli/thrift/ThriftCLIServiceClient.java x: 5 commits (all time) y: 390 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala x: 78 commits (all time) y: 25 lines of code core/src/main/resources/org/apache/spark/ui/static/sorttable.js x: 15 commits (all time) y: 352 lines of code core/src/main/scala/org/apache/spark/scheduler/SparkListenerBus.scala x: 75 commits (all time) y: 80 lines of code streaming/src/main/scala/org/apache/spark/streaming/dstream/UnionDStream.scala x: 30 commits (all time) y: 29 lines of code graphx/src/main/scala/org/apache/spark/graphx/impl/EdgePartition.scala x: 29 commits (all time) y: 338 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateUnsafeProjection.scala x: 63 commits (all time) y: 298 lines of code core/src/main/scala/org/apache/spark/rdd/ZippedPartitionsRDD.scala x: 48 commits (all time) y: 114 lines of code streaming/src/main/scala/org/apache/spark/streaming/dstream/WindowedDStream.scala x: 45 commits (all time) y: 35 lines of code core/src/main/scala/org/apache/spark/storage/BlockManagerSource.scala x: 43 commits (all time) y: 33 lines of code core/src/main/scala/org/apache/spark/rdd/RDDCheckpointData.scala x: 51 commits (all time) y: 37 lines of code core/src/main/scala/org/apache/spark/deploy/master/ApplicationState.scala x: 35 commits (all time) y: 5 lines of code core/src/main/scala/org/apache/spark/deploy/master/MasterSource.scala x: 31 commits (all time) y: 19 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSQLParser.scala x: 3 commits (all time) y: 1040 lines of code core/src/main/scala/org/apache/spark/scheduler/DAGSchedulerSource.scala x: 46 commits (all time) y: 25 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUdf.scala x: 23 commits (all time) y: 1053 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveUdfs.scala x: 54 commits (all time) y: 337 lines of code
5815.0
lines of code
  min: 1.0
  average: 157.16
  25th percentile: 24.0
  median: 68.0
  75th percentile: 167.0
  max: 5815.0
0 1203.0
commits (all time)
min: 1.0 | average: 21.43 | 25th percentile: 3.0 | median: 7.0 | 75th percentile: 21.0 | max: 1203.0

File Size vs. Contributors (all time): 4071 points

core/src/main/scala/org/apache/spark/util/collection/BitSet.scala x: 25 contributors (all time) y: 162 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CTESubstitution.scala x: 27 contributors (all time) y: 261 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveIdentifierClause.scala x: 8 contributors (all time) y: 100 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/unresolved.scala x: 76 contributors (all time) y: 641 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala x: 76 contributors (all time) y: 827 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala x: 62 contributors (all time) y: 424 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala x: 74 contributors (all time) y: 847 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreePatternBits.scala x: 3 contributors (all time) y: 38 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreePatterns.scala x: 36 contributors (all time) y: 150 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ApplyDefaultCollationToStringType.scala x: 2 contributors (all time) y: 135 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/DecisionTreeClassifier.scala x: 26 contributors (all time) y: 209 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/FMClassifier.scala x: 8 contributors (all time) y: 225 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/GBTClassifier.scala x: 31 contributors (all time) y: 270 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala x: 18 contributors (all time) y: 312 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala x: 52 contributors (all time) y: 929 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/MultilayerPerceptronClassifier.scala x: 25 contributors (all time) y: 255 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/RandomForestClassifier.scala x: 30 contributors (all time) y: 346 lines of code mllib/src/main/scala/org/apache/spark/ml/clustering/BisectingKMeans.scala x: 21 contributors (all time) y: 201 lines of code mllib/src/main/scala/org/apache/spark/ml/clustering/GaussianMixture.scala x: 26 contributors (all time) y: 455 lines of code mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala x: 36 contributors (all time) y: 518 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/DecisionTreeRegressor.scala x: 28 contributors (all time) y: 209 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/GBTRegressor.scala x: 27 contributors (all time) y: 237 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala x: 31 contributors (all time) y: 967 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/LinearRegression.scala x: 51 contributors (all time) y: 594 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/RandomForestRegressor.scala x: 29 contributors (all time) y: 219 lines of code mllib/src/main/scala/org/apache/spark/ml/tree/impl/GradientBoostedTrees.scala x: 16 contributors (all time) y: 351 lines of code mllib/src/main/scala/org/apache/spark/ml/tree/impl/RandomForest.scala x: 29 contributors (all time) y: 827 lines of code mllib/src/main/scala/org/apache/spark/ml/tree/treeModels.scala x: 16 contributors (all time) y: 331 lines of code mllib/src/main/scala/org/apache/spark/ml/util/HasTrainingSummary.scala x: 3 contributors (all time) y: 21 lines of code python/pyspark/testing/connectutils.py x: 14 contributors (all time) y: 177 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/config/Connect.scala x: 8 contributors (all time) y: 325 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/ml/MLCache.scala x: 4 contributors (all time) y: 187 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/ml/MLHandler.scala x: 3 contributors (all time) y: 323 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/package.scala x: 32 contributors (all time) y: 179 lines of code core/src/main/scala/org/apache/spark/util/UninterruptibleThread.scala x: 3 contributors (all time) y: 79 lines of code python/pyspark/errors/exceptions/captured.py x: 6 contributors (all time) y: 284 lines of code python/pyspark/pandas/config.py x: 12 contributors (all time) y: 363 lines of code python/pyspark/pandas/groupby.py x: 20 contributors (all time) y: 1800 lines of code python/pyspark/pandas/namespace.py x: 21 contributors (all time) y: 1517 lines of code python/pyspark/pandas/series.py x: 19 contributors (all time) y: 2215 lines of code python/pyspark/pandas/utils.py x: 10 contributors (all time) y: 657 lines of code python/pyspark/testing/pandasutils.py x: 10 contributors (all time) y: 486 lines of code python/pyspark/testing/utils.py x: 16 contributors (all time) y: 560 lines of code python/pyspark/sql/conversion.py x: 2 contributors (all time) y: 415 lines of code python/pyspark/sql/pandas/serializers.py x: 20 contributors (all time) y: 884 lines of code python/pyspark/worker.py x: 77 contributors (all time) y: 1728 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala x: 225 contributors (all time) y: 5815 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/python/ArrowPythonRunner.scala x: 17 contributors (all time) y: 97 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/constraints.scala x: 2 contributors (all time) y: 174 lines of code python/pyspark/sql/datasource.py x: 6 contributors (all time) y: 256 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala x: 196 contributors (all time) y: 2889 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala x: 69 contributors (all time) y: 617 lines of code common/utils/src/main/scala/org/apache/spark/util/SparkStringUtils.scala x: 2 contributors (all time) y: 8 lines of code common/utils/src/main/scala/org/apache/spark/util/SparkTestUtils.scala x: 2 contributors (all time) y: 69 lines of code connector/avro/src/main/scala/org/apache/spark/sql/avro/AvroDataToCatalyst.scala x: 12 contributors (all time) y: 105 lines of code connector/avro/src/main/scala/org/apache/spark/sql/avro/CatalystDataToAvro.scala x: 4 contributors (all time) y: 39 lines of code connector/kafka-0-10-token-provider/src/main/scala/org/apache/spark/kafka010/KafkaConfigUpdater.scala x: 4 contributors (all time) y: 58 lines of code connector/kafka-0-10-token-provider/src/main/scala/org/apache/spark/kafka010/KafkaDelegationTokenProvider.scala x: 3 contributors (all time) y: 62 lines of code connector/kafka-0-10-token-provider/src/main/scala/org/apache/spark/kafka010/KafkaRedactionUtil.scala x: 2 contributors (all time) y: 31 lines of code connector/kafka-0-10-token-provider/src/main/scala/org/apache/spark/kafka010/KafkaTokenUtil.scala x: 5 contributors (all time) y: 230 lines of code project/SparkBuild.scala x: 222 contributors (all time) y: 1460 lines of code sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala x: 51 contributors (all time) y: 474 lines of code python/pyspark/sql/pandas/types.py x: 17 contributors (all time) y: 920 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/AggregateResolver.scala x: 1 contributors (all time) y: 192 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/ExpressionResolutionContext.scala x: 1 contributors (all time) y: 34 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/ExpressionResolver.scala x: 1 contributors (all time) y: 527 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/FunctionResolver.scala x: 1 contributors (all time) y: 102 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/NameScope.scala x: 1 contributors (all time) y: 406 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/ml/MLUtils.scala x: 6 contributors (all time) y: 514 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/NaiveBayes.scala x: 27 contributors (all time) y: 436 lines of code mllib/src/main/scala/org/apache/spark/ml/clustering/LDA.scala x: 32 contributors (all time) y: 533 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/BucketedRandomProjectionLSH.scala x: 12 contributors (all time) y: 150 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/ChiSqSelector.scala x: 20 contributors (all time) y: 108 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/IDF.scala x: 21 contributors (all time) y: 157 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/MaxAbsScaler.scala x: 13 contributors (all time) y: 122 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/MinHashLSH.scala x: 11 contributors (all time) y: 161 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/MinMaxScaler.scala x: 22 contributors (all time) y: 161 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/OneHotEncoder.scala x: 23 contributors (all time) y: 378 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/PCA.scala x: 22 contributors (all time) y: 133 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/RFormula.scala x: 31 contributors (all time) y: 384 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/RobustScaler.scala x: 7 contributors (all time) y: 180 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/StandardScaler.scala x: 23 contributors (all time) y: 217 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala x: 35 contributors (all time) y: 405 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/TargetEncoder.scala x: 4 contributors (all time) y: 301 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/UnivariateFeatureSelector.scala x: 8 contributors (all time) y: 308 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/VarianceThresholdSelector.scala x: 6 contributors (all time) y: 134 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/VectorIndexer.scala x: 24 contributors (all time) y: 370 lines of code mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala x: 48 contributors (all time) y: 1062 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/AFTSurvivalRegression.scala x: 29 contributors (all time) y: 345 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/FMRegressor.scala x: 9 contributors (all time) y: 478 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/IsotonicRegression.scala x: 26 contributors (all time) y: 199 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/state/StateDataSource.scala x: 9 contributors (all time) y: 443 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala x: 59 contributors (all time) y: 483 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamingQueryCheckpointMetadata.scala x: 1 contributors (all time) y: 21 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/OperatorStateMetadata.scala x: 7 contributors (all time) y: 348 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/ColumnDefinition.scala x: 4 contributors (all time) y: 177 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala x: 111 contributors (all time) y: 920 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/TypeUtils.scala x: 26 contributors (all time) y: 102 lines of code python/pyspark/sql/connect/functions/builtin.py x: 23 contributors (all time) y: 2417 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala x: 64 contributors (all time) y: 998 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveCatalogs.scala x: 23 contributors (all time) y: 112 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveTableSpec.scala x: 5 contributors (all time) y: 84 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/v2ResolutionPlans.scala x: 17 contributors (all time) y: 164 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2AlterTableCommands.scala x: 11 contributors (all time) y: 223 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala x: 45 contributors (all time) y: 1148 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/CreateTableExec.scala x: 10 contributors (all time) y: 42 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/ReplaceTableExec.scala x: 11 contributors (all time) y: 89 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala x: 47 contributors (all time) y: 555 lines of code sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala x: 54 contributors (all time) y: 333 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/OneVsRest.scala x: 26 contributors (all time) y: 344 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/Imputer.scala x: 14 contributors (all time) y: 214 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/Word2Vec.scala x: 32 contributors (all time) y: 250 lines of code mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala x: 19 contributors (all time) y: 263 lines of code mllib/src/main/scala/org/apache/spark/ml/tuning/CrossValidator.scala x: 32 contributors (all time) y: 304 lines of code mllib/src/main/scala/org/apache/spark/ml/tuning/TrainValidationSplit.scala x: 22 contributors (all time) y: 278 lines of code mllib/src/main/scala/org/apache/spark/ml/util/ReadWrite.scala x: 25 contributors (all time) y: 524 lines of code resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/KerberosConfDriverFeatureStep.scala x: 7 contributors (all time) y: 202 lines of code dev/sparktestsupport/modules.py x: 85 contributors (all time) y: 1409 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/namedExpressions.scala x: 59 contributors (all time) y: 441 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala x: 148 contributors (all time) y: 1683 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/SparkOptimizer.scala x: 37 contributors (all time) y: 81 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/UnionLoopExec.scala x: 2 contributors (all time) y: 152 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/view.scala x: 12 contributors (all time) y: 30 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala x: 103 contributors (all time) y: 1503 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/CacheManager.scala x: 45 contributors (all time) y: 305 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/rules.scala x: 58 contributors (all time) y: 558 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/CatalystTypeConverters.scala x: 28 contributors (all time) y: 464 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/python/PythonArrowOutput.scala x: 7 contributors (all time) y: 243 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/WriteToDataSourceV2Exec.scala x: 31 contributors (all time) y: 528 lines of code python/pyspark/pandas/accessors.py x: 9 contributors (all time) y: 434 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala x: 58 contributors (all time) y: 1547 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/dsl/package.scala x: 72 contributors (all time) y: 399 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala x: 20 contributors (all time) y: 3468 lines of code sql/core/src/main/scala/org/apache/spark/sql/classic/Dataset.scala x: 7 contributors (all time) y: 1498 lines of code sql/core/src/main/protobuf/org/apache/spark/sql/execution/streaming/StateMessage.proto x: 4 contributors (all time) y: 219 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/python/streaming/TransformWithStateInPySparkStateServer.scala x: 2 contributors (all time) y: 754 lines of code common/utils/src/main/scala/org/apache/spark/internal/config/ConfigBuilder.scala x: 2 contributors (all time) y: 272 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/TransformWithStateExec.scala x: 10 contributors (all time) y: 539 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStoreErrors.scala x: 6 contributors (all time) y: 367 lines of code sql/core/src/main/scala/org/apache/spark/sql/classic/Catalog.scala x: 4 contributors (all time) y: 568 lines of code core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala x: 47 contributors (all time) y: 733 lines of code python/pyspark/ml/classification.py x: 51 contributors (all time) y: 2173 lines of code python/pyspark/ml/connect/readwrite.py x: 4 contributors (all time) y: 290 lines of code python/pyspark/ml/feature.py x: 69 contributors (all time) y: 3621 lines of code python/pyspark/ml/regression.py x: 43 contributors (all time) y: 1554 lines of code python/pyspark/ml/util.py x: 30 contributors (all time) y: 714 lines of code python/pyspark/ml/wrapper.py x: 24 contributors (all time) y: 245 lines of code python/pyspark/sql/connect/client/core.py x: 23 contributors (all time) y: 1449 lines of code python/pyspark/sql/connect/group.py x: 9 contributors (all time) y: 488 lines of code python/pyspark/sql/connect/plan.py x: 31 contributors (all time) y: 2133 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStore.scala x: 32 contributors (all time) y: 693 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStoreConf.scala x: 16 contributors (all time) y: 37 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowFunctionFrame.scala x: 14 contributors (all time) y: 361 lines of code project/plugins.sbt x: 66 contributors (all time) y: 14 lines of code common/utils/src/main/scala/org/apache/spark/ErrorClassesJSONReader.scala x: 9 contributors (all time) y: 123 lines of code sql/api/src/main/scala/org/apache/spark/sql/errors/ExecutionErrors.scala x: 7 contributors (all time) y: 210 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala x: 109 contributors (all time) y: 2609 lines of code python/pyspark/sql/pandas/functions.py x: 15 contributors (all time) y: 160 lines of code python/pyspark/sql/pandas/group_ops.py x: 13 contributors (all time) y: 252 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveUtils.scala x: 40 contributors (all time) y: 369 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/jdbc/JDBCTableCatalog.scala x: 15 contributors (all time) y: 360 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala x: 29 contributors (all time) y: 1425 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStoreChangelog.scala x: 14 contributors (all time) y: 413 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeExtractors.scala x: 35 contributors (all time) y: 368 lines of code python/pyspark/ml/connect/functions.py x: 2 contributors (all time) y: 47 lines of code python/pyspark/sql/connect/tvf.py x: 2 contributors (all time) y: 101 lines of code core/src/main/scala/org/apache/spark/deploy/ExternalShuffleService.scala x: 31 contributors (all time) y: 129 lines of code core/src/main/scala/org/apache/spark/internal/config/package.scala x: 124 contributors (all time) y: 2456 lines of code core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala x: 104 contributors (all time) y: 886 lines of code core/src/main/scala/org/apache/spark/ui/UIWorkloadGenerator.scala x: 35 contributors (all time) y: 79 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala x: 41 contributors (all time) y: 637 lines of code python/pyspark/sql/streaming/list_state_client.py x: 3 contributors (all time) y: 164 lines of code python/pyspark/sql/streaming/proto/StateMessage_pb2.pyi x: 4 contributors (all time) y: 1116 lines of code python/pyspark/sql/streaming/stateful_processor_api_client.py x: 6 contributors (all time) y: 392 lines of code sql/core/src/main/scala/org/apache/spark/sql/avro/AvroOptions.scala x: 4 contributors (all time) y: 116 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/timeExpressions.scala x: 6 contributors (all time) y: 458 lines of code python/pyspark/sql/functions/__init__.py x: 3 contributors (all time) y: 466 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/ResolverGuard.scala x: 3 contributors (all time) y: 369 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/CodeGeneratorWithInterpretedFallback.scala x: 7 contributors (all time) y: 28 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ToStringBase.scala x: 10 contributors (all time) y: 422 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ArrayBasedMapBuilder.scala x: 9 contributors (all time) y: 96 lines of code sql/core/src/main/scala/org/apache/spark/sql/avro/AvroDeserializer.scala x: 3 contributors (all time) y: 404 lines of code sql/core/src/main/scala/org/apache/spark/sql/avro/AvroSerializer.scala x: 3 contributors (all time) y: 314 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/HiveResult.scala x: 16 contributors (all time) y: 110 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala x: 53 contributors (all time) y: 553 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceUtils.scala x: 24 contributors (all time) y: 225 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetOptions.scala x: 14 contributors (all time) y: 55 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetWriteSupport.scala x: 18 contributors (all time) y: 366 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ResolveDefaultColumnsUtil.scala x: 16 contributors (all time) y: 406 lines of code sql/api/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4 x: 38 contributors (all time) y: 2122 lines of code sql/api/src/main/scala/org/apache/spark/sql/catalyst/util/SparkParserUtils.scala x: 5 contributors (all time) y: 144 lines of code sql/api/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala x: 18 contributors (all time) y: 681 lines of code sql/api/src/main/scala/org/apache/spark/sql/types/StructField.scala x: 7 contributors (all time) y: 148 lines of code sql/api/src/main/scala/org/apache/spark/sql/types/StructType.scala x: 13 contributors (all time) y: 406 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SQLFunction.scala x: 2 contributors (all time) y: 204 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AbstractSqlParser.scala x: 7 contributors (all time) y: 85 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala x: 150 contributors (all time) y: 4715 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/ParserInterface.scala x: 11 contributors (all time) y: 23 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala x: 94 contributors (all time) y: 1040 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/command/CreateSQLFunctionCommand.scala x: 2 contributors (all time) y: 279 lines of code python/pyspark/ml/connect/tuning.py x: 6 contributors (all time) y: 318 lines of code python/run-tests.py x: 30 contributors (all time) y: 287 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala x: 67 contributors (all time) y: 991 lines of code mllib/src/main/scala/org/apache/spark/ml/Model.scala x: 6 contributors (all time) y: 13 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StatefulProcessorHandleImpl.scala x: 5 contributors (all time) y: 477 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryRelation.scala x: 36 contributors (all time) y: 417 lines of code core/src/main/scala/org/apache/spark/api/python/PythonRDD.scala x: 78 contributors (all time) y: 687 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectService.scala x: 5 contributors (all time) y: 339 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/state/StreamStreamJoinStatePartitionReader.scala x: 5 contributors (all time) y: 123 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/python/streaming/TransformWithStateInPySparkExec.scala x: 2 contributors (all time) y: 403 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/IncrementalExecution.scala x: 38 contributors (all time) y: 467 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamingSymmetricHashJoinExec.scala x: 19 contributors (all time) y: 543 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/TransformWithStateVariableUtils.scala x: 4 contributors (all time) y: 154 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/SymmetricHashJoinStateManager.scala x: 23 contributors (all time) y: 793 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/statefulOperators.scala x: 32 contributors (all time) y: 1059 lines of code sql/core/src/main/scala/org/apache/spark/sql/jdbc/MySQLDialect.scala x: 26 contributors (all time) y: 314 lines of code python/pyspark/sql/connect/proto/ml_pb2.pyi x: 2 contributors (all time) y: 465 lines of code sql/connect/common/src/main/protobuf/spark/connect/ml.proto x: 3 contributors (all time) y: 114 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala x: 33 contributors (all time) y: 393 lines of code core/src/main/scala/org/apache/spark/serializer/SerializationDebugger.scala x: 11 contributors (all time) y: 274 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/xml/StaxXmlParser.scala x: 11 contributors (all time) y: 873 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala x: 75 contributors (all time) y: 3066 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala x: 69 contributors (all time) y: 432 lines of code python/pyspark/sql/pandas/_typing/__init__.pyi x: 7 contributors (all time) y: 322 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/UnsupportedOperationChecker.scala x: 39 contributors (all time) y: 439 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/pythonLogicalOperators.scala x: 16 contributors (all time) y: 190 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala x: 115 contributors (all time) y: 806 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/python/streaming/TransformWithStateInPySparkDeserializer.scala x: 1 contributors (all time) y: 48 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/python/streaming/TransformWithStateInPySparkPythonRunner.scala x: 1 contributors (all time) y: 291 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveReferencesInAggregate.scala x: 5 contributors (all time) y: 101 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala x: 143 contributors (all time) y: 961 lines of code python/pyspark/core/context.py x: 5 contributors (all time) y: 784 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/stat/StatFunctions.scala x: 31 contributors (all time) y: 166 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/QueryExecution.scala x: 73 contributors (all time) y: 440 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/InsertAdaptiveSparkPlan.scala x: 21 contributors (all time) y: 102 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/PlanAdaptiveDynamicPruningFilters.scala x: 9 contributors (all time) y: 54 lines of code mllib/src/main/scala/org/apache/spark/ml/source/libsvm/LibSVMRelation.scala x: 30 contributors (all time) y: 140 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/binaryfile/BinaryFileFormat.scala x: 11 contributors (all time) y: 133 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVFileFormat.scala x: 27 contributors (all time) y: 127 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/text/TextFileFormat.scala x: 22 contributors (all time) y: 107 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/xml/XmlFileFormat.scala x: 6 contributors (all time) y: 106 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/CatalogV2Util.scala x: 31 contributors (all time) y: 516 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala x: 63 contributors (all time) y: 551 lines of code core/src/main/scala/org/apache/spark/storage/BlockManagerMasterEndpoint.scala x: 46 contributors (all time) y: 818 lines of code python/pyspark/sql/classic/column.py x: 5 contributors (all time) y: 490 lines of code python/pyspark/sql/column.py x: 43 contributors (all time) y: 317 lines of code python/pyspark/sql/connect/expressions.py x: 8 contributors (all time) y: 1039 lines of code python/pyspark/sql/connect/proto/expressions_pb2.pyi x: 14 contributors (all time) y: 1764 lines of code sql/api/src/main/scala/org/apache/spark/sql/Column.scala x: 4 contributors (all time) y: 274 lines of code sql/api/src/main/scala/org/apache/spark/sql/internal/columnNodes.scala x: 4 contributors (all time) y: 408 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala x: 71 contributors (all time) y: 1057 lines of code sql/connect/common/src/main/scala/org/apache/spark/sql/connect/Dataset.scala x: 2 contributors (all time) y: 1025 lines of code sql/connect/common/src/main/scala/org/apache/spark/sql/connect/SparkSession.scala x: 6 contributors (all time) y: 598 lines of code sql/connect/common/src/main/scala/org/apache/spark/sql/connect/columnNodeSupport.scala x: 3 contributors (all time) y: 248 lines of code python/pyspark/accumulators.py x: 41 contributors (all time) y: 173 lines of code common/variant/src/main/java/org/apache/spark/types/variant/VariantUtil.java x: 5 contributors (all time) y: 396 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/xml/XmlExpressionEvalUtils.scala x: 3 contributors (all time) y: 159 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/xmlExpressions.scala x: 12 contributors (all time) y: 204 lines of code core/src/main/scala/org/apache/spark/internal/config/Python.scala x: 6 contributors (all time) y: 86 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala x: 82 contributors (all time) y: 613 lines of code core/src/main/scala/org/apache/spark/serializer/KryoSerializer.scala x: 78 contributors (all time) y: 575 lines of code core/src/main/scala/org/apache/spark/api/python/PythonWorkerFactory.scala x: 45 contributors (all time) y: 399 lines of code core/src/main/scala/org/apache/spark/api/python/PythonWorkerUtils.scala x: 3 contributors (all time) y: 118 lines of code core/src/main/scala/org/apache/spark/api/python/StreamingPythonRunner.scala x: 10 contributors (all time) y: 123 lines of code core/src/main/scala/org/apache/spark/api/r/RRDD.scala x: 18 contributors (all time) y: 119 lines of code python/pyspark/daemon.py x: 28 contributors (all time) y: 171 lines of code python/pyspark/sql/connect/streaming/worker/foreach_batch_worker.py x: 5 contributors (all time) y: 59 lines of code python/pyspark/sql/connect/streaming/worker/listener_worker.py x: 4 contributors (all time) y: 72 lines of code python/pyspark/sql/streaming/python_streaming_source_runner.py x: 5 contributors (all time) y: 161 lines of code python/pyspark/sql/worker/plan_data_source_read.py x: 7 contributors (all time) y: 301 lines of code python/pyspark/sql/worker/write_into_data_source.py x: 6 contributors (all time) y: 179 lines of code python/pyspark/taskcontext.py x: 18 contributors (all time) y: 147 lines of code sql/core/src/main/scala/org/apache/spark/sql/api/python/PythonSQLUtils.scala x: 14 contributors (all time) y: 139 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala x: 61 contributors (all time) y: 813 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/LogicalRelation.scala x: 27 contributors (all time) y: 77 lines of code sql/api/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseLexer.g4 x: 25 contributors (all time) y: 607 lines of code launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java x: 26 contributors (all time) y: 424 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala x: 56 contributors (all time) y: 783 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/python/UserDefinedPythonDataSource.scala x: 5 contributors (all time) y: 437 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/python/MapInBatchEvaluatorFactory.scala x: 6 contributors (all time) y: 68 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/python/MapInBatchExec.scala x: 10 contributors (all time) y: 56 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveWithCTE.scala x: 7 contributors (all time) y: 258 lines of code core/src/main/scala/org/apache/spark/executor/Executor.scala x: 138 contributors (all time) y: 952 lines of code sql/api/src/main/scala/org/apache/spark/sql/catalyst/util/SparkDateTimeUtils.scala x: 11 contributors (all time) y: 426 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/finishAnalysis.scala x: 30 contributors (all time) y: 133 lines of code common/utils/src/main/scala/org/apache/spark/internal/LogKey.scala x: 20 contributors (all time) y: 861 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala x: 46 contributors (all time) y: 701 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/ProgressReporter.scala x: 33 contributors (all time) y: 446 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/HDFSBackedStateStoreProvider.scala x: 37 contributors (all time) y: 801 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDBStateStoreProvider.scala x: 21 contributors (all time) y: 755 lines of code sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/TableCatalogCapability.java x: 4 contributors (all time) y: 9 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashJoin.scala x: 29 contributors (all time) y: 663 lines of code python/pyspark/sql/dataframe.py x: 142 contributors (all time) y: 851 lines of code python/pyspark/ml/connect/feature.py x: 3 contributors (all time) y: 216 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/AnalysisHelper.scala x: 12 contributors (all time) y: 210 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDBFileManager.scala x: 21 contributors (all time) y: 783 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVOptions.scala x: 23 contributors (all time) y: 284 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JSONOptions.scala x: 29 contributors (all time) y: 190 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala x: 96 contributors (all time) y: 3781 lines of code sql/core/src/main/scala/org/apache/spark/sql/classic/DataFrameReader.scala x: 2 contributors (all time) y: 224 lines of code sql/core/src/main/scala/org/apache/spark/sql/classic/DataStreamWriter.scala x: 2 contributors (all time) y: 323 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/AggregateExpressionResolver.scala x: 1 contributors (all time) y: 154 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/BinaryArithmeticResolver.scala x: 1 contributors (all time) y: 90 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/ExpressionIdAssigner.scala x: 1 contributors (all time) y: 210 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/JoinResolver.scala x: 1 contributors (all time) y: 180 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/ProjectResolver.scala x: 1 contributors (all time) y: 119 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/ResolutionValidator.scala x: 1 contributors (all time) y: 239 lines of code sql/api/src/main/scala/org/apache/spark/sql/catalyst/parser/parsers.scala x: 9 contributors (all time) y: 313 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/V2ExpressionUtils.scala x: 11 contributors (all time) y: 319 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2Writes.scala x: 10 contributors (all time) y: 146 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/variant/variantExpressions.scala x: 13 contributors (all time) y: 743 lines of code core/src/main/scala/org/apache/spark/util/Utils.scala x: 205 contributors (all time) y: 2154 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/ui/ExecutionPage.scala x: 22 contributors (all time) y: 210 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/UnivocityParser.scala x: 25 contributors (all time) y: 456 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVDataSource.scala x: 18 contributors (all time) y: 239 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSource.scala x: 42 contributors (all time) y: 462 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilters.scala x: 28 contributors (all time) y: 704 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaConverter.scala x: 27 contributors (all time) y: 470 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/hash.scala x: 41 contributors (all time) y: 827 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/AnsiTypeCoercion.scala x: 20 contributors (all time) y: 125 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala x: 62 contributors (all time) y: 249 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercionHelper.scala x: 6 contributors (all time) y: 535 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala x: 48 contributors (all time) y: 400 lines of code sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OffHeapColumnVector.java x: 27 contributors (all time) y: 486 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala x: 56 contributors (all time) y: 592 lines of code sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/ParquetVectorUpdaterFactory.java x: 12 contributors (all time) y: 1454 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala x: 59 contributors (all time) y: 374 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRowConverter.scala x: 29 contributors (all time) y: 675 lines of code launcher/src/main/java/org/apache/spark/launcher/AbstractCommandBuilder.java x: 21 contributors (all time) y: 256 lines of code sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/TableCatalog.java x: 15 contributors (all time) y: 78 lines of code sql/api/src/main/scala/org/apache/spark/sql/types/DataType.scala x: 14 contributors (all time) y: 369 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/QueryExecutionMetering.scala x: 8 contributors (all time) y: 87 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala x: 37 contributors (all time) y: 202 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala x: 90 contributors (all time) y: 1905 lines of code python/pyspark/sql/connect/session.py x: 24 contributors (all time) y: 838 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlanInfo.scala x: 17 contributors (all time) y: 72 lines of code python/pyspark/ml/tuning.py x: 45 contributors (all time) y: 1133 lines of code python/pyspark/sql/connect/client/reattach.py x: 9 contributors (all time) y: 218 lines of code core/src/main/scala/org/apache/spark/util/Distribution.scala x: 13 contributors (all time) y: 40 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/collect.scala x: 25 contributors (all time) y: 399 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ColumnResolutionHelper.scala x: 16 contributors (all time) y: 476 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala x: 80 contributors (all time) y: 1210 lines of code python/pyspark/ml/clustering.py x: 33 contributors (all time) y: 1000 lines of code mllib-local/src/main/scala/org/apache/spark/ml/linalg/Vectors.scala x: 17 contributors (all time) y: 578 lines of code mllib/src/main/scala/org/apache/spark/ml/param/params.scala x: 29 contributors (all time) y: 603 lines of code project/MimaExcludes.scala x: 174 contributors (all time) y: 202 lines of code core/src/main/resources/org/apache/spark/ui/static/webui.css x: 39 contributors (all time) y: 403 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala x: 57 contributors (all time) y: 486 lines of code sql/core/src/main/scala/org/apache/spark/sql/jdbc/OracleDialect.scala x: 22 contributors (all time) y: 193 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/AggUtils.scala x: 22 contributors (all time) y: 432 lines of code sql/catalyst/src/main/java/org/apache/spark/sql/connector/util/V2ExpressionSQLBuilder.java x: 12 contributors (all time) y: 313 lines of code sql/core/src/main/scala/org/apache/spark/sql/jdbc/H2Dialect.scala x: 14 contributors (all time) y: 230 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ValidateSubqueryExpression.scala x: 2 contributors (all time) y: 301 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala x: 83 contributors (all time) y: 1490 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JsonDataSource.scala x: 15 contributors (all time) y: 190 lines of code sql/hive-thriftserver/src/main/java/org/apache/hive/service/cli/session/HiveSessionImpl.java x: 14 contributors (all time) y: 778 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveUDFs.scala x: 45 contributors (all time) y: 337 lines of code sql/api/src/main/scala/org/apache/spark/sql/catalyst/parser/DataTypeAstBuilder.scala x: 11 contributors (all time) y: 201 lines of code core/src/main/scala/org/apache/spark/ui/env/EnvironmentPage.scala x: 15 contributors (all time) y: 165 lines of code sql/api/src/main/scala/org/apache/spark/sql/functions.scala x: 14 contributors (all time) y: 1882 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/misc.scala x: 53 contributors (all time) y: 436 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectListenerBusListener.scala x: 4 contributors (all time) y: 103 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectSessionManager.scala x: 6 contributors (all time) y: 216 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectStreamingQueryCache.scala x: 4 contributors (all time) y: 237 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala x: 80 contributors (all time) y: 1102 lines of code sql/api/src/main/scala/org/apache/spark/sql/catalyst/util/TimestampFormatter.scala x: 7 contributors (all time) y: 432 lines of code core/src/main/java/org/apache/spark/memory/TaskMemoryManager.java x: 30 contributors (all time) y: 315 lines of code core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeInMemorySorter.java x: 28 contributors (all time) y: 260 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala x: 49 contributors (all time) y: 695 lines of code core/src/main/scala/org/apache/spark/util/ThreadUtils.scala x: 25 contributors (all time) y: 247 lines of code core/src/main/scala/org/apache/spark/BarrierCoordinator.scala x: 11 contributors (all time) y: 149 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/objects.scala x: 24 contributors (all time) y: 454 lines of code core/src/main/scala/org/apache/spark/BarrierTaskContext.scala x: 18 contributors (all time) y: 165 lines of code core/src/main/scala/org/apache/spark/ui/ConsoleProgressBar.scala x: 17 contributors (all time) y: 67 lines of code sql/core/src/main/scala/org/apache/spark/sql/jdbc/MsSqlServerDialect.scala x: 28 contributors (all time) y: 191 lines of code sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala x: 35 contributors (all time) y: 311 lines of code mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala x: 20 contributors (all time) y: 362 lines of code mllib/src/main/scala/org/apache/spark/ml/optim/loss/RDDLossFunction.scala x: 5 contributors (all time) y: 32 lines of code mllib/src/main/scala/org/apache/spark/ml/stat/Summarizer.scala x: 13 contributors (all time) y: 544 lines of code mllib/src/main/scala/org/apache/spark/mllib/clustering/GaussianMixture.scala x: 19 contributors (all time) y: 196 lines of code mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala x: 25 contributors (all time) y: 370 lines of code mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/RowMatrix.scala x: 34 contributors (all time) y: 560 lines of code mllib/src/main/scala/org/apache/spark/mllib/optimization/GradientDescent.scala x: 31 contributors (all time) y: 184 lines of code mllib/src/main/scala/org/apache/spark/mllib/optimization/LBFGS.scala x: 19 contributors (all time) y: 143 lines of code mllib/src/main/scala/org/apache/spark/mllib/stat/Statistics.scala x: 12 contributors (all time) y: 76 lines of code sql/api/src/main/scala/org/apache/spark/sql/catalyst/JavaTypeInference.scala x: 7 contributors (all time) y: 120 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/DeserializerBuildHelper.scala x: 10 contributors (all time) y: 407 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/SerializerBuildHelper.scala x: 12 contributors (all time) y: 418 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/types/PhysicalDataType.scala x: 13 contributors (all time) y: 323 lines of code python/pyspark/sql/connect/udf.py x: 5 contributors (all time) y: 226 lines of code python/pyspark/sql/udf.py x: 26 contributors (all time) y: 464 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveShim.scala x: 48 contributors (all time) y: 1277 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/DeduplicateRelations.scala x: 21 contributors (all time) y: 394 lines of code python/pyspark/ml/base.py x: 12 contributors (all time) y: 174 lines of code python/pyspark/sql/utils.py x: 31 contributors (all time) y: 289 lines of code python/pyspark/pandas/base.py x: 13 contributors (all time) y: 601 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PushVariantIntoScan.scala x: 2 contributors (all time) y: 233 lines of code core/src/main/scala/org/apache/spark/util/JsonProtocol.scala x: 69 contributors (all time) y: 1419 lines of code sql/connect/common/src/main/scala/org/apache/spark/sql/connect/KeyValueGroupedDataset.scala x: 3 contributors (all time) y: 581 lines of code sql/core/src/main/scala/org/apache/spark/sql/scripting/SqlScriptingExecutionNode.scala x: 7 contributors (all time) y: 707 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoder.scala x: 32 contributors (all time) y: 189 lines of code sql/connect/common/src/main/scala/org/apache/spark/sql/connect/client/arrow/ArrowDeserializer.scala x: 4 contributors (all time) y: 496 lines of code scalastyle-config.xml x: 29 contributors (all time) y: 337 lines of code common/variant/src/main/java/org/apache/spark/types/variant/VariantBuilder.java x: 4 contributors (all time) y: 450 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/SparkShreddingUtils.scala x: 3 contributors (all time) y: 672 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Relation.scala x: 19 contributors (all time) y: 209 lines of code python/pyspark/sql/connect/dataframe.py x: 25 contributors (all time) y: 1945 lines of code sql/core/src/main/scala/org/apache/spark/sql/classic/SparkSession.scala x: 4 contributors (all time) y: 665 lines of code python/pyspark/sql/connect/proto/relations_pb2.pyi x: 22 contributors (all time) y: 3616 lines of code sql/connect/common/src/main/protobuf/spark/connect/relations.proto x: 7 contributors (all time) y: 984 lines of code core/src/main/scala/org/apache/spark/SparkContext.scala x: 237 contributors (all time) y: 1923 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2SessionCatalog.scala x: 28 contributors (all time) y: 421 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlanner.scala x: 20 contributors (all time) y: 60 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2ScanPartitioningAndOrdering.scala x: 7 contributors (all time) y: 48 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/ResolveWriteToStream.scala x: 11 contributors (all time) y: 99 lines of code python/pyspark/ml/pipeline.py x: 24 contributors (all time) y: 253 lines of code core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala x: 110 contributors (all time) y: 883 lines of code core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala x: 73 contributors (all time) y: 510 lines of code launcher/src/main/java/org/apache/spark/launcher/SparkLauncher.java x: 17 contributors (all time) y: 270 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDBStateEncoder.scala x: 6 contributors (all time) y: 1126 lines of code python/pyspark/sql/session.py x: 64 contributors (all time) y: 921 lines of code python/pyspark/sql/types.py x: 69 contributors (all time) y: 1984 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala x: 47 contributors (all time) y: 458 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Mode.scala x: 8 contributors (all time) y: 268 lines of code python/pyspark/shell.py x: 59 contributors (all time) y: 88 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HintErrorLogger.scala x: 6 contributors (all time) y: 38 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveHints.scala x: 25 contributors (all time) y: 198 lines of code launcher/src/main/java/org/apache/spark/launcher/JavaModuleOptions.java x: 8 contributors (all time) y: 30 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveInlineTables.scala x: 18 contributors (all time) y: 14 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/maskExpressions.scala x: 9 contributors (all time) y: 262 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala x: 81 contributors (all time) y: 2968 lines of code sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala x: 53 contributors (all time) y: 613 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/ddl.scala x: 31 contributors (all time) y: 104 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala x: 65 contributors (all time) y: 611 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/PartitionedFileUtil.scala x: 8 contributors (all time) y: 60 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/FileScan.scala x: 20 contributors (all time) y: 147 lines of code core/src/main/scala/org/apache/spark/scheduler/JobResult.scala x: 11 contributors (all time) y: 8 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/SQLExecution.scala x: 31 contributors (all time) y: 240 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SparkPlanGraph.scala x: 23 contributors (all time) y: 166 lines of code core/src/main/scala/org/apache/spark/deploy/master/Master.scala x: 98 contributors (all time) y: 1095 lines of code sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/ExpressionImplUtils.java x: 10 contributors (all time) y: 259 lines of code core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala x: 87 contributors (all time) y: 1222 lines of code python/pyspark/sql/streaming/readwriter.py x: 9 contributors (all time) y: 612 lines of code mllib/src/main/scala/org/apache/spark/ml/Pipeline.scala x: 21 contributors (all time) y: 245 lines of code common/network-common/src/main/java/org/apache/spark/network/server/TransportRequestHandler.java x: 20 contributors (all time) y: 252 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/TTLState.scala x: 5 contributors (all time) y: 271 lines of code core/src/main/scala/org/apache/spark/api/java/JavaSparkContext.scala x: 58 contributors (all time) y: 260 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/python/CoGroupedArrowPythonRunner.scala x: 7 contributors (all time) y: 97 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/python/PythonUDFRunner.scala x: 10 contributors (all time) y: 170 lines of code launcher/src/main/java/org/apache/spark/launcher/CommandBuilderUtils.java x: 13 contributors (all time) y: 240 lines of code core/src/main/scala/org/apache/spark/SparkEnv.scala x: 96 contributors (all time) y: 414 lines of code sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeArrayData.java x: 23 contributors (all time) y: 400 lines of code sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/codegen/UnsafeWriter.java x: 9 contributors (all time) y: 168 lines of code sql/api/src/main/scala/org/apache/spark/sql/SparkSession.scala x: 1 contributors (all time) y: 313 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/python/UserDefinedPythonFunction.scala x: 15 contributors (all time) y: 213 lines of code core/src/main/scala/org/apache/spark/deploy/PythonRunner.scala x: 16 contributors (all time) y: 122 lines of code python/pyspark/ml/fpm.py x: 17 contributors (all time) y: 243 lines of code python/pyspark/pandas/typedef/typehints.py x: 15 contributors (all time) y: 409 lines of code python/pyspark/sql/pandas/conversion.py x: 20 contributors (all time) y: 610 lines of code sql/api/src/main/scala/org/apache/spark/sql/util/ArrowUtils.scala x: 10 contributors (all time) y: 221 lines of code sql/core/src/main/scala/org/apache/spark/sql/api/r/SQLUtils.scala x: 34 contributors (all time) y: 188 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowConverters.scala x: 16 contributors (all time) y: 379 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/r/ArrowRRunner.scala x: 6 contributors (all time) y: 147 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/StopWordsRemover.scala x: 24 contributors (all time) y: 141 lines of code python/pyspark/sql/classic/dataframe.py x: 10 contributors (all time) y: 1539 lines of code python/pyspark/sql/udtf.py x: 11 contributors (all time) y: 275 lines of code python/pyspark/sql/connect/proto/commands_pb2.pyi x: 18 contributors (all time) y: 2053 lines of code sql/connect/common/src/main/protobuf/spark/connect/commands.proto x: 3 contributors (all time) y: 448 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/command/commands.scala x: 36 contributors (all time) y: 135 lines of code connector/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaOffsetReaderConsumer.scala x: 7 contributors (all time) y: 466 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/subquery.scala x: 33 contributors (all time) y: 347 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FlatMapGroupsWithStateExec.scala x: 23 contributors (all time) y: 417 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCOptions.scala x: 30 contributors (all time) y: 213 lines of code common/unsafe/src/main/java/org/apache/spark/unsafe/array/ByteArrayMethods.java x: 13 contributors (all time) y: 81 lines of code sql/core/src/main/scala/org/apache/spark/sql/ExperimentalMethods.scala x: 9 contributors (all time) y: 17 lines of code sql/core/src/main/scala/org/apache/spark/sql/SparkSessionExtensions.scala x: 17 contributors (all time) y: 153 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala x: 42 contributors (all time) y: 237 lines of code sql/core/src/main/scala/org/apache/spark/sql/internal/BaseSessionStateBuilder.scala x: 52 contributors (all time) y: 257 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionStateBuilder.scala x: 38 contributors (all time) y: 171 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala x: 55 contributors (all time) y: 234 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLAppStatusListener.scala x: 32 contributors (all time) y: 479 lines of code resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala x: 39 contributors (all time) y: 673 lines of code mllib-local/src/main/scala/org/apache/spark/ml/linalg/Matrices.scala x: 18 contributors (all time) y: 815 lines of code connector/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSource.scala x: 8 contributors (all time) y: 284 lines of code python/pyspark/java_gateway.py x: 52 contributors (all time) y: 105 lines of code repl/src/main/scala/org/apache/spark/repl/Main.scala x: 21 contributors (all time) y: 88 lines of code sql/core/src/main/scala/org/apache/spark/sql/artifact/ArtifactManager.scala x: 7 contributors (all time) y: 377 lines of code sql/core/src/main/scala/org/apache/spark/sql/classic/KeyValueGroupedDataset.scala x: 1 contributors (all time) y: 452 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/ExistingRDD.scala x: 41 contributors (all time) y: 220 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlan.scala x: 56 contributors (all time) y: 378 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzePartitionCommand.scala x: 13 contributors (all time) y: 64 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzeTableCommand.scala x: 13 contributors (all time) y: 12 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/command/CommandUtils.scala x: 27 contributors (all time) y: 375 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/command/DataWritingCommand.scala x: 15 contributors (all time) y: 60 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/command/createDataSourceTables.scala x: 33 contributors (all time) y: 155 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala x: 55 contributors (all time) y: 734 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala x: 74 contributors (all time) y: 992 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FallBackFileSourceV2.scala x: 8 contributors (all time) y: 21 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala x: 44 contributors (all time) y: 315 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InsertIntoDataSourceCommand.scala x: 9 contributors (all time) y: 25 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/SaveIntoDataSourceCommand.scala x: 11 contributors (all time) y: 51 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousExecution.scala x: 36 contributors (all time) y: 345 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/memory.scala x: 32 contributors (all time) y: 215 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala x: 34 contributors (all time) y: 140 lines of code sql/core/src/main/scala/org/apache/spark/sql/internal/SessionState.scala x: 34 contributors (all time) y: 114 lines of code sql/core/src/main/scala/org/apache/spark/sql/util/QueryExecutionListener.scala x: 14 contributors (all time) y: 95 lines of code sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLDriver.scala x: 30 contributors (all time) y: 82 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala x: 55 contributors (all time) y: 20 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala x: 79 contributors (all time) y: 327 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala x: 60 contributors (all time) y: 176 lines of code sql/core/src/main/scala/org/apache/spark/sql/avro/SchemaConverters.scala x: 2 contributors (all time) y: 358 lines of code core/src/main/scala/org/apache/spark/executor/CoarseGrainedExecutorBackend.scala x: 95 contributors (all time) y: 486 lines of code python/pyspark/ml/evaluation.py x: 28 contributors (all time) y: 579 lines of code python/pyspark/pandas/indexing.py x: 13 contributors (all time) y: 1210 lines of code connector/protobuf/src/main/scala/org/apache/spark/sql/protobuf/utils/SchemaConverters.scala x: 8 contributors (all time) y: 176 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/subquery.scala x: 34 contributors (all time) y: 757 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/DecorrelateInnerQuery.scala x: 11 contributors (all time) y: 529 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/RewriteMergeIntoTable.scala x: 4 contributors (all time) y: 369 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala x: 46 contributors (all time) y: 272 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Implicits.scala x: 15 contributors (all time) y: 100 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/csvExpressions.scala x: 20 contributors (all time) y: 211 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala x: 48 contributors (all time) y: 1531 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/numberFormatExpressions.scala x: 12 contributors (all time) y: 280 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/randomExpressions.scala x: 25 contributors (all time) y: 315 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/urlExpressions.scala x: 13 contributors (all time) y: 205 lines of code common/unsafe/src/main/java/org/apache/spark/sql/catalyst/util/CollationFactory.java x: 10 contributors (all time) y: 979 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2ScanRelationPushDown.scala x: 18 contributors (all time) y: 441 lines of code sql/connect/common/src/main/scala/org/apache/spark/sql/connect/common/DataTypeProtoConverter.scala x: 3 contributors (all time) y: 257 lines of code python/pyspark/sql/connect/proto/types_pb2.pyi x: 7 contributors (all time) y: 914 lines of code python/pyspark/ml/recommendation.py x: 26 contributors (all time) y: 328 lines of code python/pyspark/sql/connect/proto/common_pb2.pyi x: 5 contributors (all time) y: 566 lines of code common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDB.java x: 9 contributors (all time) y: 323 lines of code common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDB.java x: 4 contributors (all time) y: 333 lines of code common/utils/src/main/scala/org/apache/spark/SparkException.scala x: 10 contributors (all time) y: 448 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/HDFSMetadataLog.scala x: 26 contributors (all time) y: 251 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/state/StatePartitionReader.scala x: 8 contributors (all time) y: 232 lines of code core/src/main/scala/org/apache/spark/rdd/NewHadoopRDD.scala x: 63 contributors (all time) y: 298 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/parquet/ParquetPartitionReaderFactory.scala x: 18 contributors (all time) y: 266 lines of code connector/protobuf/src/main/scala/org/apache/spark/sql/protobuf/ProtobufSerializer.scala x: 7 contributors (all time) y: 286 lines of code core/src/main/scala/org/apache/spark/SparkConf.scala x: 86 contributors (all time) y: 505 lines of code python/pyspark/version.py x: 12 contributors (all time) y: 1 lines of code common/unsafe/src/main/java/org/apache/spark/sql/catalyst/util/CollationAwareUTF8String.java x: 7 contributors (all time) y: 921 lines of code sql/api/src/main/scala/org/apache/spark/sql/types/StringType.scala x: 10 contributors (all time) y: 108 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/util/SchemaUtils.scala x: 15 contributors (all time) y: 250 lines of code core/src/main/scala/org/apache/spark/api/python/PythonUtils.scala x: 24 contributors (all time) y: 137 lines of code python/pyspark/cloudpickle/cloudpickle.py x: 3 contributors (all time) y: 793 lines of code core/src/main/scala/org/apache/spark/TaskContext.scala x: 43 contributors (all time) y: 99 lines of code core/src/main/scala/org/apache/spark/scheduler/Task.scala x: 58 contributors (all time) y: 132 lines of code python/pyspark/sql/streaming/state.py x: 5 contributors (all time) y: 187 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/parameters.scala x: 8 contributors (all time) y: 121 lines of code python/pyspark/sql/connect/proto/base_pb2.pyi x: 20 contributors (all time) y: 3038 lines of code sql/connect/common/src/main/protobuf/spark/connect/base.proto x: 5 contributors (all time) y: 921 lines of code python/pyspark/sql/variant_utils.py x: 5 contributors (all time) y: 615 lines of code sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/MutableColumnarRow.java x: 9 contributors (all time) y: 239 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala x: 32 contributors (all time) y: 518 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/CharVarcharUtils.scala x: 10 contributors (all time) y: 231 lines of code core/src/main/scala/org/apache/spark/deploy/history/ApplicationCache.scala x: 13 contributors (all time) y: 226 lines of code python/pyspark/core/rdd.py x: 5 contributors (all time) y: 1451 lines of code python/pyspark/sql/readwriter.py x: 84 contributors (all time) y: 860 lines of code core/src/main/scala/org/apache/spark/deploy/DeployMessage.scala x: 39 contributors (all time) y: 147 lines of code core/src/main/java/org/apache/spark/shuffle/sort/ShuffleExternalSorter.java x: 21 contributors (all time) y: 275 lines of code core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala x: 155 contributors (all time) y: 2113 lines of code core/src/main/scala/org/apache/spark/scheduler/TaskSet.scala x: 20 contributors (all time) y: 21 lines of code core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala x: 113 contributors (all time) y: 1010 lines of code core/src/main/scala/org/apache/spark/util/collection/Spillable.scala x: 16 contributors (all time) y: 66 lines of code resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala x: 40 contributors (all time) y: 712 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala x: 80 contributors (all time) y: 4533 lines of code python/pyspark/sql/pandas/map_ops.py x: 10 contributors (all time) y: 83 lines of code mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala x: 42 contributors (all time) y: 521 lines of code sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/ParquetColumnVector.java x: 6 contributors (all time) y: 248 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetReadSupport.scala x: 20 contributors (all time) y: 369 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/udaf.scala x: 20 contributors (all time) y: 429 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala x: 29 contributors (all time) y: 497 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala x: 28 contributors (all time) y: 526 lines of code graphx/src/main/scala/org/apache/spark/graphx/Pregel.scala x: 22 contributors (all time) y: 54 lines of code core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala x: 65 contributors (all time) y: 654 lines of code streaming/src/main/scala/org/apache/spark/streaming/dstream/DStream.scala x: 37 contributors (all time) y: 541 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/joinTypes.scala x: 19 contributors (all time) y: 132 lines of code core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala x: 75 contributors (all time) y: 363 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileScanRDD.scala x: 34 contributors (all time) y: 210 lines of code python/pyspark/sql/connect/readwriter.py x: 11 contributors (all time) y: 851 lines of code sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/RowBasedKeyValueBatch.java x: 13 contributors (all time) y: 96 lines of code python/pyspark/sql/catalog.py x: 28 contributors (all time) y: 323 lines of code common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java x: 20 contributors (all time) y: 1654 lines of code python/pyspark/ml/torch/distributor.py x: 9 contributors (all time) y: 630 lines of code core/src/main/scala/org/apache/spark/rdd/RDD.scala x: 137 contributors (all time) y: 1080 lines of code core/src/main/scala/org/apache/spark/shuffle/IndexShuffleBlockResolver.scala x: 31 contributors (all time) y: 472 lines of code sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/ConstantColumnVector.java x: 6 contributors (all time) y: 198 lines of code sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/WritableColumnVector.java x: 27 contributors (all time) y: 599 lines of code core/src/main/scala/org/apache/spark/util/HadoopFSUtils.scala x: 11 contributors (all time) y: 243 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/percentiles.scala x: 9 contributors (all time) y: 388 lines of code core/src/main/scala/org/apache/spark/deploy/master/MasterArguments.scala x: 25 contributors (all time) y: 61 lines of code core/src/main/scala/org/apache/spark/deploy/worker/WorkerArguments.scala x: 33 contributors (all time) y: 117 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala x: 70 contributors (all time) y: 1071 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala x: 42 contributors (all time) y: 803 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowEvaluatorFactoryBase.scala x: 3 contributors (all time) y: 193 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/client/IsolatedClientLoader.scala x: 38 contributors (all time) y: 229 lines of code core/src/main/scala/org/apache/spark/api/r/SerDe.scala x: 17 contributors (all time) y: 362 lines of code core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala x: 48 contributors (all time) y: 203 lines of code core/src/main/scala/org/apache/spark/deploy/security/HadoopDelegationTokenManager.scala x: 14 contributors (all time) y: 201 lines of code core/src/main/scala/org/apache/spark/rdd/PipedRDD.scala x: 37 contributors (all time) y: 176 lines of code core/src/main/scala/org/apache/spark/storage/BlockInfoManager.scala x: 18 contributors (all time) y: 327 lines of code core/src/main/scala/org/apache/spark/storage/BlockManager.scala x: 130 contributors (all time) y: 1544 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/InMemoryCatalog.scala x: 33 contributors (all time) y: 533 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala x: 37 contributors (all time) y: 308 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala x: 37 contributors (all time) y: 989 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala x: 49 contributors (all time) y: 975 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/basicPhysicalOperators.scala x: 42 contributors (all time) y: 642 lines of code streaming/src/main/scala/org/apache/spark/streaming/dstream/SocketInputDStream.scala x: 24 contributors (all time) y: 94 lines of code sql/hive-thriftserver/src/main/java/org/apache/hive/service/auth/HiveAuthFactory.java x: 10 contributors (all time) y: 280 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala x: 22 contributors (all time) y: 750 lines of code sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIDriver.scala x: 62 contributors (all time) y: 510 lines of code sql/core/src/main/scala/org/apache/spark/sql/jdbc/DerbyDialect.scala x: 12 contributors (all time) y: 53 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/python/EvaluatePython.scala x: 21 contributors (all time) y: 222 lines of code python/pyspark/sql/group.py x: 28 contributors (all time) y: 90 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala x: 61 contributors (all time) y: 269 lines of code core/src/main/scala/org/apache/spark/executor/ExecutorExitCode.scala x: 14 contributors (all time) y: 37 lines of code core/src/main/scala/org/apache/spark/scheduler/cluster/StandaloneSchedulerBackend.scala x: 40 contributors (all time) y: 252 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/ObjectHashAggregateExec.scala x: 18 contributors (all time) y: 85 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala x: 43 contributors (all time) y: 562 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/ui/SparkConnectServerListener.scala x: 2 contributors (all time) y: 382 lines of code python/pyspark/pandas/generic.py x: 16 contributors (all time) y: 991 lines of code python/pyspark/pandas/internal.py x: 10 contributors (all time) y: 800 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LocalRelation.scala x: 22 contributors (all time) y: 77 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CollationTypeCasts.scala x: 10 contributors (all time) y: 8 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/DecimalPrecision.scala x: 21 contributors (all time) y: 8 lines of code common/unsafe/src/main/java/org/apache/spark/sql/catalyst/util/CollationSupport.java x: 8 contributors (all time) y: 642 lines of code common/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java x: 46 contributors (all time) y: 1414 lines of code python/pyspark/sql/connect/proto/base_pb2_grpc.py x: 8 contributors (all time) y: 454 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/SchemaPruning.scala x: 12 contributors (all time) y: 128 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/GroupStateImpl.scala x: 10 contributors (all time) y: 194 lines of code python/pyspark/sql/connect/streaming/readwriter.py x: 10 contributors (all time) y: 606 lines of code python/pyspark/ml/linalg/__init__.py x: 15 contributors (all time) y: 779 lines of code python/pyspark/mllib/linalg/__init__.py x: 25 contributors (all time) y: 908 lines of code core/src/main/scala/org/apache/spark/deploy/master/ui/MasterPage.scala x: 31 contributors (all time) y: 340 lines of code core/src/main/scala/org/apache/spark/deploy/master/ui/ApplicationPage.scala x: 45 contributors (all time) y: 140 lines of code core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala x: 20 contributors (all time) y: 174 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JsonInferSchema.scala x: 17 contributors (all time) y: 323 lines of code connector/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala x: 9 contributors (all time) y: 558 lines of code python/pyspark/mllib/classification.py x: 38 contributors (all time) y: 398 lines of code python/pyspark/mllib/feature.py x: 27 contributors (all time) y: 346 lines of code python/pyspark/mllib/regression.py x: 37 contributors (all time) y: 371 lines of code core/src/main/scala/org/apache/spark/Dependency.scala x: 26 contributors (all time) y: 126 lines of code core/src/main/scala/org/apache/spark/input/PortableDataStream.scala x: 17 contributors (all time) y: 127 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ShuffleExchangeExec.scala x: 28 contributors (all time) y: 283 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala x: 13 contributors (all time) y: 139 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DescribeTableExec.scala x: 16 contributors (all time) y: 151 lines of code resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala x: 17 contributors (all time) y: 410 lines of code core/src/main/java/org/apache/spark/io/ReadAheadInputStream.java x: 9 contributors (all time) y: 295 lines of code core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala x: 33 contributors (all time) y: 201 lines of code core/src/main/scala/org/apache/spark/deploy/worker/ExecutorRunner.scala x: 57 contributors (all time) y: 152 lines of code core/src/main/scala/org/apache/spark/util/collection/OpenHashSet.scala x: 30 contributors (all time) y: 177 lines of code core/src/main/scala/org/apache/spark/deploy/history/HistoryServerArguments.scala x: 16 contributors (all time) y: 73 lines of code python/pyspark/pandas/supported_api_gen.py x: 9 contributors (all time) y: 195 lines of code resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala x: 56 contributors (all time) y: 1227 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/CatalogV2Implicits.scala x: 21 contributors (all time) y: 180 lines of code core/src/main/scala/org/apache/spark/scheduler/DAGSchedulerEvent.scala x: 49 contributors (all time) y: 83 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousTextSocketSource.scala x: 8 contributors (all time) y: 203 lines of code core/src/main/scala/org/apache/spark/scheduler/SparkListener.scala x: 58 contributors (all time) y: 313 lines of code core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala x: 48 contributors (all time) y: 215 lines of code core/src/main/scala/org/apache/spark/storage/BlockManagerStorageEndpoint.scala x: 9 contributors (all time) y: 81 lines of code common/utils/src/main/scala/org/apache/spark/util/MavenUtils.scala x: 8 contributors (all time) y: 426 lines of code sql/api/src/main/scala/org/apache/spark/sql/catalyst/util/SparkIntervalUtils.scala x: 4 contributors (all time) y: 430 lines of code sql/api/src/main/scala/org/apache/spark/sql/types/Decimal.scala x: 7 contributors (all time) y: 495 lines of code core/src/main/scala/org/apache/spark/storage/DiskStore.scala x: 51 contributors (all time) y: 269 lines of code sql/connect/common/src/main/scala/org/apache/spark/sql/connect/common/UdfUtils.scala x: 2 contributors (all time) y: 498 lines of code core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala x: 59 contributors (all time) y: 1103 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/nullExpressions.scala x: 31 contributors (all time) y: 426 lines of code core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala x: 90 contributors (all time) y: 801 lines of code core/src/main/scala/org/apache/spark/executor/TaskMetrics.scala x: 46 contributors (all time) y: 207 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/VectorAssembler.scala x: 27 contributors (all time) y: 228 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/csv/CSVScan.scala x: 14 contributors (all time) y: 70 lines of code core/src/main/resources/org/apache/spark/ui/static/stagepage.js x: 20 contributors (all time) y: 1051 lines of code core/src/main/scala/org/apache/spark/deploy/FaultToleranceTest.scala x: 39 contributors (all time) y: 329 lines of code core/src/main/scala/org/apache/spark/deploy/rest/RestSubmissionClient.scala x: 21 contributors (all time) y: 439 lines of code core/src/main/scala/org/apache/spark/ui/JettyUtils.scala x: 55 contributors (all time) y: 464 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/Interaction.scala x: 17 contributors (all time) y: 209 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/QuantileDiscretizer.scala x: 22 contributors (all time) y: 155 lines of code resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala x: 28 contributors (all time) y: 248 lines of code core/src/main/scala/org/apache/spark/internal/config/History.scala x: 12 contributors (all time) y: 248 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/CentralMomentAgg.scala x: 19 contributors (all time) y: 290 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Covariance.scala x: 15 contributors (all time) y: 119 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/ui/SparkConnectServerPage.scala x: 1 contributors (all time) y: 428 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/ColumnType.scala x: 22 contributors (all time) y: 624 lines of code core/src/main/scala/org/apache/spark/scheduler/JobWaiter.scala x: 27 contributors (all time) y: 33 lines of code core/src/main/scala/org/apache/spark/ui/jobs/JobsTab.scala x: 15 contributors (all time) y: 35 lines of code python/pyspark/sql/window.py x: 20 contributors (all time) y: 45 lines of code python/pyspark/sql/streaming/listener.py x: 3 contributors (all time) y: 642 lines of code python/pyspark/conf.py x: 23 contributors (all time) y: 126 lines of code mllib/src/main/scala/org/apache/spark/mllib/classification/NaiveBayes.scala x: 35 contributors (all time) y: 272 lines of code mllib/src/main/scala/org/apache/spark/mllib/clustering/GaussianMixtureModel.scala x: 21 contributors (all time) y: 123 lines of code mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAModel.scala x: 26 contributors (all time) y: 540 lines of code mllib/src/main/scala/org/apache/spark/mllib/clustering/PowerIterationClustering.scala x: 26 contributors (all time) y: 265 lines of code mllib/src/main/scala/org/apache/spark/mllib/fpm/PrefixSpan.scala x: 24 contributors (all time) y: 422 lines of code mllib/src/main/scala/org/apache/spark/mllib/recommendation/MatrixFactorizationModel.scala x: 41 contributors (all time) y: 235 lines of code streaming/src/main/scala/org/apache/spark/streaming/dstream/DStreamCheckpointData.scala x: 19 contributors (all time) y: 101 lines of code streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala x: 45 contributors (all time) y: 204 lines of code streaming/src/main/scala/org/apache/spark/streaming/dstream/RawInputDStream.scala x: 25 contributors (all time) y: 69 lines of code sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2.scala x: 39 contributors (all time) y: 120 lines of code sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLEnv.scala x: 29 contributors (all time) y: 54 lines of code core/src/main/scala/org/apache/spark/Partitioner.scala x: 47 contributors (all time) y: 242 lines of code core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala x: 114 contributors (all time) y: 722 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLListener.scala x: 21 contributors (all time) y: 68 lines of code streaming/src/main/scala/org/apache/spark/streaming/DStreamGraph.scala x: 29 contributors (all time) y: 149 lines of code streaming/src/main/scala/org/apache/spark/streaming/util/RawTextSender.scala x: 23 contributors (all time) y: 50 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcSerializer.scala x: 8 contributors (all time) y: 159 lines of code python/pyspark/pandas/indexes/base.py x: 14 contributors (all time) y: 988 lines of code python/pyspark/pandas/strings.py x: 10 contributors (all time) y: 309 lines of code sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java x: 30 contributors (all time) y: 471 lines of code core/src/main/scala/org/apache/spark/deploy/master/ui/MasterWebUI.scala x: 44 contributors (all time) y: 98 lines of code resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/KubernetesConf.scala x: 23 contributors (all time) y: 235 lines of code resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala x: 45 contributors (all time) y: 732 lines of code resource-managers/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnSchedulerBackend.scala x: 23 contributors (all time) y: 263 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/compression/compressionSchemes.scala x: 12 contributors (all time) y: 673 lines of code common/network-common/src/main/java/org/apache/spark/network/crypto/GcmTransportCipher.java x: 1 contributors (all time) y: 332 lines of code core/src/main/scala/org/apache/spark/rdd/CartesianRDD.scala x: 24 contributors (all time) y: 67 lines of code streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala x: 66 contributors (all time) y: 298 lines of code streaming/src/main/scala/org/apache/spark/streaming/StreamingContext.scala x: 70 contributors (all time) y: 469 lines of code streaming/src/main/scala/org/apache/spark/streaming/scheduler/ReceiverTracker.scala x: 29 contributors (all time) y: 457 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningAwareFileIndex.scala x: 29 contributors (all time) y: 173 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala x: 26 contributors (all time) y: 90 lines of code core/src/main/scala/org/apache/spark/status/api/v1/api.scala x: 42 contributors (all time) y: 497 lines of code core/src/main/scala/org/apache/spark/io/CompressionCodec.scala x: 41 contributors (all time) y: 150 lines of code python/pyspark/sql/context.py x: 51 contributors (all time) y: 292 lines of code launcher/src/main/java/org/apache/spark/launcher/SparkSubmitOptionParser.java x: 9 contributors (all time) y: 143 lines of code core/src/main/scala/org/apache/spark/status/AppStatusListener.scala x: 42 contributors (all time) y: 1100 lines of code core/src/main/scala/org/apache/spark/status/AppStatusStore.scala x: 30 contributors (all time) y: 749 lines of code core/src/main/scala/org/apache/spark/util/collection/ExternalAppendOnlyMap.scala x: 39 contributors (all time) y: 382 lines of code python/pyspark/pandas/data_type_ops/base.py x: 9 contributors (all time) y: 366 lines of code python/pyspark/pandas/indexes/multi.py x: 12 contributors (all time) y: 527 lines of code core/src/main/scala/org/apache/spark/deploy/LocalSparkCluster.scala x: 42 contributors (all time) y: 78 lines of code core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala x: 68 contributors (all time) y: 328 lines of code core/src/main/scala/org/apache/spark/rdd/SequenceFileRDDFunctions.scala x: 24 contributors (all time) y: 38 lines of code core/src/main/scala/org/apache/spark/scheduler/dynalloc/ExecutorMonitor.scala x: 16 contributors (all time) y: 440 lines of code core/src/main/scala/org/apache/spark/storage/BlockManagerDecommissioner.scala x: 14 contributors (all time) y: 336 lines of code core/src/main/scala/org/apache/spark/storage/BlockManagerMaster.scala x: 56 contributors (all time) y: 203 lines of code core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala x: 26 contributors (all time) y: 617 lines of code core/src/main/scala/org/apache/spark/MapOutputTracker.scala x: 85 contributors (all time) y: 1131 lines of code core/src/main/scala/org/apache/spark/api/r/RBackendHandler.scala x: 18 contributors (all time) y: 198 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveInspectors.scala x: 40 contributors (all time) y: 875 lines of code common/network-common/src/main/java/org/apache/spark/network/server/OneForOneStreamManager.java x: 13 contributors (all time) y: 167 lines of code common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalBlockHandler.java x: 15 contributors (all time) y: 478 lines of code common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/OneForOneBlockFetcher.java x: 16 contributors (all time) y: 272 lines of code connector/spark-ganglia-lgpl/src/main/java/com/codahale/metrics/ganglia/GangliaReporter.java x: 3 contributors (all time) y: 293 lines of code core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeExternalSorter.java x: 31 contributors (all time) y: 582 lines of code sql/hive-thriftserver/src/main/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java x: 8 contributors (all time) y: 637 lines of code sql/hive-thriftserver/src/main/java/org/apache/hive/service/cli/thrift/ThriftHttpServlet.java x: 9 contributors (all time) y: 395 lines of code sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/SupportsMetadataColumns.java x: 5 contributors (all time) y: 10 lines of code mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/BlockMatrix.scala x: 22 contributors (all time) y: 346 lines of code core/src/main/scala/org/apache/spark/broadcast/TorrentBroadcast.scala x: 49 contributors (all time) y: 255 lines of code core/src/main/scala/org/apache/spark/rdd/JdbcRDD.scala x: 29 contributors (all time) y: 121 lines of code core/src/main/scala/org/apache/spark/ui/WebUI.scala x: 37 contributors (all time) y: 156 lines of code core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala x: 46 contributors (all time) y: 534 lines of code core/src/main/scala/org/apache/spark/util/logging/RollingFileAppender.scala x: 14 contributors (all time) y: 129 lines of code core/src/main/scala/org/apache/spark/deploy/Client.scala x: 38 contributors (all time) y: 219 lines of code core/src/main/scala/org/apache/spark/rdd/ReliableCheckpointRDD.scala x: 20 contributors (all time) y: 230 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/object.scala x: 25 contributors (all time) y: 682 lines of code core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala x: 27 contributors (all time) y: 147 lines of code core/src/main/scala/org/apache/spark/metrics/MetricsSystem.scala x: 45 contributors (all time) y: 174 lines of code core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala x: 79 contributors (all time) y: 541 lines of code core/src/main/scala/org/apache/spark/rpc/netty/NettyRpcEnv.scala x: 24 contributors (all time) y: 540 lines of code core/src/main/scala/org/apache/spark/scheduler/ReplayListenerBus.scala x: 19 contributors (all time) y: 82 lines of code core/src/main/scala/org/apache/spark/shuffle/ShuffleBlockPusher.scala x: 10 contributors (all time) y: 323 lines of code core/src/main/scala/org/apache/spark/util/SizeEstimator.scala x: 40 contributors (all time) y: 231 lines of code core/src/main/scala/org/apache/spark/storage/BlockId.scala x: 31 contributors (all time) y: 208 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatDataWriter.scala x: 12 contributors (all time) y: 410 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRDD.scala x: 44 contributors (all time) y: 195 lines of code core/src/main/scala/org/apache/spark/ContextCleaner.scala x: 27 contributors (all time) y: 194 lines of code core/src/main/scala/org/apache/spark/metrics/MetricsConfig.scala x: 33 contributors (all time) y: 78 lines of code mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeans.scala x: 47 contributors (all time) y: 326 lines of code mllib/src/main/scala/org/apache/spark/mllib/evaluation/BinaryClassificationMetrics.scala x: 16 contributors (all time) y: 161 lines of code resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/KubernetesUtils.scala x: 19 contributors (all time) y: 315 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/TableReader.scala x: 53 contributors (all time) y: 343 lines of code python/pyspark/__init__.py x: 54 contributors (all time) y: 72 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala x: 31 contributors (all time) y: 307 lines of code common/network-common/src/main/java/org/apache/spark/network/util/TransportConf.java x: 29 contributors (all time) y: 287 lines of code python/pyspark/mllib/clustering.py x: 41 contributors (all time) y: 449 lines of code python/pyspark/mllib/recommendation.py x: 28 contributors (all time) y: 136 lines of code python/pyspark/mllib/stat/KernelDensity.py x: 7 contributors (all time) y: 19 lines of code python/pyspark/mllib/tree.py x: 22 contributors (all time) y: 321 lines of code python/pyspark/serializers.py x: 58 contributors (all time) y: 373 lines of code python/pyspark/streaming/dstream.py x: 19 contributors (all time) y: 489 lines of code resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStep.scala x: 28 contributors (all time) y: 149 lines of code sql/core/src/main/scala/org/apache/spark/sql/sources/interfaces.scala x: 30 contributors (all time) y: 99 lines of code sql/core/src/main/scala/org/apache/spark/sql/streaming/ui/StreamingQueryStatisticsPage.scala x: 12 contributors (all time) y: 480 lines of code core/src/main/scala/org/apache/spark/deploy/worker/ui/WorkerWebUI.scala x: 39 contributors (all time) y: 31 lines of code core/src/main/scala/org/apache/spark/metrics/sink/MetricsServlet.scala x: 22 contributors (all time) y: 33 lines of code core/src/main/scala/org/apache/spark/ui/PagedTable.scala x: 15 contributors (all time) y: 272 lines of code core/src/main/scala/org/apache/spark/ui/jobs/AllJobsPage.scala x: 45 contributors (all time) y: 519 lines of code core/src/main/scala/org/apache/spark/ui/jobs/JobPage.scala x: 37 contributors (all time) y: 472 lines of code core/src/main/scala/org/apache/spark/ui/jobs/PoolPage.scala x: 25 contributors (all time) y: 40 lines of code core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala x: 88 contributors (all time) y: 471 lines of code core/src/main/scala/org/apache/spark/ui/jobs/StageTable.scala x: 63 contributors (all time) y: 294 lines of code core/src/main/scala/org/apache/spark/ui/jobs/StagesTab.scala x: 17 contributors (all time) y: 40 lines of code core/src/main/scala/org/apache/spark/ui/storage/RDDPage.scala x: 37 contributors (all time) y: 209 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/ui/AllExecutionsPage.scala x: 24 contributors (all time) y: 495 lines of code sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/ui/ThriftServerPage.scala x: 23 contributors (all time) y: 346 lines of code sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/ui/ThriftServerSessionPage.scala x: 20 contributors (all time) y: 84 lines of code streaming/src/main/scala/org/apache/spark/streaming/ui/StreamingPage.scala x: 29 contributors (all time) y: 426 lines of code dev/run-tests.py x: 52 contributors (all time) y: 431 lines of code launcher/src/main/java/org/apache/spark/launcher/LauncherServer.java x: 14 contributors (all time) y: 271 lines of code core/src/main/resources/org/apache/spark/ui/static/executorspage.js x: 23 contributors (all time) y: 710 lines of code sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedColumnReader.java x: 28 contributors (all time) y: 310 lines of code sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedRleValuesReader.java x: 16 contributors (all time) y: 709 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashedRelation.scala x: 40 contributors (all time) y: 795 lines of code mllib/src/main/scala/org/apache/spark/mllib/evaluation/RegressionMetrics.scala x: 18 contributors (all time) y: 65 lines of code sql/catalyst/src/main/java/org/apache/spark/sql/vectorized/ArrowColumnVector.java x: 15 contributors (all time) y: 481 lines of code core/src/main/scala/org/apache/spark/scheduler/TaskScheduler.scala x: 41 contributors (all time) y: 31 lines of code core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedClusterMessage.scala x: 46 contributors (all time) y: 93 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/UnwrapCastInBinaryComparison.scala x: 13 contributors (all time) y: 302 lines of code core/src/main/scala/org/apache/spark/rdd/AsyncRDDActions.scala x: 31 contributors (all time) y: 84 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/python/FlatMapGroupsInPandasExec.scala x: 15 contributors (all time) y: 14 lines of code core/src/main/protobuf/org/apache/spark/status/protobuf/store_types.proto x: 8 contributors (all time) y: 740 lines of code common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBTypeInfo.java x: 5 contributors (all time) y: 287 lines of code core/src/main/scala/org/apache/spark/api/java/JavaPairRDD.scala x: 51 contributors (all time) y: 421 lines of code core/src/main/scala/org/apache/spark/status/protobuf/StageDataWrapperSerializer.scala x: 4 contributors (all time) y: 659 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ToNumberParser.scala x: 4 contributors (all time) y: 640 lines of code streaming/src/main/scala/org/apache/spark/streaming/ui/StreamingJobProgressListener.scala x: 17 contributors (all time) y: 195 lines of code mllib/src/main/scala/org/apache/spark/mllib/regression/RegressionModel.scala x: 14 contributors (all time) y: 22 lines of code core/src/main/scala/org/apache/spark/api/java/JavaRDDLike.scala x: 54 contributors (all time) y: 286 lines of code mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala x: 59 contributors (all time) y: 1122 lines of code mllib/src/main/scala/org/apache/spark/mllib/linalg/Matrices.scala x: 38 contributors (all time) y: 773 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/GenerateExec.scala x: 16 contributors (all time) y: 231 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala x: 34 contributors (all time) y: 1053 lines of code core/src/main/scala/org/apache/spark/rdd/CoalescedRDD.scala x: 35 contributors (all time) y: 221 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/AggregationIterator.scala x: 16 contributors (all time) y: 214 lines of code common/unsafe/src/main/java/org/apache/spark/unsafe/Platform.java x: 19 contributors (all time) y: 231 lines of code core/src/main/scala/org/apache/spark/rdd/OrderedRDDFunctions.scala x: 28 contributors (all time) y: 45 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateSafeProjection.scala x: 23 contributors (all time) y: 156 lines of code common/utils/src/main/scala/org/apache/spark/util/ClosureCleaner.scala x: 2 contributors (all time) y: 618 lines of code core/src/main/scala/org/apache/spark/status/LiveEntity.scala x: 26 contributors (all time) y: 817 lines of code python/pyspark/join.py x: 19 contributors (all time) y: 66 lines of code python/pyspark/shuffle.py x: 14 contributors (all time) y: 446 lines of code streaming/src/main/scala/org/apache/spark/streaming/dstream/FlatMapValuedDStream.scala x: 19 contributors (all time) y: 15 lines of code core/src/main/scala/org/apache/spark/deploy/master/ZooKeeperPersistenceEngine.scala x: 33 contributors (all time) y: 48 lines of code mllib/src/main/scala/org/apache/spark/mllib/tree/DecisionTree.scala x: 25 contributors (all time) y: 110 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/debug/package.scala x: 34 contributors (all time) y: 183 lines of code streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaStreamingContext.scala x: 45 contributors (all time) y: 274 lines of code python/pyspark/sql/connect/proto/catalog_pb2.pyi x: 4 contributors (all time) y: 913 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/joins/SortMergeJoinExec.scala x: 33 contributors (all time) y: 1142 lines of code core/src/main/scala/org/apache/spark/scheduler/ShuffleMapTask.scala x: 53 contributors (all time) y: 62 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowExec.scala x: 18 contributors (all time) y: 35 lines of code core/src/main/scala/org/apache/spark/package.scala x: 25 contributors (all time) y: 12 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/interfaces.scala x: 34 contributors (all time) y: 255 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/joins/SortMergeJoinEvaluatorFactory.scala x: 1 contributors (all time) y: 259 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/joins/ShuffledHashJoinExec.scala x: 18 contributors (all time) y: 505 lines of code core/src/main/scala/org/apache/spark/scheduler/ResultTask.scala x: 44 contributors (all time) y: 47 lines of code core/src/main/scala/org/apache/spark/rdd/CheckpointRDD.scala x: 31 contributors (all time) y: 10 lines of code mllib-local/src/main/scala/org/apache/spark/ml/linalg/BLAS.scala x: 12 contributors (all time) y: 654 lines of code mllib/src/main/scala/org/apache/spark/ml/tree/treeParams.scala x: 18 contributors (all time) y: 297 lines of code mllib/src/main/scala/org/apache/spark/mllib/classification/LogisticRegression.scala x: 36 contributors (all time) y: 211 lines of code mllib/src/main/scala/org/apache/spark/mllib/linalg/BLAS.scala x: 22 contributors (all time) y: 568 lines of code mllib/src/main/scala/org/apache/spark/mllib/random/RandomRDDs.scala x: 11 contributors (all time) y: 506 lines of code mllib/src/main/scala/org/apache/spark/mllib/recommendation/ALS.scala x: 44 contributors (all time) y: 221 lines of code mllib/src/main/scala/org/apache/spark/mllib/regression/GeneralizedLinearAlgorithm.scala x: 25 contributors (all time) y: 155 lines of code mllib/src/main/scala/org/apache/spark/mllib/regression/LabeledPoint.scala x: 19 contributors (all time) y: 41 lines of code mllib/src/main/scala/org/apache/spark/mllib/regression/LinearRegression.scala x: 30 contributors (all time) y: 62 lines of code mllib/src/main/scala/org/apache/spark/mllib/regression/RidgeRegression.scala x: 26 contributors (all time) y: 62 lines of code core/src/main/scala/org/apache/spark/storage/BlockManagerId.scala x: 23 contributors (all time) y: 82 lines of code core/src/main/scala/org/apache/spark/status/storeTypes.scala x: 19 contributors (all time) y: 469 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala x: 13 contributors (all time) y: 514 lines of code core/src/main/scala/org/apache/spark/scheduler/TaskResult.scala x: 34 contributors (all time) y: 70 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/SortOrder.scala x: 26 contributors (all time) y: 172 lines of code core/src/main/scala/org/apache/spark/TaskEndReason.scala x: 38 contributors (all time) y: 138 lines of code core/src/main/scala/org/apache/spark/rdd/CoGroupedRDD.scala x: 40 contributors (all time) y: 122 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala x: 33 contributors (all time) y: 25 lines of code sql/core/src/main/java/org/apache/spark/sql/execution/UnsafeKVExternalSorter.java x: 18 contributors (all time) y: 215 lines of code core/src/main/resources/org/apache/spark/ui/static/sorttable.js x: 13 contributors (all time) y: 352 lines of code streaming/src/main/scala/org/apache/spark/streaming/dstream/UnionDStream.scala x: 21 contributors (all time) y: 29 lines of code graphx/src/main/scala/org/apache/spark/graphx/impl/EdgePartition.scala x: 12 contributors (all time) y: 338 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateUnsafeProjection.scala x: 22 contributors (all time) y: 298 lines of code core/src/main/scala/org/apache/spark/rdd/MapPartitionsRDD.scala x: 23 contributors (all time) y: 28 lines of code core/src/main/scala/org/apache/spark/rdd/ShuffledRDD.scala x: 32 contributors (all time) y: 67 lines of code licenses-binary/LICENSE-javassist.html x: 1 contributors (all time) y: 369 lines of code core/src/main/scala/org/apache/spark/api/java/JavaDoubleRDD.scala x: 32 contributors (all time) y: 82 lines of code core/src/main/scala/org/apache/spark/rdd/RDDCheckpointData.scala x: 26 contributors (all time) y: 37 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSQLParser.scala x: 2 contributors (all time) y: 1040 lines of code streaming/src/main/scala/org/apache/spark/streaming/dstream/PluggableInputDStream.scala x: 17 contributors (all time) y: 12 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUdf.scala x: 9 contributors (all time) y: 1053 lines of code core/src/main/scala/org/apache/spark/deploy/master/RecoveryState.scala x: 16 contributors (all time) y: 5 lines of code
5815.0
lines of code
  min: 1.0
  average: 157.16
  25th percentile: 24.0
  median: 68.0
  75th percentile: 167.0
  max: 5815.0
0 237.0
contributors (all time)
min: 1.0 | average: 10.87 | 25th percentile: 2.0 | median: 5.0 | 75th percentile: 13.0 | max: 237.0

File Size vs. Commits (30 days): 440 points

core/src/main/scala/org/apache/spark/util/collection/BitSet.scala x: 1 commits (30d) y: 162 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CTESubstitution.scala x: 3 commits (30d) y: 261 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveIdentifierClause.scala x: 1 commits (30d) y: 100 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/unresolved.scala x: 1 commits (30d) y: 641 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala x: 2 commits (30d) y: 827 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala x: 1 commits (30d) y: 424 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala x: 1 commits (30d) y: 847 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreePatternBits.scala x: 1 commits (30d) y: 38 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreePatterns.scala x: 2 commits (30d) y: 150 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ApplyDefaultCollationToStringType.scala x: 3 commits (30d) y: 135 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/DecisionTreeClassifier.scala x: 4 commits (30d) y: 209 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/FMClassifier.scala x: 5 commits (30d) y: 225 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/GBTClassifier.scala x: 1 commits (30d) y: 270 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala x: 5 commits (30d) y: 312 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala x: 5 commits (30d) y: 929 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/MultilayerPerceptronClassifier.scala x: 5 commits (30d) y: 255 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/RandomForestClassifier.scala x: 1 commits (30d) y: 346 lines of code mllib/src/main/scala/org/apache/spark/ml/clustering/BisectingKMeans.scala x: 1 commits (30d) y: 201 lines of code mllib/src/main/scala/org/apache/spark/ml/clustering/GaussianMixture.scala x: 5 commits (30d) y: 455 lines of code mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala x: 5 commits (30d) y: 518 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/GBTRegressor.scala x: 1 commits (30d) y: 237 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala x: 5 commits (30d) y: 967 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/LinearRegression.scala x: 5 commits (30d) y: 594 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/RandomForestRegressor.scala x: 1 commits (30d) y: 219 lines of code mllib/src/main/scala/org/apache/spark/ml/tree/treeModels.scala x: 4 commits (30d) y: 331 lines of code mllib/src/main/scala/org/apache/spark/ml/util/HasTrainingSummary.scala x: 1 commits (30d) y: 21 lines of code python/pyspark/testing/connectutils.py x: 8 commits (30d) y: 177 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/config/Connect.scala x: 3 commits (30d) y: 325 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/ml/MLCache.scala x: 5 commits (30d) y: 187 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/LiteralFunctionResolution.scala x: 2 commits (30d) y: 23 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/package.scala x: 2 commits (30d) y: 179 lines of code core/src/main/scala/org/apache/spark/util/UninterruptibleThread.scala x: 1 commits (30d) y: 79 lines of code python/pyspark/errors/exceptions/captured.py x: 1 commits (30d) y: 284 lines of code python/pyspark/pandas/config.py x: 2 commits (30d) y: 363 lines of code python/pyspark/pandas/groupby.py x: 2 commits (30d) y: 1800 lines of code python/pyspark/pandas/namespace.py x: 1 commits (30d) y: 1517 lines of code python/pyspark/pandas/series.py x: 1 commits (30d) y: 2215 lines of code python/pyspark/pandas/utils.py x: 1 commits (30d) y: 657 lines of code python/pyspark/testing/pandasutils.py x: 1 commits (30d) y: 486 lines of code python/pyspark/testing/utils.py x: 2 commits (30d) y: 560 lines of code python/pyspark/sql/pandas/serializers.py x: 5 commits (30d) y: 884 lines of code python/pyspark/worker.py x: 4 commits (30d) y: 1728 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala x: 14 commits (30d) y: 5815 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/constraints.scala x: 6 commits (30d) y: 174 lines of code python/pyspark/sql/datasource.py x: 2 commits (30d) y: 256 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala x: 9 commits (30d) y: 2889 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala x: 1 commits (30d) y: 617 lines of code connector/kafka-0-10-token-provider/src/main/scala/org/apache/spark/kafka010/KafkaConfigUpdater.scala x: 1 commits (30d) y: 58 lines of code project/SparkBuild.scala x: 3 commits (30d) y: 1460 lines of code sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala x: 1 commits (30d) y: 474 lines of code python/pyspark/sql/pandas/types.py x: 1 commits (30d) y: 920 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/AggregateResolver.scala x: 2 commits (30d) y: 192 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/ExpressionResolver.scala x: 3 commits (30d) y: 527 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/FunctionResolver.scala x: 2 commits (30d) y: 102 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/NameScope.scala x: 2 commits (30d) y: 406 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/ml/MLUtils.scala x: 1 commits (30d) y: 514 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/NaiveBayes.scala x: 4 commits (30d) y: 436 lines of code mllib/src/main/scala/org/apache/spark/ml/clustering/LDA.scala x: 5 commits (30d) y: 533 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/BucketedRandomProjectionLSH.scala x: 4 commits (30d) y: 150 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/ChiSqSelector.scala x: 4 commits (30d) y: 108 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/CountVectorizer.scala x: 4 commits (30d) y: 248 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/MaxAbsScaler.scala x: 4 commits (30d) y: 122 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/MinHashLSH.scala x: 4 commits (30d) y: 161 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/OneHotEncoder.scala x: 4 commits (30d) y: 378 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala x: 5 commits (30d) y: 405 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/TargetEncoder.scala x: 4 commits (30d) y: 301 lines of code mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala x: 5 commits (30d) y: 1062 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/AFTSurvivalRegression.scala x: 4 commits (30d) y: 345 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/FMRegressor.scala x: 4 commits (30d) y: 478 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/IsotonicRegression.scala x: 4 commits (30d) y: 199 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/state/StateDataSource.scala x: 3 commits (30d) y: 443 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/OperatorStateMetadata.scala x: 3 commits (30d) y: 348 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala x: 4 commits (30d) y: 920 lines of code python/pyspark/sql/connect/functions/builtin.py x: 3 commits (30d) y: 2417 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala x: 2 commits (30d) y: 998 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2AlterTableCommands.scala x: 2 commits (30d) y: 223 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala x: 3 commits (30d) y: 1148 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/ReplaceTableExec.scala x: 2 commits (30d) y: 89 lines of code sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala x: 1 commits (30d) y: 333 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/Word2Vec.scala x: 3 commits (30d) y: 250 lines of code mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala x: 4 commits (30d) y: 263 lines of code mllib/src/main/scala/org/apache/spark/ml/tuning/CrossValidator.scala x: 3 commits (30d) y: 304 lines of code mllib/src/main/scala/org/apache/spark/ml/tuning/TrainValidationSplit.scala x: 3 commits (30d) y: 278 lines of code mllib/src/main/scala/org/apache/spark/ml/util/ReadWrite.scala x: 3 commits (30d) y: 524 lines of code dev/sparktestsupport/modules.py x: 4 commits (30d) y: 1409 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/namedExpressions.scala x: 1 commits (30d) y: 441 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala x: 2 commits (30d) y: 1683 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/ViewResolver.scala x: 3 commits (30d) y: 91 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala x: 1 commits (30d) y: 1503 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/CacheManager.scala x: 2 commits (30d) y: 305 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/python/PythonArrowOutput.scala x: 2 commits (30d) y: 243 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/WriteToDataSourceV2Exec.scala x: 2 commits (30d) y: 528 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala x: 1 commits (30d) y: 1547 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala x: 6 commits (30d) y: 3468 lines of code sql/core/src/main/scala/org/apache/spark/sql/classic/Dataset.scala x: 3 commits (30d) y: 1498 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/python/streaming/TransformWithStateInPySparkStateServer.scala x: 4 commits (30d) y: 754 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStoreErrors.scala x: 1 commits (30d) y: 367 lines of code sql/core/src/main/scala/org/apache/spark/sql/classic/Catalog.scala x: 3 commits (30d) y: 568 lines of code core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala x: 3 commits (30d) y: 733 lines of code python/pyspark/ml/classification.py x: 2 commits (30d) y: 2173 lines of code python/pyspark/ml/connect/readwrite.py x: 2 commits (30d) y: 290 lines of code python/pyspark/ml/feature.py x: 1 commits (30d) y: 3621 lines of code python/pyspark/ml/util.py x: 2 commits (30d) y: 714 lines of code python/pyspark/sql/connect/client/core.py x: 5 commits (30d) y: 1449 lines of code python/pyspark/sql/connect/group.py x: 4 commits (30d) y: 488 lines of code python/pyspark/sql/connect/plan.py x: 3 commits (30d) y: 2133 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStore.scala x: 3 commits (30d) y: 693 lines of code project/plugins.sbt x: 2 commits (30d) y: 14 lines of code common/utils/src/main/scala/org/apache/spark/ErrorClassesJSONReader.scala x: 1 commits (30d) y: 123 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala x: 2 commits (30d) y: 2609 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala x: 5 commits (30d) y: 1425 lines of code python/pyspark/ml/connect/functions.py x: 3 commits (30d) y: 47 lines of code python/pyspark/sql/connect/tvf.py x: 3 commits (30d) y: 101 lines of code core/src/main/scala/org/apache/spark/internal/config/package.scala x: 1 commits (30d) y: 2456 lines of code core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala x: 1 commits (30d) y: 886 lines of code python/pyspark/sql/streaming/proto/StateMessage_pb2.pyi x: 1 commits (30d) y: 1116 lines of code python/pyspark/sql/streaming/stateful_processor_api_client.py x: 3 commits (30d) y: 392 lines of code sql/core/src/main/scala/org/apache/spark/sql/avro/AvroOptions.scala x: 2 commits (30d) y: 116 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/timeExpressions.scala x: 7 commits (30d) y: 458 lines of code python/pyspark/sql/functions/__init__.py x: 2 commits (30d) y: 466 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/ResolverGuard.scala x: 2 commits (30d) y: 369 lines of code sql/core/src/main/scala/org/apache/spark/sql/avro/AvroDeserializer.scala x: 1 commits (30d) y: 404 lines of code sql/core/src/main/scala/org/apache/spark/sql/avro/AvroSerializer.scala x: 1 commits (30d) y: 314 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala x: 1 commits (30d) y: 553 lines of code sql/api/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4 x: 4 commits (30d) y: 2122 lines of code sql/api/src/main/scala/org/apache/spark/sql/catalyst/util/SparkParserUtils.scala x: 1 commits (30d) y: 144 lines of code sql/api/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala x: 3 commits (30d) y: 681 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala x: 7 commits (30d) y: 4715 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala x: 3 commits (30d) y: 1040 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryRelation.scala x: 2 commits (30d) y: 417 lines of code core/src/main/scala/org/apache/spark/api/python/PythonRDD.scala x: 4 commits (30d) y: 687 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/IncrementalExecution.scala x: 3 commits (30d) y: 467 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamingSymmetricHashJoinExec.scala x: 1 commits (30d) y: 543 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/SymmetricHashJoinStateManager.scala x: 1 commits (30d) y: 793 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/statefulOperators.scala x: 1 commits (30d) y: 1059 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/xml/StaxXmlParser.scala x: 2 commits (30d) y: 873 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala x: 1 commits (30d) y: 3066 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala x: 3 commits (30d) y: 432 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala x: 1 commits (30d) y: 806 lines of code sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/TableInfo.java x: 2 commits (30d) y: 58 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala x: 3 commits (30d) y: 961 lines of code python/pyspark/core/context.py x: 3 commits (30d) y: 784 lines of code core/src/main/scala/org/apache/spark/storage/BlockManagerMasterEndpoint.scala x: 2 commits (30d) y: 818 lines of code python/pyspark/sql/connect/expressions.py x: 1 commits (30d) y: 1039 lines of code python/pyspark/sql/connect/proto/expressions_pb2.pyi x: 1 commits (30d) y: 1764 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala x: 2 commits (30d) y: 1057 lines of code sql/connect/common/src/main/scala/org/apache/spark/sql/connect/Dataset.scala x: 1 commits (30d) y: 1025 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala x: 2 commits (30d) y: 613 lines of code core/src/main/scala/org/apache/spark/serializer/KryoSerializer.scala x: 2 commits (30d) y: 575 lines of code python/pyspark/errors/exceptions/connect.py x: 2 commits (30d) y: 332 lines of code core/src/main/scala/org/apache/spark/executor/Executor.scala x: 1 commits (30d) y: 952 lines of code common/utils/src/main/scala/org/apache/spark/internal/LogKey.scala x: 2 commits (30d) y: 861 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala x: 1 commits (30d) y: 701 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDBStateStoreProvider.scala x: 2 commits (30d) y: 755 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDBFileManager.scala x: 2 commits (30d) y: 783 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala x: 1 commits (30d) y: 3781 lines of code
5815.0
lines of code
  min: 4.0
  average: 408.8
  25th percentile: 95.25
  median: 218.0
  75th percentile: 445.25
  max: 5815.0
0 14.0
commits (30d)
min: 1.0 | average: 1.79 | 25th percentile: 1.0 | median: 1.0 | 75th percentile: 2.0 | max: 14.0

File Size vs. Contributors (30 days): 440 points

core/src/main/scala/org/apache/spark/util/collection/BitSet.scala x: 1 contributors (30d) y: 162 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CTESubstitution.scala x: 3 contributors (30d) y: 261 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveIdentifierClause.scala x: 1 contributors (30d) y: 100 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/unresolved.scala x: 1 contributors (30d) y: 641 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala x: 2 contributors (30d) y: 827 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala x: 1 contributors (30d) y: 424 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala x: 1 contributors (30d) y: 847 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreePatternBits.scala x: 1 contributors (30d) y: 38 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreePatterns.scala x: 2 contributors (30d) y: 150 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ApplyDefaultCollationToStringType.scala x: 2 contributors (30d) y: 135 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/DecisionTreeClassifier.scala x: 2 contributors (30d) y: 209 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/GBTClassifier.scala x: 1 contributors (30d) y: 270 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala x: 2 contributors (30d) y: 312 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala x: 2 contributors (30d) y: 929 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/MultilayerPerceptronClassifier.scala x: 2 contributors (30d) y: 255 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/RandomForestClassifier.scala x: 1 contributors (30d) y: 346 lines of code mllib/src/main/scala/org/apache/spark/ml/clustering/BisectingKMeans.scala x: 1 contributors (30d) y: 201 lines of code mllib/src/main/scala/org/apache/spark/ml/clustering/GaussianMixture.scala x: 2 contributors (30d) y: 455 lines of code mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala x: 2 contributors (30d) y: 518 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/GBTRegressor.scala x: 1 contributors (30d) y: 237 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala x: 2 contributors (30d) y: 967 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/LinearRegression.scala x: 2 contributors (30d) y: 594 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/RandomForestRegressor.scala x: 1 contributors (30d) y: 219 lines of code mllib/src/main/scala/org/apache/spark/ml/tree/treeModels.scala x: 2 contributors (30d) y: 331 lines of code mllib/src/main/scala/org/apache/spark/ml/util/HasTrainingSummary.scala x: 1 contributors (30d) y: 21 lines of code python/pyspark/testing/connectutils.py x: 4 contributors (30d) y: 177 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/ml/MLCache.scala x: 3 contributors (30d) y: 187 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/LiteralFunctionResolution.scala x: 2 contributors (30d) y: 23 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/package.scala x: 2 contributors (30d) y: 179 lines of code core/src/main/scala/org/apache/spark/util/UninterruptibleThread.scala x: 1 contributors (30d) y: 79 lines of code python/pyspark/errors/exceptions/captured.py x: 1 contributors (30d) y: 284 lines of code python/pyspark/pandas/groupby.py x: 2 contributors (30d) y: 1800 lines of code python/pyspark/pandas/namespace.py x: 1 contributors (30d) y: 1517 lines of code python/pyspark/pandas/series.py x: 1 contributors (30d) y: 2215 lines of code python/pyspark/pandas/utils.py x: 1 contributors (30d) y: 657 lines of code python/pyspark/testing/pandasutils.py x: 1 contributors (30d) y: 486 lines of code python/pyspark/testing/utils.py x: 2 contributors (30d) y: 560 lines of code python/pyspark/sql/pandas/serializers.py x: 3 contributors (30d) y: 884 lines of code python/pyspark/worker.py x: 2 contributors (30d) y: 1728 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala x: 11 contributors (30d) y: 5815 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala x: 7 contributors (30d) y: 2889 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala x: 1 contributors (30d) y: 617 lines of code connector/kafka-0-10-token-provider/src/main/scala/org/apache/spark/kafka010/KafkaConfigUpdater.scala x: 1 contributors (30d) y: 58 lines of code project/SparkBuild.scala x: 2 contributors (30d) y: 1460 lines of code sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala x: 1 contributors (30d) y: 474 lines of code python/pyspark/sql/pandas/types.py x: 1 contributors (30d) y: 920 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/ExpressionResolver.scala x: 1 contributors (30d) y: 527 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/NameScope.scala x: 1 contributors (30d) y: 406 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/ml/MLUtils.scala x: 1 contributors (30d) y: 514 lines of code mllib/src/main/scala/org/apache/spark/ml/clustering/LDA.scala x: 2 contributors (30d) y: 533 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/ChiSqSelector.scala x: 2 contributors (30d) y: 108 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/CountVectorizer.scala x: 2 contributors (30d) y: 248 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/OneHotEncoder.scala x: 2 contributors (30d) y: 378 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala x: 3 contributors (30d) y: 405 lines of code mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala x: 2 contributors (30d) y: 1062 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/AFTSurvivalRegression.scala x: 2 contributors (30d) y: 345 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/FMRegressor.scala x: 2 contributors (30d) y: 478 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/IsotonicRegression.scala x: 2 contributors (30d) y: 199 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/state/StateDataSource.scala x: 3 contributors (30d) y: 443 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/OperatorStateMetadata.scala x: 3 contributors (30d) y: 348 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala x: 3 contributors (30d) y: 920 lines of code python/pyspark/sql/connect/functions/builtin.py x: 2 contributors (30d) y: 2417 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala x: 1 contributors (30d) y: 998 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala x: 2 contributors (30d) y: 1148 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/ReplaceTableExec.scala x: 2 contributors (30d) y: 89 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala x: 1 contributors (30d) y: 555 lines of code sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala x: 1 contributors (30d) y: 333 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/Imputer.scala x: 3 contributors (30d) y: 214 lines of code mllib/src/main/scala/org/apache/spark/ml/tuning/TrainValidationSplit.scala x: 2 contributors (30d) y: 278 lines of code dev/sparktestsupport/modules.py x: 3 contributors (30d) y: 1409 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/namedExpressions.scala x: 1 contributors (30d) y: 441 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala x: 2 contributors (30d) y: 1683 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/ViewResolver.scala x: 3 contributors (30d) y: 91 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala x: 1 contributors (30d) y: 1503 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala x: 1 contributors (30d) y: 1547 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala x: 4 contributors (30d) y: 3468 lines of code sql/core/src/main/scala/org/apache/spark/sql/classic/Dataset.scala x: 2 contributors (30d) y: 1498 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/python/streaming/TransformWithStateInPySparkStateServer.scala x: 2 contributors (30d) y: 754 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/TransformWithStateExec.scala x: 3 contributors (30d) y: 539 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStoreErrors.scala x: 1 contributors (30d) y: 367 lines of code python/pyspark/ml/classification.py x: 1 contributors (30d) y: 2173 lines of code python/pyspark/ml/feature.py x: 1 contributors (30d) y: 3621 lines of code python/pyspark/ml/util.py x: 1 contributors (30d) y: 714 lines of code python/pyspark/sql/connect/client/core.py x: 3 contributors (30d) y: 1449 lines of code python/pyspark/sql/connect/group.py x: 2 contributors (30d) y: 488 lines of code python/pyspark/sql/connect/plan.py x: 2 contributors (30d) y: 2133 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStore.scala x: 3 contributors (30d) y: 693 lines of code common/utils/src/main/scala/org/apache/spark/ErrorClassesJSONReader.scala x: 1 contributors (30d) y: 123 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala x: 2 contributors (30d) y: 2609 lines of code python/pyspark/sql/pandas/functions.py x: 1 contributors (30d) y: 160 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala x: 4 contributors (30d) y: 1425 lines of code core/src/main/scala/org/apache/spark/internal/config/package.scala x: 1 contributors (30d) y: 2456 lines of code core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala x: 1 contributors (30d) y: 886 lines of code python/pyspark/sql/streaming/proto/StateMessage_pb2.pyi x: 1 contributors (30d) y: 1116 lines of code python/pyspark/sql/streaming/stateful_processor_api_client.py x: 2 contributors (30d) y: 392 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/timeExpressions.scala x: 5 contributors (30d) y: 458 lines of code sql/core/src/main/scala/org/apache/spark/sql/avro/AvroSerializer.scala x: 1 contributors (30d) y: 314 lines of code sql/api/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4 x: 4 contributors (30d) y: 2122 lines of code sql/api/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala x: 2 contributors (30d) y: 681 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala x: 4 contributors (30d) y: 4715 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala x: 3 contributors (30d) y: 1040 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala x: 2 contributors (30d) y: 991 lines of code core/src/main/scala/org/apache/spark/api/python/PythonRDD.scala x: 1 contributors (30d) y: 687 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/SymmetricHashJoinStateManager.scala x: 1 contributors (30d) y: 793 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/statefulOperators.scala x: 1 contributors (30d) y: 1059 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala x: 1 contributors (30d) y: 3066 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala x: 3 contributors (30d) y: 432 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala x: 1 contributors (30d) y: 806 lines of code sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/TableInfo.java x: 2 contributors (30d) y: 58 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala x: 3 contributors (30d) y: 961 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/AnalyzerBridgeState.scala x: 2 contributors (30d) y: 22 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/Resolver.scala x: 2 contributors (30d) y: 421 lines of code core/src/main/scala/org/apache/spark/storage/BlockManagerMasterEndpoint.scala x: 2 contributors (30d) y: 818 lines of code python/pyspark/sql/connect/expressions.py x: 1 contributors (30d) y: 1039 lines of code python/pyspark/sql/connect/proto/expressions_pb2.pyi x: 1 contributors (30d) y: 1764 lines of code sql/connect/common/src/main/scala/org/apache/spark/sql/connect/Dataset.scala x: 1 contributors (30d) y: 1025 lines of code core/src/main/scala/org/apache/spark/serializer/KryoSerializer.scala x: 1 contributors (30d) y: 575 lines of code sql/api/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseLexer.g4 x: 2 contributors (30d) y: 607 lines of code core/src/main/scala/org/apache/spark/executor/Executor.scala x: 1 contributors (30d) y: 952 lines of code common/utils/src/main/scala/org/apache/spark/internal/LogKey.scala x: 2 contributors (30d) y: 861 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDBFileManager.scala x: 2 contributors (30d) y: 783 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala x: 1 contributors (30d) y: 3781 lines of code
5815.0
lines of code
  min: 4.0
  average: 408.8
  25th percentile: 95.25
  median: 218.0
  75th percentile: 445.25
  max: 5815.0
0 11.0
contributors (30d)
min: 1.0 | average: 1.38 | 25th percentile: 1.0 | median: 1.0 | 75th percentile: 2.0 | max: 11.0

File Size vs. Commits (90 days): 798 points

core/src/main/scala/org/apache/spark/util/collection/BitSet.scala x: 1 commits (90d) y: 162 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CTESubstitution.scala x: 5 commits (90d) y: 261 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveIdentifierClause.scala x: 3 commits (90d) y: 100 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/unresolved.scala x: 2 commits (90d) y: 641 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala x: 3 commits (90d) y: 827 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala x: 2 commits (90d) y: 424 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala x: 1 commits (90d) y: 847 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreePatternBits.scala x: 1 commits (90d) y: 38 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreePatterns.scala x: 3 commits (90d) y: 150 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ApplyDefaultCollationToStringType.scala x: 3 commits (90d) y: 135 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/DecisionTreeClassifier.scala x: 5 commits (90d) y: 209 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/FMClassifier.scala x: 7 commits (90d) y: 225 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/GBTClassifier.scala x: 2 commits (90d) y: 270 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala x: 7 commits (90d) y: 312 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala x: 7 commits (90d) y: 929 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/MultilayerPerceptronClassifier.scala x: 7 commits (90d) y: 255 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/RandomForestClassifier.scala x: 2 commits (90d) y: 346 lines of code mllib/src/main/scala/org/apache/spark/ml/clustering/BisectingKMeans.scala x: 2 commits (90d) y: 201 lines of code mllib/src/main/scala/org/apache/spark/ml/clustering/GaussianMixture.scala x: 6 commits (90d) y: 455 lines of code mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala x: 6 commits (90d) y: 518 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/GBTRegressor.scala x: 2 commits (90d) y: 237 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala x: 7 commits (90d) y: 967 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/LinearRegression.scala x: 7 commits (90d) y: 594 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/RandomForestRegressor.scala x: 2 commits (90d) y: 219 lines of code mllib/src/main/scala/org/apache/spark/ml/tree/impl/GradientBoostedTrees.scala x: 1 commits (90d) y: 351 lines of code mllib/src/main/scala/org/apache/spark/ml/tree/treeModels.scala x: 4 commits (90d) y: 331 lines of code mllib/src/main/scala/org/apache/spark/ml/util/HasTrainingSummary.scala x: 1 commits (90d) y: 21 lines of code python/pyspark/testing/connectutils.py x: 9 commits (90d) y: 177 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/config/Connect.scala x: 5 commits (90d) y: 325 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/ml/MLCache.scala x: 6 commits (90d) y: 187 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/ml/MLException.scala x: 3 commits (90d) y: 28 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/ml/MLHandler.scala x: 7 commits (90d) y: 323 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/package.scala x: 2 commits (90d) y: 179 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectAnalyzeHandler.scala x: 1 commits (90d) y: 186 lines of code core/src/main/scala/org/apache/spark/util/UninterruptibleThread.scala x: 1 commits (90d) y: 79 lines of code python/pyspark/errors/exceptions/captured.py x: 2 commits (90d) y: 284 lines of code python/pyspark/pandas/groupby.py x: 2 commits (90d) y: 1800 lines of code python/pyspark/pandas/namespace.py x: 3 commits (90d) y: 1517 lines of code python/pyspark/pandas/series.py x: 1 commits (90d) y: 2215 lines of code python/pyspark/pandas/utils.py x: 1 commits (90d) y: 657 lines of code python/pyspark/testing/pandasutils.py x: 1 commits (90d) y: 486 lines of code python/pyspark/testing/utils.py x: 4 commits (90d) y: 560 lines of code python/pyspark/sql/pandas/serializers.py x: 9 commits (90d) y: 884 lines of code python/pyspark/worker.py x: 7 commits (90d) y: 1728 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala x: 47 commits (90d) y: 5815 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/python/ArrowPythonRunner.scala x: 4 commits (90d) y: 97 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/constraints.scala x: 6 commits (90d) y: 174 lines of code python/pyspark/sql/datasource.py x: 4 commits (90d) y: 256 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala x: 25 commits (90d) y: 2889 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala x: 4 commits (90d) y: 617 lines of code connector/avro/src/main/scala/org/apache/spark/sql/avro/AvroDataToCatalyst.scala x: 1 commits (90d) y: 105 lines of code connector/avro/src/main/scala/org/apache/spark/sql/avro/SchemaOfAvro.scala x: 2 commits (90d) y: 41 lines of code connector/kafka-0-10-token-provider/src/main/scala/org/apache/spark/kafka010/KafkaConfigUpdater.scala x: 1 commits (90d) y: 58 lines of code connector/kafka-0-10-token-provider/src/main/scala/org/apache/spark/kafka010/KafkaTokenUtil.scala x: 1 commits (90d) y: 230 lines of code project/SparkBuild.scala x: 10 commits (90d) y: 1460 lines of code sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala x: 2 commits (90d) y: 474 lines of code python/pyspark/sql/pandas/types.py x: 2 commits (90d) y: 920 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/ExpressionResolver.scala x: 3 commits (90d) y: 527 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/FunctionResolver.scala x: 2 commits (90d) y: 102 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/NameScope.scala x: 2 commits (90d) y: 406 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/ml/MLUtils.scala x: 9 commits (90d) y: 514 lines of code mllib/src/main/scala/org/apache/spark/ml/clustering/LDA.scala x: 6 commits (90d) y: 533 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/BucketedRandomProjectionLSH.scala x: 5 commits (90d) y: 150 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/ChiSqSelector.scala x: 5 commits (90d) y: 108 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/CountVectorizer.scala x: 5 commits (90d) y: 248 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/MaxAbsScaler.scala x: 5 commits (90d) y: 122 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/MinHashLSH.scala x: 5 commits (90d) y: 161 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/OneHotEncoder.scala x: 5 commits (90d) y: 378 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala x: 6 commits (90d) y: 405 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/TargetEncoder.scala x: 5 commits (90d) y: 301 lines of code mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala x: 6 commits (90d) y: 1062 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/AFTSurvivalRegression.scala x: 6 commits (90d) y: 345 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/FMRegressor.scala x: 6 commits (90d) y: 478 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/IsotonicRegression.scala x: 5 commits (90d) y: 199 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/state/StateDataSource.scala x: 3 commits (90d) y: 443 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/AsyncProgressTrackingMicroBatchExecution.scala x: 1 commits (90d) y: 224 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/OperatorStateMetadata.scala x: 3 commits (90d) y: 348 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/ColumnDefinition.scala x: 3 commits (90d) y: 177 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala x: 9 commits (90d) y: 920 lines of code python/pyspark/sql/connect/functions/builtin.py x: 5 commits (90d) y: 2417 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala x: 4 commits (90d) y: 998 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2AlterTableCommands.scala x: 3 commits (90d) y: 223 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala x: 5 commits (90d) y: 1148 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/ReplaceTableExec.scala x: 3 commits (90d) y: 89 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala x: 3 commits (90d) y: 555 lines of code sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala x: 1 commits (90d) y: 333 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/Word2Vec.scala x: 4 commits (90d) y: 250 lines of code mllib/src/main/scala/org/apache/spark/ml/tuning/CrossValidator.scala x: 4 commits (90d) y: 304 lines of code mllib/src/main/scala/org/apache/spark/ml/tuning/TrainValidationSplit.scala x: 4 commits (90d) y: 278 lines of code mllib/src/main/scala/org/apache/spark/ml/util/ReadWrite.scala x: 4 commits (90d) y: 524 lines of code dev/sparktestsupport/modules.py x: 10 commits (90d) y: 1409 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/namedExpressions.scala x: 1 commits (90d) y: 441 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala x: 5 commits (90d) y: 1683 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/SparkOptimizer.scala x: 2 commits (90d) y: 81 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/UnionLoopExec.scala x: 4 commits (90d) y: 152 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala x: 6 commits (90d) y: 1503 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/CacheManager.scala x: 2 commits (90d) y: 305 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/rules.scala x: 2 commits (90d) y: 558 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/WriteToDataSourceV2Exec.scala x: 4 commits (90d) y: 528 lines of code python/pyspark/pandas/accessors.py x: 1 commits (90d) y: 434 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala x: 1 commits (90d) y: 1547 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/dsl/package.scala x: 3 commits (90d) y: 399 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala x: 13 commits (90d) y: 3468 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/python/streaming/TransformWithStateInPySparkStateServer.scala x: 4 commits (90d) y: 754 lines of code common/utils/src/main/scala/org/apache/spark/internal/config/ConfigBuilder.scala x: 3 commits (90d) y: 272 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStoreErrors.scala x: 4 commits (90d) y: 367 lines of code core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala x: 7 commits (90d) y: 733 lines of code python/pyspark/ml/classification.py x: 7 commits (90d) y: 2173 lines of code python/pyspark/ml/connect/readwrite.py x: 5 commits (90d) y: 290 lines of code python/pyspark/ml/feature.py x: 3 commits (90d) y: 3621 lines of code python/pyspark/ml/regression.py x: 2 commits (90d) y: 1554 lines of code python/pyspark/ml/util.py x: 9 commits (90d) y: 714 lines of code python/pyspark/sql/connect/client/core.py x: 13 commits (90d) y: 1449 lines of code python/pyspark/sql/connect/group.py x: 5 commits (90d) y: 488 lines of code python/pyspark/sql/connect/plan.py x: 4 commits (90d) y: 2133 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStore.scala x: 8 commits (90d) y: 693 lines of code project/plugins.sbt x: 4 commits (90d) y: 14 lines of code common/utils/src/main/scala/org/apache/spark/ErrorClassesJSONReader.scala x: 1 commits (90d) y: 123 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala x: 5 commits (90d) y: 2609 lines of code python/pyspark/util.py x: 5 commits (90d) y: 439 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveUtils.scala x: 1 commits (90d) y: 369 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala x: 14 commits (90d) y: 1425 lines of code python/pyspark/ml/connect/functions.py x: 4 commits (90d) y: 47 lines of code core/src/main/scala/org/apache/spark/internal/config/package.scala x: 4 commits (90d) y: 2456 lines of code core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala x: 3 commits (90d) y: 886 lines of code python/pyspark/sql/streaming/proto/StateMessage_pb2.pyi x: 1 commits (90d) y: 1116 lines of code python/pyspark/sql/streaming/stateful_processor_api_client.py x: 4 commits (90d) y: 392 lines of code sql/core/src/main/scala/org/apache/spark/sql/avro/AvroOptions.scala x: 2 commits (90d) y: 116 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/timeExpressions.scala x: 11 commits (90d) y: 458 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/ResolverGuard.scala x: 3 commits (90d) y: 369 lines of code sql/core/src/main/scala/org/apache/spark/sql/avro/AvroSerializer.scala x: 1 commits (90d) y: 314 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala x: 1 commits (90d) y: 553 lines of code sql/api/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4 x: 11 commits (90d) y: 2122 lines of code sql/api/src/main/scala/org/apache/spark/sql/catalyst/util/SparkParserUtils.scala x: 1 commits (90d) y: 144 lines of code sql/api/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala x: 3 commits (90d) y: 681 lines of code sql/api/src/main/scala/org/apache/spark/sql/types/StructField.scala x: 2 commits (90d) y: 148 lines of code sql/api/src/main/scala/org/apache/spark/sql/types/StructType.scala x: 1 commits (90d) y: 406 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AbstractSqlParser.scala x: 4 commits (90d) y: 85 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala x: 17 commits (90d) y: 4715 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala x: 6 commits (90d) y: 1040 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/command/CreateSQLFunctionCommand.scala x: 1 commits (90d) y: 279 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala x: 2 commits (90d) y: 991 lines of code mllib/src/main/scala/org/apache/spark/ml/Model.scala x: 2 commits (90d) y: 13 lines of code core/src/main/scala/org/apache/spark/api/python/PythonRDD.scala x: 4 commits (90d) y: 687 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectService.scala x: 2 commits (90d) y: 339 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/IncrementalExecution.scala x: 5 commits (90d) y: 467 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/SymmetricHashJoinStateManager.scala x: 4 commits (90d) y: 793 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/statefulOperators.scala x: 4 commits (90d) y: 1059 lines of code python/pyspark/sql/connect/proto/ml_pb2.pyi x: 3 commits (90d) y: 465 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/xml/StaxXmlParser.scala x: 2 commits (90d) y: 873 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala x: 3 commits (90d) y: 3066 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala x: 5 commits (90d) y: 432 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala x: 3 commits (90d) y: 806 lines of code sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/TableInfo.java x: 3 commits (90d) y: 58 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala x: 9 commits (90d) y: 961 lines of code python/pyspark/core/context.py x: 3 commits (90d) y: 784 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/QueryExecution.scala x: 2 commits (90d) y: 440 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/orc/OrcFileFormat.scala x: 3 commits (90d) y: 285 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/BridgedRelationMetadataProvider.scala x: 2 commits (90d) y: 48 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/Resolver.scala x: 4 commits (90d) y: 421 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/CatalogV2Util.scala x: 1 commits (90d) y: 516 lines of code python/pyspark/sql/connect/column.py x: 2 commits (90d) y: 482 lines of code python/pyspark/sql/connect/expressions.py x: 2 commits (90d) y: 1039 lines of code python/pyspark/sql/connect/proto/expressions_pb2.pyi x: 2 commits (90d) y: 1764 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala x: 2 commits (90d) y: 1057 lines of code sql/connect/common/src/main/scala/org/apache/spark/sql/connect/Dataset.scala x: 1 commits (90d) y: 1025 lines of code sql/connect/common/src/main/scala/org/apache/spark/sql/connect/SparkSession.scala x: 11 commits (90d) y: 598 lines of code sql/connect/common/src/main/scala/org/apache/spark/sql/connect/columnNodeSupport.scala x: 3 commits (90d) y: 248 lines of code sql/core/src/main/scala/org/apache/spark/sql/classic/columnNodeSupport.scala x: 1 commits (90d) y: 265 lines of code core/src/main/scala/org/apache/spark/internal/config/Python.scala x: 5 commits (90d) y: 86 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala x: 3 commits (90d) y: 613 lines of code core/src/main/scala/org/apache/spark/serializer/KryoSerializer.scala x: 2 commits (90d) y: 575 lines of code python/pyspark/sql/worker/data_source_pushdown_filters.py x: 4 commits (90d) y: 186 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala x: 2 commits (90d) y: 813 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala x: 1 commits (90d) y: 783 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/python/UserDefinedPythonDataSource.scala x: 4 commits (90d) y: 437 lines of code python/pyspark/errors/exceptions/connect.py x: 8 commits (90d) y: 332 lines of code core/src/main/scala/org/apache/spark/executor/Executor.scala x: 4 commits (90d) y: 952 lines of code sql/api/src/main/scala/org/apache/spark/sql/catalyst/util/SparkDateTimeUtils.scala x: 3 commits (90d) y: 426 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/finishAnalysis.scala x: 4 commits (90d) y: 133 lines of code common/utils/src/main/scala/org/apache/spark/internal/LogKey.scala x: 4 commits (90d) y: 861 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala x: 2 commits (90d) y: 701 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/HDFSBackedStateStoreProvider.scala x: 2 commits (90d) y: 801 lines of code python/pyspark/sql/dataframe.py x: 5 commits (90d) y: 851 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDBFileManager.scala x: 7 commits (90d) y: 783 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala x: 5 commits (90d) y: 3781 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/variant/variantExpressions.scala x: 5 commits (90d) y: 743 lines of code core/src/main/scala/org/apache/spark/util/Utils.scala x: 4 commits (90d) y: 2154 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSource.scala x: 1 commits (90d) y: 462 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilters.scala x: 1 commits (90d) y: 704 lines of code dev/merge_spark_pr.py x: 2 commits (90d) y: 512 lines of code sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/ParquetVectorUpdaterFactory.java x: 1 commits (90d) y: 1454 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala x: 2 commits (90d) y: 374 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRowConverter.scala x: 1 commits (90d) y: 675 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala x: 3 commits (90d) y: 1905 lines of code python/pyspark/sql/connect/session.py x: 5 commits (90d) y: 838 lines of code python/pyspark/ml/tuning.py x: 5 commits (90d) y: 1133 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala x: 1 commits (90d) y: 1210 lines of code python/pyspark/ml/clustering.py x: 3 commits (90d) y: 1000 lines of code mllib-local/src/main/scala/org/apache/spark/ml/linalg/Vectors.scala x: 1 commits (90d) y: 578 lines of code mllib/src/main/scala/org/apache/spark/ml/param/params.scala x: 1 commits (90d) y: 603 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala x: 1 commits (90d) y: 1490 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TableOutputResolver.scala x: 2 commits (90d) y: 549 lines of code sql/hive-thriftserver/src/main/java/org/apache/hive/service/cli/session/HiveSessionImpl.java x: 1 commits (90d) y: 778 lines of code sql/api/src/main/scala/org/apache/spark/sql/functions.scala x: 4 commits (90d) y: 1882 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala x: 2 commits (90d) y: 1102 lines of code sql/connect/common/src/main/scala/org/apache/spark/sql/connect/client/SparkConnectClient.scala x: 5 commits (90d) y: 544 lines of code python/pyspark/sql/connect/udf.py x: 4 commits (90d) y: 226 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveShim.scala x: 1 commits (90d) y: 1277 lines of code python/pyspark/pandas/base.py x: 2 commits (90d) y: 601 lines of code core/src/main/scala/org/apache/spark/util/JsonProtocol.scala x: 1 commits (90d) y: 1419 lines of code sql/core/src/main/scala/org/apache/spark/sql/scripting/SqlScriptingExecutionNode.scala x: 5 commits (90d) y: 707 lines of code python/pyspark/sql/connect/dataframe.py x: 2 commits (90d) y: 1945 lines of code python/pyspark/sql/connect/proto/relations_pb2.pyi x: 1 commits (90d) y: 3616 lines of code sql/connect/common/src/main/protobuf/spark/connect/relations.proto x: 1 commits (90d) y: 984 lines of code core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala x: 1 commits (90d) y: 883 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDBStateEncoder.scala x: 2 commits (90d) y: 1126 lines of code python/pyspark/sql/session.py x: 5 commits (90d) y: 921 lines of code python/pyspark/sql/types.py x: 1 commits (90d) y: 1984 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala x: 2 commits (90d) y: 2968 lines of code core/src/main/scala/org/apache/spark/deploy/master/Master.scala x: 1 commits (90d) y: 1095 lines of code core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala x: 1 commits (90d) y: 1222 lines of code
5815.0
lines of code
  min: 2.0
  average: 350.85
  25th percentile: 74.75
  median: 186.5
  75th percentile: 408.25
  max: 5815.0
0 47.0
commits (90d)
min: 1.0 | average: 2.22 | 25th percentile: 1.0 | median: 1.0 | 75th percentile: 2.0 | max: 47.0

File Size vs. Contributors (90 days): 798 points

core/src/main/scala/org/apache/spark/util/collection/BitSet.scala x: 1 contributors (90d) y: 162 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CTESubstitution.scala x: 5 contributors (90d) y: 261 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveIdentifierClause.scala x: 3 contributors (90d) y: 100 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/unresolved.scala x: 2 contributors (90d) y: 641 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala x: 3 contributors (90d) y: 827 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala x: 2 contributors (90d) y: 424 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala x: 1 contributors (90d) y: 847 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreePatternBits.scala x: 1 contributors (90d) y: 38 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreePatterns.scala x: 3 contributors (90d) y: 150 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ApplyDefaultCollationToStringType.scala x: 2 contributors (90d) y: 135 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/DecisionTreeClassifier.scala x: 3 contributors (90d) y: 209 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/GBTClassifier.scala x: 2 contributors (90d) y: 270 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala x: 3 contributors (90d) y: 312 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala x: 3 contributors (90d) y: 929 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/MultilayerPerceptronClassifier.scala x: 3 contributors (90d) y: 255 lines of code mllib/src/main/scala/org/apache/spark/ml/classification/RandomForestClassifier.scala x: 2 contributors (90d) y: 346 lines of code mllib/src/main/scala/org/apache/spark/ml/clustering/BisectingKMeans.scala x: 2 contributors (90d) y: 201 lines of code mllib/src/main/scala/org/apache/spark/ml/clustering/GaussianMixture.scala x: 3 contributors (90d) y: 455 lines of code mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala x: 3 contributors (90d) y: 518 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/GBTRegressor.scala x: 2 contributors (90d) y: 237 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala x: 3 contributors (90d) y: 967 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/LinearRegression.scala x: 3 contributors (90d) y: 594 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/RandomForestRegressor.scala x: 2 contributors (90d) y: 219 lines of code mllib/src/main/scala/org/apache/spark/ml/tree/impl/GradientBoostedTrees.scala x: 1 contributors (90d) y: 351 lines of code mllib/src/main/scala/org/apache/spark/ml/tree/treeModels.scala x: 2 contributors (90d) y: 331 lines of code mllib/src/main/scala/org/apache/spark/ml/util/HasTrainingSummary.scala x: 1 contributors (90d) y: 21 lines of code python/pyspark/testing/connectutils.py x: 5 contributors (90d) y: 177 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/config/Connect.scala x: 4 contributors (90d) y: 325 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/ml/MLCache.scala x: 3 contributors (90d) y: 187 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/ml/MLException.scala x: 2 contributors (90d) y: 28 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/LiteralFunctionResolution.scala x: 3 contributors (90d) y: 23 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/package.scala x: 2 contributors (90d) y: 179 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectAnalyzeHandler.scala x: 1 contributors (90d) y: 186 lines of code core/src/main/scala/org/apache/spark/util/UninterruptibleThread.scala x: 1 contributors (90d) y: 79 lines of code python/pyspark/errors/exceptions/captured.py x: 2 contributors (90d) y: 284 lines of code python/pyspark/pandas/groupby.py x: 2 contributors (90d) y: 1800 lines of code python/pyspark/pandas/namespace.py x: 2 contributors (90d) y: 1517 lines of code python/pyspark/pandas/series.py x: 1 contributors (90d) y: 2215 lines of code python/pyspark/pandas/utils.py x: 1 contributors (90d) y: 657 lines of code python/pyspark/testing/pandasutils.py x: 1 contributors (90d) y: 486 lines of code python/pyspark/testing/utils.py x: 3 contributors (90d) y: 560 lines of code python/pyspark/sql/pandas/serializers.py x: 5 contributors (90d) y: 884 lines of code python/pyspark/worker.py x: 4 contributors (90d) y: 1728 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala x: 23 contributors (90d) y: 5815 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/python/ArrowPythonRunner.scala x: 2 contributors (90d) y: 97 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala x: 13 contributors (90d) y: 2889 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala x: 2 contributors (90d) y: 617 lines of code connector/avro/src/main/scala/org/apache/spark/sql/avro/AvroDataToCatalyst.scala x: 1 contributors (90d) y: 105 lines of code connector/kafka-0-10-token-provider/src/main/scala/org/apache/spark/kafka010/KafkaConfigUpdater.scala x: 1 contributors (90d) y: 58 lines of code connector/kafka-0-10-token-provider/src/main/scala/org/apache/spark/kafka010/KafkaTokenUtil.scala x: 1 contributors (90d) y: 230 lines of code project/SparkBuild.scala x: 5 contributors (90d) y: 1460 lines of code sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala x: 2 contributors (90d) y: 474 lines of code python/pyspark/sql/pandas/types.py x: 2 contributors (90d) y: 920 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/ExpressionResolver.scala x: 1 contributors (90d) y: 527 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/NameScope.scala x: 1 contributors (90d) y: 406 lines of code mllib/src/main/scala/org/apache/spark/ml/clustering/LDA.scala x: 3 contributors (90d) y: 533 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/CountVectorizer.scala x: 3 contributors (90d) y: 248 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/MaxAbsScaler.scala x: 3 contributors (90d) y: 122 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/MinHashLSH.scala x: 3 contributors (90d) y: 161 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/OneHotEncoder.scala x: 3 contributors (90d) y: 378 lines of code mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala x: 3 contributors (90d) y: 405 lines of code mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala x: 3 contributors (90d) y: 1062 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/AFTSurvivalRegression.scala x: 3 contributors (90d) y: 345 lines of code mllib/src/main/scala/org/apache/spark/ml/regression/FMRegressor.scala x: 3 contributors (90d) y: 478 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/AsyncProgressTrackingMicroBatchExecution.scala x: 1 contributors (90d) y: 224 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala x: 7 contributors (90d) y: 920 lines of code python/pyspark/sql/connect/functions/builtin.py x: 4 contributors (90d) y: 2417 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala x: 3 contributors (90d) y: 998 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala x: 4 contributors (90d) y: 1148 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/ReplaceTableExec.scala x: 2 contributors (90d) y: 89 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala x: 2 contributors (90d) y: 555 lines of code sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala x: 1 contributors (90d) y: 333 lines of code mllib/src/main/scala/org/apache/spark/ml/tuning/TrainValidationSplit.scala x: 3 contributors (90d) y: 278 lines of code dev/sparktestsupport/modules.py x: 6 contributors (90d) y: 1409 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/namedExpressions.scala x: 1 contributors (90d) y: 441 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala x: 2 contributors (90d) y: 1683 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/UnionLoopExec.scala x: 2 contributors (90d) y: 152 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver/ViewResolver.scala x: 3 contributors (90d) y: 91 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala x: 6 contributors (90d) y: 1503 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/CacheManager.scala x: 2 contributors (90d) y: 305 lines of code python/pyspark/pandas/accessors.py x: 1 contributors (90d) y: 434 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala x: 1 contributors (90d) y: 1547 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/dsl/package.scala x: 2 contributors (90d) y: 399 lines of code sql/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala x: 9 contributors (90d) y: 3468 lines of code sql/core/src/main/scala/org/apache/spark/sql/classic/Dataset.scala x: 5 contributors (90d) y: 1498 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/python/streaming/TransformWithStateInPySparkStateServer.scala x: 2 contributors (90d) y: 754 lines of code common/utils/src/main/scala/org/apache/spark/internal/config/ConfigBuilder.scala x: 1 contributors (90d) y: 272 lines of code core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala x: 4 contributors (90d) y: 733 lines of code python/pyspark/ml/classification.py x: 4 contributors (90d) y: 2173 lines of code python/pyspark/ml/feature.py x: 2 contributors (90d) y: 3621 lines of code python/pyspark/ml/regression.py x: 2 contributors (90d) y: 1554 lines of code python/pyspark/ml/util.py x: 2 contributors (90d) y: 714 lines of code python/pyspark/sql/connect/client/core.py x: 7 contributors (90d) y: 1449 lines of code python/pyspark/sql/connect/group.py x: 3 contributors (90d) y: 488 lines of code python/pyspark/sql/connect/plan.py x: 3 contributors (90d) y: 2133 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStore.scala x: 4 contributors (90d) y: 693 lines of code project/plugins.sbt x: 2 contributors (90d) y: 14 lines of code common/utils/src/main/scala/org/apache/spark/ErrorClassesJSONReader.scala x: 1 contributors (90d) y: 123 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala x: 5 contributors (90d) y: 2609 lines of code python/pyspark/util.py x: 4 contributors (90d) y: 439 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveUtils.scala x: 1 contributors (90d) y: 369 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala x: 7 contributors (90d) y: 1425 lines of code core/src/main/scala/org/apache/spark/internal/config/package.scala x: 4 contributors (90d) y: 2456 lines of code core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala x: 3 contributors (90d) y: 886 lines of code python/pyspark/sql/streaming/proto/StateMessage_pb2.pyi x: 1 contributors (90d) y: 1116 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/timeExpressions.scala x: 6 contributors (90d) y: 458 lines of code python/pyspark/sql/functions/__init__.py x: 1 contributors (90d) y: 466 lines of code sql/core/src/main/scala/org/apache/spark/sql/avro/AvroSerializer.scala x: 1 contributors (90d) y: 314 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala x: 1 contributors (90d) y: 553 lines of code sql/api/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4 x: 8 contributors (90d) y: 2122 lines of code sql/api/src/main/scala/org/apache/spark/sql/catalyst/util/SparkParserUtils.scala x: 1 contributors (90d) y: 144 lines of code sql/api/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala x: 2 contributors (90d) y: 681 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala x: 12 contributors (90d) y: 4715 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala x: 4 contributors (90d) y: 1040 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/command/CreateSQLFunctionCommand.scala x: 1 contributors (90d) y: 279 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala x: 2 contributors (90d) y: 991 lines of code core/src/main/scala/org/apache/spark/api/python/PythonRDD.scala x: 1 contributors (90d) y: 687 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/IncrementalExecution.scala x: 4 contributors (90d) y: 467 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/SymmetricHashJoinStateManager.scala x: 1 contributors (90d) y: 793 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/statefulOperators.scala x: 1 contributors (90d) y: 1059 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/xml/StaxXmlParser.scala x: 1 contributors (90d) y: 873 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala x: 3 contributors (90d) y: 3066 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala x: 4 contributors (90d) y: 432 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala x: 3 contributors (90d) y: 806 lines of code sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/TableInfo.java x: 2 contributors (90d) y: 58 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala x: 6 contributors (90d) y: 961 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/QueryExecution.scala x: 2 contributors (90d) y: 440 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/CatalogV2Util.scala x: 1 contributors (90d) y: 516 lines of code core/src/main/scala/org/apache/spark/storage/BlockManagerMasterEndpoint.scala x: 2 contributors (90d) y: 818 lines of code python/pyspark/sql/connect/expressions.py x: 1 contributors (90d) y: 1039 lines of code python/pyspark/sql/connect/proto/expressions_pb2.pyi x: 1 contributors (90d) y: 1764 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala x: 2 contributors (90d) y: 1057 lines of code sql/connect/common/src/main/scala/org/apache/spark/sql/connect/Dataset.scala x: 1 contributors (90d) y: 1025 lines of code sql/connect/common/src/main/scala/org/apache/spark/sql/connect/SparkSession.scala x: 5 contributors (90d) y: 598 lines of code core/src/main/scala/org/apache/spark/serializer/KryoSerializer.scala x: 1 contributors (90d) y: 575 lines of code launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java x: 3 contributors (90d) y: 424 lines of code core/src/main/scala/org/apache/spark/executor/Executor.scala x: 4 contributors (90d) y: 952 lines of code common/utils/src/main/scala/org/apache/spark/internal/LogKey.scala x: 4 contributors (90d) y: 861 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala x: 2 contributors (90d) y: 701 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDBFileManager.scala x: 5 contributors (90d) y: 783 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala x: 5 contributors (90d) y: 3781 lines of code sql/connect/common/src/main/scala/org/apache/spark/sql/connect/client/arrow/ArrowSerializer.scala x: 2 contributors (90d) y: 499 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/variant/variantExpressions.scala x: 3 contributors (90d) y: 743 lines of code core/src/main/scala/org/apache/spark/util/Utils.scala x: 3 contributors (90d) y: 2154 lines of code dev/merge_spark_pr.py x: 2 contributors (90d) y: 512 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala x: 2 contributors (90d) y: 592 lines of code sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/ParquetVectorUpdaterFactory.java x: 1 contributors (90d) y: 1454 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRowConverter.scala x: 1 contributors (90d) y: 675 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala x: 4 contributors (90d) y: 202 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala x: 2 contributors (90d) y: 1905 lines of code python/pyspark/ml/tuning.py x: 4 contributors (90d) y: 1133 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala x: 1 contributors (90d) y: 1210 lines of code python/pyspark/ml/clustering.py x: 1 contributors (90d) y: 1000 lines of code mllib/src/main/scala/org/apache/spark/ml/param/params.scala x: 1 contributors (90d) y: 603 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala x: 1 contributors (90d) y: 1490 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TableOutputResolver.scala x: 2 contributors (90d) y: 549 lines of code sql/hive-thriftserver/src/main/java/org/apache/hive/service/cli/session/HiveSessionImpl.java x: 1 contributors (90d) y: 778 lines of code sql/api/src/main/scala/org/apache/spark/sql/functions.scala x: 4 contributors (90d) y: 1882 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala x: 2 contributors (90d) y: 1102 lines of code sql/connect/common/src/main/scala/org/apache/spark/sql/connect/client/SparkConnectClient.scala x: 5 contributors (90d) y: 544 lines of code sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveShim.scala x: 1 contributors (90d) y: 1277 lines of code core/src/main/scala/org/apache/spark/util/JsonProtocol.scala x: 1 contributors (90d) y: 1419 lines of code python/pyspark/sql/connect/dataframe.py x: 2 contributors (90d) y: 1945 lines of code sql/core/src/main/scala/org/apache/spark/sql/classic/SparkSession.scala x: 3 contributors (90d) y: 665 lines of code python/pyspark/sql/connect/proto/relations_pb2.pyi x: 1 contributors (90d) y: 3616 lines of code sql/connect/common/src/main/protobuf/spark/connect/relations.proto x: 1 contributors (90d) y: 984 lines of code sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDBStateEncoder.scala x: 2 contributors (90d) y: 1126 lines of code python/pyspark/sql/types.py x: 1 contributors (90d) y: 1984 lines of code sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala x: 2 contributors (90d) y: 2968 lines of code core/src/main/scala/org/apache/spark/deploy/master/Master.scala x: 1 contributors (90d) y: 1095 lines of code core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala x: 1 contributors (90d) y: 1222 lines of code
5815.0
lines of code
  min: 2.0
  average: 350.85
  25th percentile: 74.75
  median: 186.5
  75th percentile: 408.25
  max: 5815.0
0 23.0
contributors (90d)
min: 1.0 | average: 1.64 | 25th percentile: 1.0 | median: 1.0 | 75th percentile: 2.0 | max: 23.0