apache / spark
File Age & Freshness

File age measurements show the distribution of file ages (days since the first commit) and the file freshness (days since the latest commit).

Summary
File Change History Overall
File Age Distribution Overall
Days since first update
  • There are 4,066 files with 637,106 lines of code in files.
    • 3,530 files that are 366+ days old (566,501 lines of code)
    • 249 files that are 181-365 days old (36,712 lines of code)
    • 201 files that are 91-180 days old (25,400 lines of code)
    • 49 files that are 31-90 days old (4,782 lines of code)
    • 37 files that are 1-30 days old (3,711 lines of code)
88% | 5% | 3% | <1% | <1%
Legend:
366+
181-365
91-180
31-90
1-30

explore: grouped by folders | grouped by age
File Freshness Distribution Overall
Days since last update
  • There are 4,066 files with 637,106 lines of code in files.
    • 2,058 files have been last changed 366+ days ago (143,630 lines of code)
    • 704 files have been last changed 181-365 days ago (107,088 lines of code)
    • 524 files have been last changed 91-180 days ago (108,528 lines of code)
    • 349 files have been last changed 31-90 days ago (99,780 lines of code)
    • 431 files have been last changed 1-30 days ago (178,080 lines of code)
22% | 16% | 17% | 15% | 27%
Legend:
366+
181-365
91-180
31-90
1-30

explore: grouped by folders | grouped by freshness
File Change History per File Extension
scala, q, py, java, txt, json, sql, md, xml, r, yaml, rst, sh, js, properties, proto, css, pyi, cmd, html, ipynb, xsd, orc, gitignore, avsc, rb, bat, cfg, g4, ini, sbt, in, thrift, c, toml, rmd, ps1, gitattributes, bash
File Age Distribution per Extension
Days since first update
366+
181-365
91-180
31-90
1-30
scala87% | 5% | 5% | <1% | <1%
py95% | 2% | <1% | 1% | 0%
java93% | 4% | <1% | 1% | 0%
pyi88% | 7% | 3% | 0% | 0%
js97% | 0% | <1% | 1% | 0%
g4100% | 0% | 0% | 0% | 0%
css100% | 0% | 0% | 0% | 0%
proto16% | 77% | 3% | 2% | 0%
html100% | 0% | 0% | 0% | 0%
xml96% | 0% | 0% | 3% | 0%
bash100% | 0% | 0% | 0% | 0%
ps1100% | 0% | 0% | 0% | 0%
toml100% | 0% | 0% | 0% | 0%
in100% | 0% | 0% | 0% | 0%
c100% | 0% | 0% | 0% | 0%
sbt100% | 0% | 0% | 0% | 0%
cfg66% | 0% | 0% | 33% | 0%
yaml15% | 65% | 19% | 0% | 0%
File Freshness Distribution per Extension
Days since last update
366+
181-365
91-180
31-90
1-30
scala22% | 15% | 17% | 15% | 28%
java40% | 30% | 13% | 13% | 2%
py10% | 17% | 15% | 15% | 41%
js63% | 26% | 0% | 10% | 0%
pyi7% | 0% | 43% | 24% | 24%
proto16% | 5% | 36% | 25% | 16%
html100% | 0% | 0% | 0% | 0%
css50% | 0% | 0% | 49% | 0%
bash100% | 0% | 0% | 0% | 0%
ps1100% | 0% | 0% | 0% | 0%
in100% | 0% | 0% | 0% | 0%
c100% | 0% | 0% | 0% | 0%
cfg66% | 0% | 0% | 33% | 0%
yaml15% | 65% | 19% | 0% | 0%
toml0% | 0% | 100% | 0% | 0%
xml0% | 0% | 0% | 100% | 0%
g40% | 0% | 0% | 0% | 100%
sbt0% | 0% | 0% | 0% | 100%
File Change History per Logical Decomposition
primary
primary (file age distribution)
Days since first update
366+
181-365
91-180
31-90
1-30
sql80% | 9% | 7% | 1% | 1%
python94% | 3% | 1% | <1% | 0%
core99% | <1% | <1% | <1% | 0%
mllib99% | <1% | <1% | 0% | 0%
common90% | 6% | 3% | <1% | 0%
resource-managers96% | 3% | 0% | 0% | 0%
streaming100% | 0% | 0% | 0% | 0%
connector96% | 0% | 3% | 0% | 0%
dev96% | 1% | 2% | 0% | 0%
graphx100% | 0% | 0% | 0% | 0%
launcher100% | 0% | 0% | 0% | 0%
mllib-local100% | 0% | 0% | 0% | 0%
project100% | 0% | 0% | 0% | 0%
licenses-binary100% | 0% | 0% | 0% | 0%
ROOT100% | 0% | 0% | 0% | 0%
hadoop-cloud100% | 0% | 0% | 0% | 0%
repl100% | 0% | 0% | 0% | 0%
build100% | 0% | 0% | 0% | 0%
tools100% | 0% | 0% | 0% | 0%
R100% | 0% | 0% | 0% | 0%
connect-examples0% | 0% | 0% | 100% | 0%
primary (file freshness distribution)
Days since last update
366+
181-365
91-180
31-90
1-30
sql17% | 10% | 19% | 19% | 33%
core35% | 27% | 13% | 13% | 9%
mllib33% | 17% | 4% | 9% | 34%
python9% | 15% | 19% | 16% | 39%
common32% | 36% | 18% | 7% | 6%
streaming55% | 31% | 12% | 0% | 0%
resource-managers36% | 46% | 14% | 0% | 1%
connector39% | 21% | 27% | 4% | 6%
graphx98% | 0% | 1% | 0% | 0%
dev31% | 7% | 2% | 18% | 40%
launcher38% | 12% | 0% | 31% | 16%
mllib-local35% | 0% | 37% | 26% | 0%
licenses-binary100% | 0% | 0% | 0% | 0%
build100% | 0% | 0% | 0% | 0%
hadoop-cloud69% | 30% | 0% | 0% | 0%
tools100% | 0% | 0% | 0% | 0%
project3% | 0% | 0% | 11% | 84%
R100% | 0% | 0% | 0% | 0%
repl8% | 50% | 41% | 0% | 0%
ROOT0% | 0% | 0% | 100% | 0%
connect-examples0% | 0% | 0% | 100% | 0%
Oldest Files (Top 50)
File# lines# unitscreatedlast modified# changes
(days)
# contributorsfirst
contributor
latest
contributor
1460 12 2011-07-15 2025-05-06 818 222 ismael@juma.me.uk wenchen@databricks.com
plugins.sbt
in project
14 - 2011-09-26 2025-04-27 154 66 ismael@juma.me.uk yangjie01@baidu.com
worker.py
in python/pyspark
1728 32 2013-01-01 2025-05-06 208 77 joshrosen@eecs.berkeley.edu gurwls223@apache.org
serializers.py
in python/pyspark
373 68 2013-01-01 2024-04-04 139 58 joshrosen@eecs.berkeley.edu gurwls223@apache.org
java_gateway.py
in python/pyspark
105 2 2013-01-01 2025-01-24 88 52 joshrosen@eecs.berkeley.edu herman@databricks.com
shell.py
in python/pyspark
88 - 2013-01-01 2025-02-17 94 59 joshrosen@eecs.berkeley.edu ueshin@databricks.com
__init__.py
in python/pyspark
72 2 2013-01-01 2024-04-24 82 54 joshrosen@eecs.berkeley.edu gurwls223@apache.org
join.py
in python/pyspark
66 6 2013-01-01 2023-10-10 25 19 joshrosen@eecs.berkeley.edu gurwls223@apache.org
accumulators.py
in python/pyspark
173 20 2013-01-20 2025-04-15 68 41 joshrosen@eecs.berkeley.edu gurwls223@apache.org
daemon.py
in python/pyspark
171 3 2013-05-06 2025-04-15 44 28 jey@cs.berkeley.edu gurwls223@apache.org
DAGScheduler.scala
in core/src/main/scala/org/apache/spark/scheduler
2113 101 2013-05-12 2025-01-07 354 155 kayo@yahoo-inc.com m.zhang@databricks.com
SparkContext.scala
in core/src/main/scala/org/apache/spark
1923 124 2013-05-12 2025-02-24 534 237 kayo@yahoo-inc.com sririshindra@gmail.com
BlockManager.scala
in core/src/main/scala/org/apache/spark/storage
1544 83 2013-05-12 2024-11-15 283 130 kayo@yahoo-inc.com xumovens@gmail.com
Executor.scala
in core/src/main/scala/org/apache/spark/executor
952 28 2013-05-12 2025-04-13 278 138 kayo@yahoo-inc.com roreeves@linkedin.com
TaskEndReason.scala
in core/src/main/scala/org/apache/spark
138 8 2013-05-12 2022-10-05 63 38 kayo@yahoo-inc.com yangjie01@baidu.com
TaskResult.scala
in core/src/main/scala/org/apache/spark/scheduler
70 4 2013-05-12 2022-11-16 45 34 kayo@yahoo-inc.com ziqi.liu@databricks.com
CoGroupedRDD.scala
in core/src/main/scala/org/apache/spark/rdd
122 6 2013-08-14 2022-06-14 69 40 matei@eecs.berkeley.edu yangjie01@baidu.com
Aggregator.scala
in core/src/main/scala/org/apache/spark
32 3 2013-08-14 2016-01-27 43 24 matei@eecs.berkeley.edu andrew@databricks.com
StagePage.scala
in core/src/main/scala/org/apache/spark/ui/jobs
471 10 2013-08-15 2024-02-23 180 88 pwendell@gmail.com hiufkwok@gmail.com
DiskStore.scala
in core/src/main/scala/org/apache/spark/storage
269 18 2013-08-15 2024-09-04 76 51 pwendell@gmail.com sychen@ctrip.com
TaskMetrics.scala
in core/src/main/scala/org/apache/spark/executor
207 13 2013-08-15 2024-08-21 90 46 pwendell@gmail.com ziqi.liu@databricks.com
ShuffleMapTask.scala
in core/src/main/scala/org/apache/spark/scheduler
62 2 2013-08-15 2023-07-31 89 53 pwendell@gmail.com wenchen@databricks.com
statcounter.py
in python/pyspark
97 15 2013-08-20 2021-12-16 20 17 schumach@icsi.berkeley.edu mszymkiewicz@gmail.com
rddsampler.py
in python/pyspark
71 11 2013-08-23 2022-01-10 20 16 schumach@icsi.berkeley.edu gurwls223@apache.org
Utils.scala
in core/src/main/scala/org/apache/spark/util
2154 163 2013-09-01 2025-04-04 456 205 matei@eecs.berkeley.edu vinod.kc.in@gmail.com
MapOutputTracker.scala
in core/src/main/scala/org/apache/spark
1131 76 2013-09-01 2024-05-24 146 85 matei@eecs.berkeley.edu yi.wu@databricks.com
Master.scala
in core/src/main/scala/org/apache/spark/deploy/master
1095 40 2013-09-01 2025-02-12 213 98 matei@eecs.berkeley.edu dongjoon@apache.org
RDD.scala
in core/src/main/scala/org/apache/spark/rdd
1080 101 2013-09-01 2024-12-04 250 137 matei@eecs.berkeley.edu 1754789345@qq.com
Worker.scala
in core/src/main/scala/org/apache/spark/deploy/worker
801 30 2013-09-01 2024-08-21 178 90 matei@eecs.berkeley.edu dongjoon@apache.org
PythonRDD.scala
in core/src/main/scala/org/apache/spark/api/python
687 51 2013-09-01 2025-04-22 173 78 matei@eecs.berkeley.edu gurwls223@apache.org
UIUtils.scala
in core/src/main/scala/org/apache/spark/ui
589 27 2013-09-01 2025-03-19 118 78 matei@eecs.berkeley.edu yao@apache.org
KryoSerializer.scala
in core/src/main/scala/org/apache/spark/serializer
575 30 2013-09-01 2025-04-15 126 78 matei@eecs.berkeley.edu yao@apache.org
PairRDDFunctions.scala
in core/src/main/scala/org/apache/spark/rdd
541 67 2013-09-01 2024-05-04 149 79 matei@eecs.berkeley.edu daniel.tenedorio@databricks...
StreamingContext.scala
in streaming/src/main/scala/org/apache/spark/streaming
469 39 2013-09-01 2024-06-19 135 70 matei@eecs.berkeley.edu amanda.liu@databricks.com
JettyUtils.scala
in core/src/main/scala/org/apache/spark/ui
464 20 2013-09-01 2024-08-15 83 55 matei@eecs.berkeley.edu dhyun@apple.com
JavaPairRDD.scala
in core/src/main/scala/org/apache/spark/api/java
421 47 2013-09-01 2023-12-03 88 51 matei@eecs.berkeley.edu yangjie01@baidu.com
SparkEnv.scala
in core/src/main/scala/org/apache/spark
414 15 2013-09-01 2025-02-07 186 96 matei@eecs.berkeley.edu ueshin@databricks.com
webui.css
in core/src/main/resources/org/apache/spark/ui/static
403 - 2013-09-01 2025-03-19 56 39 matei@eecs.berkeley.edu yao@apache.org
PythonWorkerFactory.scala
in core/src/main/scala/org/apache/spark/api/python
399 16 2013-09-01 2025-04-15 64 45 matei@eecs.berkeley.edu gurwls223@apache.org
MLUtils.scala
in mllib/src/main/scala/org/apache/spark/mllib/util
397 17 2013-09-01 2025-03-20 63 45 matei@eecs.berkeley.edu ruifengz@apache.org
JavaPairDStream.scala
in streaming/src/main/scala/org/apache/spark/streaming/api/java
380 48 2013-09-01 2023-09-27 67 37 matei@eecs.berkeley.edu fanjiaeminem@qq.com
HadoopRDD.scala
in core/src/main/scala/org/apache/spark/rdd
363 14 2013-09-01 2024-12-11 125 75 matei@eecs.berkeley.edu chengpan@apache.org
sorttable.js
in core/src/main/resources/org/apache/spark/ui/static
352 23 2013-09-01 2020-11-03 12 13 matei@eecs.berkeley.edu echohlne@gmail.com
SparkHadoopUtil.scala
in core/src/main/scala/org/apache/spark/deploy
328 36 2013-09-01 2024-05-25 129 68 matei@eecs.berkeley.edu gengliang@apache.org
KMeans.scala
in mllib/src/main/scala/org/apache/spark/mllib/clustering
326 18 2013-09-01 2024-04-26 83 47 matei@eecs.berkeley.edu panbingkun@baidu.com
SparkListener.scala
in core/src/main/scala/org/apache/spark/scheduler
313 37 2013-09-01 2024-09-10 100 58 matei@eecs.berkeley.edu herman@databricks.com
NewHadoopRDD.scala
in core/src/main/scala/org/apache/spark/rdd
298 7 2013-09-01 2025-01-16 94 63 matei@eecs.berkeley.edu yi.wu@databricks.com
Checkpoint.scala
in streaming/src/main/scala/org/apache/spark/streaming
298 13 2013-09-01 2024-06-19 101 66 matei@eecs.berkeley.edu amanda.liu@databricks.com
StageTable.scala
in core/src/main/scala/org/apache/spark/ui/jobs
294 7 2013-09-01 2024-02-23 106 63 matei@eecs.berkeley.edu hiufkwok@gmail.com
JavaRDDLike.scala
in core/src/main/scala/org/apache/spark/api/java
286 46 2013-09-01 2023-11-29 97 54 matei@eecs.berkeley.edu yangjie01@baidu.com
Files Not Recently Changed (Top 50)
File# lines# unitscreatedlast modified# changes
(days)
# contributorsfirst
contributor
latest
contributor
ApproximateEvaluator.scala
in core/src/main/scala/org/apache/spark/partial
5 - 2013-09-01 2013-09-26 5 6 matei@eecs.berkeley.edu rxin@apache.org
IntParam.scala
in core/src/main/scala/org/apache/spark/util
10 1 2013-09-01 2013-09-26 5 6 matei@eecs.berkeley.edu rxin@apache.org
MemoryParam.scala
in core/src/main/scala/org/apache/spark/util
10 1 2013-09-01 2013-09-26 5 6 matei@eecs.berkeley.edu rxin@apache.org
BlockException.scala
in core/src/main/scala/org/apache/spark/storage
3 - 2013-09-01 2013-12-12 12 12 matei@eecs.berkeley.edu pwendell@gmail.com
package.scala
in core/src/main/scala/org/apache/spark/api/java
3 - 2014-01-14 2014-01-17 3 3 rxin@apache.org rizlar@gmail.com
package.scala
in graphx/src/main/scala/org/apache/spark/graphx/impl
5 - 2014-01-10 2014-01-17 4 6 ankurdave@gmail.com rizlar@gmail.com
package.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning
2 - 2014-03-21 2014-03-26 2 2 michael@databricks.com lian.cs.zju@gmail.com
package.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules
2 - 2014-03-21 2014-03-26 2 2 michael@databricks.com lian.cs.zju@gmail.com
Source.scala
in core/src/main/scala/org/apache/spark/metrics/source
6 - 2013-09-01 2014-04-09 6 7 matei@eecs.berkeley.edu pwendell@gmail.com
SchedulingMode.scala
in core/src/main/scala/org/apache/spark/scheduler
5 - 2013-09-25 2014-04-16 18 13 kayousterhout@gmail.com sandeep@techaddict.me
JavaReceiverInputDStream.scala
in streaming/src/main/scala/org/apache/spark/streaming/api/java
13 1 2014-04-22 2014-04-24 2 2 tathagata.das1565@gmail.com sowen@cloudera.com
JavaInputDStream.scala
in streaming/src/main/scala/org/apache/spark/streaming/api/java
13 1 2014-04-22 2014-04-24 2 2 tathagata.das1565@gmail.com sowen@cloudera.com
JavaPairReceiverInputDStream.scala
in streaming/src/main/scala/org/apache/spark/streaming/api/java
14 1 2014-04-22 2014-04-24 2 2 tathagata.das1565@gmail.com sowen@cloudera.com
package-info.java
in graphx/src/main/scala/org/apache/spark/graphx/lib
1 - 2014-05-15 2014-05-15 1 1 prashant.s@imaginea.com prashant.s@imaginea.com
package.scala
in streaming/src/main/scala/org/apache/spark/streaming/dstream
2 - 2014-05-15 2014-05-15 1 1 prashant.s@imaginea.com prashant.s@imaginea.com
package.scala
in streaming/src/main/scala/org/apache/spark/streaming/api/java
2 - 2014-05-15 2014-05-15 1 1 prashant.s@imaginea.com prashant.s@imaginea.com
package.scala
in core/src/main/scala/org/apache/spark/executor
2 - 2014-05-15 2014-05-15 1 1 prashant.s@imaginea.com prashant.s@imaginea.com
package.scala
in core/src/main/scala/org/apache/spark/util
2 - 2014-05-15 2014-05-15 1 1 prashant.s@imaginea.com prashant.s@imaginea.com
package.scala
in core/src/main/scala/org/apache/spark/util/random
2 - 2014-05-15 2014-05-15 1 1 prashant.s@imaginea.com prashant.s@imaginea.com
package.scala
in core/src/main/scala/org/apache/spark/serializer
2 - 2014-05-15 2014-05-15 1 1 prashant.s@imaginea.com prashant.s@imaginea.com
package.scala
in core/src/main/scala/org/apache/spark/io
2 - 2014-05-15 2014-05-15 1 1 prashant.s@imaginea.com prashant.s@imaginea.com
package.scala
in core/src/main/scala/org/apache/spark/metrics/source
2 - 2014-05-15 2014-05-15 1 1 prashant.s@imaginea.com prashant.s@imaginea.com
package.scala
in graphx/src/main/scala/org/apache/spark/graphx/lib
2 - 2014-01-11 2014-05-15 2 2 ankurdave@gmail.com prashant.s@imaginea.com
package.scala
in graphx/src/main/scala/org/apache/spark/graphx/util
2 - 2014-05-15 2014-05-15 1 1 prashant.s@imaginea.com prashant.s@imaginea.com
package.scala
in core/src/main/scala/org/apache/spark/broadcast
3 - 2014-01-14 2014-05-15 4 4 rxin@apache.org prashant.s@imaginea.com
PythonPartitioner.scala
in core/src/main/scala/org/apache/spark/api/python
20 2 2013-09-01 2014-06-08 16 13 matei@eecs.berkeley.edu zsxwing@gmail.com
CollectionsUtils.scala
in core/src/main/scala/org/apache/spark/util
26 1 2014-07-20 2014-07-20 1 1 rxin@apache.org rxin@apache.org
SizeTracker.scala
in core/src/main/scala/org/apache/spark/util/collection
45 4 2014-07-27 2014-07-27 1 1 andrewor14@gmail.com andrewor14@gmail.com
Command.scala
in core/src/main/scala/org/apache/spark/deploy
10 - 2013-09-01 2014-07-30 7 8 matei@eecs.berkeley.edu andrewor14@gmail.com
TaskLocality.scala
in core/src/main/scala/org/apache/spark/scheduler
10 1 2013-09-25 2014-08-06 22 15 kayousterhout@gmail.com zhunansjtu@gmail.com
package-info.java
in core/src/main/scala/org/apache/spark/serializer
1 - 2014-05-15 2014-08-16 2 2 prashant.s@imaginea.com rxin@apache.org
JavaFutureAction.java
in core/src/main/java/org/apache/spark/api/java
6 - 2014-10-20 2014-10-20 1 1 joshrosen@apache.org joshrosen@apache.org
Sorter.scala
in core/src/main/scala/org/apache/spark/util/collection
9 1 2014-10-28 2014-10-28 1 1 meng@databricks.com meng@databricks.com
BlockNotFoundException.scala
in core/src/main/scala/org/apache/spark/storage
2 - 2014-08-15 2014-10-29 3 1 rxin@apache.org rxin@apache.org
ShuffleReader.scala
in core/src/main/scala/org/apache/spark/shuffle
4 - 2014-06-12 2014-10-30 2 2 matei@databricks.com kayousterhout@gmail.com
package.scala
in sql/core/src/main/scala/org/apache/spark/sql/sources
2 - 2014-11-02 2014-11-02 1 1 michael@databricks.com michael@databricks.com
SparkJobInfo.java
in core/src/main/java/org/apache/spark
7 - 2014-10-25 2014-12-10 2 2 joshrosen@databricks.com sandy@cloudera.com
SparkStageInfo.java
in core/src/main/java/org/apache/spark
12 - 2014-10-25 2014-12-10 3 3 joshrosen@databricks.com sandy@cloudera.com
package.scala
in streaming/src/main/scala/org/apache/spark/streaming
3 - 2014-01-14 2014-12-26 3 4 pwendell@gmail.com zsxwing@gmail.com
ContextWaiter.scala
in streaming/src/main/scala/org/apache/spark/streaming
46 3 2014-01-12 2014-12-30 7 6 tathagata.das1565@gmail.com zsxwing@gmail.com
package-info.java
in sql/hive/src/main/scala/org/apache/spark/sql/hive
1 - 2014-05-15 2015-01-29 2 2 prashant.s@imaginea.com rxin@databricks.com
SizeTrackingVector.scala
in core/src/main/scala/org/apache/spark/util/collection
15 2 2014-07-27 2015-03-03 2 2 andrewor14@gmail.com cloud0fan@outlook.com
PrimitiveVector.scala
in core/src/main/scala/org/apache/spark/util/collection
51 5 2013-11-02 2015-03-03 20 15 aaron@databricks.com cloud0fan@outlook.com
SparkSubmitArgumentsParser.scala
in core/src/main/scala/org/apache/spark/launcher
2 - 2015-03-11 2015-03-11 1 1 vanzin@cloudera.com vanzin@cloudera.com
DriverState.scala
in core/src/main/scala/org/apache/spark/deploy/master
5 - 2013-12-22 2015-03-17 7 7 pwendell@gmail.com zhunansjtu@gmail.com
WorkerState.scala
in core/src/main/scala/org/apache/spark/deploy/master
5 - 2013-09-01 2015-03-17 22 19 matei@eecs.berkeley.edu zhunansjtu@gmail.com
RecoveryState.scala
in core/src/main/scala/org/apache/spark/deploy/master
5 - 2013-10-05 2015-03-17 16 16 aaron@databricks.com zhunansjtu@gmail.com
SubmitRestProtocolException.scala
in core/src/main/scala/org/apache/spark/deploy/rest
7 - 2015-02-06 2015-03-17 2 2 andrew@databricks.com zhunansjtu@gmail.com
ApplicationSource.scala
in core/src/main/scala/org/apache/spark/deploy/master
17 - 2013-09-01 2015-03-17 16 12 matei@eecs.berkeley.edu zhunansjtu@gmail.com
WorkerSource.scala
in core/src/main/scala/org/apache/spark/deploy/worker
22 - 2013-09-01 2015-03-17 16 12 matei@eecs.berkeley.edu zhunansjtu@gmail.com
Most Recently Created Files (Top 50)
File# lines# unitscreatedlast modified# changes
(days)
# contributorsfirst
contributor
latest
contributor
LazyTry.scala
in core/src/main/scala/org/apache/spark/util
9 1
AsyncStreamingQueryCheckpointMetadata.scala
in sql/core/src/main/scala/org/apache/spark/sql/execution/streaming
24 - 2025-04-30 2025-04-30 1 1 ruowang.zhang+data@databric... ruowang.zhang+data@databric...
StreamingQueryCheckpointMetadata.scala
in sql/core/src/main/scala/org/apache/spark/sql/execution/streaming
21 - 2025-04-30 2025-04-30 1 1 ruowang.zhang+data@databric... ruowang.zhang+data@databric...
CollationRulesRunner.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis
10 1 2025-04-30 2025-04-30 1 1 marko.ilic@databricks.com marko.ilic@databricks.com
AnalysisAwareExpression.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions
4 - 2025-04-30 2025-04-30 1 1 aokolnychyi@apache.org aokolnychyi@apache.org
TypeCoercionValidation.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis
95 7 2025-04-23 2025-04-25 2 1 vladimir.golubev@databricks... vladimir.golubev@databricks...
ApplyDefaultCollationToStringType.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis
135 10 2025-04-22 2025-05-07 3 2 marko.ilic@databricks.com wenchen@databricks.com
TransformWithStateInPySparkStateServer.scala
in sql/core/src/main/scala/org/apache/spark/sql/execution/python/streaming
754 18 2025-04-18 2025-04-28 4 2 kabhwan.opensource@gmail.com bo.gao@databricks.com
TransformWithStateInPySparkExec.scala
in sql/core/src/main/scala/org/apache/spark/sql/execution/python/streaming
403 13 2025-04-18 2025-04-22 2 2 kabhwan.opensource@gmail.com zycm03@gmail.com
TransformWithStateInPySparkPythonRunner.scala
in sql/core/src/main/scala/org/apache/spark/sql/execution/python/streaming
291 11 2025-04-18 2025-04-18 1 1 kabhwan.opensource@gmail.com kabhwan.opensource@gmail.com
TransformWithStateInPySparkDeserializer.scala
in sql/core/src/main/scala/org/apache/spark/sql/execution/python/streaming
48 2 2025-04-18 2025-04-18 1 1 kabhwan.opensource@gmail.com kabhwan.opensource@gmail.com
BridgedRelationId.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver
6 - 2025-04-16 2025-04-16 1 1 mihailo.milosevic@databrick... mihailo.milosevic@databrick...
DescribeProcedureCommand.scala
in sql/core/src/main/scala/org/apache/spark/sql/execution/command
54 6 2025-04-14 2025-04-14 1 1 szehon.apache@gmail.com szehon.apache@gmail.com
NormalizeableRelation.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis
5 - 2025-04-14 2025-04-14 1 1 vladimir.golubev@databricks... vladimir.golubev@databricks...
ResolveTableConstraints.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util
31 2 2025-04-10 2025-04-15 2 1 gengliang@apache.org gengliang@apache.org
SetOperationLikeResolver.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver
283 16 2025-04-08 2025-04-08 1 1 vladimir.golubev@databricks... vladimir.golubev@databricks...
SortResolver.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver
211 7 2025-04-08 2025-04-08 1 1 vladimir.golubev@databricks... vladimir.golubev@databricks...
AggregateResolver.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver
192 7 2025-04-08 2025-05-05 2 1 vladimir.golubev@databricks... vladimir.golubev@databricks...
JoinResolver.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver
180 7 2025-04-08 2025-04-08 1 1 vladimir.golubev@databricks... vladimir.golubev@databricks...
constraints.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions
174 8 2025-04-08 2025-05-06 4 2 gengliang@apache.org gengliang@apache.org
SubqueryExpressionResolver.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver
104 7 2025-04-08 2025-04-08 1 1 vladimir.golubev@databricks... vladimir.golubev@databricks...
LimitLikeExpressionValidator.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver
101 3 2025-04-08 2025-04-08 1 1 vladimir.golubev@databricks... vladimir.golubev@databricks...
IdentifierAndCteSubstituor.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver
100 5 2025-04-08 2025-04-08 1 1 vladimir.golubev@databricks... vladimir.golubev@databricks...
PruneMetadataColumns.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver
77 4 2025-04-08 2025-04-08 1 1 vladimir.golubev@databricks... vladimir.golubev@databricks...
FilterResolver.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver
62 4 2025-04-08 2025-04-08 1 1 vladimir.golubev@databricks... vladimir.golubev@databricks...
ResolvesNameByHiddenOutput.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver
48 3 2025-04-08 2025-04-08 1 1 vladimir.golubev@databricks... vladimir.golubev@databricks...
UnsupportedExpressionInOperatorValidation.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver
43 2 2025-04-08 2025-04-08 1 1 vladimir.golubev@databricks... vladimir.golubev@databricks...
AutoGeneratedAliasProvider.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver
40 3 2025-04-08 2025-04-08 1 1 vladimir.golubev@databricks... vladimir.golubev@databricks...
ResolverMetricTracker.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver
38 2 2025-04-08 2025-04-08 1 1 vladimir.golubev@databricks... vladimir.golubev@databricks...
SubqueryScope.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver
37 4 2025-04-08 2025-04-08 1 1 vladimir.golubev@databricks... vladimir.golubev@databricks...
SemanticComparator.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver
31 2 2025-04-08 2025-04-08 1 1 vladimir.golubev@databricks... vladimir.golubev@databricks...
PlanRewriter.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver
28 1 2025-04-08 2025-04-08 1 1 vladimir.golubev@databricks... vladimir.golubev@databricks...
ExpressionTreeTraversal.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver
28 1 2025-04-08 2025-04-08 1 1 vladimir.golubev@databricks... vladimir.golubev@databricks...
PullOutNondeterministicExpressionInExpressionTree.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver
20 1 2025-04-08 2025-04-08 1 1 vladimir.golubev@databricks... vladimir.golubev@databricks...
TryExtractOrdinal.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver
11 1 2025-04-08 2025-04-08 1 1 vladimir.golubev@databricks... vladimir.golubev@databricks...
ResolvedAggregateExpressions.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver
10 - 2025-04-08 2025-04-08 1 1 vladimir.golubev@databricks... vladimir.golubev@databricks...
ResolvedSubqueryExpressionPlan.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver
7 - 2025-04-08 2025-04-08 1 1 vladimir.golubev@databricks... vladimir.golubev@databricks...
UnresolvedCteRelationRef.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/resolver
5 - 2025-04-08 2025-04-08 1 1 vladimir.golubev@databricks... vladimir.golubev@databricks...
DefaultValue.java
in sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog
50 8 2025-04-07 2025-04-07 1 1 aokolnychyi@apache.org aokolnychyi@apache.org
objects.py
in python/pyspark/testing
57 19 2025-04-03 2025-04-03 1 1 ueshin@databricks.com ueshin@databricks.com
BaseConstraint.java
in sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/constraints
94 15 2025-04-02 2025-04-02 1 1 aokolnychyi@apache.org aokolnychyi@apache.org
ForeignKey.java
in sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/constraints
92 7 2025-04-02 2025-04-02 1 1 aokolnychyi@apache.org aokolnychyi@apache.org
Check.java
in sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/constraints
75 10 2025-04-02 2025-04-02 1 1 aokolnychyi@apache.org aokolnychyi@apache.org
PrimaryKey.java
in sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/constraints
57 5 2025-04-02 2025-04-02 1 1 aokolnychyi@apache.org aokolnychyi@apache.org
Unique.java
in sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/constraints
56 6 2025-04-02 2025-04-02 1 1 aokolnychyi@apache.org aokolnychyi@apache.org
Constraint.java
in sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/constraints
31 - 2025-04-02 2025-04-02 1 1 aokolnychyi@apache.org aokolnychyi@apache.org
ShowNamespacesCommand.scala
in sql/core/src/main/scala/org/apache/spark/sql/execution/command
49 2 2025-03-28 2025-03-28 1 1 szehon.apache@gmail.com szehon.apache@gmail.com
tblib.py
in python/pyspark/errors/exceptions
200 12 2025-03-27 2025-03-27 1 1 wenghy02@gmail.com wenghy02@gmail.com
PushProjectionThroughLimitAndOffset.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer
18 1 2025-03-27 2025-03-27 1 1 pavle.martinovic@databricks... pavle.martinovic@databricks...
TableInfo.java
in sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog
58 8 2025-03-26 2025-04-18 3 2 anoop@apache.org anoop@apache.org
Most Recently Changed Files (Top 50)
File# lines# unitscreatedlast modified# changes
(days)
# contributorsfirst
contributor
latest
contributor
LazyTry.scala
in core/src/main/scala/org/apache/spark/util
9 1
GeneralizedLinearRegression.scala
in mllib/src/main/scala/org/apache/spark/ml/regression
967 52 2016-03-01 2025-05-07 80 31 ybliang8@gmail.com weichen.xu@databricks.com
LogisticRegression.scala
in mllib/src/main/scala/org/apache/spark/ml/classification
929 32 2014-11-12 2025-05-07 170 52 meng@databricks.com weichen.xu@databricks.com
TreeNode.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees
847 56 2014-03-21 2025-05-07 145 74 michael@databricks.com buyingyi@gmail.com
Expression.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions
827 48 2014-03-21 2025-05-07 187 76 michael@databricks.com buyingyi@gmail.com
RandomForest.scala
in mllib/src/main/scala/org/apache/spark/ml/tree/impl
827 17 2015-07-17 2025-05-07 55 29 joseph@databricks.com weichen.xu@databricks.com
unresolved.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis
641 40 2014-03-21 2025-05-07 148 76 michael@databricks.com buyingyi@gmail.com
LinearRegression.scala
in mllib/src/main/scala/org/apache/spark/ml/regression
594 16 2015-02-06 2025-05-07 136 51 joseph@databricks.com weichen.xu@databricks.com
KMeans.scala
in mllib/src/main/scala/org/apache/spark/ml/clustering
518 19 2015-07-18 2025-05-07 70 36 yuu.ishikawa@gmail.com weichen.xu@databricks.com
GaussianMixture.scala
in mllib/src/main/scala/org/apache/spark/ml/clustering
455 17 2016-04-06 2025-05-07 67 26 ruifengz@foxmail.com weichen.xu@databricks.com
QueryPlan.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans
424 32 2014-03-21 2025-05-07 132 62 michael@databricks.com buyingyi@gmail.com
GradientBoostedTrees.scala
in mllib/src/main/scala/org/apache/spark/ml/tree/impl
351 10 2016-03-15 2025-05-07 30 16 seth.hendrickson16@gmail.com weichen.xu@databricks.com
RandomForestClassifier.scala
in mllib/src/main/scala/org/apache/spark/ml/classification
346 14 2015-04-25 2025-05-07 64 30 joseph@databricks.com weichen.xu@databricks.com
treeModels.scala
in mllib/src/main/scala/org/apache/spark/ml/tree
331 20 2015-04-17 2025-05-07 37 16 joseph@databricks.com weichen.xu@databricks.com
Connect.scala
in sql/connect/server/src/main/scala/org/apache/spark/sql/connect/config
325 - 2024-08-02 2025-05-07 9 8 gurwls223@apache.org weichen.xu@databricks.com
MLHandler.scala
in sql/connect/server/src/main/scala/org/apache/spark/sql/connect/ml
323 7 2025-01-14 2025-05-07 16 3 wbo4958@gmail.com weichen.xu@databricks.com
LinearSVC.scala
in mllib/src/main/scala/org/apache/spark/ml/classification
312 13 2017-01-23 2025-05-07 54 18 yuhao.yang@intel.com weichen.xu@databricks.com
captured.py
in python/pyspark/errors/exceptions
284 31 2023-02-08 2025-05-07 22 6 ueshin@databricks.com gurwls223@apache.org
GBTClassifier.scala
in mllib/src/main/scala/org/apache/spark/ml/classification
270 16 2015-04-25 2025-05-07 71 31 joseph@databricks.com weichen.xu@databricks.com
CTESubstitution.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis
261 8 2019-07-12 2025-05-07 39 27 peter.toth@gmail.com buyingyi@gmail.com
MultilayerPerceptronClassifier.scala
in mllib/src/main/scala/org/apache/spark/ml/classification
255 12 2015-07-31 2025-05-07 54 25 nashb@yandex.ru weichen.xu@databricks.com
GBTRegressor.scala
in mllib/src/main/scala/org/apache/spark/ml/regression
237 13 2015-04-25 2025-05-07 67 27 joseph@databricks.com weichen.xu@databricks.com
FMClassifier.scala
in mllib/src/main/scala/org/apache/spark/ml/classification
225 10 2019-12-23 2025-05-07 24 8 zhanjf@mob.com weichen.xu@databricks.com
RandomForestRegressor.scala
in mllib/src/main/scala/org/apache/spark/ml/regression
219 10 2015-04-25 2025-05-07 58 29 joseph@databricks.com weichen.xu@databricks.com
DecisionTreeClassifier.scala
in mllib/src/main/scala/org/apache/spark/ml/classification
209 13 2015-04-17 2025-05-07 67 26 joseph@databricks.com weichen.xu@databricks.com
DecisionTreeRegressor.scala
in mllib/src/main/scala/org/apache/spark/ml/regression
209 12 2015-04-17 2025-05-07 65 28 joseph@databricks.com weichen.xu@databricks.com
BisectingKMeans.scala
in mllib/src/main/scala/org/apache/spark/ml/clustering
201 9 2016-01-20 2025-05-07 48 21 yuu.ishikawa@gmail.com weichen.xu@databricks.com
MLCache.scala
in sql/connect/server/src/main/scala/org/apache/spark/sql/connect/ml
187 9 2025-01-14 2025-05-07 8 4 wbo4958@gmail.com weichen.xu@databricks.com
SparkConnectAnalyzeHandler.scala
in sql/connect/server/src/main/scala/org/apache/spark/sql/connect/service
186 2 2024-08-02 2025-05-07 4 4 gurwls223@apache.org peter.pashkin@databricks.com
package.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util
179 16 2014-03-21 2025-05-07 52 32 michael@databricks.com mihailo.aleksic@databricks.com
connectutils.py
in python/pyspark/testing
177 26 2022-10-06 2025-05-07 49 14 gurwls223@apache.org weichen.xu@databricks.com
BitSet.scala
in core/src/main/scala/org/apache/spark/util/collection
162 16 2013-11-03 2025-05-07 33 25 rxin@apache.org buyingyi@gmail.com
TreePatterns.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees
150 - 2021-04-12 2025-05-07 69 36 yingyi.bu@databricks.com buyingyi@gmail.com
ApplyDefaultCollationToStringType.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis
135 10 2025-04-22 2025-05-07 3 2 marko.ilic@databricks.com wenchen@databricks.com
ResolveIdentifierClause.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis
100 5 2023-06-06 2025-05-07 11 8 wenchen@databricks.com buyingyi@gmail.com
UninterruptibleThread.scala
in core/src/main/scala/org/apache/spark/util
79 7 2016-03-28 2025-05-07 6 3 shixiong@databricks.com vrozov@amazon.com
TreePatternBits.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees
38 4 2021-04-12 2025-05-07 3 3 yingyi.bu@databricks.com buyingyi@gmail.com
MLException.scala
in sql/connect/server/src/main/scala/org/apache/spark/sql/connect/ml
28 - 2025-01-14 2025-05-07 4 3 wbo4958@gmail.com weichen.xu@databricks.com
LiteralFunctionResolution.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis
23 1 2025-02-12 2025-05-07 3 3 mihailo.timotic@databricks.com mihailo.aleksic@databricks.com
HasTrainingSummary.scala
in mllib/src/main/scala/org/apache/spark/ml/util
21 1 2018-12-17 2025-05-07 4 3 yuhao.yang@intel.com weichen.xu@databricks.com
SQLConf.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/internal
5815 42 2017-03-14 2025-05-06 729 225 rxin@databricks.com gurwls223@apache.org
Analyzer.scala
in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis
2889 137 2014-03-21 2025-05-06 742 196 michael@databricks.com mihailo.aleksic@databricks.com
series.py
in python/pyspark/pandas
2215 143 2021-04-06 2025-05-06 118 19 haejoon.lee@databricks.com ueshin@databricks.com
groupby.py
in python/pyspark/pandas
1800 83 2021-04-06 2025-05-06 97 20 haejoon.lee@databricks.com ueshin@databricks.com
worker.py
in python/pyspark
1728 32 2013-01-01 2025-05-06 208 77 joshrosen@eecs.berkeley.edu gurwls223@apache.org
namespace.py
in python/pyspark/pandas
1517 45 2021-04-06 2025-05-06 82 21 haejoon.lee@databricks.com ueshin@databricks.com
1460 12 2011-07-15 2025-05-06 818 222 ismael@juma.me.uk wenchen@databricks.com
serializers.py
in python/pyspark/sql/pandas
884 46 2020-01-09 2025-05-06 52 20 gurwls223@apache.org gurwls223@apache.org
utils.py
in python/pyspark/pandas
657 31 2021-04-06 2025-05-06 49 10 haejoon.lee@databricks.com ueshin@databricks.com
DataSource.scala
in sql/core/src/main/scala/org/apache/spark/sql/execution/datasources
617 17 2016-03-08 2025-05-06 172 69 michael@databricks.com vrozov@amazon.com