apache / datafu
File Change Frequency

File change frequency (churn) shows the distribution of file updates (days with at least one commit).

Overview
File Change Frequency Overall
  • There are 225 files with 17,158 lines of code.
    • 0 files changed more than 100 times (0 lines of code)
    • 0 files changed 51-100 times (0 lines of code)
    • 0 files changed 21-50 times (0 lines of code)
    • 14 files changed 6-20 times (1,256 lines of code)
    • 211 files changed 1-5 times (15,902 lines of code)
0% | 0% | 0% | 7% | 92%
Legend:
101+
51-100
21-50
6-20
1-5

explore: grouped by folders | grouped by update frequency | data
Contributors Count Frequency Overall
  • There are 225 files with 17,158 lines of code.
    • 0 files changed by more than 25 contributors (0 lines of code)
    • 0 files changed by 11-25 contributors (0 lines of code)
    • 3 files changed by 6-10 contributors (419 lines of code)
    • 159 files changed by 2-5 contributors (13,415 lines of code)
    • 63 files changed by 1 contributor (3,324 lines of code)
0% | 0% | 2% | 78% | 19%
Legend:
26+
11-25
6-10
2-5
1

explore: grouped by folders | grouped by contributors count | data
File Change Frequency per File Extension
java, erb, markdown, gradle, scala, pig, md, py, gitignore, css, js, rb, sh, groovy, yaml, svg, properties, xsl, builder, html, less, txt, rdf
File Change Frequency per Extension
The number of recorded file updates
101+
51-100
21-50
6-20
1-5
scala0% | 0% | 0% | 65% | 34%
java0% | 0% | 0% | 2% | 97%
erb0% | 0% | 0% | 47% | 52%
py0% | 0% | 0% | 57% | 42%
rb0% | 0% | 0% | 37% | 63%
css0% | 0% | 0% | 0% | 100%
pig0% | 0% | 0% | 0% | 100%
groovy0% | 0% | 0% | 0% | 100%
xsl0% | 0% | 0% | 0% | 100%
less0% | 0% | 0% | 0% | 100%
rdf0% | 0% | 0% | 0% | 100%
builder0% | 0% | 0% | 0% | 100%
html0% | 0% | 0% | 0% | 100%
js0% | 0% | 0% | 0% | 100%
File Change Frequency per Logical Decomposition
primary
primary (file change frequency)
The number of recorded file updates
101+
51-100
21-50
6-20
1-5
datafu-spark0% | 0% | 0% | 65% | 34%
datafu-pig0% | 0% | 0% | 3% | 96%
site0% | 0% | 0% | 23% | 76%
build-plugin0% | 0% | 0% | 27% | 72%
datafu-hourglass0% | 0% | 0% | 0% | 100%
buildSrc0% | 0% | 0% | 0% | 100%
gradle0% | 0% | 0% | 0% | 100%
ROOT0% | 0% | 0% | 0% | 100%
Most Frequently Changed Files (Top 50)

See data for all files...

File# lines# unitscreatedlast modified# changes
(days)
# contributorsfirst
contributor
latest
contributor
_footer.erb
in site/source/layouts
27 - 2014-01-23 2024-01-16 16 6 mhayes@linkedin.com eyal@apache.org
_docs_nav.erb
in site/source/layouts
55 - 2014-01-23 2024-01-16 16 6 mhayes@linkedin.com eyal@apache.org
SparkDFUtils.scala
in datafu-spark/src/main/scala/datafu/spark
337 25 2019-07-17 2025-02-24 15 9 oraviv@paypal.com ahartanu@paypal.com
config.rb
in site
37 3 2014-01-23 2021-11-06 12 5 mhayes@linkedin.com eyal@apache.org
index.markdown.erb
in site/source
49 - 2014-01-23 2024-01-16 11 5 mhayes@linkedin.com eyal@apache.org
df_utils.py
in datafu-spark/src/main/resources/pyspark_utils
62 16 2019-07-17 2025-02-24 8 5 oraviv@paypal.com ahartanu@paypal.com
DataFrameOps.scala
in datafu-spark/src/main/scala/datafu/spark
75 1 2019-07-17 2024-12-09 8 5 oraviv@paypal.com eyal@apache.org
MultilineProcessor.java
in build-plugin/src/main/java/org/adrianwalker/multilinestring
33 2 2014-03-03 2020-02-05 7 3 mhayes@linkedin.com mhayes@apache.org
_header.erb
in site/source/layouts
24 - 2014-01-23 2018-03-17 6 3 mhayes@linkedin.com mhayes@apache.org
ContextualEvalFunc.java
in datafu-pig/src/main/java/datafu/pig/util
44 7 2014-03-03 2017-11-17 6 3 mhayes@linkedin.com flip@infochimps.org
SparkPythonRunner.scala
in datafu-spark/src/main/scala/spark/utils/overwrites
101 4 2019-07-17 2023-10-02 6 4 oraviv@paypal.com eyal@apache.org
BagGroup.java
in datafu-pig/src/main/java/datafu/pig/bags
117 4 2014-03-03 2014-11-24 6 5 mhayes@linkedin.com jghoman@gmail.com
SparkOverwriteUDAFs.scala
in datafu-spark/src/main/scala/spark/utils/overwrites
135 9 2019-07-17 2025-01-23 6 5 oraviv@paypal.com brahamim@paypal.com
AliasableEvalFunc.java
in datafu-pig/src/main/java/datafu/pig/util
160 23 2014-03-03 2016-10-26 6 5 mhayes@linkedin.com eyal@apache.org
builder
sitemap.xml.builder
in site/source
14 - 2014-01-23 2019-06-02 5 4 mhayes@linkedin.com eyal@apache.org
blog.erb
in site/source/layouts
41 - 2014-01-23 2018-07-05 5 3 mhayes@linkedin.com mhayes@apache.org
all.less
in site/source/stylesheets
53 - 2014-01-23 2019-01-29 5 4 mhayes@linkedin.com eyal@apache.org
ScalaPythonBridge.scala
in datafu-spark/src/main/scala/datafu/spark
103 8 2019-07-17 2023-10-02 5 4 oraviv@paypal.com eyal@apache.org
HyperLogLogPlusPlus.java
in datafu-pig/src/main/java/datafu/pig/stats
152 13 2014-03-03 2018-07-09 5 5 mhayes@linkedin.com mhayes@apache.org
PathUtils.java
in datafu-hourglass/src/main/java/datafu/hourglass/fs
192 11 2014-05-18 2015-05-23 5 3 jarcec@apache.org matthew.terence.hayes@gmail...
AbstractJob.java
in datafu-hourglass/src/main/java/datafu/hourglass/jobs
257 28 2014-05-18 2016-03-09 5 3 jarcec@apache.org matthew.terence.hayes@gmail...
layout.erb
in site/source/layouts
39 - 2014-01-23 2025-01-24 4 3 mhayes@linkedin.com niall.pemberton@gmail.com
bridge_utils.py
in datafu-spark/src/main/resources/pyspark_utils
41 3 2019-07-17 2025-01-06 4 3 oraviv@paypal.com eyal@apache.org
HasherRand.java
in datafu-pig/src/main/java/datafu/pig/hash
51 5 2017-12-05 2020-03-31 4 2 flip@infochimps.org mhayes@apache.org
DistributedCacheHelper.java
in datafu-hourglass/src/main/java/datafu/hourglass/mapreduce
56 2 2014-05-18 2015-05-23 4 3 jarcec@apache.org matthew.terence.hayes@gmail...
SampleByKey.java
in datafu-pig/src/main/java/datafu/pig/sampling
58 5 2014-03-03 2014-11-24 4 4 mhayes@linkedin.com jghoman@gmail.com
AvroMultipleInputsUtil.java
in datafu-hourglass/src/main/java/datafu/hourglass/avro
78 3 2014-05-18 2018-01-29 4 3 jarcec@apache.org mhayes@apache.org
SimpleEvalFunc.java
in datafu-pig/src/main/java/datafu/pig/util
101 5 2014-03-03 2017-11-17 4 2 mhayes@linkedin.com flip@infochimps.org
Sampler.java
in datafu-pig/src/main/java/datafu/pig/hash/lsh/interfaces
6 - 2014-05-13 2014-11-24 3 3 cestella@gmail.com jghoman@gmail.com
KeyValueCollector.java
in datafu-hourglass/src/main/java/datafu/hourglass/model
6 - 2014-05-18 2014-11-24 3 3 jarcec@apache.org jghoman@gmail.com
Mapper.java
in datafu-hourglass/src/main/java/datafu/hourglass/model
7 - 2014-05-18 2014-11-24 3 3 jarcec@apache.org jghoman@gmail.com
L2.java
in datafu-pig/src/main/java/datafu/pig/hash/lsh/metric
14 3 2014-05-13 2014-11-24 3 3 cestella@gmail.com jghoman@gmail.com
Cosine.java
in datafu-pig/src/main/java/datafu/pig/hash/lsh/metric
14 3 2014-05-13 2014-11-24 3 3 cestella@gmail.com jghoman@gmail.com
L1.java
in datafu-pig/src/main/java/datafu/pig/hash/lsh/metric
14 3 2014-05-13 2014-11-24 3 3 cestella@gmail.com jghoman@gmail.com
LSH.java
in datafu-pig/src/main/java/datafu/pig/hash/lsh/interfaces
16 1 2014-05-13 2014-11-24 3 3 cestella@gmail.com jghoman@gmail.com
L1LSH.java
in datafu-pig/src/main/java/datafu/pig/hash/lsh/p_stable
18 3 2014-05-13 2014-11-24 3 3 cestella@gmail.com jghoman@gmail.com
L2LSH.java
in datafu-pig/src/main/java/datafu/pig/hash/lsh/p_stable
18 3 2014-05-13 2014-11-24 3 4 cestella@gmail.com jghoman@gmail.com
InUDF.java
in datafu-pig/src/main/java/datafu/pig/util
19 1 2014-03-03 2014-11-24 3 3 mhayes@linkedin.com jghoman@gmail.com
LSHFamily.java
in datafu-pig/src/main/java/datafu/pig/hash/lsh
24 2 2014-05-13 2014-11-24 3 3 cestella@gmail.com jghoman@gmail.com
BagLeftOuterJoin.java
in datafu-pig/src/main/java/datafu/pig/bags
25 1 2014-03-03 2014-11-24 3 3 mhayes@linkedin.com jghoman@gmail.com
pig
diff_macros.pig
in datafu-pig/src/main/resources/datafu
34 - 2017-09-11 2019-01-07 3 3 eallweil@paypal.com mhayes@apache.org
35 - 2014-08-04 2023-08-28 3 3 mhayes@linkedin.com eyal@apache.org
pig
count_macros.pig
in datafu-pig/src/main/resources/datafu
38 - 2017-08-03 2019-01-07 3 3 eallweil@paypal.com mhayes@apache.org
pig
sample_by_keys.pig
in datafu-pig/src/main/resources/datafu
38 - 2018-07-09 2019-01-07 3 3 eallweil@paypal.com mhayes@apache.org
AvroDateRangeMetadata.java
in datafu-hourglass/src/main/java/datafu/hourglass/avro
40 2 2014-05-18 2014-11-24 3 3 jarcec@apache.org jghoman@gmail.com
AbstractStableDistributionFunction.java
in datafu-pig/src/main/java/datafu/pig/hash/lsh/p_stable
42 3 2014-05-13 2014-11-24 3 3 cestella@gmail.com jghoman@gmail.com
FileCleaner.java
in datafu-hourglass/src/main/java/datafu/hourglass/jobs
45 4 2014-05-18 2014-11-24 3 3 jarcec@apache.org jghoman@gmail.com
pig.rb
in site/lib
54 1 2014-01-23 2015-10-16 3 2 mhayes@linkedin.com matthew.terence.hayes@gmail...
LSHCreator.java
in datafu-pig/src/main/java/datafu/pig/hash/lsh/interfaces
61 5 2014-05-13 2014-11-24 3 3 cestella@gmail.com jghoman@gmail.com
PartitionPreservingIncrementalJob.java
in datafu-hourglass/src/main/java/datafu/hourglass/jobs
88 15 2014-05-18 2014-11-24 3 3 jarcec@apache.org jghoman@gmail.com
Files With Most Contributors (Top 50)
Based on the number of unique email addresses found in commits.

See data for all files...

File# lines# unitscreatedlast modified# changes
(days)
# contributorsfirst
contributor
latest
contributor
SparkDFUtils.scala
in datafu-spark/src/main/scala/datafu/spark
337 25 2019-07-17 2025-02-24 15 9 oraviv@paypal.com ahartanu@paypal.com
_footer.erb
in site/source/layouts
27 - 2014-01-23 2024-01-16 16 6 mhayes@linkedin.com eyal@apache.org
_docs_nav.erb
in site/source/layouts
55 - 2014-01-23 2024-01-16 16 6 mhayes@linkedin.com eyal@apache.org
config.rb
in site
37 3 2014-01-23 2021-11-06 12 5 mhayes@linkedin.com eyal@apache.org
index.markdown.erb
in site/source
49 - 2014-01-23 2024-01-16 11 5 mhayes@linkedin.com eyal@apache.org
DataFrameOps.scala
in datafu-spark/src/main/scala/datafu/spark
75 1 2019-07-17 2024-12-09 8 5 oraviv@paypal.com eyal@apache.org
df_utils.py
in datafu-spark/src/main/resources/pyspark_utils
62 16 2019-07-17 2025-02-24 8 5 oraviv@paypal.com ahartanu@paypal.com
AliasableEvalFunc.java
in datafu-pig/src/main/java/datafu/pig/util
160 23 2014-03-03 2016-10-26 6 5 mhayes@linkedin.com eyal@apache.org
BagGroup.java
in datafu-pig/src/main/java/datafu/pig/bags
117 4 2014-03-03 2014-11-24 6 5 mhayes@linkedin.com jghoman@gmail.com
SparkOverwriteUDAFs.scala
in datafu-spark/src/main/scala/spark/utils/overwrites
135 9 2019-07-17 2025-01-23 6 5 oraviv@paypal.com brahamim@paypal.com
HyperLogLogPlusPlus.java
in datafu-pig/src/main/java/datafu/pig/stats
152 13 2014-03-03 2018-07-09 5 5 mhayes@linkedin.com mhayes@apache.org
SparkPythonRunner.scala
in datafu-spark/src/main/scala/spark/utils/overwrites
101 4 2019-07-17 2023-10-02 6 4 oraviv@paypal.com eyal@apache.org
all.less
in site/source/stylesheets
53 - 2014-01-23 2019-01-29 5 4 mhayes@linkedin.com eyal@apache.org
builder
sitemap.xml.builder
in site/source
14 - 2014-01-23 2019-06-02 5 4 mhayes@linkedin.com eyal@apache.org
ScalaPythonBridge.scala
in datafu-spark/src/main/scala/datafu/spark
103 8 2019-07-17 2023-10-02 5 4 oraviv@paypal.com eyal@apache.org
SampleByKey.java
in datafu-pig/src/main/java/datafu/pig/sampling
58 5 2014-03-03 2014-11-24 4 4 mhayes@linkedin.com jghoman@gmail.com
L2LSH.java
in datafu-pig/src/main/java/datafu/pig/hash/lsh/p_stable
18 3 2014-05-13 2014-11-24 3 4 cestella@gmail.com jghoman@gmail.com
MultilineProcessor.java
in build-plugin/src/main/java/org/adrianwalker/multilinestring
33 2 2014-03-03 2020-02-05 7 3 mhayes@linkedin.com mhayes@apache.org
ContextualEvalFunc.java
in datafu-pig/src/main/java/datafu/pig/util
44 7 2014-03-03 2017-11-17 6 3 mhayes@linkedin.com flip@infochimps.org
_header.erb
in site/source/layouts
24 - 2014-01-23 2018-03-17 6 3 mhayes@linkedin.com mhayes@apache.org
PathUtils.java
in datafu-hourglass/src/main/java/datafu/hourglass/fs
192 11 2014-05-18 2015-05-23 5 3 jarcec@apache.org matthew.terence.hayes@gmail...
AbstractJob.java
in datafu-hourglass/src/main/java/datafu/hourglass/jobs
257 28 2014-05-18 2016-03-09 5 3 jarcec@apache.org matthew.terence.hayes@gmail...
blog.erb
in site/source/layouts
41 - 2014-01-23 2018-07-05 5 3 mhayes@linkedin.com mhayes@apache.org
DistributedCacheHelper.java
in datafu-hourglass/src/main/java/datafu/hourglass/mapreduce
56 2 2014-05-18 2015-05-23 4 3 jarcec@apache.org matthew.terence.hayes@gmail...
AvroMultipleInputsUtil.java
in datafu-hourglass/src/main/java/datafu/hourglass/avro
78 3 2014-05-18 2018-01-29 4 3 jarcec@apache.org mhayes@apache.org
layout.erb
in site/source/layouts
39 - 2014-01-23 2025-01-24 4 3 mhayes@linkedin.com niall.pemberton@gmail.com
bridge_utils.py
in datafu-spark/src/main/resources/pyspark_utils
41 3 2019-07-17 2025-01-06 4 3 oraviv@paypal.com eyal@apache.org
Mapper.java
in datafu-hourglass/src/main/java/datafu/hourglass/model
7 - 2014-05-18 2014-11-24 3 3 jarcec@apache.org jghoman@gmail.com
KeyValueCollector.java
in datafu-hourglass/src/main/java/datafu/hourglass/model
6 - 2014-05-18 2014-11-24 3 3 jarcec@apache.org jghoman@gmail.com
PartitionPreservingExecutionPlanner.java
in datafu-hourglass/src/main/java/datafu/hourglass/jobs
156 13 2014-05-18 2014-11-24 3 3 jarcec@apache.org jghoman@gmail.com
PartitionCollapsingExecutionPlanner.java
in datafu-hourglass/src/main/java/datafu/hourglass/jobs
315 21 2014-05-18 2014-11-24 3 3 jarcec@apache.org jghoman@gmail.com
AbstractPartitionCollapsingIncrementalJob.java
in datafu-hourglass/src/main/java/datafu/hourglass/jobs
349 24 2014-05-18 2014-11-24 3 3 jarcec@apache.org jghoman@gmail.com
ExecutionPlanner.java
in datafu-hourglass/src/main/java/datafu/hourglass/jobs
234 25 2014-05-18 2014-11-24 3 3 jarcec@apache.org jghoman@gmail.com
FileCleaner.java
in datafu-hourglass/src/main/java/datafu/hourglass/jobs
45 4 2014-05-18 2014-11-24 3 3 jarcec@apache.org jghoman@gmail.com
PartitionPreservingIncrementalJob.java
in datafu-hourglass/src/main/java/datafu/hourglass/jobs
88 15 2014-05-18 2014-11-24 3 3 jarcec@apache.org jghoman@gmail.com
AbstractNonIncrementalJob.java
in datafu-hourglass/src/main/java/datafu/hourglass/jobs
206 11 2014-05-18 2014-11-24 3 3 jarcec@apache.org jghoman@gmail.com
PartitionCollapsingIncrementalJob.java
in datafu-hourglass/src/main/java/datafu/hourglass/jobs
109 19 2014-05-18 2014-11-24 3 3 jarcec@apache.org jghoman@gmail.com
AbstractPartitionPreservingIncrementalJob.java
in datafu-hourglass/src/main/java/datafu/hourglass/jobs
387 20 2014-05-18 2014-11-24 3 3 jarcec@apache.org jghoman@gmail.com
PartitioningReducer.java
in datafu-hourglass/src/main/java/datafu/hourglass/mapreduce
108 8 2014-05-18 2014-11-24 3 3 jarcec@apache.org jghoman@gmail.com
CollapsingCombiner.java
in datafu-hourglass/src/main/java/datafu/hourglass/mapreduce
152 9 2014-05-18 2014-11-24 3 3 jarcec@apache.org jghoman@gmail.com
CollapsingReducer.java
in datafu-hourglass/src/main/java/datafu/hourglass/mapreduce
222 12 2014-05-18 2014-11-24 3 3 jarcec@apache.org jghoman@gmail.com
CollapsingMapper.java
in datafu-hourglass/src/main/java/datafu/hourglass/mapreduce
202 17 2014-05-18 2014-11-24 3 3 jarcec@apache.org jghoman@gmail.com
PartitioningMapper.java
in datafu-hourglass/src/main/java/datafu/hourglass/mapreduce
106 11 2014-05-18 2014-11-24 3 3 jarcec@apache.org jghoman@gmail.com
AvroDateRangeMetadata.java
in datafu-hourglass/src/main/java/datafu/hourglass/avro
40 2 2014-05-18 2014-11-24 3 3 jarcec@apache.org jghoman@gmail.com
Sessionize.java
in datafu-pig/src/main/java/datafu/pig/sessions
107 5 2014-03-03 2017-06-23 3 3 mhayes@linkedin.com jtolar@yahoo-inc.com
SetDifference.java
in datafu-pig/src/main/java/datafu/pig/sets
140 8 2014-03-03 2014-11-24 3 3 mhayes@linkedin.com jghoman@gmail.com
InUDF.java
in datafu-pig/src/main/java/datafu/pig/util
19 1 2014-03-03 2014-11-24 3 3 mhayes@linkedin.com jghoman@gmail.com
LSHCreator.java
in datafu-pig/src/main/java/datafu/pig/hash/lsh/interfaces
61 5 2014-05-13 2014-11-24 3 3 cestella@gmail.com jghoman@gmail.com
LSH.java
in datafu-pig/src/main/java/datafu/pig/hash/lsh/interfaces
16 1 2014-05-13 2014-11-24 3 3 cestella@gmail.com jghoman@gmail.com
Sampler.java
in datafu-pig/src/main/java/datafu/pig/hash/lsh/interfaces
6 - 2014-05-13 2014-11-24 3 3 cestella@gmail.com jghoman@gmail.com
Files With Least Contributors (Top 50)
Based on the number of unique email addresses found in commits.

See data for all files...

File# lines# unitscreatedlast modified# changes
(days)
# contributorsfirst
contributor
latest
contributor
bootstrap-theme.css
in site/source/stylesheets
340 - 2014-01-23 2014-01-23 1 1 mhayes@linkedin.com mhayes@linkedin.com
FloatVAR.java
in datafu-pig/src/main/java/datafu/pig/stats
289 15 2014-03-03 2014-03-03 1 1 mhayes@linkedin.com mhayes@linkedin.com
LongVAR.java
in datafu-pig/src/main/java/datafu/pig/stats
288 15 2014-03-03 2014-03-03 1 1 mhayes@linkedin.com mhayes@linkedin.com
IntVAR.java
in datafu-pig/src/main/java/datafu/pig/stats
288 15 2014-03-03 2014-03-03 1 1 mhayes@linkedin.com mhayes@linkedin.com
DoubleVAR.java
in datafu-pig/src/main/java/datafu/pig/stats
288 15 2014-03-03 2014-03-03 1 1 mhayes@linkedin.com mhayes@linkedin.com
xsl
rat-output-to-html.xsl
in gradle/resources
153 - 2014-11-23 2014-11-23 1 1 matthew.terence.hayes@gmail... matthew.terence.hayes@gmail...
ChaoShenEntropyEstimator.java
in datafu-pig/src/main/java/datafu/pig/stats/entropy
134 11 2014-03-03 2014-03-03 1 1 mhayes@linkedin.com mhayes@linkedin.com
Aggregators.scala
in datafu-spark/src/main/scala/datafu/spark
129 12 2024-01-03 2024-01-03 1 1 eyal@apache.org eyal@apache.org
TupleDiff.java
in datafu-pig/src/main/java/datafu/pig/util
128 8 2017-09-11 2017-09-11 1 1 eallweil@paypal.com eallweil@paypal.com
WeightedSample.java
in datafu-pig/src/main/java/datafu/pig/sampling
104 5 2014-03-03 2014-03-03 1 1 mhayes@linkedin.com mhayes@linkedin.com
URLInfo.java
in datafu-pig/src/main/java/datafu/pig/urls
102 8 2014-08-10 2014-08-10 1 1 jbanerjee1@gmail.com jbanerjee1@gmail.com
BagSplit.java
in datafu-pig/src/main/java/datafu/pig/bags
94 4 2014-03-03 2014-03-03 1 1 mhayes@linkedin.com mhayes@linkedin.com
MarkovPairs.java
in datafu-pig/src/main/java/datafu/pig/stats
92 5 2014-03-03 2014-03-03 1 1 mhayes@linkedin.com mhayes@linkedin.com
pig
tf_idf.pig
in datafu-pig/src/main/resources/datafu
84 - 2017-08-06 2017-08-06 1 1 russell.jurney@gmail.com russell.jurney@gmail.com
SetIntersect.java
in datafu-pig/src/main/java/datafu/pig/sets
75 4 2014-03-03 2014-03-03 1 1 mhayes@linkedin.com mhayes@linkedin.com
ScoredTuple.java
in datafu-pig/src/main/java/datafu/pig/sampling
71 11 2014-03-03 2014-03-03 1 1 mhayes@linkedin.com mhayes@linkedin.com
ZipBags.java
in datafu-pig/src/main/java/datafu/pig/bags
58 2 2014-09-15 2014-09-15 1 1 ajoseph4@binghamton.edu ajoseph4@binghamton.edu
EmptyBagToNullFields.java
in datafu-pig/src/main/java/datafu/pig/bags
58 2 2014-03-03 2014-03-03 1 1 mhayes@linkedin.com mhayes@linkedin.com
AvroKeyValueWithMetadataRecordWriter.java
in datafu-hourglass/src/main/java/datafu/hourglass/avro
55 4 2014-05-18 2014-05-18 1 1 jarcec@apache.org jarcec@apache.org
EmptyBagToNull.java
in datafu-pig/src/main/java/datafu/pig/bags
35 2 2014-03-03 2014-03-03 1 1 mhayes@linkedin.com mhayes@linkedin.com
pig
dedup.pig
in datafu-pig/src/main/resources/datafu
34 - 2018-10-11 2018-10-11 1 1 eyal@apache.org eyal@apache.org
SetUnion.java
in datafu-pig/src/main/java/datafu/pig/sets
33 1 2014-03-03 2014-03-03 1 1 mhayes@linkedin.com mhayes@linkedin.com
SelectStringFieldByName.java
in datafu-pig/src/main/java/datafu/pig/util
32 1 2014-11-03 2014-11-03 1 1 russell.jurney@gmail.com russell.jurney@gmail.com
EntropyEstimator.java
in datafu-pig/src/main/java/datafu/pig/stats/entropy
29 2 2014-03-03 2014-03-03 1 1 mhayes@linkedin.com mhayes@linkedin.com
ExtractAutojar.groovy
in buildSrc/src/main/groovy/datafu/autojar/task
29 1 2018-07-05 2018-07-05 1 1 mhayes@apache.org mhayes@apache.org
AppendToBag.java
in datafu-pig/src/main/java/datafu/pig/bags
27 2 2014-03-03 2014-03-03 1 1 mhayes@linkedin.com mhayes@linkedin.com
NullToEmptyBag.java
in datafu-pig/src/main/java/datafu/pig/bags
26 2 2014-03-03 2014-03-03 1 1 mhayes@linkedin.com mhayes@linkedin.com
Reservoir.java
in datafu-pig/src/main/java/datafu/pig/sampling
25 2 2014-03-03 2014-03-03 1 1 mhayes@linkedin.com mhayes@linkedin.com
EmpiricalEntropyEstimator.java
in datafu-pig/src/main/java/datafu/pig/stats/entropy
25 3 2014-03-03 2014-03-03 1 1 mhayes@linkedin.com mhayes@linkedin.com
AvroMultipleInputsKeyInputFormat.java
in datafu-hourglass/src/main/java/datafu/hourglass/avro
24 - 2014-05-18 2014-05-18 1 1 jarcec@apache.org jarcec@apache.org
AvroKeyValueWithMetadataOutputFormat.java
in datafu-hourglass/src/main/java/datafu/hourglass/avro
21 1 2014-05-18 2014-05-18 1 1 jarcec@apache.org jarcec@apache.org
RandomUUID.java
in datafu-pig/src/main/java/datafu/pig/random
20 2 2014-03-03 2014-03-03 1 1 mhayes@linkedin.com mhayes@linkedin.com
EntropyUtil.java
in datafu-pig/src/main/java/datafu/pig/stats/entropy
20 2 2014-03-03 2014-03-03 1 1 mhayes@linkedin.com mhayes@linkedin.com
CachedFile.java
in datafu-pig/src/main/java/datafu/pig/text/opennlp
18 1 2014-03-03 2014-03-03 1 1 mhayes@linkedin.com mhayes@linkedin.com
GradleAutojarPlugin.groovy
in buildSrc/src/main/groovy/datafu/autojar
14 1 2018-07-05 2018-07-05 1 1 mhayes@apache.org mhayes@apache.org
MaxInputDataExceededException.java
in datafu-hourglass/src/main/java/datafu/hourglass/jobs
11 2 2014-05-18 2014-05-18 1 1 jarcec@apache.org jarcec@apache.org
find_dupes.rb
in datafu-hourglass
9 - 2015-05-23 2015-05-27 2 1 matthew.terence.hayes@gmail... matthew.terence.hayes@gmail...
Multiline.java
in build-plugin/src/main/java/org/adrianwalker/multilinestring
9 - 2014-03-03 2014-03-03 1 1 mhayes@linkedin.com mhayes@linkedin.com
IntToBool.java
in datafu-pig/src/main/java/datafu/pig/util
8 1 2014-03-03 2014-03-03 1 1 mhayes@linkedin.com mhayes@linkedin.com
BoolToInt.java
in datafu-pig/src/main/java/datafu/pig/util
8 1 2014-03-03 2014-03-03 1 1 mhayes@linkedin.com mhayes@linkedin.com
ProgressIndicator.java
in datafu-pig/src/main/java/datafu/pig/linkanalysis
5 - 2014-03-03 2014-03-03 1 1 mhayes@linkedin.com mhayes@linkedin.com
Assert.java
in datafu-pig/src/main/java/datafu/pig/util
5 - 2014-03-03 2014-03-03 1 1 mhayes@linkedin.com mhayes@linkedin.com
In.java
in datafu-pig/src/main/java/datafu/pig/util
5 - 2014-03-03 2014-03-03 1 1 mhayes@linkedin.com mhayes@linkedin.com
init_spark_context.py
in datafu-spark/src/main/resources/pyspark_utils
3 - 2019-07-17 2019-07-17 1 1 oraviv@paypal.com oraviv@paypal.com
package-info.java
in datafu-hourglass/src/main/java/datafu/hourglass/model
1 - 2014-05-18 2014-05-18 1 1 jarcec@apache.org jarcec@apache.org
package-info.java
in datafu-hourglass/src/main/java/datafu/hourglass/fs
1 - 2014-05-18 2014-05-18 1 1 jarcec@apache.org jarcec@apache.org
package-info.java
in datafu-hourglass/src/main/java/datafu/hourglass/jobs
1 - 2014-05-18 2014-05-18 1 1 jarcec@apache.org jarcec@apache.org
package-info.java
in datafu-hourglass/src/main/java/datafu/hourglass/mapreduce
1 - 2014-05-18 2014-05-18 1 1 jarcec@apache.org jarcec@apache.org
package-info.java
in datafu-hourglass/src/main/java/datafu/hourglass/schemas
1 - 2014-05-18 2014-05-18 1 1 jarcec@apache.org jarcec@apache.org
package-info.java
in datafu-hourglass/src/main/java/datafu/hourglass/avro
1 - 2014-05-18 2014-05-18 1 1 jarcec@apache.org jarcec@apache.org
Correlations

File Size vs. Number of Changes: 225 points

datafu-spark/src/main/resources/pyspark_utils/df_utils.py x: 62 lines of code y: 8 # changes datafu-spark/src/main/scala/datafu/spark/SparkDFUtils.scala x: 337 lines of code y: 15 # changes site/source/layouts/layout.erb x: 39 lines of code y: 4 # changes datafu-spark/src/main/scala/spark/utils/overwrites/SparkOverwriteUDAFs.scala x: 135 lines of code y: 6 # changes datafu-spark/src/main/resources/pyspark_utils/bridge_utils.py x: 41 lines of code y: 4 # changes datafu-spark/src/main/scala/datafu/spark/DataFrameOps.scala x: 75 lines of code y: 8 # changes site/source/index.markdown.erb x: 49 lines of code y: 11 # changes site/source/layouts/_docs_nav.erb x: 55 lines of code y: 16 # changes site/source/layouts/_footer.erb x: 27 lines of code y: 16 # changes datafu-spark/src/main/scala/datafu/spark/Aggregators.scala x: 129 lines of code y: 1 # changes datafu-spark/src/main/scala/datafu/spark/ScalaPythonBridge.scala x: 103 lines of code y: 5 # changes datafu-spark/src/main/scala/spark/utils/overwrites/SparkPythonRunner.scala x: 101 lines of code y: 6 # changes doap_DataFu.rdf x: 35 lines of code y: 3 # changes datafu-spark/src/main/scala/datafu/spark/PythonPathsManager.scala x: 102 lines of code y: 2 # changes site/config.rb x: 37 lines of code y: 12 # changes buildSrc/src/main/groovy/datafu/autojar/task/Autojar.groovy x: 125 lines of code y: 2 # changes datafu-pig/src/main/java/datafu/org/apache/pig/piggybank/evaluation/ExtremalTupleByNthField.java x: 177 lines of code y: 2 # changes datafu-pig/src/main/java/datafu/pig/bags/CountDistinctUpTo.java x: 156 lines of code y: 2 # changes datafu-pig/src/main/java/datafu/pig/hash/Hasher.java x: 76 lines of code y: 2 # changes datafu-pig/src/main/java/datafu/pig/hash/HasherRand.java x: 51 lines of code y: 4 # changes build-plugin/src/main/java/org/adrianwalker/multilinestring/EcjMultilineProcessor.java x: 41 lines of code y: 2 # changes build-plugin/src/main/java/org/adrianwalker/multilinestring/JavacMultilineProcessor.java x: 39 lines of code y: 2 # changes build-plugin/src/main/java/org/adrianwalker/multilinestring/MultilineProcessor.java x: 33 lines of code y: 7 # changes datafu-spark/src/main/resources/pyspark_utils/__init__.py x: 1 lines of code y: 1 # changes datafu-spark/src/main/resources/pyspark_utils/init_spark_context.py x: 3 lines of code y: 1 # changes site/source/sitemap.xml.builder x: 14 lines of code y: 5 # changes site/source/stylesheets/all.less x: 53 lines of code y: 5 # changes datafu-pig/src/main/resources/datafu/count_macros.pig x: 38 lines of code y: 3 # changes datafu-pig/src/main/resources/datafu/diff_macros.pig x: 34 lines of code y: 3 # changes datafu-pig/src/main/resources/datafu/left_outer_join.pig x: 37 lines of code y: 2 # changes datafu-pig/src/main/resources/datafu/dedup.pig x: 34 lines of code y: 1 # changes datafu-pig/src/main/java/datafu/pig/stats/HyperLogLogPlusPlus.java x: 152 lines of code y: 5 # changes buildSrc/src/main/groovy/datafu/autojar/GradleAutojarPlugin.groovy x: 14 lines of code y: 1 # changes buildSrc/src/main/groovy/datafu/autojar/task/ExtractAutojar.groovy x: 29 lines of code y: 1 # changes site/source/layouts/blog.erb x: 41 lines of code y: 5 # changes site/source/layouts/_header.erb x: 24 lines of code y: 6 # changes datafu-hourglass/src/main/java/datafu/hourglass/avro/AvroMultipleInputsUtil.java x: 78 lines of code y: 4 # changes datafu-pig/src/main/java/datafu/pig/util/ContextualEvalFunc.java x: 44 lines of code y: 6 # changes datafu-pig/src/main/java/datafu/pig/util/SimpleEvalFunc.java x: 101 lines of code y: 4 # changes datafu-pig/src/main/java/datafu/pig/util/TupleDiff.java x: 128 lines of code y: 1 # changes datafu-pig/src/main/resources/datafu/tf_idf.pig x: 84 lines of code y: 1 # changes datafu-pig/src/main/java/datafu/pig/sessions/SessionCount.java x: 48 lines of code y: 2 # changes datafu-pig/src/main/java/datafu/pig/sessions/Sessionize.java x: 107 lines of code y: 3 # changes datafu-pig/src/main/java/datafu/pig/util/AliasableEvalFunc.java x: 160 lines of code y: 6 # changes datafu-pig/src/main/java/datafu/pig/bags/TupleFromBag.java x: 62 lines of code y: 2 # changes datafu-hourglass/src/main/java/datafu/hourglass/jobs/AbstractJob.java x: 257 lines of code y: 5 # changes datafu-pig/src/main/java/datafu/pig/bags/FirstTupleFromBag.java x: 50 lines of code y: 2 # changes datafu-pig/src/main/java/datafu/pig/text/opennlp/TokenizeSimple.java x: 55 lines of code y: 2 # changes datafu-pig/src/main/java/datafu/pig/text/opennlp/POSTag.java x: 106 lines of code y: 2 # changes datafu-pig/src/main/java/datafu/pig/text/opennlp/SentenceDetect.java x: 74 lines of code y: 2 # changes site/lib/pig.rb x: 54 lines of code y: 3 # changes datafu-hourglass/find_dupes.rb x: 9 lines of code y: 2 # changes datafu-hourglass/src/main/java/datafu/hourglass/fs/PathUtils.java x: 192 lines of code y: 5 # changes datafu-hourglass/src/main/java/datafu/hourglass/mapreduce/DistributedCacheHelper.java x: 56 lines of code y: 4 # changes datafu-hourglass/overview.html x: 3 lines of code y: 2 # changes site/source/blog/index.html.erb x: 36 lines of code y: 2 # changes site/source/javascripts/all.js x: 1 lines of code y: 2 # changes site/source/layouts/docs.erb x: 33 lines of code y: 2 # changes site/source/stylesheets/highlight.css.erb x: 19 lines of code y: 2 # changes gradle/resources/rat-output-to-html.xsl x: 153 lines of code y: 1 # changes datafu-pig/src/main/java/datafu/pig/bags/ZipBags.java x: 58 lines of code y: 1 # changes datafu-pig/src/main/java/datafu/pig/hash/lsh/p_stable/L2LSH.java x: 18 lines of code y: 3 # changes datafu-pig/src/main/java/datafu/pig/sampling/SampleByKey.java x: 58 lines of code y: 4 # changes datafu-pig/src/main/java/datafu/pig/sampling/WeightedReservoirSample.java x: 181 lines of code y: 3 # changes datafu-pig/src/main/java/datafu/pig/util/InUDF.java x: 19 lines of code y: 3 # changes datafu-hourglass/src/main/java/datafu/hourglass/avro/AvroDateRangeMetadata.java x: 40 lines of code y: 3 # changes datafu-hourglass/src/main/java/datafu/hourglass/fs/DatePath.java x: 75 lines of code y: 2 # changes datafu-hourglass/src/main/java/datafu/hourglass/fs/DateRange.java x: 20 lines of code y: 2 # changes datafu-hourglass/src/main/java/datafu/hourglass/jobs/AbstractNonIncrementalJob.java x: 206 lines of code y: 3 # changes datafu-hourglass/src/main/java/datafu/hourglass/jobs/AbstractPartitionCollapsingIncrementalJob.java x: 349 lines of code y: 3 # changes datafu-hourglass/src/main/java/datafu/hourglass/jobs/AbstractPartitionPreservingIncrementalJob.java x: 387 lines of code y: 3 # changes datafu-hourglass/src/main/java/datafu/hourglass/jobs/DateRangeConfigurable.java x: 6 lines of code y: 2 # changes datafu-hourglass/src/main/java/datafu/hourglass/jobs/DateRangePlanner.java x: 124 lines of code y: 2 # changes datafu-hourglass/src/main/java/datafu/hourglass/jobs/ExecutionPlanner.java x: 234 lines of code y: 3 # changes datafu-hourglass/src/main/java/datafu/hourglass/jobs/FileCleaner.java x: 45 lines of code y: 3 # changes datafu-hourglass/src/main/java/datafu/hourglass/jobs/IncrementalJob.java x: 86 lines of code y: 2 # changes datafu-hourglass/src/main/java/datafu/hourglass/jobs/PartitionCollapsingExecutionPlanner.java x: 315 lines of code y: 3 # changes datafu-hourglass/src/main/java/datafu/hourglass/jobs/PartitionCollapsingIncrementalJob.java x: 109 lines of code y: 3 # changes datafu-hourglass/src/main/java/datafu/hourglass/jobs/PartitionPreservingExecutionPlanner.java x: 156 lines of code y: 3 # changes datafu-hourglass/src/main/java/datafu/hourglass/jobs/PartitionPreservingIncrementalJob.java x: 88 lines of code y: 3 # changes datafu-hourglass/src/main/java/datafu/hourglass/jobs/ReduceEstimator.java x: 130 lines of code y: 2 # changes datafu-hourglass/src/main/java/datafu/hourglass/jobs/TimeBasedJob.java x: 99 lines of code y: 2 # changes datafu-hourglass/src/main/java/datafu/hourglass/jobs/TimePartitioner.java x: 78 lines of code y: 2 # changes datafu-hourglass/src/main/java/datafu/hourglass/mapreduce/AvroKeyValueIdentityMapper.java x: 17 lines of code y: 2 # changes datafu-hourglass/src/main/java/datafu/hourglass/mapreduce/CollapsingCombiner.java x: 152 lines of code y: 3 # changes datafu-hourglass/src/main/java/datafu/hourglass/mapreduce/CollapsingMapper.java x: 202 lines of code y: 3 # changes datafu-hourglass/src/main/java/datafu/hourglass/mapreduce/CollapsingReducer.java x: 222 lines of code y: 3 # changes datafu-hourglass/src/main/java/datafu/hourglass/mapreduce/DelegatingCombiner.java x: 32 lines of code y: 2 # changes datafu-hourglass/src/main/java/datafu/hourglass/mapreduce/ObjectMapper.java x: 7 lines of code y: 2 # changes datafu-hourglass/src/main/java/datafu/hourglass/mapreduce/ObjectProcessor.java x: 18 lines of code y: 2 # changes datafu-hourglass/src/main/java/datafu/hourglass/mapreduce/ObjectReducer.java x: 8 lines of code y: 2 # changes datafu-hourglass/src/main/java/datafu/hourglass/mapreduce/PartitioningCombiner.java x: 47 lines of code y: 2 # changes datafu-hourglass/src/main/java/datafu/hourglass/mapreduce/PartitioningMapper.java x: 106 lines of code y: 3 # changes datafu-hourglass/src/main/java/datafu/hourglass/mapreduce/PartitioningReducer.java x: 108 lines of code y: 3 # changes datafu-hourglass/src/main/java/datafu/hourglass/model/KeyValueCollector.java x: 6 lines of code y: 3 # changes datafu-hourglass/src/main/java/datafu/hourglass/model/Mapper.java x: 7 lines of code y: 3 # changes datafu-hourglass/src/main/java/datafu/hourglass/schemas/PartitionCollapsingSchemas.java x: 160 lines of code y: 2 # changes datafu-hourglass/src/main/java/datafu/hourglass/schemas/PartitionPreservingSchemas.java x: 114 lines of code y: 2 # changes datafu-hourglass/src/main/java/datafu/hourglass/schemas/TaskSchemas.java x: 63 lines of code y: 2 # changes datafu-pig/src/main/java/datafu/pig/bags/BagConcat.java x: 94 lines of code y: 2 # changes datafu-pig/src/main/java/datafu/pig/bags/BagGroup.java x: 117 lines of code y: 6 # changes datafu-pig/src/main/java/datafu/pig/bags/BagJoin.java x: 219 lines of code y: 3 # changes datafu-pig/src/main/java/datafu/pig/bags/BagLeftOuterJoin.java x: 25 lines of code y: 3 # changes datafu-pig/src/main/java/datafu/pig/hash/lsh/LSHFamily.java x: 24 lines of code y: 3 # changes datafu-pig/src/main/java/datafu/pig/hash/lsh/RepeatingLSH.java x: 34 lines of code y: 2 # changes datafu-pig/src/main/java/datafu/pig/hash/lsh/interfaces/LSH.java x: 16 lines of code y: 3 # changes datafu-pig/src/main/java/datafu/pig/hash/lsh/interfaces/LSHCreator.java x: 61 lines of code y: 3 # changes datafu-pig/src/main/java/datafu/pig/hash/lsh/metric/Cosine.java x: 14 lines of code y: 3 # changes datafu-pig/src/main/java/datafu/pig/hash/lsh/metric/MetricUDF.java x: 95 lines of code y: 3 # changes datafu-pig/src/main/java/datafu/pig/hash/lsh/p_stable/AbstractStableDistributionFunction.java x: 42 lines of code y: 3 # changes datafu-pig/src/main/java/datafu/pig/hash/lsh/util/DataTypeUtil.java x: 126 lines of code y: 3 # changes datafu-pig/src/main/java/datafu/pig/sampling/ReservoirSample.java x: 230 lines of code y: 2 # changes datafu-pig/src/main/java/datafu/pig/sampling/SimpleRandomSample.java x: 319 lines of code y: 3 # changes datafu-pig/src/main/java/datafu/pig/sampling/SimpleRandomSampleWithReplacementElect.java x: 143 lines of code y: 2 # changes datafu-pig/src/main/java/datafu/pig/sampling/SimpleRandomSampleWithReplacementVote.java x: 121 lines of code y: 3 # changes datafu-pig/src/main/java/datafu/pig/sets/SetDifference.java x: 140 lines of code y: 3 # changes datafu-pig/src/main/java/datafu/pig/util/FieldNotFound.java x: 12 lines of code y: 2 # changes datafu-pig/src/main/java/datafu/pig/util/SelectStringFieldByName.java x: 32 lines of code y: 1 # changes datafu-pig/src/main/java/datafu/pig/urls/URLInfo.java x: 102 lines of code y: 1 # changes datafu-hourglass/src/main/java/datafu/hourglass/avro/AvroKeyWithMetadataOutputFormat.java x: 42 lines of code y: 2 # changes datafu-hourglass/src/main/java/datafu/hourglass/jobs/StagedOutputJob.java x: 492 lines of code y: 2 # changes datafu-pig/src/main/java/datafu/pig/bags/DistinctBy.java x: 98 lines of code y: 2 # changes datafu-pig/src/main/java/datafu/pig/bags/Enumerate.java x: 88 lines of code y: 2 # changes datafu-pig/src/main/java/datafu/pig/bags/PrependToBag.java x: 31 lines of code y: 2 # changes datafu-pig/src/main/java/datafu/pig/bags/UnorderedPairs.java x: 80 lines of code y: 2 # changes datafu-pig/src/main/java/datafu/pig/geo/HaversineDistInMiles.java x: 25 lines of code y: 2 # changes datafu-pig/src/main/java/datafu/pig/hash/lsh/CosineDistanceHash.java x: 38 lines of code y: 2 # changes datafu-pig/src/main/java/datafu/pig/hash/lsh/cosine/HyperplaneLSH.java x: 21 lines of code y: 2 # changes datafu-pig/src/main/java/datafu/pig/linkanalysis/PageRank.java x: 312 lines of code y: 2 # changes datafu-pig/src/main/java/datafu/pig/linkanalysis/PageRankImpl.java x: 391 lines of code y: 2 # changes datafu-pig/src/main/java/datafu/pig/stats/Quantile.java x: 103 lines of code y: 2 # changes datafu-pig/src/main/java/datafu/pig/stats/QuantileUtil.java x: 44 lines of code y: 2 # changes datafu-pig/src/main/java/datafu/pig/stats/StreamingQuantile.java x: 276 lines of code y: 2 # changes datafu-pig/src/main/java/datafu/pig/stats/VAR.java x: 300 lines of code y: 2 # changes datafu-pig/src/main/java/datafu/pig/stats/entropy/CondEntropy.java x: 117 lines of code y: 2 # changes datafu-pig/src/main/java/datafu/pig/stats/entropy/EmpiricalCountEntropy.java x: 229 lines of code y: 2 # changes datafu-pig/src/main/java/datafu/pig/util/AssertUDF.java x: 23 lines of code y: 2 # changes datafu-pig/src/main/java/datafu/pig/util/Coalesce.java x: 128 lines of code y: 2 # changes datafu-pig/src/main/java/datafu/pig/util/DataFuException.java x: 68 lines of code y: 2 # changes datafu-pig/src/main/java/datafu/pig/util/TransposeTupleToBag.java x: 61 lines of code y: 2 # changes datafu-pig/src/main/java/datafu/pig/util/Base64Decode.java x: 14 lines of code y: 2 # changes datafu-pig/src/main/java/datafu/pig/hash/SHA.java x: 28 lines of code y: 2 # changes datafu-hourglass/src/main/java/datafu/hourglass/avro/AvroKeyValueWithMetadataOutputFormat.java x: 21 lines of code y: 1 # changes datafu-hourglass/src/main/java/datafu/hourglass/avro/AvroKeyValueWithMetadataRecordWriter.java x: 55 lines of code y: 1 # changes datafu-hourglass/src/main/java/datafu/hourglass/avro/AvroMultipleInputsKeyInputFormat.java x: 24 lines of code y: 1 # changes datafu-hourglass/src/main/java/datafu/hourglass/jobs/MaxInputDataExceededException.java x: 11 lines of code y: 1 # changes build-plugin/src/main/java/org/adrianwalker/multilinestring/Multiline.java x: 9 lines of code y: 1 # changes datafu-pig/src/main/java/datafu/pig/bags/AppendToBag.java x: 27 lines of code y: 1 # changes datafu-pig/src/main/java/datafu/pig/bags/BagSplit.java x: 94 lines of code y: 1 # changes datafu-pig/src/main/java/datafu/pig/bags/EmptyBagToNull.java x: 35 lines of code y: 1 # changes datafu-pig/src/main/java/datafu/pig/bags/NullToEmptyBag.java x: 26 lines of code y: 1 # changes datafu-pig/src/main/java/datafu/pig/linkanalysis/ProgressIndicator.java x: 5 lines of code y: 1 # changes datafu-pig/src/main/java/datafu/pig/random/RandomUUID.java x: 20 lines of code y: 1 # changes datafu-pig/src/main/java/datafu/pig/sampling/Reservoir.java x: 25 lines of code y: 1 # changes datafu-pig/src/main/java/datafu/pig/sampling/ScoredTuple.java x: 71 lines of code y: 1 # changes datafu-pig/src/main/java/datafu/pig/sampling/WeightedSample.java x: 104 lines of code y: 1 # changes datafu-pig/src/main/java/datafu/pig/sets/SetIntersect.java x: 75 lines of code y: 1 # changes datafu-pig/src/main/java/datafu/pig/sets/SetUnion.java x: 33 lines of code y: 1 # changes datafu-pig/src/main/java/datafu/pig/stats/DoubleVAR.java x: 288 lines of code y: 1 # changes datafu-pig/src/main/java/datafu/pig/stats/FloatVAR.java x: 289 lines of code y: 1 # changes datafu-pig/src/main/java/datafu/pig/stats/MarkovPairs.java x: 92 lines of code y: 1 # changes datafu-pig/src/main/java/datafu/pig/stats/entropy/ChaoShenEntropyEstimator.java x: 134 lines of code y: 1 # changes datafu-pig/src/main/java/datafu/pig/text/opennlp/CachedFile.java x: 18 lines of code y: 1 # changes datafu-pig/src/main/java/datafu/pig/util/BoolToInt.java x: 8 lines of code y: 1 # changes site/source/stylesheets/bootstrap-theme.css x: 340 lines of code y: 1 # changes
16.0
# changes
  min: 1.0
  average: 2.52
  25th percentile: 1.0
  median: 2.0
  75th percentile: 3.0
  max: 16.0
0 492.0
lines of code
min: 1.0 | average: 76.26 | 25th percentile: 16.5 | median: 42.0 | 75th percentile: 102.5 | max: 492.0

Number of Contributors vs. Number of Changes: 225 points

datafu-spark/src/main/resources/pyspark_utils/df_utils.py x: 5 # contributors y: 8 # changes datafu-spark/src/main/scala/datafu/spark/SparkDFUtils.scala x: 9 # contributors y: 15 # changes site/source/layouts/layout.erb x: 3 # contributors y: 4 # changes datafu-spark/src/main/scala/spark/utils/overwrites/SparkOverwriteUDAFs.scala x: 5 # contributors y: 6 # changes site/source/index.markdown.erb x: 5 # contributors y: 11 # changes site/source/layouts/_docs_nav.erb x: 6 # contributors y: 16 # changes datafu-spark/src/main/scala/datafu/spark/Aggregators.scala x: 1 # contributors y: 1 # changes datafu-spark/src/main/scala/datafu/spark/ScalaPythonBridge.scala x: 4 # contributors y: 5 # changes datafu-spark/src/main/scala/spark/utils/overwrites/SparkPythonRunner.scala x: 4 # contributors y: 6 # changes doap_DataFu.rdf x: 3 # contributors y: 3 # changes datafu-spark/src/main/scala/datafu/spark/PythonPathsManager.scala x: 3 # contributors y: 2 # changes site/config.rb x: 5 # contributors y: 12 # changes buildSrc/src/main/groovy/datafu/autojar/task/Autojar.groovy x: 2 # contributors y: 2 # changes datafu-pig/src/main/java/datafu/pig/hash/HasherRand.java x: 2 # contributors y: 4 # changes build-plugin/src/main/java/org/adrianwalker/multilinestring/MultilineProcessor.java x: 3 # contributors y: 7 # changes datafu-pig/src/main/java/datafu/pig/stats/HyperLogLogPlusPlus.java x: 5 # contributors y: 5 # changes site/source/layouts/blog.erb x: 3 # contributors y: 5 # changes site/source/layouts/_header.erb x: 3 # contributors y: 6 # changes site/lib/pig.rb x: 2 # contributors y: 3 # changes datafu-hourglass/find_dupes.rb x: 1 # contributors y: 2 # changes datafu-pig/src/main/java/datafu/pig/hash/lsh/p_stable/L2LSH.java x: 4 # contributors y: 3 # changes datafu-pig/src/main/java/datafu/pig/sampling/SampleByKey.java x: 4 # contributors y: 4 # changes
16.0
# changes
  min: 1.0
  average: 2.52
  25th percentile: 1.0
  median: 2.0
  75th percentile: 3.0
  max: 16.0
0 9.0
# contributors
min: 1.0 | average: 2.18 | 25th percentile: 1.0 | median: 2.0 | 75th percentile: 3.0 | max: 9.0

Number of Contributors vs. File Size: 225 points

datafu-spark/src/main/resources/pyspark_utils/df_utils.py x: 5 # contributors y: 62 lines of code datafu-spark/src/main/scala/datafu/spark/SparkDFUtils.scala x: 9 # contributors y: 337 lines of code site/source/layouts/layout.erb x: 3 # contributors y: 39 lines of code datafu-spark/src/main/scala/spark/utils/overwrites/SparkOverwriteUDAFs.scala x: 5 # contributors y: 135 lines of code datafu-spark/src/main/resources/pyspark_utils/bridge_utils.py x: 3 # contributors y: 41 lines of code datafu-spark/src/main/scala/datafu/spark/DataFrameOps.scala x: 5 # contributors y: 75 lines of code site/source/index.markdown.erb x: 5 # contributors y: 49 lines of code site/source/layouts/_docs_nav.erb x: 6 # contributors y: 55 lines of code site/source/layouts/_footer.erb x: 6 # contributors y: 27 lines of code datafu-spark/src/main/scala/datafu/spark/Aggregators.scala x: 1 # contributors y: 129 lines of code datafu-spark/src/main/scala/datafu/spark/ScalaPythonBridge.scala x: 4 # contributors y: 103 lines of code datafu-spark/src/main/scala/spark/utils/overwrites/SparkPythonRunner.scala x: 4 # contributors y: 101 lines of code doap_DataFu.rdf x: 3 # contributors y: 35 lines of code datafu-spark/src/main/scala/datafu/spark/PythonPathsManager.scala x: 3 # contributors y: 102 lines of code site/config.rb x: 5 # contributors y: 37 lines of code buildSrc/src/main/groovy/datafu/autojar/task/Autojar.groovy x: 2 # contributors y: 125 lines of code datafu-pig/src/main/java/datafu/org/apache/pig/piggybank/evaluation/ExtremalTupleByNthField.java x: 2 # contributors y: 177 lines of code datafu-pig/src/main/java/datafu/pig/bags/CountDistinctUpTo.java x: 2 # contributors y: 156 lines of code datafu-pig/src/main/java/datafu/pig/hash/Hasher.java x: 2 # contributors y: 76 lines of code datafu-pig/src/main/java/datafu/pig/hash/HasherRand.java x: 2 # contributors y: 51 lines of code build-plugin/src/main/java/org/adrianwalker/multilinestring/EcjMultilineProcessor.java x: 2 # contributors y: 41 lines of code build-plugin/src/main/java/org/adrianwalker/multilinestring/JavacMultilineProcessor.java x: 2 # contributors y: 39 lines of code build-plugin/src/main/java/org/adrianwalker/multilinestring/MultilineProcessor.java x: 3 # contributors y: 33 lines of code datafu-spark/src/main/resources/pyspark_utils/__init__.py x: 1 # contributors y: 1 lines of code datafu-spark/src/main/resources/pyspark_utils/init_spark_context.py x: 1 # contributors y: 3 lines of code site/source/sitemap.xml.builder x: 4 # contributors y: 14 lines of code site/source/stylesheets/all.less x: 4 # contributors y: 53 lines of code datafu-pig/src/main/resources/datafu/count_macros.pig x: 3 # contributors y: 38 lines of code datafu-pig/src/main/resources/datafu/left_outer_join.pig x: 2 # contributors y: 37 lines of code datafu-pig/src/main/resources/datafu/dedup.pig x: 1 # contributors y: 34 lines of code datafu-pig/src/main/java/datafu/pig/stats/HyperLogLogPlusPlus.java x: 5 # contributors y: 152 lines of code buildSrc/src/main/groovy/datafu/autojar/GradleAutojarPlugin.groovy x: 1 # contributors y: 14 lines of code buildSrc/src/main/groovy/datafu/autojar/task/ExtractAutojar.groovy x: 1 # contributors y: 29 lines of code site/source/layouts/_header.erb x: 3 # contributors y: 24 lines of code datafu-hourglass/src/main/java/datafu/hourglass/avro/AvroMultipleInputsUtil.java x: 3 # contributors y: 78 lines of code datafu-pig/src/main/java/datafu/pig/util/ContextualEvalFunc.java x: 3 # contributors y: 44 lines of code datafu-pig/src/main/java/datafu/pig/util/SimpleEvalFunc.java x: 2 # contributors y: 101 lines of code datafu-pig/src/main/resources/datafu/tf_idf.pig x: 1 # contributors y: 84 lines of code datafu-pig/src/main/java/datafu/pig/sessions/SessionCount.java x: 2 # contributors y: 48 lines of code datafu-pig/src/main/java/datafu/pig/sessions/Sessionize.java x: 3 # contributors y: 107 lines of code datafu-pig/src/main/java/datafu/pig/util/AliasableEvalFunc.java x: 5 # contributors y: 160 lines of code datafu-pig/src/main/java/datafu/pig/bags/TupleFromBag.java x: 3 # contributors y: 62 lines of code datafu-hourglass/src/main/java/datafu/hourglass/jobs/AbstractJob.java x: 3 # contributors y: 257 lines of code datafu-pig/src/main/java/datafu/pig/bags/FirstTupleFromBag.java x: 2 # contributors y: 50 lines of code datafu-pig/src/main/java/datafu/pig/text/opennlp/TokenizeSimple.java x: 2 # contributors y: 55 lines of code datafu-pig/src/main/java/datafu/pig/text/opennlp/POSTag.java x: 2 # contributors y: 106 lines of code datafu-pig/src/main/java/datafu/pig/text/opennlp/SentenceDetect.java x: 2 # contributors y: 74 lines of code site/lib/pig.rb x: 2 # contributors y: 54 lines of code datafu-hourglass/find_dupes.rb x: 1 # contributors y: 9 lines of code datafu-hourglass/src/main/java/datafu/hourglass/fs/PathUtils.java x: 3 # contributors y: 192 lines of code datafu-hourglass/src/main/java/datafu/hourglass/mapreduce/DistributedCacheHelper.java x: 3 # contributors y: 56 lines of code datafu-hourglass/overview.html x: 2 # contributors y: 3 lines of code site/source/blog/index.html.erb x: 2 # contributors y: 36 lines of code site/source/javascripts/all.js x: 2 # contributors y: 1 lines of code site/source/layouts/docs.erb x: 2 # contributors y: 33 lines of code site/source/stylesheets/highlight.css.erb x: 2 # contributors y: 19 lines of code gradle/resources/rat-output-to-html.xsl x: 1 # contributors y: 153 lines of code datafu-pig/src/main/java/datafu/pig/bags/ZipBags.java x: 1 # contributors y: 58 lines of code datafu-pig/src/main/java/datafu/pig/hash/lsh/p_stable/L2LSH.java x: 4 # contributors y: 18 lines of code datafu-pig/src/main/java/datafu/pig/sampling/SampleByKey.java x: 4 # contributors y: 58 lines of code datafu-pig/src/main/java/datafu/pig/sampling/WeightedReservoirSample.java x: 3 # contributors y: 181 lines of code datafu-pig/src/main/java/datafu/pig/util/InUDF.java x: 3 # contributors y: 19 lines of code datafu-hourglass/src/main/java/datafu/hourglass/fs/DateRange.java x: 2 # contributors y: 20 lines of code datafu-hourglass/src/main/java/datafu/hourglass/jobs/AbstractNonIncrementalJob.java x: 3 # contributors y: 206 lines of code datafu-hourglass/src/main/java/datafu/hourglass/jobs/AbstractPartitionCollapsingIncrementalJob.java x: 3 # contributors y: 349 lines of code datafu-hourglass/src/main/java/datafu/hourglass/jobs/AbstractPartitionPreservingIncrementalJob.java x: 3 # contributors y: 387 lines of code datafu-hourglass/src/main/java/datafu/hourglass/jobs/DateRangeConfigurable.java x: 2 # contributors y: 6 lines of code datafu-hourglass/src/main/java/datafu/hourglass/jobs/ExecutionPlanner.java x: 3 # contributors y: 234 lines of code datafu-hourglass/src/main/java/datafu/hourglass/jobs/FileCleaner.java x: 3 # contributors y: 45 lines of code datafu-hourglass/src/main/java/datafu/hourglass/jobs/IncrementalJob.java x: 2 # contributors y: 86 lines of code datafu-hourglass/src/main/java/datafu/hourglass/jobs/PartitionCollapsingExecutionPlanner.java x: 3 # contributors y: 315 lines of code datafu-hourglass/src/main/java/datafu/hourglass/jobs/PartitionCollapsingIncrementalJob.java x: 3 # contributors y: 109 lines of code datafu-hourglass/src/main/java/datafu/hourglass/jobs/PartitionPreservingExecutionPlanner.java x: 3 # contributors y: 156 lines of code datafu-hourglass/src/main/java/datafu/hourglass/jobs/PartitionPreservingIncrementalJob.java x: 3 # contributors y: 88 lines of code datafu-hourglass/src/main/java/datafu/hourglass/jobs/ReduceEstimator.java x: 2 # contributors y: 130 lines of code datafu-hourglass/src/main/java/datafu/hourglass/jobs/TimeBasedJob.java x: 2 # contributors y: 99 lines of code datafu-hourglass/src/main/java/datafu/hourglass/jobs/TimePartitioner.java x: 2 # contributors y: 78 lines of code datafu-hourglass/src/main/java/datafu/hourglass/mapreduce/AvroKeyValueIdentityMapper.java x: 2 # contributors y: 17 lines of code datafu-hourglass/src/main/java/datafu/hourglass/mapreduce/CollapsingCombiner.java x: 3 # contributors y: 152 lines of code datafu-hourglass/src/main/java/datafu/hourglass/mapreduce/CollapsingMapper.java x: 3 # contributors y: 202 lines of code datafu-hourglass/src/main/java/datafu/hourglass/mapreduce/CollapsingReducer.java x: 3 # contributors y: 222 lines of code datafu-hourglass/src/main/java/datafu/hourglass/mapreduce/DelegatingCombiner.java x: 2 # contributors y: 32 lines of code datafu-hourglass/src/main/java/datafu/hourglass/mapreduce/ObjectReducer.java x: 2 # contributors y: 8 lines of code datafu-hourglass/src/main/java/datafu/hourglass/mapreduce/PartitioningMapper.java x: 3 # contributors y: 106 lines of code datafu-hourglass/src/main/java/datafu/hourglass/model/KeyValueCollector.java x: 3 # contributors y: 6 lines of code datafu-hourglass/src/main/java/datafu/hourglass/schemas/PartitionCollapsingSchemas.java x: 2 # contributors y: 160 lines of code datafu-hourglass/src/main/java/datafu/hourglass/schemas/PartitionPreservingSchemas.java x: 2 # contributors y: 114 lines of code datafu-hourglass/src/main/java/datafu/hourglass/schemas/TaskSchemas.java x: 2 # contributors y: 63 lines of code datafu-pig/src/main/java/datafu/pig/bags/BagConcat.java x: 2 # contributors y: 94 lines of code datafu-pig/src/main/java/datafu/pig/bags/BagGroup.java x: 5 # contributors y: 117 lines of code datafu-pig/src/main/java/datafu/pig/bags/BagJoin.java x: 3 # contributors y: 219 lines of code datafu-pig/src/main/java/datafu/pig/hash/lsh/interfaces/LSH.java x: 3 # contributors y: 16 lines of code datafu-pig/src/main/java/datafu/pig/hash/lsh/interfaces/LSHCreator.java x: 3 # contributors y: 61 lines of code datafu-pig/src/main/java/datafu/pig/hash/lsh/metric/Cosine.java x: 3 # contributors y: 14 lines of code datafu-pig/src/main/java/datafu/pig/hash/lsh/metric/MetricUDF.java x: 3 # contributors y: 95 lines of code datafu-pig/src/main/java/datafu/pig/hash/lsh/util/DataTypeUtil.java x: 3 # contributors y: 126 lines of code datafu-pig/src/main/java/datafu/pig/sampling/ReservoirSample.java x: 2 # contributors y: 230 lines of code datafu-pig/src/main/java/datafu/pig/sampling/SimpleRandomSample.java x: 3 # contributors y: 319 lines of code datafu-pig/src/main/java/datafu/pig/sampling/SimpleRandomSampleWithReplacementElect.java x: 2 # contributors y: 143 lines of code datafu-pig/src/main/java/datafu/pig/sampling/SimpleRandomSampleWithReplacementVote.java x: 3 # contributors y: 121 lines of code datafu-pig/src/main/java/datafu/pig/sets/SetDifference.java x: 3 # contributors y: 140 lines of code datafu-pig/src/main/java/datafu/pig/util/FieldNotFound.java x: 2 # contributors y: 12 lines of code datafu-pig/src/main/java/datafu/pig/util/SelectStringFieldByName.java x: 1 # contributors y: 32 lines of code datafu-pig/src/main/java/datafu/pig/urls/URLInfo.java x: 1 # contributors y: 102 lines of code datafu-hourglass/src/main/java/datafu/hourglass/jobs/StagedOutputJob.java x: 2 # contributors y: 492 lines of code datafu-pig/src/main/java/datafu/pig/bags/DistinctBy.java x: 2 # contributors y: 98 lines of code datafu-pig/src/main/java/datafu/pig/bags/Enumerate.java x: 2 # contributors y: 88 lines of code datafu-pig/src/main/java/datafu/pig/bags/UnorderedPairs.java x: 2 # contributors y: 80 lines of code datafu-pig/src/main/java/datafu/pig/geo/HaversineDistInMiles.java x: 2 # contributors y: 25 lines of code datafu-pig/src/main/java/datafu/pig/linkanalysis/PageRank.java x: 2 # contributors y: 312 lines of code datafu-pig/src/main/java/datafu/pig/linkanalysis/PageRankImpl.java x: 2 # contributors y: 391 lines of code datafu-pig/src/main/java/datafu/pig/stats/Quantile.java x: 2 # contributors y: 103 lines of code datafu-pig/src/main/java/datafu/pig/stats/QuantileUtil.java x: 2 # contributors y: 44 lines of code datafu-pig/src/main/java/datafu/pig/stats/StreamingQuantile.java x: 2 # contributors y: 276 lines of code datafu-pig/src/main/java/datafu/pig/stats/VAR.java x: 2 # contributors y: 300 lines of code datafu-pig/src/main/java/datafu/pig/stats/entropy/CondEntropy.java x: 2 # contributors y: 117 lines of code datafu-pig/src/main/java/datafu/pig/util/AssertUDF.java x: 2 # contributors y: 23 lines of code datafu-pig/src/main/java/datafu/pig/util/Coalesce.java x: 2 # contributors y: 128 lines of code datafu-pig/src/main/java/datafu/pig/util/DataFuException.java x: 2 # contributors y: 68 lines of code datafu-pig/src/main/java/datafu/pig/util/TransposeTupleToBag.java x: 2 # contributors y: 61 lines of code datafu-pig/src/main/java/datafu/pig/util/Base64Decode.java x: 2 # contributors y: 14 lines of code datafu-pig/src/main/java/datafu/pig/hash/SHA.java x: 2 # contributors y: 28 lines of code datafu-hourglass/src/main/java/datafu/hourglass/avro/AvroKeyValueWithMetadataOutputFormat.java x: 1 # contributors y: 21 lines of code datafu-hourglass/src/main/java/datafu/hourglass/avro/AvroKeyValueWithMetadataRecordWriter.java x: 1 # contributors y: 55 lines of code datafu-hourglass/src/main/java/datafu/hourglass/avro/AvroMultipleInputsKeyInputFormat.java x: 1 # contributors y: 24 lines of code datafu-hourglass/src/main/java/datafu/hourglass/jobs/MaxInputDataExceededException.java x: 1 # contributors y: 11 lines of code datafu-pig/src/main/java/datafu/pig/bags/AppendToBag.java x: 1 # contributors y: 27 lines of code datafu-pig/src/main/java/datafu/pig/bags/BagSplit.java x: 1 # contributors y: 94 lines of code datafu-pig/src/main/java/datafu/pig/bags/EmptyBagToNull.java x: 1 # contributors y: 35 lines of code datafu-pig/src/main/java/datafu/pig/linkanalysis/ProgressIndicator.java x: 1 # contributors y: 5 lines of code datafu-pig/src/main/java/datafu/pig/sampling/ScoredTuple.java x: 1 # contributors y: 71 lines of code datafu-pig/src/main/java/datafu/pig/sampling/WeightedSample.java x: 1 # contributors y: 104 lines of code datafu-pig/src/main/java/datafu/pig/sets/SetIntersect.java x: 1 # contributors y: 75 lines of code datafu-pig/src/main/java/datafu/pig/stats/DoubleVAR.java x: 1 # contributors y: 288 lines of code datafu-pig/src/main/java/datafu/pig/stats/FloatVAR.java x: 1 # contributors y: 289 lines of code datafu-pig/src/main/java/datafu/pig/stats/MarkovPairs.java x: 1 # contributors y: 92 lines of code datafu-pig/src/main/java/datafu/pig/stats/entropy/ChaoShenEntropyEstimator.java x: 1 # contributors y: 134 lines of code datafu-pig/src/main/java/datafu/pig/text/opennlp/CachedFile.java x: 1 # contributors y: 18 lines of code site/source/stylesheets/bootstrap-theme.css x: 1 # contributors y: 340 lines of code
492.0
lines of code
  min: 1.0
  average: 76.26
  25th percentile: 16.5
  median: 42.0
  75th percentile: 102.5
  max: 492.0
0 9.0
# contributors
min: 1.0 | average: 2.18 | 25th percentile: 1.0 | median: 2.0 | 75th percentile: 3.0 | max: 9.0