apache / incubator-stormcrawler
File Change Frequency

File change frequency (churn) shows the distribution of file updates (days with at least one commit).

Overview
File Change Frequency Overall
  • There are 110 files with 10,808 lines of code.
    • 0 files changed more than 100 times (0 lines of code)
    • 2 files changed 51-100 times (147 lines of code)
    • 0 files changed 21-50 times (0 lines of code)
    • 3 files changed 6-20 times (520 lines of code)
    • 105 files changed 1-5 times (10,141 lines of code)
0% | 1% | 0% | 4% | 93%
Legend:
101+
51-100
21-50
6-20
1-5

explore: grouped by folders | grouped by update frequency | data
Contributors Count Frequency Overall
  • There are 110 files with 10,808 lines of code.
    • 0 files changed by more than 25 contributors (0 lines of code)
    • 1 file changed by 11-25 contributors (85 lines of code)
    • 1 file changed by 6-10 contributors (62 lines of code)
    • 107 files changed by 2-5 contributors (10,658 lines of code)
    • 1 file changed by 1 contributor (3 lines of code)
0% | <1% | <1% | 98% | <1%
Legend:
26+
11-25
6-10
2-5
1

explore: grouped by folders | grouped by contributors count | data
File Change Frequency per File Extension
java, xml, yaml, md, json, txt, html, sh, flux, gitignore, groovy, gitattributes, rss, properties
File Change Frequency per Extension
The number of recorded file updates
101+
51-100
21-50
6-20
1-5
yaml0% | 100% | 0% | 0% | 0%
java0% | 0% | 0% | 3% | 96%
flux0% | 0% | 0% | 100% | 0%
xml0% | 0% | 0% | 92% | 7%
File Change Frequency per Logical Decomposition
primary
primary (file change frequency)
The number of recorded file updates
101+
51-100
21-50
6-20
1-5
core0% | <1% | 0% | 3% | 95%
archetype0% | 28% | 0% | 70% | 1%
Most Frequently Changed Files (Top 50)

See data for all files...

File# lines# unitscreatedlast modified# changes
(days)
# contributorsfirst
contributor
latest
contributor
crawler-default.yaml
in core/src/main/resources
85 - 2015-10-15 2024-05-21 83 11 julien@digitalpebble.com rzo1@apache.org
crawler-conf.yaml
in archetype/src/main/resources/archetype-resources
62 - 2015-12-11 2024-05-21 52 7 julien@digitalpebble.com rzo1@apache.org
flux
crawler.flux
in archetype/src/main/resources/archetype-resources
115 - 2016-06-16 2024-03-28 13 2 julien@digitalpebble.com julien@digitalpebble.com
archetype-metadata.xml
in archetype/src/main/resources/META-INF/maven
37 - 2015-12-11 2024-11-13 11 3 julien@digitalpebble.com mvolikas@gmail.com
JSoupParserBolt.java
in core/src/main/java/org/apache/stormcrawler/bolt
368 7 2024-03-28 2024-11-22 6 3 13417392+rzo1@users.noreply... 13417392+rzo1@users.noreply...
DefaultScheduler.java
in core/src/main/java/org/apache/stormcrawler/persistence
138 6 2024-03-28 2024-12-02 4 3 13417392+rzo1@users.noreply... 13417392+rzo1@users.noreply...
FastURLFilter.java
in core/src/main/java/org/apache/stormcrawler/filtering/regex
245 12 2024-03-28 2024-12-02 4 3 13417392+rzo1@users.noreply... 13417392+rzo1@users.noreply...
SiteMapParserBolt.java
in core/src/main/java/org/apache/stormcrawler/bolt
287 7 2024-03-28 2024-09-10 4 4 13417392+rzo1@users.noreply... 13417392+rzo1@users.noreply...
ParseFilter.java
in core/src/main/java/org/apache/stormcrawler/parse
13 2 2024-03-28 2024-10-10 3 3 13417392+rzo1@users.noreply... psxjoy@outlook.com
SitemapFilter.java
in core/src/main/java/org/apache/stormcrawler/filtering/sitemap
49 1 2024-03-28 2025-02-28 3 3 13417392+rzo1@users.noreply... tallison@apache.org
URLUtil.java
in core/src/main/java/org/apache/stormcrawler/util
100 8 2024-03-28 2024-10-10 3 3 13417392+rzo1@users.noreply... psxjoy@outlook.com
ParseFilters.java
in core/src/main/java/org/apache/stormcrawler/parse
134 10 2024-03-28 2024-09-10 3 3 13417392+rzo1@users.noreply... 13417392+rzo1@users.noreply...
CharsetIdentification.java
in core/src/main/java/org/apache/stormcrawler/util
157 8 2024-03-28 2024-11-22 3 2 julien@digitalpebble.com 13417392+rzo1@users.noreply...
FileSpout.java
in core/src/main/java/org/apache/stormcrawler/spout
169 14 2024-03-28 2024-05-21 3 3 13417392+rzo1@users.noreply... 13417392+rzo1@users.noreply...
FeedParserBolt.java
in core/src/main/java/org/apache/stormcrawler/bolt
184 5 2024-03-28 2024-09-10 3 3 13417392+rzo1@users.noreply... 13417392+rzo1@users.noreply...
HttpProtocol.java
in core/src/main/java/org/apache/stormcrawler/protocol/httpclient
283 8 2024-03-28 2024-05-21 3 3 13417392+rzo1@users.noreply... 13417392+rzo1@users.noreply...
SimpleFetcherBolt.java
in core/src/main/java/org/apache/stormcrawler/bolt
397 7 2024-03-28 2024-11-22 3 2 13417392+rzo1@users.noreply... 13417392+rzo1@users.noreply...
HttpProtocol.java
in core/src/main/java/org/apache/stormcrawler/protocol/okhttp
475 10 2024-03-28 2024-05-21 3 3 13417392+rzo1@users.noreply... 13417392+rzo1@users.noreply...
FetcherBolt.java
in core/src/main/java/org/apache/stormcrawler/bolt
714 23 2024-03-28 2024-11-22 3 2 13417392+rzo1@users.noreply... 13417392+rzo1@users.noreply...
default-regex-normalizers.xml
in archetype/src/main/resources/archetype-resources/src/main/resources
3 - 2015-12-11 2015-12-17 2 1 julien@digitalpebble.com julien@digitalpebble.com
Scheduler.java
in core/src/main/java/org/apache/stormcrawler/persistence
23 1 2024-03-28 2024-05-21 2 2 13417392+rzo1@users.noreply... 13417392+rzo1@users.noreply...
SingleProxyManager.java
in core/src/main/java/org/apache/stormcrawler/proxy
35 3 2024-03-28 2024-05-03 2 3 13417392+rzo1@users.noreply... tallison314159@gmail.com
PerSecondReducer.java
in core/src/main/java/org/apache/stormcrawler/util
35 3 2024-03-28 2024-05-03 2 3 13417392+rzo1@users.noreply... tallison314159@gmail.com
MimeTypeNormalization.java
in core/src/main/java/org/apache/stormcrawler/parse/filter
36 1 2024-03-28 2024-05-03 2 3 13417392+rzo1@users.noreply... tallison314159@gmail.com
ProtocolResponse.java
in core/src/main/java/org/apache/stormcrawler/protocol
37 3 2024-03-28 2024-11-22 2 2 13417392+rzo1@users.noreply... 13417392+rzo1@users.noreply...
URLBuffer.java
in core/src/main/java/org/apache/stormcrawler/persistence/urlbuffer
42 4 2024-03-28 2024-05-21 2 2 13417392+rzo1@users.noreply... 13417392+rzo1@users.noreply...
BasicURLFilter.java
in core/src/main/java/org/apache/stormcrawler/filtering/basic
58 3 2024-03-28 2024-05-21 2 2 13417392+rzo1@users.noreply... 13417392+rzo1@users.noreply...
SeleniumProtocol.java
in core/src/main/java/org/apache/stormcrawler/protocol/selenium
68 4 2024-03-28 2024-05-03 2 3 13417392+rzo1@users.noreply... tallison314159@gmail.com
MetadataTransfer.java
in core/src/main/java/org/apache/stormcrawler/util
84 5 2024-03-28 2024-05-21 2 2 13417392+rzo1@users.noreply... 13417392+rzo1@users.noreply...
CookieConverter.java
in core/src/main/java/org/apache/stormcrawler/util
88 2 2024-03-28 2024-05-03 2 3 13417392+rzo1@users.noreply... tallison314159@gmail.com
FileResponse.java
in core/src/main/java/org/apache/stormcrawler/protocol/file
107 5 2024-03-28 2024-05-03 2 3 13417392+rzo1@users.noreply... tallison314159@gmail.com
RobotRulesParser.java
in core/src/main/java/org/apache/stormcrawler/protocol
111 6 2024-03-28 2024-05-03 2 3 13417392+rzo1@users.noreply... tallison314159@gmail.com
JSoupFilters.java
in core/src/main/java/org/apache/stormcrawler/parse
112 8 2024-03-28 2024-05-03 2 3 13417392+rzo1@users.noreply... tallison314159@gmail.com
AdaptiveScheduler.java
in core/src/main/java/org/apache/stormcrawler/persistence
124 2 2024-03-28 2024-05-03 2 3 13417392+rzo1@users.noreply... tallison314159@gmail.com
InitialisationUtil.java
in core/src/main/java/org/apache/stormcrawler/util
125 10 2024-03-28 2024-05-21 2 2 13417392+rzo1@users.noreply... 13417392+rzo1@users.noreply...
CollectionTagger.java
in core/src/main/java/org/apache/stormcrawler/parse/filter
125 10 2024-03-28 2024-10-10 2 3 13417392+rzo1@users.noreply... psxjoy@outlook.com
MultiProxyManager.java
in core/src/main/java/org/apache/stormcrawler/proxy
137 9 2024-03-28 2024-05-03 2 3 13417392+rzo1@users.noreply... tallison314159@gmail.com
TextExtractor.java
in core/src/main/java/org/apache/stormcrawler/parse
155 7 2024-03-28 2024-05-03 2 3 13417392+rzo1@users.noreply... tallison314159@gmail.com
AbstractStatusUpdaterBolt.java
in core/src/main/java/org/apache/stormcrawler/persistence
166 5 2024-03-28 2024-11-22 2 2 13417392+rzo1@users.noreply... 13417392+rzo1@users.noreply...
RegexURLNormalizer.java
in core/src/main/java/org/apache/stormcrawler/filtering/regex
186 6 2024-03-28 2024-05-03 2 3 13417392+rzo1@users.noreply... tallison314159@gmail.com
BasicURLNormalizer.java
in core/src/main/java/org/apache/stormcrawler/filtering/basic
297 8 2024-03-28 2024-11-22 2 2 13417392+rzo1@users.noreply... 13417392+rzo1@users.noreply...
EmptyQueueListener.java
in core/src/main/java/org/apache/stormcrawler/persistence
5 - 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
ProxyManager.java
in core/src/main/java/org/apache/stormcrawler/proxy
7 - 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
JSoupFilter.java
in core/src/main/java/org/apache/stormcrawler/parse
10 - 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
RegexRule.java
in core/src/main/java/org/apache/stormcrawler/filtering/regex
11 2 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
NavigationFilter.java
in core/src/main/java/org/apache/stormcrawler/protocol/selenium
11 - 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
URLFilter.java
in core/src/main/java/org/apache/stormcrawler/filtering
13 - 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
Status.java
in core/src/main/java/org/apache/stormcrawler/persistence
14 1 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
JSONResource.java
in core/src/main/java/org/apache/stormcrawler
16 1 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
AbstractConfigurable.java
in core/src/main/java/org/apache/stormcrawler/util
17 2 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
Files With Most Contributors (Top 50)
Based on the number of unique email addresses found in commits.

See data for all files...

File# lines# unitscreatedlast modified# changes
(days)
# contributorsfirst
contributor
latest
contributor
crawler-default.yaml
in core/src/main/resources
85 - 2015-10-15 2024-05-21 83 11 julien@digitalpebble.com rzo1@apache.org
crawler-conf.yaml
in archetype/src/main/resources/archetype-resources
62 - 2015-12-11 2024-05-21 52 7 julien@digitalpebble.com rzo1@apache.org
SiteMapParserBolt.java
in core/src/main/java/org/apache/stormcrawler/bolt
287 7 2024-03-28 2024-09-10 4 4 13417392+rzo1@users.noreply... 13417392+rzo1@users.noreply...
archetype-metadata.xml
in archetype/src/main/resources/META-INF/maven
37 - 2015-12-11 2024-11-13 11 3 julien@digitalpebble.com mvolikas@gmail.com
JSoupParserBolt.java
in core/src/main/java/org/apache/stormcrawler/bolt
368 7 2024-03-28 2024-11-22 6 3 13417392+rzo1@users.noreply... 13417392+rzo1@users.noreply...
DefaultScheduler.java
in core/src/main/java/org/apache/stormcrawler/persistence
138 6 2024-03-28 2024-12-02 4 3 13417392+rzo1@users.noreply... 13417392+rzo1@users.noreply...
FastURLFilter.java
in core/src/main/java/org/apache/stormcrawler/filtering/regex
245 12 2024-03-28 2024-12-02 4 3 13417392+rzo1@users.noreply... 13417392+rzo1@users.noreply...
ParseFilters.java
in core/src/main/java/org/apache/stormcrawler/parse
134 10 2024-03-28 2024-09-10 3 3 13417392+rzo1@users.noreply... 13417392+rzo1@users.noreply...
ParseFilter.java
in core/src/main/java/org/apache/stormcrawler/parse
13 2 2024-03-28 2024-10-10 3 3 13417392+rzo1@users.noreply... psxjoy@outlook.com
URLUtil.java
in core/src/main/java/org/apache/stormcrawler/util
100 8 2024-03-28 2024-10-10 3 3 13417392+rzo1@users.noreply... psxjoy@outlook.com
FeedParserBolt.java
in core/src/main/java/org/apache/stormcrawler/bolt
184 5 2024-03-28 2024-09-10 3 3 13417392+rzo1@users.noreply... 13417392+rzo1@users.noreply...
HttpProtocol.java
in core/src/main/java/org/apache/stormcrawler/protocol/okhttp
475 10 2024-03-28 2024-05-21 3 3 13417392+rzo1@users.noreply... 13417392+rzo1@users.noreply...
HttpProtocol.java
in core/src/main/java/org/apache/stormcrawler/protocol/httpclient
283 8 2024-03-28 2024-05-21 3 3 13417392+rzo1@users.noreply... 13417392+rzo1@users.noreply...
FileSpout.java
in core/src/main/java/org/apache/stormcrawler/spout
169 14 2024-03-28 2024-05-21 3 3 13417392+rzo1@users.noreply... 13417392+rzo1@users.noreply...
SitemapFilter.java
in core/src/main/java/org/apache/stormcrawler/filtering/sitemap
49 1 2024-03-28 2025-02-28 3 3 13417392+rzo1@users.noreply... tallison@apache.org
CollectionTagger.java
in core/src/main/java/org/apache/stormcrawler/parse/filter
125 10 2024-03-28 2024-10-10 2 3 13417392+rzo1@users.noreply... psxjoy@outlook.com
MimeTypeNormalization.java
in core/src/main/java/org/apache/stormcrawler/parse/filter
36 1 2024-03-28 2024-05-03 2 3 13417392+rzo1@users.noreply... tallison314159@gmail.com
JSoupFilters.java
in core/src/main/java/org/apache/stormcrawler/parse
112 8 2024-03-28 2024-05-03 2 3 13417392+rzo1@users.noreply... tallison314159@gmail.com
TextExtractor.java
in core/src/main/java/org/apache/stormcrawler/parse
155 7 2024-03-28 2024-05-03 2 3 13417392+rzo1@users.noreply... tallison314159@gmail.com
AdaptiveScheduler.java
in core/src/main/java/org/apache/stormcrawler/persistence
124 2 2024-03-28 2024-05-03 2 3 13417392+rzo1@users.noreply... tallison314159@gmail.com
CookieConverter.java
in core/src/main/java/org/apache/stormcrawler/util
88 2 2024-03-28 2024-05-03 2 3 13417392+rzo1@users.noreply... tallison314159@gmail.com
PerSecondReducer.java
in core/src/main/java/org/apache/stormcrawler/util
35 3 2024-03-28 2024-05-03 2 3 13417392+rzo1@users.noreply... tallison314159@gmail.com
MultiProxyManager.java
in core/src/main/java/org/apache/stormcrawler/proxy
137 9 2024-03-28 2024-05-03 2 3 13417392+rzo1@users.noreply... tallison314159@gmail.com
SingleProxyManager.java
in core/src/main/java/org/apache/stormcrawler/proxy
35 3 2024-03-28 2024-05-03 2 3 13417392+rzo1@users.noreply... tallison314159@gmail.com
FileResponse.java
in core/src/main/java/org/apache/stormcrawler/protocol/file
107 5 2024-03-28 2024-05-03 2 3 13417392+rzo1@users.noreply... tallison314159@gmail.com
SeleniumProtocol.java
in core/src/main/java/org/apache/stormcrawler/protocol/selenium
68 4 2024-03-28 2024-05-03 2 3 13417392+rzo1@users.noreply... tallison314159@gmail.com
RobotRulesParser.java
in core/src/main/java/org/apache/stormcrawler/protocol
111 6 2024-03-28 2024-05-03 2 3 13417392+rzo1@users.noreply... tallison314159@gmail.com
RegexURLNormalizer.java
in core/src/main/java/org/apache/stormcrawler/filtering/regex
186 6 2024-03-28 2024-05-03 2 3 13417392+rzo1@users.noreply... tallison314159@gmail.com
flux
crawler.flux
in archetype/src/main/resources/archetype-resources
115 - 2016-06-16 2024-03-28 13 2 julien@digitalpebble.com julien@digitalpebble.com
CharsetIdentification.java
in core/src/main/java/org/apache/stormcrawler/util
157 8 2024-03-28 2024-11-22 3 2 julien@digitalpebble.com 13417392+rzo1@users.noreply...
FetcherBolt.java
in core/src/main/java/org/apache/stormcrawler/bolt
714 23 2024-03-28 2024-11-22 3 2 13417392+rzo1@users.noreply... 13417392+rzo1@users.noreply...
SimpleFetcherBolt.java
in core/src/main/java/org/apache/stormcrawler/bolt
397 7 2024-03-28 2024-11-22 3 2 13417392+rzo1@users.noreply... 13417392+rzo1@users.noreply...
Scheduler.java
in core/src/main/java/org/apache/stormcrawler/persistence
23 1 2024-03-28 2024-05-21 2 2 13417392+rzo1@users.noreply... 13417392+rzo1@users.noreply...
URLBuffer.java
in core/src/main/java/org/apache/stormcrawler/persistence/urlbuffer
42 4 2024-03-28 2024-05-21 2 2 13417392+rzo1@users.noreply... 13417392+rzo1@users.noreply...
AbstractStatusUpdaterBolt.java
in core/src/main/java/org/apache/stormcrawler/persistence
166 5 2024-03-28 2024-11-22 2 2 13417392+rzo1@users.noreply... 13417392+rzo1@users.noreply...
InitialisationUtil.java
in core/src/main/java/org/apache/stormcrawler/util
125 10 2024-03-28 2024-05-21 2 2 13417392+rzo1@users.noreply... 13417392+rzo1@users.noreply...
MetadataTransfer.java
in core/src/main/java/org/apache/stormcrawler/util
84 5 2024-03-28 2024-05-21 2 2 13417392+rzo1@users.noreply... 13417392+rzo1@users.noreply...
ProtocolResponse.java
in core/src/main/java/org/apache/stormcrawler/protocol
37 3 2024-03-28 2024-11-22 2 2 13417392+rzo1@users.noreply... 13417392+rzo1@users.noreply...
BasicURLFilter.java
in core/src/main/java/org/apache/stormcrawler/filtering/basic
58 3 2024-03-28 2024-05-21 2 2 13417392+rzo1@users.noreply... 13417392+rzo1@users.noreply...
BasicURLNormalizer.java
in core/src/main/java/org/apache/stormcrawler/filtering/basic
297 8 2024-03-28 2024-11-22 2 2 13417392+rzo1@users.noreply... 13417392+rzo1@users.noreply...
StdOutIndexer.java
in core/src/main/java/org/apache/stormcrawler/indexing
54 3 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
AbstractIndexerBolt.java
in core/src/main/java/org/apache/stormcrawler/indexing
208 15 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
DummyIndexer.java
in core/src/main/java/org/apache/stormcrawler/indexing
27 2 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
MD5SignatureParseFilter.java
in core/src/main/java/org/apache/stormcrawler/parse/filter
57 2 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
XPathFilter.java
in core/src/main/java/org/apache/stormcrawler/parse/filter
175 7 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
DebugParseFilter.java
in core/src/main/java/org/apache/stormcrawler/parse/filter
39 3 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
LDJsonParseFilter.java
in core/src/main/java/org/apache/stormcrawler/parse/filter
87 6 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
DomainParseFilter.java
in core/src/main/java/org/apache/stormcrawler/parse/filter
36 2 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
LinkParseFilter.java
in core/src/main/java/org/apache/stormcrawler/parse/filter
76 2 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
CommaSeparatedToMultivaluedMetadata.java
in core/src/main/java/org/apache/stormcrawler/parse/filter
42 2 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
Files With Least Contributors (Top 50)
Based on the number of unique email addresses found in commits.

See data for all files...

File# lines# unitscreatedlast modified# changes
(days)
# contributorsfirst
contributor
latest
contributor
default-regex-normalizers.xml
in archetype/src/main/resources/archetype-resources/src/main/resources
3 - 2015-12-11 2015-12-17 2 1 julien@digitalpebble.com julien@digitalpebble.com
FetcherBolt.java
in core/src/main/java/org/apache/stormcrawler/bolt
714 23 2024-03-28 2024-11-22 3 2 13417392+rzo1@users.noreply... 13417392+rzo1@users.noreply...
SimpleFetcherBolt.java
in core/src/main/java/org/apache/stormcrawler/bolt
397 7 2024-03-28 2024-11-22 3 2 13417392+rzo1@users.noreply... 13417392+rzo1@users.noreply...
BasicURLNormalizer.java
in core/src/main/java/org/apache/stormcrawler/filtering/basic
297 8 2024-03-28 2024-11-22 2 2 13417392+rzo1@users.noreply... 13417392+rzo1@users.noreply...
AbstractIndexerBolt.java
in core/src/main/java/org/apache/stormcrawler/indexing
208 15 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
DelegatorProtocol.java
in core/src/main/java/org/apache/stormcrawler/protocol
192 12 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
XPathFilter.java
in core/src/main/java/org/apache/stormcrawler/parse/filter
175 7 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
Metadata.java
in core/src/main/java/org/apache/stormcrawler
172 23 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
AbstractStatusUpdaterBolt.java
in core/src/main/java/org/apache/stormcrawler/persistence
166 5 2024-03-28 2024-11-22 2 2 13417392+rzo1@users.noreply... 13417392+rzo1@users.noreply...
AbstractQueryingSpout.java
in core/src/main/java/org/apache/stormcrawler/persistence
166 15 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
CharsetIdentification.java
in core/src/main/java/org/apache/stormcrawler/util
157 8 2024-03-28 2024-11-22 3 2 julien@digitalpebble.com 13417392+rzo1@users.noreply...
HttpRobotRulesParser.java
in core/src/main/java/org/apache/stormcrawler/protocol
151 5 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
URLFilters.java
in core/src/main/java/org/apache/stormcrawler/filtering
144 7 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
ConfUtils.java
in core/src/main/java/org/apache/stormcrawler/util
138 15 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
AbstractHttpProtocol.java
in core/src/main/java/org/apache/stormcrawler/protocol
127 9 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
InitialisationUtil.java
in core/src/main/java/org/apache/stormcrawler/util
125 10 2024-03-28 2024-05-21 2 2 13417392+rzo1@users.noreply... 13417392+rzo1@users.noreply...
MemorySpout.java
in core/src/main/java/org/apache/stormcrawler/spout
122 10 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
flux
crawler.flux
in archetype/src/main/resources/archetype-resources
115 - 2016-06-16 2024-03-28 13 2 julien@digitalpebble.com julien@digitalpebble.com
URLPartitionerBolt.java
in core/src/main/java/org/apache/stormcrawler/bolt
115 3 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
SchedulingURLBuffer.java
in core/src/main/java/org/apache/stormcrawler/persistence/urlbuffer
112 6 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
RegexURLFilterBase.java
in core/src/main/java/org/apache/stormcrawler/filtering/regex
107 4 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
Protocol.java
in core/src/main/java/org/apache/stormcrawler/protocol
105 1 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
SCProxy.java
in core/src/main/java/org/apache/stormcrawler/proxy
104 15 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
RobotsTags.java
in core/src/main/java/org/apache/stormcrawler/util
101 10 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
RemoteDriverProtocol.java
in core/src/main/java/org/apache/stormcrawler/protocol/selenium
90 3 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
LDJsonParseFilter.java
in core/src/main/java/org/apache/stormcrawler/parse/filter
87 6 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
XPathFilter.java
in core/src/main/java/org/apache/stormcrawler/jsoup
87 6 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
MetadataTransfer.java
in core/src/main/java/org/apache/stormcrawler/util
84 5 2024-03-28 2024-05-21 2 2 13417392+rzo1@users.noreply... 13417392+rzo1@users.noreply...
HostURLFilter.java
in core/src/main/java/org/apache/stormcrawler/filtering/host
80 1 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
ParseResult.java
in core/src/main/java/org/apache/stormcrawler/parse
79 13 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
LDJsonParseFilter.java
in core/src/main/java/org/apache/stormcrawler/jsoup
79 5 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
URLFilterBolt.java
in core/src/main/java/org/apache/stormcrawler/bolt
78 5 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
LinkParseFilter.java
in core/src/main/java/org/apache/stormcrawler/parse/filter
76 2 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
LinkParseFilter.java
in core/src/main/java/org/apache/stormcrawler/jsoup
74 2 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
PriorityURLBuffer.java
in core/src/main/java/org/apache/stormcrawler/persistence/urlbuffer
73 5 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
StatusEmitterBolt.java
in core/src/main/java/org/apache/stormcrawler/bolt
73 5 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
URLPartitioner.java
in core/src/main/java/org/apache/stormcrawler/util
71 3 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
ProtocolFactory.java
in core/src/main/java/org/apache/stormcrawler/protocol
71 5 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
DocumentFragmentBuilder.java
in core/src/main/java/org/apache/stormcrawler/parse
69 6 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
ConfigurableTopology.java
in core/src/main/java/org/apache/stormcrawler
66 4 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
AbstractURLBuffer.java
in core/src/main/java/org/apache/stormcrawler/persistence/urlbuffer
64 7 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
NavigationFilters.java
in core/src/main/java/org/apache/stormcrawler/protocol/selenium
64 4 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
RobotRules.java
in core/src/main/java/org/apache/stormcrawler/protocol
60 13 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
ConfigurableHelper.java
in core/src/main/java/org/apache/stormcrawler/util
58 2 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
BasicURLFilter.java
in core/src/main/java/org/apache/stormcrawler/filtering/basic
58 3 2024-03-28 2024-05-21 2 2 13417392+rzo1@users.noreply... 13417392+rzo1@users.noreply...
MD5SignatureParseFilter.java
in core/src/main/java/org/apache/stormcrawler/parse/filter
57 2 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
StdOutIndexer.java
in core/src/main/java/org/apache/stormcrawler/indexing
54 3 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
URLStreamGrouping.java
in core/src/main/java/org/apache/stormcrawler/util
54 4 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
RobotsFilter.java
in core/src/main/java/org/apache/stormcrawler/filtering/robots
51 1 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
MetadataFilter.java
in core/src/main/java/org/apache/stormcrawler/filtering/metadata
49 1 2024-03-28 2024-03-28 1 2 13417392+rzo1@users.noreply... julien@digitalpebble.com
Correlations

File Size vs. Number of Changes: 110 points

core/src/main/java/org/apache/stormcrawler/filtering/sitemap/SitemapFilter.java x: 49 lines of code y: 3 # changes core/src/main/java/org/apache/stormcrawler/filtering/regex/FastURLFilter.java x: 245 lines of code y: 4 # changes core/src/main/java/org/apache/stormcrawler/persistence/DefaultScheduler.java x: 138 lines of code y: 4 # changes core/src/main/java/org/apache/stormcrawler/bolt/FetcherBolt.java x: 714 lines of code y: 3 # changes core/src/main/java/org/apache/stormcrawler/bolt/JSoupParserBolt.java x: 368 lines of code y: 6 # changes core/src/main/java/org/apache/stormcrawler/bolt/SimpleFetcherBolt.java x: 397 lines of code y: 3 # changes core/src/main/java/org/apache/stormcrawler/filtering/basic/BasicURLNormalizer.java x: 297 lines of code y: 2 # changes core/src/main/java/org/apache/stormcrawler/persistence/AbstractStatusUpdaterBolt.java x: 166 lines of code y: 2 # changes core/src/main/java/org/apache/stormcrawler/protocol/ProtocolResponse.java x: 37 lines of code y: 2 # changes core/src/main/java/org/apache/stormcrawler/util/CharsetIdentification.java x: 157 lines of code y: 3 # changes archetype/src/main/resources/META-INF/maven/archetype-metadata.xml x: 37 lines of code y: 11 # changes core/src/main/java/org/apache/stormcrawler/parse/filter/CollectionTagger.java x: 125 lines of code y: 2 # changes core/src/main/java/org/apache/stormcrawler/parse/ParseFilter.java x: 13 lines of code y: 3 # changes core/src/main/java/org/apache/stormcrawler/util/URLUtil.java x: 100 lines of code y: 3 # changes core/src/main/java/org/apache/stormcrawler/bolt/FeedParserBolt.java x: 184 lines of code y: 3 # changes core/src/main/java/org/apache/stormcrawler/bolt/SiteMapParserBolt.java x: 287 lines of code y: 4 # changes core/src/main/java/org/apache/stormcrawler/parse/ParseFilters.java x: 134 lines of code y: 3 # changes core/src/main/java/org/apache/stormcrawler/protocol/httpclient/HttpProtocol.java x: 283 lines of code y: 3 # changes core/src/main/java/org/apache/stormcrawler/protocol/okhttp/HttpProtocol.java x: 475 lines of code y: 3 # changes core/src/main/java/org/apache/stormcrawler/spout/FileSpout.java x: 169 lines of code y: 3 # changes core/src/main/java/org/apache/stormcrawler/filtering/basic/BasicURLFilter.java x: 58 lines of code y: 2 # changes core/src/main/java/org/apache/stormcrawler/persistence/Scheduler.java x: 23 lines of code y: 2 # changes core/src/main/java/org/apache/stormcrawler/persistence/urlbuffer/URLBuffer.java x: 42 lines of code y: 2 # changes core/src/main/java/org/apache/stormcrawler/util/MetadataTransfer.java x: 84 lines of code y: 2 # changes core/src/main/resources/crawler-default.yaml x: 85 lines of code y: 83 # changes archetype/src/main/resources/archetype-resources/crawler-conf.yaml x: 62 lines of code y: 52 # changes core/src/main/java/org/apache/stormcrawler/filtering/regex/RegexURLNormalizer.java x: 186 lines of code y: 2 # changes core/src/main/java/org/apache/stormcrawler/parse/JSoupFilters.java x: 112 lines of code y: 2 # changes core/src/main/java/org/apache/stormcrawler/parse/TextExtractor.java x: 155 lines of code y: 2 # changes core/src/main/java/org/apache/stormcrawler/parse/filter/MimeTypeNormalization.java x: 36 lines of code y: 2 # changes core/src/main/java/org/apache/stormcrawler/protocol/file/FileResponse.java x: 107 lines of code y: 2 # changes core/src/main/java/org/apache/stormcrawler/protocol/selenium/SeleniumProtocol.java x: 68 lines of code y: 2 # changes core/src/main/java/org/apache/stormcrawler/proxy/MultiProxyManager.java x: 137 lines of code y: 2 # changes core/src/main/java/org/apache/stormcrawler/util/CookieConverter.java x: 88 lines of code y: 2 # changes core/src/main/java/org/apache/stormcrawler/ConfigurableTopology.java x: 66 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/Constants.java x: 19 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/JSONResource.java x: 16 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/Metadata.java x: 172 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/bolt/StatusEmitterBolt.java x: 73 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/bolt/URLFilterBolt.java x: 78 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/bolt/URLPartitionerBolt.java x: 115 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/filtering/URLFilter.java x: 13 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/filtering/URLFilters.java x: 144 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/filtering/basic/SelfURLFilter.java x: 23 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/filtering/depth/MaxDepthFilter.java x: 49 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/filtering/host/HostURLFilter.java x: 80 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/filtering/regex/RegexRule.java x: 11 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/filtering/regex/RegexURLFilterBase.java x: 107 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/filtering/robots/RobotsFilter.java x: 51 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/indexing/AbstractIndexerBolt.java x: 208 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/indexing/DummyIndexer.java x: 27 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/indexing/StdOutIndexer.java x: 54 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/jsoup/LDJsonParseFilter.java x: 79 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/jsoup/LinkParseFilter.java x: 74 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/jsoup/XPathFilter.java x: 87 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/parse/DocumentFragmentBuilder.java x: 69 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/parse/JSoupFilter.java x: 10 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/parse/Outlink.java x: 35 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/parse/ParseData.java x: 45 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/parse/filter/CommaSeparatedToMultivaluedMetadata.java x: 42 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/parse/filter/DebugParseFilter.java x: 39 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/parse/filter/LinkParseFilter.java x: 76 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/parse/filter/MD5SignatureParseFilter.java x: 57 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/parse/filter/XPathFilter.java x: 175 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/persistence/AbstractQueryingSpout.java x: 166 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/persistence/EmptyQueueListener.java x: 5 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/persistence/MemoryStatusUpdater.java x: 17 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/persistence/urlbuffer/AbstractURLBuffer.java x: 64 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/persistence/urlbuffer/SchedulingURLBuffer.java x: 112 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/persistence/urlbuffer/SimpleURLBuffer.java x: 43 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/protocol/AbstractHttpProtocol.java x: 127 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/protocol/DelegatorProtocol.java x: 192 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/protocol/HttpHeaders.java x: 33 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/protocol/HttpRobotRulesParser.java x: 151 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/protocol/Protocol.java x: 105 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/protocol/ProtocolFactory.java x: 71 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/protocol/RobotRules.java x: 60 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/protocol/file/FileProtocol.java x: 32 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/protocol/selenium/RemoteDriverProtocol.java x: 90 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/proxy/ProxyManager.java x: 7 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/proxy/SCProxy.java x: 104 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/spout/MemorySpout.java x: 122 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/util/CollectionMetric.java x: 20 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/util/ConfUtils.java x: 138 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/util/Configurable.java x: 41 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/util/ConfigurableHelper.java x: 58 lines of code y: 1 # changes core/src/main/java/org/apache/stormcrawler/util/RobotsTags.java x: 101 lines of code y: 1 # changes archetype/src/main/resources/archetype-resources/crawler.flux x: 115 lines of code y: 13 # changes archetype/src/main/resources/archetype-resources/src/main/resources/default-regex-normalizers.xml x: 3 lines of code y: 2 # changes
83.0
# changes
  min: 1.0
  average: 2.94
  25th percentile: 1.0
  median: 1.0
  75th percentile: 2.0
  max: 83.0
0 714.0
lines of code
min: 3.0 | average: 98.25 | 25th percentile: 36.0 | median: 72.0 | 75th percentile: 125.0 | max: 714.0

Number of Contributors vs. Number of Changes: 110 points

core/src/main/java/org/apache/stormcrawler/filtering/sitemap/SitemapFilter.java x: 3 # contributors y: 3 # changes core/src/main/java/org/apache/stormcrawler/filtering/regex/FastURLFilter.java x: 3 # contributors y: 4 # changes core/src/main/java/org/apache/stormcrawler/bolt/FetcherBolt.java x: 2 # contributors y: 3 # changes core/src/main/java/org/apache/stormcrawler/bolt/JSoupParserBolt.java x: 3 # contributors y: 6 # changes core/src/main/java/org/apache/stormcrawler/filtering/basic/BasicURLNormalizer.java x: 2 # contributors y: 2 # changes archetype/src/main/resources/META-INF/maven/archetype-metadata.xml x: 3 # contributors y: 11 # changes core/src/main/java/org/apache/stormcrawler/parse/filter/CollectionTagger.java x: 3 # contributors y: 2 # changes core/src/main/java/org/apache/stormcrawler/bolt/SiteMapParserBolt.java x: 4 # contributors y: 4 # changes core/src/main/resources/crawler-default.yaml x: 11 # contributors y: 83 # changes archetype/src/main/resources/archetype-resources/crawler-conf.yaml x: 7 # contributors y: 52 # changes core/src/main/java/org/apache/stormcrawler/ConfigurableTopology.java x: 2 # contributors y: 1 # changes archetype/src/main/resources/archetype-resources/crawler.flux x: 2 # contributors y: 13 # changes archetype/src/main/resources/archetype-resources/src/main/resources/default-regex-normalizers.xml x: 1 # contributors y: 2 # changes
83.0
# changes
  min: 1.0
  average: 2.94
  25th percentile: 1.0
  median: 1.0
  75th percentile: 2.0
  max: 83.0
0 11.0
# contributors
min: 1.0 | average: 2.36 | 25th percentile: 2.0 | median: 2.0 | 75th percentile: 3.0 | max: 11.0

Number of Contributors vs. File Size: 110 points

core/src/main/java/org/apache/stormcrawler/filtering/sitemap/SitemapFilter.java x: 3 # contributors y: 49 lines of code core/src/main/java/org/apache/stormcrawler/filtering/regex/FastURLFilter.java x: 3 # contributors y: 245 lines of code core/src/main/java/org/apache/stormcrawler/persistence/DefaultScheduler.java x: 3 # contributors y: 138 lines of code core/src/main/java/org/apache/stormcrawler/bolt/FetcherBolt.java x: 2 # contributors y: 714 lines of code core/src/main/java/org/apache/stormcrawler/bolt/JSoupParserBolt.java x: 3 # contributors y: 368 lines of code core/src/main/java/org/apache/stormcrawler/bolt/SimpleFetcherBolt.java x: 2 # contributors y: 397 lines of code core/src/main/java/org/apache/stormcrawler/filtering/basic/BasicURLNormalizer.java x: 2 # contributors y: 297 lines of code core/src/main/java/org/apache/stormcrawler/persistence/AbstractStatusUpdaterBolt.java x: 2 # contributors y: 166 lines of code core/src/main/java/org/apache/stormcrawler/protocol/ProtocolResponse.java x: 2 # contributors y: 37 lines of code core/src/main/java/org/apache/stormcrawler/util/CharsetIdentification.java x: 2 # contributors y: 157 lines of code archetype/src/main/resources/META-INF/maven/archetype-metadata.xml x: 3 # contributors y: 37 lines of code core/src/main/java/org/apache/stormcrawler/parse/filter/CollectionTagger.java x: 3 # contributors y: 125 lines of code core/src/main/java/org/apache/stormcrawler/parse/ParseFilter.java x: 3 # contributors y: 13 lines of code core/src/main/java/org/apache/stormcrawler/util/URLUtil.java x: 3 # contributors y: 100 lines of code core/src/main/java/org/apache/stormcrawler/bolt/FeedParserBolt.java x: 3 # contributors y: 184 lines of code core/src/main/java/org/apache/stormcrawler/bolt/SiteMapParserBolt.java x: 4 # contributors y: 287 lines of code core/src/main/java/org/apache/stormcrawler/parse/ParseFilters.java x: 3 # contributors y: 134 lines of code core/src/main/java/org/apache/stormcrawler/protocol/httpclient/HttpProtocol.java x: 3 # contributors y: 283 lines of code core/src/main/java/org/apache/stormcrawler/protocol/okhttp/HttpProtocol.java x: 3 # contributors y: 475 lines of code core/src/main/java/org/apache/stormcrawler/spout/FileSpout.java x: 3 # contributors y: 169 lines of code core/src/main/java/org/apache/stormcrawler/filtering/basic/BasicURLFilter.java x: 2 # contributors y: 58 lines of code core/src/main/java/org/apache/stormcrawler/persistence/Scheduler.java x: 2 # contributors y: 23 lines of code core/src/main/java/org/apache/stormcrawler/persistence/urlbuffer/URLBuffer.java x: 2 # contributors y: 42 lines of code core/src/main/java/org/apache/stormcrawler/util/InitialisationUtil.java x: 2 # contributors y: 125 lines of code core/src/main/java/org/apache/stormcrawler/util/MetadataTransfer.java x: 2 # contributors y: 84 lines of code core/src/main/resources/crawler-default.yaml x: 11 # contributors y: 85 lines of code archetype/src/main/resources/archetype-resources/crawler-conf.yaml x: 7 # contributors y: 62 lines of code core/src/main/java/org/apache/stormcrawler/filtering/regex/RegexURLNormalizer.java x: 3 # contributors y: 186 lines of code core/src/main/java/org/apache/stormcrawler/parse/JSoupFilters.java x: 3 # contributors y: 112 lines of code core/src/main/java/org/apache/stormcrawler/parse/TextExtractor.java x: 3 # contributors y: 155 lines of code core/src/main/java/org/apache/stormcrawler/parse/filter/MimeTypeNormalization.java x: 3 # contributors y: 36 lines of code core/src/main/java/org/apache/stormcrawler/protocol/file/FileResponse.java x: 3 # contributors y: 107 lines of code core/src/main/java/org/apache/stormcrawler/protocol/selenium/SeleniumProtocol.java x: 3 # contributors y: 68 lines of code core/src/main/java/org/apache/stormcrawler/proxy/MultiProxyManager.java x: 3 # contributors y: 137 lines of code core/src/main/java/org/apache/stormcrawler/util/CookieConverter.java x: 3 # contributors y: 88 lines of code core/src/main/java/org/apache/stormcrawler/ConfigurableTopology.java x: 2 # contributors y: 66 lines of code core/src/main/java/org/apache/stormcrawler/Constants.java x: 2 # contributors y: 19 lines of code core/src/main/java/org/apache/stormcrawler/JSONResource.java x: 2 # contributors y: 16 lines of code core/src/main/java/org/apache/stormcrawler/Metadata.java x: 2 # contributors y: 172 lines of code core/src/main/java/org/apache/stormcrawler/bolt/StatusEmitterBolt.java x: 2 # contributors y: 73 lines of code core/src/main/java/org/apache/stormcrawler/bolt/URLFilterBolt.java x: 2 # contributors y: 78 lines of code core/src/main/java/org/apache/stormcrawler/bolt/URLPartitionerBolt.java x: 2 # contributors y: 115 lines of code core/src/main/java/org/apache/stormcrawler/filtering/URLFilter.java x: 2 # contributors y: 13 lines of code core/src/main/java/org/apache/stormcrawler/filtering/URLFilters.java x: 2 # contributors y: 144 lines of code core/src/main/java/org/apache/stormcrawler/filtering/depth/MaxDepthFilter.java x: 2 # contributors y: 49 lines of code core/src/main/java/org/apache/stormcrawler/filtering/host/HostURLFilter.java x: 2 # contributors y: 80 lines of code core/src/main/java/org/apache/stormcrawler/filtering/regex/RegexRule.java x: 2 # contributors y: 11 lines of code core/src/main/java/org/apache/stormcrawler/filtering/regex/RegexURLFilter.java x: 2 # contributors y: 22 lines of code core/src/main/java/org/apache/stormcrawler/filtering/regex/RegexURLFilterBase.java x: 2 # contributors y: 107 lines of code core/src/main/java/org/apache/stormcrawler/filtering/robots/RobotsFilter.java x: 2 # contributors y: 51 lines of code core/src/main/java/org/apache/stormcrawler/indexing/AbstractIndexerBolt.java x: 2 # contributors y: 208 lines of code core/src/main/java/org/apache/stormcrawler/indexing/DummyIndexer.java x: 2 # contributors y: 27 lines of code core/src/main/java/org/apache/stormcrawler/indexing/StdOutIndexer.java x: 2 # contributors y: 54 lines of code core/src/main/java/org/apache/stormcrawler/jsoup/LinkParseFilter.java x: 2 # contributors y: 74 lines of code core/src/main/java/org/apache/stormcrawler/jsoup/XPathFilter.java x: 2 # contributors y: 87 lines of code core/src/main/java/org/apache/stormcrawler/parse/DocumentFragmentBuilder.java x: 2 # contributors y: 69 lines of code core/src/main/java/org/apache/stormcrawler/parse/Outlink.java x: 2 # contributors y: 35 lines of code core/src/main/java/org/apache/stormcrawler/parse/ParseData.java x: 2 # contributors y: 45 lines of code core/src/main/java/org/apache/stormcrawler/parse/filter/XPathFilter.java x: 2 # contributors y: 175 lines of code core/src/main/java/org/apache/stormcrawler/persistence/EmptyQueueListener.java x: 2 # contributors y: 5 lines of code core/src/main/java/org/apache/stormcrawler/persistence/urlbuffer/AbstractURLBuffer.java x: 2 # contributors y: 64 lines of code core/src/main/java/org/apache/stormcrawler/persistence/urlbuffer/SchedulingURLBuffer.java x: 2 # contributors y: 112 lines of code core/src/main/java/org/apache/stormcrawler/persistence/urlbuffer/SimpleURLBuffer.java x: 2 # contributors y: 43 lines of code core/src/main/java/org/apache/stormcrawler/protocol/AbstractHttpProtocol.java x: 2 # contributors y: 127 lines of code core/src/main/java/org/apache/stormcrawler/protocol/DelegatorProtocol.java x: 2 # contributors y: 192 lines of code core/src/main/java/org/apache/stormcrawler/protocol/HttpHeaders.java x: 2 # contributors y: 33 lines of code core/src/main/java/org/apache/stormcrawler/protocol/HttpRobotRulesParser.java x: 2 # contributors y: 151 lines of code core/src/main/java/org/apache/stormcrawler/protocol/Protocol.java x: 2 # contributors y: 105 lines of code core/src/main/java/org/apache/stormcrawler/protocol/RobotRules.java x: 2 # contributors y: 60 lines of code core/src/main/java/org/apache/stormcrawler/protocol/selenium/RemoteDriverProtocol.java x: 2 # contributors y: 90 lines of code core/src/main/java/org/apache/stormcrawler/proxy/ProxyManager.java x: 2 # contributors y: 7 lines of code core/src/main/java/org/apache/stormcrawler/proxy/SCProxy.java x: 2 # contributors y: 104 lines of code core/src/main/java/org/apache/stormcrawler/spout/MemorySpout.java x: 2 # contributors y: 122 lines of code core/src/main/java/org/apache/stormcrawler/util/ConfUtils.java x: 2 # contributors y: 138 lines of code core/src/main/java/org/apache/stormcrawler/util/RobotsTags.java x: 2 # contributors y: 101 lines of code archetype/src/main/resources/archetype-resources/src/main/resources/default-regex-normalizers.xml x: 1 # contributors y: 3 lines of code
714.0
lines of code
  min: 3.0
  average: 98.25
  25th percentile: 36.0
  median: 72.0
  75th percentile: 125.0
  max: 714.0
0 11.0
# contributors
min: 1.0 | average: 2.36 | 25th percentile: 2.0 | median: 2.0 | 75th percentile: 3.0 | max: 11.0