apache / nutch
File Size

The distribution of size of files (measured in lines of code).

Intro
Learn more...
File Size Overall
6% | 9% | 37% | 20% | 26%
Legend:
1001+
501-1000
201-500
101-200
1-100


explore: grouped by folders | grouped by size | sunburst | 3D view
File Size per Extension
1001+
501-1000
201-500
101-200
1-100
xml32% | 0% | 10% | 3% | 53%
java2% | 10% | 41% | 23% | 22%
html0% | 0% | 86% | 0% | 13%
xsd0% | 0% | 0% | 81% | 18%
xsl0% | 0% | 0% | 0% | 100%
rss0% | 0% | 0% | 0% | 100%
File Size per Logical Decomposition
primary
1001+
501-1000
201-500
101-200
1-100
conf90% | 0% | 0% | 5% | 4%
src2% | 9% | 39% | 21% | 27%
ROOT0% | 0% | 100% | 0% | 0%
ivy0% | 0% | 0% | 0% | 100%
Longest Files (Top 50)
File# lines# units
2563 -
CrawlDbReader.java
in src/java/org/apache/nutch/crawl
1148 34
Generator.java
in src/java/org/apache/nutch/crawl
908 32
FetcherThread.java
in src/java/org/apache/nutch/fetcher
746 14
SegmentReader.java
in src/java/org/apache/nutch/segment
712 18
SegmentMerger.java
in src/java/org/apache/nutch/segment
661 12
CommonCrawlDataDumper.java
in src/java/org/apache/nutch/tools
525 12
WebGraph.java
in src/java/org/apache/nutch/scoring/webgraph
522 15
HttpResponse.java
in src/plugin/protocol-http/src/java/org/apache/nutch/protocol/http
506 14
HttpBase.java
in src/plugin/lib-http/src/java/org/apache/nutch/protocol/http/api
505 33
LinkRank.java
in src/java/org/apache/nutch/scoring/webgraph
493 21
Injector.java
in src/java/org/apache/nutch/crawl
472 16
Fetcher.java
in src/java/org/apache/nutch/fetcher
461 14
CrawlDatum.java
in src/java/org/apache/nutch/crawl
456 38
HttpResponse.java
in src/plugin/protocol-htmlunit/src/java/org/apache/nutch/protocol/htmlunit
419 13
WARCExporter.java
in src/java/org/apache/nutch/tools/warc
416 8
SitemapProcessor.java
in src/java/org/apache/nutch/util
411 11
nutch.html
in src/plugin/parse-tika/sample
408 -
PluginRepository.java
in src/java/org/apache/nutch/plugin
404 18
LinkDb.java
in src/java/org/apache/nutch/crawl
398 12
HttpResponse.java
in src/plugin/protocol-interactiveselenium/src/java/org/apache/nutch/protocol/interactiveselenium
397 12
ParseOutputFormat.java
in src/java/org/apache/nutch/parse
378 11
OpenSearch1xIndexWriter.java
in src/plugin/indexer-opensearch-1x/src/java/org/apache/nutch/indexwriter/opensearch1x
360 16
IndexingJob.java
in src/java/org/apache/nutch/indexer
358 12
Http.java
in src/plugin/protocol-httpclient/src/java/org/apache/nutch/protocol/httpclient
357 11
HttpResponse.java
in src/plugin/protocol-selenium/src/java/org/apache/nutch/protocol/selenium
354 11
IndexerMapReduce.java
in src/java/org/apache/nutch/indexer
350 7
OkHttp.java
in src/plugin/protocol-okhttp/src/java/org/apache/nutch/protocol/okhttp
340 10
FtpResponse.java
in src/plugin/protocol-ftp/src/java/org/apache/nutch/protocol/ftp
329 6
UpdateHostDbReducer.java
in src/java/org/apache/nutch/hostdb
323 5
DeduplicationJob.java
in src/java/org/apache/nutch/crawl
317 11
DOMContentUtils.java
in src/plugin/parse-html/src/java/org/apache/nutch/parse/html
317 14
DOMContentUtils.java
in src/plugin/parse-tika/src/java/org/apache/nutch/parse/tika
316 15
NodeDumper.java
in src/java/org/apache/nutch/scoring/webgraph
314 11
BasicURLNormalizer.java
in src/plugin/urlnormalizer-basic/src/java/org/apache/nutch/net/urlnormalizer/basic
314 13
LinkDumper.java
in src/java/org/apache/nutch/scoring/webgraph
311 21
CrawlDb.java
in src/java/org/apache/nutch/crawl
308 11
CSVIndexWriter.java
in src/plugin/indexer-csv/src/java/org/apache/nutch/indexwriter/csv
295 19
RobotRulesParser.java
in src/java/org/apache/nutch/protocol
294 14
AbstractCommonCrawlFormat.java
in src/java/org/apache/nutch/tools
291 34
HtmlParser.java
in src/plugin/parse-html/src/java/org/apache/nutch/parse/html
291 8
SolrIndexWriter.java
in src/plugin/indexer-solr/src/java/org/apache/nutch/indexwriter/solr
285 12
ElasticIndexWriter.java
in src/plugin/indexer-elastic/src/java/org/apache/nutch/indexwriter/elastic
282 13
plugin.xml
in src/plugin/lib-htmlunit
282 -
ParseSegment.java
in src/java/org/apache/nutch/parse
273 10
ArcSegmentCreator.java
in src/java/org/apache/nutch/tools/arc
269 11
269 -
FileDumper.java
in src/java/org/apache/nutch/tools
265 3
FetchItemQueues.java
in src/java/org/apache/nutch/fetcher
264 20
CrawlDbReducer.java
in src/java/org/apache/nutch/crawl
263 4
Files With Most Units (Top 50)
File# lines# units
HostDatum.java
in src/java/org/apache/nutch/hostdb
263 44
CrawlDatum.java
in src/java/org/apache/nutch/crawl
456 38
AbstractCommonCrawlFormat.java
in src/java/org/apache/nutch/tools
291 34
CrawlDbReader.java
in src/java/org/apache/nutch/crawl
1148 34
HttpBase.java
in src/plugin/lib-http/src/java/org/apache/nutch/protocol/http/api
505 33
Generator.java
in src/java/org/apache/nutch/crawl
908 32
DOMBuilder.java
in src/plugin/parse-tika/src/java/org/apache/nutch/parse/tika
215 30
DOMBuilder.java
in src/plugin/parse-html/src/java/org/apache/nutch/parse/html
197 29
ParseStatus.java
in src/java/org/apache/nutch/parse
197 28
ProtocolStatus.java
in src/java/org/apache/nutch/protocol
223 26
PluginDescriptor.java
in src/java/org/apache/nutch/plugin
172 22
HTMLMetaTags.java
in src/java/org/apache/nutch/parse
105 21
LinkDumper.java
in src/java/org/apache/nutch/scoring/webgraph
311 21
LinkRank.java
in src/java/org/apache/nutch/scoring/webgraph
493 21
URLUtil.java
in src/java/org/apache/nutch/util
232 20
FetchItemQueues.java
in src/java/org/apache/nutch/fetcher
264 20
Content.java
in src/java/org/apache/nutch/protocol
229 20
CommonCrawlConfig.java
in src/java/org/apache/nutch/tools
88 19
CSVIndexWriter.java
in src/plugin/indexer-csv/src/java/org/apache/nutch/indexwriter/csv
295 19
SegmentReader.java
in src/java/org/apache/nutch/segment
712 18
EncodingDetector.java
in src/java/org/apache/nutch/util
237 18
NutchServer.java
in src/java/org/apache/nutch/service
176 18
PluginRepository.java
in src/java/org/apache/nutch/plugin
404 18
ParseData.java
in src/java/org/apache/nutch/parse
159 17
LinkDatum.java
in src/java/org/apache/nutch/scoring/webgraph
81 17
JobInfo.java
in src/java/org/apache/nutch/service/model/response
75 17
Injector.java
in src/java/org/apache/nutch/crawl
472 16
CommandRunner.java
in src/java/org/apache/nutch/util
202 16
Client.java
in src/plugin/protocol-ftp/src/java/org/apache/nutch/protocol/ftp
212 16
OpenSearch1xIndexWriter.java
in src/plugin/indexer-opensearch-1x/src/java/org/apache/nutch/indexwriter/opensearch1x
360 16
FastURLFilter.java
in src/plugin/urlfilter-fast/src/java/org/apache/nutch/urlfilter/fast
260 16
CommonCrawlFormatWARC.java
in src/java/org/apache/nutch/tools
204 15
DmozParser.java
in src/java/org/apache/nutch/tools
256 15
Metadata.java
in src/java/org/apache/nutch/metadata
167 15
WebGraph.java
in src/java/org/apache/nutch/scoring/webgraph
522 15
IndexWriters.java
in src/java/org/apache/nutch/indexer
240 15
HttpFormAuthConfigurer.java
in src/plugin/protocol-httpclient/src/java/org/apache/nutch/protocol/httpclient
70 15
Subcollection.java
in src/plugin/subcollection/src/java/org/apache/nutch/collection
120 15
DOMContentUtils.java
in src/plugin/parse-tika/src/java/org/apache/nutch/parse/tika
316 15
Outlink.java
in src/java/org/apache/nutch/parse
102 14
FetchNodeDbInfo.java
in src/java/org/apache/nutch/service/model/response
65 14
FetcherThread.java
in src/java/org/apache/nutch/fetcher
746 14
FetchItemQueue.java
in src/java/org/apache/nutch/fetcher
126 14
Fetcher.java
in src/java/org/apache/nutch/fetcher
461 14
RobotRulesParser.java
in src/java/org/apache/nutch/protocol
294 14
HttpResponse.java
in src/plugin/protocol-http/src/java/org/apache/nutch/protocol/http
506 14
DOMContentUtils.java
in src/plugin/parse-html/src/java/org/apache/nutch/parse/html
317 14
Node.java
in src/java/org/apache/nutch/scoring/webgraph
62 13
Inlink.java
in src/java/org/apache/nutch/crawl
93 13
Extension.java
in src/java/org/apache/nutch/plugin
83 13
Files With Long Lines (Top 50)

There are 53 files with lines longer than 120 characters. In total, there are 190 long lines.

File# lines# units# long lines
269 - 56
nutch.html
in src/plugin/parse-tika/sample
408 - 23
2563 - 19
schema.xml
in src/plugin/indexer-solr
230 - 11
AbstractCommonCrawlFormat.java
in src/java/org/apache/nutch/tools
291 34 5
UpdateHostDbReducer.java
in src/java/org/apache/nutch/hostdb
323 5 5
SolrIndexWriter.java
in src/plugin/indexer-solr/src/java/org/apache/nutch/indexwriter/solr
285 12 4
FileDumper.java
in src/java/org/apache/nutch/tools
265 3 3
ReadHostDb.java
in src/java/org/apache/nutch/hostdb
217 6 3
Model.java
in src/plugin/scoring-similarity/src/java/org/apache/nutch/scoring/similarity/cosine
133 3 3
ParseOutputFormat.java
in src/java/org/apache/nutch/parse
378 11 2
CommonCrawlFormatFactory.java
in src/java/org/apache/nutch/tools
31 2 2
SegmentMerger.java
in src/java/org/apache/nutch/segment
661 12 2
Generator.java
in src/java/org/apache/nutch/crawl
908 32 2
CrawlDbReader.java
in src/java/org/apache/nutch/crawl
1148 34 2
Injector.java
in src/java/org/apache/nutch/crawl
472 16 2
FetcherOutputFormat.java
in src/java/org/apache/nutch/fetcher
91 2 2
IndexWriters.java
in src/java/org/apache/nutch/indexer
240 15 2
URLNormalizerChecker.java
in src/java/org/apache/nutch/net
63 3 2
plugin.xml
in src/plugin/indexer-opensearch-1x
58 - 2
anchor.html
in src/plugin/creativecommons/data
9 - 2
JexlIndexingFilter.java
in src/plugin/index-jexl-filter/src/java/org/apache/nutch/indexer/jexl
96 4 2
ReplaceIndexer.java
in src/plugin/index-replace/src/java/org/apache/nutch/indexer/replace
193 5 2
build-plugin.xml
in src/plugin
164 - 2
ivy.xml
in ivy
94 - 2
overview.html
in src/java
9 - 1
OutlinkExtractor.java
in src/java/org/apache/nutch/parse
57 - 1
CommonCrawlFormatJettinson.java
in src/java/org/apache/nutch/tools
86 9 1
CommonCrawlFormatJackson.java
in src/java/org/apache/nutch/tools
68 10 1
CommonCrawlFormatSimple.java
in src/java/org/apache/nutch/tools
136 11 1
WARCExporter.java
in src/java/org/apache/nutch/tools/warc
416 8 1
DmozParser.java
in src/java/org/apache/nutch/tools
256 15 1
CrawlDb.java
in src/java/org/apache/nutch/crawl
308 11 1
DeduplicationJob.java
in src/java/org/apache/nutch/crawl
317 11 1
CrawlDbMerger.java
in src/java/org/apache/nutch/crawl
168 10 1
DumpFileUtil.java
in src/java/org/apache/nutch/util
106 6 1
DbResource.java
in src/java/org/apache/nutch/service/resources
115 6 1
NutchServerPoolExecutor.java
in src/java/org/apache/nutch/service/impl
80 10 1
IndexWriterConfig.java
in src/java/org/apache/nutch/indexer
77 6 1
URLFilterChecker.java
in src/java/org/apache/nutch/net
55 3 1
Foo.java
in src/plugin/protocol-foo/src/java/org/apache/nutch/protocol/foo
98 4 1
Ftp.java
in src/plugin/protocol-ftp/src/java/org/apache/nutch/protocol/ftp
177 12 1
rel.html
in src/plugin/creativecommons/data
6 - 1
rdf.html
in src/plugin/creativecommons/data
10 - 1
plugin.xml
in src/plugin/indexer-elastic
62 - 1
BoilerpipeExtractorRepository.java
in src/plugin/parse-tika/src/java/org/apache/nutch/parse/tika
28 1 1
ivy.xml
in src/plugin/parse-tika
21 - 1
RegexURLNormalizer.java
in src/plugin/urlnormalizer-regex/src/java/org/apache/nutch/net/urlnormalizer/regex
245 10 1
HtmlUnitWebDriver.java
in src/plugin/lib-htmlunit/src/java/org/apache/nutch/protocol/htmlunit
137 7 1
HtmlUnitWebWindowListener.java
in src/plugin/lib-htmlunit/src/java/org/apache/nutch/protocol/htmlunit
24 5 1
Correlations

File Size vs. Commits (all time): 682 points

src/plugin/language-identifier/ivy.xml x: 20 commits (all time) y: 25 lines of code src/plugin/language-identifier/plugin.xml x: 14 commits (all time) y: 37 lines of code src/java/org/apache/nutch/hostdb/UpdateHostDb.java x: 31 commits (all time) y: 201 lines of code src/java/org/apache/nutch/hostdb/UpdateHostDbReducer.java x: 39 commits (all time) y: 323 lines of code src/java/org/apache/nutch/crawl/Inlink.java x: 18 commits (all time) y: 93 lines of code src/java/org/apache/nutch/fetcher/FetchItemQueues.java x: 32 commits (all time) y: 264 lines of code src/java/org/apache/nutch/fetcher/Fetcher.java x: 187 commits (all time) y: 461 lines of code src/java/org/apache/nutch/fetcher/QueueFeeder.java x: 31 commits (all time) y: 138 lines of code src/plugin/parsefilter-debug/plugin.xml x: 4 commits (all time) y: 21 lines of code src/plugin/parsefilter-naivebayes/plugin.xml x: 11 commits (all time) y: 20 lines of code src/plugin/parsefilter-regex/plugin.xml x: 2 commits (all time) y: 22 lines of code src/plugin/protocol-htmlunit/plugin.xml x: 3 commits (all time) y: 29 lines of code src/plugin/protocol-httpclient/plugin.xml x: 12 commits (all time) y: 29 lines of code src/plugin/protocol-interactiveselenium/plugin.xml x: 8 commits (all time) y: 29 lines of code src/plugin/protocol-selenium/plugin.xml x: 9 commits (all time) y: 29 lines of code src/plugin/indexer-elastic/ivy.xml x: 29 commits (all time) y: 35 lines of code src/plugin/indexer-elastic/plugin.xml x: 28 commits (all time) y: 62 lines of code src/java/org/apache/nutch/segment/SegmentReader.java x: 92 commits (all time) y: 712 lines of code src/java/org/apache/nutch/hostdb/ResolverThread.java x: 20 commits (all time) y: 88 lines of code src/java/org/apache/nutch/crawl/AdaptiveFetchSchedule.java x: 34 commits (all time) y: 239 lines of code src/java/org/apache/nutch/crawl/MimeAdaptiveFetchSchedule.java x: 25 commits (all time) y: 136 lines of code src/java/org/apache/nutch/metadata/SpellCheckedMetadata.java x: 16 commits (all time) y: 72 lines of code src/java/org/apache/nutch/protocol/ProtocolFactory.java x: 49 commits (all time) y: 155 lines of code src/java/org/apache/nutch/service/impl/ConfManagerImpl.java x: 11 commits (all time) y: 89 lines of code src/java/org/apache/nutch/service/impl/JobManagerImpl.java x: 4 commits (all time) y: 67 lines of code src/java/org/apache/nutch/service/impl/NutchServerPoolExecutor.java x: 19 commits (all time) y: 80 lines of code src/java/org/apache/nutch/tools/AbstractCommonCrawlFormat.java x: 23 commits (all time) y: 291 lines of code src/java/org/apache/nutch/tools/CommonCrawlFormatWARC.java x: 20 commits (all time) y: 204 lines of code src/java/org/apache/nutch/tools/warc/WARCExporter.java x: 39 commits (all time) y: 416 lines of code src/java/org/apache/nutch/util/DumpFileUtil.java x: 14 commits (all time) y: 106 lines of code src/java/org/apache/nutch/util/JexlUtil.java x: 22 commits (all time) y: 33 lines of code src/java/org/apache/nutch/util/TableUtil.java x: 11 commits (all time) y: 74 lines of code src/plugin/index-more/src/java/org/apache/nutch/indexer/more/MoreIndexingFilter.java x: 88 commits (all time) y: 239 lines of code src/plugin/indexer-cloudsearch/src/java/org/apache/nutch/indexwriter/cloudsearch/CloudSearchIndexWriter.java x: 28 commits (all time) y: 259 lines of code src/plugin/indexer-elastic/src/java/org/apache/nutch/indexwriter/elastic/ElasticIndexWriter.java x: 73 commits (all time) y: 282 lines of code src/plugin/indexer-kafka/src/java/org/apache/nutch/indexwriter/kafka/KafkaIndexWriter.java x: 7 commits (all time) y: 158 lines of code src/plugin/lib-http/src/java/org/apache/nutch/protocol/http/api/HttpRobotRulesParser.java x: 33 commits (all time) y: 174 lines of code src/plugin/parse-tika/src/java/org/apache/nutch/parse/tika/TikaParser.java x: 74 commits (all time) y: 261 lines of code src/plugin/parsefilter-regex/src/java/org/apache/nutch/parsefilter/regex/RegexParseFilter.java x: 29 commits (all time) y: 142 lines of code src/plugin/protocol-httpclient/src/java/org/apache/nutch/protocol/httpclient/Http.java x: 68 commits (all time) y: 357 lines of code src/plugin/urlfilter-domain/src/java/org/apache/nutch/urlfilter/domain/DomainURLFilter.java x: 28 commits (all time) y: 92 lines of code src/plugin/urlfilter-domaindenylist/src/java/org/apache/nutch/urlfilter/domaindenylist/DomainDenylistURLFilter.java x: 9 commits (all time) y: 92 lines of code src/plugin/urlfilter-fast/src/java/org/apache/nutch/urlfilter/fast/FastURLFilter.java x: 13 commits (all time) y: 260 lines of code src/plugin/urlnormalizer-host/src/java/org/apache/nutch/net/urlnormalizer/host/HostURLNormalizer.java x: 30 commits (all time) y: 119 lines of code src/plugin/urlnormalizer-protocol/src/java/org/apache/nutch/net/urlnormalizer/protocol/ProtocolURLNormalizer.java x: 38 commits (all time) y: 155 lines of code src/plugin/urlnormalizer-querystring/src/java/org/apache/nutch/net/urlnormalizer/querystring/QuerystringURLNormalizer.java x: 22 commits (all time) y: 49 lines of code src/plugin/urlnormalizer-slash/src/java/org/apache/nutch/net/urlnormalizer/slash/SlashURLNormalizer.java x: 33 commits (all time) y: 147 lines of code conf/nutch-default.xml x: 473 commits (all time) y: 2563 lines of code src/java/org/apache/nutch/fetcher/FetchItemQueue.java x: 17 commits (all time) y: 126 lines of code src/java/org/apache/nutch/crawl/Injector.java x: 95 commits (all time) y: 472 lines of code src/java/org/apache/nutch/crawl/AbstractFetchSchedule.java x: 39 commits (all time) y: 94 lines of code src/java/org/apache/nutch/crawl/Generator.java x: 177 commits (all time) y: 908 lines of code src/java/org/apache/nutch/fetcher/FetcherThread.java x: 84 commits (all time) y: 746 lines of code src/java/org/apache/nutch/service/NutchReader.java x: 13 commits (all time) y: 17 lines of code src/java/org/apache/nutch/service/impl/NodeReader.java x: 22 commits (all time) y: 135 lines of code src/java/org/apache/nutch/util/EncodingDetector.java x: 33 commits (all time) y: 237 lines of code src/plugin/index-arbitrary/src/java/org/apache/nutch/indexer/arbitrary/ArbitraryIndexingFilter.java x: 5 commits (all time) y: 139 lines of code src/plugin/parse-tika/src/java/org/apache/nutch/parse/tika/BoilerpipeExtractorRepository.java x: 16 commits (all time) y: 28 lines of code src/plugin/parsefilter-debug/src/java/org/apache/nutch/parsefilter/debug/DebugParseFilter.java x: 5 commits (all time) y: 37 lines of code src/plugin/protocol-httpclient/src/java/org/apache/nutch/protocol/httpclient/HttpFormAuthentication.java x: 11 commits (all time) y: 196 lines of code src/plugin/protocol-interactiveselenium/src/java/org/apache/nutch/protocol/interactiveselenium/Http.java x: 13 commits (all time) y: 33 lines of code src/plugin/scoring-link/src/java/org/apache/nutch/scoring/link/LinkAnalysisScoringFilter.java x: 19 commits (all time) y: 52 lines of code src/plugin/scoring-metadata/src/java/org/apache/nutch/scoring/metadata/MetadataScoringFilter.java x: 9 commits (all time) y: 72 lines of code src/plugin/subcollection/src/java/org/apache/nutch/collection/CollectionManager.java x: 26 commits (all time) y: 157 lines of code src/java/org/apache/nutch/fetcher/FetchNode.java x: 11 commits (all time) y: 51 lines of code src/java/org/apache/nutch/util/DomainStatistics.java x: 2 commits (all time) y: 191 lines of code src/java/org/apache/nutch/util/URLUtil.java x: 31 commits (all time) y: 232 lines of code src/plugin/tld/src/java/org/apache/nutch/indexer/tld/TLDIndexingFilter.java x: 21 commits (all time) y: 39 lines of code src/plugin/lib-regex-filter/src/java/org/apache/nutch/urlfilter/api/RegexURLFilterBase.java x: 32 commits (all time) y: 155 lines of code src/plugin/urlfilter-automaton/src/java/org/apache/nutch/urlfilter/automaton/AutomatonURLFilter.java x: 19 commits (all time) y: 68 lines of code src/plugin/urlfilter-regex/src/java/org/apache/nutch/urlfilter/regex/RegexURLFilter.java x: 20 commits (all time) y: 68 lines of code src/java/org/apache/nutch/net/protocols/Response.java x: 29 commits (all time) y: 27 lines of code src/plugin/lib-http/src/java/org/apache/nutch/protocol/http/api/HttpBase.java x: 112 commits (all time) y: 505 lines of code src/plugin/protocol-okhttp/src/java/org/apache/nutch/protocol/okhttp/OkHttp.java x: 37 commits (all time) y: 340 lines of code src/java/org/apache/nutch/indexer/IndexingJob.java x: 63 commits (all time) y: 358 lines of code src/java/org/apache/nutch/net/URLExemptionFilters.java x: 15 commits (all time) y: 40 lines of code src/plugin/urlfilter-ignoreexempt/src/java/org/apache/nutch/urlfilter/ignoreexempt/ExemptionUrlFilter.java x: 17 commits (all time) y: 40 lines of code src/java/org/apache/nutch/plugin/URLStreamHandlerFactory.java x: 5 commits (all time) y: 71 lines of code src/plugin/index-arbitrary/ivy.xml x: 1 commits (all time) y: 18 lines of code src/plugin/index-arbitrary/plugin.xml x: 1 commits (all time) y: 21 lines of code src/plugin/index-arbitrary/src/java/org/apache/nutch/indexer/arbitrary/package-info.java x: 1 commits (all time) y: 1 lines of code src/plugin/lib-htmlunit/src/java/org/apache/nutch/protocol/htmlunit/HtmlUnitWebDriver.java x: 10 commits (all time) y: 137 lines of code src/plugin/lib-selenium/ivy.xml x: 15 commits (all time) y: 21 lines of code src/plugin/lib-selenium/src/java/org/apache/nutch/protocol/selenium/HttpWebClient.java x: 26 commits (all time) y: 216 lines of code src/plugin/protocol-selenium/src/java/org/apache/nutch/protocol/selenium/Http.java x: 14 commits (all time) y: 28 lines of code src/plugin/indexer-kafka/ivy.xml x: 11 commits (all time) y: 22 lines of code src/plugin/indexer-kafka/plugin.xml x: 7 commits (all time) y: 75 lines of code ivy/ivy.xml x: 231 commits (all time) y: 94 lines of code ivy/ivysettings.xml x: 39 commits (all time) y: 63 lines of code src/plugin/build-plugin.xml x: 34 commits (all time) y: 164 lines of code src/plugin/creativecommons/ivy.xml x: 10 commits (all time) y: 20 lines of code src/plugin/exchange-jexl/ivy.xml x: 9 commits (all time) y: 20 lines of code src/plugin/index-geoip/ivy.xml x: 19 commits (all time) y: 25 lines of code src/plugin/index-links/ivy.xml x: 7 commits (all time) y: 20 lines of code src/plugin/indexer-cloudsearch/ivy.xml x: 5 commits (all time) y: 21 lines of code src/plugin/indexer-opensearch-1x/ivy.xml x: 6 commits (all time) y: 41 lines of code src/plugin/indexer-rabbit/ivy.xml x: 12 commits (all time) y: 18 lines of code src/plugin/lib-htmlunit/build-ivy.xml x: 8 commits (all time) y: 20 lines of code src/plugin/lib-rabbitmq/ivy.xml x: 13 commits (all time) y: 24 lines of code src/plugin/parse-tika/ivy.xml x: 73 commits (all time) y: 21 lines of code src/plugin/parsefilter-debug/ivy.xml x: 6 commits (all time) y: 18 lines of code src/plugin/protocol-foo/ivy.xml x: 4 commits (all time) y: 20 lines of code src/plugin/protocol-ftp/ivy.xml x: 10 commits (all time) y: 21 lines of code src/plugin/scoring-orphan/ivy.xml x: 14 commits (all time) y: 20 lines of code ivy/dependency-check-ant/dependency-check-suppressions.xml x: 5 commits (all time) y: 3 lines of code src/java/org/apache/nutch/crawl/CrawlDatum.java x: 73 commits (all time) y: 456 lines of code src/java/org/apache/nutch/crawl/CrawlDb.java x: 89 commits (all time) y: 308 lines of code src/java/org/apache/nutch/crawl/CrawlDbMerger.java x: 66 commits (all time) y: 168 lines of code src/java/org/apache/nutch/crawl/CrawlDbReader.java x: 142 commits (all time) y: 1148 lines of code src/java/org/apache/nutch/crawl/CrawlDbReducer.java x: 77 commits (all time) y: 263 lines of code src/java/org/apache/nutch/crawl/DeduplicationJob.java x: 43 commits (all time) y: 317 lines of code src/java/org/apache/nutch/crawl/Inlinks.java x: 24 commits (all time) y: 83 lines of code src/java/org/apache/nutch/crawl/LinkDb.java x: 81 commits (all time) y: 398 lines of code src/java/org/apache/nutch/crawl/LinkDbMerger.java x: 50 commits (all time) y: 147 lines of code src/java/org/apache/nutch/crawl/LinkDbReader.java x: 69 commits (all time) y: 208 lines of code src/java/org/apache/nutch/crawl/MD5Signature.java x: 18 commits (all time) y: 13 lines of code src/java/org/apache/nutch/crawl/SignatureComparator.java x: 11 commits (all time) y: 37 lines of code src/java/org/apache/nutch/crawl/TextProfileSignature.java x: 22 commits (all time) y: 153 lines of code src/java/org/apache/nutch/crawl/URLPartitioner.java x: 35 commits (all time) y: 72 lines of code src/java/org/apache/nutch/fetcher/FetcherOutputFormat.java x: 38 commits (all time) y: 91 lines of code src/java/org/apache/nutch/hostdb/ReadHostDb.java x: 56 commits (all time) y: 217 lines of code src/java/org/apache/nutch/indexer/CleaningJob.java x: 42 commits (all time) y: 145 lines of code src/java/org/apache/nutch/indexer/IndexWriters.java x: 43 commits (all time) y: 240 lines of code src/java/org/apache/nutch/indexer/IndexerOutputFormat.java x: 34 commits (all time) y: 44 lines of code src/java/org/apache/nutch/indexer/IndexingFiltersChecker.java x: 68 commits (all time) y: 241 lines of code src/java/org/apache/nutch/indexer/NutchDocument.java x: 18 commits (all time) y: 119 lines of code src/java/org/apache/nutch/metadata/CaseInsensitiveMetadata.java x: 2 commits (all time) y: 7 lines of code src/java/org/apache/nutch/metadata/MetaWrapper.java x: 13 commits (all time) y: 52 lines of code src/java/org/apache/nutch/metadata/Metadata.java x: 30 commits (all time) y: 167 lines of code src/java/org/apache/nutch/net/URLFilterChecker.java x: 42 commits (all time) y: 55 lines of code src/java/org/apache/nutch/net/URLNormalizerChecker.java x: 29 commits (all time) y: 63 lines of code src/java/org/apache/nutch/parse/HTMLMetaTags.java x: 17 commits (all time) y: 105 lines of code src/java/org/apache/nutch/parse/Outlink.java x: 21 commits (all time) y: 102 lines of code src/java/org/apache/nutch/parse/ParseData.java x: 45 commits (all time) y: 159 lines of code src/java/org/apache/nutch/parse/ParseImpl.java x: 18 commits (all time) y: 57 lines of code src/java/org/apache/nutch/parse/ParseOutputFormat.java x: 92 commits (all time) y: 378 lines of code src/java/org/apache/nutch/parse/ParseResult.java x: 23 commits (all time) y: 75 lines of code src/java/org/apache/nutch/parse/ParseSegment.java x: 92 commits (all time) y: 273 lines of code src/java/org/apache/nutch/parse/ParseStatus.java x: 25 commits (all time) y: 197 lines of code src/java/org/apache/nutch/parse/ParserChecker.java x: 75 commits (all time) y: 218 lines of code src/java/org/apache/nutch/plugin/Extension.java x: 27 commits (all time) y: 83 lines of code src/java/org/apache/nutch/plugin/Plugin.java x: 18 commits (all time) y: 24 lines of code src/java/org/apache/nutch/plugin/PluginClassLoader.java x: 12 commits (all time) y: 140 lines of code src/java/org/apache/nutch/plugin/PluginRepository.java x: 42 commits (all time) y: 404 lines of code src/java/org/apache/nutch/protocol/Content.java x: 52 commits (all time) y: 229 lines of code src/java/org/apache/nutch/protocol/ProtocolStatus.java x: 30 commits (all time) y: 223 lines of code src/java/org/apache/nutch/protocol/RobotRulesParser.java x: 42 commits (all time) y: 294 lines of code src/java/org/apache/nutch/scoring/ScoringFilters.java x: 30 commits (all time) y: 93 lines of code src/java/org/apache/nutch/scoring/webgraph/LinkDatum.java x: 6 commits (all time) y: 81 lines of code src/java/org/apache/nutch/scoring/webgraph/LinkDumper.java x: 40 commits (all time) y: 311 lines of code src/java/org/apache/nutch/scoring/webgraph/LinkRank.java x: 42 commits (all time) y: 493 lines of code src/java/org/apache/nutch/scoring/webgraph/Node.java x: 6 commits (all time) y: 62 lines of code src/java/org/apache/nutch/scoring/webgraph/NodeDumper.java x: 37 commits (all time) y: 314 lines of code src/java/org/apache/nutch/scoring/webgraph/ScoreUpdater.java x: 32 commits (all time) y: 179 lines of code src/java/org/apache/nutch/scoring/webgraph/WebGraph.java x: 44 commits (all time) y: 522 lines of code src/java/org/apache/nutch/segment/ContentAsTextInputFormat.java x: 21 commits (all time) y: 73 lines of code src/java/org/apache/nutch/segment/SegmentMerger.java x: 76 commits (all time) y: 661 lines of code src/java/org/apache/nutch/segment/SegmentPart.java x: 19 commits (all time) y: 49 lines of code src/java/org/apache/nutch/tools/CommonCrawlFormat.java x: 15 commits (all time) y: 18 lines of code src/java/org/apache/nutch/tools/DmozParser.java x: 39 commits (all time) y: 256 lines of code src/java/org/apache/nutch/tools/FreeGenerator.java x: 47 commits (all time) y: 195 lines of code src/java/org/apache/nutch/tools/ResolveUrls.java x: 13 commits (all time) y: 121 lines of code src/java/org/apache/nutch/tools/arc/ArcRecordReader.java x: 28 commits (all time) y: 159 lines of code src/java/org/apache/nutch/tools/arc/ArcSegmentCreator.java x: 40 commits (all time) y: 269 lines of code src/java/org/apache/nutch/util/AbstractChecker.java x: 33 commits (all time) y: 156 lines of code src/java/org/apache/nutch/util/CommandRunner.java x: 18 commits (all time) y: 202 lines of code src/java/org/apache/nutch/util/CrawlCompletionStats.java x: 34 commits (all time) y: 204 lines of code src/java/org/apache/nutch/util/GenericWritableConfigurable.java x: 10 commits (all time) y: 34 lines of code src/java/org/apache/nutch/util/NutchJob.java x: 33 commits (all time) y: 48 lines of code src/java/org/apache/nutch/util/PrefixStringMatcher.java x: 14 commits (all time) y: 85 lines of code src/java/org/apache/nutch/util/ProtocolStatusStatistics.java x: 36 commits (all time) y: 122 lines of code src/java/org/apache/nutch/util/SitemapProcessor.java x: 60 commits (all time) y: 411 lines of code src/java/org/apache/nutch/util/SuffixStringMatcher.java x: 12 commits (all time) y: 65 lines of code src/java/org/apache/nutch/util/TrieStringMatcher.java x: 18 commits (all time) y: 109 lines of code src/java/overview.html x: 7 commits (all time) y: 9 lines of code src/plugin/creativecommons/conf/nutch-site.xml x: 9 commits (all time) y: 34 lines of code src/plugin/creativecommons/data/anchor.html x: 3 commits (all time) y: 9 lines of code src/plugin/index-geoip/plugin.xml x: 17 commits (all time) y: 23 lines of code src/plugin/index-replace/sample/testIndexReplace.html x: 3 commits (all time) y: 12 lines of code src/plugin/indexer-csv/src/java/org/apache/nutch/indexwriter/csv/CSVIndexWriter.java x: 25 commits (all time) y: 295 lines of code src/plugin/indexer-opensearch-1x/build-ivy.xml x: 2 commits (all time) y: 20 lines of code src/plugin/indexer-opensearch-1x/plugin.xml x: 7 commits (all time) y: 58 lines of code src/plugin/indexer-opensearch-1x/src/java/org/apache/nutch/indexwriter/opensearch1x/OpenSearch1xIndexWriter.java x: 6 commits (all time) y: 360 lines of code src/plugin/indexer-solr/plugin.xml x: 23 commits (all time) y: 48 lines of code src/plugin/indexer-solr/src/java/org/apache/nutch/indexwriter/solr/SolrIndexWriter.java x: 45 commits (all time) y: 285 lines of code src/plugin/indexer-solr/src/java/org/apache/nutch/indexwriter/solr/SolrUtils.java x: 33 commits (all time) y: 81 lines of code src/plugin/lib-htmlunit/plugin.xml x: 8 commits (all time) y: 282 lines of code src/plugin/parse-html/src/java/org/apache/nutch/parse/html/DOMContentUtils.java x: 63 commits (all time) y: 317 lines of code src/plugin/parse-tika/plugin.xml x: 66 commits (all time) y: 24 lines of code src/plugin/parse-tika/sample/nutch.html x: 4 commits (all time) y: 408 lines of code src/plugin/parse-tika/src/java/org/apache/nutch/parse/tika/DOMContentUtils.java x: 42 commits (all time) y: 316 lines of code src/plugin/protocol-ftp/src/java/org/apache/nutch/protocol/ftp/Client.java x: 20 commits (all time) y: 212 lines of code src/plugin/protocol-ftp/src/java/org/apache/nutch/protocol/ftp/PrintCommandListener.java x: 11 commits (all time) y: 43 lines of code src/plugin/protocol-htmlunit/src/java/org/apache/nutch/protocol/htmlunit/DummyX509TrustManager.java x: 7 commits (all time) y: 42 lines of code src/plugin/protocol-http/src/java/org/apache/nutch/protocol/http/DummyX509TrustManager.java x: 8 commits (all time) y: 47 lines of code src/plugin/protocol-http/src/java/org/apache/nutch/protocol/http/HttpResponse.java x: 97 commits (all time) y: 506 lines of code src/plugin/protocol-httpclient/src/java/org/apache/nutch/protocol/httpclient/DummySSLProtocolSocketFactory.java x: 17 commits (all time) y: 85 lines of code src/plugin/protocol-httpclient/src/java/org/apache/nutch/protocol/httpclient/HttpAuthenticationFactory.java x: 27 commits (all time) y: 47 lines of code src/plugin/protocol-httpclient/src/java/org/apache/nutch/protocol/httpclient/HttpBasicAuthentication.java x: 34 commits (all time) y: 100 lines of code src/plugin/protocol-httpclient/src/java/org/apache/nutch/protocol/httpclient/HttpResponse.java x: 47 commits (all time) y: 153 lines of code src/plugin/protocol-interactiveselenium/src/java/org/apache/nutch/protocol/interactiveselenium/HttpResponse.java x: 31 commits (all time) y: 397 lines of code src/plugin/protocol-interactiveselenium/src/java/org/apache/nutch/protocol/interactiveselenium/handlers/DefaultClickAllAjaxLinksHandler.java x: 20 commits (all time) y: 55 lines of code src/plugin/protocol-okhttp/src/java/org/apache/nutch/protocol/okhttp/OkHttpResponse.java x: 36 commits (all time) y: 185 lines of code src/plugin/protocol-selenium/src/java/org/apache/nutch/protocol/selenium/HttpResponse.java x: 23 commits (all time) y: 354 lines of code src/plugin/scoring-depth/src/java/org/apache/nutch/scoring/depth/DepthScoringFilter.java x: 20 commits (all time) y: 196 lines of code src/plugin/scoring-similarity/src/java/org/apache/nutch/scoring/similarity/util/LuceneAnalyzerUtil.java x: 17 commits (all time) y: 54 lines of code src/plugin/urlfilter-suffix/src/java/org/apache/nutch/urlfilter/suffix/SuffixURLFilter.java x: 38 commits (all time) y: 186 lines of code src/plugin/urlnormalizer-regex/src/java/org/apache/nutch/net/urlnormalizer/regex/RegexURLNormalizer.java x: 29 commits (all time) y: 245 lines of code src/plugin/protocol-okhttp/src/java/org/apache/nutch/protocol/okhttp/package-info.java x: 6 commits (all time) y: 1 lines of code src/plugin/protocol-okhttp/src/java/org/apache/nutch/protocol/okhttp/CIDR.java x: 1 commits (all time) y: 47 lines of code src/plugin/protocol-okhttp/src/java/org/apache/nutch/protocol/okhttp/IPFilterRules.java x: 1 commits (all time) y: 75 lines of code ivy/ivy-report-license.xsl x: 6 commits (all time) y: 25 lines of code src/plugin/indexer-solr/schema.xml x: 10 commits (all time) y: 230 lines of code src/plugin/index-geoip/src/java/org/apache/nutch/indexer/geoip/GeoIPDocumentCreator.java x: 14 commits (all time) y: 169 lines of code src/plugin/index-geoip/src/java/org/apache/nutch/indexer/geoip/GeoIPIndexingFilter.java x: 15 commits (all time) y: 110 lines of code src/java/org/apache/nutch/hostdb/HostDatum.java x: 15 commits (all time) y: 263 lines of code src/java/org/apache/nutch/hostdb/UpdateHostDbMapper.java x: 31 commits (all time) y: 143 lines of code src/java/org/apache/nutch/plugin/ExtensionPoint.java x: 18 commits (all time) y: 38 lines of code src/java/org/apache/nutch/plugin/PluginManifestParser.java x: 33 commits (all time) y: 205 lines of code src/java/org/apache/nutch/indexer/IndexWriterParams.java x: 18 commits (all time) y: 44 lines of code src/plugin/indexer-elastic/src/java/org/apache/nutch/indexwriter/elastic/ElasticConstants.java x: 21 commits (all time) y: 16 lines of code src/plugin/indexer-rabbit/src/java/org/apache/nutch/indexwriter/rabbit/RabbitIndexWriter.java x: 28 commits (all time) y: 217 lines of code src/plugin/protocol-foo/src/java/org/apache/nutch/protocol/foo/Foo.java x: 1 commits (all time) y: 98 lines of code src/java/org/apache/nutch/crawl/FetchSchedule.java x: 20 commits (all time) y: 20 lines of code src/java/org/apache/nutch/crawl/FetchScheduleFactory.java x: 21 commits (all time) y: 30 lines of code src/java/org/apache/nutch/crawl/SignatureFactory.java x: 35 commits (all time) y: 29 lines of code src/java/org/apache/nutch/fetcher/FetchItem.java x: 12 commits (all time) y: 80 lines of code src/java/org/apache/nutch/fetcher/FetcherThreadEvent.java x: 12 commits (all time) y: 59 lines of code src/java/org/apache/nutch/hostdb/CrawlDatumProcessor.java x: 4 commits (all time) y: 6 lines of code src/java/org/apache/nutch/indexer/IndexWriter.java x: 23 commits (all time) y: 18 lines of code src/java/org/apache/nutch/indexer/IndexerMapReduce.java x: 67 commits (all time) y: 350 lines of code src/java/org/apache/nutch/indexer/IndexingFilter.java x: 28 commits (all time) y: 12 lines of code src/java/org/apache/nutch/indexer/IndexingFilters.java x: 43 commits (all time) y: 25 lines of code src/java/org/apache/nutch/net/URLFilter.java x: 16 commits (all time) y: 7 lines of code src/java/org/apache/nutch/net/URLFilters.java x: 28 commits (all time) y: 22 lines of code src/java/org/apache/nutch/net/URLNormalizers.java x: 29 commits (all time) y: 166 lines of code src/java/org/apache/nutch/net/protocols/HttpDateFormat.java x: 16 commits (all time) y: 48 lines of code src/java/org/apache/nutch/net/protocols/ProtocolLogUtil.java x: 4 commits (all time) y: 45 lines of code src/java/org/apache/nutch/parse/HtmlParseFilter.java x: 22 commits (all time) y: 10 lines of code src/java/org/apache/nutch/parse/HtmlParseFilters.java x: 22 commits (all time) y: 26 lines of code src/java/org/apache/nutch/parse/OutlinkExtractor.java x: 35 commits (all time) y: 57 lines of code src/java/org/apache/nutch/parse/Parse.java x: 15 commits (all time) y: 6 lines of code src/java/org/apache/nutch/parse/ParseUtil.java x: 45 commits (all time) y: 117 lines of code src/java/org/apache/nutch/parse/ParserFactory.java x: 48 commits (all time) y: 232 lines of code src/java/org/apache/nutch/plugin/PluginDescriptor.java x: 21 commits (all time) y: 172 lines of code src/java/org/apache/nutch/protocol/Protocol.java x: 34 commits (all time) y: 13 lines of code src/java/org/apache/nutch/scoring/ScoringFilter.java x: 34 commits (all time) y: 37 lines of code src/java/org/apache/nutch/service/impl/JobWorker.java x: 8 commits (all time) y: 77 lines of code src/java/org/apache/nutch/service/resources/ConfigResource.java x: 14 commits (all time) y: 74 lines of code src/java/org/apache/nutch/service/resources/JobResource.java x: 8 commits (all time) y: 51 lines of code src/java/org/apache/nutch/service/resources/ReaderResouce.java x: 6 commits (all time) y: 104 lines of code src/java/org/apache/nutch/service/resources/SeedResource.java x: 29 commits (all time) y: 80 lines of code src/java/org/apache/nutch/tools/CommonCrawlDataDumper.java x: 51 commits (all time) y: 525 lines of code src/java/org/apache/nutch/tools/FileDumper.java x: 52 commits (all time) y: 265 lines of code src/java/org/apache/nutch/tools/WARCUtils.java x: 17 commits (all time) y: 208 lines of code src/java/org/apache/nutch/util/DeflateUtils.java x: 17 commits (all time) y: 80 lines of code src/java/org/apache/nutch/util/DomUtil.java x: 30 commits (all time) y: 79 lines of code src/java/org/apache/nutch/util/GZIPUtils.java x: 19 commits (all time) y: 85 lines of code src/java/org/apache/nutch/util/MimeUtil.java x: 45 commits (all time) y: 148 lines of code src/plugin/creativecommons/src/java/org/creativecommons/nutch/CCIndexingFilter.java x: 34 commits (all time) y: 72 lines of code src/plugin/creativecommons/src/java/org/creativecommons/nutch/CCParseFilter.java x: 32 commits (all time) y: 211 lines of code src/plugin/exchange-jexl/src/java/org/apache/nutch/exchange/jexl/JexlExchange.java x: 8 commits (all time) y: 38 lines of code src/plugin/feed/src/java/org/apache/nutch/parse/feed/FeedParser.java x: 22 commits (all time) y: 245 lines of code src/plugin/headings/src/java/org/apache/nutch/parse/headings/HeadingsParseFilter.java x: 28 commits (all time) y: 77 lines of code src/plugin/index-anchor/src/java/org/apache/nutch/indexer/anchor/AnchorIndexingFilter.java x: 21 commits (all time) y: 49 lines of code src/plugin/index-basic/src/java/org/apache/nutch/indexer/basic/BasicIndexingFilter.java x: 49 commits (all time) y: 76 lines of code src/plugin/index-jexl-filter/src/java/org/apache/nutch/indexer/jexl/JexlIndexingFilter.java x: 22 commits (all time) y: 96 lines of code src/plugin/index-links/src/java/org/apache/nutch/indexer/links/LinksIndexingFilter.java x: 12 commits (all time) y: 98 lines of code src/plugin/index-metadata/src/java/org/apache/nutch/indexer/metadata/MetadataIndexer.java x: 23 commits (all time) y: 86 lines of code src/plugin/index-replace/src/java/org/apache/nutch/indexer/replace/FieldReplacer.java x: 10 commits (all time) y: 92 lines of code src/plugin/index-replace/src/java/org/apache/nutch/indexer/replace/ReplaceIndexer.java x: 16 commits (all time) y: 193 lines of code src/plugin/indexer-dummy/src/java/org/apache/nutch/indexwriter/dummy/DummyIndexWriter.java x: 30 commits (all time) y: 90 lines of code src/plugin/language-identifier/src/java/org/apache/nutch/analysis/lang/HTMLLanguageParser.java x: 22 commits (all time) y: 239 lines of code src/plugin/language-identifier/src/java/org/apache/nutch/analysis/lang/LanguageIndexingFilter.java x: 20 commits (all time) y: 44 lines of code src/plugin/lib-rabbitmq/src/java/org/apache/nutch/rabbitmq/RabbitMQClient.java x: 6 commits (all time) y: 126 lines of code src/plugin/microformats-reltag/src/java/org/apache/nutch/microformats/reltag/RelTagParser.java x: 26 commits (all time) y: 94 lines of code src/plugin/mimetype-filter/src/java/org/apache/nutch/indexer/filter/MimeTypeIndexingFilter.java x: 27 commits (all time) y: 191 lines of code src/plugin/parse-ext/src/java/org/apache/nutch/parse/ext/ExtParser.java x: 28 commits (all time) y: 115 lines of code src/plugin/parse-html/src/java/org/apache/nutch/parse/html/DOMBuilder.java x: 17 commits (all time) y: 197 lines of code src/plugin/parse-html/src/java/org/apache/nutch/parse/html/HTMLMetaProcessor.java x: 28 commits (all time) y: 144 lines of code src/plugin/parse-html/src/java/org/apache/nutch/parse/html/HtmlParser.java x: 69 commits (all time) y: 291 lines of code src/plugin/parse-js/src/java/org/apache/nutch/parse/js/JSParseFilter.java x: 38 commits (all time) y: 172 lines of code src/plugin/parse-metatags/src/java/org/apache/nutch/parse/metatags/MetaTagsParser.java x: 13 commits (all time) y: 76 lines of code src/plugin/parse-tika/src/java/org/apache/nutch/parse/tika/HTMLMetaProcessor.java x: 28 commits (all time) y: 174 lines of code src/plugin/parse-zip/src/java/org/apache/nutch/parse/zip/ZipParser.java x: 27 commits (all time) y: 106 lines of code src/plugin/protocol-file/src/java/org/apache/nutch/protocol/file/File.java x: 39 commits (all time) y: 135 lines of code src/plugin/protocol-ftp/src/java/org/apache/nutch/protocol/ftp/Ftp.java x: 51 commits (all time) y: 177 lines of code src/plugin/protocol-ftp/src/java/org/apache/nutch/protocol/ftp/FtpResponse.java x: 40 commits (all time) y: 329 lines of code src/plugin/protocol-htmlunit/src/java/org/apache/nutch/protocol/htmlunit/Http.java x: 12 commits (all time) y: 33 lines of code src/plugin/protocol-htmlunit/src/java/org/apache/nutch/protocol/htmlunit/HttpResponse.java x: 26 commits (all time) y: 419 lines of code src/plugin/protocol-http/src/java/org/apache/nutch/protocol/http/Http.java x: 37 commits (all time) y: 33 lines of code src/plugin/scoring-opic/src/java/org/apache/nutch/scoring/opic/OPICScoringFilter.java x: 46 commits (all time) y: 126 lines of code src/plugin/subcollection/src/java/org/apache/nutch/collection/Subcollection.java x: 16 commits (all time) y: 120 lines of code src/plugin/subcollection/src/java/org/apache/nutch/indexer/subcollection/SubcollectionIndexingFilter.java x: 34 commits (all time) y: 65 lines of code src/plugin/urlfilter-prefix/src/java/org/apache/nutch/urlfilter/prefix/PrefixURLFilter.java x: 35 commits (all time) y: 114 lines of code src/plugin/urlfilter-validator/src/java/org/apache/nutch/urlfilter/validator/UrlValidator.java x: 13 commits (all time) y: 189 lines of code src/plugin/urlmeta/src/java/org/apache/nutch/indexer/urlmeta/URLMetaIndexingFilter.java x: 19 commits (all time) y: 39 lines of code src/plugin/urlnormalizer-ajax/src/java/org/apache/nutch/net/urlnormalizer/ajax/AjaxURLNormalizer.java x: 22 commits (all time) y: 119 lines of code src/plugin/urlnormalizer-basic/src/java/org/apache/nutch/net/urlnormalizer/basic/BasicURLNormalizer.java x: 53 commits (all time) y: 314 lines of code src/java/org/apache/nutch/crawl/CrawlDbFilter.java x: 40 commits (all time) y: 78 lines of code src/java/org/apache/nutch/crawl/LinkDbFilter.java x: 25 commits (all time) y: 90 lines of code src/java/org/apache/nutch/exchange/Exchanges.java x: 8 commits (all time) y: 108 lines of code src/java/org/apache/nutch/metadata/Nutch.java x: 38 commits (all time) y: 43 lines of code src/java/org/apache/nutch/plugin/Pluggable.java x: 14 commits (all time) y: 3 lines of code src/java/org/apache/nutch/tools/ShowProperties.java x: 3 commits (all time) y: 44 lines of code src/plugin/index-geoip/src/java/org/apache/nutch/indexer/geoip/package-info.java x: 11 commits (all time) y: 1 lines of code src/java/org/apache/nutch/crawl/DefaultFetchSchedule.java x: 17 commits (all time) y: 20 lines of code src/java/org/apache/nutch/indexer/IndexingException.java x: 5 commits (all time) y: 16 lines of code src/java/org/apache/nutch/indexer/NutchField.java x: 21 commits (all time) y: 98 lines of code src/java/org/apache/nutch/metadata/CreativeCommons.java x: 10 commits (all time) y: 6 lines of code src/java/org/apache/nutch/net/URLExemptionFilter.java x: 12 commits (all time) y: 7 lines of code src/java/org/apache/nutch/parse/ParsePluginsReader.java x: 36 commits (all time) y: 162 lines of code src/java/org/apache/nutch/parse/Parser.java x: 19 commits (all time) y: 8 lines of code src/java/org/apache/nutch/publisher/NutchPublishers.java x: 6 commits (all time) y: 56 lines of code src/java/org/apache/nutch/service/NutchServer.java x: 22 commits (all time) y: 176 lines of code src/java/org/apache/nutch/service/impl/JobFactory.java x: 9 commits (all time) y: 49 lines of code src/java/org/apache/nutch/service/impl/SequenceReader.java x: 11 commits (all time) y: 128 lines of code src/java/org/apache/nutch/service/model/request/SeedUrl.java x: 16 commits (all time) y: 57 lines of code src/java/org/apache/nutch/service/model/response/FetchNodeDbInfo.java x: 11 commits (all time) y: 65 lines of code src/java/org/apache/nutch/service/resources/ServicesResource.java x: 5 commits (all time) y: 53 lines of code src/java/org/apache/nutch/tools/CommonCrawlConfig.java x: 4 commits (all time) y: 88 lines of code src/java/org/apache/nutch/tools/CommonCrawlFormatJettinson.java x: 10 commits (all time) y: 86 lines of code src/plugin/lib-http/src/java/org/apache/nutch/protocol/http/api/BlockedException.java x: 8 commits (all time) y: 6 lines of code src/plugin/protocol-httpclient/src/java/org/apache/nutch/protocol/httpclient/HttpFormAuthConfigurer.java x: 5 commits (all time) y: 70 lines of code src/plugin/protocol-interactiveselenium/src/java/org/apache/nutch/protocol/interactiveselenium/handlers/InteractiveSeleniumHandler.java x: 18 commits (all time) y: 6 lines of code src/plugin/publish-rabbitmq/src/java/org/apache/nutch/publisher/rabbitmq/RabbitMQPublisherImpl.java x: 15 commits (all time) y: 75 lines of code src/plugin/scoring-similarity/src/java/org/apache/nutch/scoring/similarity/cosine/DocVector.java x: 3 commits (all time) y: 32 lines of code src/plugin/indexer-cloudsearch/src/java/org/apache/nutch/indexwriter/cloudsearch/CloudSearchConstants.java x: 9 commits (all time) y: 7 lines of code src/plugin/indexer-rabbit/src/java/org/apache/nutch/indexwriter/rabbit/RabbitDocument.java x: 10 commits (all time) y: 42 lines of code conf/exchanges.xsd x: 4 commits (all time) y: 38 lines of code conf/index-writers.xsd x: 6 commits (all time) y: 163 lines of code src/plugin/lib-rabbitmq/src/java/org/apache/nutch/rabbitmq/RabbitMQOptionParser.java x: 4 commits (all time) y: 55 lines of code src/plugin/publish-rabbitmq/plugin.xml x: 7 commits (all time) y: 22 lines of code src/plugin/parse-tika/src/java/org/apache/nutch/parse/tika/DOMBuilder.java x: 7 commits (all time) y: 215 lines of code eclipse-codeformat.xml x: 3 commits (all time) y: 269 lines of code src/java/org/apache/nutch/service/model/response/NutchServerInfo.java x: 1 commits (all time) y: 34 lines of code src/plugin/parse-ext/plugin.xml x: 7 commits (all time) y: 33 lines of code
2563.0
lines of code
  min: 1.0
  average: 82.72
  25th percentile: 18.0
  median: 28.0
  75th percentile: 93.0
  max: 2563.0
0 473.0
commits (all time)
min: 1.0 | average: 17.76 | 25th percentile: 5.0 | median: 10.0 | 75th percentile: 22.0 | max: 473.0

File Size vs. Contributors (all time): 682 points

src/plugin/language-identifier/ivy.xml x: 10 contributors (all time) y: 25 lines of code src/plugin/language-identifier/plugin.xml x: 8 contributors (all time) y: 37 lines of code src/java/org/apache/nutch/hostdb/UpdateHostDb.java x: 10 contributors (all time) y: 201 lines of code src/java/org/apache/nutch/hostdb/UpdateHostDbReducer.java x: 11 contributors (all time) y: 323 lines of code src/java/org/apache/nutch/crawl/Inlink.java x: 10 contributors (all time) y: 93 lines of code src/java/org/apache/nutch/fetcher/FetchItemQueues.java x: 10 contributors (all time) y: 264 lines of code src/java/org/apache/nutch/fetcher/Fetcher.java x: 28 contributors (all time) y: 461 lines of code src/java/org/apache/nutch/fetcher/QueueFeeder.java x: 10 contributors (all time) y: 138 lines of code src/plugin/parsefilter-debug/plugin.xml x: 2 contributors (all time) y: 21 lines of code src/plugin/parsefilter-naivebayes/plugin.xml x: 8 contributors (all time) y: 20 lines of code src/plugin/protocol-htmlunit/plugin.xml x: 3 contributors (all time) y: 29 lines of code src/plugin/protocol-httpclient/plugin.xml x: 5 contributors (all time) y: 29 lines of code src/plugin/protocol-interactiveselenium/plugin.xml x: 4 contributors (all time) y: 29 lines of code src/plugin/indexer-elastic/ivy.xml x: 15 contributors (all time) y: 35 lines of code src/plugin/indexer-elastic/plugin.xml x: 15 contributors (all time) y: 62 lines of code src/java/org/apache/nutch/segment/SegmentReader.java x: 22 contributors (all time) y: 712 lines of code src/java/org/apache/nutch/hostdb/ResolverThread.java x: 8 contributors (all time) y: 88 lines of code src/java/org/apache/nutch/crawl/AdaptiveFetchSchedule.java x: 11 contributors (all time) y: 239 lines of code src/java/org/apache/nutch/crawl/MimeAdaptiveFetchSchedule.java x: 11 contributors (all time) y: 136 lines of code src/java/org/apache/nutch/metadata/SpellCheckedMetadata.java x: 9 contributors (all time) y: 72 lines of code src/java/org/apache/nutch/protocol/ProtocolFactory.java x: 17 contributors (all time) y: 155 lines of code src/java/org/apache/nutch/service/impl/ConfManagerImpl.java x: 6 contributors (all time) y: 89 lines of code src/java/org/apache/nutch/service/impl/JobManagerImpl.java x: 3 contributors (all time) y: 67 lines of code src/java/org/apache/nutch/service/impl/NutchServerPoolExecutor.java x: 8 contributors (all time) y: 80 lines of code src/java/org/apache/nutch/tools/AbstractCommonCrawlFormat.java x: 11 contributors (all time) y: 291 lines of code src/java/org/apache/nutch/tools/CommonCrawlFormatWARC.java x: 9 contributors (all time) y: 204 lines of code src/java/org/apache/nutch/tools/warc/WARCExporter.java x: 15 contributors (all time) y: 416 lines of code src/java/org/apache/nutch/util/DumpFileUtil.java x: 5 contributors (all time) y: 106 lines of code src/java/org/apache/nutch/util/JexlUtil.java x: 9 contributors (all time) y: 33 lines of code src/plugin/index-more/src/java/org/apache/nutch/indexer/more/MoreIndexingFilter.java x: 23 contributors (all time) y: 239 lines of code src/plugin/indexer-cloudsearch/src/java/org/apache/nutch/indexwriter/cloudsearch/CloudSearchIndexWriter.java x: 9 contributors (all time) y: 259 lines of code src/plugin/indexer-elastic/src/java/org/apache/nutch/indexwriter/elastic/ElasticIndexWriter.java x: 19 contributors (all time) y: 282 lines of code src/plugin/indexer-kafka/src/java/org/apache/nutch/indexwriter/kafka/KafkaIndexWriter.java x: 3 contributors (all time) y: 158 lines of code src/plugin/lib-http/src/java/org/apache/nutch/protocol/http/api/HttpRobotRulesParser.java x: 11 contributors (all time) y: 174 lines of code src/plugin/parse-tika/src/java/org/apache/nutch/parse/tika/TikaParser.java x: 18 contributors (all time) y: 261 lines of code src/plugin/parsefilter-regex/src/java/org/apache/nutch/parsefilter/regex/RegexParseFilter.java x: 9 contributors (all time) y: 142 lines of code src/plugin/protocol-httpclient/src/java/org/apache/nutch/protocol/httpclient/Http.java x: 18 contributors (all time) y: 357 lines of code src/plugin/urlfilter-domain/src/java/org/apache/nutch/urlfilter/domain/DomainURLFilter.java x: 12 contributors (all time) y: 92 lines of code src/plugin/urlfilter-domaindenylist/src/java/org/apache/nutch/urlfilter/domaindenylist/DomainDenylistURLFilter.java x: 3 contributors (all time) y: 92 lines of code src/plugin/urlfilter-fast/src/java/org/apache/nutch/urlfilter/fast/FastURLFilter.java x: 3 contributors (all time) y: 260 lines of code src/plugin/urlnormalizer-host/src/java/org/apache/nutch/net/urlnormalizer/host/HostURLNormalizer.java x: 11 contributors (all time) y: 119 lines of code src/plugin/urlnormalizer-protocol/src/java/org/apache/nutch/net/urlnormalizer/protocol/ProtocolURLNormalizer.java x: 10 contributors (all time) y: 155 lines of code src/plugin/urlnormalizer-querystring/src/java/org/apache/nutch/net/urlnormalizer/querystring/QuerystringURLNormalizer.java x: 9 contributors (all time) y: 49 lines of code src/plugin/urlnormalizer-slash/src/java/org/apache/nutch/net/urlnormalizer/slash/SlashURLNormalizer.java x: 10 contributors (all time) y: 147 lines of code conf/nutch-default.xml x: 48 contributors (all time) y: 2563 lines of code src/java/org/apache/nutch/fetcher/FetchItemQueue.java x: 11 contributors (all time) y: 126 lines of code src/java/org/apache/nutch/crawl/Injector.java x: 26 contributors (all time) y: 472 lines of code src/java/org/apache/nutch/crawl/AbstractFetchSchedule.java x: 15 contributors (all time) y: 94 lines of code src/java/org/apache/nutch/crawl/Generator.java x: 31 contributors (all time) y: 908 lines of code src/java/org/apache/nutch/fetcher/FetcherThread.java x: 21 contributors (all time) y: 746 lines of code src/java/org/apache/nutch/service/NutchReader.java x: 6 contributors (all time) y: 17 lines of code src/java/org/apache/nutch/service/impl/LinkReader.java x: 8 contributors (all time) y: 130 lines of code src/java/org/apache/nutch/service/impl/NodeReader.java x: 9 contributors (all time) y: 135 lines of code src/java/org/apache/nutch/util/EncodingDetector.java x: 13 contributors (all time) y: 237 lines of code src/plugin/index-arbitrary/src/java/org/apache/nutch/indexer/arbitrary/ArbitraryIndexingFilter.java x: 3 contributors (all time) y: 139 lines of code src/plugin/parse-tika/src/java/org/apache/nutch/parse/tika/BoilerpipeExtractorRepository.java x: 8 contributors (all time) y: 28 lines of code src/plugin/parsefilter-debug/src/java/org/apache/nutch/parsefilter/debug/DebugParseFilter.java x: 2 contributors (all time) y: 37 lines of code src/plugin/protocol-httpclient/src/java/org/apache/nutch/protocol/httpclient/HttpFormAuthentication.java x: 6 contributors (all time) y: 196 lines of code src/plugin/protocol-interactiveselenium/src/java/org/apache/nutch/protocol/interactiveselenium/Http.java x: 6 contributors (all time) y: 33 lines of code src/plugin/scoring-link/src/java/org/apache/nutch/scoring/link/LinkAnalysisScoringFilter.java x: 9 contributors (all time) y: 52 lines of code src/plugin/scoring-metadata/src/java/org/apache/nutch/scoring/metadata/MetadataScoringFilter.java x: 2 contributors (all time) y: 72 lines of code src/plugin/subcollection/src/java/org/apache/nutch/collection/CollectionManager.java x: 14 contributors (all time) y: 157 lines of code src/java/org/apache/nutch/fetcher/FetchNode.java x: 7 contributors (all time) y: 51 lines of code src/java/org/apache/nutch/util/DomainStatistics.java x: 1 contributors (all time) y: 191 lines of code src/java/org/apache/nutch/util/URLUtil.java x: 11 contributors (all time) y: 232 lines of code src/plugin/lib-regex-filter/src/java/org/apache/nutch/urlfilter/api/RegexURLFilterBase.java x: 13 contributors (all time) y: 155 lines of code src/plugin/urlfilter-automaton/src/java/org/apache/nutch/urlfilter/automaton/AutomatonURLFilter.java x: 11 contributors (all time) y: 68 lines of code src/plugin/urlfilter-regex/src/java/org/apache/nutch/urlfilter/regex/RegexURLFilter.java x: 12 contributors (all time) y: 68 lines of code src/java/org/apache/nutch/net/protocols/Response.java x: 9 contributors (all time) y: 27 lines of code src/plugin/lib-http/src/java/org/apache/nutch/protocol/http/api/HttpBase.java x: 24 contributors (all time) y: 505 lines of code src/plugin/protocol-okhttp/src/java/org/apache/nutch/protocol/okhttp/OkHttp.java x: 8 contributors (all time) y: 340 lines of code src/java/org/apache/nutch/indexer/IndexingJob.java x: 20 contributors (all time) y: 358 lines of code src/plugin/index-arbitrary/ivy.xml x: 1 contributors (all time) y: 18 lines of code src/plugin/index-arbitrary/plugin.xml x: 1 contributors (all time) y: 21 lines of code src/plugin/index-arbitrary/src/java/org/apache/nutch/indexer/arbitrary/package-info.java x: 1 contributors (all time) y: 1 lines of code src/plugin/lib-htmlunit/src/java/org/apache/nutch/protocol/htmlunit/HtmlUnitWebDriver.java x: 6 contributors (all time) y: 137 lines of code src/plugin/lib-selenium/plugin.xml x: 7 contributors (all time) y: 138 lines of code src/plugin/lib-selenium/src/java/org/apache/nutch/protocol/selenium/HttpWebClient.java x: 13 contributors (all time) y: 216 lines of code src/plugin/protocol-selenium/src/java/org/apache/nutch/protocol/selenium/Http.java x: 6 contributors (all time) y: 28 lines of code src/plugin/indexer-kafka/plugin.xml x: 5 contributors (all time) y: 75 lines of code ivy/ivy.xml x: 28 contributors (all time) y: 94 lines of code ivy/ivysettings.xml x: 19 contributors (all time) y: 63 lines of code src/plugin/build-plugin.xml x: 13 contributors (all time) y: 164 lines of code src/plugin/exchange-jexl/ivy.xml x: 5 contributors (all time) y: 20 lines of code src/plugin/index-jexl-filter/ivy.xml x: 9 contributors (all time) y: 20 lines of code src/plugin/index-static/ivy.xml x: 7 contributors (all time) y: 20 lines of code src/plugin/indexer-dummy/ivy.xml x: 4 contributors (all time) y: 20 lines of code src/plugin/indexer-opensearch-1x/ivy.xml x: 3 contributors (all time) y: 41 lines of code src/plugin/indexer-solr/ivy.xml x: 11 contributors (all time) y: 36 lines of code src/plugin/lib-nekohtml/ivy.xml x: 7 contributors (all time) y: 21 lines of code src/plugin/parse-tika/ivy.xml x: 17 contributors (all time) y: 21 lines of code src/plugin/protocol-foo/ivy.xml x: 2 contributors (all time) y: 20 lines of code ivy/dependency-check-ant/dependency-check-suppressions.xml x: 4 contributors (all time) y: 3 lines of code src/java/org/apache/nutch/crawl/CrawlDatum.java x: 23 contributors (all time) y: 456 lines of code src/java/org/apache/nutch/crawl/CrawlDb.java x: 21 contributors (all time) y: 308 lines of code src/java/org/apache/nutch/crawl/CrawlDbMerger.java x: 21 contributors (all time) y: 168 lines of code src/java/org/apache/nutch/crawl/CrawlDbReader.java x: 27 contributors (all time) y: 1148 lines of code src/java/org/apache/nutch/crawl/CrawlDbReducer.java x: 19 contributors (all time) y: 263 lines of code src/java/org/apache/nutch/crawl/DeduplicationJob.java x: 14 contributors (all time) y: 317 lines of code src/java/org/apache/nutch/crawl/Inlinks.java x: 13 contributors (all time) y: 83 lines of code src/java/org/apache/nutch/crawl/LinkDb.java x: 20 contributors (all time) y: 398 lines of code src/java/org/apache/nutch/crawl/LinkDbMerger.java x: 17 contributors (all time) y: 147 lines of code src/java/org/apache/nutch/crawl/LinkDbReader.java x: 20 contributors (all time) y: 208 lines of code src/java/org/apache/nutch/crawl/MD5Signature.java x: 10 contributors (all time) y: 13 lines of code src/java/org/apache/nutch/crawl/URLPartitioner.java x: 14 contributors (all time) y: 72 lines of code src/java/org/apache/nutch/fetcher/FetcherOutputFormat.java x: 16 contributors (all time) y: 91 lines of code src/java/org/apache/nutch/hostdb/ReadHostDb.java x: 14 contributors (all time) y: 217 lines of code src/java/org/apache/nutch/indexer/CleaningJob.java x: 15 contributors (all time) y: 145 lines of code src/java/org/apache/nutch/indexer/IndexWriters.java x: 12 contributors (all time) y: 240 lines of code src/java/org/apache/nutch/indexer/IndexerOutputFormat.java x: 13 contributors (all time) y: 44 lines of code src/java/org/apache/nutch/indexer/IndexingFiltersChecker.java x: 16 contributors (all time) y: 241 lines of code src/java/org/apache/nutch/metadata/CaseInsensitiveMetadata.java x: 2 contributors (all time) y: 7 lines of code src/java/org/apache/nutch/metadata/MetaWrapper.java x: 8 contributors (all time) y: 52 lines of code src/java/org/apache/nutch/metadata/Metadata.java x: 14 contributors (all time) y: 167 lines of code src/java/org/apache/nutch/net/URLFilterChecker.java x: 21 contributors (all time) y: 55 lines of code src/java/org/apache/nutch/net/URLNormalizerChecker.java x: 13 contributors (all time) y: 63 lines of code src/java/org/apache/nutch/parse/HTMLMetaTags.java x: 11 contributors (all time) y: 105 lines of code src/java/org/apache/nutch/parse/ParseData.java x: 15 contributors (all time) y: 159 lines of code src/java/org/apache/nutch/parse/ParseImpl.java x: 11 contributors (all time) y: 57 lines of code src/java/org/apache/nutch/parse/ParseOutputFormat.java x: 23 contributors (all time) y: 378 lines of code src/java/org/apache/nutch/parse/ParseResult.java x: 11 contributors (all time) y: 75 lines of code src/java/org/apache/nutch/parse/ParseSegment.java x: 23 contributors (all time) y: 273 lines of code src/java/org/apache/nutch/parse/ParseStatus.java x: 12 contributors (all time) y: 197 lines of code src/java/org/apache/nutch/parse/ParserChecker.java x: 19 contributors (all time) y: 218 lines of code src/java/org/apache/nutch/plugin/Extension.java x: 14 contributors (all time) y: 83 lines of code src/java/org/apache/nutch/plugin/Plugin.java x: 11 contributors (all time) y: 24 lines of code src/java/org/apache/nutch/plugin/PluginRepository.java x: 15 contributors (all time) y: 404 lines of code src/java/org/apache/nutch/protocol/Content.java x: 17 contributors (all time) y: 229 lines of code src/java/org/apache/nutch/protocol/ProtocolStatus.java x: 14 contributors (all time) y: 223 lines of code src/java/org/apache/nutch/protocol/RobotRulesParser.java x: 13 contributors (all time) y: 294 lines of code src/java/org/apache/nutch/scoring/webgraph/LinkDumper.java x: 17 contributors (all time) y: 311 lines of code src/java/org/apache/nutch/scoring/webgraph/LinkRank.java x: 17 contributors (all time) y: 493 lines of code src/java/org/apache/nutch/scoring/webgraph/Node.java x: 6 contributors (all time) y: 62 lines of code src/java/org/apache/nutch/scoring/webgraph/NodeDumper.java x: 16 contributors (all time) y: 314 lines of code src/java/org/apache/nutch/scoring/webgraph/ScoreUpdater.java x: 15 contributors (all time) y: 179 lines of code src/java/org/apache/nutch/scoring/webgraph/WebGraph.java x: 17 contributors (all time) y: 522 lines of code src/java/org/apache/nutch/segment/SegmentMerger.java x: 21 contributors (all time) y: 661 lines of code src/java/org/apache/nutch/segment/SegmentPart.java x: 10 contributors (all time) y: 49 lines of code src/java/org/apache/nutch/service/impl/SeedManagerImpl.java x: 4 contributors (all time) y: 36 lines of code src/java/org/apache/nutch/tools/CommonCrawlFormatSimple.java x: 8 contributors (all time) y: 136 lines of code src/java/org/apache/nutch/tools/DmozParser.java x: 16 contributors (all time) y: 256 lines of code src/java/org/apache/nutch/tools/FreeGenerator.java x: 16 contributors (all time) y: 195 lines of code src/java/org/apache/nutch/tools/ResolveUrls.java x: 10 contributors (all time) y: 121 lines of code src/java/org/apache/nutch/tools/arc/ArcRecordReader.java x: 11 contributors (all time) y: 159 lines of code src/java/org/apache/nutch/tools/arc/ArcSegmentCreator.java x: 14 contributors (all time) y: 269 lines of code src/java/org/apache/nutch/util/AbstractChecker.java x: 12 contributors (all time) y: 156 lines of code src/java/org/apache/nutch/util/CommandRunner.java x: 12 contributors (all time) y: 202 lines of code src/java/org/apache/nutch/util/CrawlCompletionStats.java x: 16 contributors (all time) y: 204 lines of code src/java/org/apache/nutch/util/NutchJob.java x: 17 contributors (all time) y: 48 lines of code src/java/org/apache/nutch/util/PrefixStringMatcher.java x: 9 contributors (all time) y: 85 lines of code src/java/org/apache/nutch/util/ProtocolStatusStatistics.java x: 16 contributors (all time) y: 122 lines of code src/java/org/apache/nutch/util/SitemapProcessor.java x: 13 contributors (all time) y: 411 lines of code src/java/org/apache/nutch/util/SuffixStringMatcher.java x: 9 contributors (all time) y: 65 lines of code src/java/overview.html x: 6 contributors (all time) y: 9 lines of code src/plugin/creativecommons/conf/nutch-site.xml x: 5 contributors (all time) y: 34 lines of code src/plugin/creativecommons/data/anchor.html x: 3 contributors (all time) y: 9 lines of code src/plugin/index-replace/sample/testIndexReplace.html x: 3 contributors (all time) y: 12 lines of code src/plugin/indexer-csv/src/java/org/apache/nutch/indexwriter/csv/CSVIndexWriter.java x: 7 contributors (all time) y: 295 lines of code src/plugin/indexer-opensearch-1x/plugin.xml x: 2 contributors (all time) y: 58 lines of code src/plugin/indexer-opensearch-1x/src/java/org/apache/nutch/indexwriter/opensearch1x/OpenSearch1xIndexWriter.java x: 2 contributors (all time) y: 360 lines of code src/plugin/indexer-solr/plugin.xml x: 12 contributors (all time) y: 48 lines of code src/plugin/indexer-solr/src/java/org/apache/nutch/indexwriter/solr/SolrIndexWriter.java x: 13 contributors (all time) y: 285 lines of code src/plugin/indexer-solr/src/java/org/apache/nutch/indexwriter/solr/SolrUtils.java x: 12 contributors (all time) y: 81 lines of code src/plugin/lib-htmlunit/plugin.xml x: 7 contributors (all time) y: 282 lines of code src/plugin/parse-html/src/java/org/apache/nutch/parse/html/DOMContentUtils.java x: 20 contributors (all time) y: 317 lines of code src/plugin/parse-tika/sample/nutch.html x: 4 contributors (all time) y: 408 lines of code src/plugin/parse-tika/src/java/org/apache/nutch/parse/tika/DOMContentUtils.java x: 17 contributors (all time) y: 316 lines of code src/plugin/protocol-ftp/src/java/org/apache/nutch/protocol/ftp/Client.java x: 11 contributors (all time) y: 212 lines of code src/plugin/protocol-ftp/src/java/org/apache/nutch/protocol/ftp/PrintCommandListener.java x: 8 contributors (all time) y: 43 lines of code src/plugin/protocol-htmlunit/src/java/org/apache/nutch/protocol/htmlunit/DummyX509TrustManager.java x: 5 contributors (all time) y: 42 lines of code src/plugin/protocol-http/src/java/org/apache/nutch/protocol/http/HttpResponse.java x: 21 contributors (all time) y: 506 lines of code src/plugin/protocol-httpclient/src/java/org/apache/nutch/protocol/httpclient/DummyX509TrustManager.java x: 11 contributors (all time) y: 42 lines of code src/plugin/protocol-interactiveselenium/src/java/org/apache/nutch/protocol/interactiveselenium/HttpResponse.java x: 13 contributors (all time) y: 397 lines of code src/plugin/protocol-interactiveselenium/src/java/org/apache/nutch/protocol/interactiveselenium/handlers/DefaultClickAllAjaxLinksHandler.java x: 12 contributors (all time) y: 55 lines of code src/plugin/protocol-okhttp/src/java/org/apache/nutch/protocol/okhttp/OkHttpResponse.java x: 7 contributors (all time) y: 185 lines of code src/plugin/protocol-selenium/src/java/org/apache/nutch/protocol/selenium/HttpResponse.java x: 11 contributors (all time) y: 354 lines of code src/plugin/scoring-similarity/src/java/org/apache/nutch/scoring/similarity/util/LuceneAnalyzerUtil.java x: 10 contributors (all time) y: 54 lines of code src/plugin/scoring-similarity/src/java/org/apache/nutch/scoring/similarity/util/LuceneTokenizer.java x: 12 contributors (all time) y: 105 lines of code src/plugin/urlfilter-suffix/src/java/org/apache/nutch/urlfilter/suffix/SuffixURLFilter.java x: 16 contributors (all time) y: 186 lines of code src/plugin/urlnormalizer-regex/src/java/org/apache/nutch/net/urlnormalizer/regex/RegexURLNormalizer.java x: 9 contributors (all time) y: 245 lines of code src/plugin/protocol-okhttp/src/java/org/apache/nutch/protocol/okhttp/CIDR.java x: 1 contributors (all time) y: 47 lines of code src/plugin/protocol-okhttp/src/java/org/apache/nutch/protocol/okhttp/IPFilterRules.java x: 1 contributors (all time) y: 75 lines of code src/plugin/indexer-solr/schema.xml x: 3 contributors (all time) y: 230 lines of code src/plugin/index-geoip/src/java/org/apache/nutch/indexer/geoip/GeoIPDocumentCreator.java x: 6 contributors (all time) y: 169 lines of code src/plugin/index-geoip/src/java/org/apache/nutch/indexer/geoip/GeoIPIndexingFilter.java x: 7 contributors (all time) y: 110 lines of code src/java/org/apache/nutch/hostdb/HostDatum.java x: 7 contributors (all time) y: 263 lines of code src/java/org/apache/nutch/plugin/ExtensionPoint.java x: 12 contributors (all time) y: 38 lines of code src/java/org/apache/nutch/plugin/PluginManifestParser.java x: 14 contributors (all time) y: 205 lines of code src/java/org/apache/nutch/indexer/IndexWriterParams.java x: 6 contributors (all time) y: 44 lines of code src/java/org/apache/nutch/util/NutchTool.java x: 10 contributors (all time) y: 70 lines of code src/plugin/indexer-rabbit/src/java/org/apache/nutch/indexwriter/rabbit/RabbitIndexWriter.java x: 7 contributors (all time) y: 217 lines of code src/plugin/protocol-foo/src/java/org/apache/nutch/protocol/foo/Foo.java x: 1 contributors (all time) y: 98 lines of code src/java/org/apache/nutch/crawl/SignatureFactory.java x: 16 contributors (all time) y: 29 lines of code src/java/org/apache/nutch/fetcher/FetchItem.java x: 7 contributors (all time) y: 80 lines of code src/java/org/apache/nutch/indexer/IndexerMapReduce.java x: 18 contributors (all time) y: 350 lines of code src/java/org/apache/nutch/indexer/IndexingFilter.java x: 12 contributors (all time) y: 12 lines of code src/java/org/apache/nutch/net/URLFilter.java x: 9 contributors (all time) y: 7 lines of code src/java/org/apache/nutch/net/URLFilters.java x: 15 contributors (all time) y: 22 lines of code src/java/org/apache/nutch/net/URLNormalizers.java x: 15 contributors (all time) y: 166 lines of code src/java/org/apache/nutch/parse/HtmlParseFilter.java x: 12 contributors (all time) y: 10 lines of code src/java/org/apache/nutch/parse/HtmlParseFilters.java x: 12 contributors (all time) y: 26 lines of code src/java/org/apache/nutch/parse/OutlinkExtractor.java x: 14 contributors (all time) y: 57 lines of code src/java/org/apache/nutch/parse/ParseUtil.java x: 14 contributors (all time) y: 117 lines of code src/java/org/apache/nutch/parse/ParserFactory.java x: 16 contributors (all time) y: 232 lines of code src/java/org/apache/nutch/plugin/PluginDescriptor.java x: 13 contributors (all time) y: 172 lines of code src/java/org/apache/nutch/protocol/Protocol.java x: 14 contributors (all time) y: 13 lines of code src/java/org/apache/nutch/scoring/AbstractScoringFilter.java x: 6 contributors (all time) y: 60 lines of code src/java/org/apache/nutch/scoring/ScoringFilter.java x: 14 contributors (all time) y: 37 lines of code src/java/org/apache/nutch/scoring/webgraph/NodeReader.java x: 13 contributors (all time) y: 75 lines of code src/java/org/apache/nutch/service/model/response/JobInfo.java x: 4 contributors (all time) y: 75 lines of code src/java/org/apache/nutch/service/resources/ConfigResource.java x: 6 contributors (all time) y: 74 lines of code src/java/org/apache/nutch/service/resources/JobResource.java x: 5 contributors (all time) y: 51 lines of code src/java/org/apache/nutch/service/resources/ReaderResouce.java x: 4 contributors (all time) y: 104 lines of code src/java/org/apache/nutch/tools/CommonCrawlDataDumper.java x: 15 contributors (all time) y: 525 lines of code src/java/org/apache/nutch/tools/FileDumper.java x: 15 contributors (all time) y: 265 lines of code src/java/org/apache/nutch/tools/WARCUtils.java x: 8 contributors (all time) y: 208 lines of code src/java/org/apache/nutch/util/GZIPUtils.java x: 10 contributors (all time) y: 85 lines of code src/java/org/apache/nutch/util/MimeUtil.java x: 16 contributors (all time) y: 148 lines of code src/java/org/apache/nutch/util/NodeWalker.java x: 10 contributors (all time) y: 40 lines of code src/java/org/apache/nutch/util/StringUtil.java x: 9 contributors (all time) y: 104 lines of code src/plugin/feed/src/java/org/apache/nutch/parse/feed/FeedParser.java x: 10 contributors (all time) y: 245 lines of code src/plugin/headings/src/java/org/apache/nutch/parse/headings/HeadingsParseFilter.java x: 12 contributors (all time) y: 77 lines of code src/plugin/index-basic/src/java/org/apache/nutch/indexer/basic/BasicIndexingFilter.java x: 16 contributors (all time) y: 76 lines of code src/plugin/index-links/src/java/org/apache/nutch/indexer/links/LinksIndexingFilter.java x: 6 contributors (all time) y: 98 lines of code src/plugin/index-replace/src/java/org/apache/nutch/indexer/replace/FieldReplacer.java x: 5 contributors (all time) y: 92 lines of code src/plugin/index-replace/src/java/org/apache/nutch/indexer/replace/ReplaceIndexer.java x: 8 contributors (all time) y: 193 lines of code src/plugin/language-identifier/src/java/org/apache/nutch/analysis/lang/HTMLLanguageParser.java x: 8 contributors (all time) y: 239 lines of code src/plugin/lib-rabbitmq/src/java/org/apache/nutch/rabbitmq/RabbitMQClient.java x: 4 contributors (all time) y: 126 lines of code src/plugin/mimetype-filter/src/java/org/apache/nutch/indexer/filter/MimeTypeIndexingFilter.java x: 12 contributors (all time) y: 191 lines of code src/plugin/parse-ext/src/java/org/apache/nutch/parse/ext/ExtParser.java x: 13 contributors (all time) y: 115 lines of code src/plugin/parse-html/src/java/org/apache/nutch/parse/html/DOMBuilder.java x: 11 contributors (all time) y: 197 lines of code src/plugin/parse-html/src/java/org/apache/nutch/parse/html/HTMLMetaProcessor.java x: 11 contributors (all time) y: 144 lines of code src/plugin/parse-html/src/java/org/apache/nutch/parse/html/HtmlParser.java x: 20 contributors (all time) y: 291 lines of code src/plugin/parse-js/src/java/org/apache/nutch/parse/js/JSParseFilter.java x: 14 contributors (all time) y: 172 lines of code src/plugin/parse-tika/src/java/org/apache/nutch/parse/tika/HTMLMetaProcessor.java x: 9 contributors (all time) y: 174 lines of code src/plugin/parsefilter-naivebayes/src/java/org/apache/nutch/parsefilter/naivebayes/Classify.java x: 5 contributors (all time) y: 69 lines of code src/plugin/protocol-file/src/java/org/apache/nutch/protocol/file/File.java x: 13 contributors (all time) y: 135 lines of code src/plugin/protocol-file/src/java/org/apache/nutch/protocol/file/FileResponse.java x: 16 contributors (all time) y: 160 lines of code src/plugin/protocol-ftp/src/java/org/apache/nutch/protocol/ftp/Ftp.java x: 19 contributors (all time) y: 177 lines of code src/plugin/protocol-ftp/src/java/org/apache/nutch/protocol/ftp/FtpResponse.java x: 19 contributors (all time) y: 329 lines of code src/plugin/protocol-htmlunit/src/java/org/apache/nutch/protocol/htmlunit/HttpResponse.java x: 10 contributors (all time) y: 419 lines of code src/plugin/scoring-opic/src/java/org/apache/nutch/scoring/opic/OPICScoringFilter.java x: 15 contributors (all time) y: 126 lines of code src/plugin/subcollection/src/java/org/apache/nutch/collection/Subcollection.java x: 8 contributors (all time) y: 120 lines of code src/plugin/subcollection/src/java/org/apache/nutch/indexer/subcollection/SubcollectionIndexingFilter.java x: 14 contributors (all time) y: 65 lines of code src/plugin/urlfilter-validator/src/java/org/apache/nutch/urlfilter/validator/UrlValidator.java x: 6 contributors (all time) y: 189 lines of code src/plugin/urlnormalizer-basic/src/java/org/apache/nutch/net/urlnormalizer/basic/package-info.java x: 5 contributors (all time) y: 1 lines of code conf/configuration.xsl x: 4 contributors (all time) y: 43 lines of code src/java/org/apache/nutch/exchange/Exchanges.java x: 3 contributors (all time) y: 108 lines of code src/java/org/apache/nutch/plugin/Pluggable.java x: 8 contributors (all time) y: 3 lines of code src/java/org/apache/nutch/tools/ShowProperties.java x: 2 contributors (all time) y: 44 lines of code src/plugin/index-geoip/src/java/org/apache/nutch/indexer/geoip/package-info.java x: 7 contributors (all time) y: 1 lines of code src/java/org/apache/nutch/indexer/NutchField.java x: 11 contributors (all time) y: 98 lines of code src/java/org/apache/nutch/metadata/HttpHeaders.java x: 11 contributors (all time) y: 18 lines of code src/java/org/apache/nutch/parse/Parser.java x: 11 contributors (all time) y: 8 lines of code src/java/org/apache/nutch/protocol/ProtocolOutput.java x: 7 contributors (all time) y: 36 lines of code src/java/org/apache/nutch/publisher/NutchPublishers.java x: 3 contributors (all time) y: 56 lines of code src/java/org/apache/nutch/service/impl/SequenceReader.java x: 6 contributors (all time) y: 128 lines of code src/java/org/apache/nutch/service/model/request/SeedList.java x: 4 contributors (all time) y: 67 lines of code src/java/org/apache/nutch/service/resources/DbResource.java x: 6 contributors (all time) y: 115 lines of code src/java/org/apache/nutch/tools/CommonCrawlConfig.java x: 3 contributors (all time) y: 88 lines of code src/plugin/parsefilter-naivebayes/src/java/org/apache/nutch/parsefilter/naivebayes/Train.java x: 2 contributors (all time) y: 88 lines of code conf/index-writers.xsd x: 3 contributors (all time) y: 163 lines of code src/java/org/apache/nutch/indexer/IndexWriterConfig.java x: 3 contributors (all time) y: 77 lines of code src/plugin/lib-rabbitmq/src/java/org/apache/nutch/rabbitmq/RabbitMQMessage.java x: 3 contributors (all time) y: 32 lines of code src/plugin/parse-tika/src/java/org/apache/nutch/parse/tika/DOMBuilder.java x: 5 contributors (all time) y: 215 lines of code eclipse-codeformat.xml x: 2 contributors (all time) y: 269 lines of code src/java/org/apache/nutch/service/model/response/NutchServerInfo.java x: 1 contributors (all time) y: 34 lines of code
2563.0
lines of code
  min: 1.0
  average: 82.72
  25th percentile: 18.0
  median: 28.0
  75th percentile: 93.0
  max: 2563.0
0 48.0
contributors (all time)
min: 1.0 | average: 7.34 | 25th percentile: 3.0 | median: 6.0 | 75th percentile: 10.0 | max: 48.0

File Size vs. Commits (30 days): 0 points

No data for "commits (30d)" vs. "lines of code".

File Size vs. Contributors (30 days): 0 points

No data for "contributors (30d)" vs. "lines of code".


File Size vs. Commits (90 days): 2 points

src/plugin/language-identifier/ivy.xml x: 2 commits (90d) y: 25 lines of code src/plugin/language-identifier/plugin.xml x: 2 commits (90d) y: 37 lines of code
37.0
lines of code
  min: 25.0
  average: 31.0
  25th percentile: 25.0
  median: 31.0
  75th percentile: 37.0
  max: 37.0
0 2.0
commits (90d)
min: 2.0 | average: 2.0 | 25th percentile: 2.0 | median: 2.0 | 75th percentile: 2.0 | max: 2.0

File Size vs. Contributors (90 days): 2 points

src/plugin/language-identifier/ivy.xml x: 2 contributors (90d) y: 25 lines of code src/plugin/language-identifier/plugin.xml x: 2 contributors (90d) y: 37 lines of code
37.0
lines of code
  min: 25.0
  average: 31.0
  25th percentile: 25.0
  median: 31.0
  75th percentile: 37.0
  max: 37.0
0 2.0
contributors (90d)
min: 2.0 | average: 2.0 | 25th percentile: 2.0 | median: 2.0 | 75th percentile: 2.0 | max: 2.0