elastic / crawler
File Size

The distribution of size of files (measured in lines of code).

Intro
Learn more...
File Size Overall
0% | 13% | 25% | 27% | 33%
Legend:
1001+
501-1000
201-500
101-200
1-100


explore: grouped by folders | grouped by size | sunburst | 3D view
File Size per Extension
1001+
501-1000
201-500
101-200
1-100
rb0% | 13% | 25% | 27% | 33%
yaml0% | 0% | 0% | 0% | 100%
xml0% | 0% | 0% | 0% | 100%
File Size per Logical Decomposition
primary
1001+
501-1000
201-500
101-200
1-100
spec0% | 15% | 26% | 32% | 25%
lib0% | 10% | 24% | 20% | 44%
script0% | 0% | 0% | 0% | 100%
ROOT0% | 0% | 0% | 0% | 100%
Longest Files (Top 50)
File# lines# units
coordinator_spec.rb
in spec/lib/crawler
708 8
elasticsearch_spec.rb
in spec/lib/crawler/output_sink
594 -
coordinator.rb
in lib/crawler
563 33
client_spec.rb
in spec/lib/es
480 1
http_executor_spec.rb
in spec/lib/crawler
366 -
html_spec.rb
in spec/lib/crawler/data/crawl_result
365 -
config.rb
in lib/crawler/api
341 23
http_client_spec.rb
in spec/lib/crawler
318 7
http_executor.rb
in lib/crawler
286 17
event_generator.rb
in lib/crawler
279 31
http_client.rb
in lib/crawler
276 22
elasticsearch.rb
in lib/crawler/output_sink
222 22
document_mapper_spec.rb
in spec/lib/crawler
219 -
config_spec.rb
in spec/lib/crawler/api
218 1
transformer_spec.rb
in spec/lib/crawler/content_engine
213 1
client.rb
in lib/es
198 15
helpers_spec.rb
in spec/lib/crawler/cli
196 -
sitemap_spec.rb
in spec/lib/crawler/data/crawl_result
194 -
extractor_spec.rb
in spec/lib/crawler/content_engine
188 -
crawl.rb
in lib/crawler/api
185 17
crawl_spec.rb
in spec/lib/crawler/api
183 -
base_spec.rb
in spec/lib/crawler/rule_engine
182 -
robots_txt_spec.rb
in spec/integration
169 -
url_request_check_spec.rb
in spec/lib/crawler/url_validator
165 -
rule_spec.rb
in spec/lib/crawler/data/extraction
162 -
faux_crawl.rb
in spec/support/faux
153 14
html.rb
in lib/crawler/data/crawl_result
153 22
event_generator_spec.rb
in spec/lib/crawler
150 -
exceptions.rb
in lib/crawler/http_utils
150 28
url_content_check_spec.rb
in spec/lib/crawler/url_validator
148 -
response.rb
in lib/crawler/http_utils
136 21
robots_txt_parser_spec.rb
in spec/lib/crawler
133 -
url_validator_spec.rb
in spec/lib/crawler
132 1
bulk_queue_spec.rb
in spec/lib/es
124 -
url_validator.rb
in lib/crawler
124 15
url_request_check_concern.rb
in lib/crawler/url_validator
122 3
link_spec.rb
in spec/lib/crawler/data
119 -
bad_ssl_spec.rb
in spec/lib/crawler/http_utils
109 1
crawl_results.rb
in spec/factories
104 -
robots_txt_check_spec.rb
in spec/lib/crawler/url_validator
104 -
config.rb
in lib/crawler/http_utils
104 20
rule.rb
in lib/crawler/data/extraction
98 9
rule_engine_outcome.rb
in lib/crawler/data
90 19
document_mapper.rb
in lib/crawler
88 11
tcp_check_spec.rb
in spec/lib/crawler/url_validator
86 -
stats_spec.rb
in spec/lib/crawler
85 -
third_party.rb
in script/licenses/lib
79 1
headers_spec.rb
in spec/integration
75 3
url_check_spec.rb
in spec/lib/crawler/url_validator
74 -
73 2
Files With Most Units (Top 50)
File# lines# units
coordinator.rb
in lib/crawler
563 33
event_generator.rb
in lib/crawler
279 31
exceptions.rb
in lib/crawler/http_utils
150 28
config.rb
in lib/crawler/api
341 23
http_client.rb
in lib/crawler
276 22
html.rb
in lib/crawler/data/crawl_result
153 22
elasticsearch.rb
in lib/crawler/output_sink
222 22
response.rb
in lib/crawler/http_utils
136 21
config.rb
in lib/crawler/http_utils
104 20
rule_engine_outcome.rb
in lib/crawler/data
90 19
http_executor.rb
in lib/crawler
286 17
crawl.rb
in lib/crawler/api
185 17
url_validator.rb
in lib/crawler
124 15
client.rb
in lib/es
198 15
faux_crawl.rb
in spec/support/faux
153 14
base.rb
in lib/crawler/data/crawl_result
69 14
base.rb
in lib/crawler/data/url_queue
55 13
stats.rb
in lib/crawler
67 12
crawl_task.rb
in lib/crawler/data
61 12
robots_txt_service.rb
in lib/crawler
72 12
logger.rb
in lib/crawler/logging
44 11
link.rb
in lib/crawler/data
60 11
document_mapper.rb
in lib/crawler
88 11
base.rb
in lib/crawler/output_sink
41 11
robots_txt_parser.rb
in lib/crawler
54 10
url.rb
in lib/crawler/data
46 9
rule.rb
in lib/crawler/data/extraction
98 9
coordinator_spec.rb
in spec/lib/crawler
708 8
bulk_queue.rb
in lib/es
70 8
http_client_spec.rb
in spec/lib/crawler
318 7
seen_urls.rb
in lib/crawler/data
28 7
base.rb
in lib/crawler/rule_engine
68 7
filtering_dns_resolver.rb
in lib/crawler/http_utils
51 7
domain.rb
in lib/crawler/data
33 6
file.rb
in lib/crawler/logging/handler
50 5
stdout.rb
in lib/crawler/logging/handler
49 5
extractor.rb
in lib/crawler/content_engine
49 5
sitemap.rb
in lib/crawler/data/crawl_result
50 5
redirect.rb
in lib/crawler/data/crawl_result
37 5
ruleset.rb
in lib/crawler/data/extraction
52 5
memory_only.rb
in lib/crawler/data/url_queue
52 5
helpers.rb
in lib/crawler/cli
43 5
results_collection.rb
in spec/support/faux
17 4
fixtures.rb
in spec/support
17 4
string_colors.rb
in script/support
14 4
content_extractable_file.rb
in lib/crawler/data/crawl_result
26 4
mock_executor.rb
in lib/crawler
23 4
request_timeout_spec.rb
in spec/integration/timeouts
65 3
headers_spec.rb
in spec/integration
75 3
utils.rb
in lib/crawler
22 3
Files With Long Lines (Top 7)

There are 7 files with lines longer than 120 characters. In total, there are 11 long lines.

File# lines# units# long lines
html_spec.rb
in spec/lib/crawler/data/crawl_result
365 - 3
base_spec.rb
in spec/lib/crawler/rule_engine
182 - 2
coordinator.rb
in lib/crawler
563 33 2
sitemap_xxe_spec.rb
in spec/integration
69 2 1
redirect.rb
in lib/crawler/data/crawl_result
37 5 1
crawl_task.rb
in lib/crawler/data
61 12 1
url_request_check_concern.rb
in lib/crawler/url_validator
122 3 1