duplicated block id: 1 size: 40 cleaned lines of code in 2 files: - lib/crawler/logging/handler/file.rb (23:69) - lib/crawler/logging/handler/stdout.rb (21:67) duplicated block id: 2 size: 18 cleaned lines of code in 2 files: - spec/lib/crawler/cli/crawl_spec.rb (9:30) - spec/lib/crawler/cli/validate_spec.rb (9:30) duplicated block id: 3 size: 18 cleaned lines of code in 2 files: - spec/lib/crawler/data/crawl_result/sitemap_spec.rb (75:98) - spec/lib/crawler/data/crawl_result/sitemap_spec.rb (207:230) duplicated block id: 4 size: 17 cleaned lines of code in 2 files: - spec/lib/crawler/output_sink/elasticsearch_spec.rb (523:539) - spec/lib/es/client_spec.rb (479:496) duplicated block id: 5 size: 16 cleaned lines of code in 2 files: - spec/lib/crawler/output_sink/elasticsearch_spec.rb (250:266) - spec/lib/crawler/output_sink/elasticsearch_spec.rb (355:371) duplicated block id: 6 size: 15 cleaned lines of code in 2 files: - spec/lib/crawler/http_executor_spec.rb (375:389) - spec/lib/crawler/http_executor_spec.rb (424:438) duplicated block id: 7 size: 14 cleaned lines of code in 2 files: - lib/crawler/http_executor.rb (296:309) - lib/crawler/http_executor.rb (315:328) duplicated block id: 8 size: 14 cleaned lines of code in 2 files: - spec/lib/crawler/document_mapper_spec.rb (74:89) - spec/lib/crawler/document_mapper_spec.rb (197:212) duplicated block id: 9 size: 13 cleaned lines of code in 2 files: - spec/lib/crawler/coordinator_spec.rb (196:208) - spec/lib/crawler/coordinator_spec.rb (260:272) duplicated block id: 10 size: 13 cleaned lines of code in 2 files: - spec/lib/crawler/http_client_spec.rb (341:355) - spec/lib/crawler/http_client_spec.rb (363:377) duplicated block id: 11 size: 12 cleaned lines of code in 2 files: - lib/crawler/http_executor.rb (317:328) - lib/crawler/http_executor.rb (340:351) duplicated block id: 12 size: 12 cleaned lines of code in 2 files: - spec/lib/crawler/data/crawl_result/html_spec.rb (14:26) - spec/lib/crawler/data/crawl_result_spec.rb (13:25) duplicated block id: 13 size: 12 cleaned lines of code in 2 files: - spec/lib/crawler/output_sink/elasticsearch_spec.rb (11:22) - spec/lib/crawler/output_sink/elasticsearch_spec.rb (302:313) duplicated block id: 14 size: 12 cleaned lines of code in 2 files: - spec/lib/crawler/output_sink/elasticsearch_spec.rb (278:289) - spec/lib/crawler/output_sink/elasticsearch_spec.rb (302:313) duplicated block id: 15 size: 12 cleaned lines of code in 2 files: - spec/lib/crawler/output_sink/elasticsearch_spec.rb (11:22) - spec/lib/crawler/output_sink/elasticsearch_spec.rb (278:289) duplicated block id: 16 size: 12 cleaned lines of code in 2 files: - lib/crawler/http_executor.rb (298:309) - lib/crawler/http_executor.rb (340:351) duplicated block id: 17 size: 11 cleaned lines of code in 2 files: - spec/integration/sitemap_xxe_spec.rb (50:62) - spec/integration/sitemap_xxe_spec.rb (75:87) duplicated block id: 18 size: 11 cleaned lines of code in 2 files: - spec/integration/robots_txt_spec.rb (82:94) - spec/integration/robots_txt_spec.rb (125:137) duplicated block id: 19 size: 11 cleaned lines of code in 2 files: - spec/integration/robots_txt_spec.rb (82:94) - spec/integration/robots_txt_spec.rb (164:176) duplicated block id: 20 size: 11 cleaned lines of code in 2 files: - spec/integration/robots_txt_spec.rb (164:176) - spec/integration/robots_txt_spec.rb (195:207) duplicated block id: 21 size: 11 cleaned lines of code in 2 files: - spec/integration/robots_txt_spec.rb (82:94) - spec/integration/robots_txt_spec.rb (195:207) duplicated block id: 22 size: 11 cleaned lines of code in 2 files: - spec/integration/robots_txt_spec.rb (125:137) - spec/integration/robots_txt_spec.rb (164:176) duplicated block id: 23 size: 11 cleaned lines of code in 2 files: - spec/integration/robots_txt_spec.rb (125:137) - spec/integration/robots_txt_spec.rb (195:207) duplicated block id: 24 size: 11 cleaned lines of code in 2 files: - spec/lib/crawler/output_sink/elasticsearch_spec.rb (395:407) - spec/lib/crawler/output_sink/elasticsearch_spec.rb (494:507) duplicated block id: 25 size: 11 cleaned lines of code in 2 files: - spec/factories/crawl_results.rb (87:98) - spec/factories/crawl_results.rb (101:112) duplicated block id: 26 size: 10 cleaned lines of code in 2 files: - spec/lib/crawler/output_sink/elasticsearch_spec.rb (511:520) - spec/lib/crawler/output_sink/elasticsearch_spec.rb (561:570) duplicated block id: 27 size: 10 cleaned lines of code in 2 files: - spec/lib/crawler/document_mapper_spec.rb (95:105) - spec/lib/crawler/document_mapper_spec.rb (218:228) duplicated block id: 28 size: 10 cleaned lines of code in 2 files: - spec/lib/crawler/cli/crawl_spec.rb (79:90) - spec/lib/crawler/cli/schedule_spec.rb (63:74) duplicated block id: 29 size: 10 cleaned lines of code in 2 files: - spec/factories/crawl_results.rb (44:54) - spec/factories/crawl_results.rb (58:68) duplicated block id: 30 size: 9 cleaned lines of code in 2 files: - spec/lib/crawler/output_sink/elasticsearch_spec.rb (254:262) - spec/lib/crawler/output_sink/elasticsearch_spec.rb (325:333) duplicated block id: 31 size: 9 cleaned lines of code in 2 files: - spec/lib/crawler/event_generator_spec.rb (97:105) - spec/lib/crawler/event_generator_spec.rb (114:122) duplicated block id: 32 size: 9 cleaned lines of code in 2 files: - spec/lib/crawler/output_sink/elasticsearch_spec.rb (325:333) - spec/lib/crawler/output_sink/elasticsearch_spec.rb (359:367) duplicated block id: 33 size: 9 cleaned lines of code in 2 files: - spec/lib/crawler/coordinator_spec.rb (257:266) - spec/lib/crawler/coordinator_spec.rb (277:286) duplicated block id: 34 size: 9 cleaned lines of code in 2 files: - spec/lib/crawler/robots_txt_parser_spec.rb (22:32) - spec/lib/crawler/robots_txt_parser_spec.rb (39:49) duplicated block id: 35 size: 9 cleaned lines of code in 2 files: - spec/integration/headers_spec.rb (64:73) - spec/integration/headers_spec.rb (81:90) duplicated block id: 36 size: 9 cleaned lines of code in 2 files: - spec/lib/crawler/output_sink/elasticsearch_spec.rb (230:238) - spec/lib/crawler/output_sink/elasticsearch_spec.rb (254:262) duplicated block id: 37 size: 9 cleaned lines of code in 2 files: - spec/lib/crawler/output_sink/elasticsearch_spec.rb (230:238) - spec/lib/crawler/output_sink/elasticsearch_spec.rb (359:367) duplicated block id: 38 size: 9 cleaned lines of code in 2 files: - spec/lib/crawler/data/crawl_result/sitemap_spec.rb (50:61) - spec/lib/crawler/data/crawl_result/sitemap_spec.rb (182:193) duplicated block id: 39 size: 9 cleaned lines of code in 2 files: - spec/lib/crawler/http_client_spec.rb (160:168) - spec/lib/crawler/http_client_spec.rb (199:207) duplicated block id: 40 size: 9 cleaned lines of code in 2 files: - lib/crawler/output_sink/elasticsearch.rb (128:136) - spec/lib/crawler/output_sink/elasticsearch_spec.rb (564:572) duplicated block id: 41 size: 9 cleaned lines of code in 2 files: - spec/integration/robots_txt_spec.rb (32:42) - spec/integration/robots_txt_spec.rb (57:67) duplicated block id: 42 size: 9 cleaned lines of code in 2 files: - spec/lib/crawler/output_sink/elasticsearch_spec.rb (230:238) - spec/lib/crawler/output_sink/elasticsearch_spec.rb (325:333) duplicated block id: 43 size: 8 cleaned lines of code in 2 files: - spec/lib/crawler/http_executor_spec.rb (233:241) - spec/lib/crawler/http_executor_spec.rb (288:296) duplicated block id: 44 size: 8 cleaned lines of code in 2 files: - spec/lib/crawler/api/crawl_spec.rb (18:25) - spec/lib/crawler/output_sink_spec.rb (37:44) duplicated block id: 45 size: 8 cleaned lines of code in 2 files: - spec/lib/crawler/coordinator_spec.rb (213:220) - spec/lib/crawler/coordinator_spec.rb (230:237) duplicated block id: 46 size: 8 cleaned lines of code in 2 files: - lib/crawler/output_sink/elasticsearch.rb (105:112) - lib/crawler/output_sink/elasticsearch.rb (127:134) duplicated block id: 47 size: 8 cleaned lines of code in 2 files: - spec/lib/es/client_spec.rb (350:357) - spec/lib/es/client_spec.rb (375:382) duplicated block id: 48 size: 8 cleaned lines of code in 2 files: - spec/lib/crawler/output_sink/elasticsearch_spec.rb (131:139) - spec/lib/crawler/output_sink/elasticsearch_spec.rb (164:172) duplicated block id: 49 size: 8 cleaned lines of code in 2 files: - spec/lib/crawler/output_sink/elasticsearch_spec.rb (131:139) - spec/lib/crawler/output_sink/elasticsearch_spec.rb (147:155) duplicated block id: 50 size: 8 cleaned lines of code in 2 files: - spec/lib/crawler/coordinator_spec.rb (260:267) - spec/lib/crawler/coordinator_spec.rb (386:393) duplicated block id: 51 size: 8 cleaned lines of code in 2 files: - spec/factories/crawl_results.rb (15:22) - spec/factories/crawl_results.rb (47:54) duplicated block id: 52 size: 8 cleaned lines of code in 2 files: - spec/integration/content_extraction_spec.rb (18:27) - spec/integration/response_content_type_spec.rb (17:26) duplicated block id: 53 size: 8 cleaned lines of code in 2 files: - spec/lib/crawler/output_sink/elasticsearch_spec.rb (254:261) - spec/lib/crawler/output_sink/elasticsearch_spec.rb (302:309) duplicated block id: 54 size: 8 cleaned lines of code in 2 files: - spec/lib/crawler/coordinator_spec.rb (213:220) - spec/lib/crawler/coordinator_spec.rb (280:287) duplicated block id: 55 size: 8 cleaned lines of code in 2 files: - spec/factories/crawl_results.rb (15:22) - spec/factories/crawl_results.rb (61:68) duplicated block id: 56 size: 8 cleaned lines of code in 2 files: - spec/lib/crawler/coordinator_spec.rb (196:203) - spec/lib/crawler/coordinator_spec.rb (386:393) duplicated block id: 57 size: 8 cleaned lines of code in 2 files: - spec/lib/crawler/output_sink/elasticsearch_spec.rb (254:261) - spec/lib/crawler/output_sink/elasticsearch_spec.rb (278:285) duplicated block id: 58 size: 8 cleaned lines of code in 2 files: - spec/integration/sitemap_xxe_spec.rb (40:48) - spec/integration/sitemap_xxe_spec.rb (65:73) duplicated block id: 59 size: 8 cleaned lines of code in 2 files: - spec/lib/crawler/output_sink/elasticsearch_spec.rb (11:18) - spec/lib/crawler/output_sink/elasticsearch_spec.rb (325:332) duplicated block id: 60 size: 8 cleaned lines of code in 2 files: - spec/lib/crawler/output_sink/elasticsearch_spec.rb (230:237) - spec/lib/crawler/output_sink/elasticsearch_spec.rb (302:309) duplicated block id: 61 size: 8 cleaned lines of code in 2 files: - spec/lib/crawler/coordinator_spec.rb (67:74) - spec/lib/crawler/coordinator_spec.rb (122:129) duplicated block id: 62 size: 8 cleaned lines of code in 2 files: - spec/lib/crawler/output_sink/elasticsearch_spec.rb (278:285) - spec/lib/crawler/output_sink/elasticsearch_spec.rb (359:366) duplicated block id: 63 size: 8 cleaned lines of code in 2 files: - spec/factories/crawl_results.rb (105:112) - spec/factories/crawl_results.rb (119:126) duplicated block id: 64 size: 8 cleaned lines of code in 2 files: - spec/lib/crawler/output_sink/elasticsearch_spec.rb (230:237) - spec/lib/crawler/output_sink/elasticsearch_spec.rb (278:285) duplicated block id: 65 size: 8 cleaned lines of code in 2 files: - spec/lib/crawler/output_sink/elasticsearch_spec.rb (11:18) - spec/lib/crawler/output_sink/elasticsearch_spec.rb (230:237) duplicated block id: 66 size: 8 cleaned lines of code in 2 files: - spec/lib/crawler/output_sink/elasticsearch_spec.rb (302:309) - spec/lib/crawler/output_sink/elasticsearch_spec.rb (325:332) duplicated block id: 67 size: 8 cleaned lines of code in 2 files: - spec/lib/crawler/url_validator/tcp_check_spec.rb (79:87) - spec/lib/crawler/url_validator/tcp_check_spec.rb (95:103) duplicated block id: 68 size: 8 cleaned lines of code in 2 files: - spec/lib/crawler/coordinator_spec.rb (230:237) - spec/lib/crawler/coordinator_spec.rb (280:287) duplicated block id: 69 size: 8 cleaned lines of code in 2 files: - spec/lib/crawler/output_sink/elasticsearch_spec.rb (278:285) - spec/lib/crawler/output_sink/elasticsearch_spec.rb (325:332) duplicated block id: 70 size: 8 cleaned lines of code in 2 files: - spec/lib/crawler/output_sink/elasticsearch_spec.rb (11:18) - spec/lib/crawler/output_sink/elasticsearch_spec.rb (359:366) duplicated block id: 71 size: 8 cleaned lines of code in 2 files: - spec/lib/crawler/output_sink/elasticsearch_spec.rb (302:309) - spec/lib/crawler/output_sink/elasticsearch_spec.rb (359:366) duplicated block id: 72 size: 8 cleaned lines of code in 2 files: - spec/factories/crawl_results.rb (91:98) - spec/factories/crawl_results.rb (119:126) duplicated block id: 73 size: 8 cleaned lines of code in 2 files: - lib/crawler/output_sink/elasticsearch.rb (106:113) - spec/lib/crawler/output_sink/elasticsearch_spec.rb (514:521) duplicated block id: 74 size: 8 cleaned lines of code in 2 files: - spec/lib/crawler/cli/crawl_spec.rb (51:59) - spec/lib/crawler/cli/urltest_spec.rb (52:60) duplicated block id: 75 size: 8 cleaned lines of code in 2 files: - lib/crawler/http_client.rb (97:105) - lib/crawler/http_client.rb (134:142) duplicated block id: 76 size: 8 cleaned lines of code in 2 files: - spec/lib/crawler/output_sink/elasticsearch_spec.rb (147:155) - spec/lib/crawler/output_sink/elasticsearch_spec.rb (164:172) duplicated block id: 77 size: 8 cleaned lines of code in 2 files: - spec/lib/crawler/url_validator/tcp_check_spec.rb (63:71) - spec/lib/crawler/url_validator/tcp_check_spec.rb (95:103) duplicated block id: 78 size: 8 cleaned lines of code in 2 files: - spec/lib/crawler/url_validator/tcp_check_spec.rb (63:71) - spec/lib/crawler/url_validator/tcp_check_spec.rb (79:87) duplicated block id: 79 size: 8 cleaned lines of code in 2 files: - spec/lib/crawler/output_sink/elasticsearch_spec.rb (11:18) - spec/lib/crawler/output_sink/elasticsearch_spec.rb (254:261) duplicated block id: 80 size: 8 cleaned lines of code in 2 files: - spec/lib/crawler/http_utils/bad_ssl_spec.rb (149:159) - spec/lib/crawler/http_utils/bad_ssl_spec.rb (158:168) duplicated block id: 81 size: 7 cleaned lines of code in 2 files: - spec/lib/crawler/coordinator_spec.rb (196:202) - spec/lib/crawler/coordinator_spec.rb (280:286) duplicated block id: 82 size: 7 cleaned lines of code in 2 files: - spec/lib/crawler/api/crawl_spec.rb (128:135) - spec/lib/crawler/api/crawl_spec.rb (154:161) duplicated block id: 83 size: 7 cleaned lines of code in 2 files: - spec/lib/crawler/output_sink/elasticsearch_spec.rb (662:670) - spec/lib/crawler/output_sink/elasticsearch_spec.rb (692:700) duplicated block id: 84 size: 7 cleaned lines of code in 2 files: - spec/lib/es/client_spec.rb (72:78) - spec/lib/es/client_spec.rb (94:100) duplicated block id: 85 size: 7 cleaned lines of code in 2 files: - lib/crawler/output_sink/elasticsearch.rb (128:134) - spec/lib/crawler/output_sink/elasticsearch_spec.rb (514:520) duplicated block id: 86 size: 7 cleaned lines of code in 2 files: - spec/lib/crawler/coordinator_spec.rb (196:202) - spec/lib/crawler/coordinator_spec.rb (213:219) duplicated block id: 87 size: 7 cleaned lines of code in 2 files: - spec/lib/crawler/coordinator_spec.rb (196:202) - spec/lib/crawler/coordinator_spec.rb (230:236) duplicated block id: 88 size: 7 cleaned lines of code in 2 files: - lib/crawler/api/crawl.rb (95:101) - lib/crawler/api/crawl.rb (128:134) duplicated block id: 89 size: 7 cleaned lines of code in 2 files: - spec/integration/headers_spec.rb (49:56) - spec/integration/headers_spec.rb (83:90) duplicated block id: 90 size: 7 cleaned lines of code in 2 files: - spec/integration/robots_txt_spec.rb (62:68) - spec/integration/robots_txt_spec.rb (181:187) duplicated block id: 91 size: 7 cleaned lines of code in 2 files: - spec/lib/crawler/api/crawl_spec.rb (106:113) - spec/lib/crawler/api/crawl_spec.rb (153:160) duplicated block id: 92 size: 7 cleaned lines of code in 2 files: - spec/lib/crawler/data/crawl_result/sitemap_spec.rb (108:115) - spec/lib/crawler/data/crawl_result/sitemap_spec.rb (240:247) duplicated block id: 93 size: 7 cleaned lines of code in 2 files: - spec/lib/crawler/coordinator_spec.rb (213:219) - spec/lib/crawler/coordinator_spec.rb (260:266) duplicated block id: 94 size: 7 cleaned lines of code in 2 files: - spec/lib/crawler/coordinator_spec.rb (280:286) - spec/lib/crawler/coordinator_spec.rb (386:392) duplicated block id: 95 size: 7 cleaned lines of code in 2 files: - spec/lib/crawler/coordinator_spec.rb (213:219) - spec/lib/crawler/coordinator_spec.rb (386:392) duplicated block id: 96 size: 7 cleaned lines of code in 2 files: - spec/integration/headers_spec.rb (49:56) - spec/integration/headers_spec.rb (66:73) duplicated block id: 97 size: 7 cleaned lines of code in 2 files: - spec/lib/crawler/data/link_spec.rb (61:68) - spec/lib/crawler/data/link_spec.rb (86:93) duplicated block id: 98 size: 7 cleaned lines of code in 2 files: - lib/crawler/data/crawl_result/http_auth_disallowed_error.rb (24:30) - lib/crawler/data/crawl_result/unsupported_content_type.rb (25:31) duplicated block id: 99 size: 7 cleaned lines of code in 2 files: - spec/lib/crawler/coordinator_spec.rb (230:236) - spec/lib/crawler/coordinator_spec.rb (386:392) duplicated block id: 100 size: 7 cleaned lines of code in 2 files: - spec/lib/crawler/api/crawl_spec.rb (20:26) - spec/lib/crawler/output_sink/elasticsearch_spec.rb (307:313) duplicated block id: 101 size: 7 cleaned lines of code in 2 files: - spec/lib/crawler/http_client_spec.rb (360:367) - spec/lib/crawler/http_client_spec.rb (383:390) duplicated block id: 102 size: 7 cleaned lines of code in 2 files: - spec/lib/crawler/coordinator_spec.rb (230:236) - spec/lib/crawler/coordinator_spec.rb (260:266) duplicated block id: 103 size: 7 cleaned lines of code in 2 files: - spec/lib/crawler/http_executor_spec.rb (48:54) - spec/lib/crawler/http_utils/bad_ssl_spec.rb (16:22) duplicated block id: 104 size: 7 cleaned lines of code in 2 files: - spec/lib/crawler/coordinator_spec.rb (131:138) - spec/lib/crawler/coordinator_spec.rb (151:158) duplicated block id: 105 size: 7 cleaned lines of code in 2 files: - spec/lib/crawler/content_engine/transformer_spec.rb (196:203) - spec/lib/crawler/content_engine/transformer_spec.rb (213:220) duplicated block id: 106 size: 7 cleaned lines of code in 2 files: - lib/crawler/event_generator.rb (247:255) - lib/crawler/event_generator.rb (264:272) duplicated block id: 107 size: 7 cleaned lines of code in 2 files: - spec/lib/crawler/api/crawl_spec.rb (20:26) - spec/lib/crawler/output_sink/elasticsearch_spec.rb (283:289) duplicated block id: 108 size: 7 cleaned lines of code in 2 files: - spec/lib/crawler/api/crawl_spec.rb (20:26) - spec/lib/crawler/output_sink/elasticsearch_spec.rb (16:22) duplicated block id: 109 size: 7 cleaned lines of code in 2 files: - lib/crawler/output_sink/elasticsearch.rb (106:112) - spec/lib/crawler/output_sink/elasticsearch_spec.rb (564:570) duplicated block id: 110 size: 6 cleaned lines of code in 2 files: - lib/crawler/content_engine/transformer.rb (50:55) - spec/lib/crawler/data/crawl_result/sitemap_spec.rb (244:249) duplicated block id: 111 size: 6 cleaned lines of code in 2 files: - lib/crawler/content_engine/transformer.rb (50:55) - lib/crawler/content_engine/utils.rb (104:109) duplicated block id: 112 size: 6 cleaned lines of code in 2 files: - spec/lib/crawler/url_validator/crawl_rules_check_spec.rb (48:53) - spec/lib/crawler/url_validator/crawl_rules_check_spec.rb (59:64) duplicated block id: 113 size: 6 cleaned lines of code in 2 files: - spec/lib/crawler/rule_engine/base_spec.rb (158:164) - spec/lib/crawler/rule_engine/base_spec.rb (171:177) duplicated block id: 114 size: 6 cleaned lines of code in 2 files: - spec/lib/crawler/cli/schedule_spec.rb (68:74) - spec/lib/crawler/cli/validate_spec.rb (70:76) duplicated block id: 115 size: 6 cleaned lines of code in 2 files: - spec/integration/robots_txt_spec.rb (37:42) - spec/integration/robots_txt_spec.rb (181:186) duplicated block id: 116 size: 6 cleaned lines of code in 2 files: - lib/crawler/content_engine/transformer.rb (50:55) - spec/lib/crawler/http_client_spec.rb (219:224) duplicated block id: 117 size: 6 cleaned lines of code in 2 files: - spec/integration/robots_txt_spec.rb (63:68) - spec/integration/robots_txt_spec.rb (149:154) duplicated block id: 118 size: 6 cleaned lines of code in 2 files: - spec/lib/crawler/coordinator_spec.rb (602:609) - spec/lib/crawler/coordinator_spec.rb (620:627) duplicated block id: 119 size: 6 cleaned lines of code in 2 files: - spec/lib/crawler/url_validator/url_content_check_spec.rb (151:156) - spec/lib/crawler/url_validator/url_content_check_spec.rb (168:173) duplicated block id: 120 size: 6 cleaned lines of code in 2 files: - spec/lib/crawler/data/url_queue_spec.rb (10:16) - spec/lib/crawler/event_generator_spec.rb (10:15) duplicated block id: 121 size: 6 cleaned lines of code in 2 files: - spec/lib/crawler/content_engine/extractor_spec.rb (111:116) - spec/lib/crawler/content_engine/extractor_spec.rb (192:197) duplicated block id: 122 size: 6 cleaned lines of code in 2 files: - spec/lib/crawler/output_sink/elasticsearch_spec.rb (307:312) - spec/lib/crawler/output_sink_spec.rb (39:44) duplicated block id: 123 size: 6 cleaned lines of code in 2 files: - spec/lib/crawler/http_executor_spec.rb (62:67) - spec/lib/crawler/http_executor_spec.rb (380:385) duplicated block id: 124 size: 6 cleaned lines of code in 2 files: - spec/integration/robots_txt_spec.rb (149:154) - spec/integration/robots_txt_spec.rb (182:187) duplicated block id: 125 size: 6 cleaned lines of code in 2 files: - spec/lib/crawler/rule_engine/base_spec.rb (158:164) - spec/lib/crawler/rule_engine/base_spec.rb (184:190) duplicated block id: 126 size: 6 cleaned lines of code in 2 files: - spec/lib/crawler/http_executor_spec.rb (338:343) - spec/lib/crawler/http_executor_spec.rb (380:385) duplicated block id: 127 size: 6 cleaned lines of code in 2 files: - spec/lib/crawler/rule_engine/base_spec.rb (158:164) - spec/lib/crawler/rule_engine/base_spec.rb (198:204) duplicated block id: 128 size: 6 cleaned lines of code in 2 files: - spec/lib/es/client_spec.rb (50:55) - spec/lib/es/client_spec.rb (106:111) duplicated block id: 129 size: 6 cleaned lines of code in 2 files: - lib/crawler/url_validator.rb (18:23) - lib/crawler/url_validator.rb (32:37) duplicated block id: 130 size: 6 cleaned lines of code in 2 files: - spec/lib/crawler/rule_engine/base_spec.rb (184:190) - spec/lib/crawler/rule_engine/base_spec.rb (198:204) duplicated block id: 131 size: 6 cleaned lines of code in 2 files: - spec/lib/crawler/http_executor_spec.rb (78:83) - spec/lib/crawler/http_executor_spec.rb (176:181) duplicated block id: 132 size: 6 cleaned lines of code in 2 files: - spec/lib/crawler/http_executor_spec.rb (49:54) - spec/lib/crawler/http_utils/config_spec.rb (12:17) duplicated block id: 133 size: 6 cleaned lines of code in 2 files: - lib/crawler/http_header_service.rb (22:27) - lib/crawler/http_header_service.rb (42:47) duplicated block id: 134 size: 6 cleaned lines of code in 2 files: - spec/lib/crawler/http_executor_spec.rb (62:67) - spec/lib/crawler/http_executor_spec.rb (338:343) duplicated block id: 135 size: 6 cleaned lines of code in 2 files: - spec/lib/crawler/content_engine/transformer_spec.rb (106:111) - spec/lib/crawler/content_engine/transformer_spec.rb (126:131) duplicated block id: 136 size: 6 cleaned lines of code in 2 files: - lib/crawler/content_engine/utils.rb (104:109) - spec/lib/crawler/data/crawl_result/sitemap_spec.rb (244:249) duplicated block id: 137 size: 6 cleaned lines of code in 2 files: - spec/lib/crawler/content_engine/extractor_spec.rb (59:64) - spec/lib/crawler/content_engine/extractor_spec.rb (136:141) duplicated block id: 138 size: 6 cleaned lines of code in 2 files: - spec/lib/crawler/cli/crawl_spec.rb (84:90) - spec/lib/crawler/cli/validate_spec.rb (70:76) duplicated block id: 139 size: 6 cleaned lines of code in 2 files: - lib/crawler/content_engine/utils.rb (104:109) - lib/crawler/data/extraction/rule.rb (122:127) duplicated block id: 140 size: 6 cleaned lines of code in 2 files: - spec/lib/crawler/output_sink/elasticsearch_spec.rb (16:21) - spec/lib/crawler/output_sink_spec.rb (39:44) duplicated block id: 141 size: 6 cleaned lines of code in 2 files: - spec/integration/legacy_sitemaps_spec.rb (28:34) - spec/integration/legacy_sitemaps_spec.rb (57:63) duplicated block id: 142 size: 6 cleaned lines of code in 2 files: - spec/lib/crawler/output_sink/elasticsearch_spec.rb (283:288) - spec/lib/crawler/output_sink_spec.rb (39:44) duplicated block id: 143 size: 6 cleaned lines of code in 2 files: - spec/lib/crawler/api/crawl_spec.rb (107:113) - spec/lib/crawler/api/crawl_spec.rb (128:134) duplicated block id: 144 size: 6 cleaned lines of code in 2 files: - spec/lib/crawler/data/crawl_result/sitemap_spec.rb (244:249) - spec/lib/crawler/http_client_spec.rb (219:224) duplicated block id: 145 size: 6 cleaned lines of code in 2 files: - spec/lib/crawler/data/crawl_result/sitemap_spec.rb (75:81) - spec/lib/crawler/data/crawl_result/sitemap_spec.rb (160:166) duplicated block id: 146 size: 6 cleaned lines of code in 2 files: - lib/crawler/data/extraction/rule.rb (122:127) - spec/lib/crawler/data/crawl_result/sitemap_spec.rb (244:249) duplicated block id: 147 size: 6 cleaned lines of code in 2 files: - spec/lib/crawler/output_sink/elasticsearch_spec.rb (631:637) - spec/lib/crawler/output_sink/elasticsearch_spec.rb (647:653) duplicated block id: 148 size: 6 cleaned lines of code in 2 files: - spec/lib/crawler/rule_engine/base_spec.rb (171:177) - spec/lib/crawler/rule_engine/base_spec.rb (198:204) duplicated block id: 149 size: 6 cleaned lines of code in 2 files: - lib/crawler/content_engine/transformer.rb (50:55) - lib/crawler/data/extraction/rule.rb (122:127) duplicated block id: 150 size: 6 cleaned lines of code in 2 files: - spec/lib/crawler/http_executor_spec.rb (338:343) - spec/lib/crawler/http_executor_spec.rb (429:434) duplicated block id: 151 size: 6 cleaned lines of code in 2 files: - lib/crawler/content_engine/utils.rb (104:109) - spec/lib/crawler/http_client_spec.rb (219:224) duplicated block id: 152 size: 6 cleaned lines of code in 2 files: - spec/lib/crawler/coordinator_spec.rb (585:592) - spec/lib/crawler/coordinator_spec.rb (602:609) duplicated block id: 153 size: 6 cleaned lines of code in 2 files: - spec/lib/crawler/http_executor_spec.rb (62:67) - spec/lib/crawler/http_executor_spec.rb (429:434) duplicated block id: 154 size: 6 cleaned lines of code in 2 files: - lib/crawler/data/extraction/rule.rb (122:127) - spec/lib/crawler/http_client_spec.rb (219:224) duplicated block id: 155 size: 6 cleaned lines of code in 2 files: - spec/lib/crawler/robots_txt_parser_spec.rb (136:142) - spec/lib/crawler/robots_txt_parser_spec.rb (153:159) duplicated block id: 156 size: 6 cleaned lines of code in 2 files: - spec/lib/crawler/rule_engine/base_spec.rb (171:177) - spec/lib/crawler/rule_engine/base_spec.rb (184:190) duplicated block id: 157 size: 6 cleaned lines of code in 2 files: - spec/lib/crawler/data/crawl_result/sitemap_spec.rb (160:166) - spec/lib/crawler/data/crawl_result/sitemap_spec.rb (207:213) duplicated block id: 158 size: 6 cleaned lines of code in 2 files: - spec/lib/crawler/coordinator_spec.rb (585:592) - spec/lib/crawler/coordinator_spec.rb (620:627) duplicated block id: 159 size: 6 cleaned lines of code in 2 files: - spec/lib/crawler/http_utils/bad_ssl_spec.rb (17:22) - spec/lib/crawler/http_utils/config_spec.rb (12:17)