apache / nutch
Duplication

Places in code with 6 or more lines that are exactly the same.

Intro
Learn more...
Duplication Overall
system15% (6,269 lines)
dependency graphs: 2D graph | 3D graph | 3D graph (with duplicates)...
Duplication per Extension
java14% (4,902 lines)
xml21% (1,325 lines)
rss100% (30 lines)
xsd10% (12 lines)
Duplication per Component (primary)
src16% (6,257 lines)
conf<1% (12 lines)
ivy0% (0 lines)
ROOT0% (0 lines)
Longest Duplicates
The list of 50 longest duplicates.
See data for all 3,467 duplicates...
Size#FoldersFilesLinesCode
140 x 2 src/plugin/parse-html/sr...apache/nutch/parse/html
src/plugin/parse-tika/sr...apache/nutch/parse/tika
134:417 (59%)
132:415 (59%)
view
136 x 2 src/plugin/protocol-inte...col/interactiveselenium
src/plugin/protocol-sele...nutch/protocol/selenium
68:274 (44%)
63:269 (50%)
view
74 x 2 src/plugin/parse-html/sr...apache/nutch/parse/html
src/plugin/parse-tika/sr...apache/nutch/parse/tika
350:774 (51%)
359:776 (47%)
view
73 x 2 src/plugin/protocol-inte...col/interactiveselenium
src/plugin/protocol-sele...nutch/protocol/selenium
416:530 (23%)
366:480 (26%)
view
58 x 2 src/plugin/protocol-html...nutch/protocol/htmlunit
src/plugin/protocol-inte...col/interactiveselenium
86:170 (17%)
81:164 (19%)
view
58 x 2 src/plugin/protocol-html...nutch/protocol/htmlunit
src/plugin/protocol-sele...nutch/protocol/selenium
86:170 (17%)
76:159 (21%)
view
36 x 2 src/plugin/protocol-html...nutch/protocol/htmlunit
src/plugin/protocol-sele...nutch/protocol/selenium
537:595 (10%)
422:480 (13%)
view
36 x 2 src/plugin/protocol-html...nutch/protocol/htmlunit
src/plugin/protocol-inte...col/interactiveselenium
537:595 (10%)
472:530 (11%)
view
34 x 2 src/plugin/protocol-html...nutch/protocol/htmlunit
src/plugin/protocol-sele...nutch/protocol/selenium
478:524 (10%)
366:412 (12%)
view
34 x 2 src/plugin/protocol-html...nutch/protocol/htmlunit
src/plugin/protocol-inte...col/interactiveselenium
478:524 (10%)
416:462 (11%)
view
33 x 2 src/java/org/apache/nutch/crawl
src/java/org/apache/nutch/crawl
322:360 (17%)
189:233 (31%)
view
31 x 2 src/java/org/apache/nutch/tools/warc
src/java/org/apache/nutch/tools/warc
306:341 (9%)
366:401 (9%)
view
29 x 2 src/plugin/protocol-http...che/nutch/protocol/http
src/plugin/protocol-inte...col/interactiveselenium
179:218 (7%)
181:219 (9%)
view
29 x 2 src/plugin/protocol-http...che/nutch/protocol/http
src/plugin/protocol-sele...nutch/protocol/selenium
179:218 (7%)
176:214 (10%)
view
28 x 2 src/plugin/index-geoip/s...che/nutch/indexer/geoip
src/plugin/index-geoip/s...che/nutch/indexer/geoip
84:118 (21%)
199:234 (21%)
view
28 x 2 src/plugin/parse-html/sr...apache/nutch/parse/html
src/plugin/parse-tika/sr...apache/nutch/parse/tika
450:489 (11%)
417:455 (11%)
view
28 x 2 src/plugin/parse-html/sr...apache/nutch/parse/html
src/plugin/parse-tika/sr...apache/nutch/parse/tika
125:266 (19%)
132:271 (17%)
view
27 x 2 src/plugin/indexer-elast...tch/indexwriter/elastic
src/plugin/indexer-opens...ndexwriter/opensearch1x
282:332 (13%)
347:397 (10%)
view
25 x 2 src/plugin/protocol-inte...col/interactiveselenium
src/plugin/protocol-sele...nutch/protocol/selenium
32:95 (100%)
32:95 (100%)
view
25 x 2 src/plugin/protocol-html...nutch/protocol/htmlunit
src/plugin/protocol-http...tch/protocol/httpclient
32:95 (100%)
32:95 (100%)
view
25 x 2 src/plugin/protocol-html...nutch/protocol/htmlunit
src/plugin/protocol-inte...col/interactiveselenium
32:95 (100%)
32:95 (100%)
view
25 x 2 src/plugin/protocol-http...tch/protocol/httpclient
src/plugin/protocol-sele...nutch/protocol/selenium
32:95 (100%)
32:95 (100%)
view
25 x 2 src/plugin/protocol-html...nutch/protocol/htmlunit
src/plugin/protocol-sele...nutch/protocol/selenium
32:95 (100%)
32:95 (100%)
view
25 x 2 src/java/org/apache/nutch/crawl
src/java/org/apache/nutch/crawl
220:250 (2%)
303:333 (2%)
view
25 x 2 src/plugin/protocol-http...tch/protocol/httpclient
src/plugin/protocol-inte...col/interactiveselenium
32:95 (100%)
32:95 (100%)
view
24 x 2 src/plugin/protocol-html...nutch/protocol/htmlunit
src/plugin/protocol-http...che/nutch/protocol/http
360:402 (7%)
431:474 (6%)
view
23 x 2 src/plugin/protocol-http...che/nutch/protocol/http
src/plugin/protocol-sele...nutch/protocol/selenium
51:102 (85%)
44:95 (92%)
view
23 x 2 src/plugin/protocol-html...nutch/protocol/htmlunit
src/plugin/protocol-sele...nutch/protocol/selenium
174:206 (7%)
162:193 (8%)
view
23 x 2 src/plugin/protocol-html...nutch/protocol/htmlunit
src/plugin/protocol-http...che/nutch/protocol/http
44:95 (92%)
51:102 (85%)
view
23 x 2 src/plugin/protocol-http...che/nutch/protocol/http
src/plugin/protocol-http...tch/protocol/httpclient
51:102 (85%)
44:95 (92%)
view
23 x 2 src/plugin/protocol-http...che/nutch/protocol/http
src/plugin/protocol-inte...col/interactiveselenium
51:102 (85%)
44:95 (92%)
view
23 x 2 src/plugin/parse-html/sr...apache/nutch/parse/html
src/plugin/parse-tika/sr...apache/nutch/parse/tika
271:349 (15%)
279:357 (14%)
view
23 x 2 src/plugin/protocol-html...nutch/protocol/htmlunit
src/plugin/protocol-inte...col/interactiveselenium
174:206 (7%)
167:198 (7%)
view
21 x 2 src/plugin/protocol-http...che/nutch/protocol/http
src/plugin/protocol-inte...col/interactiveselenium
105:131 (5%)
105:131 (6%)
view
21 x 2 src/plugin/parse-html/sr...apache/nutch/parse/html
src/plugin/parse-tika/sr...apache/nutch/parse/tika
73:101 (8%)
75:103 (8%)
view
21 x 2 src/plugin/indexer-elast...tch/indexwriter/elastic
src/plugin/indexer-opens...ndexwriter/opensearch1x
251:278 (10%)
316:343 (7%)
view
21 x 2 src/plugin/protocol-html...nutch/protocol/htmlunit
src/plugin/protocol-http...che/nutch/protocol/http
111:137 (6%)
105:131 (5%)
view
21 x 2 src/plugin/protocol-http...che/nutch/protocol/http
src/plugin/protocol-sele...nutch/protocol/selenium
105:131 (5%)
100:126 (7%)
view
20 x 2 src/plugin/protocol-html...nutch/protocol/htmlunit
src/plugin/protocol-inte...col/interactiveselenium
294:330 (6%)
323:358 (6%)
view
20 x 2 src/plugin/parse-html/sr...apache/nutch/parse/html
src/plugin/parse-tika/sr...apache/nutch/parse/tika
183:212 (18%)
164:193 (15%)
view
19 x 2 src/plugin/protocol-html...nutch/protocol/htmlunit
src/plugin/protocol-sele...nutch/protocol/selenium
295:330 (5%)
318:352 (7%)
view
19 x 2 src/plugin/protocol-html...nutch/protocol/htmlunit
src/plugin/protocol-http...che/nutch/protocol/http
405:436 (5%)
477:508 (4%)
view
19 x 2 src/plugin/protocol-inte...col/interactiveselenium
src/plugin/protocol-sele...nutch/protocol/selenium
324:358 (6%)
318:352 (7%)
view
19 x 2 src/java/org/apache/nutch/service/impl
src/java/org/apache/nutch/service/impl
52:79 (19%)
52:80 (18%)
view
18 x 2 src/plugin/protocol-http...che/nutch/protocol/http
src/plugin/protocol-sele...nutch/protocol/selenium
338:372 (4%)
319:352 (6%)
view
18 x 2 src/plugin/protocol-http...che/nutch/protocol/http
src/plugin/protocol-inte...col/interactiveselenium
338:372 (4%)
325:358 (5%)
view
18 x 2 src/plugin/protocol-html...nutch/protocol/htmlunit
src/plugin/protocol-http...che/nutch/protocol/http
296:330 (5%)
338:372 (4%)
view
17 x 2 src/plugin/parse-html/sr...apache/nutch/parse/html
src/plugin/parse-tika/sr...apache/nutch/parse/tika
33:72 (15%)
35:74 (13%)
view
17 x 2 src/plugin/parse-html/sr...apache/nutch/parse/html
src/plugin/parse-tika/sr...apache/nutch/parse/tika
147:177 (15%)
216:246 (13%)
view
17 x 2 src/plugin/urlfilter-dom.../nutch/urlfilter/domain
src/plugin/urlfilter-dom...rlfilter/domaindenylist
134:158 (28%)
135:159 (28%)
view
Duplicated Units
The list of top 23 duplicated units.
See data for all 23 unit duplicates...
Size#FoldersFilesLinesCode
43 x 2 src/plugin/parse-tika/sr...apache/nutch/parse/tika
src/plugin/parse-html/sr...apache/nutch/parse/html
152:201 
154:203 
view
36 x 2 src/plugin/parse-tika/sr...apache/nutch/parse/tika
src/plugin/parse-html/sr...apache/nutch/parse/html
328:377 
330:379 
view
24 x 3 src/plugin/protocol-inte...col/interactiveselenium
src/plugin/protocol-sele...nutch/protocol/selenium
src/plugin/protocol-html...nutch/protocol/htmlunit
498:525 
448:475 
563:590 
view
24 x 2 src/plugin/parse-tika/sr...apache/nutch/parse/tika
src/plugin/parse-html/sr...apache/nutch/parse/html
141:174 
136:169 
view
21 x 3 src/plugin/protocol-inte...col/interactiveselenium
src/plugin/protocol-sele...nutch/protocol/selenium
src/plugin/protocol-html...nutch/protocol/htmlunit
439:463 
389:413 
501:525 
view
21 x 2 src/plugin/parse-tika/sr...apache/nutch/parse/tika
src/plugin/parse-html/sr...apache/nutch/parse/html
284:315 
286:317 
view
17 x 2 src/plugin/protocol-inte...col/interactiveselenium
src/plugin/protocol-sele...nutch/protocol/selenium
466:496 
416:446 
view
17 x 2 src/plugin/parse-tika/sr...apache/nutch/parse/tika
src/plugin/parse-html/sr...apache/nutch/parse/html
408:429 
398:419 
view
16 x 2 src/plugin/parse-tika/sr...apache/nutch/parse/tika
src/plugin/parse-html/sr...apache/nutch/parse/html
254:277 
256:279 
view
14 x 3 src/plugin/protocol-inte...col/interactiveselenium
src/plugin/protocol-sele...nutch/protocol/selenium
src/plugin/protocol-html...nutch/protocol/htmlunit
416:437 
366:387 
478:499 
view
13 x 2 src/plugin/parse-tika/sr...apache/nutch/parse/tika
src/plugin/parse-html/sr...apache/nutch/parse/html
211:226 
213:228 
view
10 x 5 src/plugin/protocol-http...tch/protocol/httpclient
src/plugin/protocol-inte...col/interactiveselenium
src/plugin/protocol-sele...nutch/protocol/selenium
src/plugin/protocol-http...che/nutch/protocol/http
src/plugin/protocol-html...nutch/protocol/htmlunit
44:55 
44:55 
44:55 
51:62 
44:55 
view
10 x 2 src/plugin/parse-tika/sr...apache/nutch/parse/tika
src/plugin/parse-html/sr...apache/nutch/parse/html
645:659 
643:657 
view
8 x 2 src/plugin/parse-tika/sr...apache/nutch/parse/tika
src/plugin/parse-html/sr...apache/nutch/parse/html
443:454 
435:446 
view
8 x 2 src/plugin/parse-tika/sr...apache/nutch/parse/tika
src/plugin/parse-html/sr...apache/nutch/parse/html
98:110 
98:110 
view
7 x 2 src/plugin/indexer-opens...ndexwriter/opensearch1x
src/plugin/indexer-elast...tch/indexwriter/elastic
376:386 
311:321 
view
7 x 2 src/plugin/parse-tika/sr...apache/nutch/parse/tika
src/plugin/parse-html/sr...apache/nutch/parse/html
235:243 
237:245 
view
7 x 2 src/plugin/urlfilter-dom.../nutch/urlfilter/domain
src/plugin/urlfilter-dom...rlfilter/domaindenylist
87:98 
87:98 
view
6 x 3 src/java/org/apache/nutch/util
src/java/org/apache/nutch/util
src/java/org/apache/nutch/util
219:228 
233:242 
154:163 
view
6 x 3 src/java/org/apache/nutch/util
src/java/org/apache/nutch/util
src/java/org/apache/nutch/util
234:242 
248:256 
169:177 
view
6 x 2 src/plugin/parse-tika/sr...apache/nutch/parse/tika
src/plugin/parse-html/sr...apache/nutch/parse/html
317:324 
319:326 
view
6 x 2 src/plugin/parse-tika/sr...apache/nutch/parse/tika
src/plugin/parse-html/sr...apache/nutch/parse/html
60:70 
60:70 
view
6 x 2 src/plugin/parse-tika/sr...apache/nutch/parse/tika
src/plugin/parse-html/sr...apache/nutch/parse/html
79:89 
79:89 
view