duplicated block id: 1 size: 105 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/DoubleVAR.java (107:260) - datafu-pig/src/main/java/datafu/pig/stats/FloatVAR.java (109:262) duplicated block id: 2 size: 73 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/DoubleVAR.java (107:215) - datafu-pig/src/main/java/datafu/pig/stats/VAR.java (124:232) duplicated block id: 3 size: 73 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/FloatVAR.java (109:217) - datafu-pig/src/main/java/datafu/pig/stats/VAR.java (124:232) duplicated block id: 4 size: 61 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/IntVAR.java (116:204) - datafu-pig/src/main/java/datafu/pig/stats/LongVAR.java (116:205) duplicated block id: 5 size: 51 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/IntVAR.java (295:370) - datafu-pig/src/main/java/datafu/pig/stats/LongVAR.java (296:371) duplicated block id: 6 size: 51 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/DoubleVAR.java (296:371) - datafu-pig/src/main/java/datafu/pig/stats/FloatVAR.java (298:373) duplicated block id: 7 size: 40 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/DoubleVAR.java (316:371) - datafu-pig/src/main/java/datafu/pig/stats/VAR.java (345:400) duplicated block id: 8 size: 40 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/FloatVAR.java (318:373) - datafu-pig/src/main/java/datafu/pig/stats/VAR.java (345:400) duplicated block id: 9 size: 37 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/FloatVAR.java (116:166) - datafu-pig/src/main/java/datafu/pig/stats/LongVAR.java (114:164) duplicated block id: 10 size: 37 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/DoubleVAR.java (114:164) - datafu-pig/src/main/java/datafu/pig/stats/LongVAR.java (114:164) duplicated block id: 11 size: 37 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/LongVAR.java (114:164) - datafu-pig/src/main/java/datafu/pig/stats/VAR.java (131:181) duplicated block id: 12 size: 36 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/FloatVAR.java (118:166) - datafu-pig/src/main/java/datafu/pig/stats/IntVAR.java (116:163) duplicated block id: 13 size: 36 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/DoubleVAR.java (116:164) - datafu-pig/src/main/java/datafu/pig/stats/IntVAR.java (116:163) duplicated block id: 14 size: 36 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/IntVAR.java (116:163) - datafu-pig/src/main/java/datafu/pig/stats/VAR.java (133:181) duplicated block id: 15 size: 31 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/FloatVAR.java (219:262) - datafu-pig/src/main/java/datafu/pig/stats/VAR.java (233:276) duplicated block id: 16 size: 31 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/DoubleVAR.java (217:260) - datafu-pig/src/main/java/datafu/pig/stats/VAR.java (233:276) duplicated block id: 17 size: 29 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/text/opennlp/TokenizeSimple.java (52:98) - datafu-pig/src/main/java/datafu/pig/text/opennlp/TokenizeWhitespace.java (53:99) duplicated block id: 18 size: 29 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/DoubleVAR.java (46:93) - datafu-pig/src/main/java/datafu/pig/stats/VAR.java (62:109) duplicated block id: 19 size: 29 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/FloatVAR.java (46:93) - datafu-pig/src/main/java/datafu/pig/stats/VAR.java (62:109) duplicated block id: 20 size: 29 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/IntVAR.java (46:93) - datafu-pig/src/main/java/datafu/pig/stats/LongVAR.java (46:93) duplicated block id: 21 size: 29 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/DoubleVAR.java (46:93) - datafu-pig/src/main/java/datafu/pig/stats/FloatVAR.java (46:93) duplicated block id: 22 size: 27 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/IntVAR.java (221:259) - datafu-pig/src/main/java/datafu/pig/stats/LongVAR.java (222:260) duplicated block id: 23 size: 24 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/bags/Enumerate.java (109:150) - datafu-pig/src/main/java/datafu/pig/bags/ReverseEnumerate.java (97:138) duplicated block id: 24 size: 22 cleaned lines of code in 2 files: - datafu-hourglass/src/main/java/datafu/hourglass/mapreduce/CollapsingCombiner.java (203:240) - datafu-hourglass/src/main/java/datafu/hourglass/mapreduce/CollapsingReducer.java (278:315) duplicated block id: 25 size: 21 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/FloatVAR.java (339:368) - datafu-pig/src/main/java/datafu/pig/stats/IntVAR.java (336:365) duplicated block id: 26 size: 21 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/IntVAR.java (336:365) - datafu-pig/src/main/java/datafu/pig/stats/VAR.java (366:395) duplicated block id: 27 size: 21 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/DoubleVAR.java (337:366) - datafu-pig/src/main/java/datafu/pig/stats/LongVAR.java (337:366) duplicated block id: 28 size: 21 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/DoubleVAR.java (337:366) - datafu-pig/src/main/java/datafu/pig/stats/IntVAR.java (336:365) duplicated block id: 29 size: 21 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/FloatVAR.java (339:368) - datafu-pig/src/main/java/datafu/pig/stats/LongVAR.java (337:366) duplicated block id: 30 size: 21 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/LongVAR.java (337:366) - datafu-pig/src/main/java/datafu/pig/stats/VAR.java (366:395) duplicated block id: 31 size: 20 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/DoubleVAR.java (262:291) - datafu-pig/src/main/java/datafu/pig/stats/VAR.java (279:308) duplicated block id: 32 size: 19 cleaned lines of code in 2 files: - datafu-pig/src/main/resources/datafu/left_outer_join.pig (1:20) - datafu-pig/src/main/resources/datafu/sample_by_keys.pig (1:20) duplicated block id: 33 size: 19 cleaned lines of code in 2 files: - datafu-pig/src/main/resources/datafu/dedup.pig (1:20) - datafu-pig/src/main/resources/datafu/left_outer_join.pig (1:20) duplicated block id: 34 size: 19 cleaned lines of code in 2 files: - datafu-pig/src/main/resources/datafu/count_macros.pig (1:20) - datafu-pig/src/main/resources/datafu/diff_macros.pig (1:20) duplicated block id: 35 size: 19 cleaned lines of code in 2 files: - datafu-hourglass/src/main/java/datafu/hourglass/jobs/PartitionCollapsingIncrementalJob.java (60:96) - datafu-hourglass/src/main/java/datafu/hourglass/jobs/PartitionPreservingIncrementalJob.java (57:93) duplicated block id: 36 size: 19 cleaned lines of code in 2 files: - datafu-pig/src/main/resources/datafu/count_macros.pig (1:20) - datafu-pig/src/main/resources/datafu/left_outer_join.pig (1:20) duplicated block id: 37 size: 19 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/text/opennlp/TokenizeME.java (93:123) - datafu-pig/src/main/java/datafu/pig/text/opennlp/TokenizeSimple.java (68:98) duplicated block id: 38 size: 19 cleaned lines of code in 2 files: - datafu-pig/src/main/resources/datafu/diff_macros.pig (1:20) - datafu-pig/src/main/resources/datafu/left_outer_join.pig (1:20) duplicated block id: 39 size: 19 cleaned lines of code in 2 files: - datafu-pig/src/main/resources/datafu/count_macros.pig (1:20) - datafu-pig/src/main/resources/datafu/sample_by_keys.pig (1:20) duplicated block id: 40 size: 19 cleaned lines of code in 2 files: - datafu-pig/src/main/resources/datafu/dedup.pig (1:20) - datafu-pig/src/main/resources/datafu/sample_by_keys.pig (1:20) duplicated block id: 41 size: 19 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/text/opennlp/TokenizeME.java (93:123) - datafu-pig/src/main/java/datafu/pig/text/opennlp/TokenizeWhitespace.java (69:99) duplicated block id: 42 size: 19 cleaned lines of code in 2 files: - datafu-pig/src/main/resources/datafu/count_macros.pig (1:20) - datafu-pig/src/main/resources/datafu/dedup.pig (1:20) duplicated block id: 43 size: 19 cleaned lines of code in 2 files: - datafu-pig/src/main/resources/datafu/dedup.pig (1:20) - datafu-pig/src/main/resources/datafu/diff_macros.pig (1:20) duplicated block id: 44 size: 19 cleaned lines of code in 2 files: - datafu-pig/src/main/resources/datafu/diff_macros.pig (1:20) - datafu-pig/src/main/resources/datafu/sample_by_keys.pig (1:20) duplicated block id: 45 size: 18 cleaned lines of code in 2 files: - site/source/blog/index.html.erb (4:21) - site/source/stylesheets/highlight.css.erb (1:18) duplicated block id: 46 size: 18 cleaned lines of code in 2 files: - site/source/layouts/_header.erb (1:18) - site/source/layouts/blog.erb (1:18) duplicated block id: 47 size: 18 cleaned lines of code in 2 files: - site/source/blog/index.html.erb (4:21) - site/source/layouts/_footer.erb (1:18) duplicated block id: 48 size: 18 cleaned lines of code in 2 files: - site/source/layouts/blog.erb (1:18) - site/source/stylesheets/highlight.css.erb (1:18) duplicated block id: 49 size: 18 cleaned lines of code in 2 files: - site/source/layouts/_docs_nav.erb (1:18) - site/source/layouts/blog.erb (1:18) duplicated block id: 50 size: 18 cleaned lines of code in 2 files: - site/source/layouts/_footer.erb (1:18) - site/source/layouts/docs.erb (1:18) duplicated block id: 51 size: 18 cleaned lines of code in 2 files: - site/source/layouts/layout.erb (1:18) - site/source/stylesheets/highlight.css.erb (1:18) duplicated block id: 52 size: 18 cleaned lines of code in 2 files: - site/source/layouts/blog.erb (1:18) - site/source/layouts/layout.erb (1:18) duplicated block id: 53 size: 18 cleaned lines of code in 2 files: - datafu-hourglass/src/main/java/datafu/hourglass/mapreduce/CollapsingCombiner.java (74:107) - datafu-hourglass/src/main/java/datafu/hourglass/mapreduce/CollapsingReducer.java (90:123) duplicated block id: 54 size: 18 cleaned lines of code in 2 files: - site/source/blog/index.html.erb (4:21) - site/source/layouts/layout.erb (1:18) duplicated block id: 55 size: 18 cleaned lines of code in 2 files: - site/source/layouts/_docs_nav.erb (1:18) - site/source/layouts/docs.erb (1:18) duplicated block id: 56 size: 18 cleaned lines of code in 2 files: - site/source/layouts/_footer.erb (1:18) - site/source/layouts/_header.erb (1:18) duplicated block id: 57 size: 18 cleaned lines of code in 2 files: - site/source/layouts/docs.erb (1:18) - site/source/stylesheets/highlight.css.erb (1:18) duplicated block id: 58 size: 18 cleaned lines of code in 2 files: - site/source/layouts/_footer.erb (1:18) - site/source/layouts/blog.erb (1:18) duplicated block id: 59 size: 18 cleaned lines of code in 2 files: - site/source/blog/index.html.erb (4:21) - site/source/layouts/_docs_nav.erb (1:18) duplicated block id: 60 size: 18 cleaned lines of code in 2 files: - site/source/layouts/_header.erb (1:18) - site/source/stylesheets/highlight.css.erb (1:18) duplicated block id: 61 size: 18 cleaned lines of code in 2 files: - site/source/blog/index.html.erb (4:21) - site/source/layouts/docs.erb (1:18) duplicated block id: 62 size: 18 cleaned lines of code in 2 files: - site/source/layouts/_header.erb (1:18) - site/source/layouts/docs.erb (1:18) duplicated block id: 63 size: 18 cleaned lines of code in 2 files: - site/source/layouts/_docs_nav.erb (1:18) - site/source/stylesheets/highlight.css.erb (1:18) duplicated block id: 64 size: 18 cleaned lines of code in 2 files: - site/source/blog/index.html.erb (4:21) - site/source/layouts/blog.erb (1:18) duplicated block id: 65 size: 18 cleaned lines of code in 2 files: - site/source/layouts/_header.erb (1:18) - site/source/layouts/layout.erb (1:18) duplicated block id: 66 size: 18 cleaned lines of code in 2 files: - site/source/layouts/blog.erb (1:18) - site/source/layouts/docs.erb (1:18) duplicated block id: 67 size: 18 cleaned lines of code in 2 files: - site/source/layouts/_footer.erb (1:18) - site/source/stylesheets/highlight.css.erb (1:18) duplicated block id: 68 size: 18 cleaned lines of code in 2 files: - site/source/blog/index.html.erb (4:21) - site/source/layouts/_header.erb (1:18) duplicated block id: 69 size: 18 cleaned lines of code in 2 files: - site/source/layouts/_docs_nav.erb (1:18) - site/source/layouts/_header.erb (1:18) duplicated block id: 70 size: 18 cleaned lines of code in 2 files: - site/source/layouts/_docs_nav.erb (1:18) - site/source/layouts/_footer.erb (1:18) duplicated block id: 71 size: 18 cleaned lines of code in 2 files: - site/source/layouts/_docs_nav.erb (1:18) - site/source/layouts/layout.erb (1:18) duplicated block id: 72 size: 18 cleaned lines of code in 2 files: - site/source/layouts/docs.erb (1:18) - site/source/layouts/layout.erb (1:18) duplicated block id: 73 size: 18 cleaned lines of code in 2 files: - site/source/layouts/_footer.erb (1:18) - site/source/layouts/layout.erb (1:18) duplicated block id: 74 size: 17 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/IntVAR.java (264:290) - datafu-pig/src/main/java/datafu/pig/stats/LongVAR.java (265:291) duplicated block id: 75 size: 17 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/FloatVAR.java (267:293) - datafu-pig/src/main/java/datafu/pig/stats/VAR.java (282:308) duplicated block id: 76 size: 17 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/bags/DistinctBy.java (101:133) - datafu-pig/src/main/java/datafu/pig/bags/Enumerate.java (106:138) duplicated block id: 77 size: 17 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/DoubleVAR.java (265:291) - datafu-pig/src/main/java/datafu/pig/stats/FloatVAR.java (267:293) duplicated block id: 78 size: 16 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/IntVAR.java (65:93) - datafu-pig/src/main/java/datafu/pig/stats/VAR.java (81:109) duplicated block id: 79 size: 16 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/FloatVAR.java (65:93) - datafu-pig/src/main/java/datafu/pig/stats/IntVAR.java (65:93) duplicated block id: 80 size: 16 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/LongVAR.java (65:93) - datafu-pig/src/main/java/datafu/pig/stats/VAR.java (81:109) duplicated block id: 81 size: 16 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/DoubleVAR.java (65:93) - datafu-pig/src/main/java/datafu/pig/stats/LongVAR.java (65:93) duplicated block id: 82 size: 16 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/DoubleVAR.java (65:93) - datafu-pig/src/main/java/datafu/pig/stats/IntVAR.java (65:93) duplicated block id: 83 size: 16 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/FloatVAR.java (65:93) - datafu-pig/src/main/java/datafu/pig/stats/LongVAR.java (65:93) duplicated block id: 84 size: 15 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/bags/DistinctBy.java (104:133) - datafu-pig/src/main/java/datafu/pig/bags/ReverseEnumerate.java (97:126) duplicated block id: 85 size: 14 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/FloatVAR.java (229:247) - datafu-pig/src/main/java/datafu/pig/stats/IntVAR.java (226:244) duplicated block id: 86 size: 14 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/DoubleVAR.java (227:245) - datafu-pig/src/main/java/datafu/pig/stats/IntVAR.java (226:244) duplicated block id: 87 size: 14 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/LongVAR.java (227:245) - datafu-pig/src/main/java/datafu/pig/stats/VAR.java (243:261) duplicated block id: 88 size: 14 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/DoubleVAR.java (293:312) - datafu-pig/src/main/java/datafu/pig/stats/VAR.java (311:330) duplicated block id: 89 size: 14 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/IntVAR.java (226:244) - datafu-pig/src/main/java/datafu/pig/stats/VAR.java (243:261) duplicated block id: 90 size: 14 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/DoubleVAR.java (227:245) - datafu-pig/src/main/java/datafu/pig/stats/LongVAR.java (227:245) duplicated block id: 91 size: 14 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/FloatVAR.java (229:247) - datafu-pig/src/main/java/datafu/pig/stats/LongVAR.java (227:245) duplicated block id: 92 size: 13 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/text/opennlp/SentenceDetect.java (59:80) - datafu-pig/src/main/java/datafu/pig/text/opennlp/TokenizeME.java (63:84) duplicated block id: 93 size: 13 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/HyperLogLogPlusPlus.java (111:133) - datafu-pig/src/main/java/datafu/pig/stats/entropy/EmpiricalCountEntropy.java (178:200) duplicated block id: 94 size: 12 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/bags/ReverseEnumerate.java (100:121) - datafu-pig/src/main/java/datafu/pig/bags/UnorderedPairs.java (97:118) duplicated block id: 95 size: 12 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/bags/Enumerate.java (112:133) - datafu-pig/src/main/java/datafu/pig/bags/UnorderedPairs.java (97:118) duplicated block id: 96 size: 12 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/entropy/CondEntropy.java (226:247) - datafu-pig/src/main/java/datafu/pig/stats/entropy/EmpiricalCountEntropy.java (378:399) duplicated block id: 97 size: 12 cleaned lines of code in 2 files: - datafu-hourglass/src/main/java/datafu/hourglass/jobs/PartitionCollapsingIncrementalJob.java (116:172) - datafu-hourglass/src/main/java/datafu/hourglass/jobs/PartitionPreservingIncrementalJob.java (101:157) duplicated block id: 98 size: 12 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/bags/DistinctBy.java (107:128) - datafu-pig/src/main/java/datafu/pig/bags/UnorderedPairs.java (97:118) duplicated block id: 99 size: 11 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/FloatVAR.java (298:314) - datafu-pig/src/main/java/datafu/pig/stats/VAR.java (314:330) duplicated block id: 100 size: 10 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/entropy/EmpiricalCountEntropy.java (378:394) - datafu-pig/src/main/java/datafu/pig/stats/entropy/Entropy.java (200:216) duplicated block id: 101 size: 10 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/sessions/Sessionize.java (137:153) - datafu-pig/src/main/java/datafu/pig/stats/entropy/CondEntropy.java (226:242) duplicated block id: 102 size: 10 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/sessions/Sessionize.java (137:153) - datafu-pig/src/main/java/datafu/pig/stats/entropy/EmpiricalCountEntropy.java (378:394) duplicated block id: 103 size: 10 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/entropy/CondEntropy.java (226:242) - datafu-pig/src/main/java/datafu/pig/stats/entropy/Entropy.java (200:216) duplicated block id: 104 size: 10 cleaned lines of code in 2 files: - datafu-hourglass/src/main/java/datafu/hourglass/jobs/AbstractPartitionCollapsingIncrementalJob.java (318:336) - datafu-hourglass/src/main/java/datafu/hourglass/jobs/AbstractPartitionPreservingIncrementalJob.java (249:267) duplicated block id: 105 size: 10 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/sessions/Sessionize.java (137:153) - datafu-pig/src/main/java/datafu/pig/stats/entropy/Entropy.java (200:216) duplicated block id: 106 size: 10 cleaned lines of code in 2 files: - datafu-hourglass/src/main/java/datafu/hourglass/jobs/PartitionCollapsingExecutionPlanner.java (138:150) - datafu-hourglass/src/main/java/datafu/hourglass/jobs/PartitionPreservingExecutionPlanner.java (212:224) duplicated block id: 107 size: 10 cleaned lines of code in 2 files: - datafu-hourglass/src/main/java/datafu/hourglass/jobs/AbstractPartitionCollapsingIncrementalJob.java (348:370) - datafu-hourglass/src/main/java/datafu/hourglass/jobs/AbstractPartitionPreservingIncrementalJob.java (224:246) duplicated block id: 108 size: 9 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/text/opennlp/SentenceDetect.java (91:108) - datafu-pig/src/main/java/datafu/pig/text/opennlp/TokenizeSimple.java (70:87) duplicated block id: 109 size: 9 cleaned lines of code in 2 files: - datafu-hourglass/src/main/java/datafu/hourglass/mapreduce/DelegatingCombiner.java (47:62) - datafu-hourglass/src/main/java/datafu/hourglass/mapreduce/DelegatingReducer.java (47:62) duplicated block id: 110 size: 9 cleaned lines of code in 2 files: - datafu-pig/src/main/resources/datafu/count_macros.pig (1:9) - datafu-pig/src/main/resources/datafu/tf_idf.pig (1:9) duplicated block id: 111 size: 9 cleaned lines of code in 2 files: - datafu-pig/src/main/resources/datafu/dedup.pig (1:9) - datafu-pig/src/main/resources/datafu/tf_idf.pig (1:9) duplicated block id: 112 size: 9 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/sampling/WeightedReservoirSample.java (94:111) - datafu-pig/src/main/java/datafu/pig/stats/entropy/EmpiricalCountEntropy.java (386:403) duplicated block id: 113 size: 9 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/FloatVAR.java (180:193) - datafu-pig/src/main/java/datafu/pig/stats/LongVAR.java (178:191) duplicated block id: 114 size: 9 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/FloatVAR.java (180:193) - datafu-pig/src/main/java/datafu/pig/stats/IntVAR.java (177:190) duplicated block id: 115 size: 9 cleaned lines of code in 2 files: - datafu-pig/src/main/resources/datafu/sample_by_keys.pig (1:9) - datafu-pig/src/main/resources/datafu/tf_idf.pig (1:9) duplicated block id: 116 size: 9 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/sampling/SimpleRandomSample.java (128:143) - datafu-pig/src/main/java/datafu/pig/sampling/SimpleRandomSampleWithReplacementElect.java (91:106) duplicated block id: 117 size: 9 cleaned lines of code in 2 files: - datafu-pig/src/main/resources/datafu/diff_macros.pig (1:9) - datafu-pig/src/main/resources/datafu/tf_idf.pig (1:9) duplicated block id: 118 size: 9 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/IntVAR.java (177:190) - datafu-pig/src/main/java/datafu/pig/stats/VAR.java (195:208) duplicated block id: 119 size: 9 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/DoubleVAR.java (178:191) - datafu-pig/src/main/java/datafu/pig/stats/LongVAR.java (178:191) duplicated block id: 120 size: 9 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/text/opennlp/SentenceDetect.java (91:108) - datafu-pig/src/main/java/datafu/pig/text/opennlp/TokenizeWhitespace.java (71:88) duplicated block id: 121 size: 9 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/text/opennlp/SentenceDetect.java (91:108) - datafu-pig/src/main/java/datafu/pig/text/opennlp/TokenizeME.java (95:112) duplicated block id: 122 size: 9 cleaned lines of code in 2 files: - datafu-pig/src/main/resources/datafu/left_outer_join.pig (1:9) - datafu-pig/src/main/resources/datafu/tf_idf.pig (1:9) duplicated block id: 123 size: 9 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/LongVAR.java (178:191) - datafu-pig/src/main/java/datafu/pig/stats/VAR.java (195:208) duplicated block id: 124 size: 9 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/DoubleVAR.java (178:191) - datafu-pig/src/main/java/datafu/pig/stats/IntVAR.java (177:190) duplicated block id: 125 size: 8 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/bags/DistinctBy.java (116:131) - datafu-pig/src/main/java/datafu/pig/linkanalysis/PageRank.java (387:402) duplicated block id: 126 size: 8 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/bags/UnorderedPairs.java (97:110) - datafu-pig/src/main/java/datafu/pig/stats/HyperLogLogPlusPlus.java (82:95) duplicated block id: 127 size: 8 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/bags/ReverseEnumerate.java (100:113) - datafu-pig/src/main/java/datafu/pig/stats/HyperLogLogPlusPlus.java (82:95) duplicated block id: 128 size: 8 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/linkanalysis/PageRank.java (387:402) - datafu-pig/src/main/java/datafu/pig/sessions/Sessionize.java (141:156) duplicated block id: 129 size: 8 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/bags/DistinctBy.java (107:120) - datafu-pig/src/main/java/datafu/pig/stats/HyperLogLogPlusPlus.java (82:95) duplicated block id: 130 size: 8 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/DoubleVAR.java (143:150) - datafu-pig/src/main/java/datafu/pig/stats/entropy/EmpiricalCountEntropy.java (256:263) duplicated block id: 131 size: 8 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/bags/EmptyBagToNull.java (35:44) - datafu-pig/src/main/java/datafu/pig/bags/EmptyBagToNullFields.java (46:55) duplicated block id: 132 size: 8 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/VAR.java (273:280) - datafu-pig/src/main/java/datafu/pig/stats/VAR.java (305:312) duplicated block id: 133 size: 8 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/VAR.java (160:167) - datafu-pig/src/main/java/datafu/pig/stats/entropy/EmpiricalCountEntropy.java (256:263) duplicated block id: 134 size: 8 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/IntVAR.java (143:150) - datafu-pig/src/main/java/datafu/pig/stats/entropy/EmpiricalCountEntropy.java (256:263) duplicated block id: 135 size: 8 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/LongVAR.java (143:150) - datafu-pig/src/main/java/datafu/pig/stats/entropy/EmpiricalCountEntropy.java (256:263) duplicated block id: 136 size: 8 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/sampling/ReservoirSample.java (258:271) - datafu-pig/src/main/java/datafu/pig/sampling/ReservoirSample.java (304:317) duplicated block id: 137 size: 8 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/Quantile.java (191:203) - datafu-pig/src/main/java/datafu/pig/stats/StreamingQuantile.java (254:266) duplicated block id: 138 size: 8 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/bags/ReverseEnumerate.java (109:124) - datafu-pig/src/main/java/datafu/pig/linkanalysis/PageRank.java (387:402) duplicated block id: 139 size: 8 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/bags/Enumerate.java (112:125) - datafu-pig/src/main/java/datafu/pig/stats/HyperLogLogPlusPlus.java (82:95) duplicated block id: 140 size: 8 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/bags/Enumerate.java (121:136) - datafu-pig/src/main/java/datafu/pig/linkanalysis/PageRank.java (387:402) duplicated block id: 141 size: 8 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/bags/CountEach.java (152:161) - datafu-pig/src/main/java/datafu/pig/bags/Enumerate.java (141:150) duplicated block id: 142 size: 8 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/FloatVAR.java (145:152) - datafu-pig/src/main/java/datafu/pig/stats/entropy/EmpiricalCountEntropy.java (256:263) duplicated block id: 143 size: 8 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/sampling/SimpleRandomSampleWithReplacementElect.java (125:136) - datafu-pig/src/main/java/datafu/pig/sampling/SimpleRandomSampleWithReplacementElect.java (153:164) duplicated block id: 144 size: 8 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/bags/Enumerate.java (121:136) - datafu-pig/src/main/java/datafu/pig/sessions/Sessionize.java (141:156) duplicated block id: 145 size: 8 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/bags/CountEach.java (152:161) - datafu-pig/src/main/java/datafu/pig/bags/ReverseEnumerate.java (129:138) duplicated block id: 146 size: 8 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/bags/ReverseEnumerate.java (109:124) - datafu-pig/src/main/java/datafu/pig/sessions/Sessionize.java (141:156) duplicated block id: 147 size: 8 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/bags/CountEach.java (152:161) - datafu-pig/src/main/java/datafu/pig/bags/DistinctBy.java (135:144) duplicated block id: 148 size: 8 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/bags/DistinctBy.java (135:144) - datafu-pig/src/main/java/datafu/pig/bags/ReverseEnumerate.java (129:138) duplicated block id: 149 size: 8 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/bags/DistinctBy.java (116:131) - datafu-pig/src/main/java/datafu/pig/sessions/Sessionize.java (141:156) duplicated block id: 150 size: 8 cleaned lines of code in 2 files: - datafu-hourglass/src/main/java/datafu/hourglass/jobs/AbstractNonIncrementalJob.java (372:405) - datafu-hourglass/src/main/java/datafu/hourglass/jobs/AbstractPartitionPreservingIncrementalJob.java (655:688) duplicated block id: 151 size: 8 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/bags/DistinctBy.java (135:144) - datafu-pig/src/main/java/datafu/pig/bags/Enumerate.java (141:150) duplicated block id: 152 size: 7 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/bags/Enumerate.java (121:133) - datafu-pig/src/main/java/datafu/pig/stats/entropy/EmpiricalCountEntropy.java (382:394) duplicated block id: 153 size: 7 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/bags/AppendToBag.java (60:68) - datafu-pig/src/main/java/datafu/pig/bags/PrependToBag.java (66:74) duplicated block id: 154 size: 7 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/text/opennlp/SentenceDetect.java (70:80) - datafu-pig/src/main/java/datafu/pig/text/opennlp/TokenizeSimple.java (55:66) duplicated block id: 155 size: 7 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/bags/ReverseEnumerate.java (109:121) - datafu-pig/src/main/java/datafu/pig/stats/entropy/CondEntropy.java (230:242) duplicated block id: 156 size: 7 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/linkanalysis/PageRank.java (467:475) - datafu-pig/src/main/java/datafu/pig/text/opennlp/TokenizeSimple.java (90:98) duplicated block id: 157 size: 7 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/IntVAR.java (256:262) - datafu-pig/src/main/java/datafu/pig/stats/IntVAR.java (287:293) duplicated block id: 158 size: 7 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/linkanalysis/PageRank.java (387:399) - datafu-pig/src/main/java/datafu/pig/stats/entropy/EmpiricalCountEntropy.java (382:394) duplicated block id: 159 size: 7 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/text/opennlp/POSTag.java (68:79) - datafu-pig/src/main/java/datafu/pig/text/opennlp/SentenceDetect.java (59:70) duplicated block id: 160 size: 7 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/linkanalysis/PageRank.java (387:399) - datafu-pig/src/main/java/datafu/pig/stats/entropy/CondEntropy.java (230:242) duplicated block id: 161 size: 7 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/bags/BagJoin.java (148:155) - datafu-pig/src/main/java/datafu/pig/bags/BagJoin.java (183:190) duplicated block id: 162 size: 7 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/linkanalysis/PageRank.java (467:475) - datafu-pig/src/main/java/datafu/pig/text/opennlp/TokenizeWhitespace.java (91:99) duplicated block id: 163 size: 7 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/text/opennlp/POSTag.java (68:79) - datafu-pig/src/main/java/datafu/pig/text/opennlp/TokenizeME.java (63:74) duplicated block id: 164 size: 7 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/text/opennlp/SentenceDetect.java (70:80) - datafu-pig/src/main/java/datafu/pig/text/opennlp/TokenizeWhitespace.java (56:67) duplicated block id: 165 size: 7 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/linkanalysis/PageRank.java (467:475) - datafu-pig/src/main/java/datafu/pig/text/opennlp/SentenceDetect.java (111:119) duplicated block id: 166 size: 7 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/text/opennlp/TokenizeME.java (74:84) - datafu-pig/src/main/java/datafu/pig/text/opennlp/TokenizeSimple.java (55:66) duplicated block id: 167 size: 7 cleaned lines of code in 2 files: - datafu-hourglass/src/main/java/datafu/hourglass/jobs/PartitionCollapsingIncrementalJob.java (206:217) - datafu-hourglass/src/main/java/datafu/hourglass/jobs/PartitionPreservingIncrementalJob.java (165:176) duplicated block id: 168 size: 7 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/text/opennlp/POSTag.java (166:174) - datafu-pig/src/main/java/datafu/pig/text/opennlp/TokenizeWhitespace.java (91:99) duplicated block id: 169 size: 7 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/bags/Enumerate.java (84:91) - datafu-pig/src/main/java/datafu/pig/bags/ReverseEnumerate.java (84:91) duplicated block id: 170 size: 7 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/bags/DistinctBy.java (116:128) - datafu-pig/src/main/java/datafu/pig/stats/entropy/EmpiricalCountEntropy.java (382:394) duplicated block id: 171 size: 7 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/text/opennlp/SentenceDetect.java (111:119) - datafu-pig/src/main/java/datafu/pig/text/opennlp/TokenizeWhitespace.java (91:99) duplicated block id: 172 size: 7 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/linkanalysis/PageRank.java (467:475) - datafu-pig/src/main/java/datafu/pig/text/opennlp/TokenizeME.java (115:123) duplicated block id: 173 size: 7 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/bags/DistinctBy.java (116:128) - datafu-pig/src/main/java/datafu/pig/stats/entropy/Entropy.java (204:216) duplicated block id: 174 size: 7 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/text/opennlp/POSTag.java (166:174) - datafu-pig/src/main/java/datafu/pig/text/opennlp/TokenizeSimple.java (90:98) duplicated block id: 175 size: 7 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/bags/UnorderedPairs.java (106:118) - datafu-pig/src/main/java/datafu/pig/stats/entropy/EmpiricalCountEntropy.java (382:394) duplicated block id: 176 size: 7 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/text/opennlp/SentenceDetect.java (111:119) - datafu-pig/src/main/java/datafu/pig/text/opennlp/TokenizeME.java (115:123) duplicated block id: 177 size: 7 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/bags/Enumerate.java (121:133) - datafu-pig/src/main/java/datafu/pig/stats/entropy/Entropy.java (204:216) duplicated block id: 178 size: 7 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/text/opennlp/TokenizeME.java (74:84) - datafu-pig/src/main/java/datafu/pig/text/opennlp/TokenizeWhitespace.java (56:67) duplicated block id: 179 size: 7 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/linkanalysis/PageRank.java (387:399) - datafu-pig/src/main/java/datafu/pig/stats/entropy/Entropy.java (204:216) duplicated block id: 180 size: 7 cleaned lines of code in 2 files: - datafu-hourglass/src/main/java/datafu/hourglass/schemas/PartitionCollapsingSchemas.java (155:164) - datafu-hourglass/src/main/java/datafu/hourglass/schemas/PartitionPreservingSchemas.java (103:112) duplicated block id: 181 size: 7 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/sampling/ReservoirSample.java (133:142) - datafu-pig/src/main/java/datafu/pig/sampling/WeightedReservoirSample.java (134:144) duplicated block id: 182 size: 7 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/sampling/SimpleRandomSample.java (290:298) - datafu-pig/src/main/java/datafu/pig/sampling/SimpleRandomSample.java (365:373) duplicated block id: 183 size: 7 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/bags/ReverseEnumerate.java (109:121) - datafu-pig/src/main/java/datafu/pig/stats/entropy/Entropy.java (204:216) duplicated block id: 184 size: 7 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/text/opennlp/POSTag.java (166:174) - datafu-pig/src/main/java/datafu/pig/text/opennlp/SentenceDetect.java (111:119) duplicated block id: 185 size: 7 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/bags/DistinctBy.java (116:128) - datafu-pig/src/main/java/datafu/pig/stats/entropy/CondEntropy.java (230:242) duplicated block id: 186 size: 7 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/text/opennlp/SentenceDetect.java (111:119) - datafu-pig/src/main/java/datafu/pig/text/opennlp/TokenizeSimple.java (90:98) duplicated block id: 187 size: 7 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/bags/UnorderedPairs.java (106:118) - datafu-pig/src/main/java/datafu/pig/linkanalysis/PageRank.java (387:399) duplicated block id: 188 size: 7 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/sampling/WeightedReservoirSample.java (94:107) - datafu-pig/src/main/java/datafu/pig/stats/entropy/CondEntropy.java (234:247) duplicated block id: 189 size: 7 cleaned lines of code in 2 files: - datafu-spark/src/main/scala/datafu/spark/SparkDFUtils.scala (120:126) - datafu-spark/src/main/scala/datafu/spark/SparkDFUtils.scala (534:540) duplicated block id: 190 size: 7 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/text/opennlp/POSTag.java (166:174) - datafu-pig/src/main/java/datafu/pig/text/opennlp/TokenizeME.java (115:123) duplicated block id: 191 size: 7 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/bags/ReverseEnumerate.java (109:121) - datafu-pig/src/main/java/datafu/pig/stats/entropy/EmpiricalCountEntropy.java (382:394) duplicated block id: 192 size: 7 cleaned lines of code in 2 files: - datafu-hourglass/src/main/java/datafu/hourglass/jobs/AbstractPartitionCollapsingIncrementalJob.java (387:393) - datafu-hourglass/src/main/java/datafu/hourglass/jobs/AbstractPartitionPreservingIncrementalJob.java (325:331) duplicated block id: 193 size: 7 cleaned lines of code in 2 files: - datafu-hourglass/src/main/java/datafu/hourglass/jobs/AbstractPartitionCollapsingIncrementalJob.java (489:499) - datafu-hourglass/src/main/java/datafu/hourglass/jobs/AbstractPartitionPreservingIncrementalJob.java (414:424) duplicated block id: 194 size: 7 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/LongVAR.java (257:263) - datafu-pig/src/main/java/datafu/pig/stats/LongVAR.java (288:294) duplicated block id: 195 size: 7 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/bags/UnorderedPairs.java (106:118) - datafu-pig/src/main/java/datafu/pig/sessions/Sessionize.java (141:153) duplicated block id: 196 size: 7 cleaned lines of code in 2 files: - datafu-hourglass/src/main/java/datafu/hourglass/mapreduce/PartitioningCombiner.java (43:52) - datafu-hourglass/src/main/java/datafu/hourglass/mapreduce/PartitioningReducer.java (63:72) duplicated block id: 197 size: 7 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/linkanalysis/PageRank.java (467:475) - datafu-pig/src/main/java/datafu/pig/text/opennlp/POSTag.java (166:174) duplicated block id: 198 size: 7 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/bags/UnorderedPairs.java (106:118) - datafu-pig/src/main/java/datafu/pig/stats/entropy/CondEntropy.java (230:242) duplicated block id: 199 size: 7 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/bags/Enumerate.java (121:133) - datafu-pig/src/main/java/datafu/pig/stats/entropy/CondEntropy.java (230:242) duplicated block id: 200 size: 7 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/bags/UnorderedPairs.java (106:118) - datafu-pig/src/main/java/datafu/pig/stats/entropy/Entropy.java (204:216) duplicated block id: 201 size: 7 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/linkanalysis/PageRank.java (382:394) - datafu-pig/src/main/java/datafu/pig/text/opennlp/POSTag.java (124:136) duplicated block id: 202 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/DoubleVAR.java (347:352) - datafu-pig/src/main/java/datafu/pig/stats/VAR.java (198:203) duplicated block id: 203 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/FloatVAR.java (63:70) - datafu-pig/src/main/java/datafu/pig/stats/FloatVAR.java (178:184) duplicated block id: 204 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/IntVAR.java (175:181) - datafu-pig/src/main/java/datafu/pig/stats/LongVAR.java (63:70) duplicated block id: 205 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/FloatVAR.java (63:70) - datafu-pig/src/main/java/datafu/pig/stats/VAR.java (193:199) duplicated block id: 206 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/DoubleVAR.java (181:186) - datafu-pig/src/main/java/datafu/pig/stats/IntVAR.java (346:351) duplicated block id: 207 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/FloatVAR.java (349:354) - datafu-pig/src/main/java/datafu/pig/stats/LongVAR.java (181:186) duplicated block id: 208 size: 6 cleaned lines of code in 2 files: - datafu-hourglass/src/main/java/datafu/hourglass/mapreduce/CollapsingCombiner.java (58:65) - datafu-hourglass/src/main/java/datafu/hourglass/mapreduce/PartitioningReducer.java (65:72) duplicated block id: 209 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/FloatVAR.java (183:188) - datafu-pig/src/main/java/datafu/pig/stats/VAR.java (376:381) duplicated block id: 210 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/LongVAR.java (347:352) - datafu-pig/src/main/java/datafu/pig/stats/VAR.java (198:203) duplicated block id: 211 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/sampling/SimpleRandomSampleWithReplacementVote.java (271:280) - datafu-pig/src/main/java/datafu/pig/text/opennlp/POSTag.java (124:133) duplicated block id: 212 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/VAR.java (198:203) - datafu-pig/src/main/java/datafu/pig/stats/VAR.java (376:381) duplicated block id: 213 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/DoubleVAR.java (181:186) - datafu-pig/src/main/java/datafu/pig/stats/LongVAR.java (347:352) duplicated block id: 214 size: 6 cleaned lines of code in 2 files: - datafu-hourglass/src/main/java/datafu/hourglass/jobs/AbstractPartitionCollapsingIncrementalJob.java (401:409) - datafu-hourglass/src/main/java/datafu/hourglass/jobs/AbstractPartitionPreservingIncrementalJob.java (338:346) duplicated block id: 215 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/sampling/ReservoirSample.java (245:256) - datafu-pig/src/main/java/datafu/pig/sampling/ReservoirSample.java (291:302) duplicated block id: 216 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/sampling/ReservoirSample.java (124:130) - datafu-pig/src/main/java/datafu/pig/sampling/WeightedReservoirSample.java (88:94) duplicated block id: 217 size: 6 cleaned lines of code in 2 files: - datafu-hourglass/src/main/java/datafu/hourglass/jobs/AbstractNonIncrementalJob.java (372:394) - datafu-hourglass/src/main/java/datafu/hourglass/jobs/AbstractPartitionCollapsingIncrementalJob.java (643:665) duplicated block id: 218 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/LongVAR.java (181:186) - datafu-pig/src/main/java/datafu/pig/stats/LongVAR.java (347:352) duplicated block id: 219 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/FloatVAR.java (349:354) - datafu-pig/src/main/java/datafu/pig/stats/VAR.java (198:203) duplicated block id: 220 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/DoubleVAR.java (181:186) - datafu-pig/src/main/java/datafu/pig/stats/VAR.java (376:381) duplicated block id: 221 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/DoubleVAR.java (347:352) - datafu-pig/src/main/java/datafu/pig/stats/LongVAR.java (181:186) duplicated block id: 222 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/DoubleVAR.java (181:186) - datafu-pig/src/main/java/datafu/pig/stats/DoubleVAR.java (347:352) duplicated block id: 223 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/sampling/SimpleRandomSampleWithReplacementElect.java (174:183) - datafu-pig/src/main/java/datafu/pig/sampling/SimpleRandomSampleWithReplacementVote.java (271:280) duplicated block id: 224 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/FloatVAR.java (183:188) - datafu-pig/src/main/java/datafu/pig/stats/IntVAR.java (346:351) duplicated block id: 225 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/sampling/SimpleRandomSample.java (146:155) - datafu-pig/src/main/java/datafu/pig/sampling/SimpleRandomSampleWithReplacementVote.java (271:280) duplicated block id: 226 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/sampling/SimpleRandomSampleWithReplacementElect.java (174:183) - datafu-pig/src/main/java/datafu/pig/text/opennlp/POSTag.java (124:133) duplicated block id: 227 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/linkanalysis/PageRank.java (382:391) - datafu-pig/src/main/java/datafu/pig/sampling/SimpleRandomSampleWithReplacementVote.java (271:280) duplicated block id: 228 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/bags/EmptyBagToNull.java (48:58) - datafu-pig/src/main/java/datafu/pig/bags/EmptyBagToNullFields.java (61:71) duplicated block id: 229 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/DoubleVAR.java (63:70) - datafu-pig/src/main/java/datafu/pig/stats/VAR.java (193:199) duplicated block id: 230 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/FloatVAR.java (183:188) - datafu-pig/src/main/java/datafu/pig/stats/FloatVAR.java (349:354) duplicated block id: 231 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/IntVAR.java (63:70) - datafu-pig/src/main/java/datafu/pig/stats/IntVAR.java (175:181) duplicated block id: 232 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/FloatVAR.java (349:354) - datafu-pig/src/main/java/datafu/pig/stats/IntVAR.java (180:185) duplicated block id: 233 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/hash/lsh/L1PStableHash.java (178:187) - datafu-pig/src/main/java/datafu/pig/hash/lsh/L2PStableHash.java (179:189) duplicated block id: 234 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/FloatVAR.java (178:184) - datafu-pig/src/main/java/datafu/pig/stats/VAR.java (79:86) duplicated block id: 235 size: 6 cleaned lines of code in 2 files: - build-plugin/src/main/java/org/adrianwalker/multilinestring/EcjMultilineProcessor.java (37:44) - build-plugin/src/main/java/org/adrianwalker/multilinestring/JavacMultilineProcessor.java (37:42) duplicated block id: 236 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/linkanalysis/PageRank.java (382:391) - datafu-pig/src/main/java/datafu/pig/sampling/SimpleRandomSampleWithReplacementElect.java (174:183) duplicated block id: 237 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/IntVAR.java (63:70) - datafu-pig/src/main/java/datafu/pig/stats/LongVAR.java (176:182) duplicated block id: 238 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/bags/DistinctBy.java (137:144) - datafu-pig/src/main/java/datafu/pig/sessions/Sessionize.java (171:178) duplicated block id: 239 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/DoubleVAR.java (176:182) - datafu-pig/src/main/java/datafu/pig/stats/VAR.java (79:86) duplicated block id: 240 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/sampling/SimpleRandomSample.java (146:155) - datafu-pig/src/main/java/datafu/pig/text/opennlp/POSTag.java (124:133) duplicated block id: 241 size: 6 cleaned lines of code in 2 files: - datafu-spark/src/main/scala/datafu/spark/SparkDFUtils.scala (472:477) - datafu-spark/src/main/scala/datafu/spark/SparkDFUtils.scala (542:547) duplicated block id: 242 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/DoubleVAR.java (63:70) - datafu-pig/src/main/java/datafu/pig/stats/FloatVAR.java (178:184) duplicated block id: 243 size: 6 cleaned lines of code in 2 files: - datafu-hourglass/src/main/java/datafu/hourglass/mapreduce/DelegatingCombiner.java (34:43) - datafu-hourglass/src/main/java/datafu/hourglass/mapreduce/DelegatingReducer.java (34:43) duplicated block id: 244 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/IntVAR.java (180:185) - datafu-pig/src/main/java/datafu/pig/stats/IntVAR.java (346:351) duplicated block id: 245 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/DoubleVAR.java (347:352) - datafu-pig/src/main/java/datafu/pig/stats/IntVAR.java (180:185) duplicated block id: 246 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/DoubleVAR.java (347:352) - datafu-pig/src/main/java/datafu/pig/stats/FloatVAR.java (183:188) duplicated block id: 247 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/entropy/CondEntropy.java (257:262) - datafu-pig/src/main/java/datafu/pig/stats/entropy/Entropy.java (219:224) duplicated block id: 248 size: 6 cleaned lines of code in 2 files: - datafu-hourglass/src/main/java/datafu/hourglass/jobs/PartitionCollapsingIncrementalJob.java (42:47) - datafu-hourglass/src/main/java/datafu/hourglass/jobs/PartitionPreservingIncrementalJob.java (41:46) duplicated block id: 249 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/DoubleVAR.java (176:182) - datafu-pig/src/main/java/datafu/pig/stats/FloatVAR.java (63:70) duplicated block id: 250 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/IntVAR.java (180:185) - datafu-pig/src/main/java/datafu/pig/stats/VAR.java (376:381) duplicated block id: 251 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/IntVAR.java (346:351) - datafu-pig/src/main/java/datafu/pig/stats/LongVAR.java (181:186) duplicated block id: 252 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/bags/Enumerate.java (143:150) - datafu-pig/src/main/java/datafu/pig/sessions/Sessionize.java (171:178) duplicated block id: 253 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/sessions/SessionCount.java (71:79) - datafu-pig/src/main/java/datafu/pig/sessions/Sessionize.java (82:91) duplicated block id: 254 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/LongVAR.java (63:70) - datafu-pig/src/main/java/datafu/pig/stats/LongVAR.java (176:182) duplicated block id: 255 size: 6 cleaned lines of code in 2 files: - datafu-hourglass/src/main/java/datafu/hourglass/jobs/AbstractPartitionCollapsingIncrementalJob.java (643:665) - datafu-hourglass/src/main/java/datafu/hourglass/jobs/AbstractPartitionPreservingIncrementalJob.java (655:677) duplicated block id: 256 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/entropy/CondEntropy.java (257:262) - datafu-pig/src/main/java/datafu/pig/stats/entropy/EmpiricalCountEntropy.java (419:424) duplicated block id: 257 size: 6 cleaned lines of code in 2 files: - datafu-hourglass/src/main/java/datafu/hourglass/mapreduce/CollapsingCombiner.java (58:65) - datafu-hourglass/src/main/java/datafu/hourglass/mapreduce/PartitioningCombiner.java (45:52) duplicated block id: 258 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/DoubleVAR.java (181:186) - datafu-pig/src/main/java/datafu/pig/stats/FloatVAR.java (349:354) duplicated block id: 259 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/VAR.java (79:86) - datafu-pig/src/main/java/datafu/pig/stats/VAR.java (193:199) duplicated block id: 260 size: 6 cleaned lines of code in 2 files: - datafu-hourglass/src/main/java/datafu/hourglass/jobs/DateRangePlanner.java (122:128) - datafu-hourglass/src/main/java/datafu/hourglass/jobs/DateRangePlanner.java (160:166) duplicated block id: 261 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/sampling/SimpleRandomSample.java (146:155) - datafu-pig/src/main/java/datafu/pig/sampling/SimpleRandomSampleWithReplacementElect.java (174:183) duplicated block id: 262 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/bags/CountEach.java (154:161) - datafu-pig/src/main/java/datafu/pig/sessions/Sessionize.java (171:178) duplicated block id: 263 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/IntVAR.java (180:185) - datafu-pig/src/main/java/datafu/pig/stats/LongVAR.java (347:352) duplicated block id: 264 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/IntVAR.java (346:351) - datafu-pig/src/main/java/datafu/pig/stats/VAR.java (198:203) duplicated block id: 265 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/linkanalysis/PageRank.java (382:391) - datafu-pig/src/main/java/datafu/pig/sampling/SimpleRandomSample.java (146:155) duplicated block id: 266 size: 6 cleaned lines of code in 2 files: - datafu-spark/src/main/scala/datafu/spark/SparkDFUtils.scala (106:111) - datafu-spark/src/main/scala/datafu/spark/SparkDFUtils.scala (466:471) duplicated block id: 267 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/entropy/EmpiricalCountEntropy.java (419:424) - datafu-pig/src/main/java/datafu/pig/stats/entropy/Entropy.java (219:224) duplicated block id: 268 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/DoubleVAR.java (63:70) - datafu-pig/src/main/java/datafu/pig/stats/DoubleVAR.java (176:182) duplicated block id: 269 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/LongVAR.java (181:186) - datafu-pig/src/main/java/datafu/pig/stats/VAR.java (376:381) duplicated block id: 270 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/stats/FloatVAR.java (183:188) - datafu-pig/src/main/java/datafu/pig/stats/LongVAR.java (347:352) duplicated block id: 271 size: 6 cleaned lines of code in 2 files: - datafu-pig/src/main/java/datafu/pig/bags/ReverseEnumerate.java (131:138) - datafu-pig/src/main/java/datafu/pig/sessions/Sessionize.java (171:178)