duplicated block id: 1 size: 256 cleaned lines of code in 2 files: - src/app/app.py (246:523) - src/predict_many_samples.py (262:539) duplicated block id: 2 size: 251 cleaned lines of code in 2 files: - src/predict_many_samples.py (256:529) - src/predict_one_sample.py (259:532) duplicated block id: 3 size: 246 cleaned lines of code in 2 files: - src/app/app.py (246:513) - src/predict_one_sample.py (265:532) duplicated block id: 4 size: 172 cleaned lines of code in 2 files: - src/app/app.py (42:243) - src/predict_many_samples.py (52:253) duplicated block id: 5 size: 136 cleaned lines of code in 2 files: - src/predict_many_samples.py (95:254) - src/predict_one_sample.py (97:256) duplicated block id: 6 size: 135 cleaned lines of code in 2 files: - src/app/app.py (85:243) - src/predict_one_sample.py (97:255) duplicated block id: 7 size: 99 cleaned lines of code in 2 files: - src/baselines/dnn.py (650:765) - src/deep_baselines/run.py (608:723) duplicated block id: 8 size: 88 cleaned lines of code in 2 files: - src/predict.py (35:156) - src/predict_many_samples.py (35:155) duplicated block id: 9 size: 87 cleaned lines of code in 2 files: - src/baselines/dnn.py (541:647) - src/deep_baselines/run.py (498:604) duplicated block id: 10 size: 85 cleaned lines of code in 2 files: - src/app/app.py (42:145) - src/predict.py (51:156) duplicated block id: 11 size: 82 cleaned lines of code in 2 files: - src/predict_many_samples.py (673:761) - src/predict_one_sample.py (657:745) duplicated block id: 12 size: 74 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (266:351) - src/deep_baselines/cheer.py (409:494) duplicated block id: 13 size: 70 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (99:179) - src/deep_baselines/cheer.py (415:494) duplicated block id: 14 size: 70 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (99:179) - src/deep_baselines/cheer.py (272:351) duplicated block id: 15 size: 70 cleaned lines of code in 2 files: - src/baselines/dnn.py (824:908) - src/deep_baselines/run.py (783:867) duplicated block id: 16 size: 69 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_2.py (39:129) - src/plot/plot_map_pie_fig_aff4_2.py (38:126) duplicated block id: 17 size: 68 cleaned lines of code in 2 files: - src/deep_baselines/virseeker.py (107:185) - src/deep_baselines/virtifier.py (117:195) duplicated block id: 18 size: 66 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (272:346) - src/deep_baselines/virseeker.py (109:183) duplicated block id: 19 size: 66 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (99:173) - src/deep_baselines/virtifier.py (119:193) duplicated block id: 20 size: 66 cleaned lines of code in 2 files: - src/baselines/dnn.py (336:411) - src/deep_baselines/virtifier.py (119:193) duplicated block id: 21 size: 66 cleaned lines of code in 2 files: - src/baselines/dnn.py (336:411) - src/deep_baselines/cheer.py (99:173) duplicated block id: 22 size: 66 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (415:489) - src/deep_baselines/virhunter.py (93:167) duplicated block id: 23 size: 66 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (99:173) - src/deep_baselines/virseeker.py (109:183) duplicated block id: 24 size: 66 cleaned lines of code in 2 files: - src/deep_baselines/virhunter.py (93:167) - src/deep_baselines/virtifier.py (119:193) duplicated block id: 25 size: 66 cleaned lines of code in 2 files: - src/baselines/dnn.py (336:411) - src/deep_baselines/cheer.py (272:346) duplicated block id: 26 size: 66 cleaned lines of code in 2 files: - src/baselines/dnn.py (336:411) - src/deep_baselines/cheer.py (415:489) duplicated block id: 27 size: 66 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (415:489) - src/deep_baselines/virtifier.py (119:193) duplicated block id: 28 size: 66 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (272:346) - src/deep_baselines/virtifier.py (119:193) duplicated block id: 29 size: 66 cleaned lines of code in 2 files: - src/baselines/dnn.py (336:411) - src/deep_baselines/virseeker.py (109:183) duplicated block id: 30 size: 66 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (415:489) - src/deep_baselines/virseeker.py (109:183) duplicated block id: 31 size: 66 cleaned lines of code in 2 files: - src/deep_baselines/virhunter.py (93:167) - src/deep_baselines/virseeker.py (109:183) duplicated block id: 32 size: 66 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (99:173) - src/deep_baselines/virhunter.py (93:167) duplicated block id: 33 size: 66 cleaned lines of code in 2 files: - src/baselines/dnn.py (336:411) - src/deep_baselines/virhunter.py (93:167) duplicated block id: 34 size: 66 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (272:346) - src/deep_baselines/virhunter.py (93:167) duplicated block id: 35 size: 63 cleaned lines of code in 2 files: - src/SSFN/model.py (249:321) - src/deep_baselines/virseeker.py (112:183) duplicated block id: 36 size: 63 cleaned lines of code in 2 files: - src/SSFN/model.py (249:321) - src/deep_baselines/cheer.py (418:489) duplicated block id: 37 size: 63 cleaned lines of code in 2 files: - src/SSFN/model.py (249:321) - src/deep_baselines/cheer.py (275:346) duplicated block id: 38 size: 63 cleaned lines of code in 2 files: - src/SSFN/model.py (249:321) - src/deep_baselines/virhunter.py (96:167) duplicated block id: 39 size: 63 cleaned lines of code in 2 files: - src/SSFN/model.py (249:321) - src/deep_baselines/virtifier.py (122:193) duplicated block id: 40 size: 63 cleaned lines of code in 2 files: - src/SSFN/model.py (249:321) - src/deep_baselines/cheer.py (102:173) duplicated block id: 41 size: 63 cleaned lines of code in 2 files: - src/SSFN/model.py (249:321) - src/baselines/dnn.py (339:411) duplicated block id: 42 size: 61 cleaned lines of code in 2 files: - src/predict.py (187:257) - src/predict_one_sample.py (185:255) duplicated block id: 43 size: 61 cleaned lines of code in 2 files: - src/predict.py (646:716) - src/predict_many_samples.py (698:765) duplicated block id: 44 size: 61 cleaned lines of code in 2 files: - src/predict.py (187:257) - src/predict_many_samples.py (183:253) duplicated block id: 45 size: 61 cleaned lines of code in 2 files: - src/app/app.py (173:243) - src/predict.py (187:257) duplicated block id: 46 size: 59 cleaned lines of code in 2 files: - src/baselines/lgbm.py (113:174) - src/baselines/xgb.py (112:172) duplicated block id: 47 size: 59 cleaned lines of code in 2 files: - src/baselines/dnn.py (155:213) - src/baselines/xgb.py (110:168) duplicated block id: 48 size: 59 cleaned lines of code in 2 files: - src/predict.py (646:712) - src/predict_one_sample.py (682:745) duplicated block id: 49 size: 57 cleaned lines of code in 2 files: - src/baselines/dnn.py (157:213) - src/baselines/lgbm.py (113:170) duplicated block id: 50 size: 51 cleaned lines of code in 2 files: - src/baselines/dnn.py (911:970) - src/deep_baselines/run.py (871:930) duplicated block id: 51 size: 48 cleaned lines of code in 2 files: - src/predict.py (94:156) - src/predict_one_sample.py (97:157) duplicated block id: 52 size: 47 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (143:192) - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (119:167) duplicated block id: 53 size: 47 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (117:164) - src/data_preprocess/subword.py (65:113) duplicated block id: 54 size: 46 cleaned lines of code in 2 files: - src/predict_many_samples.py (572:619) - src/predict_one_sample.py (570:617) duplicated block id: 55 size: 45 cleaned lines of code in 2 files: - src/predict.py (542:586) - src/predict_one_sample.py (590:634) duplicated block id: 56 size: 43 cleaned lines of code in 2 files: - src/predict_many_samples.py (24:91) - src/predict_one_sample.py (24:90) duplicated block id: 57 size: 43 cleaned lines of code in 2 files: - src/baselines/dnn.py (767:821) - src/deep_baselines/run.py (725:779) duplicated block id: 58 size: 41 cleaned lines of code in 2 files: - src/evaluater.py (118:158) - src/predictor.py (124:164) duplicated block id: 59 size: 39 cleaned lines of code in 2 files: - src/predict.py (35:90) - src/predict_one_sample.py (34:90) duplicated block id: 60 size: 38 cleaned lines of code in 2 files: - src/baselines/dnn.py (70:123) - src/deep_baselines/run.py (77:130) duplicated block id: 61 size: 37 cleaned lines of code in 2 files: - src/app/app.py (347:384) - src/predict.py (348:384) duplicated block id: 62 size: 37 cleaned lines of code in 2 files: - src/predict.py (348:384) - src/predict_many_samples.py (363:400) duplicated block id: 63 size: 37 cleaned lines of code in 2 files: - src/evaluater.py (121:157) - src/trainer.py (206:242) duplicated block id: 64 size: 37 cleaned lines of code in 2 files: - src/predictor.py (127:163) - src/trainer.py (206:242) duplicated block id: 65 size: 37 cleaned lines of code in 2 files: - src/predict.py (348:384) - src/predict_one_sample.py (366:403) duplicated block id: 66 size: 36 cleaned lines of code in 2 files: - src/app/app.py (42:81) - src/predict_one_sample.py (51:90) duplicated block id: 67 size: 36 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (394:434) - src/predict.py (745:785) duplicated block id: 68 size: 32 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (548:583) - src/predict.py (912:948) duplicated block id: 69 size: 32 cleaned lines of code in 2 files: - src/deep_baselines/virseeker.py (199:234) - src/deep_baselines/virtifier.py (221:256) duplicated block id: 70 size: 31 cleaned lines of code in 2 files: - src/deep_baselines/virhunter.py (189:223) - src/deep_baselines/virtifier.py (222:256) duplicated block id: 71 size: 31 cleaned lines of code in 2 files: - src/deep_baselines/virhunter.py (189:223) - src/deep_baselines/virseeker.py (200:234) duplicated block id: 72 size: 30 cleaned lines of code in 2 files: - src/baselines/dnn.py (482:519) - src/deep_baselines/run.py (439:476) duplicated block id: 73 size: 29 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (586:620) - src/predict.py (955:989) duplicated block id: 74 size: 29 cleaned lines of code in 2 files: - src/baselines/lgbm.py (376:409) - src/baselines/xgb.py (342:374) duplicated block id: 75 size: 28 cleaned lines of code in 2 files: - src/data_loader.py (557:592) - src/data_loader.py (1000:1035) duplicated block id: 76 size: 28 cleaned lines of code in 2 files: - src/predict.py (542:569) - src/predict_many_samples.py (590:619) duplicated block id: 77 size: 27 cleaned lines of code in 2 files: - src/baselines/dnn.py (652:686) - src/trainer.py (249:283) duplicated block id: 78 size: 27 cleaned lines of code in 2 files: - src/deep_baselines/virseeker.py (3:46) - src/deep_baselines/virtifier.py (3:46) duplicated block id: 79 size: 27 cleaned lines of code in 2 files: - src/deep_baselines/run.py (610:644) - src/trainer.py (249:283) duplicated block id: 80 size: 26 cleaned lines of code in 2 files: - src/predict.py (265:293) - src/predict_one_sample.py (274:302) duplicated block id: 81 size: 26 cleaned lines of code in 2 files: - src/data_loader.py (323:348) - src/deep_baselines/run.py (244:269) duplicated block id: 82 size: 26 cleaned lines of code in 2 files: - src/predict.py (265:293) - src/predict_many_samples.py (271:299) duplicated block id: 83 size: 26 cleaned lines of code in 2 files: - src/app/app.py (255:283) - src/predict.py (265:293) duplicated block id: 84 size: 24 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_manual.py (30:60) - src/geo_map/get_biosample_from_update.py (31:61) duplicated block id: 85 size: 23 cleaned lines of code in 2 files: - src/baselines/dnn.py (707:733) - src/trainer.py (325:350) duplicated block id: 86 size: 23 cleaned lines of code in 2 files: - src/baselines/lgbm.py (411:438) - src/baselines/xgb.py (376:403) duplicated block id: 87 size: 23 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (456:484) - src/predict.py (807:835) duplicated block id: 88 size: 23 cleaned lines of code in 2 files: - src/deep_baselines/run.py (665:691) - src/trainer.py (325:350) duplicated block id: 89 size: 23 cleaned lines of code in 2 files: - src/baselines/lgbm.py (665:692) - src/baselines/xgb.py (627:654) duplicated block id: 90 size: 22 cleaned lines of code in 2 files: - src/baselines/dnn.py (125:150) - src/deep_baselines/run.py (160:185) duplicated block id: 91 size: 22 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (173:197) - src/data_preprocess/tf_records_generator.py (101:126) duplicated block id: 92 size: 22 cleaned lines of code in 2 files: - src/common/metrics.py (426:450) - src/utils.py (271:295) duplicated block id: 93 size: 22 cleaned lines of code in 2 files: - src/predict_many_samples.py (621:642) - src/predict_one_sample.py (619:640) duplicated block id: 94 size: 21 cleaned lines of code in 2 files: - src/deep_baselines/run.py (840:866) - src/predictor.py (100:126) duplicated block id: 95 size: 21 cleaned lines of code in 2 files: - src/deep_baselines/run.py (750:778) - src/evaluater.py (92:120) duplicated block id: 96 size: 21 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (354:378) - src/deep_baselines/cheer.py (496:520) duplicated block id: 97 size: 21 cleaned lines of code in 2 files: - src/baselines/dnn.py (881:907) - src/predictor.py (100:126) duplicated block id: 98 size: 21 cleaned lines of code in 2 files: - src/baselines/dnn.py (792:820) - src/evaluater.py (92:120) duplicated block id: 99 size: 20 cleaned lines of code in 2 files: - src/predict_one_sample.py (477:496) - src/predict_one_sample.py (509:528) duplicated block id: 100 size: 20 cleaned lines of code in 2 files: - src/app/app.py (490:509) - src/predict_many_samples.py (474:493) duplicated block id: 101 size: 20 cleaned lines of code in 2 files: - src/predict_many_samples.py (506:525) - src/predict_one_sample.py (477:496) duplicated block id: 102 size: 20 cleaned lines of code in 2 files: - src/predict_many_samples.py (474:493) - src/predict_one_sample.py (509:528) duplicated block id: 103 size: 20 cleaned lines of code in 2 files: - src/app/app.py (490:509) - src/predict_one_sample.py (477:496) duplicated block id: 104 size: 20 cleaned lines of code in 2 files: - src/predict_many_samples.py (474:493) - src/predict_many_samples.py (506:525) duplicated block id: 105 size: 20 cleaned lines of code in 2 files: - src/app/app.py (458:477) - src/app/app.py (490:509) duplicated block id: 106 size: 20 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (238:258) - src/deep_baselines/cheer.py (384:404) duplicated block id: 107 size: 20 cleaned lines of code in 2 files: - src/app/app.py (458:477) - src/predict_many_samples.py (506:525) duplicated block id: 108 size: 20 cleaned lines of code in 2 files: - src/app/app.py (458:477) - src/predict_one_sample.py (509:528) duplicated block id: 109 size: 20 cleaned lines of code in 2 files: - src/data_loader.py (1404:1423) - src/data_loader.py (1429:1448) duplicated block id: 110 size: 19 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (486:506) - src/predict.py (837:857) duplicated block id: 111 size: 19 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_manual.py (3:24) - src/geo_map/get_biosample_from_update.py (3:24) duplicated block id: 112 size: 19 cleaned lines of code in 2 files: - src/geo_map/extract_attr_from_biosample_page.py (3:24) - src/geo_map/parse_loc_lat_lon_from_attr.py (3:24) duplicated block id: 113 size: 18 cleaned lines of code in 2 files: - src/baselines/lgbm.py (636:655) - src/baselines/xgb.py (597:616) duplicated block id: 114 size: 18 cleaned lines of code in 2 files: - src/data_loader.py (370:388) - src/data_loader.py (442:460) duplicated block id: 115 size: 18 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (514:531) - src/predict.py (867:884) duplicated block id: 116 size: 18 cleaned lines of code in 2 files: - src/app/app.py (311:328) - src/predict.py (308:325) duplicated block id: 117 size: 18 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (149:167) - src/data_preprocess/tf_records_generator.py (101:120) duplicated block id: 118 size: 18 cleaned lines of code in 2 files: - src/predict.py (308:325) - src/predict_many_samples.py (327:344) duplicated block id: 119 size: 18 cleaned lines of code in 2 files: - src/predict.py (308:325) - src/predict_one_sample.py (330:347) duplicated block id: 120 size: 17 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (213:232) - src/deep_baselines/virhunter.py (188:206) duplicated block id: 121 size: 17 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (359:378) - src/deep_baselines/virhunter.py (188:206) duplicated block id: 122 size: 17 cleaned lines of code in 2 files: - src/baselines/dnn.py (68:89) - src/baselines/lgbm.py (61:82) duplicated block id: 123 size: 17 cleaned lines of code in 2 files: - src/baselines/xgb.py (144:160) - src/deep_baselines/run.py (320:336) duplicated block id: 124 size: 17 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (213:232) - src/deep_baselines/cheer.py (359:378) duplicated block id: 125 size: 17 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (213:232) - src/deep_baselines/cheer.py (501:520) duplicated block id: 126 size: 17 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (501:520) - src/deep_baselines/virhunter.py (188:206) duplicated block id: 127 size: 17 cleaned lines of code in 2 files: - src/baselines/dnn.py (189:205) - src/deep_baselines/run.py (320:336) duplicated block id: 128 size: 17 cleaned lines of code in 2 files: - src/baselines/dnn.py (24:66) - src/deep_baselines/run.py (24:71) duplicated block id: 129 size: 17 cleaned lines of code in 2 files: - src/baselines/lgbm.py (146:162) - src/deep_baselines/run.py (320:336) duplicated block id: 130 size: 16 cleaned lines of code in 2 files: - src/baselines/dnn.py (418:435) - src/deep_baselines/cheer.py (361:378) duplicated block id: 131 size: 16 cleaned lines of code in 2 files: - src/data_loader.py (661:677) - src/data_loader.py (1098:1113) duplicated block id: 132 size: 16 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (361:378) - src/deep_baselines/virseeker.py (200:217) duplicated block id: 133 size: 16 cleaned lines of code in 2 files: - src/baselines/dnn.py (418:435) - src/deep_baselines/cheer.py (215:232) duplicated block id: 134 size: 16 cleaned lines of code in 2 files: - src/baselines/dnn.py (497:517) - src/run.py (774:792) duplicated block id: 135 size: 16 cleaned lines of code in 2 files: - src/common/metrics.py (3:21) - src/common/multi_label_metrics.py (3:21) duplicated block id: 136 size: 16 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (3:21) - src/plot/plot_map_pie_fig4_2.py (3:21) duplicated block id: 137 size: 16 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (215:232) - src/deep_baselines/virtifier.py (222:239) duplicated block id: 138 size: 16 cleaned lines of code in 2 files: - src/baselines/dnn.py (418:435) - src/deep_baselines/cheer.py (503:520) duplicated block id: 139 size: 16 cleaned lines of code in 2 files: - src/baselines/dnn.py (418:435) - src/deep_baselines/virseeker.py (200:217) duplicated block id: 140 size: 16 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:21) - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:21) duplicated block id: 141 size: 16 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (3:21) - src/plot/plot_map_pie_fig_aff4_2.py (3:21) duplicated block id: 142 size: 16 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (503:520) - src/deep_baselines/virtifier.py (222:239) duplicated block id: 143 size: 16 cleaned lines of code in 2 files: - src/common/loss.py (3:21) - src/common/metrics.py (3:21) duplicated block id: 144 size: 16 cleaned lines of code in 2 files: - src/baselines/lgbm.py (539:555) - src/baselines/xgb.py (503:519) duplicated block id: 145 size: 16 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_2.py (3:21) - src/plot/plot_map_pie_fig_aff4_2.py (3:21) duplicated block id: 146 size: 16 cleaned lines of code in 2 files: - src/protein_structure/embedding_from_esmfold.py (3:21) - src/protein_structure/structure_from_esm_v1.py (3:21) duplicated block id: 147 size: 16 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:21) - src/baselines/xgb.py (3:21) duplicated block id: 148 size: 16 cleaned lines of code in 2 files: - src/baselines/lgbm.py (481:496) - src/baselines/xgb.py (441:456) duplicated block id: 149 size: 16 cleaned lines of code in 2 files: - src/data_loader.py (679:694) - src/data_loader.py (1115:1130) duplicated block id: 150 size: 16 cleaned lines of code in 2 files: - src/predict_many_samples.py (647:664) - src/predict_one_sample.py (639:656) duplicated block id: 151 size: 16 cleaned lines of code in 2 files: - src/common/metrics.py (177:192) - src/common/metrics.py (241:256) duplicated block id: 152 size: 16 cleaned lines of code in 2 files: - src/baselines/dnn.py (521:539) - src/deep_baselines/run.py (478:496) duplicated block id: 153 size: 16 cleaned lines of code in 2 files: - src/baselines/dnn.py (418:435) - src/deep_baselines/virtifier.py (222:239) duplicated block id: 154 size: 16 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (361:378) - src/deep_baselines/virtifier.py (222:239) duplicated block id: 155 size: 16 cleaned lines of code in 2 files: - src/baselines/dnn.py (418:435) - src/deep_baselines/virhunter.py (189:206) duplicated block id: 156 size: 16 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:21) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:21) duplicated block id: 157 size: 16 cleaned lines of code in 2 files: - src/predict.py (571:586) - src/predict_many_samples.py (621:636) duplicated block id: 158 size: 16 cleaned lines of code in 2 files: - src/deep_baselines/run.py (454:474) - src/run.py (774:792) duplicated block id: 159 size: 16 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (215:232) - src/deep_baselines/virseeker.py (200:217) duplicated block id: 160 size: 16 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (503:520) - src/deep_baselines/virseeker.py (200:217) duplicated block id: 161 size: 16 cleaned lines of code in 2 files: - src/baselines/lgbm.py (62:82) - src/deep_baselines/run.py (77:96) duplicated block id: 162 size: 16 cleaned lines of code in 2 files: - src/common/loss.py (3:21) - src/common/multi_label_metrics.py (3:21) duplicated block id: 163 size: 15 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (74:88) - src/plot/plot_map_pie_fig4_2.py (104:118) duplicated block id: 164 size: 15 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_2.py (137:152) - src/plot/plot_map_pie_fig_aff4_2.py (134:149) duplicated block id: 165 size: 15 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_2.py (211:231) - src/plot/plot_map_pie_fig_aff4_2.py (210:229) duplicated block id: 166 size: 15 cleaned lines of code in 2 files: - src/data_loader.py (614:628) - src/data_loader.py (1052:1066) duplicated block id: 167 size: 15 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (115:135) - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (93:113) duplicated block id: 168 size: 15 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (74:88) - src/plot/plot_map_pie_fig_aff4_2.py (102:116) duplicated block id: 169 size: 14 cleaned lines of code in 2 files: - src/common/multi_label_metrics.py (3:19) - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:19) duplicated block id: 170 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:19) - src/geo_map/get_biosample_from_update.py (3:19) duplicated block id: 171 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_builder.py (3:19) - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) duplicated block id: 172 size: 14 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:19) - src/SSFN/modeling_bert.py (3:19) duplicated block id: 173 size: 14 cleaned lines of code in 2 files: - src/common/loss.py (3:19) - src/predict_many_samples.py (3:19) duplicated block id: 174 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_manual.py (3:19) - src/protein_structure/structure_from_esm_v1.py (3:19) duplicated block id: 175 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/verify_train_dataset.py (3:19) - src/geo_map/get_biosample_from_update.py (3:19) duplicated block id: 176 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_builder.py (3:19) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) duplicated block id: 177 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/structure_file_reader.py (3:19) - src/data_loader.py (3:19) duplicated block id: 178 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_generator.py (3:19) - src/predict.py (3:19) duplicated block id: 179 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_generator.py (3:19) - src/evaluater.py (3:19) duplicated block id: 180 size: 14 cleaned lines of code in 2 files: - src/common/multi_label_metrics.py (3:19) - src/geo_map/standardization_lat_lon_info.py (3:19) duplicated block id: 181 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:19) - src/geo_map/get_biosample_from_manual.py (3:19) duplicated block id: 182 size: 14 cleaned lines of code in 2 files: - src/evaluater.py (3:19) - src/run.py (3:19) duplicated block id: 183 size: 14 cleaned lines of code in 2 files: - src/common/loss.py (3:19) - src/geo_map/get_biosample_from_manual.py (3:19) duplicated block id: 184 size: 14 cleaned lines of code in 2 files: - src/common/multi_label_metrics.py (3:19) - src/predict_one_sample.py (3:19) duplicated block id: 185 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_update.py (3:19) - src/plot/plot_map_pie_fig_aff4_1.py (3:19) duplicated block id: 186 size: 14 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:19) - src/predict_one_sample.py (3:19) duplicated block id: 187 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_update.py (3:19) - src/protein_structure/embedding_from_esmfold.py (3:19) duplicated block id: 188 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_generator.py (3:19) - src/geo_map/get_biosample_from_manual.py (3:19) duplicated block id: 189 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_manual.py (3:19) - src/predictor.py (3:19) duplicated block id: 190 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/structure_file_reader.py (3:19) - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) duplicated block id: 191 size: 14 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (32:65) - src/predict_many_samples.py (35:64) duplicated block id: 192 size: 14 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:19) - src/run.py (3:19) duplicated block id: 193 size: 14 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:19) - src/common/loss.py (3:19) duplicated block id: 194 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_generator.py (3:19) - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:19) duplicated block id: 195 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_generator.py (3:19) - src/trainer.py (3:19) duplicated block id: 196 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_update.py (3:19) - src/run.py (3:19) duplicated block id: 197 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/structure_file_reader.py (3:19) - src/trainer.py (3:19) duplicated block id: 198 size: 14 cleaned lines of code in 2 files: - src/common/metrics.py (3:19) - src/protein_structure/merge_embedding_pdb_result.py (3:19) duplicated block id: 199 size: 14 cleaned lines of code in 2 files: - src/SSFN/model.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:19) duplicated block id: 200 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_builder.py (3:19) - src/predict.py (3:19) duplicated block id: 201 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:19) - src/geo_map/get_biosample_from_update.py (3:19) duplicated block id: 202 size: 14 cleaned lines of code in 2 files: - src/protein_structure/merge_embedding_pdb_result.py (3:19) - src/trainer.py (3:19) duplicated block id: 203 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:19) duplicated block id: 204 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_manual.py (3:19) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) duplicated block id: 205 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_builder.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:19) duplicated block id: 206 size: 14 cleaned lines of code in 2 files: - src/baselines/predict.py (3:19) - src/biotoolbox/structure_file_reader.py (3:19) duplicated block id: 207 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_generator.py (3:19) - src/biotoolbox/structure_file_reader.py (3:19) duplicated block id: 208 size: 14 cleaned lines of code in 2 files: - src/SSFN/model.py (3:19) - src/utils.py (3:19) duplicated block id: 209 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:19) - src/protein_structure/structure_from_esm_v1.py (3:19) duplicated block id: 210 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/structure_file_reader.py (3:19) - src/protein_structure/structure_from_esm_v1.py (3:19) duplicated block id: 211 size: 14 cleaned lines of code in 2 files: - src/geo_map/standardization_lat_lon_info.py (3:19) - src/utils.py (3:19) duplicated block id: 212 size: 14 cleaned lines of code in 2 files: - src/common/loss.py (3:19) - src/evaluater.py (3:19) duplicated block id: 213 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_entrez.py (3:19) - src/predictor.py (3:19) duplicated block id: 214 size: 14 cleaned lines of code in 2 files: - src/common/metrics.py (3:19) - src/data_preprocess/tf_records_generator.py (3:19) duplicated block id: 215 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig_aff4_2.py (3:19) - src/predictor.py (3:19) duplicated block id: 216 size: 14 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:19) - src/plot/plot_map_pie_fig4_1.py (3:19) duplicated block id: 217 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) duplicated block id: 218 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/structure_file_reader.py (3:19) - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) duplicated block id: 219 size: 14 cleaned lines of code in 2 files: - src/predict_one_sample.py (3:19) - src/utils.py (3:19) duplicated block id: 220 size: 14 cleaned lines of code in 2 files: - src/predict_one_sample.py (3:19) - src/protein_structure/predict_structure.py (3:19) duplicated block id: 221 size: 14 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:19) - src/plot/plot_map_pie_fig_aff4_1.py (3:19) duplicated block id: 222 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_builder.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:19) duplicated block id: 223 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_update.py (3:19) - src/predict.py (3:19) duplicated block id: 224 size: 14 cleaned lines of code in 2 files: - src/predict_many_samples.py (3:19) - src/predict_one_sample.py (3:19) duplicated block id: 225 size: 14 cleaned lines of code in 2 files: - src/common/multi_label_metrics.py (3:19) - src/data_preprocess/verify_train_dataset.py (3:19) duplicated block id: 226 size: 14 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:19) - src/geo_map/standardization_lat_lon_info.py (3:19) duplicated block id: 227 size: 14 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:19) - src/plot/plot_map_pie_fig4_1.py (3:19) duplicated block id: 228 size: 14 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:19) - src/data_preprocess/subword.py (3:19) duplicated block id: 229 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/subword.py (3:19) - src/protein_structure/merge_embedding_pdb_result.py (3:19) duplicated block id: 230 size: 14 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:19) - src/plot/plot_map_pie_fig_aff4_1.py (3:19) duplicated block id: 231 size: 14 cleaned lines of code in 2 files: - src/predict_one_sample.py (3:19) - src/run.py (3:19) duplicated block id: 232 size: 14 cleaned lines of code in 2 files: - src/data_loader.py (3:19) - src/geo_map/get_biosample_from_entrez.py (3:19) duplicated block id: 233 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:19) - src/predictor.py (3:19) duplicated block id: 234 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/verify_train_dataset.py (3:19) - src/protein_structure/structure_from_esm_v1.py (3:19) duplicated block id: 235 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_builder.py (3:19) - src/data_loader.py (3:19) duplicated block id: 236 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/tf_records_generator.py (3:19) - src/deep_baselines/predict_deep_baselines.py (1:17) duplicated block id: 237 size: 14 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:19) - src/evaluater.py (3:19) duplicated block id: 238 size: 14 cleaned lines of code in 2 files: - src/data_loader.py (3:19) - src/predict_one_sample.py (3:19) duplicated block id: 239 size: 14 cleaned lines of code in 2 files: - src/SSFN/model.py (3:19) - src/common/metrics.py (3:19) duplicated block id: 240 size: 14 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:19) - src/biotoolbox/contact_map_builder.py (3:19) duplicated block id: 241 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_update.py (3:19) - src/protein_structure/merge_embedding_pdb_result.py (3:19) duplicated block id: 242 size: 14 cleaned lines of code in 2 files: - src/common/multi_label_metrics.py (3:19) - src/plot/plot_map_pie_fig4_1.py (3:19) duplicated block id: 243 size: 14 cleaned lines of code in 2 files: - src/baselines/predict.py (3:19) - src/plot/plot_map_pie_fig_aff4_2.py (3:19) duplicated block id: 244 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:19) duplicated block id: 245 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/structure_file_reader.py (3:19) - src/protein_structure/predict_structure.py (3:19) duplicated block id: 246 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:19) - src/protein_structure/predict_structure.py (3:19) duplicated block id: 247 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:19) - src/geo_map/download_biosample_page.py (3:19) duplicated block id: 248 size: 14 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:19) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) duplicated block id: 249 size: 14 cleaned lines of code in 2 files: - src/geo_map/parse_loc_lat_lon_from_attr.py (3:19) - src/protein_structure/merge_embedding_pdb_result.py (3:19) duplicated block id: 250 size: 14 cleaned lines of code in 2 files: - src/geo_map/extract_attr_from_biosample_page.py (3:19) - src/result_process/process_predict_result.py (3:19) duplicated block id: 251 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:19) - src/plot/plot_map_pie_fig4_1.py (3:19) duplicated block id: 252 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/subword.py (3:19) - src/geo_map/get_biosample_from_update.py (3:19) duplicated block id: 253 size: 14 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:19) - src/data_preprocess/verify_train_dataset.py (3:19) duplicated block id: 254 size: 14 cleaned lines of code in 2 files: - src/common/loss.py (3:19) - src/geo_map/standardization_lat_lon_info.py (3:19) duplicated block id: 255 size: 14 cleaned lines of code in 2 files: - src/common/metrics.py (3:19) - src/predict_one_sample.py (3:19) duplicated block id: 256 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_update.py (3:19) - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) duplicated block id: 257 size: 14 cleaned lines of code in 2 files: - src/baselines/predict.py (3:19) - src/predict_one_sample.py (3:19) duplicated block id: 258 size: 14 cleaned lines of code in 2 files: - src/evaluater.py (3:19) - src/predict.py (3:19) duplicated block id: 259 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/verify_train_dataset.py (3:19) - src/plot/plot_map_pie_fig_aff4_1.py (3:19) duplicated block id: 260 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/subword.py (3:19) - src/plot/plot_map_pie_fig4_1.py (3:19) duplicated block id: 261 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_generator.py (3:19) - src/utils.py (3:19) duplicated block id: 262 size: 14 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:19) - src/geo_map/extract_attr_from_biosample_page.py (3:19) duplicated block id: 263 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/structure_file_reader.py (3:19) - src/plot/plot_map_pie_fig_aff4_2.py (3:19) duplicated block id: 264 size: 14 cleaned lines of code in 2 files: - src/predict.py (3:19) - src/protein_structure/structure_from_esm_v1.py (3:19) duplicated block id: 265 size: 14 cleaned lines of code in 2 files: - src/baselines/predict.py (3:19) - src/geo_map/get_biosample_from_update.py (3:19) duplicated block id: 266 size: 14 cleaned lines of code in 2 files: - src/SSFN/model.py (3:19) - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) duplicated block id: 267 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) duplicated block id: 268 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_manual.py (3:19) - src/plot/plot_map_pie_fig_aff4_1.py (3:19) duplicated block id: 269 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_builder.py (3:19) - src/protein_structure/embedding_from_esmfold.py (3:19) duplicated block id: 270 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_2.py (3:19) - src/run.py (3:19) duplicated block id: 271 size: 14 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:19) - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) duplicated block id: 272 size: 14 cleaned lines of code in 2 files: - src/data_loader.py (3:19) - src/plot/plot_map_pie_fig4_2.py (3:19) duplicated block id: 273 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_generator.py (3:19) - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) duplicated block id: 274 size: 14 cleaned lines of code in 2 files: - src/evaluater.py (3:19) - src/geo_map/extract_attr_from_biosample_page.py (3:19) duplicated block id: 275 size: 14 cleaned lines of code in 2 files: - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) - src/protein_structure/embedding_from_esmfold.py (3:19) duplicated block id: 276 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/common/metrics.py (3:19) duplicated block id: 277 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/data_loader.py (3:19) duplicated block id: 278 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_update.py (3:19) - src/trainer.py (3:19) duplicated block id: 279 size: 14 cleaned lines of code in 2 files: - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) - src/utils.py (3:19) duplicated block id: 280 size: 14 cleaned lines of code in 2 files: - src/common/loss.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:19) duplicated block id: 281 size: 14 cleaned lines of code in 2 files: - src/geo_map/standardization_lat_lon_info.py (3:19) - src/plot/plot_map_pie_fig_aff4_1.py (3:19) duplicated block id: 282 size: 14 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (1:17) - src/plot/plot_map_pie_fig_aff4_1.py (3:19) duplicated block id: 283 size: 14 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:19) - src/baselines/dnn.py (3:19) duplicated block id: 284 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_manual.py (3:19) - src/protein_structure/embedding_from_esmfold.py (3:19) duplicated block id: 285 size: 14 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:19) - src/geo_map/get_biosample_from_update.py (3:19) duplicated block id: 286 size: 14 cleaned lines of code in 2 files: - src/deep_baselines/virhunter.py (3:19) - src/deep_baselines/virseeker.py (3:19) duplicated block id: 287 size: 14 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:19) - src/geo_map/extract_attr_from_biosample_page.py (3:19) duplicated block id: 288 size: 14 cleaned lines of code in 2 files: - src/common/metrics.py (3:19) - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) duplicated block id: 289 size: 14 cleaned lines of code in 2 files: - src/geo_map/standardization_lat_lon_info.py (3:19) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) duplicated block id: 290 size: 14 cleaned lines of code in 2 files: - src/geo_map/extract_attr_from_biosample_page.py (3:19) - src/geo_map/get_biosample_from_entrez.py (3:19) duplicated block id: 291 size: 14 cleaned lines of code in 2 files: - src/evaluater.py (3:19) - src/trainer.py (3:19) duplicated block id: 292 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (448:462) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (493:507) duplicated block id: 293 size: 14 cleaned lines of code in 2 files: - src/geo_map/download_biosample_page.py (3:19) - src/protein_structure/embedding_from_esmfold.py (3:19) duplicated block id: 294 size: 14 cleaned lines of code in 2 files: - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) - src/protein_structure/merge_embedding_pdb_result.py (3:19) duplicated block id: 295 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/verify_train_dataset.py (3:19) - src/trainer.py (3:19) duplicated block id: 296 size: 14 cleaned lines of code in 2 files: - src/geo_map/standardization_lat_lon_info.py (3:19) - src/protein_structure/predict_structure.py (3:19) duplicated block id: 297 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/verify_train_dataset.py (3:19) - src/predict.py (3:19) duplicated block id: 298 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/tf_records_generator.py (3:19) - src/plot/plot_map_pie_fig4_1.py (3:19) duplicated block id: 299 size: 14 cleaned lines of code in 2 files: - src/common/loss.py (3:19) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) duplicated block id: 300 size: 14 cleaned lines of code in 2 files: - src/result_process/process_predict_result.py (3:19) - src/run.py (3:19) duplicated block id: 301 size: 14 cleaned lines of code in 2 files: - src/result_process/process_predict_result.py (49:64) - src/utils.py (190:205) duplicated block id: 302 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/verify_train_dataset.py (3:19) - src/predictor.py (3:19) duplicated block id: 303 size: 14 cleaned lines of code in 2 files: - src/SSFN/model.py (3:19) - src/data_loader.py (3:19) duplicated block id: 304 size: 14 cleaned lines of code in 2 files: - src/data_loader.py (3:19) - src/run.py (3:19) duplicated block id: 305 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:19) - src/geo_map/extract_attr_from_biosample_page.py (3:19) duplicated block id: 306 size: 14 cleaned lines of code in 2 files: - src/common/multi_label_metrics.py (3:19) - src/plot/plot_map_pie_fig_aff4_2.py (3:19) duplicated block id: 307 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/protein_structure/merge_embedding_pdb_result.py (3:19) duplicated block id: 308 size: 14 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:19) - src/data_preprocess/tf_records_generator.py (3:19) duplicated block id: 309 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/SSFN/layers.py (3:19) duplicated block id: 310 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:19) - src/utils.py (3:19) duplicated block id: 311 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_builder.py (3:19) - src/trainer.py (3:19) duplicated block id: 312 size: 14 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:19) duplicated block id: 313 size: 14 cleaned lines of code in 2 files: - src/baselines/xgb.py (3:19) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) duplicated block id: 314 size: 14 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:19) - src/predict_many_samples.py (3:19) duplicated block id: 315 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:19) - src/geo_map/standardization_lat_lon_info.py (3:19) duplicated block id: 316 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/structure_file_reader.py (3:19) - src/predict.py (3:19) duplicated block id: 317 size: 14 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:19) - src/deep_baselines/predict_deep_baselines.py (1:17) duplicated block id: 318 size: 14 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:19) - src/predictor.py (3:19) duplicated block id: 319 size: 14 cleaned lines of code in 2 files: - src/SSFN/model.py (3:19) - src/geo_map/standardization_lat_lon_info.py (3:19) duplicated block id: 320 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:19) - src/plot/plot_map_pie_fig4_1.py (3:19) duplicated block id: 321 size: 14 cleaned lines of code in 2 files: - src/baselines/predict.py (3:19) - src/evaluater.py (3:19) duplicated block id: 322 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_manual.py (3:19) - src/geo_map/standardization_lat_lon_info.py (3:19) duplicated block id: 323 size: 14 cleaned lines of code in 2 files: - src/protein_structure/merge_embedding_pdb_result.py (3:19) - src/protein_structure/predict_structure.py (3:19) duplicated block id: 324 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:19) - src/data_preprocess/verify_train_dataset.py (3:19) duplicated block id: 325 size: 14 cleaned lines of code in 2 files: - src/SSFN/model.py (3:19) - src/data_preprocess/verify_train_dataset.py (3:19) duplicated block id: 326 size: 14 cleaned lines of code in 2 files: - src/baselines/xgb.py (3:19) - src/predict_one_sample.py (3:19) duplicated block id: 327 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_entrez.py (3:19) - src/plot/plot_map_pie_fig4_2.py (3:19) duplicated block id: 328 size: 14 cleaned lines of code in 2 files: - src/predict_one_sample.py (3:19) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) duplicated block id: 329 size: 14 cleaned lines of code in 2 files: - src/common/loss.py (3:19) - src/data_preprocess/verify_train_dataset.py (3:19) duplicated block id: 330 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (3:19) - src/run.py (3:19) duplicated block id: 331 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/plot/plot_map_pie_fig_aff4_1.py (3:19) duplicated block id: 332 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_update.py (3:19) - src/protein_structure/predict_structure.py (3:19) duplicated block id: 333 size: 14 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:19) - src/evaluater.py (3:19) duplicated block id: 334 size: 14 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:19) - src/data_preprocess/subword.py (3:19) duplicated block id: 335 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_builder.py (3:19) - src/common/metrics.py (3:19) duplicated block id: 336 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:19) - src/protein_structure/embedding_from_esmfold.py (3:19) duplicated block id: 337 size: 14 cleaned lines of code in 2 files: - src/deep_baselines/statistics.py (3:19) - src/deep_baselines/virtifier.py (3:19) duplicated block id: 338 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:19) - src/result_process/process_predict_result.py (3:19) duplicated block id: 339 size: 14 cleaned lines of code in 2 files: - src/geo_map/standardization_lat_lon_info.py (3:19) - src/run.py (3:19) duplicated block id: 340 size: 14 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:19) - src/data_preprocess/verify_train_dataset.py (3:19) duplicated block id: 341 size: 14 cleaned lines of code in 2 files: - src/evaluater.py (3:19) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) duplicated block id: 342 size: 14 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:19) - src/biotoolbox/contact_map_builder.py (3:19) duplicated block id: 343 size: 14 cleaned lines of code in 2 files: - src/common/loss.py (3:19) - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) duplicated block id: 344 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:19) - src/deep_baselines/predict_deep_baselines.py (1:17) duplicated block id: 345 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_generator.py (3:19) - src/plot/plot_map_pie_fig_aff4_1.py (3:19) duplicated block id: 346 size: 14 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:19) - src/plot/plot_map_pie_fig4_2.py (3:19) duplicated block id: 347 size: 14 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (3:19) - src/deep_baselines/virtifier.py (3:19) duplicated block id: 348 size: 14 cleaned lines of code in 2 files: - src/baselines/predict.py (3:19) - src/predict_many_samples.py (3:19) duplicated block id: 349 size: 14 cleaned lines of code in 2 files: - src/baselines/predict.py (3:19) - src/biotoolbox/contact_map_generator.py (3:19) duplicated block id: 350 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) duplicated block id: 351 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:19) - src/data_preprocess/subword.py (3:19) duplicated block id: 352 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_manual.py (3:19) - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) duplicated block id: 353 size: 14 cleaned lines of code in 2 files: - src/geo_map/download_biosample_page.py (3:19) - src/geo_map/parse_loc_lat_lon_from_attr.py (3:19) duplicated block id: 354 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_builder.py (3:19) - src/geo_map/standardization_lat_lon_info.py (3:19) duplicated block id: 355 size: 14 cleaned lines of code in 2 files: - src/predict_one_sample.py (3:19) - src/predictor.py (3:19) duplicated block id: 356 size: 14 cleaned lines of code in 2 files: - src/evaluater.py (3:19) - src/plot/plot_map_pie_fig4_2.py (3:19) duplicated block id: 357 size: 14 cleaned lines of code in 2 files: - src/data_loader.py (3:19) - src/geo_map/get_biosample_from_manual.py (3:19) duplicated block id: 358 size: 14 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:19) - src/biotoolbox/structure_file_reader.py (3:19) duplicated block id: 359 size: 14 cleaned lines of code in 2 files: - src/deep_baselines/statistics.py (3:19) - src/deep_baselines/virseeker.py (3:19) duplicated block id: 360 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:19) - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) duplicated block id: 361 size: 14 cleaned lines of code in 2 files: - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) - src/trainer.py (3:19) duplicated block id: 362 size: 14 cleaned lines of code in 2 files: - src/baselines/predict.py (3:19) - src/baselines/xgb.py (3:19) duplicated block id: 363 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:19) - src/plot/plot_map_pie_fig4_1.py (3:19) duplicated block id: 364 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:19) - src/plot/plot_map_pie_fig_aff4_1.py (3:19) duplicated block id: 365 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_2.py (3:19) - src/protein_structure/embedding_from_esmfold.py (3:19) duplicated block id: 366 size: 14 cleaned lines of code in 2 files: - src/protein_structure/merge_embedding_pdb_result.py (3:19) - src/run.py (3:19) duplicated block id: 367 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:19) - src/protein_structure/merge_embedding_pdb_result.py (3:19) duplicated block id: 368 size: 14 cleaned lines of code in 2 files: - src/data_loader.py (3:19) - src/data_preprocess/subword.py (3:19) duplicated block id: 369 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_manual.py (3:19) - src/plot/plot_map_pie_fig4_2.py (3:19) duplicated block id: 370 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) - src/protein_structure/structure_from_esm_v1.py (3:19) duplicated block id: 371 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/verify_train_dataset.py (3:19) - src/plot/plot_map_pie_fig4_2.py (3:19) duplicated block id: 372 size: 14 cleaned lines of code in 2 files: - src/SSFN/model.py (3:19) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) duplicated block id: 373 size: 14 cleaned lines of code in 2 files: - src/geo_map/extract_attr_from_biosample_page.py (3:19) - src/plot/plot_map_pie_fig4_2.py (3:19) duplicated block id: 374 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig_aff4_2.py (3:19) - src/utils.py (3:19) duplicated block id: 375 size: 14 cleaned lines of code in 2 files: - src/geo_map/download_biosample_page.py (3:19) - src/utils.py (3:19) duplicated block id: 376 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/result_process/process_predict_result.py (3:19) duplicated block id: 377 size: 14 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:19) - src/protein_structure/structure_from_esm_v1.py (3:19) duplicated block id: 378 size: 14 cleaned lines of code in 2 files: - src/predict_one_sample.py (3:19) - src/protein_structure/structure_from_esm_v1.py (3:19) duplicated block id: 379 size: 14 cleaned lines of code in 2 files: - src/common/multi_label_metrics.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:19) duplicated block id: 380 size: 14 cleaned lines of code in 2 files: - src/common/loss.py (3:19) - src/plot/plot_map_pie_fig4_2.py (3:19) duplicated block id: 381 size: 14 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:19) - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) duplicated block id: 382 size: 14 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (1:17) - src/geo_map/get_biosample_from_update.py (3:19) duplicated block id: 383 size: 14 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:19) - src/geo_map/extract_attr_from_biosample_page.py (3:19) duplicated block id: 384 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/structure_file_reader.py (3:19) - src/run.py (3:19) duplicated block id: 385 size: 14 cleaned lines of code in 2 files: - src/run.py (3:19) - src/utils.py (3:19) duplicated block id: 386 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_manual.py (3:19) - src/geo_map/parse_loc_lat_lon_from_attr.py (3:19) duplicated block id: 387 size: 14 cleaned lines of code in 2 files: - src/baselines/xgb.py (3:19) - src/data_loader.py (3:19) duplicated block id: 388 size: 14 cleaned lines of code in 2 files: - src/geo_map/extract_attr_from_biosample_page.py (3:19) - src/predictor.py (3:19) duplicated block id: 389 size: 14 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (1:17) - src/geo_map/parse_loc_lat_lon_from_attr.py (3:19) duplicated block id: 390 size: 14 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (3:19) - src/deep_baselines/virseeker.py (3:19) duplicated block id: 391 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/data_preprocess/verify_train_dataset.py (3:19) duplicated block id: 392 size: 14 cleaned lines of code in 2 files: - src/data_loader.py (3:19) - src/plot/plot_map_pie_fig_aff4_2.py (3:19) duplicated block id: 393 size: 14 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:19) - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) duplicated block id: 394 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:19) - src/data_preprocess/subword.py (3:19) duplicated block id: 395 size: 14 cleaned lines of code in 2 files: - src/SSFN/model.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:19) duplicated block id: 396 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:19) - src/data_preprocess/tf_records_generator.py (3:19) duplicated block id: 397 size: 14 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (1:17) - src/plot/plot_map_pie_fig4_2.py (3:19) duplicated block id: 398 size: 14 cleaned lines of code in 2 files: - src/protein_structure/merge_embedding_pdb_result.py (3:19) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) duplicated block id: 399 size: 14 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (1:17) - src/protein_structure/embedding_from_esmfold.py (3:19) duplicated block id: 400 size: 14 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:19) - src/common/loss.py (3:19) duplicated block id: 401 size: 14 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:19) - src/protein_structure/structure_from_esm_v1.py (3:19) duplicated block id: 402 size: 14 cleaned lines of code in 2 files: - src/predictor.py (3:19) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) duplicated block id: 403 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/verify_train_dataset.py (3:19) - src/geo_map/download_biosample_page.py (3:19) duplicated block id: 404 size: 14 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:19) - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) duplicated block id: 405 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:19) - src/plot/plot_map_pie_fig_aff4_1.py (3:19) duplicated block id: 406 size: 14 cleaned lines of code in 2 files: - src/predictor.py (3:19) - src/trainer.py (3:19) duplicated block id: 407 size: 14 cleaned lines of code in 2 files: - src/common/metrics.py (3:19) - src/result_process/process_predict_result.py (3:19) duplicated block id: 408 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/structure_file_reader.py (3:19) - src/common/metrics.py (3:19) duplicated block id: 409 size: 14 cleaned lines of code in 2 files: - src/common/metrics.py (3:19) - src/utils.py (3:19) duplicated block id: 410 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:19) - src/evaluater.py (3:19) duplicated block id: 411 size: 14 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:19) - src/common/loss.py (3:19) duplicated block id: 412 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_update.py (3:19) - src/predict_one_sample.py (3:19) duplicated block id: 413 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_entrez.py (3:19) - src/plot/plot_map_pie_fig_aff4_2.py (3:19) duplicated block id: 414 size: 14 cleaned lines of code in 2 files: - src/geo_map/download_biosample_page.py (3:19) - src/plot/plot_map_pie_fig4_2.py (3:19) duplicated block id: 415 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_2.py (3:19) - src/protein_structure/structure_from_esm_v1.py (3:19) duplicated block id: 416 size: 14 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:19) - src/baselines/predict.py (3:19) duplicated block id: 417 size: 14 cleaned lines of code in 2 files: - src/common/metrics.py (3:19) - src/trainer.py (3:19) duplicated block id: 418 size: 14 cleaned lines of code in 2 files: - src/geo_map/parse_loc_lat_lon_from_attr.py (3:19) - src/protein_structure/embedding_from_esmfold.py (3:19) duplicated block id: 419 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_generator.py (3:19) - src/deep_baselines/predict_deep_baselines.py (1:17) duplicated block id: 420 size: 14 cleaned lines of code in 2 files: - src/evaluater.py (3:19) - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) duplicated block id: 421 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_builder.py (3:19) - src/biotoolbox/contact_map_generator.py (3:19) duplicated block id: 422 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig_aff4_2.py (3:19) - src/predict_many_samples.py (3:19) duplicated block id: 423 size: 14 cleaned lines of code in 2 files: - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) - src/protein_structure/structure_from_esm_v1.py (3:19) duplicated block id: 424 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:19) - src/protein_structure/merge_embedding_pdb_result.py (3:19) duplicated block id: 425 size: 14 cleaned lines of code in 2 files: - src/baselines/predict.py (3:19) - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) duplicated block id: 426 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/structure_file_reader.py (3:19) - src/data_preprocess/tf_records_generator.py (3:19) duplicated block id: 427 size: 14 cleaned lines of code in 2 files: - src/predict.py (3:19) - src/predictor.py (3:19) duplicated block id: 428 size: 14 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:19) - src/baselines/predict.py (3:19) duplicated block id: 429 size: 14 cleaned lines of code in 2 files: - src/common/metrics.py (3:19) - src/data_preprocess/subword.py (3:19) duplicated block id: 430 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/subword.py (3:19) - src/data_preprocess/tf_records_generator.py (3:19) duplicated block id: 431 size: 14 cleaned lines of code in 2 files: - src/geo_map/standardization_lat_lon_info.py (3:19) - src/predict.py (3:19) duplicated block id: 432 size: 14 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:19) - src/geo_map/standardization_lat_lon_info.py (3:19) duplicated block id: 433 size: 14 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:19) duplicated block id: 434 size: 14 cleaned lines of code in 2 files: - src/predict_many_samples.py (3:19) - src/protein_structure/predict_structure.py (3:19) duplicated block id: 435 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/structure_file_reader.py (3:19) - src/geo_map/download_biosample_page.py (3:19) duplicated block id: 436 size: 14 cleaned lines of code in 2 files: - src/common/metrics.py (3:19) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) duplicated block id: 437 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_manual.py (3:19) - src/predict_one_sample.py (3:19) duplicated block id: 438 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:19) - src/plot/plot_map_pie_fig_aff4_2.py (3:19) duplicated block id: 439 size: 14 cleaned lines of code in 2 files: - src/geo_map/standardization_lat_lon_info.py (3:19) - src/trainer.py (3:19) duplicated block id: 440 size: 14 cleaned lines of code in 2 files: - src/protein_structure/embedding_from_esmfold.py (3:19) - src/protein_structure/merge_embedding_pdb_result.py (3:19) duplicated block id: 441 size: 14 cleaned lines of code in 2 files: - src/baselines/xgb.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:19) duplicated block id: 442 size: 14 cleaned lines of code in 2 files: - src/baselines/xgb.py (3:19) - src/plot/plot_map_pie_fig_aff4_1.py (3:19) duplicated block id: 443 size: 14 cleaned lines of code in 2 files: - src/geo_map/extract_attr_from_biosample_page.py (3:19) - src/geo_map/get_biosample_from_manual.py (3:19) duplicated block id: 444 size: 14 cleaned lines of code in 2 files: - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) - src/predict_many_samples.py (3:19) duplicated block id: 445 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_2.py (3:19) - src/protein_structure/merge_embedding_pdb_result.py (3:19) duplicated block id: 446 size: 14 cleaned lines of code in 2 files: - src/baselines/predict.py (3:19) - src/utils.py (3:19) duplicated block id: 447 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/structure_file_reader.py (3:19) - src/evaluater.py (3:19) duplicated block id: 448 size: 14 cleaned lines of code in 2 files: - src/baselines/predict.py (3:19) - src/result_process/process_predict_result.py (3:19) duplicated block id: 449 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) - src/evaluater.py (3:19) duplicated block id: 450 size: 14 cleaned lines of code in 2 files: - src/common/loss.py (3:19) - src/utils.py (3:19) duplicated block id: 451 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:19) - src/geo_map/get_biosample_from_manual.py (3:19) duplicated block id: 452 size: 14 cleaned lines of code in 2 files: - src/common/metrics.py (3:19) - src/predict.py (3:19) duplicated block id: 453 size: 14 cleaned lines of code in 2 files: - src/geo_map/standardization_lat_lon_info.py (3:19) - src/protein_structure/merge_embedding_pdb_result.py (3:19) duplicated block id: 454 size: 14 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:19) - src/protein_structure/embedding_from_esmfold.py (3:19) duplicated block id: 455 size: 14 cleaned lines of code in 2 files: - src/protein_structure/merge_embedding_pdb_result.py (3:19) - src/utils.py (3:19) duplicated block id: 456 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:19) - src/geo_map/download_biosample_page.py (3:19) duplicated block id: 457 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_generator.py (3:19) - src/plot/plot_map_pie_fig4_2.py (3:19) duplicated block id: 458 size: 14 cleaned lines of code in 2 files: - src/common/multi_label_metrics.py (3:19) - src/geo_map/extract_attr_from_biosample_page.py (3:19) duplicated block id: 459 size: 14 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:19) - src/biotoolbox/structure_file_reader.py (3:19) duplicated block id: 460 size: 14 cleaned lines of code in 2 files: - src/deep_baselines/virhunter.py (3:19) - src/deep_baselines/virtifier.py (3:19) duplicated block id: 461 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_manual.py (3:19) - src/result_process/process_predict_result.py (3:19) duplicated block id: 462 size: 14 cleaned lines of code in 2 files: - src/baselines/predict.py (3:19) - src/protein_structure/predict_structure.py (3:19) duplicated block id: 463 size: 14 cleaned lines of code in 2 files: - src/geo_map/extract_attr_from_biosample_page.py (3:19) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) duplicated block id: 464 size: 14 cleaned lines of code in 2 files: - src/common/loss.py (3:19) - src/geo_map/get_biosample_from_entrez.py (3:19) duplicated block id: 465 size: 14 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:19) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) duplicated block id: 466 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) - src/data_preprocess/tf_records_generator.py (3:19) duplicated block id: 467 size: 14 cleaned lines of code in 2 files: - src/data_loader.py (3:19) - src/utils.py (3:19) duplicated block id: 468 size: 14 cleaned lines of code in 2 files: - src/common/metrics.py (3:19) - src/protein_structure/structure_from_esm_v1.py (3:19) duplicated block id: 469 size: 14 cleaned lines of code in 2 files: - src/SSFN/model.py (3:19) - src/protein_structure/merge_embedding_pdb_result.py (3:19) duplicated block id: 470 size: 14 cleaned lines of code in 2 files: - src/predict_many_samples.py (3:19) - src/protein_structure/embedding_from_esmfold.py (3:19) duplicated block id: 471 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:19) - src/geo_map/get_biosample_from_entrez.py (3:19) duplicated block id: 472 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/subword.py (3:19) - src/protein_structure/structure_from_esm_v1.py (3:19) duplicated block id: 473 size: 14 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:19) - src/biotoolbox/contact_map_generator.py (3:19) duplicated block id: 474 size: 14 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:19) - src/baselines/dnn.py (3:19) duplicated block id: 475 size: 14 cleaned lines of code in 2 files: - src/common/loss.py (3:19) - src/protein_structure/merge_embedding_pdb_result.py (3:19) duplicated block id: 476 size: 14 cleaned lines of code in 2 files: - src/run.py (3:19) - src/trainer.py (3:19) duplicated block id: 477 size: 14 cleaned lines of code in 2 files: - src/evaluater.py (3:19) - src/protein_structure/embedding_from_esmfold.py (3:19) duplicated block id: 478 size: 14 cleaned lines of code in 2 files: - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) - src/utils.py (3:19) duplicated block id: 479 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:19) - src/protein_structure/embedding_from_esmfold.py (3:19) duplicated block id: 480 size: 14 cleaned lines of code in 2 files: - src/geo_map/download_biosample_page.py (3:19) - src/geo_map/extract_attr_from_biosample_page.py (3:19) duplicated block id: 481 size: 14 cleaned lines of code in 2 files: - src/predict.py (3:19) - src/protein_structure/embedding_from_esmfold.py (3:19) duplicated block id: 482 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (3:19) - src/result_process/process_predict_result.py (3:19) duplicated block id: 483 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/subword.py (3:19) - src/geo_map/download_biosample_page.py (3:19) duplicated block id: 484 size: 14 cleaned lines of code in 2 files: - src/geo_map/extract_attr_from_biosample_page.py (3:19) - src/run.py (3:19) duplicated block id: 485 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:19) - src/geo_map/extract_attr_from_biosample_page.py (3:19) duplicated block id: 486 size: 14 cleaned lines of code in 2 files: - src/geo_map/download_biosample_page.py (3:19) - src/protein_structure/structure_from_esm_v1.py (3:19) duplicated block id: 487 size: 14 cleaned lines of code in 2 files: - src/SSFN/model.py (3:19) - src/protein_structure/predict_structure.py (3:19) duplicated block id: 488 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_generator.py (3:19) - src/data_loader.py (3:19) duplicated block id: 489 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_builder.py (3:19) - src/biotoolbox/structure_file_reader.py (3:19) duplicated block id: 490 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_entrez.py (3:19) - src/protein_structure/embedding_from_esmfold.py (3:19) duplicated block id: 491 size: 14 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (1:17) - src/protein_structure/structure_from_esm_v1.py (3:19) duplicated block id: 492 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/verify_train_dataset.py (3:19) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) duplicated block id: 493 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) - src/geo_map/extract_attr_from_biosample_page.py (3:19) duplicated block id: 494 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) - src/plot/plot_map_pie_fig_aff4_1.py (3:19) duplicated block id: 495 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_generator.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:19) duplicated block id: 496 size: 14 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:19) - src/baselines/lgbm.py (3:19) duplicated block id: 497 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/protein_structure/structure_from_esm_v1.py (3:19) duplicated block id: 498 size: 14 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:19) - src/deep_baselines/predict_deep_baselines.py (1:17) duplicated block id: 499 size: 14 cleaned lines of code in 2 files: - src/SSFN/model.py (3:19) - src/geo_map/parse_loc_lat_lon_from_attr.py (3:19) duplicated block id: 500 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_2.py (3:19) - src/predict_one_sample.py (3:19) duplicated block id: 501 size: 14 cleaned lines of code in 2 files: - src/baselines/xgb.py (3:19) - src/common/metrics.py (3:19) duplicated block id: 502 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_2.py (3:19) - src/trainer.py (3:19) duplicated block id: 503 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/verify_train_dataset.py (3:19) - src/geo_map/parse_loc_lat_lon_from_attr.py (3:19) duplicated block id: 504 size: 14 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:19) duplicated block id: 505 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_builder.py (3:19) - src/plot/plot_map_pie_fig4_1.py (3:19) duplicated block id: 506 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_manual.py (3:19) - src/plot/plot_map_pie_fig_aff4_2.py (3:19) duplicated block id: 507 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:19) - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) duplicated block id: 508 size: 14 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:19) duplicated block id: 509 size: 14 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (1:17) - src/plot/plot_map_pie_fig_aff4_2.py (3:19) duplicated block id: 510 size: 14 cleaned lines of code in 2 files: - src/data_loader.py (3:19) - src/geo_map/standardization_lat_lon_info.py (3:19) duplicated block id: 511 size: 14 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:19) - src/geo_map/extract_attr_from_biosample_page.py (3:19) duplicated block id: 512 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:19) - src/plot/plot_map_pie_fig_aff4_2.py (3:19) duplicated block id: 513 size: 14 cleaned lines of code in 2 files: - src/common/multi_label_metrics.py (3:19) - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) duplicated block id: 514 size: 14 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:19) - src/utils.py (3:19) duplicated block id: 515 size: 14 cleaned lines of code in 2 files: - src/baselines/predict.py (3:19) - src/plot/plot_map_pie_fig4_2.py (3:19) duplicated block id: 516 size: 14 cleaned lines of code in 2 files: - src/baselines/predict.py (3:19) - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) duplicated block id: 517 size: 14 cleaned lines of code in 2 files: - src/common/loss.py (3:19) - src/geo_map/parse_loc_lat_lon_from_attr.py (3:19) duplicated block id: 518 size: 14 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:19) - src/common/multi_label_metrics.py (3:19) duplicated block id: 519 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_generator.py (3:19) - src/geo_map/download_biosample_page.py (3:19) duplicated block id: 520 size: 14 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:19) - src/protein_structure/predict_structure.py (3:19) duplicated block id: 521 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_generator.py (3:19) - src/geo_map/extract_attr_from_biosample_page.py (3:19) duplicated block id: 522 size: 14 cleaned lines of code in 2 files: - src/common/multi_label_metrics.py (3:19) - src/data_loader.py (3:19) duplicated block id: 523 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_builder.py (3:19) - src/geo_map/get_biosample_from_entrez.py (3:19) duplicated block id: 524 size: 14 cleaned lines of code in 2 files: - src/predict.py (3:19) - src/protein_structure/predict_structure.py (3:19) duplicated block id: 525 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:19) - src/geo_map/parse_loc_lat_lon_from_attr.py (3:19) duplicated block id: 526 size: 14 cleaned lines of code in 2 files: - src/common/multi_label_metrics.py (3:19) - src/geo_map/download_biosample_page.py (3:19) duplicated block id: 527 size: 14 cleaned lines of code in 2 files: - src/geo_map/standardization_lat_lon_info.py (3:19) - src/predict_many_samples.py (3:19) duplicated block id: 528 size: 14 cleaned lines of code in 2 files: - src/common/multi_label_metrics.py (3:19) - src/plot/plot_map_pie_fig4_2.py (3:19) duplicated block id: 529 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:19) - src/geo_map/get_biosample_from_manual.py (3:19) duplicated block id: 530 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_builder.py (3:19) - src/data_preprocess/subword.py (3:19) duplicated block id: 531 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:19) - src/predict.py (3:19) duplicated block id: 532 size: 14 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:19) - src/biotoolbox/contact_map_generator.py (3:19) duplicated block id: 533 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/subword.py (3:19) - src/plot/plot_map_pie_fig4_2.py (3:19) duplicated block id: 534 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/tf_records_generator.py (3:19) - src/plot/plot_map_pie_fig_aff4_2.py (3:19) duplicated block id: 535 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/subword.py (3:19) - src/plot/plot_map_pie_fig_aff4_1.py (3:19) duplicated block id: 536 size: 14 cleaned lines of code in 2 files: - src/data_loader.py (3:19) - src/geo_map/parse_loc_lat_lon_from_attr.py (3:19) duplicated block id: 537 size: 14 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:19) - src/geo_map/get_biosample_from_manual.py (3:19) duplicated block id: 538 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/protein_structure/embedding_from_esmfold.py (3:19) duplicated block id: 539 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_entrez.py (3:19) - src/protein_structure/predict_structure.py (3:19) duplicated block id: 540 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_update.py (3:19) - src/plot/plot_map_pie_fig4_1.py (3:19) duplicated block id: 541 size: 14 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:19) - src/evaluater.py (3:19) duplicated block id: 542 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_manual.py (3:19) - src/protein_structure/merge_embedding_pdb_result.py (3:19) duplicated block id: 543 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/protein_structure/predict_structure.py (3:19) duplicated block id: 544 size: 14 cleaned lines of code in 2 files: - src/evaluater.py (3:19) - src/geo_map/get_biosample_from_entrez.py (3:19) duplicated block id: 545 size: 14 cleaned lines of code in 2 files: - src/SSFN/model.py (3:19) - src/common/multi_label_metrics.py (3:19) duplicated block id: 546 size: 14 cleaned lines of code in 2 files: - src/geo_map/download_biosample_page.py (3:19) - src/geo_map/get_biosample_from_entrez.py (3:19) duplicated block id: 547 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (3:19) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) duplicated block id: 548 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:19) - src/protein_structure/structure_from_esm_v1.py (3:19) duplicated block id: 549 size: 14 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:19) - src/plot/plot_map_pie_fig_aff4_2.py (3:19) duplicated block id: 550 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/structure_file_reader.py (3:19) - src/protein_structure/embedding_from_esmfold.py (3:19) duplicated block id: 551 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:19) - src/protein_structure/merge_embedding_pdb_result.py (3:19) duplicated block id: 552 size: 14 cleaned lines of code in 2 files: - src/protein_structure/embedding_from_esmfold.py (53:66) - src/protein_structure/structure_from_esm_v1.py (352:365) duplicated block id: 553 size: 14 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:19) - src/geo_map/get_biosample_from_entrez.py (3:19) duplicated block id: 554 size: 14 cleaned lines of code in 2 files: - src/baselines/xgb.py (3:19) - src/data_preprocess/subword.py (3:19) duplicated block id: 555 size: 14 cleaned lines of code in 2 files: - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) - src/plot/plot_map_pie_fig4_2.py (3:19) duplicated block id: 556 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_entrez.py (3:19) - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) duplicated block id: 557 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:19) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) duplicated block id: 558 size: 14 cleaned lines of code in 2 files: - src/data_loader.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:19) duplicated block id: 559 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/tf_records_generator.py (3:19) - src/predict_many_samples.py (3:19) duplicated block id: 560 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:19) - src/predictor.py (3:19) duplicated block id: 561 size: 14 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:19) - src/data_preprocess/subword.py (3:19) duplicated block id: 562 size: 14 cleaned lines of code in 2 files: - src/common/loss.py (3:19) - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:19) duplicated block id: 563 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/geo_map/get_biosample_from_update.py (3:19) duplicated block id: 564 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_builder.py (3:19) - src/common/loss.py (3:19) duplicated block id: 565 size: 14 cleaned lines of code in 2 files: - src/evaluater.py (3:19) - src/geo_map/download_biosample_page.py (3:19) duplicated block id: 566 size: 14 cleaned lines of code in 2 files: - src/common/multi_label_metrics.py (3:19) - src/data_preprocess/subword.py (3:19) duplicated block id: 567 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/SSFN/model.py (3:19) duplicated block id: 568 size: 14 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:19) - src/geo_map/get_biosample_from_manual.py (3:19) duplicated block id: 569 size: 14 cleaned lines of code in 2 files: - src/baselines/predict.py (3:19) - src/geo_map/extract_attr_from_biosample_page.py (3:19) duplicated block id: 570 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:19) - src/run.py (3:19) duplicated block id: 571 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:19) - src/geo_map/extract_attr_from_biosample_page.py (3:19) duplicated block id: 572 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_update.py (3:19) - src/utils.py (3:19) duplicated block id: 573 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_manual.py (3:19) - src/plot/plot_map_pie_fig4_1.py (3:19) duplicated block id: 574 size: 14 cleaned lines of code in 2 files: - src/geo_map/extract_attr_from_biosample_page.py (3:19) - src/protein_structure/predict_structure.py (3:19) duplicated block id: 575 size: 14 cleaned lines of code in 2 files: - src/predictor.py (3:19) - src/protein_structure/structure_from_esm_v1.py (3:19) duplicated block id: 576 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:19) - src/data_preprocess/tf_records_generator.py (3:19) duplicated block id: 577 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_builder.py (3:19) - src/predictor.py (3:19) duplicated block id: 578 size: 14 cleaned lines of code in 2 files: - src/predict_many_samples.py (3:19) - src/trainer.py (3:19) duplicated block id: 579 size: 14 cleaned lines of code in 2 files: - src/baselines/predict.py (3:19) - src/geo_map/parse_loc_lat_lon_from_attr.py (3:19) duplicated block id: 580 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/tf_records_generator.py (3:19) - src/result_process/process_predict_result.py (3:19) duplicated block id: 581 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_2.py (3:19) - src/plot/plot_map_pie_fig_aff4_1.py (3:19) duplicated block id: 582 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/tf_records_generator.py (3:19) - src/geo_map/get_biosample_from_update.py (3:19) duplicated block id: 583 size: 14 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:19) - src/baselines/predict.py (3:19) duplicated block id: 584 size: 14 cleaned lines of code in 2 files: - src/baselines/xgb.py (3:19) - src/predict.py (3:19) duplicated block id: 585 size: 14 cleaned lines of code in 2 files: - src/predict_many_samples.py (3:19) - src/protein_structure/structure_from_esm_v1.py (3:19) duplicated block id: 586 size: 14 cleaned lines of code in 2 files: - src/geo_map/parse_loc_lat_lon_from_attr.py (3:19) - src/plot/plot_map_pie_fig_aff4_2.py (3:19) duplicated block id: 587 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/tf_records_generator.py (3:19) - src/plot/plot_map_pie_fig4_2.py (3:19) duplicated block id: 588 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/baselines/xgb.py (3:19) duplicated block id: 589 size: 14 cleaned lines of code in 2 files: - src/predict_one_sample.py (3:19) - src/protein_structure/embedding_from_esmfold.py (3:19) duplicated block id: 590 size: 14 cleaned lines of code in 2 files: - src/baselines/xgb.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:19) duplicated block id: 591 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:19) - src/predict_many_samples.py (3:19) duplicated block id: 592 size: 14 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (488:502) - src/deep_baselines/predict_deep_baselines.py (516:529) duplicated block id: 593 size: 14 cleaned lines of code in 2 files: - src/common/metrics.py (3:19) - src/protein_structure/predict_structure.py (3:19) duplicated block id: 594 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:19) - src/geo_map/get_biosample_from_update.py (3:19) duplicated block id: 595 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/deep_baselines/predict_deep_baselines.py (1:17) duplicated block id: 596 size: 14 cleaned lines of code in 2 files: - src/common/loss.py (3:19) - src/plot/plot_map_pie_fig_aff4_1.py (3:19) duplicated block id: 597 size: 14 cleaned lines of code in 2 files: - src/SSFN/model.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:19) duplicated block id: 598 size: 14 cleaned lines of code in 2 files: - src/SSFN/model.py (3:19) - src/protein_structure/embedding_from_esmfold.py (3:19) duplicated block id: 599 size: 14 cleaned lines of code in 2 files: - src/geo_map/parse_loc_lat_lon_from_attr.py (3:19) - src/trainer.py (3:19) duplicated block id: 600 size: 14 cleaned lines of code in 2 files: - src/baselines/xgb.py (3:19) - src/predictor.py (3:19) duplicated block id: 601 size: 14 cleaned lines of code in 2 files: - src/geo_map/extract_attr_from_biosample_page.py (3:19) - src/trainer.py (3:19) duplicated block id: 602 size: 14 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:19) - src/baselines/lgbm.py (3:19) duplicated block id: 603 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/subword.py (3:19) - src/result_process/process_predict_result.py (3:19) duplicated block id: 604 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_entrez.py (3:19) - src/plot/plot_map_pie_fig4_1.py (3:19) duplicated block id: 605 size: 14 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:19) - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) duplicated block id: 606 size: 14 cleaned lines of code in 2 files: - src/baselines/xgb.py (3:19) - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) duplicated block id: 607 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_builder.py (3:19) - src/utils.py (3:19) duplicated block id: 608 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:19) duplicated block id: 609 size: 14 cleaned lines of code in 2 files: - src/baselines/xgb.py (3:19) - src/geo_map/get_biosample_from_entrez.py (3:19) duplicated block id: 610 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig_aff4_2.py (3:19) - src/protein_structure/predict_structure.py (3:19) duplicated block id: 611 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:19) - src/geo_map/get_biosample_from_entrez.py (3:19) duplicated block id: 612 size: 14 cleaned lines of code in 2 files: - src/data_loader.py (3:19) - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) duplicated block id: 613 size: 14 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:19) - src/utils.py (3:19) duplicated block id: 614 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/tf_records_generator.py (3:19) - src/protein_structure/structure_from_esm_v1.py (3:19) duplicated block id: 615 size: 14 cleaned lines of code in 2 files: - src/deep_baselines/run.py (3:19) - src/deep_baselines/virseeker.py (3:19) duplicated block id: 616 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:19) - src/geo_map/get_biosample_from_manual.py (3:19) duplicated block id: 617 size: 14 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:19) - src/baselines/lgbm.py (3:19) duplicated block id: 618 size: 14 cleaned lines of code in 2 files: - src/baselines/xgb.py (3:19) - src/geo_map/standardization_lat_lon_info.py (3:19) duplicated block id: 619 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_builder.py (3:19) - src/predict_many_samples.py (3:19) duplicated block id: 620 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_2.py (3:19) - src/utils.py (3:19) duplicated block id: 621 size: 14 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:19) - src/result_process/process_predict_result.py (3:19) duplicated block id: 622 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:19) - src/predict_one_sample.py (3:19) duplicated block id: 623 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/trainer.py (3:19) duplicated block id: 624 size: 14 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1771:1789) - src/SSFN/modeling_bert.py (1857:1881) duplicated block id: 625 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:19) - src/predict_one_sample.py (3:19) duplicated block id: 626 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig_aff4_2.py (3:19) - src/result_process/process_predict_result.py (3:19) duplicated block id: 627 size: 14 cleaned lines of code in 2 files: - src/baselines/xgb.py (3:19) - src/geo_map/parse_loc_lat_lon_from_attr.py (3:19) duplicated block id: 628 size: 14 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:19) - src/predictor.py (3:19) duplicated block id: 629 size: 14 cleaned lines of code in 2 files: - src/common/multi_label_metrics.py (3:19) - src/data_preprocess/tf_records_generator.py (3:19) duplicated block id: 630 size: 14 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:19) - src/predict.py (3:19) duplicated block id: 631 size: 14 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:19) - src/geo_map/parse_loc_lat_lon_from_attr.py (3:19) duplicated block id: 632 size: 14 cleaned lines of code in 2 files: - src/predict.py (839:853) - src/predict.py (869:882) duplicated block id: 633 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/evaluater.py (3:19) duplicated block id: 634 size: 14 cleaned lines of code in 2 files: - src/geo_map/parse_loc_lat_lon_from_attr.py (3:19) - src/predict_many_samples.py (3:19) duplicated block id: 635 size: 14 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (32:65) - src/predict_one_sample.py (34:63) duplicated block id: 636 size: 14 cleaned lines of code in 2 files: - src/SSFN/model.py (3:19) - src/geo_map/get_biosample_from_entrez.py (3:19) duplicated block id: 637 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/subword.py (3:19) - src/geo_map/get_biosample_from_manual.py (3:19) duplicated block id: 638 size: 14 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:19) - src/baselines/dnn.py (3:19) duplicated block id: 639 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_builder.py (3:19) - src/data_preprocess/verify_train_dataset.py (3:19) duplicated block id: 640 size: 14 cleaned lines of code in 2 files: - src/predict_many_samples.py (3:19) - src/utils.py (3:19) duplicated block id: 641 size: 14 cleaned lines of code in 2 files: - src/geo_map/extract_attr_from_biosample_page.py (3:19) - src/plot/plot_map_pie_fig4_1.py (3:19) duplicated block id: 642 size: 14 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:19) - src/data_preprocess/subword.py (3:19) duplicated block id: 643 size: 14 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:19) - src/data_preprocess/subword.py (3:19) duplicated block id: 644 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:19) - src/data_preprocess/subword.py (3:19) duplicated block id: 645 size: 14 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:19) - src/predict.py (3:19) duplicated block id: 646 size: 14 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (125:138) - src/predict_one_sample.py (136:150) duplicated block id: 647 size: 14 cleaned lines of code in 2 files: - src/baselines/xgb.py (3:19) - src/plot/plot_map_pie_fig4_2.py (3:19) duplicated block id: 648 size: 14 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:19) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) duplicated block id: 649 size: 14 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:19) - src/protein_structure/merge_embedding_pdb_result.py (3:19) duplicated block id: 650 size: 14 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:19) - src/geo_map/get_biosample_from_update.py (3:19) duplicated block id: 651 size: 14 cleaned lines of code in 2 files: - src/baselines/predict.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:19) duplicated block id: 652 size: 14 cleaned lines of code in 2 files: - src/baselines/predict.py (3:19) - src/plot/plot_map_pie_fig_aff4_1.py (3:19) duplicated block id: 653 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig_aff4_1.py (3:19) - src/protein_structure/structure_from_esm_v1.py (3:19) duplicated block id: 654 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/verify_train_dataset.py (3:19) - src/protein_structure/embedding_from_esmfold.py (3:19) duplicated block id: 655 size: 14 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:19) - src/protein_structure/merge_embedding_pdb_result.py (3:19) duplicated block id: 656 size: 14 cleaned lines of code in 2 files: - src/common/multi_label_metrics.py (3:19) - src/protein_structure/embedding_from_esmfold.py (3:19) duplicated block id: 657 size: 14 cleaned lines of code in 2 files: - src/baselines/predict.py (3:19) - src/common/multi_label_metrics.py (3:19) duplicated block id: 658 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_generator.py (3:19) - src/plot/plot_map_pie_fig_aff4_2.py (3:19) duplicated block id: 659 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/structure_file_reader.py (3:19) - src/geo_map/get_biosample_from_update.py (3:19) duplicated block id: 660 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_builder.py (3:19) - src/evaluater.py (3:19) duplicated block id: 661 size: 14 cleaned lines of code in 2 files: - src/baselines/xgb.py (3:19) - src/protein_structure/structure_from_esm_v1.py (3:19) duplicated block id: 662 size: 14 cleaned lines of code in 2 files: - src/predict.py (3:19) - src/result_process/process_predict_result.py (3:19) duplicated block id: 663 size: 14 cleaned lines of code in 2 files: - src/data_loader.py (3:19) - src/protein_structure/merge_embedding_pdb_result.py (3:19) duplicated block id: 664 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) - src/protein_structure/embedding_from_esmfold.py (3:19) duplicated block id: 665 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/tf_records_generator.py (3:19) - src/evaluater.py (3:19) duplicated block id: 666 size: 14 cleaned lines of code in 2 files: - src/geo_map/standardization_lat_lon_info.py (3:19) - src/protein_structure/structure_from_esm_v1.py (3:19) duplicated block id: 667 size: 14 cleaned lines of code in 2 files: - src/app/app.py (124:138) - src/deep_baselines/predict_deep_baselines.py (125:138) duplicated block id: 668 size: 14 cleaned lines of code in 2 files: - src/common/metrics.py (3:19) - src/data_preprocess/verify_train_dataset.py (3:19) duplicated block id: 669 size: 14 cleaned lines of code in 2 files: - src/baselines/predict.py (3:19) - src/geo_map/standardization_lat_lon_info.py (3:19) duplicated block id: 670 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_builder.py (3:19) - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) duplicated block id: 671 size: 14 cleaned lines of code in 2 files: - src/common/loss.py (3:19) - src/trainer.py (3:19) duplicated block id: 672 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_generator.py (3:19) - src/data_preprocess/subword.py (3:19) duplicated block id: 673 size: 14 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:19) - src/baselines/xgb.py (3:19) duplicated block id: 674 size: 14 cleaned lines of code in 2 files: - src/baselines/xgb.py (3:19) - src/predict_many_samples.py (3:19) duplicated block id: 675 size: 14 cleaned lines of code in 2 files: - src/geo_map/parse_loc_lat_lon_from_attr.py (3:19) - src/predict_one_sample.py (3:19) duplicated block id: 676 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig_aff4_2.py (3:19) - src/trainer.py (3:19) duplicated block id: 677 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:19) duplicated block id: 678 size: 14 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:19) - src/protein_structure/structure_from_esm_v1.py (3:19) duplicated block id: 679 size: 14 cleaned lines of code in 2 files: - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) - src/protein_structure/predict_structure.py (3:19) duplicated block id: 680 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/subword.py (3:19) - src/trainer.py (3:19) duplicated block id: 681 size: 14 cleaned lines of code in 2 files: - src/deep_baselines/statistics.py (3:19) - src/deep_baselines/virhunter.py (3:19) duplicated block id: 682 size: 14 cleaned lines of code in 2 files: - src/geo_map/download_biosample_page.py (3:19) - src/geo_map/standardization_lat_lon_info.py (3:19) duplicated block id: 683 size: 14 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (3:19) - src/deep_baselines/virhunter.py (3:19) duplicated block id: 684 size: 14 cleaned lines of code in 2 files: - src/SSFN/model.py (3:19) - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:19) duplicated block id: 685 size: 14 cleaned lines of code in 2 files: - src/data_loader.py (3:19) - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:19) duplicated block id: 686 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_builder.py (3:19) - src/plot/plot_map_pie_fig_aff4_2.py (3:19) duplicated block id: 687 size: 14 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:19) - src/plot/plot_map_pie_fig4_1.py (3:19) duplicated block id: 688 size: 14 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:19) - src/data_preprocess/verify_train_dataset.py (3:19) duplicated block id: 689 size: 14 cleaned lines of code in 2 files: - src/SSFN/model.py (3:19) - src/plot/plot_map_pie_fig_aff4_1.py (3:19) duplicated block id: 690 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:19) - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) duplicated block id: 691 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig_aff4_2.py (3:19) - src/protein_structure/merge_embedding_pdb_result.py (3:19) duplicated block id: 692 size: 14 cleaned lines of code in 2 files: - src/geo_map/parse_loc_lat_lon_from_attr.py (3:19) - src/plot/plot_map_pie_fig_aff4_1.py (3:19) duplicated block id: 693 size: 14 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:19) - src/geo_map/get_biosample_from_entrez.py (3:19) duplicated block id: 694 size: 14 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:19) - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:19) duplicated block id: 695 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_generator.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:19) duplicated block id: 696 size: 14 cleaned lines of code in 2 files: - src/SSFN/model.py (3:19) - src/predict_many_samples.py (3:19) duplicated block id: 697 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/structure_file_reader.py (3:19) - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:19) duplicated block id: 698 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/tf_records_generator.py (3:19) - src/protein_structure/embedding_from_esmfold.py (3:19) duplicated block id: 699 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/verify_train_dataset.py (3:19) - src/geo_map/standardization_lat_lon_info.py (3:19) duplicated block id: 700 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:19) - src/plot/plot_map_pie_fig4_2.py (3:19) duplicated block id: 701 size: 14 cleaned lines of code in 2 files: - src/SSFN/model.py (3:19) - src/biotoolbox/contact_map_builder.py (3:19) duplicated block id: 702 size: 14 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:19) - src/plot/plot_map_pie_fig_aff4_1.py (3:19) duplicated block id: 703 size: 14 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (488:502) - src/predict.py (869:882) duplicated block id: 704 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_builder.py (3:19) - src/common/multi_label_metrics.py (3:19) duplicated block id: 705 size: 14 cleaned lines of code in 2 files: - src/geo_map/standardization_lat_lon_info.py (3:19) - src/plot/plot_map_pie_fig_aff4_2.py (3:19) duplicated block id: 706 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/structure_file_reader.py (3:19) - src/predict_many_samples.py (3:19) duplicated block id: 707 size: 14 cleaned lines of code in 2 files: - src/common/multi_label_metrics.py (3:19) - src/result_process/process_predict_result.py (3:19) duplicated block id: 708 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_builder.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:19) duplicated block id: 709 size: 14 cleaned lines of code in 2 files: - src/common/loss.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:19) duplicated block id: 710 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_update.py (3:19) - src/predictor.py (3:19) duplicated block id: 711 size: 14 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:19) duplicated block id: 712 size: 14 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:19) - src/geo_map/get_biosample_from_manual.py (3:19) duplicated block id: 713 size: 14 cleaned lines of code in 2 files: - src/predictor.py (3:19) - src/run.py (3:19) duplicated block id: 714 size: 14 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:19) - src/predict_one_sample.py (3:19) duplicated block id: 715 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/structure_file_reader.py (3:19) - src/protein_structure/merge_embedding_pdb_result.py (3:19) duplicated block id: 716 size: 14 cleaned lines of code in 2 files: - src/data_loader.py (3:19) - src/data_preprocess/tf_records_generator.py (3:19) duplicated block id: 717 size: 14 cleaned lines of code in 2 files: - src/common/loss.py (3:19) - src/protein_structure/predict_structure.py (3:19) duplicated block id: 718 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (3:19) - src/predictor.py (3:19) duplicated block id: 719 size: 14 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:19) - src/common/multi_label_metrics.py (3:19) duplicated block id: 720 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/SSFN/modeling_bert.py (3:19) duplicated block id: 721 size: 14 cleaned lines of code in 2 files: - src/protein_structure/embedding_from_esmfold.py (3:19) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) duplicated block id: 722 size: 14 cleaned lines of code in 2 files: - src/common/multi_label_metrics.py (3:19) - src/geo_map/get_biosample_from_manual.py (3:19) duplicated block id: 723 size: 14 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (1:17) - src/geo_map/standardization_lat_lon_info.py (3:19) duplicated block id: 724 size: 14 cleaned lines of code in 2 files: - src/geo_map/download_biosample_page.py (3:19) - src/predict.py (3:19) duplicated block id: 725 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (98:111) - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (61:74) duplicated block id: 726 size: 14 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (3:19) - src/deep_baselines/statistics.py (3:19) duplicated block id: 727 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (3:19) - src/trainer.py (3:19) duplicated block id: 728 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/tf_records_generator.py (3:19) - src/geo_map/download_biosample_page.py (3:19) duplicated block id: 729 size: 14 cleaned lines of code in 2 files: - src/trainer.py (3:19) - src/utils.py (3:19) duplicated block id: 730 size: 14 cleaned lines of code in 2 files: - src/SSFN/model.py (3:19) - src/baselines/xgb.py (3:19) duplicated block id: 731 size: 14 cleaned lines of code in 2 files: - src/protein_structure/embedding_from_esmfold.py (3:19) - src/trainer.py (3:19) duplicated block id: 732 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:19) - src/geo_map/standardization_lat_lon_info.py (3:19) duplicated block id: 733 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:19) - src/predict_many_samples.py (3:19) duplicated block id: 734 size: 14 cleaned lines of code in 2 files: - src/predictor.py (3:19) - src/result_process/process_predict_result.py (3:19) duplicated block id: 735 size: 14 cleaned lines of code in 2 files: - src/protein_structure/structure_from_esm_v1.py (3:19) - src/result_process/process_predict_result.py (3:19) duplicated block id: 736 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/plot/plot_map_pie_fig4_1.py (3:19) duplicated block id: 737 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_generator.py (3:19) - src/geo_map/get_biosample_from_update.py (3:19) duplicated block id: 738 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/geo_map/parse_loc_lat_lon_from_attr.py (3:19) duplicated block id: 739 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/structure_file_reader.py (3:19) - src/predict_one_sample.py (3:19) duplicated block id: 740 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig_aff4_1.py (3:19) - src/protein_structure/embedding_from_esmfold.py (3:19) duplicated block id: 741 size: 14 cleaned lines of code in 2 files: - src/protein_structure/predict_structure.py (3:19) - src/result_process/process_predict_result.py (3:19) duplicated block id: 742 size: 14 cleaned lines of code in 2 files: - src/baselines/xgb.py (3:19) - src/data_preprocess/tf_records_generator.py (3:19) duplicated block id: 743 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/subword.py (3:19) - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) duplicated block id: 744 size: 14 cleaned lines of code in 2 files: - src/common/metrics.py (3:19) - src/deep_baselines/predict_deep_baselines.py (1:17) duplicated block id: 745 size: 14 cleaned lines of code in 2 files: - src/geo_map/self_testing_nearest_station_antarctica.py (37:52) - src/plot/plot_map_pie_fig_aff4_2.py (210:225) duplicated block id: 746 size: 14 cleaned lines of code in 2 files: - src/common/loss.py (3:19) - src/geo_map/get_biosample_from_update.py (3:19) duplicated block id: 747 size: 14 cleaned lines of code in 2 files: - src/SSFN/model.py (3:19) - src/plot/plot_map_pie_fig4_1.py (3:19) duplicated block id: 748 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_entrez.py (3:19) - src/trainer.py (3:19) duplicated block id: 749 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) - src/geo_map/download_biosample_page.py (3:19) duplicated block id: 750 size: 14 cleaned lines of code in 2 files: - src/baselines/predict.py (3:19) - src/predictor.py (3:19) duplicated block id: 751 size: 14 cleaned lines of code in 2 files: - src/baselines/xgb.py (3:19) - src/biotoolbox/contact_map_builder.py (3:19) duplicated block id: 752 size: 14 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:19) - src/predict_many_samples.py (3:19) duplicated block id: 753 size: 14 cleaned lines of code in 2 files: - src/baselines/predict.py (3:19) - src/data_loader.py (3:19) duplicated block id: 754 size: 14 cleaned lines of code in 2 files: - src/baselines/predict.py (3:19) - src/predict.py (3:19) duplicated block id: 755 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/structure_file_reader.py (3:19) - src/plot/plot_map_pie_fig4_2.py (3:19) duplicated block id: 756 size: 14 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:19) - src/predictor.py (3:19) duplicated block id: 757 size: 14 cleaned lines of code in 2 files: - src/deep_baselines/run.py (3:19) - src/deep_baselines/virhunter.py (3:19) duplicated block id: 758 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:19) - src/protein_structure/predict_structure.py (3:19) duplicated block id: 759 size: 14 cleaned lines of code in 2 files: - src/evaluater.py (3:19) - src/protein_structure/predict_structure.py (3:19) duplicated block id: 760 size: 14 cleaned lines of code in 2 files: - src/data_loader.py (3:19) - src/deep_baselines/predict_deep_baselines.py (1:17) duplicated block id: 761 size: 14 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:19) - src/common/loss.py (3:19) duplicated block id: 762 size: 14 cleaned lines of code in 2 files: - src/SSFN/model.py (3:19) - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) duplicated block id: 763 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) - src/plot/plot_map_pie_fig4_1.py (3:19) duplicated block id: 764 size: 14 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:19) - src/predict_one_sample.py (3:19) duplicated block id: 765 size: 14 cleaned lines of code in 2 files: - src/data_loader.py (3:19) - src/protein_structure/predict_structure.py (3:19) duplicated block id: 766 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_generator.py (3:19) - src/predict_many_samples.py (3:19) duplicated block id: 767 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/subword.py (3:19) - src/evaluater.py (3:19) duplicated block id: 768 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/verify_train_dataset.py (3:19) - src/geo_map/get_biosample_from_manual.py (3:19) duplicated block id: 769 size: 14 cleaned lines of code in 2 files: - src/data_loader.py (3:19) - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) duplicated block id: 770 size: 14 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (1:17) - src/predict.py (3:19) duplicated block id: 771 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) - src/data_preprocess/subword.py (3:19) duplicated block id: 772 size: 14 cleaned lines of code in 2 files: - src/SSFN/model.py (3:19) - src/predict.py (3:19) duplicated block id: 773 size: 14 cleaned lines of code in 2 files: - src/common/multi_label_metrics.py (3:19) - src/geo_map/get_biosample_from_entrez.py (3:19) duplicated block id: 774 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:19) - src/plot/plot_map_pie_fig4_1.py (3:19) duplicated block id: 775 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/structure_file_reader.py (3:19) - src/geo_map/parse_loc_lat_lon_from_attr.py (3:19) duplicated block id: 776 size: 14 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:19) - src/baselines/xgb.py (3:19) duplicated block id: 777 size: 14 cleaned lines of code in 2 files: - src/baselines/xgb.py (3:19) - src/protein_structure/predict_structure.py (3:19) duplicated block id: 778 size: 14 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:19) - src/protein_structure/embedding_from_esmfold.py (3:19) duplicated block id: 779 size: 14 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:19) - src/evaluater.py (3:19) duplicated block id: 780 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:19) - src/protein_structure/embedding_from_esmfold.py (3:19) duplicated block id: 781 size: 14 cleaned lines of code in 2 files: - src/common/metrics.py (3:19) - src/geo_map/download_biosample_page.py (3:19) duplicated block id: 782 size: 14 cleaned lines of code in 2 files: - src/SSFN/model.py (3:19) - src/SSFN/pooling.py (3:19) duplicated block id: 783 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:19) - src/utils.py (3:19) duplicated block id: 784 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/subword.py (3:19) - src/geo_map/get_biosample_from_entrez.py (3:19) duplicated block id: 785 size: 14 cleaned lines of code in 2 files: - src/geo_map/extract_attr_from_biosample_page.py (3:19) - src/protein_structure/embedding_from_esmfold.py (3:19) duplicated block id: 786 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) duplicated block id: 787 size: 14 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (1:17) - src/predictor.py (3:19) duplicated block id: 788 size: 14 cleaned lines of code in 2 files: - src/baselines/predict.py (3:19) - src/data_preprocess/tf_records_generator.py (3:19) duplicated block id: 789 size: 14 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:19) - src/geo_map/get_biosample_from_entrez.py (3:19) duplicated block id: 790 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_generator.py (3:19) - src/protein_structure/predict_structure.py (3:19) duplicated block id: 791 size: 14 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:19) - src/SSFN/pooling.py (3:19) duplicated block id: 792 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_2.py (3:19) - src/result_process/process_predict_result.py (3:19) duplicated block id: 793 size: 14 cleaned lines of code in 2 files: - src/geo_map/download_biosample_page.py (3:19) - src/predict_one_sample.py (3:19) duplicated block id: 794 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:19) - src/geo_map/get_biosample_from_entrez.py (3:19) duplicated block id: 795 size: 14 cleaned lines of code in 2 files: - src/predict.py (3:19) - src/trainer.py (3:19) duplicated block id: 796 size: 14 cleaned lines of code in 2 files: - src/common/metrics.py (3:19) - src/geo_map/parse_loc_lat_lon_from_attr.py (3:19) duplicated block id: 797 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/predict_many_samples.py (3:19) duplicated block id: 798 size: 14 cleaned lines of code in 2 files: - src/common/metrics.py (3:19) - src/evaluater.py (3:19) duplicated block id: 799 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) - src/plot/plot_map_pie_fig_aff4_2.py (3:19) duplicated block id: 800 size: 14 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:19) duplicated block id: 801 size: 14 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:19) - src/plot/plot_map_pie_fig4_1.py (3:19) duplicated block id: 802 size: 14 cleaned lines of code in 2 files: - src/predict.py (3:19) - src/run.py (3:19) duplicated block id: 803 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/subword.py (3:19) - src/geo_map/extract_attr_from_biosample_page.py (3:19) duplicated block id: 804 size: 14 cleaned lines of code in 2 files: - src/evaluater.py (3:19) - src/geo_map/get_biosample_from_manual.py (3:19) duplicated block id: 805 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig_aff4_2.py (3:19) - src/run.py (3:19) duplicated block id: 806 size: 14 cleaned lines of code in 2 files: - src/common/metrics.py (3:19) - src/data_loader.py (3:19) duplicated block id: 807 size: 14 cleaned lines of code in 2 files: - src/common/loss.py (3:19) - src/geo_map/download_biosample_page.py (3:19) duplicated block id: 808 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig_aff4_1.py (3:19) - src/predict_one_sample.py (3:19) duplicated block id: 809 size: 14 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:19) - src/protein_structure/embedding_from_esmfold.py (3:19) duplicated block id: 810 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig_aff4_1.py (3:19) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) duplicated block id: 811 size: 14 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:19) - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) duplicated block id: 812 size: 14 cleaned lines of code in 2 files: - src/baselines/predict.py (3:19) - src/deep_baselines/predict_deep_baselines.py (1:17) duplicated block id: 813 size: 14 cleaned lines of code in 2 files: - src/geo_map/standardization_lat_lon_info.py (3:19) - src/plot/plot_map_pie_fig4_1.py (3:19) duplicated block id: 814 size: 14 cleaned lines of code in 2 files: - src/common/metrics.py (3:19) - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:19) duplicated block id: 815 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:19) - src/protein_structure/predict_structure.py (3:19) duplicated block id: 816 size: 14 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:19) - src/geo_map/standardization_lat_lon_info.py (3:19) duplicated block id: 817 size: 14 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:19) - src/predict_many_samples.py (3:19) duplicated block id: 818 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_builder.py (3:19) - src/geo_map/extract_attr_from_biosample_page.py (3:19) duplicated block id: 819 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_entrez.py (3:19) - src/geo_map/get_biosample_from_manual.py (3:19) duplicated block id: 820 size: 14 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:19) duplicated block id: 821 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_builder.py (3:19) - src/geo_map/get_biosample_from_manual.py (3:19) duplicated block id: 822 size: 14 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:19) - src/predict.py (3:19) duplicated block id: 823 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/verify_train_dataset.py (3:19) - src/protein_structure/predict_structure.py (3:19) duplicated block id: 824 size: 14 cleaned lines of code in 2 files: - src/evaluater.py (3:19) - src/plot/plot_map_pie_fig_aff4_2.py (3:19) duplicated block id: 825 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_update.py (3:19) - src/geo_map/standardization_lat_lon_info.py (3:19) duplicated block id: 826 size: 14 cleaned lines of code in 2 files: - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) - src/protein_structure/structure_from_esm_v1.py (3:19) duplicated block id: 827 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_2.py (3:19) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) duplicated block id: 828 size: 14 cleaned lines of code in 2 files: - src/predict_one_sample.py (3:19) - src/result_process/process_predict_result.py (3:19) duplicated block id: 829 size: 14 cleaned lines of code in 2 files: - src/common/metrics.py (3:19) - src/protein_structure/embedding_from_esmfold.py (3:19) duplicated block id: 830 size: 14 cleaned lines of code in 2 files: - src/SSFN/model.py (3:19) - src/common/loss.py (3:19) duplicated block id: 831 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:19) - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) duplicated block id: 832 size: 14 cleaned lines of code in 2 files: - src/data_loader.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:19) duplicated block id: 833 size: 14 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:19) - src/predict.py (3:19) duplicated block id: 834 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/verify_train_dataset.py (3:19) - src/deep_baselines/predict_deep_baselines.py (1:17) duplicated block id: 835 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_manual.py (3:19) - src/predict.py (3:19) duplicated block id: 836 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_entrez.py (3:19) - src/geo_map/parse_loc_lat_lon_from_attr.py (3:19) duplicated block id: 837 size: 14 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:19) - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) duplicated block id: 838 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:19) - src/predictor.py (3:19) duplicated block id: 839 size: 14 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:19) - src/plot/plot_map_pie_fig_aff4_1.py (3:19) duplicated block id: 840 size: 14 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:19) - src/biotoolbox/structure_file_reader.py (3:19) duplicated block id: 841 size: 14 cleaned lines of code in 2 files: - src/data_loader.py (3:19) - src/geo_map/download_biosample_page.py (3:19) duplicated block id: 842 size: 14 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (1:17) - src/plot/plot_map_pie_fig4_1.py (3:19) duplicated block id: 843 size: 14 cleaned lines of code in 2 files: - src/geo_map/download_biosample_page.py (3:19) - src/run.py (3:19) duplicated block id: 844 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:19) - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) duplicated block id: 845 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/subword.py (3:19) - src/predict_one_sample.py (3:19) duplicated block id: 846 size: 14 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (1:17) - src/predict_one_sample.py (3:19) duplicated block id: 847 size: 14 cleaned lines of code in 2 files: - src/predict_one_sample.py (3:19) - src/trainer.py (3:19) duplicated block id: 848 size: 14 cleaned lines of code in 2 files: - src/SSFN/model.py (3:19) - src/deep_baselines/predict_deep_baselines.py (1:17) duplicated block id: 849 size: 14 cleaned lines of code in 2 files: - src/predict_many_samples.py (3:19) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) duplicated block id: 850 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:19) - src/predict.py (3:19) duplicated block id: 851 size: 14 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:19) - src/data_preprocess/tf_records_generator.py (3:19) duplicated block id: 852 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/tf_records_generator.py (3:19) - src/protein_structure/predict_structure.py (3:19) duplicated block id: 853 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_manual.py (3:19) - src/protein_structure/predict_structure.py (3:19) duplicated block id: 854 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/verify_train_dataset.py (3:19) - src/plot/plot_map_pie_fig4_1.py (3:19) duplicated block id: 855 size: 14 cleaned lines of code in 2 files: - src/baselines/xgb.py (3:19) - src/plot/plot_map_pie_fig_aff4_2.py (3:19) duplicated block id: 856 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig_aff4_1.py (3:19) - src/utils.py (3:19) duplicated block id: 857 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_builder.py (3:19) - src/geo_map/parse_loc_lat_lon_from_attr.py (3:19) duplicated block id: 858 size: 14 cleaned lines of code in 2 files: - src/SSFN/model.py (3:19) - src/baselines/predict.py (3:19) duplicated block id: 859 size: 14 cleaned lines of code in 2 files: - src/common/loss.py (3:19) - src/plot/plot_map_pie_fig_aff4_2.py (3:19) duplicated block id: 860 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_entrez.py (3:19) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) duplicated block id: 861 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_generator.py (3:19) - src/run.py (3:19) duplicated block id: 862 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:19) - src/data_preprocess/tf_records_generator.py (3:19) duplicated block id: 863 size: 14 cleaned lines of code in 2 files: - src/protein_structure/merge_embedding_pdb_result.py (3:19) - src/result_process/process_predict_result.py (3:19) duplicated block id: 864 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/tf_records_generator.py (3:19) - src/geo_map/parse_loc_lat_lon_from_attr.py (3:19) duplicated block id: 865 size: 14 cleaned lines of code in 2 files: - src/predictor.py (3:19) - src/protein_structure/merge_embedding_pdb_result.py (3:19) duplicated block id: 866 size: 14 cleaned lines of code in 2 files: - src/SSFN/model.py (3:19) - src/geo_map/get_biosample_from_manual.py (3:19) duplicated block id: 867 size: 14 cleaned lines of code in 2 files: - src/common/metrics.py (3:19) - src/run.py (3:19) duplicated block id: 868 size: 14 cleaned lines of code in 2 files: - src/common/loss.py (3:19) - src/run.py (3:19) duplicated block id: 869 size: 14 cleaned lines of code in 2 files: - src/baselines/xgb.py (3:19) - src/biotoolbox/contact_map_generator.py (3:19) duplicated block id: 870 size: 14 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:19) - src/common/loss.py (3:19) duplicated block id: 871 size: 14 cleaned lines of code in 2 files: - src/geo_map/parse_loc_lat_lon_from_attr.py (3:19) - src/utils.py (3:19) duplicated block id: 872 size: 14 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:19) - src/protein_structure/predict_structure.py (3:19) duplicated block id: 873 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/baselines/dnn.py (3:19) duplicated block id: 874 size: 14 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:19) - src/result_process/process_predict_result.py (3:19) duplicated block id: 875 size: 14 cleaned lines of code in 2 files: - src/baselines/xgb.py (3:19) - src/deep_baselines/predict_deep_baselines.py (1:17) duplicated block id: 876 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:19) - src/data_preprocess/verify_train_dataset.py (3:19) duplicated block id: 877 size: 14 cleaned lines of code in 2 files: - src/common/metrics.py (3:19) - src/predict_many_samples.py (3:19) duplicated block id: 878 size: 14 cleaned lines of code in 2 files: - src/predict.py (3:19) - src/predict_many_samples.py (3:19) duplicated block id: 879 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_update.py (3:19) - src/protein_structure/structure_from_esm_v1.py (3:19) duplicated block id: 880 size: 14 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:19) - src/result_process/process_predict_result.py (3:19) duplicated block id: 881 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_entrez.py (3:19) - src/utils.py (3:19) duplicated block id: 882 size: 14 cleaned lines of code in 2 files: - src/geo_map/download_biosample_page.py (3:19) - src/plot/plot_map_pie_fig4_1.py (3:19) duplicated block id: 883 size: 14 cleaned lines of code in 2 files: - src/geo_map/download_biosample_page.py (3:19) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) duplicated block id: 884 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) - src/predictor.py (3:19) duplicated block id: 885 size: 14 cleaned lines of code in 2 files: - src/predictor.py (3:19) - src/protein_structure/predict_structure.py (3:19) duplicated block id: 886 size: 14 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:19) - src/plot/plot_map_pie_fig_aff4_1.py (3:19) duplicated block id: 887 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/structure_file_reader.py (3:19) - src/predictor.py (3:19) duplicated block id: 888 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:19) - src/geo_map/parse_loc_lat_lon_from_attr.py (3:19) duplicated block id: 889 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:19) - src/predict_one_sample.py (3:19) duplicated block id: 890 size: 14 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:19) - src/baselines/predict.py (3:19) duplicated block id: 891 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig_aff4_1.py (3:19) - src/trainer.py (3:19) duplicated block id: 892 size: 14 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (1:17) - src/utils.py (3:19) duplicated block id: 893 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_builder.py (3:19) - src/protein_structure/structure_from_esm_v1.py (3:19) duplicated block id: 894 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/subword.py (3:19) - src/geo_map/parse_loc_lat_lon_from_attr.py (3:19) duplicated block id: 895 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (3:19) - src/plot/plot_map_pie_fig_aff4_1.py (3:19) duplicated block id: 896 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_entrez.py (3:19) - src/geo_map/standardization_lat_lon_info.py (3:19) duplicated block id: 897 size: 14 cleaned lines of code in 2 files: - src/SSFN/model.py (3:19) - src/biotoolbox/structure_file_reader.py (3:19) duplicated block id: 898 size: 14 cleaned lines of code in 2 files: - src/SSFN/model.py (3:19) - src/data_preprocess/subword.py (3:19) duplicated block id: 899 size: 14 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:19) - src/baselines/predict.py (3:19) duplicated block id: 900 size: 14 cleaned lines of code in 2 files: - src/data_loader.py (634:648) - src/data_loader.py (1073:1086) duplicated block id: 901 size: 14 cleaned lines of code in 2 files: - src/common/multi_label_metrics.py (3:19) - src/predict.py (3:19) duplicated block id: 902 size: 14 cleaned lines of code in 2 files: - src/baselines/xgb.py (3:19) - src/common/multi_label_metrics.py (3:19) duplicated block id: 903 size: 14 cleaned lines of code in 2 files: - src/protein_structure/structure_from_esm_v1.py (3:19) - src/utils.py (3:19) duplicated block id: 904 size: 14 cleaned lines of code in 2 files: - src/common/multi_label_metrics.py (3:19) - src/trainer.py (3:19) duplicated block id: 905 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/biotoolbox/contact_map_generator.py (3:19) duplicated block id: 906 size: 14 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:19) - src/protein_structure/merge_embedding_pdb_result.py (3:19) duplicated block id: 907 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:19) - src/trainer.py (3:19) duplicated block id: 908 size: 14 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (1:17) - src/protein_structure/merge_embedding_pdb_result.py (3:19) duplicated block id: 909 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig_aff4_2.py (3:19) - src/predict_one_sample.py (3:19) duplicated block id: 910 size: 14 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:19) - src/plot/plot_map_pie_fig_aff4_2.py (3:19) duplicated block id: 911 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig_aff4_2.py (3:19) - src/protein_structure/structure_from_esm_v1.py (3:19) duplicated block id: 912 size: 14 cleaned lines of code in 2 files: - src/common/metrics.py (3:19) - src/geo_map/get_biosample_from_update.py (3:19) duplicated block id: 913 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_builder.py (3:19) - src/plot/plot_map_pie_fig_aff4_1.py (3:19) duplicated block id: 914 size: 14 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:19) - src/evaluater.py (3:19) duplicated block id: 915 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_manual.py (3:19) - src/predict_many_samples.py (3:19) duplicated block id: 916 size: 14 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:19) - src/biotoolbox/contact_map_builder.py (3:19) duplicated block id: 917 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_2.py (3:19) - src/predictor.py (3:19) duplicated block id: 918 size: 14 cleaned lines of code in 2 files: - src/evaluater.py (3:19) - src/protein_structure/merge_embedding_pdb_result.py (3:19) duplicated block id: 919 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/tf_records_generator.py (3:19) - src/utils.py (3:19) duplicated block id: 920 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:19) - src/geo_map/standardization_lat_lon_info.py (3:19) duplicated block id: 921 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (117:135) - src/data_preprocess/tf_records_generator.py (60:78) duplicated block id: 922 size: 14 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (1:17) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) duplicated block id: 923 size: 14 cleaned lines of code in 2 files: - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) - src/plot/plot_map_pie_fig_aff4_2.py (3:19) duplicated block id: 924 size: 14 cleaned lines of code in 2 files: - src/baselines/predict.py (3:19) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) duplicated block id: 925 size: 14 cleaned lines of code in 2 files: - src/common/loss.py (3:19) - src/predict_one_sample.py (3:19) duplicated block id: 926 size: 14 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:19) - src/geo_map/get_biosample_from_manual.py (3:19) duplicated block id: 927 size: 14 cleaned lines of code in 2 files: - src/common/multi_label_metrics.py (3:19) - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) duplicated block id: 928 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_generator.py (3:19) - src/plot/plot_map_pie_fig4_1.py (3:19) duplicated block id: 929 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_manual.py (3:19) - src/utils.py (3:19) duplicated block id: 930 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/verify_train_dataset.py (3:19) - src/evaluater.py (3:19) duplicated block id: 931 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/structure_file_reader.py (3:19) - src/geo_map/get_biosample_from_manual.py (3:19) duplicated block id: 932 size: 14 cleaned lines of code in 2 files: - src/baselines/predict.py (3:19) - src/run.py (3:19) duplicated block id: 933 size: 14 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:19) - src/plot/plot_map_pie_fig4_1.py (3:19) duplicated block id: 934 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:19) - src/plot/plot_map_pie_fig_aff4_1.py (3:19) duplicated block id: 935 size: 14 cleaned lines of code in 2 files: - src/protein_structure/predict_structure.py (3:19) - src/utils.py (3:19) duplicated block id: 936 size: 14 cleaned lines of code in 2 files: - src/common/loss.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:19) duplicated block id: 937 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) - src/geo_map/standardization_lat_lon_info.py (3:19) duplicated block id: 938 size: 14 cleaned lines of code in 2 files: - src/geo_map/extract_attr_from_biosample_page.py (3:19) - src/plot/plot_map_pie_fig_aff4_2.py (3:19) duplicated block id: 939 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_builder.py (3:19) - src/data_preprocess/tf_records_generator.py (3:19) duplicated block id: 940 size: 14 cleaned lines of code in 2 files: - src/result_process/process_predict_result.py (3:19) - src/utils.py (3:19) duplicated block id: 941 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) - src/data_preprocess/verify_train_dataset.py (3:19) duplicated block id: 942 size: 14 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:19) duplicated block id: 943 size: 14 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:19) - src/geo_map/get_biosample_from_update.py (3:19) duplicated block id: 944 size: 14 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1578:1593) - src/SSFN/modeling_bert.py (1689:1704) duplicated block id: 945 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:19) - src/plot/plot_map_pie_fig_aff4_2.py (3:19) duplicated block id: 946 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/structure_file_reader.py (3:19) - src/deep_baselines/predict_deep_baselines.py (1:17) duplicated block id: 947 size: 14 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:19) - src/common/metrics.py (3:19) duplicated block id: 948 size: 14 cleaned lines of code in 2 files: - src/common/multi_label_metrics.py (3:19) - src/geo_map/parse_loc_lat_lon_from_attr.py (3:19) duplicated block id: 949 size: 14 cleaned lines of code in 2 files: - src/common/loss.py (3:19) - src/data_loader.py (3:19) duplicated block id: 950 size: 14 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:19) - src/protein_structure/structure_from_esm_v1.py (3:19) duplicated block id: 951 size: 14 cleaned lines of code in 2 files: - src/geo_map/download_biosample_page.py (3:19) - src/geo_map/get_biosample_from_update.py (3:19) duplicated block id: 952 size: 14 cleaned lines of code in 2 files: - src/geo_map/parse_loc_lat_lon_from_attr.py (3:19) - src/run.py (3:19) duplicated block id: 953 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:19) - src/geo_map/standardization_lat_lon_info.py (3:19) duplicated block id: 954 size: 14 cleaned lines of code in 2 files: - src/common/metrics.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:19) duplicated block id: 955 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) - src/result_process/process_predict_result.py (3:19) duplicated block id: 956 size: 14 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:19) - src/data_loader.py (3:19) duplicated block id: 957 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_entrez.py (3:19) - src/predict_one_sample.py (3:19) duplicated block id: 958 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) - src/utils.py (3:19) duplicated block id: 959 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/subword.py (3:19) - src/deep_baselines/predict_deep_baselines.py (1:17) duplicated block id: 960 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_builder.py (3:19) - src/run.py (3:19) duplicated block id: 961 size: 14 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:19) - src/biotoolbox/contact_map_generator.py (3:19) duplicated block id: 962 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) - src/protein_structure/merge_embedding_pdb_result.py (3:19) duplicated block id: 963 size: 14 cleaned lines of code in 2 files: - src/common/loss.py (3:19) - src/geo_map/extract_attr_from_biosample_page.py (3:19) duplicated block id: 964 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:19) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) duplicated block id: 965 size: 14 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:19) - src/plot/plot_map_pie_fig4_2.py (3:19) duplicated block id: 966 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) duplicated block id: 967 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_builder.py (3:19) - src/geo_map/download_biosample_page.py (3:19) duplicated block id: 968 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_generator.py (3:19) - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) duplicated block id: 969 size: 14 cleaned lines of code in 2 files: - src/geo_map/extract_attr_from_biosample_page.py (3:19) - src/utils.py (3:19) duplicated block id: 970 size: 14 cleaned lines of code in 2 files: - src/geo_map/extract_attr_from_biosample_page.py (3:19) - src/predict_one_sample.py (3:19) duplicated block id: 971 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:19) - src/geo_map/get_biosample_from_update.py (3:19) duplicated block id: 972 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/subword.py (3:19) - src/protein_structure/embedding_from_esmfold.py (3:19) duplicated block id: 973 size: 14 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:19) duplicated block id: 974 size: 14 cleaned lines of code in 2 files: - src/geo_map/download_biosample_page.py (3:19) - src/plot/plot_map_pie_fig_aff4_2.py (3:19) duplicated block id: 975 size: 14 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:19) - src/deep_baselines/predict_deep_baselines.py (1:17) duplicated block id: 976 size: 14 cleaned lines of code in 2 files: - src/protein_structure/predict_structure.py (3:19) - src/trainer.py (3:19) duplicated block id: 977 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_generator.py (3:19) - src/common/metrics.py (3:19) duplicated block id: 978 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:19) - src/predict.py (3:19) duplicated block id: 979 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/geo_map/get_biosample_from_manual.py (3:19) duplicated block id: 980 size: 14 cleaned lines of code in 2 files: - src/predictor.py (3:19) - src/utils.py (3:19) duplicated block id: 981 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (3:19) - src/predict.py (3:19) duplicated block id: 982 size: 14 cleaned lines of code in 2 files: - src/baselines/predict.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:19) duplicated block id: 983 size: 14 cleaned lines of code in 2 files: - src/common/loss.py (3:19) - src/result_process/process_predict_result.py (3:19) duplicated block id: 984 size: 14 cleaned lines of code in 2 files: - src/data_loader.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:19) duplicated block id: 985 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:19) - src/protein_structure/structure_from_esm_v1.py (3:19) duplicated block id: 986 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:19) - src/data_preprocess/verify_train_dataset.py (3:19) duplicated block id: 987 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/verify_train_dataset.py (3:19) - src/run.py (3:19) duplicated block id: 988 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_generator.py (3:19) - src/predict_one_sample.py (3:19) duplicated block id: 989 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_2.py (3:19) - src/protein_structure/predict_structure.py (3:19) duplicated block id: 990 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:19) - src/deep_baselines/predict_deep_baselines.py (1:17) duplicated block id: 991 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) - src/geo_map/get_biosample_from_manual.py (3:19) duplicated block id: 992 size: 14 cleaned lines of code in 2 files: - src/baselines/xgb.py (3:19) - src/geo_map/extract_attr_from_biosample_page.py (3:19) duplicated block id: 993 size: 14 cleaned lines of code in 2 files: - src/predict_many_samples.py (3:19) - src/predictor.py (3:19) duplicated block id: 994 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/subword.py (3:19) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) duplicated block id: 995 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/structure_file_reader.py (3:19) - src/utils.py (3:19) duplicated block id: 996 size: 14 cleaned lines of code in 2 files: - src/common/multi_label_metrics.py (3:19) - src/protein_structure/structure_from_esm_v1.py (3:19) duplicated block id: 997 size: 14 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:19) - src/geo_map/parse_loc_lat_lon_from_attr.py (3:19) duplicated block id: 998 size: 14 cleaned lines of code in 2 files: - src/geo_map/extract_attr_from_biosample_page.py (3:19) - src/protein_structure/structure_from_esm_v1.py (3:19) duplicated block id: 999 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:19) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) duplicated block id: 1000 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig_aff4_1.py (3:19) - src/protein_structure/predict_structure.py (3:19) duplicated block id: 1001 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/verify_train_dataset.py (3:19) - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) duplicated block id: 1002 size: 14 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:19) - src/geo_map/parse_loc_lat_lon_from_attr.py (3:19) duplicated block id: 1003 size: 14 cleaned lines of code in 2 files: - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) duplicated block id: 1004 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig_aff4_2.py (3:19) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) duplicated block id: 1005 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/tf_records_generator.py (3:19) - src/geo_map/get_biosample_from_entrez.py (3:19) duplicated block id: 1006 size: 14 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:19) - src/run.py (3:19) duplicated block id: 1007 size: 14 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:19) - src/geo_map/download_biosample_page.py (3:19) duplicated block id: 1008 size: 14 cleaned lines of code in 2 files: - src/baselines/xgb.py (3:19) - src/geo_map/get_biosample_from_update.py (3:19) duplicated block id: 1009 size: 14 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:19) - src/biotoolbox/contact_map_builder.py (3:19) duplicated block id: 1010 size: 14 cleaned lines of code in 2 files: - src/baselines/predict.py (3:19) - src/data_preprocess/verify_train_dataset.py (3:19) duplicated block id: 1011 size: 14 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:19) - src/protein_structure/predict_structure.py (3:19) duplicated block id: 1012 size: 14 cleaned lines of code in 2 files: - src/result_process/process_predict_result.py (3:19) - src/trainer.py (3:19) duplicated block id: 1013 size: 14 cleaned lines of code in 2 files: - src/common/metrics.py (3:19) - src/plot/plot_map_pie_fig4_1.py (3:19) duplicated block id: 1014 size: 14 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:19) - src/protein_structure/merge_embedding_pdb_result.py (3:19) duplicated block id: 1015 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/verify_train_dataset.py (3:19) - src/utils.py (3:19) duplicated block id: 1016 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:19) - src/evaluater.py (3:19) duplicated block id: 1017 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/tf_records_generator.py (3:19) - src/trainer.py (3:19) duplicated block id: 1018 size: 14 cleaned lines of code in 2 files: - src/common/metrics.py (3:19) - src/geo_map/standardization_lat_lon_info.py (3:19) duplicated block id: 1019 size: 14 cleaned lines of code in 2 files: - src/SSFN/model.py (3:19) - src/biotoolbox/contact_map_generator.py (3:19) duplicated block id: 1020 size: 14 cleaned lines of code in 2 files: - src/baselines/predict.py (3:19) - src/protein_structure/embedding_from_esmfold.py (3:19) duplicated block id: 1021 size: 14 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:19) - src/utils.py (3:19) duplicated block id: 1022 size: 14 cleaned lines of code in 2 files: - src/evaluater.py (3:19) - src/predict_one_sample.py (3:19) duplicated block id: 1023 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/geo_map/get_biosample_from_entrez.py (3:19) duplicated block id: 1024 size: 14 cleaned lines of code in 2 files: - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) - src/predictor.py (3:19) duplicated block id: 1025 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/subword.py (3:19) - src/utils.py (3:19) duplicated block id: 1026 size: 14 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (1:17) - src/evaluater.py (3:19) duplicated block id: 1027 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_builder.py (3:19) - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:19) duplicated block id: 1028 size: 14 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:19) - src/SSFN/model.py (3:19) duplicated block id: 1029 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/plot/plot_map_pie_fig_aff4_2.py (3:19) duplicated block id: 1030 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:19) - src/deep_baselines/predict_deep_baselines.py (1:17) duplicated block id: 1031 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/subword.py (3:19) - src/data_preprocess/verify_train_dataset.py (3:19) duplicated block id: 1032 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:19) - src/trainer.py (3:19) duplicated block id: 1033 size: 14 cleaned lines of code in 2 files: - src/baselines/xgb.py (3:19) - src/data_preprocess/verify_train_dataset.py (3:19) duplicated block id: 1034 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:19) - src/plot/plot_map_pie_fig4_2.py (3:19) duplicated block id: 1035 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/tf_records_generator.py (3:19) - src/predict.py (3:19) duplicated block id: 1036 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/data_preprocess/subword.py (3:19) duplicated block id: 1037 size: 14 cleaned lines of code in 2 files: - src/geo_map/extract_attr_from_biosample_page.py (3:19) - src/geo_map/get_biosample_from_update.py (3:19) duplicated block id: 1038 size: 14 cleaned lines of code in 2 files: - src/common/loss.py (3:19) - src/deep_baselines/predict_deep_baselines.py (1:17) duplicated block id: 1039 size: 14 cleaned lines of code in 2 files: - src/baselines/xgb.py (3:19) - src/biotoolbox/structure_file_reader.py (3:19) duplicated block id: 1040 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/structure_file_reader.py (3:19) - src/plot/plot_map_pie_fig_aff4_1.py (3:19) duplicated block id: 1041 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig_aff4_1.py (3:19) - src/result_process/process_predict_result.py (3:19) duplicated block id: 1042 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/verify_train_dataset.py (3:19) - src/plot/plot_map_pie_fig_aff4_2.py (3:19) duplicated block id: 1043 size: 14 cleaned lines of code in 2 files: - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) - src/result_process/process_predict_result.py (3:19) duplicated block id: 1044 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/structure_file_reader.py (3:19) - src/geo_map/extract_attr_from_biosample_page.py (3:19) duplicated block id: 1045 size: 14 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:19) - src/geo_map/parse_loc_lat_lon_from_attr.py (3:19) duplicated block id: 1046 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_builder.py (3:19) - src/protein_structure/predict_structure.py (3:19) duplicated block id: 1047 size: 14 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:19) - src/predict.py (3:19) duplicated block id: 1048 size: 14 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:19) - src/common/metrics.py (3:19) duplicated block id: 1049 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:19) - src/result_process/process_predict_result.py (3:19) duplicated block id: 1050 size: 14 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:19) - src/baselines/xgb.py (3:19) duplicated block id: 1051 size: 14 cleaned lines of code in 2 files: - src/baselines/lgbm.py (574:587) - src/baselines/xgb.py (542:555) duplicated block id: 1052 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/structure_file_reader.py (3:19) - src/data_preprocess/verify_train_dataset.py (3:19) duplicated block id: 1053 size: 14 cleaned lines of code in 2 files: - src/SSFN/model.py (3:19) - src/predictor.py (3:19) duplicated block id: 1054 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) - src/predict.py (3:19) duplicated block id: 1055 size: 14 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:19) - src/predict_one_sample.py (3:19) duplicated block id: 1056 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:19) - src/plot/plot_map_pie_fig4_2.py (3:19) duplicated block id: 1057 size: 14 cleaned lines of code in 2 files: - src/geo_map/download_biosample_page.py (3:19) - src/geo_map/get_biosample_from_manual.py (3:19) duplicated block id: 1058 size: 14 cleaned lines of code in 2 files: - src/common/metrics.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:19) duplicated block id: 1059 size: 14 cleaned lines of code in 2 files: - src/SSFN/model.py (3:19) - src/run.py (3:19) duplicated block id: 1060 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/baselines/predict.py (3:19) duplicated block id: 1061 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/subword.py (3:19) - src/plot/plot_map_pie_fig_aff4_2.py (3:19) duplicated block id: 1062 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:19) - src/geo_map/parse_loc_lat_lon_from_attr.py (3:19) duplicated block id: 1063 size: 14 cleaned lines of code in 2 files: - src/geo_map/standardization_lat_lon_info.py (3:19) - src/protein_structure/embedding_from_esmfold.py (3:19) duplicated block id: 1064 size: 14 cleaned lines of code in 2 files: - src/baselines/xgb.py (3:19) - src/common/loss.py (3:19) duplicated block id: 1065 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/biotoolbox/structure_file_reader.py (3:19) duplicated block id: 1066 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/structure_file_reader.py (3:19) - src/result_process/process_predict_result.py (3:19) duplicated block id: 1067 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:19) - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) duplicated block id: 1068 size: 14 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:19) - src/common/metrics.py (3:19) duplicated block id: 1069 size: 14 cleaned lines of code in 2 files: - src/common/multi_label_metrics.py (3:19) - src/run.py (3:19) duplicated block id: 1070 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_generator.py (3:19) - src/geo_map/parse_loc_lat_lon_from_attr.py (3:19) duplicated block id: 1071 size: 14 cleaned lines of code in 2 files: - src/protein_structure/predict_structure.py (3:19) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) duplicated block id: 1072 size: 14 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:19) - src/data_preprocess/verify_train_dataset.py (3:19) duplicated block id: 1073 size: 14 cleaned lines of code in 2 files: - src/baselines/xgb.py (3:19) - src/geo_map/get_biosample_from_manual.py (3:19) duplicated block id: 1074 size: 14 cleaned lines of code in 2 files: - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) - src/geo_map/standardization_lat_lon_info.py (3:19) duplicated block id: 1075 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_generator.py (3:19) - src/data_preprocess/tf_records_generator.py (3:19) duplicated block id: 1076 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:19) - src/protein_structure/merge_embedding_pdb_result.py (3:19) duplicated block id: 1077 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_generator.py (3:19) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) duplicated block id: 1078 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:19) - src/protein_structure/structure_from_esm_v1.py (3:19) duplicated block id: 1079 size: 14 cleaned lines of code in 2 files: - src/baselines/xgb.py (3:19) - src/protein_structure/merge_embedding_pdb_result.py (3:19) duplicated block id: 1080 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/verify_train_dataset.py (3:19) - src/geo_map/get_biosample_from_entrez.py (3:19) duplicated block id: 1081 size: 14 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:19) - src/data_loader.py (3:19) duplicated block id: 1082 size: 14 cleaned lines of code in 2 files: - src/evaluater.py (3:19) - src/predictor.py (3:19) duplicated block id: 1083 size: 14 cleaned lines of code in 2 files: - src/protein_structure/merge_embedding_pdb_result.py (3:19) - src/protein_structure/structure_from_esm_v1.py (3:19) duplicated block id: 1084 size: 14 cleaned lines of code in 2 files: - src/data_loader.py (3:19) - src/result_process/process_predict_result.py (3:19) duplicated block id: 1085 size: 14 cleaned lines of code in 2 files: - src/geo_map/extract_attr_from_biosample_page.py (3:19) - src/predict.py (3:19) duplicated block id: 1086 size: 14 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:19) - src/run.py (3:19) duplicated block id: 1087 size: 14 cleaned lines of code in 2 files: - src/common/metrics.py (3:19) - src/plot/plot_map_pie_fig_aff4_1.py (3:19) duplicated block id: 1088 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/subword.py (3:19) - src/protein_structure/predict_structure.py (3:19) duplicated block id: 1089 size: 14 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:19) - src/common/multi_label_metrics.py (3:19) duplicated block id: 1090 size: 14 cleaned lines of code in 2 files: - src/SSFN/model.py (3:19) - src/geo_map/extract_attr_from_biosample_page.py (3:19) duplicated block id: 1091 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (3:19) - src/protein_structure/predict_structure.py (3:19) duplicated block id: 1092 size: 14 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:19) - src/SSFN/pooling.py (3:19) duplicated block id: 1093 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_builder.py (3:19) - src/predict_one_sample.py (3:19) duplicated block id: 1094 size: 14 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:19) - src/geo_map/download_biosample_page.py (3:19) duplicated block id: 1095 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:19) - src/predict.py (3:19) duplicated block id: 1096 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_builder.py (3:19) - src/deep_baselines/predict_deep_baselines.py (1:17) duplicated block id: 1097 size: 14 cleaned lines of code in 2 files: - src/geo_map/extract_attr_from_biosample_page.py (3:19) - src/predict_many_samples.py (3:19) duplicated block id: 1098 size: 14 cleaned lines of code in 2 files: - src/baselines/lgbm.py (334:353) - src/baselines/xgb.py (311:330) duplicated block id: 1099 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/predict_one_sample.py (3:19) duplicated block id: 1100 size: 14 cleaned lines of code in 2 files: - src/geo_map/download_biosample_page.py (3:19) - src/protein_structure/merge_embedding_pdb_result.py (3:19) duplicated block id: 1101 size: 14 cleaned lines of code in 2 files: - src/predict_many_samples.py (3:19) - src/result_process/process_predict_result.py (3:19) duplicated block id: 1102 size: 14 cleaned lines of code in 2 files: - src/baselines/xgb.py (3:19) - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:19) duplicated block id: 1103 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig_aff4_1.py (3:19) - src/protein_structure/merge_embedding_pdb_result.py (3:19) duplicated block id: 1104 size: 14 cleaned lines of code in 2 files: - src/baselines/xgb.py (3:19) - src/protein_structure/embedding_from_esmfold.py (3:19) duplicated block id: 1105 size: 14 cleaned lines of code in 2 files: - src/geo_map/parse_loc_lat_lon_from_attr.py (3:19) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) duplicated block id: 1106 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (177:192) - src/plot/plot_map_pie_fig4_2.py (211:226) duplicated block id: 1107 size: 14 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:19) - src/data_preprocess/tf_records_generator.py (3:19) duplicated block id: 1108 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/structure_file_reader.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:19) duplicated block id: 1109 size: 14 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (1:17) - src/result_process/process_predict_result.py (3:19) duplicated block id: 1110 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_generator.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:19) duplicated block id: 1111 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) - src/geo_map/get_biosample_from_entrez.py (3:19) duplicated block id: 1112 size: 14 cleaned lines of code in 2 files: - src/SSFN/model.py (3:19) - src/predict_one_sample.py (3:19) duplicated block id: 1113 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/tf_records_generator.py (3:19) - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) duplicated block id: 1114 size: 14 cleaned lines of code in 2 files: - src/evaluater.py (3:19) - src/geo_map/parse_loc_lat_lon_from_attr.py (3:19) duplicated block id: 1115 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_generator.py (3:19) - src/predictor.py (3:19) duplicated block id: 1116 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:19) - src/run.py (3:19) duplicated block id: 1117 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/utils.py (3:19) duplicated block id: 1118 size: 14 cleaned lines of code in 2 files: - src/data_loader.py (3:19) - src/predict.py (3:19) duplicated block id: 1119 size: 14 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:19) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) duplicated block id: 1120 size: 14 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:19) duplicated block id: 1121 size: 14 cleaned lines of code in 2 files: - src/common/multi_label_metrics.py (3:19) - src/utils.py (3:19) duplicated block id: 1122 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_update.py (3:19) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) duplicated block id: 1123 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:19) - src/geo_map/get_biosample_from_entrez.py (3:19) duplicated block id: 1124 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/tf_records_generator.py (3:19) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) duplicated block id: 1125 size: 14 cleaned lines of code in 2 files: - src/baselines/xgb.py (3:19) - src/geo_map/download_biosample_page.py (3:19) duplicated block id: 1126 size: 14 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:19) - src/baselines/xgb.py (3:19) duplicated block id: 1127 size: 14 cleaned lines of code in 2 files: - src/geo_map/standardization_lat_lon_info.py (3:19) - src/plot/plot_map_pie_fig4_2.py (3:19) duplicated block id: 1128 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/subword.py (3:19) - src/predict_many_samples.py (3:19) duplicated block id: 1129 size: 14 cleaned lines of code in 2 files: - src/data_loader.py (3:19) - src/trainer.py (3:19) duplicated block id: 1130 size: 14 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (1:17) - src/predict_many_samples.py (3:19) duplicated block id: 1131 size: 14 cleaned lines of code in 2 files: - src/evaluater.py (3:19) - src/plot/plot_map_pie_fig_aff4_1.py (3:19) duplicated block id: 1132 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/baselines/lgbm.py (3:19) duplicated block id: 1133 size: 14 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:19) - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) duplicated block id: 1134 size: 14 cleaned lines of code in 2 files: - src/deep_baselines/run.py (3:19) - src/deep_baselines/virtifier.py (3:19) duplicated block id: 1135 size: 14 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (1:17) - src/run.py (3:19) duplicated block id: 1136 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_entrez.py (3:19) - src/protein_structure/merge_embedding_pdb_result.py (3:19) duplicated block id: 1137 size: 14 cleaned lines of code in 2 files: - src/data_loader.py (3:19) - src/geo_map/extract_attr_from_biosample_page.py (3:19) duplicated block id: 1138 size: 14 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (516:529) - src/predict.py (839:853) duplicated block id: 1139 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_generator.py (3:19) - src/protein_structure/structure_from_esm_v1.py (3:19) duplicated block id: 1140 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig_aff4_1.py (3:19) - src/plot/plot_map_pie_fig_aff4_2.py (3:19) duplicated block id: 1141 size: 14 cleaned lines of code in 2 files: - src/common/multi_label_metrics.py (3:19) - src/deep_baselines/predict_deep_baselines.py (1:17) duplicated block id: 1142 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/subword.py (3:19) - src/predictor.py (3:19) duplicated block id: 1143 size: 14 cleaned lines of code in 2 files: - src/geo_map/download_biosample_page.py (3:19) - src/predict_many_samples.py (3:19) duplicated block id: 1144 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:19) - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:19) duplicated block id: 1145 size: 14 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:19) - src/trainer.py (3:19) duplicated block id: 1146 size: 14 cleaned lines of code in 2 files: - src/common/loss.py (3:19) - src/data_preprocess/subword.py (3:19) duplicated block id: 1147 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/common/loss.py (3:19) duplicated block id: 1148 size: 14 cleaned lines of code in 2 files: - src/baselines/predict.py (3:19) - src/data_preprocess/subword.py (3:19) duplicated block id: 1149 size: 14 cleaned lines of code in 2 files: - src/geo_map/parse_loc_lat_lon_from_attr.py (3:19) - src/geo_map/standardization_lat_lon_info.py (3:19) duplicated block id: 1150 size: 14 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:19) - src/geo_map/get_biosample_from_update.py (3:19) duplicated block id: 1151 size: 14 cleaned lines of code in 2 files: - src/SSFN/model.py (3:19) - src/baselines/lgbm.py (3:19) duplicated block id: 1152 size: 14 cleaned lines of code in 2 files: - src/baselines/xgb.py (3:19) - src/result_process/process_predict_result.py (3:19) duplicated block id: 1153 size: 14 cleaned lines of code in 2 files: - src/predictor.py (3:19) - src/protein_structure/embedding_from_esmfold.py (3:19) duplicated block id: 1154 size: 14 cleaned lines of code in 2 files: - src/geo_map/parse_loc_lat_lon_from_attr.py (3:19) - src/result_process/process_predict_result.py (3:19) duplicated block id: 1155 size: 14 cleaned lines of code in 2 files: - src/SSFN/model.py (3:19) - src/result_process/process_predict_result.py (3:19) duplicated block id: 1156 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_entrez.py (3:19) - src/predict_many_samples.py (3:19) duplicated block id: 1157 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/verify_train_dataset.py (3:19) - src/geo_map/extract_attr_from_biosample_page.py (3:19) duplicated block id: 1158 size: 14 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:19) - src/predictor.py (3:19) duplicated block id: 1159 size: 14 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:19) - src/plot/plot_map_pie_fig4_2.py (3:19) duplicated block id: 1160 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/verify_train_dataset.py (3:19) - src/result_process/process_predict_result.py (3:19) duplicated block id: 1161 size: 14 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:19) duplicated block id: 1162 size: 14 cleaned lines of code in 2 files: - src/predict.py (3:19) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) duplicated block id: 1163 size: 14 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:19) - src/protein_structure/embedding_from_esmfold.py (3:19) duplicated block id: 1164 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:19) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) duplicated block id: 1165 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/tf_records_generator.py (3:19) - src/plot/plot_map_pie_fig_aff4_1.py (3:19) duplicated block id: 1166 size: 14 cleaned lines of code in 2 files: - src/geo_map/parse_loc_lat_lon_from_attr.py (3:19) - src/plot/plot_map_pie_fig4_2.py (3:19) duplicated block id: 1167 size: 14 cleaned lines of code in 2 files: - src/common/multi_label_metrics.py (3:19) - src/evaluater.py (3:19) duplicated block id: 1168 size: 14 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (125:138) - src/predict_many_samples.py (134:148) duplicated block id: 1169 size: 14 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (1:17) - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) duplicated block id: 1170 size: 14 cleaned lines of code in 2 files: - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) - src/trainer.py (3:19) duplicated block id: 1171 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:19) - src/predictor.py (3:19) duplicated block id: 1172 size: 14 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:19) - src/geo_map/download_biosample_page.py (3:19) duplicated block id: 1173 size: 14 cleaned lines of code in 2 files: - src/baselines/predict.py (3:19) - src/geo_map/get_biosample_from_manual.py (3:19) duplicated block id: 1174 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/structure_file_reader.py (3:19) - src/common/loss.py (3:19) duplicated block id: 1175 size: 14 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:19) - src/predict_many_samples.py (3:19) duplicated block id: 1176 size: 14 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:19) - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:19) duplicated block id: 1177 size: 14 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:19) - src/plot/plot_map_pie_fig4_2.py (3:19) duplicated block id: 1178 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_2.py (3:19) - src/predict.py (3:19) duplicated block id: 1179 size: 14 cleaned lines of code in 2 files: - src/common/loss.py (3:19) - src/protein_structure/structure_from_esm_v1.py (3:19) duplicated block id: 1180 size: 14 cleaned lines of code in 2 files: - src/deep_baselines/run.py (3:19) - src/deep_baselines/statistics.py (3:19) duplicated block id: 1181 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:19) - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) duplicated block id: 1182 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:19) - src/protein_structure/predict_structure.py (3:19) duplicated block id: 1183 size: 14 cleaned lines of code in 2 files: - src/data_loader.py (3:19) - src/predictor.py (3:19) duplicated block id: 1184 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_entrez.py (3:19) - src/result_process/process_predict_result.py (3:19) duplicated block id: 1185 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_builder.py (3:19) - src/protein_structure/merge_embedding_pdb_result.py (3:19) duplicated block id: 1186 size: 14 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:19) - src/biotoolbox/contact_map_generator.py (3:19) duplicated block id: 1187 size: 14 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (1:17) - src/geo_map/extract_attr_from_biosample_page.py (3:19) duplicated block id: 1188 size: 14 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:19) - src/data_loader.py (3:19) duplicated block id: 1189 size: 14 cleaned lines of code in 2 files: - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) - src/run.py (3:19) duplicated block id: 1190 size: 14 cleaned lines of code in 2 files: - src/protein_structure/predict_structure.py (3:19) - src/run.py (3:19) duplicated block id: 1191 size: 14 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:19) - src/protein_structure/merge_embedding_pdb_result.py (3:19) duplicated block id: 1192 size: 14 cleaned lines of code in 2 files: - src/geo_map/extract_attr_from_biosample_page.py (3:19) - src/protein_structure/merge_embedding_pdb_result.py (3:19) duplicated block id: 1193 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:19) - src/trainer.py (3:19) duplicated block id: 1194 size: 14 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:19) - src/protein_structure/structure_from_esm_v1.py (3:19) duplicated block id: 1195 size: 14 cleaned lines of code in 2 files: - src/geo_map/parse_loc_lat_lon_from_attr.py (3:19) - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) duplicated block id: 1196 size: 14 cleaned lines of code in 2 files: - src/protein_structure/embedding_from_esmfold.py (3:19) - src/run.py (3:19) duplicated block id: 1197 size: 14 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:19) duplicated block id: 1198 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:19) - src/plot/plot_map_pie_fig_aff4_1.py (3:19) duplicated block id: 1199 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (3:19) - src/predict_many_samples.py (3:19) duplicated block id: 1200 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_2.py (3:19) - src/predict_many_samples.py (3:19) duplicated block id: 1201 size: 14 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:19) - src/data_loader.py (3:19) duplicated block id: 1202 size: 14 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:19) - src/deep_baselines/predict_deep_baselines.py (1:17) duplicated block id: 1203 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/structure_file_reader.py (3:19) - src/common/multi_label_metrics.py (3:19) duplicated block id: 1204 size: 14 cleaned lines of code in 2 files: - src/common/multi_label_metrics.py (3:19) - src/geo_map/get_biosample_from_update.py (3:19) duplicated block id: 1205 size: 14 cleaned lines of code in 2 files: - src/baselines/xgb.py (3:19) - src/trainer.py (3:19) duplicated block id: 1206 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_entrez.py (3:19) - src/protein_structure/structure_from_esm_v1.py (3:19) duplicated block id: 1207 size: 14 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:19) - src/plot/plot_map_pie_fig_aff4_2.py (3:19) duplicated block id: 1208 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/plot/plot_map_pie_fig4_2.py (3:19) duplicated block id: 1209 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/predictor.py (3:19) duplicated block id: 1210 size: 14 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:19) - src/common/multi_label_metrics.py (3:19) duplicated block id: 1211 size: 14 cleaned lines of code in 2 files: - src/geo_map/download_biosample_page.py (3:19) - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) duplicated block id: 1212 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_update.py (3:19) - src/result_process/process_predict_result.py (3:19) duplicated block id: 1213 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:19) - src/geo_map/download_biosample_page.py (3:19) duplicated block id: 1214 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/tf_records_generator.py (3:19) - src/geo_map/extract_attr_from_biosample_page.py (3:19) duplicated block id: 1215 size: 14 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:19) - src/protein_structure/predict_structure.py (3:19) duplicated block id: 1216 size: 14 cleaned lines of code in 2 files: - src/data_loader.py (3:19) - src/protein_structure/structure_from_esm_v1.py (3:19) duplicated block id: 1217 size: 14 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:19) - src/biotoolbox/structure_file_reader.py (3:19) duplicated block id: 1218 size: 14 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:19) - src/data_preprocess/verify_train_dataset.py (3:19) duplicated block id: 1219 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) - src/geo_map/parse_loc_lat_lon_from_attr.py (3:19) duplicated block id: 1220 size: 14 cleaned lines of code in 2 files: - src/SSFN/model.py (3:19) - src/data_preprocess/tf_records_generator.py (3:19) duplicated block id: 1221 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/verify_train_dataset.py (3:19) - src/protein_structure/merge_embedding_pdb_result.py (3:19) duplicated block id: 1222 size: 14 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:19) - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) duplicated block id: 1223 size: 14 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:19) - src/geo_map/standardization_lat_lon_info.py (3:19) duplicated block id: 1224 size: 14 cleaned lines of code in 2 files: - src/evaluater.py (3:19) - src/plot/plot_map_pie_fig4_1.py (3:19) duplicated block id: 1225 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/tf_records_generator.py (3:19) - src/predictor.py (3:19) duplicated block id: 1226 size: 14 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:19) - src/geo_map/get_biosample_from_manual.py (3:19) duplicated block id: 1227 size: 14 cleaned lines of code in 2 files: - src/predict.py (3:19) - src/utils.py (3:19) duplicated block id: 1228 size: 14 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:19) - src/geo_map/get_biosample_from_entrez.py (3:19) duplicated block id: 1229 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_generator.py (3:19) - src/common/multi_label_metrics.py (3:19) duplicated block id: 1230 size: 14 cleaned lines of code in 2 files: - src/baselines/predict.py (3:19) - src/protein_structure/merge_embedding_pdb_result.py (3:19) duplicated block id: 1231 size: 14 cleaned lines of code in 2 files: - src/common/multi_label_metrics.py (3:19) - src/plot/plot_map_pie_fig_aff4_1.py (3:19) duplicated block id: 1232 size: 14 cleaned lines of code in 2 files: - src/protein_structure/structure_from_esm_v1.py (3:19) - src/trainer.py (3:19) duplicated block id: 1233 size: 14 cleaned lines of code in 2 files: - src/evaluater.py (3:19) - src/geo_map/standardization_lat_lon_info.py (3:19) duplicated block id: 1234 size: 14 cleaned lines of code in 2 files: - src/baselines/predict.py (3:19) - src/trainer.py (3:19) duplicated block id: 1235 size: 14 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:19) - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) duplicated block id: 1236 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:19) - src/predict_many_samples.py (3:19) duplicated block id: 1237 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:19) - src/evaluater.py (3:19) duplicated block id: 1238 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:19) - src/predict_one_sample.py (3:19) duplicated block id: 1239 size: 14 cleaned lines of code in 2 files: - src/common/multi_label_metrics.py (3:19) - src/protein_structure/merge_embedding_pdb_result.py (3:19) duplicated block id: 1240 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/tf_records_generator.py (3:19) - src/data_preprocess/verify_train_dataset.py (3:19) duplicated block id: 1241 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/predict.py (3:19) duplicated block id: 1242 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:19) - src/geo_map/extract_attr_from_biosample_page.py (3:19) duplicated block id: 1243 size: 14 cleaned lines of code in 2 files: - src/data_loader.py (3:19) - src/evaluater.py (3:19) duplicated block id: 1244 size: 14 cleaned lines of code in 2 files: - src/geo_map/parse_loc_lat_lon_from_attr.py (3:19) - src/predict.py (3:19) duplicated block id: 1245 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig_aff4_2.py (3:19) - src/protein_structure/embedding_from_esmfold.py (3:19) duplicated block id: 1246 size: 14 cleaned lines of code in 2 files: - src/baselines/predict.py (3:19) - src/protein_structure/structure_from_esm_v1.py (3:19) duplicated block id: 1247 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:19) duplicated block id: 1248 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (3:19) - src/protein_structure/embedding_from_esmfold.py (3:19) duplicated block id: 1249 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig_aff4_1.py (3:19) - src/predict_many_samples.py (3:19) duplicated block id: 1250 size: 14 cleaned lines of code in 2 files: - src/protein_structure/predict_structure.py (3:19) - src/protein_structure/structure_from_esm_v1.py (3:19) duplicated block id: 1251 size: 14 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (3:19) - src/deep_baselines/run.py (3:19) duplicated block id: 1252 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) - src/predict_one_sample.py (3:19) duplicated block id: 1253 size: 14 cleaned lines of code in 2 files: - src/protein_structure/structure_from_esm_v1.py (3:19) - src/run.py (3:19) duplicated block id: 1254 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:19) - src/plot/plot_map_pie_fig4_2.py (3:19) duplicated block id: 1255 size: 14 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (1:17) - src/geo_map/download_biosample_page.py (3:19) duplicated block id: 1256 size: 14 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:19) - src/geo_map/extract_attr_from_biosample_page.py (3:19) duplicated block id: 1257 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/structure_file_reader.py (3:19) - src/plot/plot_map_pie_fig4_1.py (3:19) duplicated block id: 1258 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:19) - src/data_preprocess/verify_train_dataset.py (3:19) duplicated block id: 1259 size: 14 cleaned lines of code in 2 files: - src/data_loader.py (3:19) - src/protein_structure/embedding_from_esmfold.py (3:19) duplicated block id: 1260 size: 14 cleaned lines of code in 2 files: - src/geo_map/self_testing_nearest_station_antarctica.py (37:52) - src/plot/plot_map_pie_fig4_2.py (211:226) duplicated block id: 1261 size: 14 cleaned lines of code in 2 files: - src/baselines/xgb.py (3:19) - src/utils.py (3:19) duplicated block id: 1262 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) - src/plot/plot_map_pie_fig4_2.py (3:19) duplicated block id: 1263 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_generator.py (3:19) - src/geo_map/get_biosample_from_entrez.py (3:19) duplicated block id: 1264 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (3:19) - src/utils.py (3:19) duplicated block id: 1265 size: 14 cleaned lines of code in 2 files: - src/SSFN/model.py (3:19) - src/plot/plot_map_pie_fig_aff4_2.py (3:19) duplicated block id: 1266 size: 14 cleaned lines of code in 2 files: - src/SSFN/model.py (3:19) - src/plot/plot_map_pie_fig4_2.py (3:19) duplicated block id: 1267 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/structure_file_reader.py (3:19) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) duplicated block id: 1268 size: 14 cleaned lines of code in 2 files: - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) - src/run.py (3:19) duplicated block id: 1269 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:19) - src/data_preprocess/tf_records_generator.py (3:19) duplicated block id: 1270 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/tf_records_generator.py (3:19) - src/protein_structure/merge_embedding_pdb_result.py (3:19) duplicated block id: 1271 size: 14 cleaned lines of code in 2 files: - src/data_loader.py (3:19) - src/geo_map/get_biosample_from_update.py (3:19) duplicated block id: 1272 size: 14 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (125:138) - src/predict.py (135:149) duplicated block id: 1273 size: 14 cleaned lines of code in 2 files: - src/baselines/predict.py (3:19) - src/geo_map/download_biosample_page.py (3:19) duplicated block id: 1274 size: 14 cleaned lines of code in 2 files: - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) - src/plot/plot_map_pie_fig_aff4_1.py (3:19) duplicated block id: 1275 size: 14 cleaned lines of code in 2 files: - src/baselines/xgb.py (3:19) - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) duplicated block id: 1276 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/tf_records_generator.py (3:19) - src/run.py (3:19) duplicated block id: 1277 size: 14 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:19) - src/plot/plot_map_pie_fig_aff4_2.py (3:19) duplicated block id: 1278 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) - src/trainer.py (3:19) duplicated block id: 1279 size: 14 cleaned lines of code in 2 files: - src/predict_many_samples.py (3:19) - src/run.py (3:19) duplicated block id: 1280 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/data_preprocess/tf_records_generator.py (3:19) duplicated block id: 1281 size: 14 cleaned lines of code in 2 files: - src/common/loss.py (3:19) - src/predict.py (3:19) duplicated block id: 1282 size: 14 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:19) - src/trainer.py (3:19) duplicated block id: 1283 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/SSFN/pooling.py (3:19) duplicated block id: 1284 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:19) - src/geo_map/download_biosample_page.py (3:19) duplicated block id: 1285 size: 14 cleaned lines of code in 2 files: - src/common/metrics.py (3:19) - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) duplicated block id: 1286 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/structure_file_reader.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:19) duplicated block id: 1287 size: 14 cleaned lines of code in 2 files: - src/SSFN/model.py (3:19) - src/SSFN/modeling_bert.py (3:19) duplicated block id: 1288 size: 14 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:19) - src/deep_baselines/predict_deep_baselines.py (1:17) duplicated block id: 1289 size: 14 cleaned lines of code in 2 files: - src/common/metrics.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:19) duplicated block id: 1290 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/tf_records_generator.py (3:19) - src/geo_map/standardization_lat_lon_info.py (3:19) duplicated block id: 1291 size: 14 cleaned lines of code in 2 files: - src/predict_many_samples.py (3:19) - src/protein_structure/merge_embedding_pdb_result.py (3:19) duplicated block id: 1292 size: 14 cleaned lines of code in 2 files: - src/common/metrics.py (3:19) - src/predictor.py (3:19) duplicated block id: 1293 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:19) duplicated block id: 1294 size: 14 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:19) - src/trainer.py (3:19) duplicated block id: 1295 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_generator.py (3:19) - src/common/loss.py (3:19) duplicated block id: 1296 size: 14 cleaned lines of code in 2 files: - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) - src/predict.py (3:19) duplicated block id: 1297 size: 14 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:19) - src/predictor.py (3:19) duplicated block id: 1298 size: 14 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (1:17) - src/geo_map/get_biosample_from_manual.py (3:19) duplicated block id: 1299 size: 14 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:19) - src/result_process/process_predict_result.py (3:19) duplicated block id: 1300 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:19) - src/run.py (3:19) duplicated block id: 1301 size: 14 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (1:17) - src/protein_structure/predict_structure.py (3:19) duplicated block id: 1302 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_generator.py (3:19) - src/data_preprocess/verify_train_dataset.py (3:19) duplicated block id: 1303 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_generator.py (3:19) - src/geo_map/standardization_lat_lon_info.py (3:19) duplicated block id: 1304 size: 14 cleaned lines of code in 2 files: - src/common/multi_label_metrics.py (3:19) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) duplicated block id: 1305 size: 14 cleaned lines of code in 2 files: - src/data_loader.py (3:19) - src/plot/plot_map_pie_fig4_1.py (3:19) duplicated block id: 1306 size: 14 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:19) - src/data_preprocess/tf_records_generator.py (3:19) duplicated block id: 1307 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig_aff4_2.py (3:19) - src/predict.py (3:19) duplicated block id: 1308 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/geo_map/standardization_lat_lon_info.py (3:19) duplicated block id: 1309 size: 14 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (32:65) - src/predict.py (35:63) duplicated block id: 1310 size: 14 cleaned lines of code in 2 files: - src/common/metrics.py (3:19) - src/geo_map/get_biosample_from_entrez.py (3:19) duplicated block id: 1311 size: 14 cleaned lines of code in 2 files: - src/baselines/xgb.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:19) duplicated block id: 1312 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:19) - src/run.py (3:19) duplicated block id: 1313 size: 14 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:19) - src/protein_structure/predict_structure.py (3:19) duplicated block id: 1314 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/structure_file_reader.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:19) duplicated block id: 1315 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (177:192) - src/plot/plot_map_pie_fig_aff4_2.py (210:225) duplicated block id: 1316 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:19) - src/utils.py (3:19) duplicated block id: 1317 size: 14 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:19) - src/run.py (3:19) duplicated block id: 1318 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:19) - src/geo_map/parse_loc_lat_lon_from_attr.py (3:19) duplicated block id: 1319 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) - src/protein_structure/predict_structure.py (3:19) duplicated block id: 1320 size: 14 cleaned lines of code in 2 files: - src/data_loader.py (3:19) - src/predict_many_samples.py (3:19) duplicated block id: 1321 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/geo_map/extract_attr_from_biosample_page.py (3:19) duplicated block id: 1322 size: 14 cleaned lines of code in 2 files: - src/geo_map/extract_attr_from_biosample_page.py (3:19) - src/plot/plot_map_pie_fig_aff4_1.py (3:19) duplicated block id: 1323 size: 14 cleaned lines of code in 2 files: - src/common/multi_label_metrics.py (3:19) - src/predict_many_samples.py (3:19) duplicated block id: 1324 size: 14 cleaned lines of code in 2 files: - src/common/metrics.py (3:19) - src/plot/plot_map_pie_fig4_2.py (3:19) duplicated block id: 1325 size: 14 cleaned lines of code in 2 files: - src/SSFN/model.py (3:19) - src/geo_map/download_biosample_page.py (3:19) duplicated block id: 1326 size: 14 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:19) - src/protein_structure/embedding_from_esmfold.py (3:19) duplicated block id: 1327 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/structure_file_reader.py (3:19) - src/geo_map/standardization_lat_lon_info.py (3:19) duplicated block id: 1328 size: 14 cleaned lines of code in 2 files: - src/geo_map/standardization_lat_lon_info.py (3:19) - src/predict_one_sample.py (3:19) duplicated block id: 1329 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/common/multi_label_metrics.py (3:19) duplicated block id: 1330 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_generator.py (3:19) - src/protein_structure/embedding_from_esmfold.py (3:19) duplicated block id: 1331 size: 14 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:19) - src/predict_one_sample.py (3:19) duplicated block id: 1332 size: 14 cleaned lines of code in 2 files: - src/baselines/predict.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:19) duplicated block id: 1333 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:19) - src/data_preprocess/subword.py (3:19) duplicated block id: 1334 size: 14 cleaned lines of code in 2 files: - src/evaluater.py (3:19) - src/geo_map/get_biosample_from_update.py (3:19) duplicated block id: 1335 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/subword.py (3:19) - src/predict.py (3:19) duplicated block id: 1336 size: 14 cleaned lines of code in 2 files: - src/baselines/xgb.py (3:19) - src/evaluater.py (3:19) duplicated block id: 1337 size: 14 cleaned lines of code in 2 files: - src/common/metrics.py (3:19) - src/geo_map/extract_attr_from_biosample_page.py (3:19) duplicated block id: 1338 size: 14 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:19) - src/predict_many_samples.py (3:19) duplicated block id: 1339 size: 14 cleaned lines of code in 2 files: - src/geo_map/download_biosample_page.py (3:19) - src/trainer.py (3:19) duplicated block id: 1340 size: 14 cleaned lines of code in 2 files: - src/protein_structure/embedding_from_esmfold.py (332:345) - src/protein_structure/embedding_from_esmfold.py (351:364) duplicated block id: 1341 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) - src/run.py (3:19) duplicated block id: 1342 size: 14 cleaned lines of code in 2 files: - src/baselines/predict.py (3:19) - src/common/metrics.py (3:19) duplicated block id: 1343 size: 14 cleaned lines of code in 2 files: - src/evaluater.py (3:19) - src/predict_many_samples.py (3:19) duplicated block id: 1344 size: 14 cleaned lines of code in 2 files: - src/baselines/predict.py (3:19) - src/geo_map/get_biosample_from_entrez.py (3:19) duplicated block id: 1345 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_builder.py (3:19) - src/geo_map/get_biosample_from_update.py (3:19) duplicated block id: 1346 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) - src/geo_map/get_biosample_from_update.py (3:19) duplicated block id: 1347 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:19) - src/protein_structure/embedding_from_esmfold.py (3:19) duplicated block id: 1348 size: 14 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:19) - src/geo_map/download_biosample_page.py (3:19) duplicated block id: 1349 size: 14 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:19) duplicated block id: 1350 size: 14 cleaned lines of code in 2 files: - src/SSFN/model.py (3:19) - src/protein_structure/structure_from_esm_v1.py (3:19) duplicated block id: 1351 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) - src/deep_baselines/predict_deep_baselines.py (1:17) duplicated block id: 1352 size: 14 cleaned lines of code in 2 files: - src/geo_map/parse_loc_lat_lon_from_attr.py (3:19) - src/protein_structure/predict_structure.py (3:19) duplicated block id: 1353 size: 14 cleaned lines of code in 2 files: - src/baselines/predict.py (3:19) - src/common/loss.py (3:19) duplicated block id: 1354 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:19) - src/predict_many_samples.py (3:19) duplicated block id: 1355 size: 14 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:19) - src/plot/plot_map_pie_fig4_2.py (3:19) duplicated block id: 1356 size: 14 cleaned lines of code in 2 files: - src/baselines/predict.py (3:19) - src/biotoolbox/contact_map_builder.py (3:19) duplicated block id: 1357 size: 14 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:19) - src/data_loader.py (3:19) duplicated block id: 1358 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (95:113) - src/data_preprocess/tf_records_generator.py (60:78) duplicated block id: 1359 size: 14 cleaned lines of code in 2 files: - src/geo_map/parse_loc_lat_lon_from_attr.py (3:19) - src/predictor.py (3:19) duplicated block id: 1360 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/subword.py (3:19) - src/run.py (3:19) duplicated block id: 1361 size: 14 cleaned lines of code in 2 files: - src/common/loss.py (3:19) - src/data_preprocess/tf_records_generator.py (3:19) duplicated block id: 1362 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/subword.py (3:19) - src/geo_map/standardization_lat_lon_info.py (3:19) duplicated block id: 1363 size: 14 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:19) - src/biotoolbox/structure_file_reader.py (3:19) duplicated block id: 1364 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:19) - src/result_process/process_predict_result.py (3:19) duplicated block id: 1365 size: 14 cleaned lines of code in 2 files: - src/predict.py (3:19) - src/predict_one_sample.py (3:19) duplicated block id: 1366 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_entrez.py (3:19) - src/geo_map/get_biosample_from_update.py (3:19) duplicated block id: 1367 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig_aff4_1.py (3:19) - src/predictor.py (3:19) duplicated block id: 1368 size: 14 cleaned lines of code in 2 files: - src/geo_map/parse_loc_lat_lon_from_attr.py (3:19) - src/plot/plot_map_pie_fig4_1.py (3:19) duplicated block id: 1369 size: 14 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:19) duplicated block id: 1370 size: 14 cleaned lines of code in 2 files: - src/evaluater.py (3:19) - src/protein_structure/structure_from_esm_v1.py (3:19) duplicated block id: 1371 size: 14 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:19) - src/geo_map/standardization_lat_lon_info.py (3:19) duplicated block id: 1372 size: 14 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:19) - src/run.py (3:19) duplicated block id: 1373 size: 14 cleaned lines of code in 2 files: - src/baselines/xgb.py (3:19) - src/plot/plot_map_pie_fig4_1.py (3:19) duplicated block id: 1374 size: 14 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:19) - src/data_preprocess/tf_records_generator.py (3:19) duplicated block id: 1375 size: 14 cleaned lines of code in 2 files: - src/baselines/xgb.py (3:19) - src/run.py (3:19) duplicated block id: 1376 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:19) - src/result_process/process_predict_result.py (3:19) duplicated block id: 1377 size: 14 cleaned lines of code in 2 files: - src/predict.py (3:19) - src/protein_structure/merge_embedding_pdb_result.py (3:19) duplicated block id: 1378 size: 14 cleaned lines of code in 2 files: - src/geo_map/download_biosample_page.py (3:19) - src/plot/plot_map_pie_fig_aff4_1.py (3:19) duplicated block id: 1379 size: 14 cleaned lines of code in 2 files: - src/geo_map/download_biosample_page.py (3:19) - src/result_process/process_predict_result.py (3:19) duplicated block id: 1380 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:19) - src/deep_baselines/predict_deep_baselines.py (1:17) duplicated block id: 1381 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:19) - src/utils.py (3:19) duplicated block id: 1382 size: 14 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:19) - src/geo_map/get_biosample_from_entrez.py (3:19) duplicated block id: 1383 size: 14 cleaned lines of code in 2 files: - src/baselines/predict.py (3:19) - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:19) duplicated block id: 1384 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/structure_file_reader.py (3:19) - src/geo_map/get_biosample_from_entrez.py (3:19) duplicated block id: 1385 size: 14 cleaned lines of code in 2 files: - src/SSFN/model.py (3:19) - src/baselines/dnn.py (3:19) duplicated block id: 1386 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_builder.py (3:19) - src/result_process/process_predict_result.py (3:19) duplicated block id: 1387 size: 14 cleaned lines of code in 2 files: - src/common/multi_label_metrics.py (3:19) - src/predictor.py (3:19) duplicated block id: 1388 size: 14 cleaned lines of code in 2 files: - src/baselines/predict.py (3:19) - src/plot/plot_map_pie_fig4_1.py (3:19) duplicated block id: 1389 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/tf_records_generator.py (3:19) - src/predict_one_sample.py (3:19) duplicated block id: 1390 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_update.py (3:19) - src/plot/plot_map_pie_fig_aff4_2.py (3:19) duplicated block id: 1391 size: 14 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:19) - src/result_process/process_predict_result.py (3:19) duplicated block id: 1392 size: 14 cleaned lines of code in 2 files: - src/data_loader.py (3:19) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) duplicated block id: 1393 size: 14 cleaned lines of code in 2 files: - src/SSFN/model.py (3:19) - src/trainer.py (3:19) duplicated block id: 1394 size: 14 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:19) duplicated block id: 1395 size: 14 cleaned lines of code in 2 files: - src/evaluater.py (3:19) - src/utils.py (3:19) duplicated block id: 1396 size: 14 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:19) - src/utils.py (3:19) duplicated block id: 1397 size: 14 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:19) - src/common/metrics.py (3:19) duplicated block id: 1398 size: 14 cleaned lines of code in 2 files: - src/common/loss.py (3:19) - src/protein_structure/embedding_from_esmfold.py (3:19) duplicated block id: 1399 size: 14 cleaned lines of code in 2 files: - src/protein_structure/embedding_from_esmfold.py (3:19) - src/protein_structure/predict_structure.py (3:19) duplicated block id: 1400 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (3:19) - src/predict_one_sample.py (3:19) duplicated block id: 1401 size: 14 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:19) - src/geo_map/get_biosample_from_update.py (3:19) duplicated block id: 1402 size: 14 cleaned lines of code in 2 files: - src/geo_map/extract_attr_from_biosample_page.py (3:19) - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) duplicated block id: 1403 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (3:19) - src/protein_structure/structure_from_esm_v1.py (3:19) duplicated block id: 1404 size: 14 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:19) - src/plot/plot_map_pie_fig_aff4_2.py (3:19) duplicated block id: 1405 size: 14 cleaned lines of code in 2 files: - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) - src/result_process/process_predict_result.py (3:19) duplicated block id: 1406 size: 14 cleaned lines of code in 2 files: - src/SSFN/model.py (3:19) - src/geo_map/get_biosample_from_update.py (3:19) duplicated block id: 1407 size: 14 cleaned lines of code in 2 files: - src/protein_structure/embedding_from_esmfold.py (3:19) - src/result_process/process_predict_result.py (3:19) duplicated block id: 1408 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:19) - src/plot/plot_map_pie_fig_aff4_2.py (3:19) duplicated block id: 1409 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/geo_map/download_biosample_page.py (3:19) duplicated block id: 1410 size: 14 cleaned lines of code in 2 files: - src/geo_map/extract_attr_from_biosample_page.py (3:19) - src/geo_map/standardization_lat_lon_info.py (3:19) duplicated block id: 1411 size: 14 cleaned lines of code in 2 files: - src/common/loss.py (3:19) - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) duplicated block id: 1412 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_entrez.py (3:19) - src/run.py (3:19) duplicated block id: 1413 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:19) - src/trainer.py (3:19) duplicated block id: 1414 size: 14 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:19) - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:19) duplicated block id: 1415 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_update.py (3:19) - src/geo_map/parse_loc_lat_lon_from_attr.py (3:19) duplicated block id: 1416 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_entrez.py (3:19) - src/predict.py (3:19) duplicated block id: 1417 size: 14 cleaned lines of code in 2 files: - src/predict_one_sample.py (3:19) - src/protein_structure/merge_embedding_pdb_result.py (3:19) duplicated block id: 1418 size: 14 cleaned lines of code in 2 files: - src/data_loader.py (3:19) - src/data_preprocess/verify_train_dataset.py (3:19) duplicated block id: 1419 size: 14 cleaned lines of code in 2 files: - src/SSFN/model.py (3:19) - src/evaluater.py (3:19) duplicated block id: 1420 size: 14 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:19) - src/trainer.py (3:19) duplicated block id: 1421 size: 14 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:19) - src/common/metrics.py (3:19) duplicated block id: 1422 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_manual.py (3:19) - src/run.py (3:19) duplicated block id: 1423 size: 14 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (1:17) - src/trainer.py (3:19) duplicated block id: 1424 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:19) - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:19) duplicated block id: 1425 size: 14 cleaned lines of code in 2 files: - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) - src/plot/plot_map_pie_fig4_1.py (3:19) duplicated block id: 1426 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/verify_train_dataset.py (3:19) - src/predict_many_samples.py (3:19) duplicated block id: 1427 size: 14 cleaned lines of code in 2 files: - src/common/multi_label_metrics.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:19) duplicated block id: 1428 size: 14 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:19) - src/utils.py (3:19) duplicated block id: 1429 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_builder.py (3:19) - src/plot/plot_map_pie_fig4_2.py (3:19) duplicated block id: 1430 size: 14 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:19) - src/biotoolbox/contact_map_generator.py (3:19) duplicated block id: 1431 size: 14 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:19) - src/biotoolbox/contact_map_builder.py (3:19) duplicated block id: 1432 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:19) - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) duplicated block id: 1433 size: 14 cleaned lines of code in 2 files: - src/common/loss.py (3:19) - src/predictor.py (3:19) duplicated block id: 1434 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_generator.py (3:19) - src/protein_structure/merge_embedding_pdb_result.py (3:19) duplicated block id: 1435 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/structure_file_reader.py (3:19) - src/data_preprocess/subword.py (3:19) duplicated block id: 1436 size: 14 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_generator.py (3:19) - src/result_process/process_predict_result.py (3:19) duplicated block id: 1437 size: 14 cleaned lines of code in 2 files: - src/geo_map/download_biosample_page.py (3:19) - src/protein_structure/predict_structure.py (3:19) duplicated block id: 1438 size: 14 cleaned lines of code in 2 files: - src/common/metrics.py (3:19) - src/geo_map/get_biosample_from_manual.py (3:19) duplicated block id: 1439 size: 14 cleaned lines of code in 2 files: - src/geo_map/self_testing_nearest_station_antarctica.py (37:52) - src/plot/plot_map_pie_fig4_1.py (177:192) duplicated block id: 1440 size: 14 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:19) - src/baselines/lgbm.py (3:19) duplicated block id: 1441 size: 14 cleaned lines of code in 2 files: - src/data_loader.py (3:19) - src/plot/plot_map_pie_fig_aff4_1.py (3:19) duplicated block id: 1442 size: 14 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:19) - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:19) duplicated block id: 1443 size: 14 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:19) - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:19) duplicated block id: 1444 size: 14 cleaned lines of code in 2 files: - src/geo_map/parse_loc_lat_lon_from_attr.py (3:19) - src/protein_structure/structure_from_esm_v1.py (3:19) duplicated block id: 1445 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig_aff4_1.py (3:19) - src/predict.py (3:19) duplicated block id: 1446 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig_aff4_1.py (3:19) - src/run.py (3:19) duplicated block id: 1447 size: 14 cleaned lines of code in 2 files: - src/geo_map/standardization_lat_lon_info.py (3:19) - src/result_process/process_predict_result.py (3:19) duplicated block id: 1448 size: 14 cleaned lines of code in 2 files: - src/evaluater.py (3:19) - src/result_process/process_predict_result.py (3:19) duplicated block id: 1449 size: 14 cleaned lines of code in 2 files: - src/common/metrics.py (3:19) - src/plot/plot_map_pie_fig_aff4_2.py (3:19) duplicated block id: 1450 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/ncbi_id_2_uniprot.py (3:19) - src/predict_many_samples.py (3:19) duplicated block id: 1451 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_entrez.py (3:19) - src/plot/plot_map_pie_fig_aff4_1.py (3:19) duplicated block id: 1452 size: 14 cleaned lines of code in 2 files: - src/geo_map/standardization_lat_lon_info.py (3:19) - src/predictor.py (3:19) duplicated block id: 1453 size: 14 cleaned lines of code in 2 files: - src/common/multi_label_metrics.py (3:19) - src/protein_structure/predict_structure.py (3:19) duplicated block id: 1454 size: 14 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:19) - src/trainer.py (3:19) duplicated block id: 1455 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/run.py (3:19) duplicated block id: 1456 size: 14 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (1:17) - src/geo_map/get_biosample_from_entrez.py (3:19) duplicated block id: 1457 size: 14 cleaned lines of code in 2 files: - src/common/loss.py (3:19) - src/plot/plot_map_pie_fig4_1.py (3:19) duplicated block id: 1458 size: 14 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:19) - src/biotoolbox/contact_map_builder.py (3:19) duplicated block id: 1459 size: 14 cleaned lines of code in 2 files: - src/geo_map/self_testing_nearest_station_antarctica.py (3:19) - src/predict_one_sample.py (3:19) duplicated block id: 1460 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/tf_records_generator.py (3:19) - src/geo_map/get_biosample_from_manual.py (3:19) duplicated block id: 1461 size: 14 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:19) - src/geo_map/parse_loc_lat_lon_from_attr.py (3:19) duplicated block id: 1462 size: 14 cleaned lines of code in 2 files: - src/protein_structure/embedding_from_esmfold.py (3:19) - src/utils.py (3:19) duplicated block id: 1463 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_manual.py (3:19) - src/trainer.py (3:19) duplicated block id: 1464 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_update.py (3:19) - src/plot/plot_map_pie_fig4_2.py (3:19) duplicated block id: 1465 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:19) - src/evaluater.py (3:19) duplicated block id: 1466 size: 14 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (3:19) - src/protein_structure/merge_embedding_pdb_result.py (3:19) duplicated block id: 1467 size: 14 cleaned lines of code in 2 files: - src/data_preprocess/verify_train_dataset.py (3:19) - src/predict_one_sample.py (3:19) duplicated block id: 1468 size: 14 cleaned lines of code in 2 files: - src/common/multi_label_metrics.py (3:19) - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:19) duplicated block id: 1469 size: 14 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_update.py (3:19) - src/predict_many_samples.py (3:19) duplicated block id: 1470 size: 14 cleaned lines of code in 2 files: - src/geo_map/download_biosample_page.py (3:19) - src/predictor.py (3:19) duplicated block id: 1471 size: 14 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:19) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:19) duplicated block id: 1472 size: 14 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:19) - src/geo_map/download_biosample_page.py (3:19) duplicated block id: 1473 size: 14 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:19) - src/common/multi_label_metrics.py (3:19) duplicated block id: 1474 size: 13 cleaned lines of code in 2 files: - src/SSFN/model.py (423:436) - src/deep_baselines/cheer.py (361:374) duplicated block id: 1475 size: 13 cleaned lines of code in 2 files: - src/SSFN/model.py (423:436) - src/deep_baselines/cheer.py (503:516) duplicated block id: 1476 size: 13 cleaned lines of code in 2 files: - src/common/metrics.py (363:375) - src/common/metrics.py (411:423) duplicated block id: 1477 size: 13 cleaned lines of code in 2 files: - src/baselines/lgbm.py (196:208) - src/baselines/xgb.py (190:202) duplicated block id: 1478 size: 13 cleaned lines of code in 2 files: - src/baselines/dnn.py (525:539) - src/baselines/xgb.py (406:420) duplicated block id: 1479 size: 13 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (157:169) - src/data_preprocess/tf_records_generator.py (101:113) duplicated block id: 1480 size: 13 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1107:1151) - src/SSFN/modeling_bert.py (1857:1879) duplicated block id: 1481 size: 13 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1107:1151) - src/SSFN/modeling_bert.py (1771:1787) duplicated block id: 1482 size: 13 cleaned lines of code in 2 files: - src/baselines/dnn.py (743:755) - src/trainer.py (366:378) duplicated block id: 1483 size: 13 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1107:1151) - src/SSFN/modeling_bert.py (1570:1588) duplicated block id: 1484 size: 13 cleaned lines of code in 2 files: - src/baselines/xgb.py (126:138) - src/deep_baselines/run.py (305:317) duplicated block id: 1485 size: 13 cleaned lines of code in 2 files: - src/deep_baselines/run.py (647:659) - src/trainer.py (296:308) duplicated block id: 1486 size: 13 cleaned lines of code in 2 files: - src/SSFN/model.py (423:436) - src/deep_baselines/cheer.py (215:228) duplicated block id: 1487 size: 13 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (157:169) - src/data_preprocess/data_preprocess_for_rdrp_v2.py (173:185) duplicated block id: 1488 size: 13 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (173:185) - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (133:145) duplicated block id: 1489 size: 13 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_2.py (311:325) - src/plot/plot_map_pie_fig_aff4_2.py (327:341) duplicated block id: 1490 size: 13 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1656:1668) - src/SSFN/modeling_bert.py (1757:1769) duplicated block id: 1491 size: 13 cleaned lines of code in 2 files: - src/baselines/lgbm.py (253:265) - src/baselines/xgb.py (241:253) duplicated block id: 1492 size: 13 cleaned lines of code in 2 files: - src/predict.py (327:339) - src/predict_one_sample.py (348:360) duplicated block id: 1493 size: 13 cleaned lines of code in 2 files: - src/baselines/lgbm.py (128:140) - src/deep_baselines/run.py (305:317) duplicated block id: 1494 size: 13 cleaned lines of code in 2 files: - src/predict_many_samples.py (536:552) - src/predict_one_sample.py (534:550) duplicated block id: 1495 size: 13 cleaned lines of code in 2 files: - src/baselines/lgbm.py (441:455) - src/baselines/xgb.py (406:420) duplicated block id: 1496 size: 13 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_2.py (337:350) - src/plot/plot_map_pie_fig_aff4_2.py (354:368) duplicated block id: 1497 size: 13 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (133:145) - src/data_preprocess/tf_records_generator.py (101:113) duplicated block id: 1498 size: 13 cleaned lines of code in 2 files: - src/run.py (788:801) - src/run.py (811:823) duplicated block id: 1499 size: 13 cleaned lines of code in 2 files: - src/SSFN/model.py (423:436) - src/deep_baselines/virhunter.py (189:202) duplicated block id: 1500 size: 13 cleaned lines of code in 2 files: - src/baselines/lgbm.py (177:189) - src/baselines/xgb.py (173:185) duplicated block id: 1501 size: 13 cleaned lines of code in 2 files: - src/predict.py (327:339) - src/predict_many_samples.py (345:357) duplicated block id: 1502 size: 13 cleaned lines of code in 2 files: - src/baselines/dnn.py (171:183) - src/deep_baselines/run.py (305:317) duplicated block id: 1503 size: 13 cleaned lines of code in 2 files: - src/deep_baselines/run.py (701:713) - src/trainer.py (366:378) duplicated block id: 1504 size: 13 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1556:1568) - src/SSFN/modeling_bert.py (1656:1668) duplicated block id: 1505 size: 13 cleaned lines of code in 2 files: - src/baselines/lgbm.py (272:284) - src/baselines/xgb.py (258:270) duplicated block id: 1506 size: 13 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1556:1568) - src/SSFN/modeling_bert.py (1757:1769) duplicated block id: 1507 size: 13 cleaned lines of code in 2 files: - src/SSFN/model.py (423:436) - src/baselines/dnn.py (418:431) duplicated block id: 1508 size: 13 cleaned lines of code in 2 files: - src/SSFN/model.py (423:436) - src/deep_baselines/virseeker.py (200:213) duplicated block id: 1509 size: 13 cleaned lines of code in 2 files: - src/baselines/lgbm.py (234:246) - src/baselines/xgb.py (224:236) duplicated block id: 1510 size: 13 cleaned lines of code in 2 files: - src/baselines/lgbm.py (215:227) - src/baselines/xgb.py (207:219) duplicated block id: 1511 size: 13 cleaned lines of code in 2 files: - src/baselines/dnn.py (689:701) - src/trainer.py (296:308) duplicated block id: 1512 size: 13 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (213:228) - src/plot/plot_map_pie_fig4_2.py (247:262) duplicated block id: 1513 size: 13 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (133:145) - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (149:161) duplicated block id: 1514 size: 13 cleaned lines of code in 2 files: - src/baselines/xgb.py (406:420) - src/deep_baselines/run.py (482:496) duplicated block id: 1515 size: 13 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1570:1588) - src/SSFN/modeling_bert.py (1857:1879) duplicated block id: 1516 size: 13 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1570:1588) - src/SSFN/modeling_bert.py (1771:1787) duplicated block id: 1517 size: 13 cleaned lines of code in 2 files: - src/geo_map/parse_loc_lat_lon_from_attr.py (40:52) - src/geo_map/standardization_lat_lon_info.py (36:48) duplicated block id: 1518 size: 13 cleaned lines of code in 2 files: - src/baselines/lgbm.py (441:455) - src/deep_baselines/run.py (482:496) duplicated block id: 1519 size: 13 cleaned lines of code in 2 files: - src/app/app.py (329:341) - src/predict.py (327:339) duplicated block id: 1520 size: 13 cleaned lines of code in 2 files: - src/baselines/dnn.py (525:539) - src/baselines/lgbm.py (441:455) duplicated block id: 1521 size: 13 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1490:1504) - src/SSFN/modeling_bert.py (1576:1590) duplicated block id: 1522 size: 13 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (157:169) - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (149:161) duplicated block id: 1523 size: 13 cleaned lines of code in 2 files: - src/SSFN/model.py (423:436) - src/deep_baselines/virtifier.py (222:235) duplicated block id: 1524 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virhunter.py (3:17) - src/geo_map/standardization_lat_lon_info.py (3:17) duplicated block id: 1525 size: 12 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:17) - src/deep_baselines/run.py (3:17) duplicated block id: 1526 size: 12 cleaned lines of code in 2 files: - src/common/multi_label_metrics.py (3:17) - src/deep_baselines/virhunter.py (3:17) duplicated block id: 1527 size: 12 cleaned lines of code in 2 files: - src/data_preprocess/tf_records_generator.py (3:17) - src/deep_baselines/virhunter.py (3:17) duplicated block id: 1528 size: 12 cleaned lines of code in 2 files: - src/baselines/xgb.py (3:17) - src/deep_baselines/statistics.py (3:17) duplicated block id: 1529 size: 12 cleaned lines of code in 2 files: - src/baselines/predict.py (3:17) - src/deep_baselines/virtifier.py (3:17) duplicated block id: 1530 size: 12 cleaned lines of code in 2 files: - src/data_preprocess/subword.py (3:17) - src/deep_baselines/virhunter.py (3:17) duplicated block id: 1531 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virtifier.py (3:17) - src/geo_map/extract_attr_from_biosample_page.py (3:17) duplicated block id: 1532 size: 12 cleaned lines of code in 2 files: - src/data_preprocess/ncbi_id_2_uniprot.py (3:17) - src/deep_baselines/cheer.py (3:17) duplicated block id: 1533 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/run.py (3:17) - src/predict.py (3:17) duplicated block id: 1534 size: 12 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:17) - src/deep_baselines/virhunter.py (3:17) duplicated block id: 1535 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/run.py (3:17) - src/geo_map/extract_attr_from_biosample_page.py (3:17) duplicated block id: 1536 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/run.py (3:17) - src/evaluater.py (3:17) duplicated block id: 1537 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virtifier.py (3:17) - src/predictor.py (3:17) duplicated block id: 1538 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virhunter.py (3:17) - src/utils.py (3:17) duplicated block id: 1539 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (3:17) - src/geo_map/extract_attr_from_biosample_page.py (3:17) duplicated block id: 1540 size: 12 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_generator.py (3:17) - src/deep_baselines/virtifier.py (3:17) duplicated block id: 1541 size: 12 cleaned lines of code in 2 files: - src/data_preprocess/subword.py (3:17) - src/deep_baselines/cheer.py (3:17) duplicated block id: 1542 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (1:15) - src/deep_baselines/virseeker.py (3:17) duplicated block id: 1543 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virtifier.py (3:17) - src/protein_structure/merge_embedding_pdb_result.py (3:17) duplicated block id: 1544 size: 12 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:17) - src/deep_baselines/statistics.py (3:17) duplicated block id: 1545 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virhunter.py (3:17) - src/plot/plot_map_pie_fig_aff4_2.py (3:17) duplicated block id: 1546 size: 12 cleaned lines of code in 2 files: - src/baselines/xgb.py (173:184) - src/baselines/xgb.py (190:201) duplicated block id: 1547 size: 12 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:17) - src/deep_baselines/virhunter.py (3:17) duplicated block id: 1548 size: 12 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1439:1450) - src/SSFN/modeling_bert.py (1758:1769) duplicated block id: 1549 size: 12 cleaned lines of code in 2 files: - src/baselines/lgbm.py (177:188) - src/baselines/lgbm.py (196:207) duplicated block id: 1550 size: 12 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:17) - src/deep_baselines/run.py (3:17) duplicated block id: 1551 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/run.py (3:17) - src/trainer.py (3:17) duplicated block id: 1552 size: 12 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:17) - src/deep_baselines/statistics.py (3:17) duplicated block id: 1553 size: 12 cleaned lines of code in 2 files: - src/baselines/lgbm.py (196:207) - src/baselines/xgb.py (173:184) duplicated block id: 1554 size: 12 cleaned lines of code in 2 files: - src/data_preprocess/verify_train_dataset.py (3:17) - src/deep_baselines/virseeker.py (3:17) duplicated block id: 1555 size: 12 cleaned lines of code in 2 files: - src/common/loss.py (3:17) - src/deep_baselines/run.py (3:17) duplicated block id: 1556 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (3:17) - src/plot/plot_map_pie_fig4_2.py (3:17) duplicated block id: 1557 size: 12 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:17) - src/deep_baselines/virhunter.py (3:17) duplicated block id: 1558 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virtifier.py (3:17) - src/plot/plot_map_pie_fig_aff4_1.py (3:17) duplicated block id: 1559 size: 12 cleaned lines of code in 2 files: - src/common/loss.py (3:17) - src/deep_baselines/virseeker.py (3:17) duplicated block id: 1560 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virtifier.py (3:17) - src/plot/plot_map_pie_fig4_2.py (3:17) duplicated block id: 1561 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virseeker.py (3:17) - src/plot/plot_map_pie_fig4_2.py (3:17) duplicated block id: 1562 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/statistics.py (3:17) - src/plot/plot_map_pie_fig4_2.py (3:17) duplicated block id: 1563 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (3:17) - src/geo_map/get_biosample_from_entrez.py (3:17) duplicated block id: 1564 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/run.py (3:17) - src/plot/plot_map_pie_fig4_1.py (3:17) duplicated block id: 1565 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/statistics.py (3:17) - src/predictor.py (3:17) duplicated block id: 1566 size: 12 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:17) - src/deep_baselines/virhunter.py (3:17) duplicated block id: 1567 size: 12 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:17) - src/deep_baselines/run.py (3:17) duplicated block id: 1568 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (3:17) - src/plot/plot_map_pie_fig_aff4_1.py (3:17) duplicated block id: 1569 size: 12 cleaned lines of code in 2 files: - src/data_preprocess/tf_records_generator.py (3:17) - src/deep_baselines/cheer.py (3:17) duplicated block id: 1570 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (3:17) - src/deep_baselines/predict_deep_baselines.py (1:15) duplicated block id: 1571 size: 12 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:17) - src/deep_baselines/statistics.py (3:17) duplicated block id: 1572 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virhunter.py (3:17) - src/predictor.py (3:17) duplicated block id: 1573 size: 12 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_generator.py (3:17) - src/deep_baselines/run.py (3:17) duplicated block id: 1574 size: 12 cleaned lines of code in 2 files: - src/data_loader.py (3:17) - src/deep_baselines/statistics.py (3:17) duplicated block id: 1575 size: 12 cleaned lines of code in 2 files: - src/common/metrics.py (3:17) - src/deep_baselines/cheer.py (3:17) duplicated block id: 1576 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (3:17) - src/protein_structure/embedding_from_esmfold.py (3:17) duplicated block id: 1577 size: 12 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_generator.py (3:17) - src/deep_baselines/virseeker.py (3:17) duplicated block id: 1578 size: 12 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:17) - src/deep_baselines/virseeker.py (3:17) duplicated block id: 1579 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virseeker.py (3:17) - src/predictor.py (3:17) duplicated block id: 1580 size: 12 cleaned lines of code in 2 files: - src/data_preprocess/subword.py (3:17) - src/deep_baselines/run.py (3:17) duplicated block id: 1581 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virseeker.py (3:17) - src/utils.py (3:17) duplicated block id: 1582 size: 12 cleaned lines of code in 2 files: - src/data_preprocess/verify_train_dataset.py (3:17) - src/deep_baselines/virhunter.py (3:17) duplicated block id: 1583 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virseeker.py (3:17) - src/protein_structure/predict_structure.py (3:17) duplicated block id: 1584 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (1:15) - src/deep_baselines/run.py (3:17) duplicated block id: 1585 size: 12 cleaned lines of code in 2 files: - src/baselines/xgb.py (3:17) - src/deep_baselines/virhunter.py (3:17) duplicated block id: 1586 size: 12 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:17) - src/deep_baselines/virhunter.py (3:17) duplicated block id: 1587 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/run.py (3:17) - src/geo_map/standardization_lat_lon_info.py (3:17) duplicated block id: 1588 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/statistics.py (3:17) - src/predict.py (3:17) duplicated block id: 1589 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/run.py (3:17) - src/protein_structure/embedding_from_esmfold.py (3:17) duplicated block id: 1590 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (368:380) - src/predict.py (725:737) duplicated block id: 1591 size: 12 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_builder.py (3:17) - src/deep_baselines/statistics.py (3:17) duplicated block id: 1592 size: 12 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:17) - src/deep_baselines/cheer.py (3:17) duplicated block id: 1593 size: 12 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:17) - src/deep_baselines/virhunter.py (3:17) duplicated block id: 1594 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (3:17) - src/geo_map/self_testing_nearest_station_antarctica.py (3:17) duplicated block id: 1595 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virseeker.py (3:17) - src/result_process/process_predict_result.py (3:17) duplicated block id: 1596 size: 12 cleaned lines of code in 2 files: - src/baselines/predict.py (3:17) - src/deep_baselines/statistics.py (3:17) duplicated block id: 1597 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/statistics.py (3:17) - src/trainer.py (3:17) duplicated block id: 1598 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/run.py (3:17) - src/run.py (3:17) duplicated block id: 1599 size: 12 cleaned lines of code in 2 files: - src/data_preprocess/subword.py (3:17) - src/deep_baselines/virseeker.py (3:17) duplicated block id: 1600 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virhunter.py (3:17) - src/plot/plot_map_pie_fig_aff4_1.py (3:17) duplicated block id: 1601 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virtifier.py (3:17) - src/result_process/process_predict_result.py (3:17) duplicated block id: 1602 size: 12 cleaned lines of code in 2 files: - src/common/metrics.py (3:17) - src/deep_baselines/statistics.py (3:17) duplicated block id: 1603 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virtifier.py (3:17) - src/protein_structure/predict_structure.py (3:17) duplicated block id: 1604 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virseeker.py (3:17) - src/geo_map/get_biosample_from_entrez.py (3:17) duplicated block id: 1605 size: 12 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:17) - src/deep_baselines/virseeker.py (3:17) duplicated block id: 1606 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (3:17) - src/geo_map/get_biosample_from_manual.py (3:17) duplicated block id: 1607 size: 12 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:17) - src/deep_baselines/virseeker.py (3:17) duplicated block id: 1608 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virseeker.py (3:17) - src/plot/plot_map_pie_fig4_1.py (3:17) duplicated block id: 1609 size: 12 cleaned lines of code in 2 files: - src/baselines/predict.py (3:17) - src/deep_baselines/cheer.py (3:17) duplicated block id: 1610 size: 12 cleaned lines of code in 2 files: - src/SSFN/model.py (3:17) - src/deep_baselines/virseeker.py (3:17) duplicated block id: 1611 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/statistics.py (3:17) - src/geo_map/self_testing_nearest_station_antarctica.py (3:17) duplicated block id: 1612 size: 12 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:17) - src/deep_baselines/cheer.py (3:17) duplicated block id: 1613 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virseeker.py (3:17) - src/predict.py (3:17) duplicated block id: 1614 size: 12 cleaned lines of code in 2 files: - src/baselines/dnn.py (438:452) - src/baselines/lgbm.py (337:351) duplicated block id: 1615 size: 12 cleaned lines of code in 2 files: - src/data_preprocess/ncbi_id_2_uniprot.py (3:17) - src/deep_baselines/run.py (3:17) duplicated block id: 1616 size: 12 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:17) - src/deep_baselines/statistics.py (3:17) duplicated block id: 1617 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virhunter.py (3:17) - src/result_process/process_predict_result.py (3:17) duplicated block id: 1618 size: 12 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:17) - src/deep_baselines/run.py (3:17) duplicated block id: 1619 size: 12 cleaned lines of code in 2 files: - src/data_loader.py (3:17) - src/deep_baselines/virtifier.py (3:17) duplicated block id: 1620 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virseeker.py (3:17) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:17) duplicated block id: 1621 size: 12 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_builder.py (3:17) - src/deep_baselines/virhunter.py (3:17) duplicated block id: 1622 size: 12 cleaned lines of code in 2 files: - src/predictor.py (208:219) - src/predictor.py (224:235) duplicated block id: 1623 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virtifier.py (3:17) - src/geo_map/get_biosample_from_entrez.py (3:17) duplicated block id: 1624 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/run.py (3:17) - src/geo_map/get_biosample_from_update.py (3:17) duplicated block id: 1625 size: 12 cleaned lines of code in 2 files: - src/data_preprocess/ncbi_id_2_uniprot.py (3:17) - src/deep_baselines/statistics.py (3:17) duplicated block id: 1626 size: 12 cleaned lines of code in 2 files: - src/common/multi_label_metrics.py (3:17) - src/deep_baselines/cheer.py (3:17) duplicated block id: 1627 size: 12 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:17) - src/deep_baselines/run.py (3:17) duplicated block id: 1628 size: 12 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:17) - src/deep_baselines/run.py (3:17) duplicated block id: 1629 size: 12 cleaned lines of code in 2 files: - src/biotoolbox/structure_file_reader.py (3:17) - src/deep_baselines/virtifier.py (3:17) duplicated block id: 1630 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (3:17) - src/protein_structure/structure_from_esm_v1.py (3:17) duplicated block id: 1631 size: 12 cleaned lines of code in 2 files: - src/data_preprocess/verify_train_dataset.py (3:17) - src/deep_baselines/cheer.py (3:17) duplicated block id: 1632 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/statistics.py (3:17) - src/run.py (3:17) duplicated block id: 1633 size: 12 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:17) - src/deep_baselines/virseeker.py (3:17) duplicated block id: 1634 size: 12 cleaned lines of code in 2 files: - src/data_loader.py (3:17) - src/deep_baselines/run.py (3:17) duplicated block id: 1635 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virtifier.py (3:17) - src/predict_many_samples.py (3:17) duplicated block id: 1636 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/run.py (3:17) - src/result_process/process_predict_result.py (3:17) duplicated block id: 1637 size: 12 cleaned lines of code in 2 files: - src/common/metrics.py (3:17) - src/deep_baselines/virseeker.py (3:17) duplicated block id: 1638 size: 12 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:17) - src/deep_baselines/virseeker.py (3:17) duplicated block id: 1639 size: 12 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:17) - src/deep_baselines/virtifier.py (3:17) duplicated block id: 1640 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virtifier.py (3:17) - src/predict_one_sample.py (3:17) duplicated block id: 1641 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (3:17) - src/result_process/process_predict_result.py (3:17) duplicated block id: 1642 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (3:17) - src/predict_many_samples.py (3:17) duplicated block id: 1643 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virseeker.py (3:17) - src/geo_map/download_biosample_page.py (3:17) duplicated block id: 1644 size: 12 cleaned lines of code in 2 files: - src/common/loss.py (3:17) - src/deep_baselines/virtifier.py (3:17) duplicated block id: 1645 size: 12 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1439:1450) - src/SSFN/modeling_bert.py (1557:1568) duplicated block id: 1646 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virtifier.py (3:17) - src/geo_map/parse_loc_lat_lon_from_attr.py (3:17) duplicated block id: 1647 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virhunter.py (3:17) - src/protein_structure/merge_embedding_pdb_result.py (3:17) duplicated block id: 1648 size: 12 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:17) - src/deep_baselines/virhunter.py (3:17) duplicated block id: 1649 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virseeker.py (3:17) - src/geo_map/get_biosample_from_manual.py (3:17) duplicated block id: 1650 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virhunter.py (3:17) - src/geo_map/download_biosample_page.py (3:17) duplicated block id: 1651 size: 12 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:17) - src/deep_baselines/virseeker.py (3:17) duplicated block id: 1652 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virtifier.py (3:17) - src/predict.py (3:17) duplicated block id: 1653 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/statistics.py (3:17) - src/protein_structure/embedding_from_esmfold.py (3:17) duplicated block id: 1654 size: 12 cleaned lines of code in 2 files: - src/data_preprocess/ncbi_id_2_uniprot.py (3:17) - src/deep_baselines/virseeker.py (3:17) duplicated block id: 1655 size: 12 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1439:1450) - src/SSFN/modeling_bert.py (1657:1668) duplicated block id: 1656 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virhunter.py (3:17) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:17) duplicated block id: 1657 size: 12 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:17) - src/deep_baselines/run.py (3:17) duplicated block id: 1658 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (3:17) - src/geo_map/standardization_lat_lon_info.py (3:17) duplicated block id: 1659 size: 12 cleaned lines of code in 2 files: - src/SSFN/model.py (3:17) - src/deep_baselines/statistics.py (3:17) duplicated block id: 1660 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virseeker.py (3:17) - src/geo_map/self_testing_nearest_station_antarctica.py (3:17) duplicated block id: 1661 size: 12 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1492:1504) - src/SSFN/modeling_bert.py (1689:1701) duplicated block id: 1662 size: 12 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:17) - src/deep_baselines/cheer.py (3:17) duplicated block id: 1663 size: 12 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_builder.py (3:17) - src/deep_baselines/virseeker.py (3:17) duplicated block id: 1664 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/statistics.py (3:17) - src/predict_one_sample.py (3:17) duplicated block id: 1665 size: 12 cleaned lines of code in 2 files: - src/predict.py (417:430) - src/predict_many_samples.py (434:447) duplicated block id: 1666 size: 12 cleaned lines of code in 2 files: - src/baselines/xgb.py (3:17) - src/deep_baselines/virtifier.py (3:17) duplicated block id: 1667 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/statistics.py (3:17) - src/geo_map/get_biosample_from_manual.py (3:17) duplicated block id: 1668 size: 12 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:17) - src/deep_baselines/run.py (3:17) duplicated block id: 1669 size: 12 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:17) - src/deep_baselines/cheer.py (3:17) duplicated block id: 1670 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virseeker.py (3:17) - src/protein_structure/structure_from_esm_v1.py (3:17) duplicated block id: 1671 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virtifier.py (3:17) - src/geo_map/download_biosample_page.py (3:17) duplicated block id: 1672 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virtifier.py (3:17) - src/geo_map/get_biosample_from_manual.py (3:17) duplicated block id: 1673 size: 12 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:17) - src/deep_baselines/virtifier.py (3:17) duplicated block id: 1674 size: 12 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:17) - src/deep_baselines/virtifier.py (3:17) duplicated block id: 1675 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virtifier.py (3:17) - src/utils.py (3:17) duplicated block id: 1676 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virseeker.py (3:17) - src/predict_one_sample.py (3:17) duplicated block id: 1677 size: 12 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_builder.py (3:17) - src/deep_baselines/run.py (3:17) duplicated block id: 1678 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (3:17) - src/geo_map/download_biosample_page.py (3:17) duplicated block id: 1679 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/run.py (3:17) - src/plot/plot_map_pie_fig_aff4_2.py (3:17) duplicated block id: 1680 size: 12 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:17) - src/deep_baselines/virseeker.py (3:17) duplicated block id: 1681 size: 12 cleaned lines of code in 2 files: - src/baselines/lgbm.py (177:188) - src/baselines/xgb.py (190:201) duplicated block id: 1682 size: 12 cleaned lines of code in 2 files: - src/predict.py (417:430) - src/predict_one_sample.py (437:450) duplicated block id: 1683 size: 12 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:17) - src/deep_baselines/statistics.py (3:17) duplicated block id: 1684 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virtifier.py (3:17) - src/geo_map/get_biosample_from_update.py (3:17) duplicated block id: 1685 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virseeker.py (3:17) - src/geo_map/extract_attr_from_biosample_page.py (3:17) duplicated block id: 1686 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virhunter.py (3:17) - src/predict_one_sample.py (3:17) duplicated block id: 1687 size: 12 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:17) - src/deep_baselines/run.py (3:17) duplicated block id: 1688 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/run.py (3:17) - src/plot/plot_map_pie_fig_aff4_1.py (3:17) duplicated block id: 1689 size: 12 cleaned lines of code in 2 files: - src/common/multi_label_metrics.py (3:17) - src/deep_baselines/virtifier.py (3:17) duplicated block id: 1690 size: 12 cleaned lines of code in 2 files: - src/data_preprocess/tf_records_generator.py (3:17) - src/deep_baselines/virtifier.py (3:17) duplicated block id: 1691 size: 12 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:17) - src/deep_baselines/virtifier.py (3:17) duplicated block id: 1692 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/statistics.py (3:17) - src/geo_map/extract_attr_from_biosample_page.py (3:17) duplicated block id: 1693 size: 12 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:17) - src/deep_baselines/statistics.py (3:17) duplicated block id: 1694 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/run.py (3:17) - src/geo_map/get_biosample_from_manual.py (3:17) duplicated block id: 1695 size: 12 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1139:1151) - src/SSFN/modeling_bert.py (1490:1502) duplicated block id: 1696 size: 12 cleaned lines of code in 2 files: - src/data_loader.py (3:17) - src/deep_baselines/cheer.py (3:17) duplicated block id: 1697 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virhunter.py (3:17) - src/geo_map/extract_attr_from_biosample_page.py (3:17) duplicated block id: 1698 size: 12 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:17) - src/deep_baselines/virtifier.py (3:17) duplicated block id: 1699 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (3:17) - src/trainer.py (3:17) duplicated block id: 1700 size: 12 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_generator.py (3:17) - src/deep_baselines/virhunter.py (3:17) duplicated block id: 1701 size: 12 cleaned lines of code in 2 files: - src/common/metrics.py (3:17) - src/deep_baselines/run.py (3:17) duplicated block id: 1702 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virhunter.py (3:17) - src/geo_map/get_biosample_from_manual.py (3:17) duplicated block id: 1703 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/statistics.py (3:17) - src/plot/plot_map_pie_fig4_1.py (3:17) duplicated block id: 1704 size: 12 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1208:1219) - src/SSFN/modeling_bert.py (1348:1359) duplicated block id: 1705 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virhunter.py (3:17) - src/evaluater.py (3:17) duplicated block id: 1706 size: 12 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (468:479) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (513:524) duplicated block id: 1707 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/run.py (3:17) - src/predictor.py (3:17) duplicated block id: 1708 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (3:17) - src/run.py (3:17) duplicated block id: 1709 size: 12 cleaned lines of code in 2 files: - src/biotoolbox/structure_file_reader.py (3:17) - src/deep_baselines/run.py (3:17) duplicated block id: 1710 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/statistics.py (3:17) - src/geo_map/get_biosample_from_entrez.py (3:17) duplicated block id: 1711 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/run.py (3:17) - src/utils.py (3:17) duplicated block id: 1712 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/run.py (3:17) - src/predict_many_samples.py (3:17) duplicated block id: 1713 size: 12 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:17) - src/deep_baselines/virtifier.py (3:17) duplicated block id: 1714 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (3:17) - src/predict.py (3:17) duplicated block id: 1715 size: 12 cleaned lines of code in 2 files: - src/data_preprocess/subword.py (3:17) - src/deep_baselines/virtifier.py (3:17) duplicated block id: 1716 size: 12 cleaned lines of code in 2 files: - src/data_preprocess/verify_train_dataset.py (3:17) - src/deep_baselines/run.py (3:17) duplicated block id: 1717 size: 12 cleaned lines of code in 2 files: - src/common/loss.py (3:17) - src/deep_baselines/virhunter.py (3:17) duplicated block id: 1718 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virseeker.py (3:17) - src/plot/plot_map_pie_fig_aff4_2.py (3:17) duplicated block id: 1719 size: 12 cleaned lines of code in 2 files: - src/common/multi_label_metrics.py (3:17) - src/deep_baselines/run.py (3:17) duplicated block id: 1720 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virhunter.py (3:17) - src/protein_structure/embedding_from_esmfold.py (3:17) duplicated block id: 1721 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virseeker.py (3:17) - src/geo_map/standardization_lat_lon_info.py (3:17) duplicated block id: 1722 size: 12 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:17) - src/deep_baselines/statistics.py (3:17) duplicated block id: 1723 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virhunter.py (3:17) - src/predict.py (3:17) duplicated block id: 1724 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virhunter.py (3:17) - src/trainer.py (3:17) duplicated block id: 1725 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virhunter.py (3:17) - src/geo_map/self_testing_nearest_station_antarctica.py (3:17) duplicated block id: 1726 size: 12 cleaned lines of code in 2 files: - src/biotoolbox/structure_file_reader.py (3:17) - src/deep_baselines/virhunter.py (3:17) duplicated block id: 1727 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/run.py (3:17) - src/geo_map/self_testing_nearest_station_antarctica.py (3:17) duplicated block id: 1728 size: 12 cleaned lines of code in 2 files: - src/baselines/xgb.py (3:17) - src/deep_baselines/cheer.py (3:17) duplicated block id: 1729 size: 12 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_generator.py (3:17) - src/deep_baselines/statistics.py (3:17) duplicated block id: 1730 size: 12 cleaned lines of code in 2 files: - src/biotoolbox/structure_file_reader.py (3:17) - src/deep_baselines/statistics.py (3:17) duplicated block id: 1731 size: 12 cleaned lines of code in 2 files: - src/baselines/xgb.py (3:17) - src/deep_baselines/run.py (3:17) duplicated block id: 1732 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virtifier.py (3:17) - src/plot/plot_map_pie_fig4_1.py (3:17) duplicated block id: 1733 size: 12 cleaned lines of code in 2 files: - src/common/loss.py (3:17) - src/deep_baselines/statistics.py (3:17) duplicated block id: 1734 size: 12 cleaned lines of code in 2 files: - src/common/metrics.py (3:17) - src/deep_baselines/virhunter.py (3:17) duplicated block id: 1735 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virhunter.py (3:17) - src/geo_map/get_biosample_from_update.py (3:17) duplicated block id: 1736 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/run.py (3:17) - src/geo_map/get_biosample_from_entrez.py (3:17) duplicated block id: 1737 size: 12 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:17) - src/deep_baselines/cheer.py (3:17) duplicated block id: 1738 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (3:17) - src/plot/plot_map_pie_fig4_1.py (3:17) duplicated block id: 1739 size: 12 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:17) - src/deep_baselines/cheer.py (3:17) duplicated block id: 1740 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (1:15) - src/deep_baselines/virhunter.py (3:17) duplicated block id: 1741 size: 12 cleaned lines of code in 2 files: - src/SSFN/model.py (3:17) - src/deep_baselines/virhunter.py (3:17) duplicated block id: 1742 size: 12 cleaned lines of code in 2 files: - src/data_preprocess/subword.py (3:17) - src/deep_baselines/statistics.py (3:17) duplicated block id: 1743 size: 12 cleaned lines of code in 2 files: - src/baselines/predict.py (3:17) - src/deep_baselines/run.py (3:17) duplicated block id: 1744 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virtifier.py (3:17) - src/evaluater.py (3:17) duplicated block id: 1745 size: 12 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:17) - src/deep_baselines/virhunter.py (3:17) duplicated block id: 1746 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/statistics.py (3:17) - src/plot/plot_map_pie_fig_aff4_1.py (3:17) duplicated block id: 1747 size: 12 cleaned lines of code in 2 files: - src/data_preprocess/verify_train_dataset.py (3:17) - src/deep_baselines/statistics.py (3:17) duplicated block id: 1748 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/statistics.py (3:17) - src/predict_many_samples.py (3:17) duplicated block id: 1749 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (3:17) - src/evaluater.py (3:17) duplicated block id: 1750 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (3:17) - src/protein_structure/predict_structure.py (3:17) duplicated block id: 1751 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/statistics.py (3:17) - src/geo_map/parse_loc_lat_lon_from_attr.py (3:17) duplicated block id: 1752 size: 12 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:17) - src/deep_baselines/virtifier.py (3:17) duplicated block id: 1753 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virhunter.py (3:17) - src/plot/plot_map_pie_fig4_1.py (3:17) duplicated block id: 1754 size: 12 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:17) - src/deep_baselines/virhunter.py (3:17) duplicated block id: 1755 size: 12 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:17) - src/deep_baselines/virhunter.py (3:17) duplicated block id: 1756 size: 12 cleaned lines of code in 2 files: - src/data_preprocess/ncbi_id_2_uniprot.py (3:17) - src/deep_baselines/virhunter.py (3:17) duplicated block id: 1757 size: 12 cleaned lines of code in 2 files: - src/baselines/predict.py (3:17) - src/deep_baselines/virhunter.py (3:17) duplicated block id: 1758 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virtifier.py (3:17) - src/run.py (3:17) duplicated block id: 1759 size: 12 cleaned lines of code in 2 files: - src/data_preprocess/tf_records_generator.py (3:17) - src/deep_baselines/run.py (3:17) duplicated block id: 1760 size: 12 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:17) - src/deep_baselines/run.py (3:17) duplicated block id: 1761 size: 12 cleaned lines of code in 2 files: - src/baselines/predict.py (3:17) - src/deep_baselines/virseeker.py (3:17) duplicated block id: 1762 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (3:17) - src/utils.py (3:17) duplicated block id: 1763 size: 12 cleaned lines of code in 2 files: - src/common/multi_label_metrics.py (3:17) - src/deep_baselines/virseeker.py (3:17) duplicated block id: 1764 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/run.py (3:17) - src/protein_structure/predict_structure.py (3:17) duplicated block id: 1765 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virseeker.py (3:17) - src/trainer.py (3:17) duplicated block id: 1766 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virhunter.py (3:17) - src/geo_map/get_biosample_from_entrez.py (3:17) duplicated block id: 1767 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virhunter.py (3:17) - src/run.py (3:17) duplicated block id: 1768 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/statistics.py (3:17) - src/geo_map/standardization_lat_lon_info.py (3:17) duplicated block id: 1769 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virtifier.py (3:17) - src/geo_map/standardization_lat_lon_info.py (3:17) duplicated block id: 1770 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (3:17) - src/predictor.py (3:17) duplicated block id: 1771 size: 12 cleaned lines of code in 2 files: - src/data_preprocess/tf_records_generator.py (3:17) - src/deep_baselines/virseeker.py (3:17) duplicated block id: 1772 size: 12 cleaned lines of code in 2 files: - src/data_preprocess/tf_records_generator.py (3:17) - src/deep_baselines/statistics.py (3:17) duplicated block id: 1773 size: 12 cleaned lines of code in 2 files: - src/baselines/dnn.py (3:17) - src/deep_baselines/statistics.py (3:17) duplicated block id: 1774 size: 12 cleaned lines of code in 2 files: - src/baselines/xgb.py (3:17) - src/deep_baselines/virseeker.py (3:17) duplicated block id: 1775 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/run.py (3:17) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:17) duplicated block id: 1776 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virtifier.py (3:17) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:17) duplicated block id: 1777 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virhunter.py (3:17) - src/protein_structure/structure_from_esm_v1.py (3:17) duplicated block id: 1778 size: 12 cleaned lines of code in 2 files: - src/data_loader.py (3:17) - src/deep_baselines/virhunter.py (3:17) duplicated block id: 1779 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/run.py (3:17) - src/geo_map/download_biosample_page.py (3:17) duplicated block id: 1780 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virhunter.py (3:17) - src/predict_many_samples.py (3:17) duplicated block id: 1781 size: 12 cleaned lines of code in 2 files: - src/data_loader.py (353:369) - src/data_loader.py (423:439) duplicated block id: 1782 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/statistics.py (3:17) - src/geo_map/get_biosample_from_update.py (3:17) duplicated block id: 1783 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/run.py (3:17) - src/protein_structure/merge_embedding_pdb_result.py (3:17) duplicated block id: 1784 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/statistics.py (3:17) - src/protein_structure/structure_from_esm_v1.py (3:17) duplicated block id: 1785 size: 12 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (3:17) - src/deep_baselines/virtifier.py (3:17) duplicated block id: 1786 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (3:17) - src/predict_one_sample.py (3:17) duplicated block id: 1787 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/statistics.py (3:17) - src/protein_structure/merge_embedding_pdb_result.py (3:17) duplicated block id: 1788 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virseeker.py (3:17) - src/plot/plot_map_pie_fig_aff4_1.py (3:17) duplicated block id: 1789 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (3:17) - src/protein_structure/merge_embedding_pdb_result.py (3:17) duplicated block id: 1790 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (3:17) - src/plot/plot_map_pie_fig_aff4_2.py (3:17) duplicated block id: 1791 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/run.py (3:17) - src/protein_structure/structure_from_esm_v1.py (3:17) duplicated block id: 1792 size: 12 cleaned lines of code in 2 files: - src/app/app.py (418:431) - src/predict.py (417:430) duplicated block id: 1793 size: 12 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:17) - src/deep_baselines/statistics.py (3:17) duplicated block id: 1794 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/run.py (3:17) - src/geo_map/parse_loc_lat_lon_from_attr.py (3:17) duplicated block id: 1795 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virseeker.py (3:17) - src/predict_many_samples.py (3:17) duplicated block id: 1796 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virseeker.py (3:17) - src/evaluater.py (3:17) duplicated block id: 1797 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virtifier.py (3:17) - src/protein_structure/structure_from_esm_v1.py (3:17) duplicated block id: 1798 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (3:17) - src/geo_map/parse_loc_lat_lon_from_attr.py (3:17) duplicated block id: 1799 size: 12 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:17) - src/deep_baselines/cheer.py (3:17) duplicated block id: 1800 size: 12 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_builder.py (3:17) - src/deep_baselines/virtifier.py (3:17) duplicated block id: 1801 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virhunter.py (3:17) - src/geo_map/parse_loc_lat_lon_from_attr.py (3:17) duplicated block id: 1802 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virtifier.py (3:17) - src/geo_map/self_testing_nearest_station_antarctica.py (3:17) duplicated block id: 1803 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/statistics.py (3:17) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:17) duplicated block id: 1804 size: 12 cleaned lines of code in 2 files: - src/biotoolbox/structure_file_reader.py (3:17) - src/deep_baselines/cheer.py (3:17) duplicated block id: 1805 size: 12 cleaned lines of code in 2 files: - src/SSFN/pooling.py (3:17) - src/deep_baselines/virseeker.py (3:17) duplicated block id: 1806 size: 12 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:17) - src/deep_baselines/statistics.py (3:17) duplicated block id: 1807 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virtifier.py (3:17) - src/protein_structure/embedding_from_esmfold.py (3:17) duplicated block id: 1808 size: 12 cleaned lines of code in 2 files: - src/common/metrics.py (3:17) - src/deep_baselines/virtifier.py (3:17) duplicated block id: 1809 size: 12 cleaned lines of code in 2 files: - src/biotoolbox/structure_file_reader.py (3:17) - src/deep_baselines/virseeker.py (3:17) duplicated block id: 1810 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virseeker.py (3:17) - src/protein_structure/embedding_from_esmfold.py (3:17) duplicated block id: 1811 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (1:15) - src/deep_baselines/virtifier.py (3:17) duplicated block id: 1812 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virtifier.py (3:17) - src/trainer.py (3:17) duplicated block id: 1813 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/statistics.py (3:17) - src/result_process/process_predict_result.py (3:17) duplicated block id: 1814 size: 12 cleaned lines of code in 2 files: - src/baselines/dnn.py (438:452) - src/baselines/xgb.py (314:328) duplicated block id: 1815 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virseeker.py (3:17) - src/protein_structure/merge_embedding_pdb_result.py (3:17) duplicated block id: 1816 size: 12 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_builder.py (3:17) - src/deep_baselines/cheer.py (3:17) duplicated block id: 1817 size: 12 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (3:17) - src/deep_baselines/virseeker.py (3:17) duplicated block id: 1818 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virhunter.py (3:17) - src/plot/plot_map_pie_fig4_2.py (3:17) duplicated block id: 1819 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/statistics.py (3:17) - src/geo_map/download_biosample_page.py (3:17) duplicated block id: 1820 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/run.py (3:17) - src/predict_one_sample.py (3:17) duplicated block id: 1821 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virtifier.py (3:17) - src/plot/plot_map_pie_fig_aff4_2.py (3:17) duplicated block id: 1822 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/statistics.py (3:17) - src/plot/plot_map_pie_fig_aff4_2.py (3:17) duplicated block id: 1823 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/run.py (3:17) - src/plot/plot_map_pie_fig4_2.py (3:17) duplicated block id: 1824 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/statistics.py (3:17) - src/protein_structure/predict_structure.py (3:17) duplicated block id: 1825 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virseeker.py (3:17) - src/run.py (3:17) duplicated block id: 1826 size: 12 cleaned lines of code in 2 files: - src/SSFN/layers.py (3:17) - src/deep_baselines/cheer.py (3:17) duplicated block id: 1827 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (1:15) - src/deep_baselines/statistics.py (3:17) duplicated block id: 1828 size: 12 cleaned lines of code in 2 files: - src/data_loader.py (3:17) - src/deep_baselines/virseeker.py (3:17) duplicated block id: 1829 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/statistics.py (3:17) - src/evaluater.py (3:17) duplicated block id: 1830 size: 12 cleaned lines of code in 2 files: - src/SSFN/model.py (3:17) - src/deep_baselines/run.py (3:17) duplicated block id: 1831 size: 12 cleaned lines of code in 2 files: - src/SSFN/model.py (3:17) - src/deep_baselines/virtifier.py (3:17) duplicated block id: 1832 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virhunter.py (3:17) - src/protein_structure/predict_structure.py (3:17) duplicated block id: 1833 size: 12 cleaned lines of code in 2 files: - src/data_preprocess/verify_train_dataset.py (3:17) - src/deep_baselines/virtifier.py (3:17) duplicated block id: 1834 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virseeker.py (3:17) - src/geo_map/get_biosample_from_update.py (3:17) duplicated block id: 1835 size: 12 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (936:947) - src/SSFN/modeling_bert.py (1207:1218) duplicated block id: 1836 size: 12 cleaned lines of code in 2 files: - src/common/loss.py (3:17) - src/deep_baselines/cheer.py (3:17) duplicated block id: 1837 size: 12 cleaned lines of code in 2 files: - src/baselines/lgbm.py (3:17) - src/deep_baselines/cheer.py (3:17) duplicated block id: 1838 size: 12 cleaned lines of code in 2 files: - src/data_preprocess/ncbi_id_2_uniprot.py (3:17) - src/deep_baselines/virtifier.py (3:17) duplicated block id: 1839 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/statistics.py (3:17) - src/utils.py (3:17) duplicated block id: 1840 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (3:17) - src/geo_map/get_biosample_from_update.py (3:17) duplicated block id: 1841 size: 12 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1490:1502) - src/SSFN/modeling_bert.py (1867:1879) duplicated block id: 1842 size: 12 cleaned lines of code in 2 files: - src/biotoolbox/contact_map_generator.py (3:17) - src/deep_baselines/cheer.py (3:17) duplicated block id: 1843 size: 12 cleaned lines of code in 2 files: - src/common/multi_label_metrics.py (3:17) - src/deep_baselines/statistics.py (3:17) duplicated block id: 1844 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/virseeker.py (3:17) - src/geo_map/parse_loc_lat_lon_from_attr.py (3:17) duplicated block id: 1845 size: 12 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1490:1502) - src/SSFN/modeling_bert.py (1775:1787) duplicated block id: 1846 size: 12 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (3:17) - src/deep_baselines/virtifier.py (3:17) duplicated block id: 1847 size: 12 cleaned lines of code in 2 files: - src/SSFN/gcn.py (3:17) - src/deep_baselines/virtifier.py (3:17) duplicated block id: 1848 size: 12 cleaned lines of code in 2 files: - src/SSFN/model.py (3:17) - src/deep_baselines/cheer.py (3:17) duplicated block id: 1849 size: 12 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (3:17) - src/deep_baselines/virseeker.py (3:17) duplicated block id: 1850 size: 12 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (3:17) - src/protein_structure/sampling_verify_embedding_is_ok.py (3:17) duplicated block id: 1851 size: 12 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (3:17) - src/deep_baselines/cheer.py (3:17) duplicated block id: 1852 size: 11 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1141:1151) - src/SSFN/modeling_bert.py (1689:1699) duplicated block id: 1853 size: 11 cleaned lines of code in 2 files: - src/baselines/dnn.py (572:585) - src/trainer.py (108:121) duplicated block id: 1854 size: 11 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (416:426) - src/predict.py (871:881) duplicated block id: 1855 size: 11 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (416:426) - src/predict.py (841:851) duplicated block id: 1856 size: 11 cleaned lines of code in 2 files: - src/protein_structure/embedding_from_esmfold.py (123:133) - src/protein_structure/embedding_from_esmfold.py (136:146) duplicated block id: 1857 size: 11 cleaned lines of code in 2 files: - src/baselines/lgbm.py (441:453) - src/run.py (803:815) duplicated block id: 1858 size: 11 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (173:183) - src/deep_baselines/run.py (413:423) duplicated block id: 1859 size: 11 cleaned lines of code in 2 files: - src/app/app.py (147:157) - src/predict.py (158:168) duplicated block id: 1860 size: 11 cleaned lines of code in 2 files: - src/baselines/dnn.py (627:639) - src/trainer.py (177:189) duplicated block id: 1861 size: 11 cleaned lines of code in 2 files: - src/predict.py (767:777) - src/predict.py (871:881) duplicated block id: 1862 size: 11 cleaned lines of code in 2 files: - src/predict.py (767:777) - src/predict.py (841:851) duplicated block id: 1863 size: 11 cleaned lines of code in 2 files: - src/app/app.py (42:54) - src/deep_baselines/predict_deep_baselines.py (53:65) duplicated block id: 1864 size: 11 cleaned lines of code in 2 files: - src/predict.py (158:168) - src/predict_many_samples.py (157:167) duplicated block id: 1865 size: 11 cleaned lines of code in 2 files: - src/deep_baselines/run.py (584:596) - src/trainer.py (177:189) duplicated block id: 1866 size: 11 cleaned lines of code in 2 files: - src/predict.py (158:168) - src/predict_one_sample.py (159:169) duplicated block id: 1867 size: 11 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (937:947) - src/SSFN/modeling_bert.py (1348:1358) duplicated block id: 1868 size: 11 cleaned lines of code in 2 files: - src/baselines/dnn.py (525:537) - src/run.py (803:815) duplicated block id: 1869 size: 11 cleaned lines of code in 2 files: - src/app/app.py (159:169) - src/predict.py (170:180) duplicated block id: 1870 size: 11 cleaned lines of code in 2 files: - src/data_loader.py (765:775) - src/predict.py (354:364) duplicated block id: 1871 size: 11 cleaned lines of code in 2 files: - src/data_loader.py (765:775) - src/predict_one_sample.py (372:382) duplicated block id: 1872 size: 11 cleaned lines of code in 2 files: - src/protein_structure/embedding_from_esmfold.py (144:154) - src/protein_structure/structure_from_esm_v1.py (227:237) duplicated block id: 1873 size: 11 cleaned lines of code in 2 files: - src/baselines/lgbm.py (288:304) - src/baselines/xgb.py (274:290) duplicated block id: 1874 size: 11 cleaned lines of code in 2 files: - src/data_loader.py (765:775) - src/predict_many_samples.py (369:379) duplicated block id: 1875 size: 11 cleaned lines of code in 2 files: - src/deep_baselines/run.py (482:494) - src/run.py (803:815) duplicated block id: 1876 size: 11 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (490:500) - src/predict.py (767:777) duplicated block id: 1877 size: 11 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1190:1205) - src/SSFN/modeling_bert.py (1328:1343) duplicated block id: 1878 size: 11 cleaned lines of code in 2 files: - src/predict.py (170:180) - src/predict_one_sample.py (171:181) duplicated block id: 1879 size: 11 cleaned lines of code in 2 files: - src/app/app.py (353:363) - src/data_loader.py (765:775) duplicated block id: 1880 size: 11 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (518:528) - src/predict.py (767:777) duplicated block id: 1881 size: 11 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1689:1699) - src/SSFN/modeling_bert.py (1869:1879) duplicated block id: 1882 size: 11 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (416:426) - src/deep_baselines/predict_deep_baselines.py (518:528) duplicated block id: 1883 size: 11 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (416:426) - src/deep_baselines/predict_deep_baselines.py (490:500) duplicated block id: 1884 size: 11 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1689:1699) - src/SSFN/modeling_bert.py (1777:1787) duplicated block id: 1885 size: 11 cleaned lines of code in 2 files: - src/predict.py (170:180) - src/predict_many_samples.py (169:179) duplicated block id: 1886 size: 11 cleaned lines of code in 2 files: - src/evaluater.py (48:58) - src/predictor.py (49:59) duplicated block id: 1887 size: 11 cleaned lines of code in 2 files: - src/baselines/xgb.py (406:418) - src/run.py (803:815) duplicated block id: 1888 size: 11 cleaned lines of code in 2 files: - src/deep_baselines/run.py (529:542) - src/trainer.py (108:121) duplicated block id: 1889 size: 10 cleaned lines of code in 2 files: - src/SSFN/model.py (347:356) - src/SSFN/modeling_bert.py (1778:1787) duplicated block id: 1890 size: 10 cleaned lines of code in 2 files: - src/common/metrics.py (110:119) - src/common/metrics.py (257:266) duplicated block id: 1891 size: 10 cleaned lines of code in 2 files: - src/SSFN/model.py (347:356) - src/SSFN/modeling_bert.py (1870:1879) duplicated block id: 1892 size: 10 cleaned lines of code in 2 files: - src/common/metrics.py (158:167) - src/common/metrics.py (305:314) duplicated block id: 1893 size: 10 cleaned lines of code in 2 files: - src/baselines/xgb.py (394:403) - src/deep_baselines/run.py (467:476) duplicated block id: 1894 size: 10 cleaned lines of code in 2 files: - src/SSFN/model.py (347:356) - src/SSFN/modeling_bert.py (1579:1588) duplicated block id: 1895 size: 10 cleaned lines of code in 2 files: - src/SSFN/model.py (347:356) - src/SSFN/modeling_bert.py (1493:1502) duplicated block id: 1896 size: 10 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (140:149) - src/data_preprocess/data_preprocess_for_rdrp_v1.py (155:164) duplicated block id: 1897 size: 10 cleaned lines of code in 2 files: - src/baselines/dnn.py (513:522) - src/baselines/dnn.py (533:542) duplicated block id: 1898 size: 10 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1207:1216) - src/SSFN/modeling_bert.py (1655:1664) duplicated block id: 1899 size: 10 cleaned lines of code in 2 files: - src/deep_baselines/run.py (783:793) - src/evaluater.py (157:167) duplicated block id: 1900 size: 10 cleaned lines of code in 2 files: - src/SSFN/model.py (347:356) - src/SSFN/modeling_bert.py (1690:1699) duplicated block id: 1901 size: 10 cleaned lines of code in 2 files: - src/SSFN/model.py (347:356) - src/SSFN/modeling_bert.py (1142:1151) duplicated block id: 1902 size: 10 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (232:241) - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (185:194) duplicated block id: 1903 size: 10 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (243:257) - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (217:231) duplicated block id: 1904 size: 10 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_2.py (253:262) - src/plot/plot_map_pie_fig_aff4_2.py (253:262) duplicated block id: 1905 size: 10 cleaned lines of code in 2 files: - src/predict.py (464:473) - src/predict.py (493:502) duplicated block id: 1906 size: 10 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_2.py (195:204) - src/plot/plot_map_pie_fig_aff4_2.py (194:203) duplicated block id: 1907 size: 10 cleaned lines of code in 2 files: - src/data_preprocess/subword.py (88:97) - src/data_preprocess/subword.py (104:113) duplicated block id: 1908 size: 10 cleaned lines of code in 2 files: - src/baselines/dnn.py (510:519) - src/baselines/xgb.py (394:403) duplicated block id: 1909 size: 10 cleaned lines of code in 2 files: - src/baselines/lgbm.py (429:438) - src/deep_baselines/run.py (467:476) duplicated block id: 1910 size: 10 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (301:310) - src/SSFN/modeling_bert.py (435:444) duplicated block id: 1911 size: 10 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (301:310) - src/SSFN/modeling_bert.py (503:512) duplicated block id: 1912 size: 10 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (140:149) - src/data_preprocess/subword.py (104:113) duplicated block id: 1913 size: 10 cleaned lines of code in 2 files: - src/deep_baselines/run.py (470:479) - src/deep_baselines/run.py (490:499) duplicated block id: 1914 size: 10 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (219:228) - src/plot/plot_map_pie_fig_aff4_2.py (253:262) duplicated block id: 1915 size: 10 cleaned lines of code in 2 files: - src/baselines/lgbm.py (24:51) - src/baselines/xgb.py (24:50) duplicated block id: 1916 size: 10 cleaned lines of code in 2 files: - src/baselines/dnn.py (824:834) - src/evaluater.py (157:167) duplicated block id: 1917 size: 10 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (155:164) - src/data_preprocess/subword.py (88:97) duplicated block id: 1918 size: 10 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (64:75) - src/deep_baselines/cheer.py (249:260) duplicated block id: 1919 size: 10 cleaned lines of code in 2 files: - src/SSFN/pooling.py (121:131) - src/SSFN/pooling.py (165:175) duplicated block id: 1920 size: 10 cleaned lines of code in 2 files: - src/predict.py (435:444) - src/predict.py (493:502) duplicated block id: 1921 size: 10 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (435:444) - src/SSFN/modeling_bert.py (503:512) duplicated block id: 1922 size: 10 cleaned lines of code in 2 files: - src/predict.py (435:444) - src/predict.py (464:473) duplicated block id: 1923 size: 10 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (936:945) - src/SSFN/modeling_bert.py (1655:1664) duplicated block id: 1924 size: 10 cleaned lines of code in 2 files: - src/protein_structure/predict_structure.py (51:62) - src/protein_structure/structure_from_esm_v1.py (51:63) duplicated block id: 1925 size: 10 cleaned lines of code in 2 files: - src/predict_many_samples.py (765:774) - src/predict_one_sample.py (749:758) duplicated block id: 1926 size: 10 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (52:61) - src/deep_baselines/cheer.py (384:393) duplicated block id: 1927 size: 10 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (52:61) - src/deep_baselines/cheer.py (238:247) duplicated block id: 1928 size: 10 cleaned lines of code in 2 files: - src/baselines/dnn.py (510:519) - src/baselines/lgbm.py (429:438) duplicated block id: 1929 size: 9 cleaned lines of code in 2 files: - src/predict_many_samples.py (442:450) - src/predict_one_sample.py (509:517) duplicated block id: 1930 size: 9 cleaned lines of code in 2 files: - src/predict_many_samples.py (442:450) - src/predict_one_sample.py (477:485) duplicated block id: 1931 size: 9 cleaned lines of code in 2 files: - src/app/app.py (426:434) - src/app/app.py (458:466) duplicated block id: 1932 size: 9 cleaned lines of code in 2 files: - src/app/app.py (490:498) - src/predict_many_samples.py (442:450) duplicated block id: 1933 size: 9 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1107:1147) - src/SSFN/modeling_bert.py (1364:1379) duplicated block id: 1934 size: 9 cleaned lines of code in 2 files: - src/app/app.py (426:434) - src/app/app.py (490:498) duplicated block id: 1935 size: 9 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (556:564) - src/deep_baselines/virseeker.py (221:229) duplicated block id: 1936 size: 9 cleaned lines of code in 2 files: - src/predict_many_samples.py (506:514) - src/predict_one_sample.py (445:453) duplicated block id: 1937 size: 9 cleaned lines of code in 2 files: - src/deep_baselines/run.py (776:784) - src/deep_baselines/run.py (864:872) duplicated block id: 1938 size: 9 cleaned lines of code in 2 files: - src/predict_many_samples.py (474:482) - src/predict_one_sample.py (445:453) duplicated block id: 1939 size: 9 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1364:1379) - src/SSFN/modeling_bert.py (1570:1584) duplicated block id: 1940 size: 9 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1348:1356) - src/SSFN/modeling_bert.py (1842:1850) duplicated block id: 1941 size: 9 cleaned lines of code in 2 files: - src/deep_baselines/run.py (807:817) - src/evaluater.py (196:206) duplicated block id: 1942 size: 9 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (48:56) - src/plot/plot_map_pie_fig_aff4_1.py (39:47) duplicated block id: 1943 size: 9 cleaned lines of code in 2 files: - src/predict.py (425:433) - src/predict.py (483:491) duplicated block id: 1944 size: 9 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1364:1379) - src/SSFN/modeling_bert.py (1771:1783) duplicated block id: 1945 size: 9 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1364:1379) - src/SSFN/modeling_bert.py (1857:1875) duplicated block id: 1946 size: 9 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1348:1356) - src/SSFN/modeling_bert.py (1556:1564) duplicated block id: 1947 size: 9 cleaned lines of code in 2 files: - src/predict.py (454:462) - src/predict.py (483:491) duplicated block id: 1948 size: 9 cleaned lines of code in 2 files: - src/predict.py (425:433) - src/predict.py (454:462) duplicated block id: 1949 size: 9 cleaned lines of code in 2 files: - src/data_loader.py (524:532) - src/data_loader.py (976:984) duplicated block id: 1950 size: 9 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1348:1356) - src/SSFN/modeling_bert.py (1757:1765) duplicated block id: 1951 size: 9 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1348:1356) - src/SSFN/modeling_bert.py (1656:1664) duplicated block id: 1952 size: 9 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (301:310) - src/plot/plot_map_pie_fig4_2.py (316:324) duplicated block id: 1953 size: 9 cleaned lines of code in 2 files: - src/baselines/dnn.py (867:876) - src/predictor.py (55:64) duplicated block id: 1954 size: 9 cleaned lines of code in 2 files: - src/deep_baselines/virseeker.py (70:78) - src/deep_baselines/virtifier.py (74:82) duplicated block id: 1955 size: 9 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (937:945) - src/SSFN/modeling_bert.py (1842:1850) duplicated block id: 1956 size: 9 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1656:1664) - src/SSFN/modeling_bert.py (1842:1850) duplicated block id: 1957 size: 9 cleaned lines of code in 2 files: - src/predict_one_sample.py (445:453) - src/predict_one_sample.py (509:517) duplicated block id: 1958 size: 9 cleaned lines of code in 2 files: - src/predict_one_sample.py (445:453) - src/predict_one_sample.py (477:485) duplicated block id: 1959 size: 9 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (937:945) - src/SSFN/modeling_bert.py (1556:1564) duplicated block id: 1960 size: 9 cleaned lines of code in 2 files: - src/app/app.py (458:466) - src/predict_many_samples.py (442:450) duplicated block id: 1961 size: 9 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (937:945) - src/SSFN/modeling_bert.py (1757:1765) duplicated block id: 1962 size: 9 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1254:1262) - src/SSFN/modeling_bert.py (1373:1381) duplicated block id: 1963 size: 9 cleaned lines of code in 2 files: - src/deep_baselines/run.py (358:368) - src/run.py (593:603) duplicated block id: 1964 size: 9 cleaned lines of code in 2 files: - src/app/app.py (490:498) - src/predict_one_sample.py (445:453) duplicated block id: 1965 size: 9 cleaned lines of code in 2 files: - src/baselines/dnn.py (849:861) - src/baselines/lgbm.py (647:658) duplicated block id: 1966 size: 9 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1093:1101) - src/SSFN/modeling_bert.py (1758:1766) duplicated block id: 1967 size: 9 cleaned lines of code in 2 files: - src/deep_baselines/run.py (871:880) - src/predictor.py (163:172) duplicated block id: 1968 size: 9 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (24:46) - src/deep_baselines/virseeker.py (24:46) duplicated block id: 1969 size: 9 cleaned lines of code in 2 files: - src/baselines/lgbm.py (647:658) - src/deep_baselines/run.py (808:820) duplicated block id: 1970 size: 9 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1622:1635) - src/SSFN/modeling_bert.py (1804:1817) duplicated block id: 1971 size: 9 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1093:1101) - src/SSFN/modeling_bert.py (1557:1565) duplicated block id: 1972 size: 9 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (556:564) - src/deep_baselines/virtifier.py (243:251) duplicated block id: 1973 size: 9 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1093:1101) - src/SSFN/modeling_bert.py (1657:1665) duplicated block id: 1974 size: 9 cleaned lines of code in 2 files: - src/app/app.py (426:434) - src/predict_one_sample.py (509:517) duplicated block id: 1975 size: 9 cleaned lines of code in 2 files: - src/app/app.py (426:434) - src/predict_one_sample.py (477:485) duplicated block id: 1976 size: 9 cleaned lines of code in 2 files: - src/utils.py (139:147) - src/utils.py (161:169) duplicated block id: 1977 size: 9 cleaned lines of code in 2 files: - src/baselines/dnn.py (848:858) - src/evaluater.py (196:206) duplicated block id: 1978 size: 9 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1757:1765) - src/SSFN/modeling_bert.py (1842:1850) duplicated block id: 1979 size: 9 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (64:73) - src/deep_baselines/cheer.py (395:404) duplicated block id: 1980 size: 9 cleaned lines of code in 2 files: - src/deep_baselines/virhunter.py (24:48) - src/deep_baselines/virtifier.py (24:46) duplicated block id: 1981 size: 9 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (556:564) - src/deep_baselines/virhunter.py (210:218) duplicated block id: 1982 size: 9 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1208:1216) - src/SSFN/modeling_bert.py (1842:1850) duplicated block id: 1983 size: 9 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (402:411) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (448:457) duplicated block id: 1984 size: 9 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (301:310) - src/plot/plot_map_pie_fig_aff4_2.py (332:340) duplicated block id: 1985 size: 9 cleaned lines of code in 2 files: - src/predictor.py (67:75) - src/predictor.py (83:91) duplicated block id: 1986 size: 9 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1208:1216) - src/SSFN/modeling_bert.py (1757:1765) duplicated block id: 1987 size: 9 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1556:1564) - src/SSFN/modeling_bert.py (1842:1850) duplicated block id: 1988 size: 9 cleaned lines of code in 2 files: - src/evaluater.py (66:74) - src/evaluater.py (79:87) duplicated block id: 1989 size: 9 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1208:1216) - src/SSFN/modeling_bert.py (1556:1564) duplicated block id: 1990 size: 9 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (24:46) - src/deep_baselines/virtifier.py (24:46) duplicated block id: 1991 size: 9 cleaned lines of code in 2 files: - src/baselines/dnn.py (911:920) - src/predictor.py (163:172) duplicated block id: 1992 size: 9 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1539:1550) - src/SSFN/modeling_bert.py (1740:1751) duplicated block id: 1993 size: 9 cleaned lines of code in 2 files: - src/data_loader.py (507:515) - src/data_loader.py (957:965) duplicated block id: 1994 size: 9 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (402:411) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (493:502) duplicated block id: 1995 size: 9 cleaned lines of code in 2 files: - src/trainer.py (70:78) - src/trainer.py (86:94) duplicated block id: 1996 size: 9 cleaned lines of code in 2 files: - src/deep_baselines/run.py (826:835) - src/predictor.py (55:64) duplicated block id: 1997 size: 9 cleaned lines of code in 2 files: - src/app/app.py (426:434) - src/predict_many_samples.py (506:514) duplicated block id: 1998 size: 9 cleaned lines of code in 2 files: - src/app/app.py (458:466) - src/predict_one_sample.py (445:453) duplicated block id: 1999 size: 9 cleaned lines of code in 2 files: - src/app/app.py (426:434) - src/predict_many_samples.py (474:482) duplicated block id: 2000 size: 9 cleaned lines of code in 2 files: - src/deep_baselines/cheer.py (24:46) - src/deep_baselines/virhunter.py (24:48) duplicated block id: 2001 size: 9 cleaned lines of code in 2 files: - src/baselines/dnn.py (956:966) - src/predictor.py (264:274) duplicated block id: 2002 size: 9 cleaned lines of code in 2 files: - src/deep_baselines/run.py (916:926) - src/predictor.py (264:274) duplicated block id: 2003 size: 9 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1093:1101) - src/SSFN/modeling_bert.py (1439:1447) duplicated block id: 2004 size: 9 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (24:41) - src/data_preprocess/tf_records_generator.py (24:40) duplicated block id: 2005 size: 9 cleaned lines of code in 2 files: - src/predictor.py (186:194) - src/predictor.py (221:229) duplicated block id: 2006 size: 9 cleaned lines of code in 2 files: - src/data_loader.py (389:397) - src/data_loader.py (462:470) duplicated block id: 2007 size: 9 cleaned lines of code in 2 files: - src/predict_many_samples.py (442:450) - src/predict_many_samples.py (506:514) duplicated block id: 2008 size: 9 cleaned lines of code in 2 files: - src/deep_baselines/virhunter.py (24:48) - src/deep_baselines/virseeker.py (24:46) duplicated block id: 2009 size: 9 cleaned lines of code in 2 files: - src/predict_many_samples.py (442:450) - src/predict_many_samples.py (474:482) duplicated block id: 2010 size: 9 cleaned lines of code in 2 files: - src/baselines/dnn.py (459:467) - src/deep_baselines/run.py (373:381) duplicated block id: 2011 size: 8 cleaned lines of code in 2 files: - src/baselines/dnn.py (818:825) - src/baselines/dnn.py (905:912) duplicated block id: 2012 size: 8 cleaned lines of code in 2 files: - src/deep_baselines/run.py (544:556) - src/trainer.py (127:139) duplicated block id: 2013 size: 8 cleaned lines of code in 2 files: - src/predict.py (450:459) - src/predict_one_sample.py (473:482) duplicated block id: 2014 size: 8 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (488:495) - src/data_preprocess/data_preprocess_for_rdrp_v1.py (504:511) duplicated block id: 2015 size: 8 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (235:244) - src/predict.py (446:455) duplicated block id: 2016 size: 8 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1439:1446) - src/SSFN/modeling_bert.py (1843:1850) duplicated block id: 2017 size: 8 cleaned lines of code in 2 files: - src/app/app.py (418:427) - src/deep_baselines/predict_deep_baselines.py (211:220) duplicated block id: 2018 size: 8 cleaned lines of code in 2 files: - src/baselines/xgb.py (608:616) - src/evaluater.py (197:206) duplicated block id: 2019 size: 8 cleaned lines of code in 2 files: - src/common/multi_label_metrics.py (53:60) - src/common/multi_label_metrics.py (116:123) duplicated block id: 2020 size: 8 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig_aff4_2.py (68:75) - src/plot/plot_map_pie_fig_aff4_2.py (136:143) duplicated block id: 2021 size: 8 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_2.py (69:76) - src/plot/plot_map_pie_fig4_2.py (139:146) duplicated block id: 2022 size: 8 cleaned lines of code in 2 files: - src/app/app.py (454:463) - src/predict.py (450:459) duplicated block id: 2023 size: 8 cleaned lines of code in 2 files: - src/baselines/dnn.py (849:858) - src/baselines/xgb.py (608:616) duplicated block id: 2024 size: 8 cleaned lines of code in 2 files: - src/SSFN/pooling.py (231:238) - src/SSFN/pooling.py (316:323) duplicated block id: 2025 size: 8 cleaned lines of code in 2 files: - src/predict.py (479:488) - src/predict_one_sample.py (505:514) duplicated block id: 2026 size: 8 cleaned lines of code in 2 files: - src/predict.py (450:459) - src/predict_many_samples.py (470:479) duplicated block id: 2027 size: 8 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (23:40) - src/data_preprocess/tf_records_generator.py (23:38) duplicated block id: 2028 size: 8 cleaned lines of code in 2 files: - src/protein_structure/embedding_from_esmfold.py (122:129) - src/protein_structure/structure_from_esm_v1.py (218:225) duplicated block id: 2029 size: 8 cleaned lines of code in 2 files: - src/deep_baselines/run.py (732:739) - src/deep_baselines/run.py (823:830) duplicated block id: 2030 size: 8 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (395:402) - src/data_preprocess/data_preprocess_for_rdrp_v1.py (443:450) duplicated block id: 2031 size: 8 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1093:1100) - src/SSFN/modeling_bert.py (1843:1850) duplicated block id: 2032 size: 8 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1349:1356) - src/SSFN/modeling_bert.py (1439:1446) duplicated block id: 2033 size: 8 cleaned lines of code in 2 files: - src/app/app.py (266:274) - src/data_loader.py (642:650) duplicated block id: 2034 size: 8 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_2.py (69:76) - src/plot/plot_map_pie_fig_aff4_2.py (136:143) duplicated block id: 2035 size: 8 cleaned lines of code in 2 files: - src/predict.py (479:488) - src/predict_many_samples.py (502:511) duplicated block id: 2036 size: 8 cleaned lines of code in 2 files: - src/baselines/dnn.py (864:871) - src/deep_baselines/run.py (732:739) duplicated block id: 2037 size: 8 cleaned lines of code in 2 files: - src/baselines/lgbm.py (432:439) - src/baselines/lgbm.py (449:456) duplicated block id: 2038 size: 8 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_2.py (139:146) - src/plot/plot_map_pie_fig_aff4_2.py (68:75) duplicated block id: 2039 size: 8 cleaned lines of code in 2 files: - src/baselines/xgb.py (608:616) - src/deep_baselines/run.py (808:817) duplicated block id: 2040 size: 8 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1371:1379) - src/SSFN/modeling_bert.py (1490:1498) duplicated block id: 2041 size: 8 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (211:220) - src/predict_one_sample.py (437:446) duplicated block id: 2042 size: 8 cleaned lines of code in 2 files: - src/SSFN/pooling.py (141:150) - src/SSFN/pooling.py (185:195) duplicated block id: 2043 size: 8 cleaned lines of code in 2 files: - src/data_loader.py (642:650) - src/predict_one_sample.py (285:293) duplicated block id: 2044 size: 8 cleaned lines of code in 2 files: - src/baselines/dnn.py (587:599) - src/trainer.py (127:139) duplicated block id: 2045 size: 8 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (444:451) - src/data_preprocess/data_preprocess_for_rdrp_v1.py (488:495) duplicated block id: 2046 size: 8 cleaned lines of code in 2 files: - src/common/multi_label_metrics.py (116:123) - src/common/multi_label_metrics.py (199:206) duplicated block id: 2047 size: 8 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (444:451) - src/data_preprocess/data_preprocess_for_rdrp_v1.py (504:511) duplicated block id: 2048 size: 8 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (211:220) - src/predict_many_samples.py (434:443) duplicated block id: 2049 size: 8 cleaned lines of code in 2 files: - src/common/multi_label_metrics.py (53:60) - src/common/multi_label_metrics.py (199:206) duplicated block id: 2050 size: 8 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1093:1100) - src/SSFN/modeling_bert.py (1209:1216) duplicated block id: 2051 size: 8 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (92:100) - src/predict_many_samples.py (94:102) duplicated block id: 2052 size: 8 cleaned lines of code in 2 files: - src/baselines/lgbm.py (429:436) - src/run.py (785:792) duplicated block id: 2053 size: 8 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (211:220) - src/predict.py (417:426) duplicated block id: 2054 size: 8 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (425:432) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (516:523) duplicated block id: 2055 size: 8 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (297:304) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (342:349) duplicated block id: 2056 size: 8 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (425:432) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (471:478) duplicated block id: 2057 size: 8 cleaned lines of code in 2 files: - src/baselines/dnn.py (774:781) - src/baselines/dnn.py (864:871) duplicated block id: 2058 size: 8 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (444:453) - src/predict.py (797:806) duplicated block id: 2059 size: 8 cleaned lines of code in 2 files: - src/common/multi_label_metrics.py (115:122) - src/common/multi_label_metrics.py (333:341) duplicated block id: 2060 size: 8 cleaned lines of code in 2 files: - src/baselines/dnn.py (445:455) - src/deep_baselines/run.py (361:371) duplicated block id: 2061 size: 8 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1093:1100) - src/SSFN/modeling_bert.py (1349:1356) duplicated block id: 2062 size: 8 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (938:945) - src/SSFN/modeling_bert.py (1093:1100) duplicated block id: 2063 size: 8 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (259:268) - src/predict.py (475:484) duplicated block id: 2064 size: 8 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_2.py (179:188) - src/plot/plot_map_pie_fig_aff4_2.py (179:187) duplicated block id: 2065 size: 8 cleaned lines of code in 2 files: - src/baselines/dnn.py (774:781) - src/deep_baselines/run.py (823:830) duplicated block id: 2066 size: 8 cleaned lines of code in 2 files: - src/app/app.py (84:92) - src/deep_baselines/predict_deep_baselines.py (92:100) duplicated block id: 2067 size: 8 cleaned lines of code in 2 files: - src/baselines/lgbm.py (647:655) - src/evaluater.py (197:206) duplicated block id: 2068 size: 8 cleaned lines of code in 2 files: - src/baselines/xgb.py (397:404) - src/baselines/xgb.py (414:421) duplicated block id: 2069 size: 8 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (938:945) - src/SSFN/modeling_bert.py (1439:1446) duplicated block id: 2070 size: 8 cleaned lines of code in 2 files: - src/common/metrics.py (386:393) - src/common/multi_label_metrics.py (458:465) duplicated block id: 2071 size: 8 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1209:1216) - src/SSFN/modeling_bert.py (1439:1446) duplicated block id: 2072 size: 8 cleaned lines of code in 2 files: - src/app/app.py (486:495) - src/predict.py (479:488) duplicated block id: 2073 size: 8 cleaned lines of code in 2 files: - src/data_loader.py (885:892) - src/data_loader.py (1433:1440) duplicated block id: 2074 size: 8 cleaned lines of code in 2 files: - src/data_loader.py (885:892) - src/data_loader.py (1408:1415) duplicated block id: 2075 size: 8 cleaned lines of code in 2 files: - src/baselines/dnn.py (66:76) - src/baselines/xgb.py (58:69) duplicated block id: 2076 size: 8 cleaned lines of code in 2 files: - src/data_loader.py (642:650) - src/predict.py (276:284) duplicated block id: 2077 size: 8 cleaned lines of code in 2 files: - src/baselines/xgb.py (394:401) - src/run.py (785:792) duplicated block id: 2078 size: 8 cleaned lines of code in 2 files: - src/data_loader.py (642:650) - src/predict_many_samples.py (282:290) duplicated block id: 2079 size: 8 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (92:100) - src/predict.py (93:101) duplicated block id: 2080 size: 8 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (143:150) - src/deep_baselines/run.py (272:279) duplicated block id: 2081 size: 7 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_2.py (329:335) - src/plot/plot_map_pie_fig_aff4_2.py (346:352) duplicated block id: 2082 size: 7 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (99:105) - src/result_process/process_predict_result.py (105:111) duplicated block id: 2083 size: 7 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (222:228) - src/predict.py (430:436) duplicated block id: 2084 size: 7 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1719:1730) - src/SSFN/modeling_bert.py (1806:1817) duplicated block id: 2085 size: 7 cleaned lines of code in 2 files: - src/common/metrics.py (363:369) - src/common/multi_label_metrics.py (482:488) duplicated block id: 2086 size: 7 cleaned lines of code in 2 files: - src/baselines/predict.py (24:38) - src/deep_baselines/run.py (24:50) duplicated block id: 2087 size: 7 cleaned lines of code in 2 files: - src/predict.py (518:524) - src/predict_many_samples.py (545:552) duplicated block id: 2088 size: 7 cleaned lines of code in 2 files: - src/predict_many_samples.py (410:416) - src/predict_one_sample.py (447:453) duplicated block id: 2089 size: 7 cleaned lines of code in 2 files: - src/app/app.py (492:498) - src/predict_many_samples.py (410:416) duplicated block id: 2090 size: 7 cleaned lines of code in 2 files: - src/baselines/dnn.py (533:539) - src/baselines/xgb.py (397:403) duplicated block id: 2091 size: 7 cleaned lines of code in 2 files: - src/predict.py (518:524) - src/predict_one_sample.py (543:550) duplicated block id: 2092 size: 7 cleaned lines of code in 2 files: - src/predict_many_samples.py (410:416) - src/predict_one_sample.py (479:485) duplicated block id: 2093 size: 7 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (219:225) - src/deep_baselines/predict_deep_baselines.py (267:273) duplicated block id: 2094 size: 7 cleaned lines of code in 2 files: - src/predict_many_samples.py (410:416) - src/predict_one_sample.py (511:517) duplicated block id: 2095 size: 7 cleaned lines of code in 2 files: - src/baselines/dnn.py (513:519) - src/baselines/xgb.py (414:420) duplicated block id: 2096 size: 7 cleaned lines of code in 2 files: - src/utils.py (110:116) - src/utils.py (153:159) duplicated block id: 2097 size: 7 cleaned lines of code in 2 files: - src/utils.py (110:116) - src/utils.py (131:137) duplicated block id: 2098 size: 7 cleaned lines of code in 2 files: - src/SSFN/model.py (24:35) - src/deep_baselines/run.py (24:50) duplicated block id: 2099 size: 7 cleaned lines of code in 2 files: - src/data_loader.py (122:130) - src/data_loader.py (172:180) duplicated block id: 2100 size: 7 cleaned lines of code in 2 files: - src/baselines/dnn.py (24:49) - src/protein_structure/embedding_from_esmfold.py (24:38) duplicated block id: 2101 size: 7 cleaned lines of code in 2 files: - src/deep_baselines/run.py (24:50) - src/protein_structure/merge_embedding_pdb_result.py (24:32) duplicated block id: 2102 size: 7 cleaned lines of code in 2 files: - src/data_loader.py (1106:1112) - src/predict.py (318:324) duplicated block id: 2103 size: 7 cleaned lines of code in 2 files: - src/baselines/lgbm.py (449:455) - src/deep_baselines/run.py (470:476) duplicated block id: 2104 size: 7 cleaned lines of code in 2 files: - src/data_loader.py (1123:1129) - src/predict_one_sample.py (358:364) duplicated block id: 2105 size: 7 cleaned lines of code in 2 files: - src/utils.py (131:137) - src/utils.py (153:159) duplicated block id: 2106 size: 7 cleaned lines of code in 2 files: - src/predict_many_samples.py (476:482) - src/predict_one_sample.py (413:419) duplicated block id: 2107 size: 7 cleaned lines of code in 2 files: - src/evaluater.py (24:42) - src/predictor.py (24:43) duplicated block id: 2108 size: 7 cleaned lines of code in 2 files: - src/SSFN/model.py (24:35) - src/baselines/predict.py (24:38) duplicated block id: 2109 size: 7 cleaned lines of code in 2 files: - src/baselines/lgbm.py (432:438) - src/deep_baselines/run.py (490:496) duplicated block id: 2110 size: 7 cleaned lines of code in 2 files: - src/baselines/dnn.py (513:519) - src/deep_baselines/run.py (490:496) duplicated block id: 2111 size: 7 cleaned lines of code in 2 files: - src/baselines/xgb.py (414:420) - src/deep_baselines/run.py (470:476) duplicated block id: 2112 size: 7 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (60:66) - src/protein_structure/structure_from_esm_v1.py (246:252) duplicated block id: 2113 size: 7 cleaned lines of code in 2 files: - src/app/app.py (339:345) - src/data_loader.py (1123:1129) duplicated block id: 2114 size: 7 cleaned lines of code in 2 files: - src/predict_many_samples.py (24:49) - src/trainer.py (24:50) duplicated block id: 2115 size: 7 cleaned lines of code in 2 files: - src/data_loader.py (390:396) - src/data_loader.py (473:479) duplicated block id: 2116 size: 7 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (301:307) - src/SSFN/modeling_bert.py (581:587) duplicated block id: 2117 size: 7 cleaned lines of code in 2 files: - src/app/app.py (492:498) - src/predict_one_sample.py (413:419) duplicated block id: 2118 size: 7 cleaned lines of code in 2 files: - src/predict_one_sample.py (413:419) - src/predict_one_sample.py (479:485) duplicated block id: 2119 size: 7 cleaned lines of code in 2 files: - src/predict_one_sample.py (413:419) - src/predict_one_sample.py (511:517) duplicated block id: 2120 size: 7 cleaned lines of code in 2 files: - src/deep_baselines/run.py (222:228) - src/deep_baselines/run.py (234:240) duplicated block id: 2121 size: 7 cleaned lines of code in 2 files: - src/data_loader.py (463:469) - src/data_loader.py (473:479) duplicated block id: 2122 size: 7 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (93:100) - src/predict_one_sample.py (97:104) duplicated block id: 2123 size: 7 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_2.py (155:162) - src/plot/plot_map_pie_fig_aff4_2.py (154:161) duplicated block id: 2124 size: 7 cleaned lines of code in 2 files: - src/protein_structure/structure_from_esm_v1.py (278:284) - src/result_process/process_predict_result.py (105:111) duplicated block id: 2125 size: 7 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (561:567) - src/plot/plot_map_pie_fig4_1.py (99:105) duplicated block id: 2126 size: 7 cleaned lines of code in 2 files: - src/predict_one_sample.py (413:419) - src/predict_one_sample.py (447:453) duplicated block id: 2127 size: 7 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1254:1260) - src/SSFN/modeling_bert.py (1492:1498) duplicated block id: 2128 size: 7 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1624:1635) - src/SSFN/modeling_bert.py (1719:1730) duplicated block id: 2129 size: 7 cleaned lines of code in 2 files: - src/app/app.py (389:397) - src/predict.py (395:403) duplicated block id: 2130 size: 7 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_entrez.py (76:82) - src/geo_map/get_biosample_from_manual.py (51:57) duplicated block id: 2131 size: 7 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1520:1531) - src/SSFN/modeling_bert.py (1624:1635) duplicated block id: 2132 size: 7 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (219:225) - src/deep_baselines/predict_deep_baselines.py (243:249) duplicated block id: 2133 size: 7 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1520:1531) - src/SSFN/modeling_bert.py (1719:1730) duplicated block id: 2134 size: 7 cleaned lines of code in 2 files: - src/deep_baselines/run.py (24:50) - src/protein_structure/embedding_from_esmfold.py (24:38) duplicated block id: 2135 size: 7 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1141:1147) - src/SSFN/modeling_bert.py (1254:1260) duplicated block id: 2136 size: 7 cleaned lines of code in 2 files: - src/predict_many_samples.py (508:514) - src/predict_one_sample.py (413:419) duplicated block id: 2137 size: 7 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (270:276) - src/predict.py (488:494) duplicated block id: 2138 size: 7 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (243:249) - src/deep_baselines/predict_deep_baselines.py (267:273) duplicated block id: 2139 size: 7 cleaned lines of code in 2 files: - src/protein_structure/embedding_from_esmfold.py (24:38) - src/protein_structure/merge_embedding_pdb_result.py (24:32) duplicated block id: 2140 size: 7 cleaned lines of code in 2 files: - src/predict_many_samples.py (444:450) - src/predict_one_sample.py (413:419) duplicated block id: 2141 size: 7 cleaned lines of code in 2 files: - src/common/metrics.py (411:417) - src/common/multi_label_metrics.py (482:488) duplicated block id: 2142 size: 7 cleaned lines of code in 2 files: - src/data_loader.py (687:693) - src/predict_many_samples.py (355:361) duplicated block id: 2143 size: 7 cleaned lines of code in 2 files: - src/data_loader.py (687:693) - src/predict_one_sample.py (358:364) duplicated block id: 2144 size: 7 cleaned lines of code in 2 files: - src/predict_one_sample.py (24:48) - src/trainer.py (24:50) duplicated block id: 2145 size: 7 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1520:1531) - src/SSFN/modeling_bert.py (1806:1817) duplicated block id: 2146 size: 7 cleaned lines of code in 2 files: - src/data_loader.py (1123:1129) - src/predict_many_samples.py (355:361) duplicated block id: 2147 size: 7 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1254:1260) - src/SSFN/modeling_bert.py (1689:1695) duplicated block id: 2148 size: 7 cleaned lines of code in 2 files: - src/app/app.py (428:434) - src/predict_one_sample.py (413:419) duplicated block id: 2149 size: 7 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (24:39) - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (24:40) duplicated block id: 2150 size: 7 cleaned lines of code in 2 files: - src/baselines/xgb.py (642:649) - src/predictor.py (265:272) duplicated block id: 2151 size: 7 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_entrez.py (76:82) - src/geo_map/get_biosample_from_update.py (52:58) duplicated block id: 2152 size: 7 cleaned lines of code in 2 files: - src/baselines/dnn.py (24:49) - src/protein_structure/merge_embedding_pdb_result.py (24:32) duplicated block id: 2153 size: 7 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_manual.py (90:96) - src/geo_map/get_biosample_from_update.py (92:98) duplicated block id: 2154 size: 7 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1254:1260) - src/SSFN/modeling_bert.py (1869:1875) duplicated block id: 2155 size: 7 cleaned lines of code in 2 files: - src/baselines/dnn.py (533:539) - src/baselines/lgbm.py (432:438) duplicated block id: 2156 size: 7 cleaned lines of code in 2 files: - src/common/metrics.py (334:340) - src/common/multi_label_metrics.py (453:459) duplicated block id: 2157 size: 7 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1254:1260) - src/SSFN/modeling_bert.py (1777:1783) duplicated block id: 2158 size: 7 cleaned lines of code in 2 files: - src/common/multi_label_metrics.py (199:205) - src/common/multi_label_metrics.py (334:341) duplicated block id: 2159 size: 7 cleaned lines of code in 2 files: - src/baselines/lgbm.py (680:687) - src/predictor.py (265:272) duplicated block id: 2160 size: 7 cleaned lines of code in 2 files: - src/app/app.py (460:466) - src/predict_many_samples.py (410:416) duplicated block id: 2161 size: 7 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (251:257) - src/deep_baselines/predict_deep_baselines.py (275:281) duplicated block id: 2162 size: 7 cleaned lines of code in 2 files: - src/data_loader.py (669:675) - src/predict_many_samples.py (337:343) duplicated block id: 2163 size: 7 cleaned lines of code in 2 files: - src/baselines/dnn.py (957:964) - src/baselines/lgbm.py (680:687) duplicated block id: 2164 size: 7 cleaned lines of code in 2 files: - src/app/app.py (460:466) - src/predict_one_sample.py (413:419) duplicated block id: 2165 size: 7 cleaned lines of code in 2 files: - src/predict.py (395:403) - src/predict_one_sample.py (408:416) duplicated block id: 2166 size: 7 cleaned lines of code in 2 files: - src/SSFN/model.py (24:35) - src/protein_structure/merge_embedding_pdb_result.py (24:32) duplicated block id: 2167 size: 7 cleaned lines of code in 2 files: - src/baselines/xgb.py (24:42) - src/geo_map/self_testing_nearest_station_antarctica.py (24:33) duplicated block id: 2168 size: 7 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (561:567) - src/result_process/process_predict_result.py (105:111) duplicated block id: 2169 size: 7 cleaned lines of code in 2 files: - src/app/app.py (394:400) - src/predict_one_sample.py (511:517) duplicated block id: 2170 size: 7 cleaned lines of code in 2 files: - src/app/app.py (321:327) - src/data_loader.py (1106:1112) duplicated block id: 2171 size: 7 cleaned lines of code in 2 files: - src/app/app.py (394:400) - src/predict_one_sample.py (479:485) duplicated block id: 2172 size: 7 cleaned lines of code in 2 files: - src/common/metrics.py (241:247) - src/common/metrics.py (288:294) duplicated block id: 2173 size: 7 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (227:233) - src/deep_baselines/predict_deep_baselines.py (251:257) duplicated block id: 2174 size: 7 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (227:233) - src/deep_baselines/predict_deep_baselines.py (275:281) duplicated block id: 2175 size: 7 cleaned lines of code in 2 files: - src/common/metrics.py (177:183) - src/common/metrics.py (288:294) duplicated block id: 2176 size: 7 cleaned lines of code in 2 files: - src/app/app.py (394:400) - src/app/app.py (492:498) duplicated block id: 2177 size: 7 cleaned lines of code in 2 files: - src/data_loader.py (1198:1204) - src/data_loader.py (1300:1306) duplicated block id: 2178 size: 7 cleaned lines of code in 2 files: - src/app/app.py (394:400) - src/app/app.py (428:434) duplicated block id: 2179 size: 7 cleaned lines of code in 2 files: - src/data_loader.py (785:791) - src/data_loader.py (1323:1329) duplicated block id: 2180 size: 7 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (246:252) - src/predict.py (459:465) duplicated block id: 2181 size: 7 cleaned lines of code in 2 files: - src/app/app.py (394:400) - src/app/app.py (460:466) duplicated block id: 2182 size: 7 cleaned lines of code in 2 files: - src/app/app.py (394:400) - src/predict_one_sample.py (447:453) duplicated block id: 2183 size: 7 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (400:407) - src/SSFN/modeling_bert.py (478:485) duplicated block id: 2184 size: 7 cleaned lines of code in 2 files: - src/baselines/lgbm.py (680:687) - src/deep_baselines/run.py (917:924) duplicated block id: 2185 size: 7 cleaned lines of code in 2 files: - src/common/multi_label_metrics.py (53:59) - src/common/multi_label_metrics.py (334:341) duplicated block id: 2186 size: 7 cleaned lines of code in 2 files: - src/baselines/lgbm.py (103:111) - src/baselines/xgb.py (102:110) duplicated block id: 2187 size: 7 cleaned lines of code in 2 files: - src/data_loader.py (783:789) - src/trainer.py (70:76) duplicated block id: 2188 size: 7 cleaned lines of code in 2 files: - src/data_loader.py (783:789) - src/trainer.py (86:92) duplicated block id: 2189 size: 7 cleaned lines of code in 2 files: - src/predict.py (400:406) - src/predict.py (456:462) duplicated block id: 2190 size: 7 cleaned lines of code in 2 files: - src/predict.py (400:406) - src/predict.py (427:433) duplicated block id: 2191 size: 7 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (396:402) - src/data_preprocess/data_preprocess_for_rdrp_v1.py (504:510) duplicated block id: 2192 size: 7 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1254:1260) - src/SSFN/modeling_bert.py (1578:1584) duplicated block id: 2193 size: 7 cleaned lines of code in 2 files: - src/app/app.py (428:434) - src/predict_many_samples.py (410:416) duplicated block id: 2194 size: 7 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (396:402) - src/data_preprocess/data_preprocess_for_rdrp_v1.py (488:494) duplicated block id: 2195 size: 7 cleaned lines of code in 2 files: - src/SSFN/model.py (24:35) - src/baselines/dnn.py (24:49) duplicated block id: 2196 size: 7 cleaned lines of code in 2 files: - src/predict.py (400:406) - src/predict.py (485:491) duplicated block id: 2197 size: 7 cleaned lines of code in 2 files: - src/app/app.py (339:345) - src/data_loader.py (687:693) duplicated block id: 2198 size: 7 cleaned lines of code in 2 files: - src/predict.py (614:620) - src/predict_many_samples.py (647:653) duplicated block id: 2199 size: 7 cleaned lines of code in 2 files: - src/data_loader.py (669:675) - src/predict_one_sample.py (340:346) duplicated block id: 2200 size: 7 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (435:441) - src/SSFN/modeling_bert.py (581:587) duplicated block id: 2201 size: 7 cleaned lines of code in 2 files: - src/baselines/predict.py (24:38) - src/protein_structure/embedding_from_esmfold.py (24:38) duplicated block id: 2202 size: 7 cleaned lines of code in 2 files: - src/app/app.py (321:327) - src/data_loader.py (669:675) duplicated block id: 2203 size: 7 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (503:509) - src/SSFN/modeling_bert.py (581:587) duplicated block id: 2204 size: 7 cleaned lines of code in 2 files: - src/data_loader.py (669:675) - src/predict.py (318:324) duplicated block id: 2205 size: 7 cleaned lines of code in 2 files: - src/protein_structure/embedding_from_esmfold.py (136:142) - src/protein_structure/structure_from_esm_v1.py (219:225) duplicated block id: 2206 size: 7 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (99:105) - src/protein_structure/structure_from_esm_v1.py (278:284) duplicated block id: 2207 size: 7 cleaned lines of code in 2 files: - src/predict.py (614:620) - src/predict_one_sample.py (639:645) duplicated block id: 2208 size: 7 cleaned lines of code in 2 files: - src/baselines/predict.py (24:38) - src/protein_structure/merge_embedding_pdb_result.py (24:32) duplicated block id: 2209 size: 7 cleaned lines of code in 2 files: - src/app/app.py (394:400) - src/predict_many_samples.py (476:482) duplicated block id: 2210 size: 7 cleaned lines of code in 2 files: - src/app/app.py (394:400) - src/predict_many_samples.py (508:514) duplicated block id: 2211 size: 7 cleaned lines of code in 2 files: - src/baselines/dnn.py (513:519) - src/baselines/lgbm.py (449:455) duplicated block id: 2212 size: 7 cleaned lines of code in 2 files: - src/app/app.py (394:400) - src/predict_many_samples.py (444:450) duplicated block id: 2213 size: 7 cleaned lines of code in 2 files: - src/predict_many_samples.py (410:416) - src/predict_many_samples.py (444:450) duplicated block id: 2214 size: 7 cleaned lines of code in 2 files: - src/baselines/dnn.py (957:964) - src/baselines/xgb.py (642:649) duplicated block id: 2215 size: 7 cleaned lines of code in 2 files: - src/baselines/dnn.py (533:539) - src/deep_baselines/run.py (470:476) duplicated block id: 2216 size: 7 cleaned lines of code in 2 files: - src/SSFN/model.py (24:35) - src/protein_structure/embedding_from_esmfold.py (24:38) duplicated block id: 2217 size: 7 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1373:1379) - src/SSFN/modeling_bert.py (1689:1695) duplicated block id: 2218 size: 7 cleaned lines of code in 2 files: - src/baselines/xgb.py (642:649) - src/deep_baselines/run.py (917:924) duplicated block id: 2219 size: 7 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (561:567) - src/protein_structure/structure_from_esm_v1.py (278:284) duplicated block id: 2220 size: 7 cleaned lines of code in 2 files: - src/baselines/lgbm.py (432:438) - src/baselines/xgb.py (414:420) duplicated block id: 2221 size: 7 cleaned lines of code in 2 files: - src/protein_structure/embedding_from_esmfold.py (157:163) - src/protein_structure/structure_from_esm_v1.py (232:238) duplicated block id: 2222 size: 7 cleaned lines of code in 2 files: - src/baselines/lgbm.py (24:43) - src/geo_map/self_testing_nearest_station_antarctica.py (24:33) duplicated block id: 2223 size: 7 cleaned lines of code in 2 files: - src/baselines/lgbm.py (449:455) - src/baselines/xgb.py (397:403) duplicated block id: 2224 size: 7 cleaned lines of code in 2 files: - src/predict.py (395:403) - src/predict_many_samples.py (405:413) duplicated block id: 2225 size: 7 cleaned lines of code in 2 files: - src/baselines/dnn.py (24:49) - src/baselines/predict.py (24:38) duplicated block id: 2226 size: 7 cleaned lines of code in 2 files: - src/data_loader.py (1106:1112) - src/predict_one_sample.py (340:346) duplicated block id: 2227 size: 7 cleaned lines of code in 2 files: - src/baselines/xgb.py (397:403) - src/deep_baselines/run.py (490:496) duplicated block id: 2228 size: 7 cleaned lines of code in 2 files: - src/predict_many_samples.py (410:416) - src/predict_many_samples.py (508:514) duplicated block id: 2229 size: 7 cleaned lines of code in 2 files: - src/predict_many_samples.py (410:416) - src/predict_many_samples.py (476:482) duplicated block id: 2230 size: 7 cleaned lines of code in 2 files: - src/data_loader.py (1106:1112) - src/predict_many_samples.py (337:343) duplicated block id: 2231 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (316:321) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (383:388) duplicated block id: 2232 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (383:388) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (402:407) duplicated block id: 2233 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (316:321) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (360:365) duplicated block id: 2234 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (316:321) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (352:357) duplicated block id: 2235 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (352:357) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (383:388) duplicated block id: 2236 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (307:312) - src/protein_structure/embedding_from_esmfold.py (321:326) duplicated block id: 2237 size: 6 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_manual.py (66:71) - src/protein_structure/structure_from_esm_v1.py (279:284) duplicated block id: 2238 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (352:357) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (402:407) duplicated block id: 2239 size: 6 cleaned lines of code in 2 files: - src/predict.py (549:554) - src/run.py (103:108) duplicated block id: 2240 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (352:357) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (360:365) duplicated block id: 2241 size: 6 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_manual.py (66:71) - src/protein_structure/structure_from_esm_v1.py (246:251) duplicated block id: 2242 size: 6 cleaned lines of code in 2 files: - src/protein_structure/merge_embedding_pdb_result.py (130:135) - src/result_process/process_predict_result.py (106:111) duplicated block id: 2243 size: 6 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_2.py (49:54) - src/plot/plot_map_pie_fig_aff4_1.py (40:45) duplicated block id: 2244 size: 6 cleaned lines of code in 2 files: - src/baselines/lgbm.py (383:388) - src/run.py (742:748) duplicated block id: 2245 size: 6 cleaned lines of code in 2 files: - src/common/metrics.py (102:107) - src/common/metrics.py (151:156) duplicated block id: 2246 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (352:357) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (448:453) duplicated block id: 2247 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (316:321) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (448:453) duplicated block id: 2248 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (383:388) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (448:453) duplicated block id: 2249 size: 6 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (198:203) - src/protein_structure/merge_embedding_pdb_result.py (130:135) duplicated block id: 2250 size: 6 cleaned lines of code in 2 files: - src/predict.py (334:339) - src/predict_many_samples.py (334:339) duplicated block id: 2251 size: 6 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (198:203) - src/protein_structure/merge_embedding_pdb_result.py (141:146) duplicated block id: 2252 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (316:321) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (402:407) duplicated block id: 2253 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (260:267) - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (244:250) duplicated block id: 2254 size: 6 cleaned lines of code in 2 files: - src/protein_structure/merge_embedding_pdb_result.py (130:135) - src/protein_structure/merge_embedding_pdb_result.py (141:146) duplicated block id: 2255 size: 6 cleaned lines of code in 2 files: - src/baselines/xgb.py (591:596) - src/baselines/xgb.py (621:626) duplicated block id: 2256 size: 6 cleaned lines of code in 2 files: - src/baselines/dnn.py (558:565) - src/deep_baselines/run.py (552:559) duplicated block id: 2257 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (288:293) - src/protein_structure/merge_embedding_pdb_result.py (141:146) duplicated block id: 2258 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (288:293) - src/protein_structure/merge_embedding_pdb_result.py (130:135) duplicated block id: 2259 size: 6 cleaned lines of code in 2 files: - src/app/app.py (336:341) - src/predict_many_samples.py (334:339) duplicated block id: 2260 size: 6 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (204:209) - src/predict_many_samples.py (424:429) duplicated block id: 2261 size: 6 cleaned lines of code in 2 files: - src/app/app.py (404:409) - src/app/app.py (439:444) duplicated block id: 2262 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (493:498) - src/result_process/process_predict_result.py (106:111) duplicated block id: 2263 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (316:321) - src/protein_structure/merge_embedding_pdb_result.py (106:111) duplicated block id: 2264 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (352:357) - src/plot/plot_map_pie_fig4_1.py (100:105) duplicated block id: 2265 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (316:321) - src/protein_structure/merge_embedding_pdb_result.py (74:79) duplicated block id: 2266 size: 6 cleaned lines of code in 2 files: - src/baselines/lgbm.py (337:343) - src/deep_baselines/run.py (352:358) duplicated block id: 2267 size: 6 cleaned lines of code in 2 files: - src/predictor.py (83:88) - src/trainer.py (86:91) duplicated block id: 2268 size: 6 cleaned lines of code in 2 files: - src/predictor.py (83:88) - src/trainer.py (70:75) duplicated block id: 2269 size: 6 cleaned lines of code in 2 files: - src/SSFN/model.py (347:352) - src/SSFN/modeling_bert.py (1374:1379) duplicated block id: 2270 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (60:65) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (493:498) duplicated block id: 2271 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (316:321) - src/protein_structure/merge_embedding_pdb_result.py (130:135) duplicated block id: 2272 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (316:321) - src/protein_structure/merge_embedding_pdb_result.py (141:146) duplicated block id: 2273 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (383:388) - src/result_process/process_predict_result.py (106:111) duplicated block id: 2274 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (360:365) - src/plot/plot_map_pie_fig4_2.py (233:238) duplicated block id: 2275 size: 6 cleaned lines of code in 2 files: - src/data_loader.py (214:219) - src/data_loader.py (291:296) duplicated block id: 2276 size: 6 cleaned lines of code in 2 files: - src/predict.py (315:320) - src/predict.py (334:339) duplicated block id: 2277 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (562:567) - src/plot/plot_map_pie_fig4_1.py (198:203) duplicated block id: 2278 size: 6 cleaned lines of code in 2 files: - src/app/app.py (404:409) - src/predict_many_samples.py (455:460) duplicated block id: 2279 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (307:312) - src/plot/plot_map_pie_fig4_1.py (100:105) duplicated block id: 2280 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (60:65) - src/protein_structure/merge_embedding_pdb_result.py (130:135) duplicated block id: 2281 size: 6 cleaned lines of code in 2 files: - src/data_loader.py (214:219) - src/data_loader.py (259:264) duplicated block id: 2282 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (316:321) - src/plot/plot_map_pie_fig4_1.py (100:105) duplicated block id: 2283 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (60:65) - src/protein_structure/merge_embedding_pdb_result.py (141:146) duplicated block id: 2284 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (360:365) - src/protein_structure/merge_embedding_pdb_result.py (106:111) duplicated block id: 2285 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (360:365) - src/protein_structure/merge_embedding_pdb_result.py (74:79) duplicated block id: 2286 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (493:498) - src/plot/plot_map_pie_fig4_1.py (198:203) duplicated block id: 2287 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (316:321) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (493:498) duplicated block id: 2288 size: 6 cleaned lines of code in 2 files: - src/predict.py (564:569) - src/run.py (119:124) duplicated block id: 2289 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (383:388) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (493:498) duplicated block id: 2290 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (60:65) - src/plot/plot_map_pie_fig4_1.py (100:105) duplicated block id: 2291 size: 6 cleaned lines of code in 2 files: - src/app/app.py (404:409) - src/predict_one_sample.py (458:463) duplicated block id: 2292 size: 6 cleaned lines of code in 2 files: - src/protein_structure/predict_structure.py (97:102) - src/protein_structure/predict_structure.py (176:181) duplicated block id: 2293 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (288:293) - src/protein_structure/merge_embedding_pdb_result.py (74:79) duplicated block id: 2294 size: 6 cleaned lines of code in 2 files: - src/protein_structure/merge_embedding_pdb_result.py (106:111) - src/protein_structure/structure_from_esm_v1.py (246:251) duplicated block id: 2295 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (316:321) - src/protein_structure/structure_from_esm_v1.py (279:284) duplicated block id: 2296 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (316:321) - src/plot/plot_map_pie_fig4_2.py (233:238) duplicated block id: 2297 size: 6 cleaned lines of code in 2 files: - src/baselines/dnn.py (558:565) - src/baselines/dnn.py (595:602) duplicated block id: 2298 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (316:321) - src/protein_structure/structure_from_esm_v1.py (246:251) duplicated block id: 2299 size: 6 cleaned lines of code in 2 files: - src/protein_structure/merge_embedding_pdb_result.py (106:111) - src/protein_structure/structure_from_esm_v1.py (279:284) duplicated block id: 2300 size: 6 cleaned lines of code in 2 files: - src/app/app.py (318:323) - src/predict_one_sample.py (355:360) duplicated block id: 2301 size: 6 cleaned lines of code in 2 files: - src/geo_map/standardization_lat_lon_info.py (207:212) - src/plot/plot_map_pie_fig_aff4_2.py (231:236) duplicated block id: 2302 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (73:78) - src/result_process/process_predict_result.py (106:111) duplicated block id: 2303 size: 6 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (148:153) - src/plot/plot_map_pie_fig4_2.py (198:203) duplicated block id: 2304 size: 6 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (198:203) - src/protein_structure/structure_from_esm_v1.py (279:284) duplicated block id: 2305 size: 6 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (168:173) - src/plot/plot_map_pie_fig_aff4_2.py (197:202) duplicated block id: 2306 size: 6 cleaned lines of code in 2 files: - src/predict.py (425:430) - src/predict_one_sample.py (477:482) duplicated block id: 2307 size: 6 cleaned lines of code in 2 files: - src/predict.py (425:430) - src/predict_one_sample.py (509:514) duplicated block id: 2308 size: 6 cleaned lines of code in 2 files: - src/predict_many_samples.py (334:339) - src/predict_many_samples.py (352:357) duplicated block id: 2309 size: 6 cleaned lines of code in 2 files: - src/predictor.py (189:194) - src/predictor.py (208:213) duplicated block id: 2310 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (562:567) - src/protein_structure/structure_from_esm_v1.py (246:251) duplicated block id: 2311 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (562:567) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (448:453) duplicated block id: 2312 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (562:567) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (402:407) duplicated block id: 2313 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (562:567) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (383:388) duplicated block id: 2314 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (24:36) - src/data_preprocess/tf_records_generator.py (24:34) duplicated block id: 2315 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (360:365) - src/protein_structure/embedding_from_esmfold.py (321:326) duplicated block id: 2316 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (562:567) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (360:365) duplicated block id: 2317 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (562:567) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (352:357) duplicated block id: 2318 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (73:78) - src/protein_structure/merge_embedding_pdb_result.py (141:146) duplicated block id: 2319 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (73:78) - src/protein_structure/merge_embedding_pdb_result.py (130:135) duplicated block id: 2320 size: 6 cleaned lines of code in 2 files: - src/common/multi_label_metrics.py (34:39) - src/common/multi_label_metrics.py (98:103) duplicated block id: 2321 size: 6 cleaned lines of code in 2 files: - src/SSFN/model.py (347:352) - src/SSFN/modeling_bert.py (1255:1260) duplicated block id: 2322 size: 6 cleaned lines of code in 2 files: - src/predict.py (315:320) - src/predict_one_sample.py (355:360) duplicated block id: 2323 size: 6 cleaned lines of code in 2 files: - src/baselines/lgbm.py (38:51) - src/deep_baselines/run.py (40:64) duplicated block id: 2324 size: 6 cleaned lines of code in 2 files: - src/app/app.py (426:431) - src/predict.py (454:459) duplicated block id: 2325 size: 6 cleaned lines of code in 2 files: - src/baselines/lgbm.py (61:69) - src/baselines/xgb.py (61:69) duplicated block id: 2326 size: 6 cleaned lines of code in 2 files: - src/predict.py (454:459) - src/predict_one_sample.py (509:514) duplicated block id: 2327 size: 6 cleaned lines of code in 2 files: - src/predict.py (483:488) - src/predict_many_samples.py (442:447) duplicated block id: 2328 size: 6 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (198:203) - src/protein_structure/merge_embedding_pdb_result.py (74:79) duplicated block id: 2329 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (562:567) - src/geo_map/get_biosample_from_manual.py (66:71) duplicated block id: 2330 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (562:567) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (493:498) duplicated block id: 2331 size: 6 cleaned lines of code in 2 files: - src/predict.py (483:488) - src/predict_many_samples.py (474:479) duplicated block id: 2332 size: 6 cleaned lines of code in 2 files: - src/predict.py (464:469) - src/predict_many_samples.py (484:489) duplicated block id: 2333 size: 6 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (198:203) - src/protein_structure/merge_embedding_pdb_result.py (106:111) duplicated block id: 2334 size: 6 cleaned lines of code in 2 files: - src/predict.py (464:469) - src/predict_many_samples.py (516:521) duplicated block id: 2335 size: 6 cleaned lines of code in 2 files: - src/geo_map/standardization_lat_lon_info.py (183:188) - src/plot/plot_map_pie_fig4_1.py (198:203) duplicated block id: 2336 size: 6 cleaned lines of code in 2 files: - src/baselines/dnn.py (830:835) - src/deep_baselines/run.py (877:882) duplicated block id: 2337 size: 6 cleaned lines of code in 2 files: - src/app/app.py (468:473) - src/predict.py (435:440) duplicated block id: 2338 size: 6 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (100:105) - src/protein_structure/merge_embedding_pdb_result.py (130:135) duplicated block id: 2339 size: 6 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (100:105) - src/protein_structure/merge_embedding_pdb_result.py (141:146) duplicated block id: 2340 size: 6 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_2.py (233:238) - src/plot/plot_map_pie_fig_aff4_2.py (231:236) duplicated block id: 2341 size: 6 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig_aff4_1.py (40:45) - src/plot/plot_map_pie_fig_aff4_2.py (48:53) duplicated block id: 2342 size: 6 cleaned lines of code in 2 files: - src/predict.py (454:459) - src/predict_one_sample.py (445:450) duplicated block id: 2343 size: 6 cleaned lines of code in 2 files: - src/protein_structure/merge_embedding_pdb_result.py (74:79) - src/protein_structure/merge_embedding_pdb_result.py (106:111) duplicated block id: 2344 size: 6 cleaned lines of code in 2 files: - src/baselines/dnn.py (735:741) - src/trainer.py (352:358) duplicated block id: 2345 size: 6 cleaned lines of code in 2 files: - src/app/app.py (468:473) - src/predict.py (464:469) duplicated block id: 2346 size: 6 cleaned lines of code in 2 files: - src/app/app.py (426:431) - src/predict.py (483:488) duplicated block id: 2347 size: 6 cleaned lines of code in 2 files: - src/app/app.py (468:473) - src/predict.py (493:498) duplicated block id: 2348 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (60:65) - src/protein_structure/structure_from_esm_v1.py (279:284) duplicated block id: 2349 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (352:357) - src/plot/plot_map_pie_fig_aff4_2.py (231:236) duplicated block id: 2350 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (562:567) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (307:312) duplicated block id: 2351 size: 6 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_entrez.py (86:91) - src/geo_map/get_biosample_from_update.py (42:47) duplicated block id: 2352 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (562:567) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (316:321) duplicated block id: 2353 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (383:388) - src/plot/plot_map_pie_fig_aff4_2.py (231:236) duplicated block id: 2354 size: 6 cleaned lines of code in 2 files: - src/app/app.py (296:301) - src/predict_many_samples.py (301:306) duplicated block id: 2355 size: 6 cleaned lines of code in 2 files: - src/protein_structure/embedding_from_esmfold.py (321:326) - src/protein_structure/merge_embedding_pdb_result.py (74:79) duplicated block id: 2356 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (402:407) - src/protein_structure/structure_from_esm_v1.py (246:251) duplicated block id: 2357 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (383:388) - src/protein_structure/structure_from_esm_v1.py (279:284) duplicated block id: 2358 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (360:365) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (383:388) duplicated block id: 2359 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (73:78) - src/protein_structure/structure_from_esm_v1.py (246:251) duplicated block id: 2360 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (24:41) - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (24:36) duplicated block id: 2361 size: 6 cleaned lines of code in 2 files: - src/common/metrics.py (261:266) - src/common/metrics.py (309:314) duplicated block id: 2362 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (402:407) - src/protein_structure/structure_from_esm_v1.py (279:284) duplicated block id: 2363 size: 6 cleaned lines of code in 2 files: - src/protein_structure/embedding_from_esmfold.py (321:326) - src/protein_structure/merge_embedding_pdb_result.py (106:111) duplicated block id: 2364 size: 6 cleaned lines of code in 2 files: - src/deep_baselines/run.py (604:609) - src/deep_baselines/run.py (867:872) duplicated block id: 2365 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (493:498) - src/plot/plot_map_pie_fig4_1.py (100:105) duplicated block id: 2366 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (383:388) - src/protein_structure/structure_from_esm_v1.py (246:251) duplicated block id: 2367 size: 6 cleaned lines of code in 2 files: - src/predict.py (483:488) - src/predict_one_sample.py (477:482) duplicated block id: 2368 size: 6 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig_aff4_2.py (231:236) - src/protein_structure/merge_embedding_pdb_result.py (74:79) duplicated block id: 2369 size: 6 cleaned lines of code in 2 files: - src/deep_baselines/run.py (604:609) - src/deep_baselines/run.py (779:784) duplicated block id: 2370 size: 6 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig_aff4_2.py (231:236) - src/protein_structure/merge_embedding_pdb_result.py (106:111) duplicated block id: 2371 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (288:293) - src/protein_structure/embedding_from_esmfold.py (321:326) duplicated block id: 2372 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (493:498) - src/plot/plot_map_pie_fig_aff4_2.py (231:236) duplicated block id: 2373 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (307:312) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (493:498) duplicated block id: 2374 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (360:365) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (448:453) duplicated block id: 2375 size: 6 cleaned lines of code in 2 files: - src/baselines/dnn.py (445:452) - src/run.py (596:603) duplicated block id: 2376 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (360:365) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (402:407) duplicated block id: 2377 size: 6 cleaned lines of code in 2 files: - src/predictor.py (67:72) - src/trainer.py (70:75) duplicated block id: 2378 size: 6 cleaned lines of code in 2 files: - src/predictor.py (67:72) - src/trainer.py (86:91) duplicated block id: 2379 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (448:453) - src/protein_structure/embedding_from_esmfold.py (321:326) duplicated block id: 2380 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (307:312) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (448:453) duplicated block id: 2381 size: 6 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (100:105) - src/plot/plot_map_pie_fig4_1.py (198:203) duplicated block id: 2382 size: 6 cleaned lines of code in 2 files: - src/baselines/xgb.py (321:328) - src/deep_baselines/run.py (361:368) duplicated block id: 2383 size: 6 cleaned lines of code in 2 files: - src/predict_one_sample.py (590:595) - src/run.py (96:101) duplicated block id: 2384 size: 6 cleaned lines of code in 2 files: - src/geo_map/standardization_lat_lon_info.py (183:188) - src/protein_structure/merge_embedding_pdb_result.py (106:111) duplicated block id: 2385 size: 6 cleaned lines of code in 2 files: - src/protein_structure/merge_embedding_pdb_result.py (106:111) - src/result_process/process_predict_result.py (106:111) duplicated block id: 2386 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (316:321) - src/plot/plot_map_pie_fig_aff4_2.py (231:236) duplicated block id: 2387 size: 6 cleaned lines of code in 2 files: - src/geo_map/standardization_lat_lon_info.py (183:188) - src/protein_structure/merge_embedding_pdb_result.py (74:79) duplicated block id: 2388 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (448:453) - src/protein_structure/merge_embedding_pdb_result.py (130:135) duplicated block id: 2389 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (448:453) - src/protein_structure/merge_embedding_pdb_result.py (141:146) duplicated block id: 2390 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (402:407) - src/geo_map/standardization_lat_lon_info.py (183:188) duplicated block id: 2391 size: 6 cleaned lines of code in 2 files: - src/predict_one_sample.py (304:309) - src/predict_one_sample.py (315:320) duplicated block id: 2392 size: 6 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_2.py (233:238) - src/result_process/process_predict_result.py (106:111) duplicated block id: 2393 size: 6 cleaned lines of code in 2 files: - src/app/app.py (266:272) - src/data_loader.py (1081:1086) duplicated block id: 2394 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (402:407) - src/geo_map/standardization_lat_lon_info.py (207:212) duplicated block id: 2395 size: 6 cleaned lines of code in 2 files: - src/trainer.py (286:291) - src/trainer.py (310:315) duplicated block id: 2396 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (307:312) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (360:365) duplicated block id: 2397 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (307:312) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (352:357) duplicated block id: 2398 size: 6 cleaned lines of code in 2 files: - src/predict_many_samples.py (590:597) - src/run.py (96:101) duplicated block id: 2399 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (307:312) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (402:407) duplicated block id: 2400 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (352:357) - src/result_process/process_predict_result.py (106:111) duplicated block id: 2401 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (307:312) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (383:388) duplicated block id: 2402 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (73:78) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (60:65) duplicated block id: 2403 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (73:78) - src/protein_structure/structure_from_esm_v1.py (279:284) duplicated block id: 2404 size: 6 cleaned lines of code in 2 files: - src/data_loader.py (666:671) - src/data_loader.py (684:689) duplicated block id: 2405 size: 6 cleaned lines of code in 2 files: - src/geo_map/standardization_lat_lon_info.py (183:188) - src/plot/plot_map_pie_fig_aff4_2.py (231:236) duplicated block id: 2406 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (60:65) - src/plot/plot_map_pie_fig_aff4_2.py (231:236) duplicated block id: 2407 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (383:388) - src/protein_structure/merge_embedding_pdb_result.py (74:79) duplicated block id: 2408 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (493:498) - src/protein_structure/structure_from_esm_v1.py (279:284) duplicated block id: 2409 size: 6 cleaned lines of code in 2 files: - src/geo_map/standardization_lat_lon_info.py (207:212) - src/plot/plot_map_pie_fig4_1.py (198:203) duplicated block id: 2410 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (383:388) - src/protein_structure/merge_embedding_pdb_result.py (106:111) duplicated block id: 2411 size: 6 cleaned lines of code in 2 files: - src/predict_many_samples.py (301:306) - src/predict_many_samples.py (312:317) duplicated block id: 2412 size: 6 cleaned lines of code in 2 files: - src/data_loader.py (1081:1086) - src/predict_many_samples.py (282:288) duplicated block id: 2413 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (316:321) - src/geo_map/standardization_lat_lon_info.py (183:188) duplicated block id: 2414 size: 6 cleaned lines of code in 2 files: - src/protein_structure/merge_embedding_pdb_result.py (141:146) - src/result_process/process_predict_result.py (106:111) duplicated block id: 2415 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (316:321) - src/geo_map/standardization_lat_lon_info.py (207:212) duplicated block id: 2416 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (448:453) - src/plot/plot_map_pie_fig_aff4_2.py (231:236) duplicated block id: 2417 size: 6 cleaned lines of code in 2 files: - src/app/app.py (296:301) - src/predict_one_sample.py (304:309) duplicated block id: 2418 size: 6 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1265:1271) - src/SSFN/modeling_bert.py (1382:1388) duplicated block id: 2419 size: 6 cleaned lines of code in 2 files: - src/app/app.py (336:341) - src/predict_one_sample.py (337:342) duplicated block id: 2420 size: 6 cleaned lines of code in 2 files: - src/app/app.py (336:341) - src/predict.py (315:320) duplicated block id: 2421 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (288:293) - src/result_process/process_predict_result.py (106:111) duplicated block id: 2422 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (383:388) - src/plot/plot_map_pie_fig4_2.py (233:238) duplicated block id: 2423 size: 6 cleaned lines of code in 2 files: - src/geo_map/standardization_lat_lon_info.py (183:188) - src/plot/plot_map_pie_fig4_1.py (100:105) duplicated block id: 2424 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (402:407) - src/plot/plot_map_pie_fig4_1.py (100:105) duplicated block id: 2425 size: 6 cleaned lines of code in 2 files: - src/app/app.py (318:323) - src/predict.py (334:339) duplicated block id: 2426 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (448:453) - src/plot/plot_map_pie_fig4_2.py (233:238) duplicated block id: 2427 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (448:453) - src/geo_map/get_biosample_from_manual.py (66:71) duplicated block id: 2428 size: 6 cleaned lines of code in 2 files: - src/predict.py (483:488) - src/predict_one_sample.py (445:450) duplicated block id: 2429 size: 6 cleaned lines of code in 2 files: - src/app/app.py (318:323) - src/predict_many_samples.py (352:357) duplicated block id: 2430 size: 6 cleaned lines of code in 2 files: - src/geo_map/standardization_lat_lon_info.py (207:212) - src/result_process/process_predict_result.py (106:111) duplicated block id: 2431 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (307:312) - src/result_process/process_predict_result.py (106:111) duplicated block id: 2432 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (352:357) - src/protein_structure/embedding_from_esmfold.py (321:326) duplicated block id: 2433 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (60:65) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (307:312) duplicated block id: 2434 size: 6 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (49:54) - src/plot/plot_map_pie_fig4_2.py (49:54) duplicated block id: 2435 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (60:65) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (316:321) duplicated block id: 2436 size: 6 cleaned lines of code in 2 files: - src/predict_one_sample.py (612:617) - src/run.py (119:124) duplicated block id: 2437 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (288:293) - src/plot/plot_map_pie_fig4_1.py (198:203) duplicated block id: 2438 size: 6 cleaned lines of code in 2 files: - src/baselines/dnn.py (917:922) - src/deep_baselines/run.py (789:794) duplicated block id: 2439 size: 6 cleaned lines of code in 2 files: - src/predict.py (454:459) - src/predict_many_samples.py (442:447) duplicated block id: 2440 size: 6 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (198:203) - src/protein_structure/embedding_from_esmfold.py (321:326) duplicated block id: 2441 size: 6 cleaned lines of code in 2 files: - src/geo_map/standardization_lat_lon_info.py (183:188) - src/result_process/process_predict_result.py (106:111) duplicated block id: 2442 size: 6 cleaned lines of code in 2 files: - src/predict.py (32:48) - src/predictor.py (31:43) duplicated block id: 2443 size: 6 cleaned lines of code in 2 files: - src/predict_many_samples.py (301:306) - src/predict_one_sample.py (315:320) duplicated block id: 2444 size: 6 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig_aff4_2.py (231:236) - src/protein_structure/structure_from_esm_v1.py (246:251) duplicated block id: 2445 size: 6 cleaned lines of code in 2 files: - src/app/app.py (500:505) - src/predict.py (435:440) duplicated block id: 2446 size: 6 cleaned lines of code in 2 files: - src/app/app.py (500:505) - src/predict.py (464:469) duplicated block id: 2447 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (288:293) - src/protein_structure/merge_embedding_pdb_result.py (106:111) duplicated block id: 2448 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (307:312) - src/plot/plot_map_pie_fig4_2.py (233:238) duplicated block id: 2449 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (383:388) - src/protein_structure/embedding_from_esmfold.py (321:326) duplicated block id: 2450 size: 6 cleaned lines of code in 2 files: - src/app/app.py (285:290) - src/app/app.py (296:301) duplicated block id: 2451 size: 6 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (198:203) - src/plot/plot_map_pie_fig_aff4_2.py (231:236) duplicated block id: 2452 size: 6 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig_aff4_2.py (231:236) - src/protein_structure/structure_from_esm_v1.py (279:284) duplicated block id: 2453 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (60:65) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (383:388) duplicated block id: 2454 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (73:78) - src/protein_structure/embedding_from_esmfold.py (321:326) duplicated block id: 2455 size: 6 cleaned lines of code in 2 files: - src/predict_one_sample.py (597:602) - src/run.py (103:108) duplicated block id: 2456 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (60:65) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (402:407) duplicated block id: 2457 size: 6 cleaned lines of code in 2 files: - src/app/app.py (285:290) - src/predict_many_samples.py (312:317) duplicated block id: 2458 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (60:65) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (448:453) duplicated block id: 2459 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (73:78) - src/protein_structure/merge_embedding_pdb_result.py (106:111) duplicated block id: 2460 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (60:65) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (352:357) duplicated block id: 2461 size: 6 cleaned lines of code in 2 files: - src/baselines/dnn.py (830:835) - src/baselines/dnn.py (917:922) duplicated block id: 2462 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (60:65) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (360:365) duplicated block id: 2463 size: 6 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_manual.py (66:71) - src/result_process/process_predict_result.py (106:111) duplicated block id: 2464 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (493:498) - src/protein_structure/structure_from_esm_v1.py (246:251) duplicated block id: 2465 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (73:78) - src/protein_structure/merge_embedding_pdb_result.py (74:79) duplicated block id: 2466 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (288:293) - src/plot/plot_map_pie_fig4_2.py (233:238) duplicated block id: 2467 size: 6 cleaned lines of code in 2 files: - src/app/app.py (500:505) - src/predict.py (493:498) duplicated block id: 2468 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (307:312) - src/plot/plot_map_pie_fig4_1.py (198:203) duplicated block id: 2469 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (288:293) - src/data_preprocess/data_preprocess_for_rdrp_v1.py (562:567) duplicated block id: 2470 size: 6 cleaned lines of code in 2 files: - src/app/app.py (439:444) - src/predict_one_sample.py (423:428) duplicated block id: 2471 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (316:321) - src/protein_structure/embedding_from_esmfold.py (321:326) duplicated block id: 2472 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (352:357) - src/protein_structure/merge_embedding_pdb_result.py (74:79) duplicated block id: 2473 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (448:453) - src/protein_structure/merge_embedding_pdb_result.py (74:79) duplicated block id: 2474 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (383:388) - src/plot/plot_map_pie_fig4_1.py (198:203) duplicated block id: 2475 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (316:321) - src/geo_map/get_biosample_from_manual.py (66:71) duplicated block id: 2476 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (448:453) - src/protein_structure/merge_embedding_pdb_result.py (106:111) duplicated block id: 2477 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (73:78) - src/data_preprocess/data_preprocess_for_rdrp_v1.py (562:567) duplicated block id: 2478 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (352:357) - src/protein_structure/merge_embedding_pdb_result.py (106:111) duplicated block id: 2479 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (73:78) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (493:498) duplicated block id: 2480 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (448:453) - src/plot/plot_map_pie_fig4_1.py (198:203) duplicated block id: 2481 size: 6 cleaned lines of code in 2 files: - src/data_loader.py (684:689) - src/data_loader.py (1103:1108) duplicated block id: 2482 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (360:365) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (493:498) duplicated block id: 2483 size: 6 cleaned lines of code in 2 files: - src/app/app.py (285:290) - src/predict_one_sample.py (315:320) duplicated block id: 2484 size: 6 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (100:105) - src/protein_structure/embedding_from_esmfold.py (321:326) duplicated block id: 2485 size: 6 cleaned lines of code in 2 files: - src/predict_many_samples.py (312:317) - src/predict_one_sample.py (304:309) duplicated block id: 2486 size: 6 cleaned lines of code in 2 files: - src/predict.py (454:459) - src/predict_many_samples.py (506:511) duplicated block id: 2487 size: 6 cleaned lines of code in 2 files: - src/geo_map/standardization_lat_lon_info.py (207:212) - src/protein_structure/embedding_from_esmfold.py (321:326) duplicated block id: 2488 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (383:388) - src/geo_map/get_biosample_from_manual.py (66:71) duplicated block id: 2489 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (402:407) - src/plot/plot_map_pie_fig4_1.py (198:203) duplicated block id: 2490 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (493:498) - src/geo_map/standardization_lat_lon_info.py (183:188) duplicated block id: 2491 size: 6 cleaned lines of code in 2 files: - src/predict_many_samples.py (455:460) - src/predict_one_sample.py (423:428) duplicated block id: 2492 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (493:498) - src/geo_map/standardization_lat_lon_info.py (207:212) duplicated block id: 2493 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (360:365) - src/plot/plot_map_pie_fig4_1.py (198:203) duplicated block id: 2494 size: 6 cleaned lines of code in 2 files: - src/predict_many_samples.py (599:604) - src/run.py (103:108) duplicated block id: 2495 size: 6 cleaned lines of code in 2 files: - src/geo_map/standardization_lat_lon_info.py (207:212) - src/protein_structure/merge_embedding_pdb_result.py (130:135) duplicated block id: 2496 size: 6 cleaned lines of code in 2 files: - src/geo_map/standardization_lat_lon_info.py (207:212) - src/protein_structure/merge_embedding_pdb_result.py (141:146) duplicated block id: 2497 size: 6 cleaned lines of code in 2 files: - src/deep_baselines/run.py (693:699) - src/trainer.py (352:358) duplicated block id: 2498 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (493:498) - src/plot/plot_map_pie_fig4_2.py (233:238) duplicated block id: 2499 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (316:321) - src/plot/plot_map_pie_fig4_1.py (198:203) duplicated block id: 2500 size: 6 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_2.py (233:238) - src/protein_structure/structure_from_esm_v1.py (279:284) duplicated block id: 2501 size: 6 cleaned lines of code in 2 files: - src/app/app.py (490:495) - src/predict.py (425:430) duplicated block id: 2502 size: 6 cleaned lines of code in 2 files: - src/app/app.py (490:495) - src/predict.py (454:459) duplicated block id: 2503 size: 6 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_2.py (233:238) - src/protein_structure/structure_from_esm_v1.py (246:251) duplicated block id: 2504 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (352:357) - src/protein_structure/merge_embedding_pdb_result.py (141:146) duplicated block id: 2505 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (352:357) - src/protein_structure/merge_embedding_pdb_result.py (130:135) duplicated block id: 2506 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (73:78) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (307:312) duplicated block id: 2507 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (73:78) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (316:321) duplicated block id: 2508 size: 6 cleaned lines of code in 2 files: - src/data_loader.py (1103:1108) - src/data_loader.py (1120:1125) duplicated block id: 2509 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (73:78) - src/data_preprocess/data_preprocess_for_rdrp_v1.py (288:293) duplicated block id: 2510 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (307:312) - src/geo_map/standardization_lat_lon_info.py (183:188) duplicated block id: 2511 size: 6 cleaned lines of code in 2 files: - src/baselines/xgb.py (314:320) - src/deep_baselines/run.py (352:358) duplicated block id: 2512 size: 6 cleaned lines of code in 2 files: - src/deep_baselines/run.py (515:522) - src/deep_baselines/run.py (552:559) duplicated block id: 2513 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (402:407) - src/plot/plot_map_pie_fig4_2.py (233:238) duplicated block id: 2514 size: 6 cleaned lines of code in 2 files: - src/app/app.py (458:463) - src/predict.py (425:430) duplicated block id: 2515 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (307:312) - src/geo_map/standardization_lat_lon_info.py (207:212) duplicated block id: 2516 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (402:407) - src/protein_structure/embedding_from_esmfold.py (321:326) duplicated block id: 2517 size: 6 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (100:105) - src/plot/plot_map_pie_fig_aff4_2.py (231:236) duplicated block id: 2518 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (73:78) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (402:407) duplicated block id: 2519 size: 6 cleaned lines of code in 2 files: - src/geo_map/standardization_lat_lon_info.py (183:188) - src/protein_structure/embedding_from_esmfold.py (321:326) duplicated block id: 2520 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (73:78) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (383:388) duplicated block id: 2521 size: 6 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (168:173) - src/plot/plot_map_pie_fig4_2.py (198:203) duplicated block id: 2522 size: 6 cleaned lines of code in 2 files: - src/protein_structure/structure_from_esm_v1.py (246:251) - src/protein_structure/structure_from_esm_v1.py (279:284) duplicated block id: 2523 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (73:78) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (352:357) duplicated block id: 2524 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (73:78) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (360:365) duplicated block id: 2525 size: 6 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1046:1051) - src/SSFN/modeling_bert.py (1263:1268) duplicated block id: 2526 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (307:312) - src/protein_structure/structure_from_esm_v1.py (246:251) duplicated block id: 2527 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (24:41) - src/data_preprocess/tf_records_generator.py (24:34) duplicated block id: 2528 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (360:365) - src/plot/plot_map_pie_fig4_1.py (100:105) duplicated block id: 2529 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (307:312) - src/protein_structure/structure_from_esm_v1.py (279:284) duplicated block id: 2530 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (73:78) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (448:453) duplicated block id: 2531 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (73:78) - src/geo_map/get_biosample_from_manual.py (66:71) duplicated block id: 2532 size: 6 cleaned lines of code in 2 files: - src/protein_structure/structure_from_esm_v1.py (246:251) - src/result_process/process_predict_result.py (106:111) duplicated block id: 2533 size: 6 cleaned lines of code in 2 files: - src/protein_structure/embedding_from_esmfold.py (321:326) - src/protein_structure/structure_from_esm_v1.py (246:251) duplicated block id: 2534 size: 6 cleaned lines of code in 2 files: - src/protein_structure/embedding_from_esmfold.py (321:326) - src/protein_structure/structure_from_esm_v1.py (279:284) duplicated block id: 2535 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (402:407) - src/plot/plot_map_pie_fig_aff4_2.py (231:236) duplicated block id: 2536 size: 6 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_manual.py (66:71) - src/protein_structure/embedding_from_esmfold.py (321:326) duplicated block id: 2537 size: 6 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (287:298) - src/plot/plot_map_pie_fig_aff4_2.py (321:329) duplicated block id: 2538 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (307:312) - src/plot/plot_map_pie_fig_aff4_2.py (231:236) duplicated block id: 2539 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (73:78) - src/plot/plot_map_pie_fig_aff4_2.py (231:236) duplicated block id: 2540 size: 6 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (198:203) - src/plot/plot_map_pie_fig4_2.py (233:238) duplicated block id: 2541 size: 6 cleaned lines of code in 2 files: - src/predict_many_samples.py (614:619) - src/run.py (119:124) duplicated block id: 2542 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (360:365) - src/geo_map/standardization_lat_lon_info.py (183:188) duplicated block id: 2543 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (60:65) - src/geo_map/standardization_lat_lon_info.py (183:188) duplicated block id: 2544 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (448:453) - src/result_process/process_predict_result.py (106:111) duplicated block id: 2545 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (493:498) - src/protein_structure/merge_embedding_pdb_result.py (141:146) duplicated block id: 2546 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (60:65) - src/geo_map/standardization_lat_lon_info.py (207:212) duplicated block id: 2547 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (360:365) - src/geo_map/standardization_lat_lon_info.py (207:212) duplicated block id: 2548 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (493:498) - src/protein_structure/merge_embedding_pdb_result.py (130:135) duplicated block id: 2549 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (288:293) - src/geo_map/standardization_lat_lon_info.py (183:188) duplicated block id: 2550 size: 6 cleaned lines of code in 2 files: - src/baselines/dnn.py (438:444) - src/deep_baselines/run.py (352:358) duplicated block id: 2551 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (307:312) - src/geo_map/get_biosample_from_manual.py (66:71) duplicated block id: 2552 size: 6 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (100:105) - src/protein_structure/structure_from_esm_v1.py (246:251) duplicated block id: 2553 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (288:293) - src/geo_map/standardization_lat_lon_info.py (207:212) duplicated block id: 2554 size: 6 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_2.py (287:292) - src/plot/plot_map_pie_fig_aff4_2.py (298:303) duplicated block id: 2555 size: 6 cleaned lines of code in 2 files: - src/protein_structure/merge_embedding_pdb_result.py (106:111) - src/protein_structure/merge_embedding_pdb_result.py (130:135) duplicated block id: 2556 size: 6 cleaned lines of code in 2 files: - src/predict.py (425:430) - src/predict_many_samples.py (506:511) duplicated block id: 2557 size: 6 cleaned lines of code in 2 files: - src/protein_structure/merge_embedding_pdb_result.py (106:111) - src/protein_structure/merge_embedding_pdb_result.py (141:146) duplicated block id: 2558 size: 6 cleaned lines of code in 2 files: - src/protein_structure/embedding_from_esmfold.py (321:326) - src/protein_structure/merge_embedding_pdb_result.py (130:135) duplicated block id: 2559 size: 6 cleaned lines of code in 2 files: - src/protein_structure/embedding_from_esmfold.py (321:326) - src/protein_structure/merge_embedding_pdb_result.py (141:146) duplicated block id: 2560 size: 6 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_2.py (269:275) - src/plot/plot_map_pie_fig_aff4_2.py (279:285) duplicated block id: 2561 size: 6 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig_aff4_2.py (231:236) - src/protein_structure/merge_embedding_pdb_result.py (130:135) duplicated block id: 2562 size: 6 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig_aff4_2.py (231:236) - src/protein_structure/merge_embedding_pdb_result.py (141:146) duplicated block id: 2563 size: 6 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (49:54) - src/plot/plot_map_pie_fig_aff4_2.py (48:53) duplicated block id: 2564 size: 6 cleaned lines of code in 2 files: - src/predict.py (425:430) - src/predict_many_samples.py (474:479) duplicated block id: 2565 size: 6 cleaned lines of code in 2 files: - src/baselines/xgb.py (37:50) - src/deep_baselines/run.py (40:64) duplicated block id: 2566 size: 6 cleaned lines of code in 2 files: - src/geo_map/standardization_lat_lon_info.py (183:188) - src/plot/plot_map_pie_fig4_2.py (233:238) duplicated block id: 2567 size: 6 cleaned lines of code in 2 files: - src/predict.py (435:440) - src/predict_one_sample.py (519:524) duplicated block id: 2568 size: 6 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_manual.py (66:71) - src/plot/plot_map_pie_fig4_1.py (100:105) duplicated block id: 2569 size: 6 cleaned lines of code in 2 files: - src/predict.py (435:440) - src/predict_one_sample.py (487:492) duplicated block id: 2570 size: 6 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_2.py (233:238) - src/protein_structure/merge_embedding_pdb_result.py (130:135) duplicated block id: 2571 size: 6 cleaned lines of code in 2 files: - src/protein_structure/embedding_from_esmfold.py (321:326) - src/result_process/process_predict_result.py (106:111) duplicated block id: 2572 size: 6 cleaned lines of code in 2 files: - src/predict.py (334:339) - src/predict_one_sample.py (337:342) duplicated block id: 2573 size: 6 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1083:1091) - src/SSFN/modeling_bert.py (1194:1202) duplicated block id: 2574 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (402:407) - src/geo_map/get_biosample_from_manual.py (66:71) duplicated block id: 2575 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (60:65) - src/protein_structure/embedding_from_esmfold.py (321:326) duplicated block id: 2576 size: 6 cleaned lines of code in 2 files: - src/common/metrics.py (162:167) - src/common/metrics.py (261:266) duplicated block id: 2577 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (73:78) - src/geo_map/standardization_lat_lon_info.py (207:212) duplicated block id: 2578 size: 6 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (23:42) - src/plot/plot_map_pie_fig4_2.py (23:41) duplicated block id: 2579 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (73:78) - src/geo_map/standardization_lat_lon_info.py (183:188) duplicated block id: 2580 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (402:407) - src/protein_structure/merge_embedding_pdb_result.py (130:135) duplicated block id: 2581 size: 6 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_2.py (233:238) - src/protein_structure/merge_embedding_pdb_result.py (106:111) duplicated block id: 2582 size: 6 cleaned lines of code in 2 files: - src/protein_structure/merge_embedding_pdb_result.py (141:146) - src/protein_structure/structure_from_esm_v1.py (279:284) duplicated block id: 2583 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (402:407) - src/protein_structure/merge_embedding_pdb_result.py (141:146) duplicated block id: 2584 size: 6 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_2.py (233:238) - src/protein_structure/merge_embedding_pdb_result.py (74:79) duplicated block id: 2585 size: 6 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig_aff4_2.py (231:236) - src/protein_structure/embedding_from_esmfold.py (321:326) duplicated block id: 2586 size: 6 cleaned lines of code in 2 files: - src/protein_structure/merge_embedding_pdb_result.py (141:146) - src/protein_structure/structure_from_esm_v1.py (246:251) duplicated block id: 2587 size: 6 cleaned lines of code in 2 files: - src/data_loader.py (1081:1086) - src/predict_one_sample.py (285:291) duplicated block id: 2588 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (307:312) - src/protein_structure/merge_embedding_pdb_result.py (141:146) duplicated block id: 2589 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (562:567) - src/protein_structure/merge_embedding_pdb_result.py (106:111) duplicated block id: 2590 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (307:312) - src/protein_structure/merge_embedding_pdb_result.py (130:135) duplicated block id: 2591 size: 6 cleaned lines of code in 2 files: - src/baselines/dnn.py (595:602) - src/deep_baselines/run.py (515:522) duplicated block id: 2592 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (562:567) - src/protein_structure/merge_embedding_pdb_result.py (74:79) duplicated block id: 2593 size: 6 cleaned lines of code in 2 files: - src/common/metrics.py (114:119) - src/common/metrics.py (309:314) duplicated block id: 2594 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (307:312) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (316:321) duplicated block id: 2595 size: 6 cleaned lines of code in 2 files: - src/predict.py (315:320) - src/predict_many_samples.py (352:357) duplicated block id: 2596 size: 6 cleaned lines of code in 2 files: - src/app/app.py (458:463) - src/predict.py (483:488) duplicated block id: 2597 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (562:567) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (60:65) duplicated block id: 2598 size: 6 cleaned lines of code in 2 files: - src/predict_many_samples.py (420:425) - src/predict_many_samples.py (455:460) duplicated block id: 2599 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (383:388) - src/geo_map/standardization_lat_lon_info.py (183:188) duplicated block id: 2600 size: 6 cleaned lines of code in 2 files: - src/baselines/lgbm.py (344:351) - src/deep_baselines/run.py (361:368) duplicated block id: 2601 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (383:388) - src/geo_map/standardization_lat_lon_info.py (207:212) duplicated block id: 2602 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (562:567) - src/plot/plot_map_pie_fig_aff4_2.py (231:236) duplicated block id: 2603 size: 6 cleaned lines of code in 2 files: - src/predict_many_samples.py (352:357) - src/predict_one_sample.py (337:342) duplicated block id: 2604 size: 6 cleaned lines of code in 2 files: - src/protein_structure/merge_embedding_pdb_result.py (74:79) - src/protein_structure/structure_from_esm_v1.py (279:284) duplicated block id: 2605 size: 6 cleaned lines of code in 2 files: - src/app/app.py (439:444) - src/predict_many_samples.py (420:425) duplicated block id: 2606 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (493:498) - src/protein_structure/merge_embedding_pdb_result.py (106:111) duplicated block id: 2607 size: 6 cleaned lines of code in 2 files: - src/app/app.py (567:572) - src/predict_one_sample.py (675:680) duplicated block id: 2608 size: 6 cleaned lines of code in 2 files: - src/protein_structure/merge_embedding_pdb_result.py (74:79) - src/protein_structure/structure_from_esm_v1.py (246:251) duplicated block id: 2609 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (493:498) - src/protein_structure/merge_embedding_pdb_result.py (74:79) duplicated block id: 2610 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (60:65) - src/geo_map/get_biosample_from_manual.py (66:71) duplicated block id: 2611 size: 6 cleaned lines of code in 2 files: - src/common/metrics.py (114:119) - src/common/metrics.py (162:167) duplicated block id: 2612 size: 6 cleaned lines of code in 2 files: - src/predict.py (464:469) - src/predict_one_sample.py (519:524) duplicated block id: 2613 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (402:407) - src/protein_structure/merge_embedding_pdb_result.py (74:79) duplicated block id: 2614 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (402:407) - src/protein_structure/merge_embedding_pdb_result.py (106:111) duplicated block id: 2615 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (448:453) - src/plot/plot_map_pie_fig4_1.py (100:105) duplicated block id: 2616 size: 6 cleaned lines of code in 2 files: - src/protein_structure/embedding_from_esmfold.py (149:154) - src/protein_structure/embedding_from_esmfold.py (157:162) duplicated block id: 2617 size: 6 cleaned lines of code in 2 files: - src/data_loader.py (666:671) - src/data_loader.py (1120:1125) duplicated block id: 2618 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (88:93) - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (69:74) duplicated block id: 2619 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (288:293) - src/protein_structure/structure_from_esm_v1.py (279:284) duplicated block id: 2620 size: 6 cleaned lines of code in 2 files: - src/predict.py (464:469) - src/predict_one_sample.py (487:492) duplicated block id: 2621 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (288:293) - src/protein_structure/structure_from_esm_v1.py (246:251) duplicated block id: 2622 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (352:357) - src/protein_structure/structure_from_esm_v1.py (279:284) duplicated block id: 2623 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (562:567) - src/plot/plot_map_pie_fig4_2.py (233:238) duplicated block id: 2624 size: 6 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (100:105) - src/plot/plot_map_pie_fig4_2.py (233:238) duplicated block id: 2625 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (352:357) - src/protein_structure/structure_from_esm_v1.py (246:251) duplicated block id: 2626 size: 6 cleaned lines of code in 2 files: - src/predict.py (493:498) - src/predict_one_sample.py (487:492) duplicated block id: 2627 size: 6 cleaned lines of code in 2 files: - src/predict.py (493:498) - src/predict_one_sample.py (519:524) duplicated block id: 2628 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (448:453) - src/protein_structure/structure_from_esm_v1.py (246:251) duplicated block id: 2629 size: 6 cleaned lines of code in 2 files: - src/geo_map/standardization_lat_lon_info.py (183:188) - src/protein_structure/structure_from_esm_v1.py (279:284) duplicated block id: 2630 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (24:41) - src/data_preprocess/data_preprocess_for_rdrp_v2.py (24:35) duplicated block id: 2631 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (448:453) - src/protein_structure/structure_from_esm_v1.py (279:284) duplicated block id: 2632 size: 6 cleaned lines of code in 2 files: - src/geo_map/standardization_lat_lon_info.py (183:188) - src/protein_structure/structure_from_esm_v1.py (246:251) duplicated block id: 2633 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (60:65) - src/protein_structure/merge_embedding_pdb_result.py (106:111) duplicated block id: 2634 size: 6 cleaned lines of code in 2 files: - src/geo_map/standardization_lat_lon_info.py (207:212) - src/plot/plot_map_pie_fig4_1.py (100:105) duplicated block id: 2635 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (60:65) - src/protein_structure/merge_embedding_pdb_result.py (74:79) duplicated block id: 2636 size: 6 cleaned lines of code in 2 files: - src/deep_baselines/run.py (789:794) - src/deep_baselines/run.py (877:882) duplicated block id: 2637 size: 6 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (198:203) - src/result_process/process_predict_result.py (106:111) duplicated block id: 2638 size: 6 cleaned lines of code in 2 files: - src/data_loader.py (1081:1086) - src/predict.py (276:282) duplicated block id: 2639 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (73:78) - src/plot/plot_map_pie_fig4_2.py (233:238) duplicated block id: 2640 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (562:567) - src/geo_map/standardization_lat_lon_info.py (183:188) duplicated block id: 2641 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (562:567) - src/geo_map/standardization_lat_lon_info.py (207:212) duplicated block id: 2642 size: 6 cleaned lines of code in 2 files: - src/predict.py (435:440) - src/predict_many_samples.py (484:489) duplicated block id: 2643 size: 6 cleaned lines of code in 2 files: - src/predict.py (435:440) - src/predict_many_samples.py (516:521) duplicated block id: 2644 size: 6 cleaned lines of code in 2 files: - src/predict_one_sample.py (337:342) - src/predict_one_sample.py (355:360) duplicated block id: 2645 size: 6 cleaned lines of code in 2 files: - src/predict_one_sample.py (423:428) - src/predict_one_sample.py (458:463) duplicated block id: 2646 size: 6 cleaned lines of code in 2 files: - src/geo_map/standardization_lat_lon_info.py (207:212) - src/plot/plot_map_pie_fig4_2.py (233:238) duplicated block id: 2647 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (562:567) - src/protein_structure/merge_embedding_pdb_result.py (130:135) duplicated block id: 2648 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (562:567) - src/protein_structure/merge_embedding_pdb_result.py (141:146) duplicated block id: 2649 size: 6 cleaned lines of code in 2 files: - src/evaluater.py (32:42) - src/predict.py (32:48) duplicated block id: 2650 size: 6 cleaned lines of code in 2 files: - src/baselines/xgb.py (321:328) - src/run.py (596:603) duplicated block id: 2651 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (288:293) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (316:321) duplicated block id: 2652 size: 6 cleaned lines of code in 2 files: - src/geo_map/standardization_lat_lon_info.py (207:212) - src/protein_structure/merge_embedding_pdb_result.py (74:79) duplicated block id: 2653 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (288:293) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (307:312) duplicated block id: 2654 size: 6 cleaned lines of code in 2 files: - src/geo_map/standardization_lat_lon_info.py (183:188) - src/protein_structure/merge_embedding_pdb_result.py (130:135) duplicated block id: 2655 size: 6 cleaned lines of code in 2 files: - src/trainer.py (54:59) - src/trainer.py (286:291) duplicated block id: 2656 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (307:312) - src/protein_structure/merge_embedding_pdb_result.py (106:111) duplicated block id: 2657 size: 6 cleaned lines of code in 2 files: - src/trainer.py (54:59) - src/trainer.py (310:315) duplicated block id: 2658 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (288:293) - src/geo_map/get_biosample_from_manual.py (66:71) duplicated block id: 2659 size: 6 cleaned lines of code in 2 files: - src/geo_map/standardization_lat_lon_info.py (183:188) - src/protein_structure/merge_embedding_pdb_result.py (141:146) duplicated block id: 2660 size: 6 cleaned lines of code in 2 files: - src/predict.py (542:547) - src/run.py (96:101) duplicated block id: 2661 size: 6 cleaned lines of code in 2 files: - src/deep_baselines/virseeker.py (193:198) - src/deep_baselines/virtifier.py (202:207) duplicated block id: 2662 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (360:365) - src/protein_structure/merge_embedding_pdb_result.py (141:146) duplicated block id: 2663 size: 6 cleaned lines of code in 2 files: - src/geo_map/standardization_lat_lon_info.py (207:212) - src/protein_structure/structure_from_esm_v1.py (246:251) duplicated block id: 2664 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (360:365) - src/protein_structure/merge_embedding_pdb_result.py (130:135) duplicated block id: 2665 size: 6 cleaned lines of code in 2 files: - src/geo_map/standardization_lat_lon_info.py (207:212) - src/protein_structure/merge_embedding_pdb_result.py (106:111) duplicated block id: 2666 size: 6 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (148:153) - src/plot/plot_map_pie_fig_aff4_2.py (197:202) duplicated block id: 2667 size: 6 cleaned lines of code in 2 files: - src/geo_map/standardization_lat_lon_info.py (207:212) - src/protein_structure/structure_from_esm_v1.py (279:284) duplicated block id: 2668 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (288:293) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (60:65) duplicated block id: 2669 size: 6 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (100:105) - src/protein_structure/merge_embedding_pdb_result.py (106:111) duplicated block id: 2670 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (288:293) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (360:365) duplicated block id: 2671 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (288:293) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (352:357) duplicated block id: 2672 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (288:293) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (402:407) duplicated block id: 2673 size: 6 cleaned lines of code in 2 files: - src/baselines/xgb.py (349:354) - src/run.py (742:748) duplicated block id: 2674 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (352:357) - src/geo_map/get_biosample_from_manual.py (66:71) duplicated block id: 2675 size: 6 cleaned lines of code in 2 files: - src/data_loader.py (283:288) - src/data_loader.py (315:320) duplicated block id: 2676 size: 6 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (100:105) - src/protein_structure/merge_embedding_pdb_result.py (74:79) duplicated block id: 2677 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (288:293) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (383:388) duplicated block id: 2678 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (288:293) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (448:453) duplicated block id: 2679 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (307:312) - src/protein_structure/merge_embedding_pdb_result.py (74:79) duplicated block id: 2680 size: 6 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_2.py (233:238) - src/protein_structure/merge_embedding_pdb_result.py (141:146) duplicated block id: 2681 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (493:498) - src/geo_map/get_biosample_from_manual.py (66:71) duplicated block id: 2682 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (352:357) - src/plot/plot_map_pie_fig4_2.py (233:238) duplicated block id: 2683 size: 6 cleaned lines of code in 2 files: - src/SSFN/modeling_bert.py (1083:1091) - src/SSFN/modeling_bert.py (1332:1340) duplicated block id: 2684 size: 6 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (198:203) - src/protein_structure/structure_from_esm_v1.py (246:251) duplicated block id: 2685 size: 6 cleaned lines of code in 2 files: - src/protein_structure/merge_embedding_pdb_result.py (130:135) - src/protein_structure/structure_from_esm_v1.py (279:284) duplicated block id: 2686 size: 6 cleaned lines of code in 2 files: - src/protein_structure/merge_embedding_pdb_result.py (130:135) - src/protein_structure/structure_from_esm_v1.py (246:251) duplicated block id: 2687 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (60:65) - src/plot/plot_map_pie_fig4_2.py (233:238) duplicated block id: 2688 size: 6 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_2.py (233:238) - src/protein_structure/embedding_from_esmfold.py (321:326) duplicated block id: 2689 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (288:293) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (493:498) duplicated block id: 2690 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (24:36) - src/data_preprocess/data_preprocess_into_tfrecords_for_rdrp.py (24:36) duplicated block id: 2691 size: 6 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_manual.py (66:71) - src/protein_structure/merge_embedding_pdb_result.py (74:79) duplicated block id: 2692 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (24:41) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (24:36) duplicated block id: 2693 size: 6 cleaned lines of code in 2 files: - src/app/app.py (567:572) - src/predict_many_samples.py (691:696) duplicated block id: 2694 size: 6 cleaned lines of code in 2 files: - src/data_loader.py (245:250) - src/data_loader.py (283:288) duplicated block id: 2695 size: 6 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_manual.py (66:71) - src/protein_structure/merge_embedding_pdb_result.py (106:111) duplicated block id: 2696 size: 6 cleaned lines of code in 2 files: - src/data_loader.py (783:788) - src/predictor.py (67:72) duplicated block id: 2697 size: 6 cleaned lines of code in 2 files: - src/protein_structure/merge_embedding_pdb_result.py (74:79) - src/protein_structure/merge_embedding_pdb_result.py (130:135) duplicated block id: 2698 size: 6 cleaned lines of code in 2 files: - src/data_loader.py (245:250) - src/data_loader.py (315:320) duplicated block id: 2699 size: 6 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_manual.py (66:71) - src/protein_structure/merge_embedding_pdb_result.py (141:146) duplicated block id: 2700 size: 6 cleaned lines of code in 2 files: - src/protein_structure/merge_embedding_pdb_result.py (74:79) - src/protein_structure/merge_embedding_pdb_result.py (141:146) duplicated block id: 2701 size: 6 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_manual.py (66:71) - src/protein_structure/merge_embedding_pdb_result.py (130:135) duplicated block id: 2702 size: 6 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_manual.py (66:71) - src/plot/plot_map_pie_fig4_1.py (198:203) duplicated block id: 2703 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (24:35) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (24:36) duplicated block id: 2704 size: 6 cleaned lines of code in 2 files: - src/data_loader.py (783:788) - src/predictor.py (83:88) duplicated block id: 2705 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (60:65) - src/plot/plot_map_pie_fig4_1.py (198:203) duplicated block id: 2706 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (360:365) - src/protein_structure/structure_from_esm_v1.py (246:251) duplicated block id: 2707 size: 6 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_manual.py (66:71) - src/geo_map/standardization_lat_lon_info.py (183:188) duplicated block id: 2708 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (562:567) - src/protein_structure/embedding_from_esmfold.py (321:326) duplicated block id: 2709 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (383:388) - src/protein_structure/merge_embedding_pdb_result.py (130:135) duplicated block id: 2710 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v2.py (88:93) - src/data_preprocess/data_preprocess_for_rdrp_v2.py (106:111) duplicated block id: 2711 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (352:357) - src/plot/plot_map_pie_fig4_1.py (198:203) duplicated block id: 2712 size: 6 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig_aff4_2.py (231:236) - src/result_process/process_predict_result.py (106:111) duplicated block id: 2713 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (360:365) - src/protein_structure/structure_from_esm_v1.py (279:284) duplicated block id: 2714 size: 6 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_manual.py (66:71) - src/geo_map/standardization_lat_lon_info.py (207:212) duplicated block id: 2715 size: 6 cleaned lines of code in 2 files: - src/data_loader.py (259:264) - src/data_loader.py (291:296) duplicated block id: 2716 size: 6 cleaned lines of code in 2 files: - src/baselines/lgbm.py (344:351) - src/run.py (596:603) duplicated block id: 2717 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (383:388) - src/protein_structure/merge_embedding_pdb_result.py (141:146) duplicated block id: 2718 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (73:78) - src/plot/plot_map_pie_fig4_1.py (198:203) duplicated block id: 2719 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (493:498) - src/protein_structure/embedding_from_esmfold.py (321:326) duplicated block id: 2720 size: 6 cleaned lines of code in 2 files: - src/deep_baselines/predict_deep_baselines.py (204:209) - src/predict_one_sample.py (427:432) duplicated block id: 2721 size: 6 cleaned lines of code in 2 files: - src/baselines/dnn.py (43:59) - src/baselines/lgbm.py (38:51) duplicated block id: 2722 size: 6 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_manual.py (66:71) - src/plot/plot_map_pie_fig4_2.py (233:238) duplicated block id: 2723 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (352:357) - src/geo_map/standardization_lat_lon_info.py (207:212) duplicated block id: 2724 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (448:453) - src/geo_map/standardization_lat_lon_info.py (183:188) duplicated block id: 2725 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (60:65) - src/result_process/process_predict_result.py (106:111) duplicated block id: 2726 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (383:388) - src/plot/plot_map_pie_fig4_1.py (100:105) duplicated block id: 2727 size: 6 cleaned lines of code in 2 files: - src/predict.py (493:498) - src/predict_many_samples.py (516:521) duplicated block id: 2728 size: 6 cleaned lines of code in 2 files: - src/predict_many_samples.py (334:339) - src/predict_one_sample.py (355:360) duplicated block id: 2729 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (316:321) - src/result_process/process_predict_result.py (106:111) duplicated block id: 2730 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (448:453) - src/geo_map/standardization_lat_lon_info.py (207:212) duplicated block id: 2731 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (360:365) - src/result_process/process_predict_result.py (106:111) duplicated block id: 2732 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (352:357) - src/geo_map/standardization_lat_lon_info.py (183:188) duplicated block id: 2733 size: 6 cleaned lines of code in 2 files: - src/app/app.py (408:413) - src/deep_baselines/predict_deep_baselines.py (204:209) duplicated block id: 2734 size: 6 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_entrez.py (86:91) - src/geo_map/get_biosample_from_manual.py (41:46) duplicated block id: 2735 size: 6 cleaned lines of code in 2 files: - src/plot/plot_map_pie_fig4_1.py (148:153) - src/plot/plot_map_pie_fig4_1.py (168:173) duplicated block id: 2736 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (73:78) - src/plot/plot_map_pie_fig4_1.py (100:105) duplicated block id: 2737 size: 6 cleaned lines of code in 2 files: - src/predict_many_samples.py (420:425) - src/predict_one_sample.py (458:463) duplicated block id: 2738 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (360:365) - src/geo_map/get_biosample_from_manual.py (66:71) duplicated block id: 2739 size: 6 cleaned lines of code in 2 files: - src/geo_map/get_biosample_from_manual.py (66:71) - src/plot/plot_map_pie_fig_aff4_2.py (231:236) duplicated block id: 2740 size: 6 cleaned lines of code in 2 files: - src/protein_structure/merge_embedding_pdb_result.py (74:79) - src/result_process/process_predict_result.py (106:111) duplicated block id: 2741 size: 6 cleaned lines of code in 2 files: - src/predict_many_samples.py (565:570) - src/predict_one_sample.py (563:568) duplicated block id: 2742 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (288:293) - src/plot/plot_map_pie_fig4_1.py (100:105) duplicated block id: 2743 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v1.py (288:293) - src/plot/plot_map_pie_fig_aff4_2.py (231:236) duplicated block id: 2744 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (352:357) - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (493:498) duplicated block id: 2745 size: 6 cleaned lines of code in 2 files: - src/app/app.py (318:323) - src/app/app.py (336:341) duplicated block id: 2746 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (360:365) - src/plot/plot_map_pie_fig_aff4_2.py (231:236) duplicated block id: 2747 size: 6 cleaned lines of code in 2 files: - src/data_preprocess/data_preprocess_for_rdrp_v3_extend.py (402:407) - src/result_process/process_predict_result.py (106:111) duplicated block id: 2748 size: 6 cleaned lines of code in 2 files: - src/predict.py (493:498) - src/predict_many_samples.py (484:489) duplicated block id: 2749 size: 6 cleaned lines of code in 2 files: - src/baselines/dnn.py (43:59) - src/baselines/xgb.py (37:50) duplicated block id: 2750 size: 6 cleaned lines of code in 2 files: - src/geo_map/standardization_lat_lon_info.py (183:188) - src/geo_map/standardization_lat_lon_info.py (207:212)