duplicated block id: 1 size: 137 cleaned lines of code in 2 files: - src/datasets/dataset_dict.py (1774:1920) - src/datasets/dataset_dict.py (2590:2736) duplicated block id: 2 size: 56 cleaned lines of code in 2 files: - src/datasets/builder.py (1440:1506) - src/datasets/builder.py (1695:1761) duplicated block id: 3 size: 42 cleaned lines of code in 2 files: - src/datasets/dataset_dict.py (1619:1756) - src/datasets/dataset_dict.py (2445:2572) duplicated block id: 4 size: 38 cleaned lines of code in 2 files: - src/datasets/features/audio.py (240:290) - src/datasets/features/pdf.py (207:257) duplicated block id: 5 size: 38 cleaned lines of code in 2 files: - src/datasets/dataset_dict.py (1882:1920) - src/datasets/iterable_dataset.py (4225:4263) duplicated block id: 6 size: 38 cleaned lines of code in 2 files: - src/datasets/dataset_dict.py (2698:2736) - src/datasets/iterable_dataset.py (4225:4263) duplicated block id: 7 size: 36 cleaned lines of code in 2 files: - src/datasets/features/image.py (196:253) - src/datasets/features/video.py (224:281) duplicated block id: 8 size: 28 cleaned lines of code in 2 files: - src/datasets/features/audio.py (250:290) - src/datasets/features/image.py (253:293) duplicated block id: 9 size: 28 cleaned lines of code in 2 files: - src/datasets/features/image.py (253:293) - src/datasets/features/pdf.py (217:257) duplicated block id: 10 size: 25 cleaned lines of code in 2 files: - src/datasets/builder.py (1409:1437) - src/datasets/builder.py (1666:1694) duplicated block id: 11 size: 25 cleaned lines of code in 2 files: - src/datasets/features/pdf.py (173:216) - src/datasets/features/video.py (228:271) duplicated block id: 12 size: 25 cleaned lines of code in 2 files: - src/datasets/download/streaming_download_manager.py (15:44) - src/datasets/streaming.py (7:35) duplicated block id: 13 size: 25 cleaned lines of code in 2 files: - src/datasets/features/image.py (200:243) - src/datasets/features/pdf.py (173:216) duplicated block id: 14 size: 24 cleaned lines of code in 2 files: - src/datasets/io/csv.py (39:66) - src/datasets/io/text.py (33:60) duplicated block id: 15 size: 23 cleaned lines of code in 2 files: - src/datasets/packaged_modules/text/text.py (31:62) - src/datasets/packaged_modules/xml/xml.py (27:58) duplicated block id: 16 size: 22 cleaned lines of code in 2 files: - src/datasets/io/csv.py (16:37) - src/datasets/io/text.py (10:31) duplicated block id: 17 size: 22 cleaned lines of code in 2 files: - src/datasets/io/parquet.py (18:39) - src/datasets/io/text.py (10:31) duplicated block id: 18 size: 22 cleaned lines of code in 2 files: - src/datasets/combine.py (124:145) - src/datasets/combine.py (190:211) duplicated block id: 19 size: 22 cleaned lines of code in 2 files: - src/datasets/io/csv.py (16:37) - src/datasets/io/parquet.py (18:39) duplicated block id: 20 size: 21 cleaned lines of code in 2 files: - src/datasets/io/json.py (45:69) - src/datasets/io/text.py (36:60) duplicated block id: 21 size: 21 cleaned lines of code in 2 files: - src/datasets/io/csv.py (42:66) - src/datasets/io/json.py (45:69) duplicated block id: 22 size: 21 cleaned lines of code in 2 files: - src/datasets/io/csv.py (42:66) - src/datasets/io/parquet.py (46:70) duplicated block id: 23 size: 21 cleaned lines of code in 2 files: - src/datasets/utils/_dill.py (273:293) - src/datasets/utils/_dill.py (442:462) duplicated block id: 24 size: 21 cleaned lines of code in 2 files: - src/datasets/io/json.py (45:69) - src/datasets/io/parquet.py (46:70) duplicated block id: 25 size: 21 cleaned lines of code in 2 files: - src/datasets/io/parquet.py (46:70) - src/datasets/io/text.py (36:60) duplicated block id: 26 size: 20 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (1589:1612) - src/datasets/iterable_dataset.py (1713:1736) duplicated block id: 27 size: 18 cleaned lines of code in 2 files: - src/datasets/packaged_modules/folder_based_builder/folder_based_builder.py (314:333) - src/datasets/packaged_modules/json/json.py (135:154) duplicated block id: 28 size: 18 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (323:341) - src/datasets/iterable_dataset.py (395:413) duplicated block id: 29 size: 18 cleaned lines of code in 2 files: - src/datasets/packaged_modules/csv/csv.py (148:168) - src/datasets/packaged_modules/text/text.py (31:55) duplicated block id: 30 size: 18 cleaned lines of code in 2 files: - src/datasets/utils/_dill.py (279:296) - src/datasets/utils/_dill.py (430:447) duplicated block id: 31 size: 18 cleaned lines of code in 2 files: - src/datasets/packaged_modules/csv/csv.py (148:168) - src/datasets/packaged_modules/xml/xml.py (27:51) duplicated block id: 32 size: 17 cleaned lines of code in 2 files: - src/datasets/utils/_dill.py (361:377) - src/datasets/utils/_dill.py (385:401) duplicated block id: 33 size: 15 cleaned lines of code in 2 files: - src/datasets/dataset_dict.py (2659:2674) - src/datasets/iterable_dataset.py (4192:4207) duplicated block id: 34 size: 15 cleaned lines of code in 2 files: - src/datasets/packaged_modules/json/json.py (71:88) - src/datasets/packaged_modules/text/text.py (32:53) duplicated block id: 35 size: 15 cleaned lines of code in 2 files: - src/datasets/packaged_modules/json/json.py (71:88) - src/datasets/packaged_modules/xml/xml.py (28:49) duplicated block id: 36 size: 15 cleaned lines of code in 2 files: - src/datasets/dataset_dict.py (1843:1858) - src/datasets/iterable_dataset.py (4192:4207) duplicated block id: 37 size: 15 cleaned lines of code in 2 files: - src/datasets/utils/_dill.py (430:444) - src/datasets/utils/_dill.py (448:462) duplicated block id: 38 size: 15 cleaned lines of code in 2 files: - src/datasets/builder.py (1588:1603) - src/datasets/builder.py (1840:1855) duplicated block id: 39 size: 15 cleaned lines of code in 2 files: - src/datasets/packaged_modules/csv/csv.py (149:166) - src/datasets/packaged_modules/json/json.py (71:88) duplicated block id: 40 size: 14 cleaned lines of code in 2 files: - src/datasets/utils/_dill.py (361:374) - src/datasets/utils/_dill.py (407:420) duplicated block id: 41 size: 14 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (2912:2925) - src/datasets/iterable_dataset.py (2992:3005) duplicated block id: 42 size: 14 cleaned lines of code in 2 files: - src/datasets/packaged_modules/arrow/arrow.py (28:45) - src/datasets/packaged_modules/parquet/parquet.py (42:59) duplicated block id: 43 size: 14 cleaned lines of code in 2 files: - src/datasets/dataset_dict.py (2679:2693) - src/datasets/iterable_dataset.py (4209:4223) duplicated block id: 44 size: 14 cleaned lines of code in 2 files: - src/datasets/utils/_dill.py (297:310) - src/datasets/utils/_dill.py (431:444) duplicated block id: 45 size: 14 cleaned lines of code in 2 files: - src/datasets/utils/_dill.py (297:310) - src/datasets/utils/_dill.py (449:462) duplicated block id: 46 size: 14 cleaned lines of code in 2 files: - src/datasets/utils/_dill.py (258:271) - src/datasets/utils/_dill.py (427:440) duplicated block id: 47 size: 14 cleaned lines of code in 2 files: - src/datasets/dataset_dict.py (1863:1877) - src/datasets/iterable_dataset.py (4209:4223) duplicated block id: 48 size: 14 cleaned lines of code in 2 files: - src/datasets/utils/_dill.py (385:398) - src/datasets/utils/_dill.py (407:420) duplicated block id: 49 size: 14 cleaned lines of code in 2 files: - src/datasets/utils/_dill.py (280:293) - src/datasets/utils/_dill.py (297:310) duplicated block id: 50 size: 13 cleaned lines of code in 2 files: - src/datasets/utils/_dill.py (259:271) - src/datasets/utils/_dill.py (409:421) duplicated block id: 51 size: 13 cleaned lines of code in 2 files: - src/datasets/io/csv.py (50:63) - src/datasets/io/generator.py (43:56) duplicated block id: 52 size: 13 cleaned lines of code in 2 files: - src/datasets/formatting/tf_formatter.py (96:111) - src/datasets/formatting/torch_formatter.py (97:112) duplicated block id: 53 size: 13 cleaned lines of code in 2 files: - src/datasets/formatting/jax_formatter.py (141:156) - src/datasets/formatting/tf_formatter.py (96:111) duplicated block id: 54 size: 13 cleaned lines of code in 2 files: - src/datasets/io/generator.py (43:56) - src/datasets/io/json.py (53:66) duplicated block id: 55 size: 13 cleaned lines of code in 2 files: - src/datasets/data_files.py (580:592) - src/datasets/data_files.py (739:751) duplicated block id: 56 size: 13 cleaned lines of code in 2 files: - src/datasets/io/generator.py (43:56) - src/datasets/io/parquet.py (54:67) duplicated block id: 57 size: 13 cleaned lines of code in 2 files: - src/datasets/formatting/jax_formatter.py (141:156) - src/datasets/formatting/torch_formatter.py (97:112) duplicated block id: 58 size: 13 cleaned lines of code in 2 files: - src/datasets/io/csv.py (24:36) - src/datasets/io/json.py (25:37) duplicated block id: 59 size: 13 cleaned lines of code in 2 files: - src/datasets/io/json.py (25:37) - src/datasets/io/text.py (18:30) duplicated block id: 60 size: 13 cleaned lines of code in 2 files: - src/datasets/io/generator.py (43:56) - src/datasets/io/text.py (44:57) duplicated block id: 61 size: 13 cleaned lines of code in 2 files: - src/datasets/utils/_dill.py (409:421) - src/datasets/utils/_dill.py (428:440) duplicated block id: 62 size: 13 cleaned lines of code in 2 files: - src/datasets/io/json.py (25:37) - src/datasets/io/parquet.py (26:38) duplicated block id: 63 size: 12 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (344:355) - src/datasets/iterable_dataset.py (420:431) duplicated block id: 64 size: 12 cleaned lines of code in 2 files: - src/datasets/utils/_dill.py (259:270) - src/datasets/utils/_dill.py (363:374) duplicated block id: 65 size: 12 cleaned lines of code in 2 files: - src/datasets/utils/_dill.py (259:270) - src/datasets/utils/_dill.py (387:398) duplicated block id: 66 size: 12 cleaned lines of code in 2 files: - src/datasets/dataset_dict.py (1727:1741) - src/datasets/iterable_dataset.py (4039:4053) duplicated block id: 67 size: 12 cleaned lines of code in 2 files: - src/datasets/packaged_modules/arrow/arrow.py (27:41) - src/datasets/packaged_modules/text/text.py (31:48) duplicated block id: 68 size: 12 cleaned lines of code in 2 files: - src/datasets/formatting/np_formatter.py (105:117) - src/datasets/formatting/tf_formatter.py (114:126) duplicated block id: 69 size: 12 cleaned lines of code in 2 files: - src/datasets/packaged_modules/arrow/arrow.py (27:41) - src/datasets/packaged_modules/csv/csv.py (148:161) duplicated block id: 70 size: 12 cleaned lines of code in 2 files: - src/datasets/dataset_dict.py (2543:2557) - src/datasets/iterable_dataset.py (4039:4053) duplicated block id: 71 size: 12 cleaned lines of code in 2 files: - src/datasets/builder.py (1524:1538) - src/datasets/builder.py (1779:1793) duplicated block id: 72 size: 12 cleaned lines of code in 2 files: - src/datasets/formatting/tf_formatter.py (114:126) - src/datasets/formatting/torch_formatter.py (115:127) duplicated block id: 73 size: 12 cleaned lines of code in 2 files: - src/datasets/formatting/jax_formatter.py (159:171) - src/datasets/formatting/tf_formatter.py (114:126) duplicated block id: 74 size: 12 cleaned lines of code in 2 files: - src/datasets/dataset_dict.py (1606:1617) - src/datasets/dataset_dict.py (2432:2443) duplicated block id: 75 size: 12 cleaned lines of code in 2 files: - src/datasets/formatting/jax_formatter.py (159:171) - src/datasets/formatting/np_formatter.py (105:117) duplicated block id: 76 size: 12 cleaned lines of code in 2 files: - src/datasets/utils/_dill.py (387:398) - src/datasets/utils/_dill.py (428:439) duplicated block id: 77 size: 12 cleaned lines of code in 2 files: - src/datasets/formatting/jax_formatter.py (159:171) - src/datasets/formatting/torch_formatter.py (115:127) duplicated block id: 78 size: 12 cleaned lines of code in 2 files: - src/datasets/packaged_modules/arrow/arrow.py (27:41) - src/datasets/packaged_modules/xml/xml.py (27:44) duplicated block id: 79 size: 12 cleaned lines of code in 2 files: - src/datasets/formatting/np_formatter.py (105:117) - src/datasets/formatting/torch_formatter.py (115:127) duplicated block id: 80 size: 12 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (1380:1391) - src/datasets/iterable_dataset.py (1397:1408) duplicated block id: 81 size: 12 cleaned lines of code in 2 files: - src/datasets/utils/_dill.py (363:374) - src/datasets/utils/_dill.py (428:439) duplicated block id: 82 size: 11 cleaned lines of code in 2 files: - src/datasets/packaged_modules/parquet/parquet.py (42:55) - src/datasets/packaged_modules/text/text.py (32:48) duplicated block id: 83 size: 11 cleaned lines of code in 2 files: - src/datasets/load.py (472:483) - src/datasets/load.py (680:691) duplicated block id: 84 size: 11 cleaned lines of code in 2 files: - src/datasets/packaged_modules/json/json.py (71:83) - src/datasets/packaged_modules/parquet/parquet.py (42:55) duplicated block id: 85 size: 11 cleaned lines of code in 2 files: - src/datasets/load.py (697:707) - src/datasets/load.py (771:781) duplicated block id: 86 size: 11 cleaned lines of code in 2 files: - src/datasets/builder.py (1549:1560) - src/datasets/builder.py (1798:1809) duplicated block id: 87 size: 11 cleaned lines of code in 2 files: - src/datasets/builder.py (1569:1579) - src/datasets/builder.py (1816:1826) duplicated block id: 88 size: 11 cleaned lines of code in 2 files: - src/datasets/load.py (968:978) - src/datasets/load.py (1005:1015) duplicated block id: 89 size: 11 cleaned lines of code in 2 files: - src/datasets/packaged_modules/csv/csv.py (149:161) - src/datasets/packaged_modules/parquet/parquet.py (42:55) duplicated block id: 90 size: 11 cleaned lines of code in 2 files: - src/datasets/data_files.py (632:642) - src/datasets/data_files.py (678:688) duplicated block id: 91 size: 11 cleaned lines of code in 2 files: - src/datasets/utils/_dill.py (261:271) - src/datasets/utils/_dill.py (448:458) duplicated block id: 92 size: 11 cleaned lines of code in 2 files: - src/datasets/utils/_dill.py (244:254) - src/datasets/utils/_dill.py (343:353) duplicated block id: 93 size: 11 cleaned lines of code in 2 files: - src/datasets/load.py (443:453) - src/datasets/load.py (642:652) duplicated block id: 94 size: 11 cleaned lines of code in 2 files: - src/datasets/utils/_dill.py (261:271) - src/datasets/utils/_dill.py (279:289) duplicated block id: 95 size: 11 cleaned lines of code in 2 files: - src/datasets/packaged_modules/parquet/parquet.py (42:55) - src/datasets/packaged_modules/xml/xml.py (28:44) duplicated block id: 96 size: 11 cleaned lines of code in 2 files: - src/datasets/utils/tf_utils.py (266:276) - src/datasets/utils/tf_utils.py (504:514) duplicated block id: 97 size: 11 cleaned lines of code in 2 files: - src/datasets/utils/_dill.py (279:289) - src/datasets/utils/_dill.py (411:421) duplicated block id: 98 size: 11 cleaned lines of code in 2 files: - src/datasets/packaged_modules/arrow/arrow.py (28:41) - src/datasets/packaged_modules/json/json.py (71:83) duplicated block id: 99 size: 11 cleaned lines of code in 2 files: - src/datasets/utils/_dill.py (411:421) - src/datasets/utils/_dill.py (448:458) duplicated block id: 100 size: 10 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (589:601) - src/datasets/iterable_dataset.py (631:643) duplicated block id: 101 size: 10 cleaned lines of code in 2 files: - src/datasets/utils/_dill.py (365:374) - src/datasets/utils/_dill.py (448:457) duplicated block id: 102 size: 10 cleaned lines of code in 2 files: - src/datasets/utils/_dill.py (262:271) - src/datasets/utils/_dill.py (297:306) duplicated block id: 103 size: 10 cleaned lines of code in 2 files: - src/datasets/utils/_dill.py (389:398) - src/datasets/utils/_dill.py (448:457) duplicated block id: 104 size: 10 cleaned lines of code in 2 files: - src/datasets/load.py (454:463) - src/datasets/load.py (654:663) duplicated block id: 105 size: 10 cleaned lines of code in 2 files: - src/datasets/formatting/formatting.py (241:250) - src/datasets/formatting/polars_formatter.py (73:82) duplicated block id: 106 size: 10 cleaned lines of code in 2 files: - src/datasets/utils/_dill.py (297:306) - src/datasets/utils/_dill.py (412:421) duplicated block id: 107 size: 10 cleaned lines of code in 2 files: - src/datasets/features/audio.py (240:249) - src/datasets/features/video.py (262:271) duplicated block id: 108 size: 10 cleaned lines of code in 2 files: - src/datasets/utils/_dill.py (279:288) - src/datasets/utils/_dill.py (389:398) duplicated block id: 109 size: 10 cleaned lines of code in 2 files: - src/datasets/utils/_dill.py (279:288) - src/datasets/utils/_dill.py (365:374) duplicated block id: 110 size: 10 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (3300:3309) - src/datasets/iterable_dataset.py (3424:3433) duplicated block id: 111 size: 10 cleaned lines of code in 2 files: - src/datasets/features/audio.py (240:249) - src/datasets/features/image.py (234:243) duplicated block id: 112 size: 9 cleaned lines of code in 2 files: - src/datasets/packaged_modules/pandas/pandas.py (45:55) - src/datasets/packaged_modules/text/text.py (44:53) duplicated block id: 113 size: 9 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (2997:3005) - src/datasets/iterable_dataset.py (3406:3414) duplicated block id: 114 size: 9 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (785:795) - src/datasets/iterable_dataset.py (873:884) duplicated block id: 115 size: 9 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (2252:2260) - src/datasets/iterable_dataset.py (2351:2359) duplicated block id: 116 size: 9 cleaned lines of code in 2 files: - src/datasets/io/parquet.py (55:64) - src/datasets/io/sql.py (37:46) duplicated block id: 117 size: 9 cleaned lines of code in 2 files: - src/datasets/inspect.py (161:169) - src/datasets/inspect.py (220:228) duplicated block id: 118 size: 9 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (2917:2925) - src/datasets/iterable_dataset.py (3054:3062) duplicated block id: 119 size: 9 cleaned lines of code in 2 files: - src/datasets/utils/tf_utils.py (119:127) - src/datasets/utils/tf_utils.py (266:274) duplicated block id: 120 size: 9 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (2997:3005) - src/datasets/iterable_dataset.py (3054:3062) duplicated block id: 121 size: 9 cleaned lines of code in 2 files: - src/datasets/features/features.py (2102:2110) - src/datasets/features/features.py (2128:2136) duplicated block id: 122 size: 9 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (3254:3262) - src/datasets/iterable_dataset.py (3301:3309) duplicated block id: 123 size: 9 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (2917:2925) - src/datasets/iterable_dataset.py (3406:3414) duplicated block id: 124 size: 9 cleaned lines of code in 2 files: - src/datasets/packaged_modules/json/json.py (79:88) - src/datasets/packaged_modules/pandas/pandas.py (45:55) duplicated block id: 125 size: 9 cleaned lines of code in 2 files: - src/datasets/packaged_modules/csv/csv.py (157:166) - src/datasets/packaged_modules/pandas/pandas.py (45:55) duplicated block id: 126 size: 9 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (3254:3262) - src/datasets/iterable_dataset.py (3425:3433) duplicated block id: 127 size: 9 cleaned lines of code in 2 files: - src/datasets/utils/_dill.py (297:305) - src/datasets/utils/_dill.py (390:398) duplicated block id: 128 size: 9 cleaned lines of code in 2 files: - src/datasets/utils/_dill.py (297:305) - src/datasets/utils/_dill.py (366:374) duplicated block id: 129 size: 9 cleaned lines of code in 2 files: - src/datasets/io/generator.py (44:53) - src/datasets/io/sql.py (37:46) duplicated block id: 130 size: 9 cleaned lines of code in 2 files: - src/datasets/data_files.py (634:642) - src/datasets/data_files.py (657:665) duplicated block id: 131 size: 9 cleaned lines of code in 2 files: - src/datasets/io/sql.py (37:46) - src/datasets/io/text.py (45:54) duplicated block id: 132 size: 9 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (3054:3062) - src/datasets/iterable_dataset.py (3406:3414) duplicated block id: 133 size: 9 cleaned lines of code in 2 files: - src/datasets/io/json.py (54:63) - src/datasets/io/sql.py (37:46) duplicated block id: 134 size: 9 cleaned lines of code in 2 files: - src/datasets/data_files.py (657:665) - src/datasets/data_files.py (680:688) duplicated block id: 135 size: 9 cleaned lines of code in 2 files: - src/datasets/utils/tf_utils.py (119:127) - src/datasets/utils/tf_utils.py (504:512) duplicated block id: 136 size: 9 cleaned lines of code in 2 files: - src/datasets/formatting/jax_formatter.py (104:118) - src/datasets/formatting/tf_formatter.py (68:82) duplicated block id: 137 size: 9 cleaned lines of code in 2 files: - src/datasets/builder.py (447:455) - src/datasets/builder.py (494:502) duplicated block id: 138 size: 9 cleaned lines of code in 2 files: - src/datasets/io/csv.py (78:87) - src/datasets/io/json.py (81:90) duplicated block id: 139 size: 9 cleaned lines of code in 2 files: - src/datasets/packaged_modules/pandas/pandas.py (45:55) - src/datasets/packaged_modules/xml/xml.py (40:49) duplicated block id: 140 size: 9 cleaned lines of code in 2 files: - src/datasets/packaged_modules/csv/csv.py (177:186) - src/datasets/packaged_modules/folder_based_builder/folder_based_builder.py (282:291) duplicated block id: 141 size: 9 cleaned lines of code in 2 files: - src/datasets/io/csv.py (51:60) - src/datasets/io/sql.py (37:46) duplicated block id: 142 size: 8 cleaned lines of code in 2 files: - src/datasets/formatting/jax_formatter.py (81:94) - src/datasets/formatting/torch_formatter.py (47:59) duplicated block id: 143 size: 8 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (1625:1632) - src/datasets/iterable_dataset.py (1750:1757) duplicated block id: 144 size: 8 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (1490:1497) - src/datasets/iterable_dataset.py (1503:1510) duplicated block id: 145 size: 8 cleaned lines of code in 2 files: - src/datasets/search.py (458:465) - src/datasets/search.py (499:506) duplicated block id: 146 size: 8 cleaned lines of code in 2 files: - src/datasets/formatting/jax_formatter.py (147:156) - src/datasets/formatting/np_formatter.py (93:102) duplicated block id: 147 size: 8 cleaned lines of code in 2 files: - src/datasets/inspect.py (238:245) - src/datasets/inspect.py (299:306) duplicated block id: 148 size: 8 cleaned lines of code in 2 files: - src/datasets/io/abc.py (12:19) - src/datasets/io/parquet.py (21:28) duplicated block id: 149 size: 8 cleaned lines of code in 2 files: - src/datasets/io/json.py (16:23) - src/datasets/io/parquet.py (18:25) duplicated block id: 150 size: 8 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (451:461) - src/datasets/iterable_dataset.py (587:597) duplicated block id: 151 size: 8 cleaned lines of code in 2 files: - src/datasets/dataset_dict.py (1791:1798) - src/datasets/iterable_dataset.py (4082:4089) duplicated block id: 152 size: 8 cleaned lines of code in 2 files: - src/datasets/inspect.py (97:104) - src/datasets/inspect.py (345:352) duplicated block id: 153 size: 8 cleaned lines of code in 2 files: - src/datasets/utils/track.py (28:36) - src/datasets/utils/track.py (50:58) duplicated block id: 154 size: 8 cleaned lines of code in 2 files: - src/datasets/load.py (491:498) - src/datasets/load.py (774:781) duplicated block id: 155 size: 8 cleaned lines of code in 2 files: - src/datasets/load.py (491:498) - src/datasets/load.py (700:707) duplicated block id: 156 size: 8 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (631:640) - src/datasets/iterable_dataset.py (1525:1534) duplicated block id: 157 size: 8 cleaned lines of code in 2 files: - src/datasets/dataset_dict.py (416:423) - src/datasets/dataset_dict.py (2275:2282) duplicated block id: 158 size: 8 cleaned lines of code in 2 files: - src/datasets/io/abc.py (12:19) - src/datasets/io/csv.py (19:26) duplicated block id: 159 size: 8 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (453:462) - src/datasets/iterable_dataset.py (1603:1612) duplicated block id: 160 size: 8 cleaned lines of code in 2 files: - src/datasets/dataset_dict.py (1192:1199) - src/datasets/dataset_dict.py (1275:1282) duplicated block id: 161 size: 8 cleaned lines of code in 2 files: - src/datasets/dataset_dict.py (2124:2131) - src/datasets/iterable_dataset.py (2727:2734) duplicated block id: 162 size: 8 cleaned lines of code in 2 files: - src/datasets/formatting/np_formatter.py (93:102) - src/datasets/formatting/tf_formatter.py (102:111) duplicated block id: 163 size: 8 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (589:598) - src/datasets/iterable_dataset.py (1525:1534) duplicated block id: 164 size: 8 cleaned lines of code in 2 files: - src/datasets/packaged_modules/csv/csv.py (165:174) - src/datasets/packaged_modules/sql/sql.py (100:109) duplicated block id: 165 size: 8 cleaned lines of code in 2 files: - src/datasets/dataset_dict.py (2607:2614) - src/datasets/iterable_dataset.py (4082:4089) duplicated block id: 166 size: 8 cleaned lines of code in 2 files: - src/datasets/io/csv.py (16:23) - src/datasets/io/json.py (16:23) duplicated block id: 167 size: 8 cleaned lines of code in 2 files: - src/datasets/features/image.py (111:120) - src/datasets/features/video.py (123:132) duplicated block id: 168 size: 8 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (244:251) - src/datasets/iterable_dataset.py (287:294) duplicated block id: 169 size: 8 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (453:462) - src/datasets/iterable_dataset.py (1727:1736) duplicated block id: 170 size: 8 cleaned lines of code in 2 files: - src/datasets/io/json.py (16:23) - src/datasets/io/text.py (10:17) duplicated block id: 171 size: 8 cleaned lines of code in 2 files: - src/datasets/dataset_dict.py (2109:2116) - src/datasets/iterable_dataset.py (2704:2711) duplicated block id: 172 size: 8 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (1032:1039) - src/datasets/iterable_dataset.py (1443:1450) duplicated block id: 173 size: 8 cleaned lines of code in 2 files: - src/datasets/io/abc.py (12:19) - src/datasets/io/text.py (13:20) duplicated block id: 174 size: 8 cleaned lines of code in 2 files: - src/datasets/formatting/np_formatter.py (93:102) - src/datasets/formatting/torch_formatter.py (103:112) duplicated block id: 175 size: 7 cleaned lines of code in 2 files: - src/datasets/dataset_dict.py (1806:1812) - src/datasets/iterable_dataset.py (4096:4102) duplicated block id: 176 size: 7 cleaned lines of code in 2 files: - src/datasets/data_files.py (644:650) - src/datasets/data_files.py (690:696) duplicated block id: 177 size: 7 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (631:639) - src/datasets/iterable_dataset.py (1603:1611) duplicated block id: 178 size: 7 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (1525:1533) - src/datasets/iterable_dataset.py (1727:1735) duplicated block id: 179 size: 7 cleaned lines of code in 2 files: - src/datasets/io/abc.py (37:43) - src/datasets/io/text.py (14:20) duplicated block id: 180 size: 7 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (2244:2250) - src/datasets/iterable_dataset.py (2330:2337) duplicated block id: 181 size: 7 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (631:639) - src/datasets/iterable_dataset.py (1727:1735) duplicated block id: 182 size: 7 cleaned lines of code in 2 files: - src/datasets/features/audio.py (186:193) - src/datasets/features/pdf.py (235:241) duplicated block id: 183 size: 7 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (453:461) - src/datasets/iterable_dataset.py (1525:1533) duplicated block id: 184 size: 7 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (453:461) - src/datasets/iterable_dataset.py (631:639) duplicated block id: 185 size: 7 cleaned lines of code in 2 files: - benchmarks/benchmark_getitem_100B.py (68:77) - benchmarks/benchmark_indices_mapping.py (50:59) duplicated block id: 186 size: 7 cleaned lines of code in 2 files: - src/datasets/data_files.py (558:564) - src/datasets/data_files.py (571:577) duplicated block id: 187 size: 7 cleaned lines of code in 2 files: - src/datasets/dataset_dict.py (1611:1617) - src/datasets/iterable_dataset.py (3915:3921) duplicated block id: 188 size: 7 cleaned lines of code in 2 files: - src/datasets/features/audio.py (186:193) - src/datasets/features/image.py (271:277) duplicated block id: 189 size: 7 cleaned lines of code in 2 files: - src/datasets/dataset_dict.py (941:948) - src/datasets/dataset_dict.py (2103:2110) duplicated block id: 190 size: 7 cleaned lines of code in 2 files: - src/datasets/features/features.py (1676:1682) - src/datasets/features/features.py (1694:1700) duplicated block id: 191 size: 7 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (281:287) - src/datasets/iterable_dataset.py (414:420) duplicated block id: 192 size: 7 cleaned lines of code in 2 files: - src/datasets/io/abc.py (37:43) - src/datasets/io/csv.py (20:26) duplicated block id: 193 size: 7 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (2719:2725) - src/datasets/iterable_dataset.py (3427:3433) duplicated block id: 194 size: 7 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (391:397) - src/datasets/iterable_dataset.py (416:422) duplicated block id: 195 size: 7 cleaned lines of code in 2 files: - src/datasets/inspect.py (98:104) - src/datasets/inspect.py (271:277) duplicated block id: 196 size: 7 cleaned lines of code in 2 files: - src/datasets/packaged_modules/videofolder/videofolder.py (27:33) - src/datasets/packaged_modules/webdataset/webdataset.py (272:278) duplicated block id: 197 size: 7 cleaned lines of code in 2 files: - src/datasets/data_files.py (645:652) - src/datasets/data_files.py (669:676) duplicated block id: 198 size: 7 cleaned lines of code in 2 files: - src/datasets/features/features.py (1379:1385) - src/datasets/features/features.py (1391:1397) duplicated block id: 199 size: 7 cleaned lines of code in 2 files: - src/datasets/io/csv.py (30:36) - src/datasets/io/generator.py (22:28) duplicated block id: 200 size: 7 cleaned lines of code in 2 files: - src/datasets/io/abc.py (22:29) - src/datasets/io/abc.py (44:51) duplicated block id: 201 size: 7 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (273:282) - src/datasets/iterable_dataset.py (380:389) duplicated block id: 202 size: 7 cleaned lines of code in 2 files: - src/datasets/inspect.py (271:277) - src/datasets/inspect.py (346:352) duplicated block id: 203 size: 7 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (2719:2725) - src/datasets/iterable_dataset.py (3303:3309) duplicated block id: 204 size: 7 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (2805:2811) - src/datasets/iterable_dataset.py (2962:2968) duplicated block id: 205 size: 7 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (2719:2725) - src/datasets/iterable_dataset.py (3256:3262) duplicated block id: 206 size: 7 cleaned lines of code in 2 files: - src/datasets/dataset_dict.py (533:554) - src/datasets/dataset_dict.py (572:613) duplicated block id: 207 size: 7 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (589:597) - src/datasets/iterable_dataset.py (1603:1611) duplicated block id: 208 size: 7 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (684:692) - src/datasets/iterable_dataset.py (948:956) duplicated block id: 209 size: 7 cleaned lines of code in 2 files: - src/datasets/io/generator.py (22:28) - src/datasets/io/text.py (24:30) duplicated block id: 210 size: 7 cleaned lines of code in 2 files: - src/datasets/features/image.py (172:178) - src/datasets/features/pdf.py (148:154) duplicated block id: 211 size: 7 cleaned lines of code in 2 files: - src/datasets/io/csv.py (70:76) - src/datasets/io/json.py (73:79) duplicated block id: 212 size: 7 cleaned lines of code in 2 files: - src/datasets/io/generator.py (22:28) - src/datasets/io/parquet.py (32:38) duplicated block id: 213 size: 7 cleaned lines of code in 2 files: - src/datasets/io/abc.py (13:19) - src/datasets/io/abc.py (37:43) duplicated block id: 214 size: 7 cleaned lines of code in 2 files: - src/datasets/features/audio.py (186:193) - src/datasets/features/audio.py (268:274) duplicated block id: 215 size: 7 cleaned lines of code in 2 files: - src/datasets/dataset_dict.py (1454:1461) - src/datasets/dataset_dict.py (1497:1504) duplicated block id: 216 size: 7 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (1525:1533) - src/datasets/iterable_dataset.py (1603:1611) duplicated block id: 217 size: 7 cleaned lines of code in 2 files: - src/datasets/dataset_dict.py (1766:1772) - src/datasets/dataset_dict.py (2582:2588) duplicated block id: 218 size: 7 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (589:597) - src/datasets/iterable_dataset.py (1727:1735) duplicated block id: 219 size: 7 cleaned lines of code in 2 files: - src/datasets/io/abc.py (37:43) - src/datasets/io/parquet.py (22:28) duplicated block id: 220 size: 7 cleaned lines of code in 2 files: - src/datasets/builder.py (1559:1565) - src/datasets/builder.py (1578:1584) duplicated block id: 221 size: 7 cleaned lines of code in 2 files: - src/datasets/dataset_dict.py (2437:2443) - src/datasets/iterable_dataset.py (3915:3921) duplicated block id: 222 size: 7 cleaned lines of code in 2 files: - src/datasets/dataset_dict.py (2622:2628) - src/datasets/iterable_dataset.py (4096:4102) duplicated block id: 223 size: 7 cleaned lines of code in 2 files: - src/datasets/io/generator.py (22:28) - src/datasets/io/json.py (31:37) duplicated block id: 224 size: 6 cleaned lines of code in 2 files: - src/datasets/utils/py_utils.py (409:414) - src/datasets/utils/py_utils.py (546:551) duplicated block id: 225 size: 6 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (2963:2968) - src/datasets/iterable_dataset.py (3257:3262) duplicated block id: 226 size: 6 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (2963:2968) - src/datasets/iterable_dataset.py (3304:3309) duplicated block id: 227 size: 6 cleaned lines of code in 2 files: - benchmarks/benchmark_iterating.py (45:50) - benchmarks/benchmark_iterating.py (59:64) duplicated block id: 228 size: 6 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (2720:2725) - src/datasets/iterable_dataset.py (3409:3414) duplicated block id: 229 size: 6 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (3057:3062) - src/datasets/iterable_dataset.py (3304:3309) duplicated block id: 230 size: 6 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (3057:3062) - src/datasets/iterable_dataset.py (3257:3262) duplicated block id: 231 size: 6 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (3257:3262) - src/datasets/iterable_dataset.py (3409:3414) duplicated block id: 232 size: 6 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (3057:3062) - src/datasets/iterable_dataset.py (3428:3433) duplicated block id: 233 size: 6 cleaned lines of code in 2 files: - src/datasets/builder.py (1808:1813) - src/datasets/builder.py (1825:1830) duplicated block id: 234 size: 6 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (2806:2811) - src/datasets/iterable_dataset.py (2920:2925) duplicated block id: 235 size: 6 cleaned lines of code in 2 files: - src/datasets/features/audio.py (232:237) - src/datasets/features/video.py (256:261) duplicated block id: 236 size: 6 cleaned lines of code in 2 files: - src/datasets/dataset_dict.py (1497:1502) - src/datasets/dataset_dict.py (1589:1594) duplicated block id: 237 size: 6 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (2806:2811) - src/datasets/iterable_dataset.py (3000:3005) duplicated block id: 238 size: 6 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (2806:2811) - src/datasets/iterable_dataset.py (3057:3062) duplicated block id: 239 size: 6 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (2920:2925) - src/datasets/iterable_dataset.py (3304:3309) duplicated block id: 240 size: 6 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (2920:2925) - src/datasets/iterable_dataset.py (3257:3262) duplicated block id: 241 size: 6 cleaned lines of code in 2 files: - src/datasets/features/features.py (313:318) - src/datasets/features/features.py (364:369) duplicated block id: 242 size: 6 cleaned lines of code in 2 files: - src/datasets/features/features.py (313:318) - src/datasets/features/features.py (349:354) duplicated block id: 243 size: 6 cleaned lines of code in 2 files: - src/datasets/inspect.py (110:115) - src/datasets/inspect.py (176:181) duplicated block id: 244 size: 6 cleaned lines of code in 2 files: - src/datasets/dataset_dict.py (17:22) - src/datasets/hub.py (5:10) duplicated block id: 245 size: 6 cleaned lines of code in 2 files: - src/datasets/formatting/np_formatter.py (66:75) - src/datasets/formatting/torch_formatter.py (79:88) duplicated block id: 246 size: 6 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (2920:2925) - src/datasets/iterable_dataset.py (3428:3433) duplicated block id: 247 size: 6 cleaned lines of code in 2 files: - src/datasets/inspect.py (45:50) - src/datasets/inspect.py (302:307) duplicated block id: 248 size: 6 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (967:972) - src/datasets/iterable_dataset.py (978:983) duplicated block id: 249 size: 6 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (704:709) - src/datasets/iterable_dataset.py (989:994) duplicated block id: 250 size: 6 cleaned lines of code in 2 files: - src/datasets/packaged_modules/folder_based_builder/folder_based_builder.py (272:277) - src/datasets/packaged_modules/folder_based_builder/folder_based_builder.py (390:395) duplicated block id: 251 size: 6 cleaned lines of code in 2 files: - src/datasets/formatting/jax_formatter.py (109:118) - src/datasets/formatting/torch_formatter.py (79:88) duplicated block id: 252 size: 6 cleaned lines of code in 2 files: - src/datasets/features/features.py (313:318) - src/datasets/features/features.py (334:339) duplicated block id: 253 size: 6 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (2963:2968) - src/datasets/iterable_dataset.py (3057:3062) duplicated block id: 254 size: 6 cleaned lines of code in 2 files: - src/datasets/dataset_dict.py (1890:1895) - src/datasets/dataset_dict.py (2725:2730) duplicated block id: 255 size: 6 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (2963:2968) - src/datasets/iterable_dataset.py (3000:3005) duplicated block id: 256 size: 6 cleaned lines of code in 2 files: - src/datasets/formatting/jax_formatter.py (109:118) - src/datasets/formatting/np_formatter.py (66:75) duplicated block id: 257 size: 6 cleaned lines of code in 2 files: - src/datasets/dataset_dict.py (2725:2730) - src/datasets/iterable_dataset.py (4233:4238) duplicated block id: 258 size: 6 cleaned lines of code in 2 files: - src/datasets/dataset_dict.py (1890:1895) - src/datasets/dataset_dict.py (1909:1914) duplicated block id: 259 size: 6 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (3409:3414) - src/datasets/iterable_dataset.py (3428:3433) duplicated block id: 260 size: 6 cleaned lines of code in 2 files: - src/datasets/utils/_dill.py (273:278) - src/datasets/utils/_dill.py (291:296) duplicated block id: 261 size: 6 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (4233:4238) - src/datasets/iterable_dataset.py (4252:4257) duplicated block id: 262 size: 6 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (2679:2684) - src/datasets/iterable_dataset.py (2695:2700) duplicated block id: 263 size: 6 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (2806:2811) - src/datasets/iterable_dataset.py (3409:3414) duplicated block id: 264 size: 6 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (2806:2811) - src/datasets/iterable_dataset.py (3428:3433) duplicated block id: 265 size: 6 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (3756:3761) - src/datasets/iterable_dataset.py (3825:3830) duplicated block id: 266 size: 6 cleaned lines of code in 2 files: - src/datasets/data_files.py (669:674) - src/datasets/data_files.py (691:696) duplicated block id: 267 size: 6 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (3304:3309) - src/datasets/iterable_dataset.py (3409:3414) duplicated block id: 268 size: 6 cleaned lines of code in 2 files: - src/datasets/dataset_dict.py (1334:1339) - src/datasets/dataset_dict.py (1711:1716) duplicated block id: 269 size: 6 cleaned lines of code in 2 files: - src/datasets/dataset_dict.py (2706:2711) - src/datasets/dataset_dict.py (2725:2730) duplicated block id: 270 size: 6 cleaned lines of code in 2 files: - src/datasets/packaged_modules/spark/spark.py (239:244) - src/datasets/packaged_modules/spark/spark.py (257:262) duplicated block id: 271 size: 6 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (1650:1655) - src/datasets/iterable_dataset.py (1775:1780) duplicated block id: 272 size: 6 cleaned lines of code in 2 files: - src/datasets/features/video.py (197:202) - src/datasets/features/video.py (206:211) duplicated block id: 273 size: 6 cleaned lines of code in 2 files: - src/datasets/features/features.py (349:354) - src/datasets/features/features.py (364:369) duplicated block id: 274 size: 6 cleaned lines of code in 2 files: - src/datasets/packaged_modules/arrow/arrow.py (53:62) - src/datasets/packaged_modules/parquet/parquet.py (67:76) duplicated block id: 275 size: 6 cleaned lines of code in 2 files: - src/datasets/dataset_dict.py (1463:1468) - src/datasets/dataset_dict.py (1555:1560) duplicated block id: 276 size: 6 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (3000:3005) - src/datasets/iterable_dataset.py (3428:3433) duplicated block id: 277 size: 6 cleaned lines of code in 2 files: - src/datasets/packaged_modules/folder_based_builder/folder_based_builder.py (307:313) - src/datasets/packaged_modules/json/json.py (124:130) duplicated block id: 278 size: 6 cleaned lines of code in 2 files: - src/datasets/features/features.py (334:339) - src/datasets/features/features.py (349:354) duplicated block id: 279 size: 6 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (878:884) - src/datasets/iterable_dataset.py (948:954) duplicated block id: 280 size: 6 cleaned lines of code in 2 files: - src/datasets/formatting/np_formatter.py (66:75) - src/datasets/formatting/tf_formatter.py (73:82) duplicated block id: 281 size: 6 cleaned lines of code in 2 files: - src/datasets/dataset_dict.py (947:952) - src/datasets/dataset_dict.py (1064:1069) duplicated block id: 282 size: 6 cleaned lines of code in 2 files: - src/datasets/dataset_dict.py (1454:1459) - src/datasets/dataset_dict.py (1589:1594) duplicated block id: 283 size: 6 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (3000:3005) - src/datasets/iterable_dataset.py (3257:3262) duplicated block id: 284 size: 6 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (3000:3005) - src/datasets/iterable_dataset.py (3304:3309) duplicated block id: 285 size: 6 cleaned lines of code in 2 files: - src/datasets/formatting/tf_formatter.py (73:82) - src/datasets/formatting/torch_formatter.py (79:88) duplicated block id: 286 size: 6 cleaned lines of code in 2 files: - src/datasets/dataset_dict.py (1890:1895) - src/datasets/iterable_dataset.py (4252:4257) duplicated block id: 287 size: 6 cleaned lines of code in 2 files: - src/datasets/dataset_dict.py (1759:1764) - src/datasets/dataset_dict.py (2575:2580) duplicated block id: 288 size: 6 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (2720:2725) - src/datasets/iterable_dataset.py (2806:2811) duplicated block id: 289 size: 6 cleaned lines of code in 2 files: - src/datasets/dataset_dict.py (1909:1914) - src/datasets/iterable_dataset.py (4233:4238) duplicated block id: 290 size: 6 cleaned lines of code in 2 files: - src/datasets/dataset_dict.py (1420:1425) - src/datasets/dataset_dict.py (1555:1560) duplicated block id: 291 size: 6 cleaned lines of code in 2 files: - src/datasets/features/audio.py (232:237) - src/datasets/features/pdf.py (201:206) duplicated block id: 292 size: 6 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (2703:2708) - src/datasets/iterable_dataset.py (2794:2799) duplicated block id: 293 size: 6 cleaned lines of code in 2 files: - src/datasets/load.py (570:575) - src/datasets/load.py (741:746) duplicated block id: 294 size: 6 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (2720:2725) - src/datasets/iterable_dataset.py (2920:2925) duplicated block id: 295 size: 6 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (2720:2725) - src/datasets/iterable_dataset.py (2963:2968) duplicated block id: 296 size: 6 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (789:795) - src/datasets/iterable_dataset.py (948:954) duplicated block id: 297 size: 6 cleaned lines of code in 2 files: - src/datasets/dataset_dict.py (1420:1425) - src/datasets/dataset_dict.py (1463:1468) duplicated block id: 298 size: 6 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (2720:2725) - src/datasets/iterable_dataset.py (3000:3005) duplicated block id: 299 size: 6 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (2720:2725) - src/datasets/iterable_dataset.py (3057:3062) duplicated block id: 300 size: 6 cleaned lines of code in 2 files: - src/datasets/features/video.py (196:201) - src/datasets/features/video.py (214:219) duplicated block id: 301 size: 6 cleaned lines of code in 2 files: - src/datasets/dataset_dict.py (987:1062) - src/datasets/dataset_dict.py (1091:1120) duplicated block id: 302 size: 6 cleaned lines of code in 2 files: - src/datasets/search.py (110:115) - src/datasets/search.py (589:594) duplicated block id: 303 size: 6 cleaned lines of code in 2 files: - src/datasets/dataset_dict.py (615:620) - src/datasets/dataset_dict.py (752:757) duplicated block id: 304 size: 6 cleaned lines of code in 2 files: - src/datasets/dataset_dict.py (2176:2181) - src/datasets/iterable_dataset.py (2795:2800) duplicated block id: 305 size: 6 cleaned lines of code in 2 files: - src/datasets/dataset_dict.py (1334:1339) - src/datasets/dataset_dict.py (2527:2532) duplicated block id: 306 size: 6 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (684:690) - src/datasets/iterable_dataset.py (878:884) duplicated block id: 307 size: 6 cleaned lines of code in 2 files: - src/datasets/formatting/formatting.py (255:260) - src/datasets/formatting/polars_formatter.py (87:92) duplicated block id: 308 size: 6 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (2963:2968) - src/datasets/iterable_dataset.py (3428:3433) duplicated block id: 309 size: 6 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (2806:2811) - src/datasets/iterable_dataset.py (3257:3262) duplicated block id: 310 size: 6 cleaned lines of code in 2 files: - src/datasets/features/image.py (124:131) - src/datasets/features/pdf.py (100:107) duplicated block id: 311 size: 6 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (2806:2811) - src/datasets/iterable_dataset.py (3304:3309) duplicated block id: 312 size: 6 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (2963:2968) - src/datasets/iterable_dataset.py (3409:3414) duplicated block id: 313 size: 6 cleaned lines of code in 2 files: - src/datasets/features/features.py (334:339) - src/datasets/features/features.py (364:369) duplicated block id: 314 size: 6 cleaned lines of code in 2 files: - src/datasets/dataset_dict.py (2706:2711) - src/datasets/iterable_dataset.py (4252:4257) duplicated block id: 315 size: 6 cleaned lines of code in 2 files: - src/datasets/features/audio.py (232:237) - src/datasets/features/image.py (228:233) duplicated block id: 316 size: 6 cleaned lines of code in 2 files: - src/datasets/download/download_manager.py (74:79) - src/datasets/download/streaming_download_manager.py (57:62) duplicated block id: 317 size: 6 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (2920:2925) - src/datasets/iterable_dataset.py (2963:2968) duplicated block id: 318 size: 6 cleaned lines of code in 2 files: - src/datasets/dataset_dict.py (1909:1914) - src/datasets/dataset_dict.py (2706:2711) duplicated block id: 319 size: 6 cleaned lines of code in 2 files: - src/datasets/load.py (418:423) - src/datasets/load.py (595:600) duplicated block id: 320 size: 6 cleaned lines of code in 2 files: - src/datasets/dataset_dict.py (1196:1202) - src/datasets/dataset_dict.py (2182:2188) duplicated block id: 321 size: 6 cleaned lines of code in 2 files: - src/datasets/fingerprint.py (238:243) - src/datasets/fingerprint.py (260:265) duplicated block id: 322 size: 6 cleaned lines of code in 2 files: - src/datasets/packaged_modules/folder_based_builder/folder_based_builder.py (65:72) - src/datasets/packaged_modules/parquet/parquet.py (41:48) duplicated block id: 323 size: 6 cleaned lines of code in 2 files: - src/datasets/iterable_dataset.py (684:690) - src/datasets/iterable_dataset.py (789:795) duplicated block id: 324 size: 6 cleaned lines of code in 2 files: - src/datasets/dataset_dict.py (1139:1183) - src/datasets/dataset_dict.py (1206:1258)