duplicated block id: 1 size: 21 cleaned lines of code in 2 files: - src/chug/task_pipeline/pipeline_doc_read.py (39:62) - src/chug/task_pipeline/pipeline_doc_vqa.py (99:122) duplicated block id: 2 size: 18 cleaned lines of code in 2 files: - src/chug/task_pipeline/pipeline_gtparse.py (72:93) - src/chug/task_pipeline/pipeline_image_text.py (70:90) duplicated block id: 3 size: 17 cleaned lines of code in 2 files: - src/chug/task_pipeline/pipeline_gtparse.py (47:65) - src/chug/task_pipeline/pipeline_image_text.py (45:63) duplicated block id: 4 size: 14 cleaned lines of code in 2 files: - src/chug/image/build_transforms_doc.py (15:33) - src/chug/image/build_transforms_doc.py (53:70) duplicated block id: 5 size: 12 cleaned lines of code in 2 files: - src/chug/wds/filters.py (76:87) - src/chug/wds/filters.py (112:123) duplicated block id: 6 size: 12 cleaned lines of code in 2 files: - src/chug/doc/__init__.py (2:13) - src/chug/task_pipeline/pipeline_doc_vqa.py (11:22) duplicated block id: 7 size: 12 cleaned lines of code in 2 files: - src/chug/image/build_transforms_doc.py (35:49) - src/chug/image/build_transforms_doc.py (160:174) duplicated block id: 8 size: 11 cleaned lines of code in 2 files: - src/chug/image/transforms_alb.py (37:49) - src/chug/image/transforms_alb.py (57:69) duplicated block id: 9 size: 10 cleaned lines of code in 2 files: - src/chug/image/transforms_torch.py (50:60) - src/chug/image/transforms_torch.py (72:82) duplicated block id: 10 size: 10 cleaned lines of code in 2 files: - src/chug/doc/doc_read_processor.py (21:30) - src/chug/doc/doc_vqa_processor.py (25:34) duplicated block id: 11 size: 9 cleaned lines of code in 2 files: - src/chug/wds/filters.py (56:64) - src/chug/wds/filters.py (112:120) duplicated block id: 12 size: 9 cleaned lines of code in 2 files: - src/chug/hfds/loader.py (39:47) - src/chug/wds/loader.py (22:30) duplicated block id: 13 size: 9 cleaned lines of code in 2 files: - src/chug/wds/filters.py (56:64) - src/chug/wds/filters.py (76:84) duplicated block id: 14 size: 8 cleaned lines of code in 2 files: - src/chug/loader.py (38:45) - src/chug/loader.py (49:56) duplicated block id: 15 size: 7 cleaned lines of code in 2 files: - src/chug/image/build_transforms_image.py (17:23) - src/chug/image/build_transforms_image.py (87:93) duplicated block id: 16 size: 7 cleaned lines of code in 2 files: - src/chug/task_pipeline/pipeline_doc_vqa.py (116:122) - src/chug/task_pipeline/pipeline_gtparse.py (86:93) duplicated block id: 17 size: 7 cleaned lines of code in 2 files: - src/chug/doc/doc_processor.py (124:130) - src/chug/doc/doc_processor.py (147:153) duplicated block id: 18 size: 7 cleaned lines of code in 2 files: - src/chug/task_pipeline/pipeline_doc_read.py (56:62) - src/chug/task_pipeline/pipeline_gtparse.py (86:93) duplicated block id: 19 size: 7 cleaned lines of code in 2 files: - src/chug/task_pipeline/pipeline_gtparse.py (26:32) - src/chug/task_pipeline/pipeline_image_text.py (23:29) duplicated block id: 20 size: 7 cleaned lines of code in 2 files: - src/chug/doc/doc_processor.py (70:76) - src/chug/doc/doc_read_processor.py (20:26) duplicated block id: 21 size: 7 cleaned lines of code in 2 files: - src/chug/wds/filters.py (58:64) - src/chug/wds/filters.py (94:100) duplicated block id: 22 size: 7 cleaned lines of code in 2 files: - src/chug/wds/filters.py (94:100) - src/chug/wds/filters.py (114:120) duplicated block id: 23 size: 7 cleaned lines of code in 2 files: - src/chug/task_pipeline/pipeline_doc_read.py (56:62) - src/chug/task_pipeline/pipeline_image_text.py (83:90) duplicated block id: 24 size: 7 cleaned lines of code in 2 files: - src/chug/image/build_transforms_image.py (38:46) - src/chug/image/build_transforms_image.py (109:117) duplicated block id: 25 size: 7 cleaned lines of code in 2 files: - src/chug/doc/doc_processor.py (63:69) - src/chug/doc/doc_read_processor.py (12:18) duplicated block id: 26 size: 7 cleaned lines of code in 2 files: - src/chug/image/build_transforms_doc.py (15:22) - src/chug/image/build_transforms_doc.py (178:188) duplicated block id: 27 size: 7 cleaned lines of code in 2 files: - src/chug/image/build_transforms_doc.py (53:60) - src/chug/image/build_transforms_doc.py (178:188) duplicated block id: 28 size: 7 cleaned lines of code in 2 files: - src/chug/task_pipeline/pipeline_doc_vqa.py (116:122) - src/chug/task_pipeline/pipeline_image_text.py (83:90) duplicated block id: 29 size: 7 cleaned lines of code in 2 files: - src/chug/loader.py (10:16) - src/chug/loader.py (134:140) duplicated block id: 30 size: 7 cleaned lines of code in 2 files: - src/chug/wds/filters.py (78:84) - src/chug/wds/filters.py (94:100) duplicated block id: 31 size: 6 cleaned lines of code in 2 files: - src/chug/doc/doc_processor.py (71:76) - src/chug/doc/doc_vqa_processor.py (25:30) duplicated block id: 32 size: 6 cleaned lines of code in 2 files: - src/chug/text/tokenization.py (68:73) - src/chug/text/tokenization.py (104:109) duplicated block id: 33 size: 6 cleaned lines of code in 2 files: - src/chug/common/config.py (68:73) - src/chug/common/config.py (78:83) duplicated block id: 34 size: 6 cleaned lines of code in 2 files: - src/chug/task_pipeline/pipeline_gtparse.py (36:45) - src/chug/task_pipeline/pipeline_image_text.py (33:43) duplicated block id: 35 size: 6 cleaned lines of code in 2 files: - src/chug/hfds/loader.py (99:105) - src/chug/hfds/loader.py (152:158) duplicated block id: 36 size: 6 cleaned lines of code in 2 files: - src/chug/wds/decode.py (69:74) - src/chug/wds/decode.py (146:151)