huggingface / chug
File Size

The distribution of size of files (measured in lines of code).

Intro
Learn more...
File Size Overall
0% | 0% | 23% | 28% | 47%
Legend:
1001+
501-1000
201-500
101-200
1-100


explore: grouped by folders | grouped by size | sunburst | 3D view
File Size per Extension
1001+
501-1000
201-500
101-200
1-100
py0% | 0% | 24% | 29% | 46%
toml0% | 0% | 0% | 0% | 100%
File Size per Logical Decomposition
primary
1001+
501-1000
201-500
101-200
1-100
src0% | 0% | 24% | 29% | 46%
ROOT0% | 0% | 0% | 0% | 100%
Longest Files (Top 45)
File# lines# units
build_transforms_doc.py
in src/chug/image
283 3
decode.py
in src/chug/wds
232 5
config.py
in src/chug/common
230 14
doc_processor.py
in src/chug/doc
159 10
build_transforms_image.py
in src/chug/image
152 2
loader.py
in src/chug/hfds
132 2
doc_read_processor.py
in src/chug/doc
124 4
loader.py
in src/chug
124 4
transforms_torch.py
in src/chug/image
109 16
filters.py
in src/chug/wds
107 6
tokenization.py
in src/chug/text
100 5
loader.py
in src/chug/wds
99 1
doc_vqa_processor.py
in src/chug/doc
90 3
pipeline.py
in src/chug/wds
88 1
transforms_alb.py
in src/chug/image
84 14
pipeline_doc_vqa.py
in src/chug/task_pipeline
82 2
shardlists.py
in src/chug/wds
81 5
pipeline_gtparse.py
in src/chug/task_pipeline
73 2
pipeline_image_text.py
in src/chug/task_pipeline
71 2
types.py
in src/chug/common
63 6
collate.py
in src/chug/hfds
56 4
urls.py
in src/chug/common
55 4
dataset_info.py
in src/chug/wds
52 2
helpers.py
in src/chug/wds
51 5
pipeline_doc_read.py
in src/chug/task_pipeline
50 1
50 -
transforms_factory.py
in src/chug/image
47 2
__init__.py
in src/chug
45 -
collate.py
in src/chug/common
42 1
tariterators.py
in src/chug/wds
40 3
wrappers.py
in src/chug/hfds
36 6
random.py
in src/chug/common
26 2
pipeline_manual.py
in src/chug/task_pipeline
20 1
__init__.py
in src/chug/doc
16 -
pipeline_factory.py
in src/chug/task_pipeline
14 1
task_config.py
in src/chug/common
12 -
constants.py
in src/chug/doc
12 -
__init__.py
in src/chug/wds
7 -
__init__.py
in src/chug/common
7 -
__init__.py
in src/chug/task_pipeline
6 -
__init__.py
in src/chug/image
3 -
__init__.py
in src/chug/hfds
2 -
1 -
__init__.py
in src/chug/text
1 -
version.py
in src/chug
1 -
Files With Most Units (Top 32)
File# lines# units
transforms_torch.py
in src/chug/image
109 16
config.py
in src/chug/common
230 14
transforms_alb.py
in src/chug/image
84 14
doc_processor.py
in src/chug/doc
159 10
wrappers.py
in src/chug/hfds
36 6
filters.py
in src/chug/wds
107 6
types.py
in src/chug/common
63 6
helpers.py
in src/chug/wds
51 5
decode.py
in src/chug/wds
232 5
shardlists.py
in src/chug/wds
81 5
tokenization.py
in src/chug/text
100 5
collate.py
in src/chug/hfds
56 4
urls.py
in src/chug/common
55 4
doc_read_processor.py
in src/chug/doc
124 4
loader.py
in src/chug
124 4
tariterators.py
in src/chug/wds
40 3
doc_vqa_processor.py
in src/chug/doc
90 3
build_transforms_doc.py
in src/chug/image
283 3
loader.py
in src/chug/hfds
132 2
dataset_info.py
in src/chug/wds
52 2
random.py
in src/chug/common
26 2
transforms_factory.py
in src/chug/image
47 2
build_transforms_image.py
in src/chug/image
152 2
pipeline_gtparse.py
in src/chug/task_pipeline
73 2
pipeline_doc_vqa.py
in src/chug/task_pipeline
82 2
pipeline_image_text.py
in src/chug/task_pipeline
71 2
pipeline.py
in src/chug/wds
88 1
loader.py
in src/chug/wds
99 1
collate.py
in src/chug/common
42 1
pipeline_manual.py
in src/chug/task_pipeline
20 1
pipeline_doc_read.py
in src/chug/task_pipeline
50 1
pipeline_factory.py
in src/chug/task_pipeline
14 1
Files With Long Lines (Top 2)

There are 2 files with lines longer than 120 characters. In total, there are 2 long lines.

File# lines# units# long lines
loader.py
in src/chug/wds
99 1 1
doc_vqa_processor.py
in src/chug/doc
90 3 1