huggingface / xet-core
File Size

The distribution of size of files (measured in lines of code).

Intro
Learn more...
File Size Overall
19% | 17% | 36% | 12% | 14%
Legend:
1001+
501-1000
201-500
101-200
1-100


explore: grouped by folders | grouped by size | sunburst | 3D view
File Size per Extension
1001+
501-1000
201-500
101-200
1-100
rs19% | 17% | 37% | 12% | 12%
toml0% | 0% | 0% | 14% | 85%
File Size per Logical Decomposition
primary
1001+
501-1000
201-500
101-200
1-100
cas_object50% | 0% | 30% | 12% | 6%
mdb_shard25% | 28% | 33% | 9% | 3%
cas_client32% | 33% | 11% | 16% | 5%
chunk_cache74% | 0% | 0% | 11% | 13%
deduplication0% | 56% | 20% | 0% | 23%
progress_tracking0% | 44% | 41% | 0% | 13%
utils0% | 41% | 20% | 17% | 21%
merkledb0% | 0% | 84% | 3% | 12%
data0% | 0% | 55% | 7% | 36%
hf_xet0% | 0% | 64% | 10% | 24%
file_utils0% | 0% | 74% | 22% | 3%
merklehash0% | 0% | 94% | 0% | 5%
parutils0% | 0% | 92% | 0% | 7%
hf_xet_wasm0% | 0% | 0% | 34% | 65%
cas_types0% | 0% | 0% | 68% | 31%
chunk_cache_bench0% | 0% | 0% | 64% | 35%
xet_threadpool0% | 0% | 0% | 79% | 20%
error_printer0% | 0% | 0% | 92% | 7%
ROOT0% | 0% | 0% | 71% | 28%
Longest Files (Top 50)
File# lines# units
cas_object_format.rs
in cas_object/src
1617 32
shard_format.rs
in mdb_shard/src
1275 10
remote_client.rs
in cas_client/src
1095 7
disk.rs
in chunk_cache/src
1079 20
shard_file_manager.rs
in mdb_shard/src
894 2
chunking.rs
in deduplication/src
764 21
upload_tracking.rs
in progress_tracking/src
612 8
download_utils.rs
in cas_client/src
604 2
singleflight.rs
in utils/src
592 11
file_structs.rs
in mdb_shard/src
524 2
local_client.rs
in cas_client/src
520 5
set_operations.rs
in mdb_shard/src
497 13
437 2
internal_methods.rs
in merkledb/src
417 1
http_client.rs
in cas_client/src
399 9
shard_file_handle.rs
in mdb_shard/src
397 4
394 -
streaming_shard.rs
in mdb_shard/src
372 1
data_hash.rs
in merklehash/src
369 33
tests.rs
in merkledb/src
363 10
aggregator.rs
in progress_tracking/src
360 3
merklememdb.rs
in merkledb/src
357 13
data_client.rs
in data/src
342 4
merklenode.rs
in merkledb/src
314 8
rw_task_lock.rs
in utils/src
291 1
merkledb_debug.rs
in merkledb/src
280 11
file_deduplication.rs
in deduplication/src
270 4
log_buffer.rs
in hf_xet/src
266 7
privilege_context.rs
in file_utils/src
262 6
cas_chunk_format.rs
in cas_object/src
261 8
bg4.rs
in cas_object/src/byte_grouping
254 2
bg4_prediction.rs
in cas_object/src/byte_grouping
251 11
rolling_hash_benchmark.rs
in merkledb/benches
245 22
safe_file_creator.rs
in file_utils/src
239 12
parallel_utils.rs
in parutils/src
224 -
verification_wrapper.rs
in progress_tracking/src
218 -
chunk_iterator.rs
in merkledb/src
215 3
210 -
lib.rs
in hf_xet/src
210 14
collect_compression_stats.rs
in cas_object/src/byte_grouping/compression_stats
209 7
cas_structs.rs
in mdb_shard/src
207 -
progress_update.rs
in hf_xet/src
204 1
shard_in_memory.rs
in mdb_shard/src
201 -
lib.rs
in cas_types/src
198 9
interpolation_search.rs
in mdb_shard/src
186 8
retry_utils.rs
in cas_client/src
181 2
cache_bench.rs
in chunk_cache_bench/benches
169 6
cache_item.rs
in chunk_cache/src/disk
162 12
upload_progress_stream.rs
in cas_client/src
158 5
shard_benchmark.rs
in mdb_shard/src
157 2
Files With Most Units (Top 50)
File# lines# units
data_hash.rs
in merklehash/src
369 33
cas_object_format.rs
in cas_object/src
1617 32
rolling_hash_benchmark.rs
in merkledb/benches
245 22
chunking.rs
in deduplication/src
764 21
disk.rs
in chunk_cache/src
1079 20
lib.rs
in hf_xet/src
210 14
merklememdb.rs
in merkledb/src
357 13
set_operations.rs
in mdb_shard/src
497 13
cache_item.rs
in chunk_cache/src/disk
162 12
safe_file_creator.rs
in file_utils/src
239 12
merkledb_debug.rs
in merkledb/src
280 11
singleflight.rs
in utils/src
592 11
lib.rs
in error_printer/src
111 11
bg4_prediction.rs
in cas_object/src/byte_grouping
251 11
tests.rs
in merkledb/src
363 10
shard_format.rs
in mdb_shard/src
1275 10
key.rs
in cas_types/src
73 9
lib.rs
in cas_types/src
198 9
http_client.rs
in cas_client/src
399 9
bg_split_regroup_bench.rs
in cas_object/benches
117 9
merklenode.rs
in merkledb/src
314 8
upload_tracking.rs
in progress_tracking/src
612 8
error.rs
in cas_client/src
106 8
cas_chunk_format.rs
in cas_object/src
261 8
interpolation_search.rs
in mdb_shard/src
186 8
remote_client.rs
in cas_client/src
1095 7
log_buffer.rs
in hf_xet/src
266 7
collect_compression_stats.rs
in cas_object/src/byte_grouping/compression_stats
209 7
cache_bench.rs
in chunk_cache_bench/benches
169 6
privilege_context.rs
in file_utils/src
262 6
upload_progress_stream.rs
in cas_client/src
158 5
local_client.rs
in cas_client/src
520 5
runtime.rs
in hf_xet/src
113 5
18 4
107 4
data_client.rs
in data/src
342 4
file_metadata.rs
in file_utils/src
151 4
file_deduplication.rs
in deduplication/src
270 4
compression_scheme.rs
in cas_object/src
61 4
shard_file_handle.rs
in mdb_shard/src
397 4
chunk_iterator.rs
in merkledb/src
215 3
lib.rs
in chunk_cache_bench/src
21 3
124 3
aggregator.rs
in progress_tracking/src
360 3
output_provider.rs
in cas_client/src
70 3
deserialize_async.rs
in cas_object/src/cas_chunk_format
149 3
error.rs
in cas_object/src
46 3
93 2
437 2
solid_cache.rs
in chunk_cache_bench/src
26 2
Files With Long Lines (Top 13)

There are 13 files with lines longer than 120 characters. In total, there are 27 long lines.

File# lines# units# long lines
remote_client.rs
in cas_client/src
1095 7 5
privilege_context.rs
in file_utils/src
262 6 4
data_client.rs
in data/src
342 4 3
lib.rs
in hf_xet/src
210 14 3
file_cleaner.rs
in data/src
129 - 2
124 3 2
Cargo.toml
in cas_client
64 - 2
437 2 1
migrate.rs
in data/src/migration_tool
75 - 1
singleflight.rs
in utils/src
592 11 1
compression_bench.rs
in cas_object/benches
121 1 1
shard_file_handle.rs
in mdb_shard/src
397 4 1
shard_file_manager.rs
in mdb_shard/src
894 2 1