huggingface / data-measurements-tool
File Size

The distribution of size of files (measured in lines of code).

Intro
Learn more...
File Size Overall
0% | 0% | 59% | 25% | 14%
Legend:
1001+
501-1000
201-500
101-200
1-100


explore: grouped by folders | grouped by size | sunburst | 3D view
File Size per Extension
1001+
501-1000
201-500
101-200
1-100
py0% | 0% | 59% | 25% | 14%
File Size per Logical Decomposition
primary
1001+
501-1000
201-500
101-200
1-100
data_measurements0% | 0% | 65% | 26% | 8%
utils0% | 0% | 96% | 0% | 3%
ROOT0% | 0% | 63% | 36% | 0%
widgets0% | 0% | 0% | 27% | 72%
npmi0% | 0% | 0% | 96% | 3%
lengths0% | 0% | 0% | 0% | 100%
Longest Files (Top 34)
File# lines# units
embeddings.py
in data_measurements/embeddings
410 13
dataset_statistics.py
in data_measurements
380 23
363 15
npmi.py
in data_measurements/npmi
311 23
275 5
269 21
labels.py
in data_measurements/labels
167 12
zipf.py
in data_measurements/zipf
165 12
app.py
in root
156 9
npmi.py
in widgets
127 8
npmi.py
in npmi
125 7
lengths.py
in data_measurements/lengths
119 10
zipf.py
in widgets
82 5
text_lengths.py
in widgets
74 6
text_duplicates.py
in data_measurements/text_duplicates
57 6
44 5
duplicates.py
in widgets
39 5
39 5
tokenize.py
in data_measurements
38 4
perplexity.py
in data_measurements/perplexity
36 4
28 5
__init__.py
in utils
24 1
widget_base.py
in widgets
17 4
__init__.py
in widgets
8 -
app.py
in npmi
4 -
__init__.py
in data_measurements/embeddings
1 -
__init__.py
in data_measurements/text_duplicates
1 -
__init__.py
in data_measurements/labels
1 -
__init__.py
in data_measurements/perplexity
1 -
__init__.py
in data_measurements/zipf
1 -
__init__.py
in data_measurements
1 -
__init__.py
in data_measurements/lengths
1 -
__init__.py
in data_measurements/npmi
1 -
__init__.py
in lengths
1 -
Files With Most Units (Top 23)
File# lines# units
dataset_statistics.py
in data_measurements
380 23
npmi.py
in data_measurements/npmi
311 23
269 21
363 15
embeddings.py
in data_measurements/embeddings
410 13
labels.py
in data_measurements/labels
167 12
zipf.py
in data_measurements/zipf
165 12
lengths.py
in data_measurements/lengths
119 10
app.py
in root
156 9
npmi.py
in widgets
127 8
npmi.py
in npmi
125 7
text_lengths.py
in widgets
74 6
text_duplicates.py
in data_measurements/text_duplicates
57 6
duplicates.py
in widgets
39 5
39 5
zipf.py
in widgets
82 5
28 5
44 5
275 5
widget_base.py
in widgets
17 4
perplexity.py
in data_measurements/perplexity
36 4
tokenize.py
in data_measurements
38 4
__init__.py
in utils
24 1
Files With Long Lines (Top 7)

There are 7 files with lines longer than 120 characters. In total, there are 21 long lines.

File# lines# units# long lines
363 15 6
275 5 5
app.py
in root
156 9 4
zipf.py
in widgets
82 5 3
44 5 1
__init__.py
in utils
24 1 1
npmi.py
in npmi
125 7 1