facebookresearch / DPR
File Size

The distribution of size of files (measured in lines of code).

Intro
  • File size measurements show the distribution of size of files.
  • Files are classified in four categories based on their size (lines of code): 1-100 (very small files), 101-200 (small files), 201-500 (medium size files), 501-1000 (long files), 1001+(very long files).
  • It is a good practice to keep files small. Long files may become "bloaters", code that have increased to such gargantuan proportions that they are hard to work with.
Learn more...
File Size Overall
  • There are 39 files with 5,728 lines of code.
    • 0 very long files (0 lines of code)
    • 1 long files (605 lines of code)
    • 10 medium size files (3,671 lines of codeclsfd_ftr_w_mp_ins)
    • 6 small files (839 lines of code)
    • 22 very small files (613 lines of code)
0% | 10% | 64% | 14% | 10%
Legend:
1001+
501-1000
201-500
101-200
1-100


explore: zoomable circles | sunburst | 3D view
File Size per Extension
1001+
501-1000
201-500
101-200
1-100
py0% | 11% | 67% | 15% | 6%
yaml0% | 0% | 0% | 0% | 100%
File Size per Logical Decomposition
primary
1001+
501-1000
201-500
101-200
1-100
ROOT0% | 39% | 50% | 7% | 2%
dpr/data0% | 0% | 94% | 5% | <1%
dpr/models0% | 0% | 60% | 19% | 20%
dpr/utils0% | 0% | 38% | 47% | 13%
dpr/indexer0% | 0% | 0% | 100% | 0%
conf0% | 0% | 0% | 0% | 100%
conf/train0% | 0% | 0% | 0% | 100%
dpr0% | 0% | 0% | 0% | 100%
conf/datasets0% | 0% | 0% | 0% | 100%
conf/encoder0% | 0% | 0% | 0% | 100%
conf/ctx_sources0% | 0% | 0% | 0% | 100%
Longest Files (Top 39)
File# lines# units
train_dense_encoder.py
in root
605 12
tables.py
in dpr/data
477 40
download_data.py
in dpr/data
469 5
biencoder_data.py
in dpr/data
461 50
train_extractive_reader.py
in root
458 12
reader_data.py
in dpr/data
451 25
dense_retriever.py
in root
322 11
biencoder.py
in dpr/models
303 13
retriever_data.py
in dpr/data
287 21
hf_models.py
in dpr/models
225 22
data_utils.py
in dpr/utils
218 27
reader.py
in dpr/models
169 10
faiss_indexers.py
in dpr/indexer
161 26
tokenizers.py
in dpr/utils
148 18
qa_validation.py
in dpr/data
130 8
model_utils.py
in dpr/utils
120 9
generate_dense_embeddings.py
in root
111 2
pytext_models.py
in dpr/models
91 7
options.py
in dpr
65 5
dist_utils.py
in dpr/utils
56 5
__init__.py
in dpr/models
56 10
dense_retriever.yaml
in conf
36 -
fairseq_models.py
in dpr/models
34 6
setup.py
in root
34 -
encoder_train_default.yaml
in conf/datasets
33 -
extractive_reader_train_cfg.yaml
in conf
30 -
biencoder_train_cfg.yaml
in conf
28 -
retriever_default.yaml
in conf/datasets
24 -
gen_embs.yaml
in conf
23 -
biencoder_local.yaml
in conf/train
19 -
biencoder_nq.yaml
in conf/train
19 -
biencoder_default.yaml
in conf/train
19 -
conf_utils.py
in dpr/utils
18 1
extractive_reader_default.yaml
in conf/train
13 -
hf_bert.yaml
in conf/encoder
8 -
default_sources.yaml
in conf/ctx_sources
4 -
__init__.py
in dpr
1 -
__init__.py
in dpr/utils
1 -
__init__.py
in dpr/data
1 -
Files With Most Units (Top 20)
File# lines# units
biencoder_data.py
in dpr/data
461 50
tables.py
in dpr/data
477 40
data_utils.py
in dpr/utils
218 27
faiss_indexers.py
in dpr/indexer
161 26
reader_data.py
in dpr/data
451 25
hf_models.py
in dpr/models
225 22
retriever_data.py
in dpr/data
287 21
tokenizers.py
in dpr/utils
148 18
biencoder.py
in dpr/models
303 13
train_dense_encoder.py
in root
605 12
train_extractive_reader.py
in root
458 12
dense_retriever.py
in root
322 11
__init__.py
in dpr/models
56 10
reader.py
in dpr/models
169 10
model_utils.py
in dpr/utils
120 9
qa_validation.py
in dpr/data
130 8
pytext_models.py
in dpr/models
91 7
fairseq_models.py
in dpr/models
34 6
options.py
in dpr
65 5
dist_utils.py
in dpr/utils
56 5
Files With Long Lines (Top 1)

There is only one file with lines longer than 120 characters. In total, there are 5 long lines.

File# lines# units# long lines
download_data.py
in dpr/data
469 5 5