facebookresearch / multihop_dense_retrieval
File Size

The distribution of size of files (measured in lines of code).

Intro
  • File size measurements show the distribution of size of files.
  • Files are classified in four categories based on their size (lines of code): 1-100 (very small files), 101-200 (small files), 201-500 (medium size files), 501-1000 (long files), 1001+(very long files).
  • It is a good practice to keep files small. Long files may become "bloaters", code that have increased to such gargantuan proportions that they are hard to work with.
Learn more...
File Size Overall
  • There are 49 files with 6,563 lines of code.
    • 0 very long files (0 lines of code)
    • 0 long files (0 lines of code)
    • 11 medium size files (3,006 lines of codeclsfd_ftr_w_mp_ins)
    • 17 small files (2,503 lines of code)
    • 21 very small files (1,054 lines of code)
0% | 0% | 45% | 38% | 16%
Legend:
1001+
501-1000
201-500
101-200
1-100


explore: zoomable circles | sunburst | 3D view
File Size per Extension
1001+
501-1000
201-500
101-200
1-100
py0% | 0% | 45% | 38% | 16%
File Size per Logical Decomposition
primary
1001+
501-1000
201-500
101-200
1-100
mdr/qa0% | 0% | 59% | 29% | 10%
mdr/retrieval0% | 0% | 64% | 21% | 13%
mdr/retrieval/data0% | 0% | 55% | 18% | 25%
scripts0% | 0% | 35% | 57% | 6%
scripts/eval0% | 0% | 35% | 43% | 20%
mdr/retrieval/utils0% | 0% | 32% | 66% | 1%
mdr/retrieval/models0% | 0% | 0% | 71% | 28%
submitit0% | 0% | 0% | 0% | 100%
ROOT0% | 0% | 0% | 0% | 100%
mdr0% | 0% | 0% | 0% | 100%
Longest Files (Top 49)
File# lines# units
train_qa.py
in scripts
404 4
qa_dataset.py
in mdr/qa
356 12
qa_trainer.py
in mdr/qa
338 10
unified_dataset.py
in mdr/retrieval/data
287 20
utils.py
in mdr/qa
280 23
sp_datasets.py
in mdr/retrieval/data
248 18
single_trainer.py
in mdr/retrieval
234 10
mhop_trainer.py
in mdr/retrieval
222 10
eval_mhop_retrieval.py
in scripts/eval
220 1
basic_tokenizer.py
in mdr/retrieval/utils
214 21
train_single.py
in mdr/retrieval
203 2
train_momentum.py
in scripts
198 2
train_ranker.py
in mdr/qa
196 3
mhop_utils.py
in mdr/retrieval/utils
191 10
train_mhop.py
in scripts
190 2
data_utils.py
in mdr/retrieval/data
181 8
basic_tokenizer.py
in mdr/qa
175 18
eval_retrieval.py
in scripts/eval
142 3
demo.py
in scripts
138 2
unified_retriever.py
in mdr/retrieval/models
134 18
retriever.py
in mdr/retrieval/models
131 16
end2end.py
in scripts
131 1
eval_mhop_fever.py
in scripts/eval
125 -
utils.py
in mdr/retrieval/utils
124 15
decomposed_analysis.py
in mdr/retrieval
119 4
tokenizer.py
in mdr/retrieval/utils
118 12
hotpot_evaluate_v1.py
in mdr/qa
107 6
criterions.py
in mdr/retrieval
103 4
mhop_dataset.py
in mdr/retrieval/data
91 5
eval_single_fever.py
in scripts/eval
90 -
submitit_train_qa.py
in submitit
88 4
config.py
in mdr/retrieval
88 3
encode_datasets.py
in mdr/retrieval/data
85 6
mhop_retriever.py
in mdr/retrieval/models
82 9
qa_model.py
in mdr/qa
81 4
submitit_train.py
in submitit
79 4
encode_corpus.py
in scripts
75 2
config.py
in mdr/qa
73 2
fever_dataset.py
in mdr/retrieval/data
64 6
interactive_retrieval.py
in mdr/retrieval
50 -
eval_reranked.py
in scripts/eval
36 -
hop1_retriever.py
in mdr/retrieval/models
24 2
data_utils.py
in mdr/qa
17 2
setup.py
in root
16 -
gen_index_id_map.py
in mdr/retrieval/utils
8 -
__init__.py
in mdr/retrieval
3 -
__init__.py
in mdr
2 -
__init__.py
in mdr/qa
1 -
__init__.py
in mdr/retrieval/data
1 -
Files With Most Units (Top 20)
File# lines# units
utils.py
in mdr/qa
280 23
basic_tokenizer.py
in mdr/retrieval/utils
214 21
unified_dataset.py
in mdr/retrieval/data
287 20
basic_tokenizer.py
in mdr/qa
175 18
unified_retriever.py
in mdr/retrieval/models
134 18
sp_datasets.py
in mdr/retrieval/data
248 18
retriever.py
in mdr/retrieval/models
131 16
utils.py
in mdr/retrieval/utils
124 15
qa_dataset.py
in mdr/qa
356 12
tokenizer.py
in mdr/retrieval/utils
118 12
qa_trainer.py
in mdr/qa
338 10
mhop_utils.py
in mdr/retrieval/utils
191 10
single_trainer.py
in mdr/retrieval
234 10
mhop_trainer.py
in mdr/retrieval
222 10
mhop_retriever.py
in mdr/retrieval/models
82 9
data_utils.py
in mdr/retrieval/data
181 8
hotpot_evaluate_v1.py
in mdr/qa
107 6
encode_datasets.py
in mdr/retrieval/data
85 6
fever_dataset.py
in mdr/retrieval/data
64 6
mhop_dataset.py
in mdr/retrieval/data
91 5
Files With Long Lines (Top 20)

There are 32 files with lines longer than 120 characters. In total, there are 135 long lines.

File# lines# units# long lines
sp_datasets.py
in mdr/retrieval/data
248 18 11
qa_trainer.py
in mdr/qa
338 10 10
train_qa.py
in scripts
404 4 10
end2end.py
in scripts
131 1 8
qa_dataset.py
in mdr/qa
356 12 7
unified_retriever.py
in mdr/retrieval/models
134 18 6
data_utils.py
in mdr/retrieval/data
181 8 6
unified_dataset.py
in mdr/retrieval/data
287 20 6
train_momentum.py
in scripts
198 2 6
retriever.py
in mdr/retrieval/models
131 16 5
mhop_trainer.py
in mdr/retrieval
222 10 5
train_mhop.py
in scripts
190 2 5
eval_mhop_retrieval.py
in scripts/eval
220 1 5
eval_mhop_fever.py
in scripts/eval
125 - 5
train_ranker.py
in mdr/qa
196 3 4
train_single.py
in mdr/retrieval
203 2 4
mhop_utils.py
in mdr/retrieval/utils
191 10 4
single_trainer.py
in mdr/retrieval
234 10 4
demo.py
in scripts
138 2 4
interactive_retrieval.py
in mdr/retrieval
50 - 3