facebookresearch / BLINK
File Size

The distribution of size of files (measured in lines of code).

Intro
  • File size measurements show the distribution of size of files.
  • Files are classified in four categories based on their size (lines of code): 1-100 (very small files), 101-200 (small files), 201-500 (medium size files), 501-1000 (long files), 1001+(very long files).
  • It is a good practice to keep files small. Long files may become "bloaters", code that have increased to such gargantuan proportions that they are hard to work with.
Learn more...
File Size Overall
  • There are 60 files with 9,530 lines of code.
    • 0 very long files (0 lines of code)
    • 3 long files (2,051 lines of code)
    • 12 medium size files (3,770 lines of codeclsfd_ftr_w_mp_ins)
    • 15 small files (2,121 lines of code)
    • 30 very small files (1,588 lines of code)
0% | 21% | 39% | 22% | 16%
Legend:
1001+
501-1000
201-500
101-200
1-100


explore: zoomable circles | sunburst | 3D view
File Size per Extension
1001+
501-1000
201-500
101-200
1-100
py0% | 21% | 39% | 22% | 16%
File Size per Logical Decomposition
primary
1001+
501-1000
201-500
101-200
1-100
elq0% | 92% | 0% | 0% | 7%
elq/biencoder0% | 42% | 53% | 0% | 4%
blink0% | 48% | 0% | 34% | 16%
blink/candidate_ranking0% | 0% | 90% | 9% | 0%
blink/candidate_retrieval0% | 0% | 29% | 34% | 36%
blink/biencoder0% | 0% | 48% | 33% | 17%
elq/common0% | 0% | 88% | 0% | 11%
blink/crossencoder0% | 0% | 56% | 43% | 0%
blink/common0% | 0% | 70% | 0% | 29%
scripts0% | 0% | 0% | 95% | 4%
blink/indexer0% | 0% | 0% | 0% | 100%
elq/index0% | 0% | 0% | 0% | 100%
elq/candidate_ranking0% | 0% | 0% | 0% | 100%
elq/vcg_utils0% | 0% | 0% | 0% | 100%
ROOT0% | 0% | 0% | 0% | 100%
Longest Files (Top 50)
File# lines# units
main_dense.py
in elq
764 12
biencoder.py
in elq/biencoder
734 28
main_dense.py
in blink
553 13
train_biencoder.py
in elq/biencoder
465 4
data_process.py
in elq/biencoder
460 9
bert_reranking.py
in blink/candidate_ranking
393 12
train.py
in blink/candidate_ranking
359 1
params.py
in elq/common
333 5
train_cross.py
in blink/crossencoder
306 5
dataset.py
in blink/candidate_retrieval
269 10
eval_biencoder.py
in blink/biencoder
249 7
evaluate.py
in blink/candidate_ranking
248 2
train_biencoder.py
in blink/biencoder
232 4
params.py
in blink/common
229 5
utils.py
in blink/candidate_retrieval
227 8
tune_hyperparams_new.py
in scripts
199 5
main_solr.py
in blink
189 1
create_BLINK_benchmark_data.py
in scripts
182 5
perform_and_evaluate_candidate_retrieval_multithreaded.py
in blink/candidate_retrieval
166 4
data_process.py
in blink/biencoder
164 4
biencoder.py
in blink/biencoder
162 13
data_ingestion.py
in blink/candidate_retrieval
137 3
candidate_generators.py
in blink/candidate_retrieval
133 8
enrich_data.py
in blink/candidate_retrieval
129 -
crossencoder.py
in blink/crossencoder
122 12
generate_candidates.py
in scripts
120 2
data_process.py
in blink/crossencoder
109 4
utils.py
in blink/candidate_ranking
106 11
utils.py
in blink
102 11
candidate_generation.py
in blink
101 8
nn_prediction.py
in blink/biencoder
98 1
faiss_indexer.py
in blink/indexer
94 12
evaluator.py
in blink/candidate_retrieval
92 3
process_wiki_extractor_output_links.py
in blink/candidate_retrieval
87 -
faiss_indexer.py
in elq/index
83 15
run_benchmark.py
in blink
78 -
link_wikipedia_and_wikidata.py
in blink/candidate_retrieval
78 -
zeshel_utils.py
in blink/biencoder
77 5
process_wikidata.py
in blink/candidate_retrieval
73 -
json_data_generation.py
in blink/candidate_retrieval
70 -
utils.py
in elq/candidate_ranking
68 7
optimizer.py
in blink/common
67 2
build_faiss_index.py
in blink
59 1
build_faiss_index.py
in elq
58 1
process_wiki_extractor_output.py
in blink/candidate_retrieval
57 -
process_wiki_extractor_output_full.py
in blink/candidate_retrieval
56 -
allennlp_span_utils.py
in elq/biencoder
53 5
measures.py
in elq/vcg_utils
48 1
process_intro_sents.py
in blink/candidate_retrieval
47 -
ranker_base.py
in elq/common
45 3
Files With Most Units (Top 20)
File# lines# units
biencoder.py
in elq/biencoder
734 28
faiss_indexer.py
in elq/index
83 15
biencoder.py
in blink/biencoder
162 13
main_dense.py
in blink
553 13
crossencoder.py
in blink/crossencoder
122 12
bert_reranking.py
in blink/candidate_ranking
393 12
faiss_indexer.py
in blink/indexer
94 12
main_dense.py
in elq
764 12
utils.py
in blink/candidate_ranking
106 11
utils.py
in blink
102 11
dataset.py
in blink/candidate_retrieval
269 10
data_process.py
in elq/biencoder
460 9
candidate_generation.py
in blink
101 8
utils.py
in blink/candidate_retrieval
227 8
candidate_generators.py
in blink/candidate_retrieval
133 8
eval_biencoder.py
in blink/biencoder
249 7
utils.py
in elq/candidate_ranking
68 7
train_cross.py
in blink/crossencoder
306 5
zeshel_utils.py
in blink/biencoder
77 5
params.py
in blink/common
229 5
Files With Long Lines (Top 10)

There are 10 files with lines longer than 120 characters. In total, there are 58 long lines.

File# lines# units# long lines
main_dense.py
in elq
764 12 19
biencoder.py
in elq/biencoder
734 28 13
tune_hyperparams_new.py
in scripts
199 5 11
train_biencoder.py
in elq/biencoder
465 4 5
generate_candidates.py
in scripts
120 2 5
perform_and_evaluate_candidate_retrieval_multithreaded.py
in blink/candidate_retrieval
166 4 1
candidate_generators.py
in blink/candidate_retrieval
133 8 1
main_solr.py
in blink
189 1 1
utils.py
in elq/biencoder
18 1 1
merge_candidates.py
in scripts
22 - 1