microsoft / nlp-recipes
File Size

The distribution of size of files (measured in lines of code).

Intro
  • File size measurements show the distribution of size of files.
  • Files are classified in four categories based on their size (lines of code): 1-100 (very small files), 101-200 (small files), 201-500 (medium size files), 501-1000 (long files), 1001+(very long files).
  • It is a good practice to keep files small. Long files may become "bloaters", code that have increased to such gargantuan proportions that they are hard to work with.
Learn more...
File Size Overall
  • There are 108 files with 13,939 lines of code.
    • 1 very long files (1,067 lines of code)
    • 3 long files (1,745 lines of code)
    • 17 medium size files (4,836 lines of codeclsfd_ftr_w_mp_ins)
    • 22 small files (3,363 lines of code)
    • 65 very small files (2,928 lines of code)
7% | 12% | 34% | 24% | 21%
Legend:
1001+
501-1000
201-500
101-200
1-100


explore: zoomable circles | sunburst | 3D view
File Size per Extension
1001+
501-1000
201-500
101-200
1-100
py8% | 13% | 31% | 23% | 22%
c0% | 0% | 68% | 31% | 0%
bash0% | 0% | 0% | 0% | 100%
sed0% | 0% | 0% | 0% | 100%
in0% | 0% | 0% | 0% | 100%
toml0% | 0% | 0% | 0% | 100%
yml0% | 0% | 0% | 0% | 100%
File Size per Logical Decomposition
primary
1001+
501-1000
201-500
101-200
1-100
utils_nlp/models11% | 18% | 43% | 21% | 4%
utils_nlp/eval0% | 0% | 22% | 35% | 42%
utils_nlp/dataset0% | 0% | 17% | 18% | 64%
tools0% | 0% | 0% | 65% | 34%
utils_nlp/azureml0% | 0% | 0% | 55% | 44%
utils_nlp/common0% | 0% | 0% | 0% | 100%
utils_nlp/interpreter0% | 0% | 0% | 0% | 100%
utils_nlp/language_utils0% | 0% | 0% | 0% | 100%
ROOT0% | 0% | 0% | 0% | 100%
utils_nlp0% | 0% | 0% | 0% | 100%
Longest Files (Top 50)
File# lines# units
question_answering.py
in utils_nlp/models/transformers
1067 23
abstractive_summarization_seq2seq.py
in utils_nlp/models/transformers
707 22
extractive_summarization.py
in utils_nlp/models/transformers
528 27
abstractive_summarization_bertsum.py
in utils_nlp/models/transformers
510 21
utils.py
in utils_nlp/models/gensen
455 13
cooccur.c
in utils_nlp/models/glove/src
411 15
glove.c
in utils_nlp/models/glove/src
374 7
datasets.py
in utils_nlp/models/transformers
349 26
model_builder.py
in utils_nlp/models/transformers/bertsum
313 15
rouge_ext.py
in utils_nlp/eval/rouge
312 11
gensen.py
in utils_nlp/models/gensen
292 12
named_entity_recognition.py
in utils_nlp/models/transformers
269 11
common.py
in utils_nlp/models/transformers
262 11
ranking.py
in utils_nlp/eval/SentEval/senteval/tools
259 12
neural.py
in utils_nlp/models/transformers/bertsum
232 15
cnndm.py
in utils_nlp/dataset
229 8
common.py
in utils_nlp/models/bert
226 13
sequence_classification.py
in utils_nlp/models/xlnet
217 3
predictor.py
in utils_nlp/models/transformers/bertsum
215 5
multi_task_model.py
in utils_nlp/models/gensen
214 6
sequence_classification_distributed.py
in utils_nlp/models/bert
207 6
data_loader.py
in utils_nlp/models/transformers/bertsum
196 18
sequence_classification.py
in utils_nlp/models/transformers
187 9
validation.py
in utils_nlp/eval/SentEval/senteval/tools
186 7
vocab_count.c
in utils_nlp/models/glove/src
185 9
optimizers.py
in utils_nlp/models/transformers/bertsum
185 16
decoder.py
in utils_nlp/models/transformers/bertsum
183 13
sequence_classification.py
in utils_nlp/models/bert
179 4
token_classification.py
in utils_nlp/models/bert
178 7
shuffle.c
in utils_nlp/models/glove/src
170 8
sick.py
in utils_nlp/eval/SentEval/senteval
167 8
sequence_encoding.py
in utils_nlp/models/bert
165 9
question_answering.py
in utils_nlp/eval
158 7
classifier.py
in utils_nlp/eval/SentEval/senteval/tools
145 8
generate_conda_file.py
in tools
144 -
sts.py
in utils_nlp/eval/SentEval/senteval
129 10
multinli.py
in utils_nlp/dataset
127 6
beam.py
in utils_nlp/models/transformers/bertsum
120 9
probing.py
in utils_nlp/eval/SentEval/senteval
115 14
loss.py
in utils_nlp/models/transformers/bertsum
115 16
wikigold.py
in utils_nlp/dataset
113 3
azureml_bert_util.py
in utils_nlp/azureml
110 8
encoder.py
in utils_nlp/models/transformers/bertsum
106 11
pytorch_utils.py
in utils_nlp/common
96 6
relatedness.py
in utils_nlp/eval/SentEval/senteval/tools
95 5
engine.py
in utils_nlp/eval/SentEval/senteval
94 2
sentence_selection.py
in utils_nlp/dataset
92 5
azureml_utils.py
in utils_nlp/azureml
87 4
snli.py
in utils_nlp/eval/SentEval/senteval
86 4
preprocess_utils.py
in utils_nlp/models/gensen
83 3
Files With Most Units (Top 20)
File# lines# units
extractive_summarization.py
in utils_nlp/models/transformers
528 27
datasets.py
in utils_nlp/models/transformers
349 26
question_answering.py
in utils_nlp/models/transformers
1067 23
abstractive_summarization_seq2seq.py
in utils_nlp/models/transformers
707 22
abstractive_summarization_bertsum.py
in utils_nlp/models/transformers
510 21
data_loader.py
in utils_nlp/models/transformers/bertsum
196 18
loss.py
in utils_nlp/models/transformers/bertsum
115 16
optimizers.py
in utils_nlp/models/transformers/bertsum
185 16
cooccur.c
in utils_nlp/models/glove/src
411 15
model_builder.py
in utils_nlp/models/transformers/bertsum
313 15
neural.py
in utils_nlp/models/transformers/bertsum
232 15
probing.py
in utils_nlp/eval/SentEval/senteval
115 14
common.py
in utils_nlp/models/bert
226 13
utils.py
in utils_nlp/models/gensen
455 13
decoder.py
in utils_nlp/models/transformers/bertsum
183 13
ranking.py
in utils_nlp/eval/SentEval/senteval/tools
259 12
gensen.py
in utils_nlp/models/gensen
292 12
rouge_ext.py
in utils_nlp/eval/rouge
312 11
encoder.py
in utils_nlp/models/transformers/bertsum
106 11
common.py
in utils_nlp/models/transformers
262 11
Files With Long Lines (Top 9)

There are 9 files with lines longer than 120 characters. In total, there are 24 long lines.

File# lines# units# long lines
cooccur.c
in utils_nlp/models/glove/src
411 15 6
glove.c
in utils_nlp/models/glove/src
374 7 6
get_transfer_data.bash
in utils_nlp/eval/SentEval/data/downstream
60 - 3
vocab_count.c
in utils_nlp/models/glove/src
185 9 3
model_builder.py
in utils_nlp/models/transformers/bertsum
313 15 2
setup.py
in root
65 1 1
multinli.py
in utils_nlp/dataset
127 6 1
rank.py
in utils_nlp/eval/SentEval/senteval
82 4 1
shuffle.c
in utils_nlp/models/glove/src
170 8 1