azure / synthetic-qa-generation
File Size

The distribution of size of files (measured in lines of code).

Intro
Learn more...
File Size Overall
0% | 44% | 48% | 2% | 4%
Legend:
1001+
501-1000
201-500
101-200
1-100


explore: grouped by folders | grouped by size | sunburst | 3D view
File Size per Extension
1001+
501-1000
201-500
101-200
1-100
ipynb0% | 59% | 37% | 3% | 0%
py0% | 0% | 81% | 0% | 18%
File Size per Logical Decomposition
primary
1001+
501-1000
201-500
101-200
1-100
seed0% | 50% | 41% | 4% | 4%
glan-instruct0% | 60% | 34% | 0% | 4%
auto-evolve-instruct0% | 0% | 100% | 0% | 0%
evolve-instruct0% | 0% | 87% | 0% | 12%
Longest Files (Top 18)
File# lines# units
776 -
672 -
glan_tutorial.ipynb
in glan-instruct
657 -
409 -
trial.ipynb
in auto-evolve-instruct
393 -
glan.py
in glan-instruct
378 13
evolve.py
in evolve-instruct
361 14
281 -
preprocess.py
in seed/util
241 21
238 -
114 -
common_utils.py
in seed/util
68 4
qa.py
in seed/util
39 4
convert.py
in evolve-instruct
34 1
generate_answer_only.py
in glan-instruct
28 -
generate.py
in glan-instruct
25 -
merge_json.py
in evolve-instruct
16 -
qa_pair.py
in seed/util
10 1
Files With Most Units (Top 7)
File# lines# units
preprocess.py
in seed/util
241 21
evolve.py
in evolve-instruct
361 14
glan.py
in glan-instruct
378 13
common_utils.py
in seed/util
68 4
qa.py
in seed/util
39 4
convert.py
in evolve-instruct
34 1
qa_pair.py
in seed/util
10 1
Files With Long Lines (Top 12)

There are 12 files with lines longer than 120 characters. In total, there are 136 long lines.

File# lines# units# long lines
trial.ipynb
in auto-evolve-instruct
393 - 43
glan_tutorial.ipynb
in glan-instruct
657 - 27
776 - 19
672 - 15
glan.py
in glan-instruct
378 13 11
evolve.py
in evolve-instruct
361 14 10
409 - 3
281 - 3
238 - 2
convert.py
in evolve-instruct
34 1 1
common_utils.py
in seed/util
68 4 1
generate_answer_only.py
in glan-instruct
28 - 1
Correlations

File Size vs. Commits (all time): 18 points

seed/make_qa_multimodal_pdf_docai.ipynb x: 7 commits (all time) y: 776 lines of code seed/make_qa_multimodal_pdf_oss.ipynb x: 5 commits (all time) y: 672 lines of code seed/make_qa_only_image_multiple_pdf.ipynb x: 3 commits (all time) y: 281 lines of code seed/make_qa_only_image_pdf.ipynb x: 3 commits (all time) y: 409 lines of code seed/merge_training_dataset_json.ipynb x: 3 commits (all time) y: 114 lines of code auto-evolve-instruct/trial.ipynb x: 2 commits (all time) y: 393 lines of code evolve-instruct/evolve.py x: 3 commits (all time) y: 361 lines of code glan-instruct/glan.py x: 4 commits (all time) y: 378 lines of code glan-instruct/glan_tutorial.ipynb x: 2 commits (all time) y: 657 lines of code seed/make_qa_csv.ipynb x: 3 commits (all time) y: 238 lines of code seed/util/qa_pair.py x: 2 commits (all time) y: 10 lines of code seed/util/preprocess.py x: 3 commits (all time) y: 241 lines of code evolve-instruct/convert.py x: 1 commits (all time) y: 34 lines of code evolve-instruct/merge_json.py x: 1 commits (all time) y: 16 lines of code glan-instruct/generate.py x: 1 commits (all time) y: 25 lines of code glan-instruct/generate_answer_only.py x: 1 commits (all time) y: 28 lines of code seed/util/common_utils.py x: 1 commits (all time) y: 68 lines of code seed/util/qa.py x: 1 commits (all time) y: 39 lines of code
776.0
lines of code
  min: 10.0
  average: 263.33
  25th percentile: 32.5
  median: 239.5
  75th percentile: 397.0
  max: 776.0
0 7.0
commits (all time)
min: 1.0 | average: 2.56 | 25th percentile: 1.0 | median: 2.5 | 75th percentile: 3.0 | max: 7.0

File Size vs. Contributors (all time): 18 points

seed/make_qa_multimodal_pdf_docai.ipynb x: 1 contributors (all time) y: 776 lines of code seed/make_qa_multimodal_pdf_oss.ipynb x: 1 contributors (all time) y: 672 lines of code seed/make_qa_only_image_multiple_pdf.ipynb x: 1 contributors (all time) y: 281 lines of code seed/make_qa_only_image_pdf.ipynb x: 1 contributors (all time) y: 409 lines of code seed/merge_training_dataset_json.ipynb x: 1 contributors (all time) y: 114 lines of code auto-evolve-instruct/trial.ipynb x: 1 contributors (all time) y: 393 lines of code evolve-instruct/evolve.py x: 1 contributors (all time) y: 361 lines of code glan-instruct/glan.py x: 1 contributors (all time) y: 378 lines of code glan-instruct/glan_tutorial.ipynb x: 1 contributors (all time) y: 657 lines of code seed/make_qa_csv.ipynb x: 1 contributors (all time) y: 238 lines of code seed/util/qa_pair.py x: 1 contributors (all time) y: 10 lines of code seed/util/preprocess.py x: 1 contributors (all time) y: 241 lines of code evolve-instruct/convert.py x: 1 contributors (all time) y: 34 lines of code evolve-instruct/merge_json.py x: 1 contributors (all time) y: 16 lines of code glan-instruct/generate.py x: 1 contributors (all time) y: 25 lines of code glan-instruct/generate_answer_only.py x: 1 contributors (all time) y: 28 lines of code seed/util/common_utils.py x: 1 contributors (all time) y: 68 lines of code seed/util/qa.py x: 1 contributors (all time) y: 39 lines of code
776.0
lines of code
  min: 10.0
  average: 263.33
  25th percentile: 32.5
  median: 239.5
  75th percentile: 397.0
  max: 776.0
0 1.0
contributors (all time)
min: 1.0 | average: 1.0 | 25th percentile: 1.0 | median: 1.0 | 75th percentile: 1.0 | max: 1.0

File Size vs. Commits (30 days): 0 points

No data for "commits (30d)" vs. "lines of code".

File Size vs. Contributors (30 days): 0 points

No data for "contributors (30d)" vs. "lines of code".


File Size vs. Commits (90 days): 0 points

No data for "commits (90d)" vs. "lines of code".

File Size vs. Contributors (90 days): 0 points

No data for "contributors (90d)" vs. "lines of code".