azure / synthetic-qa-generation
File Change Frequency

File change frequency (churn) shows the distribution of file updates (days with at least one commit).

Overview
File Change Frequency Overall
  • There are 18 files with 4,740 lines of code.
    • 0 files changed more than 100 times (0 lines of code)
    • 0 files changed 51-100 times (0 lines of code)
    • 0 files changed 21-50 times (0 lines of code)
    • 0 files changed 6-20 times (0 lines of code)
    • 18 files changed 1-5 times (4,740 lines of code)
0% | 0% | 0% | 0% | 100%
Legend:
101+
51-100
21-50
6-20
1-5

explore: grouped by folders | grouped by update frequency | data
Contributors Count Frequency Overall
  • There are 18 files with 4,740 lines of code.
    • 0 files changed by more than 25 contributors (0 lines of code)
    • 0 files changed by 11-25 contributors (0 lines of code)
    • 0 files changed by 6-10 contributors (0 lines of code)
    • 0 files changed by 2-5 contributors (0 lines of code)
    • 18 files changed by 1 contributor (4,740 lines of code)
0% | 0% | 0% | 0% | 100%
Legend:
26+
11-25
6-10
2-5
1

explore: grouped by folders | grouped by contributors count | data
File Change Frequency per File Extension
txt, jsonl, py, ipynb, md, sh, json, gitignore, prettierignore
File Change Frequency per Extension
The number of recorded file updates
101+
51-100
21-50
6-20
1-5
ipynb0% | 0% | 0% | 0% | 100%
py0% | 0% | 0% | 0% | 100%
File Change Frequency per Logical Decomposition
primary
primary (file change frequency)
The number of recorded file updates
101+
51-100
21-50
6-20
1-5
seed0% | 0% | 0% | 0% | 100%
glan-instruct0% | 0% | 0% | 0% | 100%
evolve-instruct0% | 0% | 0% | 0% | 100%
auto-evolve-instruct0% | 0% | 0% | 0% | 100%
Most Frequently Changed Files (Top 18)

See data for all files...

File# lines# unitscreatedlast modified# changes
(days)
# contributorsfirst
contributor
latest
contributor
776 - 2024-07-25 2024-11-03 5 1 housekdk@naver.com housekdk@naver.com
672 - 2024-07-25 2024-11-03 4 1 housekdk@naver.com housekdk@naver.com
238 - 2024-07-25 2024-11-02 3 1 housekdk@naver.com housekdk@naver.com
qa_pair.py
in seed/util
10 1 2024-07-25 2024-11-02 2 1 housekdk@naver.com housekdk@naver.com
114 - 2024-07-25 2024-11-02 2 1 housekdk@naver.com housekdk@naver.com
preprocess.py
in seed/util
241 21 2024-07-25 2024-09-10 2 1 housekdk@naver.com housekdk@naver.com
281 - 2024-07-25 2024-11-02 2 1 housekdk@naver.com housekdk@naver.com
evolve.py
in evolve-instruct
361 14 2024-07-25 2024-11-02 2 1 housekdk@naver.com housekdk@naver.com
glan.py
in glan-instruct
378 13 2024-07-25 2024-11-02 2 1 housekdk@naver.com housekdk@naver.com
trial.ipynb
in auto-evolve-instruct
393 - 2024-07-25 2024-11-02 2 1 housekdk@naver.com housekdk@naver.com
409 - 2024-07-25 2024-11-02 2 1 housekdk@naver.com housekdk@naver.com
glan_tutorial.ipynb
in glan-instruct
657 - 2024-07-25 2024-11-02 2 1 housekdk@naver.com housekdk@naver.com
merge_json.py
in evolve-instruct
16 - 2024-07-25 2024-07-25 1 1 housekdk@naver.com housekdk@naver.com
generate.py
in glan-instruct
25 - 2024-07-25 2024-07-25 1 1 housekdk@naver.com housekdk@naver.com
generate_answer_only.py
in glan-instruct
28 - 2024-07-25 2024-07-25 1 1 housekdk@naver.com housekdk@naver.com
convert.py
in evolve-instruct
34 1 2024-07-25 2024-07-25 1 1 housekdk@naver.com housekdk@naver.com
qa.py
in seed/util
39 4 2024-07-25 2024-07-25 1 1 housekdk@naver.com housekdk@naver.com
common_utils.py
in seed/util
68 4 2024-07-25 2024-07-25 1 1 housekdk@naver.com housekdk@naver.com
Files With Most Contributors (Top 18)
Based on the number of unique email addresses found in commits.

See data for all files...

File# lines# unitscreatedlast modified# changes
(days)
# contributorsfirst
contributor
latest
contributor
776 - 2024-07-25 2024-11-03 5 1 housekdk@naver.com housekdk@naver.com
672 - 2024-07-25 2024-11-03 4 1 housekdk@naver.com housekdk@naver.com
238 - 2024-07-25 2024-11-02 3 1 housekdk@naver.com housekdk@naver.com
evolve.py
in evolve-instruct
361 14 2024-07-25 2024-11-02 2 1 housekdk@naver.com housekdk@naver.com
409 - 2024-07-25 2024-11-02 2 1 housekdk@naver.com housekdk@naver.com
114 - 2024-07-25 2024-11-02 2 1 housekdk@naver.com housekdk@naver.com
qa_pair.py
in seed/util
10 1 2024-07-25 2024-11-02 2 1 housekdk@naver.com housekdk@naver.com
preprocess.py
in seed/util
241 21 2024-07-25 2024-09-10 2 1 housekdk@naver.com housekdk@naver.com
281 - 2024-07-25 2024-11-02 2 1 housekdk@naver.com housekdk@naver.com
trial.ipynb
in auto-evolve-instruct
393 - 2024-07-25 2024-11-02 2 1 housekdk@naver.com housekdk@naver.com
glan.py
in glan-instruct
378 13 2024-07-25 2024-11-02 2 1 housekdk@naver.com housekdk@naver.com
glan_tutorial.ipynb
in glan-instruct
657 - 2024-07-25 2024-11-02 2 1 housekdk@naver.com housekdk@naver.com
merge_json.py
in evolve-instruct
16 - 2024-07-25 2024-07-25 1 1 housekdk@naver.com housekdk@naver.com
convert.py
in evolve-instruct
34 1 2024-07-25 2024-07-25 1 1 housekdk@naver.com housekdk@naver.com
common_utils.py
in seed/util
68 4 2024-07-25 2024-07-25 1 1 housekdk@naver.com housekdk@naver.com
qa.py
in seed/util
39 4 2024-07-25 2024-07-25 1 1 housekdk@naver.com housekdk@naver.com
generate_answer_only.py
in glan-instruct
28 - 2024-07-25 2024-07-25 1 1 housekdk@naver.com housekdk@naver.com
generate.py
in glan-instruct
25 - 2024-07-25 2024-07-25 1 1 housekdk@naver.com housekdk@naver.com
Files With Least Contributors (Top 18)
Based on the number of unique email addresses found in commits.

See data for all files...

File# lines# unitscreatedlast modified# changes
(days)
# contributorsfirst
contributor
latest
contributor
776 - 2024-07-25 2024-11-03 5 1 housekdk@naver.com housekdk@naver.com
672 - 2024-07-25 2024-11-03 4 1 housekdk@naver.com housekdk@naver.com
glan_tutorial.ipynb
in glan-instruct
657 - 2024-07-25 2024-11-02 2 1 housekdk@naver.com housekdk@naver.com
409 - 2024-07-25 2024-11-02 2 1 housekdk@naver.com housekdk@naver.com
trial.ipynb
in auto-evolve-instruct
393 - 2024-07-25 2024-11-02 2 1 housekdk@naver.com housekdk@naver.com
glan.py
in glan-instruct
378 13 2024-07-25 2024-11-02 2 1 housekdk@naver.com housekdk@naver.com
evolve.py
in evolve-instruct
361 14 2024-07-25 2024-11-02 2 1 housekdk@naver.com housekdk@naver.com
281 - 2024-07-25 2024-11-02 2 1 housekdk@naver.com housekdk@naver.com
preprocess.py
in seed/util
241 21 2024-07-25 2024-09-10 2 1 housekdk@naver.com housekdk@naver.com
238 - 2024-07-25 2024-11-02 3 1 housekdk@naver.com housekdk@naver.com
114 - 2024-07-25 2024-11-02 2 1 housekdk@naver.com housekdk@naver.com
common_utils.py
in seed/util
68 4 2024-07-25 2024-07-25 1 1 housekdk@naver.com housekdk@naver.com
qa.py
in seed/util
39 4 2024-07-25 2024-07-25 1 1 housekdk@naver.com housekdk@naver.com
convert.py
in evolve-instruct
34 1 2024-07-25 2024-07-25 1 1 housekdk@naver.com housekdk@naver.com
generate_answer_only.py
in glan-instruct
28 - 2024-07-25 2024-07-25 1 1 housekdk@naver.com housekdk@naver.com
generate.py
in glan-instruct
25 - 2024-07-25 2024-07-25 1 1 housekdk@naver.com housekdk@naver.com
merge_json.py
in evolve-instruct
16 - 2024-07-25 2024-07-25 1 1 housekdk@naver.com housekdk@naver.com
qa_pair.py
in seed/util
10 1 2024-07-25 2024-11-02 2 1 housekdk@naver.com housekdk@naver.com
Correlations

File Size vs. Number of Changes: 18 points

seed/make_qa_multimodal_pdf_docai.ipynb x: 776 lines of code y: 5 # changes seed/make_qa_multimodal_pdf_oss.ipynb x: 672 lines of code y: 4 # changes seed/make_qa_only_image_multiple_pdf.ipynb x: 281 lines of code y: 2 # changes seed/make_qa_only_image_pdf.ipynb x: 409 lines of code y: 2 # changes seed/merge_training_dataset_json.ipynb x: 114 lines of code y: 2 # changes auto-evolve-instruct/trial.ipynb x: 393 lines of code y: 2 # changes evolve-instruct/evolve.py x: 361 lines of code y: 2 # changes glan-instruct/glan.py x: 378 lines of code y: 2 # changes glan-instruct/glan_tutorial.ipynb x: 657 lines of code y: 2 # changes seed/make_qa_csv.ipynb x: 238 lines of code y: 3 # changes seed/util/qa_pair.py x: 10 lines of code y: 2 # changes seed/util/preprocess.py x: 241 lines of code y: 2 # changes evolve-instruct/convert.py x: 34 lines of code y: 1 # changes evolve-instruct/merge_json.py x: 16 lines of code y: 1 # changes glan-instruct/generate.py x: 25 lines of code y: 1 # changes glan-instruct/generate_answer_only.py x: 28 lines of code y: 1 # changes seed/util/common_utils.py x: 68 lines of code y: 1 # changes seed/util/qa.py x: 39 lines of code y: 1 # changes
5.0
# changes
  min: 1.0
  average: 2.0
  25th percentile: 1.0
  median: 2.0
  75th percentile: 2.0
  max: 5.0
0 776.0
lines of code
min: 10.0 | average: 263.33 | 25th percentile: 32.5 | median: 239.5 | 75th percentile: 397.0 | max: 776.0

Number of Contributors vs. Number of Changes: 18 points

seed/make_qa_multimodal_pdf_docai.ipynb x: 1 # contributors y: 5 # changes seed/make_qa_multimodal_pdf_oss.ipynb x: 1 # contributors y: 4 # changes seed/make_qa_only_image_multiple_pdf.ipynb x: 1 # contributors y: 2 # changes seed/make_qa_csv.ipynb x: 1 # contributors y: 3 # changes evolve-instruct/convert.py x: 1 # contributors y: 1 # changes
5.0
# changes
  min: 1.0
  average: 2.0
  25th percentile: 1.0
  median: 2.0
  75th percentile: 2.0
  max: 5.0
0 1.0
# contributors
min: 1.0 | average: 1.0 | 25th percentile: 1.0 | median: 1.0 | 75th percentile: 1.0 | max: 1.0

Number of Contributors vs. File Size: 18 points

seed/make_qa_multimodal_pdf_docai.ipynb x: 1 # contributors y: 776 lines of code seed/make_qa_multimodal_pdf_oss.ipynb x: 1 # contributors y: 672 lines of code seed/make_qa_only_image_multiple_pdf.ipynb x: 1 # contributors y: 281 lines of code seed/make_qa_only_image_pdf.ipynb x: 1 # contributors y: 409 lines of code seed/merge_training_dataset_json.ipynb x: 1 # contributors y: 114 lines of code auto-evolve-instruct/trial.ipynb x: 1 # contributors y: 393 lines of code evolve-instruct/evolve.py x: 1 # contributors y: 361 lines of code glan-instruct/glan.py x: 1 # contributors y: 378 lines of code glan-instruct/glan_tutorial.ipynb x: 1 # contributors y: 657 lines of code seed/make_qa_csv.ipynb x: 1 # contributors y: 238 lines of code seed/util/qa_pair.py x: 1 # contributors y: 10 lines of code seed/util/preprocess.py x: 1 # contributors y: 241 lines of code evolve-instruct/convert.py x: 1 # contributors y: 34 lines of code evolve-instruct/merge_json.py x: 1 # contributors y: 16 lines of code glan-instruct/generate.py x: 1 # contributors y: 25 lines of code glan-instruct/generate_answer_only.py x: 1 # contributors y: 28 lines of code seed/util/common_utils.py x: 1 # contributors y: 68 lines of code seed/util/qa.py x: 1 # contributors y: 39 lines of code
776.0
lines of code
  min: 10.0
  average: 263.33
  25th percentile: 32.5
  median: 239.5
  75th percentile: 397.0
  max: 776.0
0 1.0
# contributors
min: 1.0 | average: 1.0 | 25th percentile: 1.0 | median: 1.0 | 75th percentile: 1.0 | max: 1.0