huggingface / cosmopedia
File Change Frequency

File change frequency (churn) shows the distribution of file updates (days with at least one commit).

Overview
File Change Frequency Overall
  • There are 20 files with 2,536 lines of code.
    • 0 files changed more than 100 times (0 lines of code)
    • 0 files changed 51-100 times (0 lines of code)
    • 0 files changed 21-50 times (0 lines of code)
    • 0 files changed 6-20 times (0 lines of code)
    • 20 files changed 1-5 times (2,536 lines of code)
0% | 0% | 0% | 0% | 100%
Legend:
101+
51-100
21-50
6-20
1-5

explore: grouped by folders | grouped by update frequency | data
Contributors Count Frequency Overall
  • There are 20 files with 2,536 lines of code.
    • 0 files changed by more than 25 contributors (0 lines of code)
    • 0 files changed by 11-25 contributors (0 lines of code)
    • 0 files changed by 6-10 contributors (0 lines of code)
    • 15 files changed by 2-5 contributors (2,100 lines of code)
    • 5 files changed by 1 contributor (436 lines of code)
0% | 0% | 0% | 82% | 17%
Legend:
26+
11-25
6-10
2-5
1

explore: grouped by folders | grouped by contributors count | data
File Change Frequency per File Extension
py, md, txt, ipynb
File Change Frequency per Extension
The number of recorded file updates
101+
51-100
21-50
6-20
1-5
py0% | 0% | 0% | 0% | 100%
ipynb0% | 0% | 0% | 0% | 100%
File Change Frequency per Logical Decomposition
primary
primary (file change frequency)
The number of recorded file updates
101+
51-100
21-50
6-20
1-5
prompts0% | 0% | 0% | 0% | 100%
generation0% | 0% | 0% | 0% | 100%
fulltext_search0% | 0% | 0% | 0% | 100%
classification0% | 0% | 0% | 0% | 100%
decontamination0% | 0% | 0% | 0% | 100%
deduplication0% | 0% | 0% | 0% | 100%
Most Frequently Changed Files (Top 20)

See data for all files...

File# lines# unitscreatedlast modified# changes
(days)
# contributorsfirst
contributor
latest
contributor
decontaminate.py
in decontamination
99 6 2024-02-21 2024-07-16 3 2 anton@huggingface.co anton@huggingface.co
train_edu_bert.py
in classification
128 2 2024-05-31 2024-07-16 3 1 anton@huggingface.co anton@huggingface.co
run_edu_bert.py
in classification
64 1 2024-05-31 2024-07-16 2 1 anton@huggingface.co anton@huggingface.co
25 1 2024-05-31 2024-05-31 1 1 anton@huggingface.co anton@huggingface.co
filter_openhermes.py
in prompts/stories
31 2 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
generate_textbooks.py
in prompts/khanacademy
33 - 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
build_science_prompts.py
in prompts/auto_math_text
38 2 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
42 2 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
47 2 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
build_web_prompts.py
in prompts/web_samples
48 2 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
main.py
in prompts/khanacademy/khan_dl
66 2 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
deduplicate_dataset.py
in deduplication
93 - 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
search_sharded.py
in fulltext_search
102 3 2024-07-16 2024-07-16 1 1 anton@huggingface.co anton@huggingface.co
index_docs.py
in fulltext_search
117 2 2024-07-16 2024-07-16 1 1 anton@huggingface.co anton@huggingface.co
build_openstax_prompts.py
in prompts/openstax
190 4 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
llm_swarm_script.py
in generation
195 - 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
filter_and_classify_clusters.py
in prompts/web_samples
211 3 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
287 - 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
khan_dl.py
in prompts/khanacademy/khan_dl
321 21 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
1_scraper.ipynb
in prompts/stanford
399 - 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
Files With Most Contributors (Top 20)
Based on the number of unique email addresses found in commits.

See data for all files...

File# lines# unitscreatedlast modified# changes
(days)
# contributorsfirst
contributor
latest
contributor
decontaminate.py
in decontamination
99 6 2024-02-21 2024-07-16 3 2 anton@huggingface.co anton@huggingface.co
llm_swarm_script.py
in generation
195 - 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
khan_dl.py
in prompts/khanacademy/khan_dl
321 21 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
main.py
in prompts/khanacademy/khan_dl
66 2 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
generate_textbooks.py
in prompts/khanacademy
33 - 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
build_openstax_prompts.py
in prompts/openstax
190 4 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
build_science_prompts.py
in prompts/auto_math_text
38 2 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
1_scraper.ipynb
in prompts/stanford
399 - 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
287 - 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
42 2 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
filter_openhermes.py
in prompts/stories
31 2 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
47 2 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
filter_and_classify_clusters.py
in prompts/web_samples
211 3 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
build_web_prompts.py
in prompts/web_samples
48 2 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
deduplicate_dataset.py
in deduplication
93 - 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
train_edu_bert.py
in classification
128 2 2024-05-31 2024-07-16 3 1 anton@huggingface.co anton@huggingface.co
run_edu_bert.py
in classification
64 1 2024-05-31 2024-07-16 2 1 anton@huggingface.co anton@huggingface.co
25 1 2024-05-31 2024-05-31 1 1 anton@huggingface.co anton@huggingface.co
search_sharded.py
in fulltext_search
102 3 2024-07-16 2024-07-16 1 1 anton@huggingface.co anton@huggingface.co
index_docs.py
in fulltext_search
117 2 2024-07-16 2024-07-16 1 1 anton@huggingface.co anton@huggingface.co
Files With Least Contributors (Top 20)
Based on the number of unique email addresses found in commits.

See data for all files...

File# lines# unitscreatedlast modified# changes
(days)
# contributorsfirst
contributor
latest
contributor
train_edu_bert.py
in classification
128 2 2024-05-31 2024-07-16 3 1 anton@huggingface.co anton@huggingface.co
index_docs.py
in fulltext_search
117 2 2024-07-16 2024-07-16 1 1 anton@huggingface.co anton@huggingface.co
search_sharded.py
in fulltext_search
102 3 2024-07-16 2024-07-16 1 1 anton@huggingface.co anton@huggingface.co
run_edu_bert.py
in classification
64 1 2024-05-31 2024-07-16 2 1 anton@huggingface.co anton@huggingface.co
25 1 2024-05-31 2024-05-31 1 1 anton@huggingface.co anton@huggingface.co
1_scraper.ipynb
in prompts/stanford
399 - 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
khan_dl.py
in prompts/khanacademy/khan_dl
321 21 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
287 - 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
filter_and_classify_clusters.py
in prompts/web_samples
211 3 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
llm_swarm_script.py
in generation
195 - 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
build_openstax_prompts.py
in prompts/openstax
190 4 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
decontaminate.py
in decontamination
99 6 2024-02-21 2024-07-16 3 2 anton@huggingface.co anton@huggingface.co
deduplicate_dataset.py
in deduplication
93 - 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
main.py
in prompts/khanacademy/khan_dl
66 2 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
build_web_prompts.py
in prompts/web_samples
48 2 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
47 2 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
42 2 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
build_science_prompts.py
in prompts/auto_math_text
38 2 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
generate_textbooks.py
in prompts/khanacademy
33 - 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
filter_openhermes.py
in prompts/stories
31 2 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com