huggingface / cosmopedia
File Age & Freshness

File age measurements show the distribution of file ages (days since the first commit) and the file freshness (days since the latest commit).

Summary
File Change History Overall
File Age Distribution Overall
Days since first update
  • There are 20 files with 2,536 lines of code in files.
    • 18 files that are 366+ days old (2,317 lines of code)
    • 2 files that are 181-365 days old (219 lines of code)
    • 0 files that are 91-180 days old (0 lines of code)
    • 0 files that are 31-90 days old (0 lines of code)
    • 0 files that are 1-30 days old (0 lines of code)
91% | 8% | 0% | 0% | 0%
Legend:
366+
181-365
91-180
31-90
1-30

explore: grouped by folders | grouped by age
File Freshness Distribution Overall
Days since last update
  • There are 20 files with 2,536 lines of code in files.
    • 15 files have been last changed 366+ days ago (2,026 lines of code)
    • 5 files have been last changed 181-365 days ago (510 lines of code)
    • 0 files have been last changed 91-180 days ago (0 lines of code)
    • 0 files have been last changed 31-90 days ago (0 lines of code)
    • 0 files have been last changed 1-30 days ago (0 lines of code)
79% | 20% | 0% | 0% | 0%
Legend:
366+
181-365
91-180
31-90
1-30

explore: grouped by folders | grouped by freshness
File Change History per File Extension
py, md, txt, ipynb
File Age Distribution per Extension
Days since first update
366+
181-365
91-180
31-90
1-30
py88% | 11% | 0% | 0% | 0%
ipynb100% | 0% | 0% | 0% | 0%
File Freshness Distribution per Extension
Days since last update
366+
181-365
91-180
31-90
1-30
py72% | 27% | 0% | 0% | 0%
ipynb100% | 0% | 0% | 0% | 0%
File Change History per Logical Decomposition
primary
primary (file age distribution)
Days since first update
366+
181-365
91-180
31-90
1-30
prompts100% | 0% | 0% | 0% | 0%
generation100% | 0% | 0% | 0% | 0%
classification100% | 0% | 0% | 0% | 0%
decontamination100% | 0% | 0% | 0% | 0%
deduplication100% | 0% | 0% | 0% | 0%
fulltext_search0% | 100% | 0% | 0% | 0%
primary (file freshness distribution)
Days since last update
366+
181-365
91-180
31-90
1-30
prompts100% | 0% | 0% | 0% | 0%
generation100% | 0% | 0% | 0% | 0%
deduplication100% | 0% | 0% | 0% | 0%
fulltext_search0% | 100% | 0% | 0% | 0%
classification0% | 100% | 0% | 0% | 0%
decontamination0% | 100% | 0% | 0% | 0%
Oldest Files (Top 20)
File# lines# unitscreatedlast modified# changes
(days)
# contributorsfirst
contributor
latest
contributor
1_scraper.ipynb
in prompts/stanford
399 - 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
khan_dl.py
in prompts/khanacademy/khan_dl
321 21 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
287 - 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
filter_and_classify_clusters.py
in prompts/web_samples
211 3 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
llm_swarm_script.py
in generation
195 - 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
build_openstax_prompts.py
in prompts/openstax
190 4 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
deduplicate_dataset.py
in deduplication
93 - 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
main.py
in prompts/khanacademy/khan_dl
66 2 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
build_web_prompts.py
in prompts/web_samples
48 2 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
47 2 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
42 2 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
build_science_prompts.py
in prompts/auto_math_text
38 2 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
generate_textbooks.py
in prompts/khanacademy
33 - 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
filter_openhermes.py
in prompts/stories
31 2 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
decontaminate.py
in decontamination
99 6 2024-02-21 2024-07-16 3 2 anton@huggingface.co anton@huggingface.co
train_edu_bert.py
in classification
128 2 2024-05-31 2024-07-16 3 1 anton@huggingface.co anton@huggingface.co
run_edu_bert.py
in classification
64 1 2024-05-31 2024-07-16 2 1 anton@huggingface.co anton@huggingface.co
25 1 2024-05-31 2024-05-31 1 1 anton@huggingface.co anton@huggingface.co
index_docs.py
in fulltext_search
117 2 2024-07-16 2024-07-16 1 1 anton@huggingface.co anton@huggingface.co
search_sharded.py
in fulltext_search
102 3 2024-07-16 2024-07-16 1 1 anton@huggingface.co anton@huggingface.co
Files Not Recently Changed (Top 20)
File# lines# unitscreatedlast modified# changes
(days)
# contributorsfirst
contributor
latest
contributor
filter_openhermes.py
in prompts/stories
31 2 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
generate_textbooks.py
in prompts/khanacademy
33 - 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
build_science_prompts.py
in prompts/auto_math_text
38 2 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
42 2 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
47 2 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
build_web_prompts.py
in prompts/web_samples
48 2 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
main.py
in prompts/khanacademy/khan_dl
66 2 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
deduplicate_dataset.py
in deduplication
93 - 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
build_openstax_prompts.py
in prompts/openstax
190 4 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
llm_swarm_script.py
in generation
195 - 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
filter_and_classify_clusters.py
in prompts/web_samples
211 3 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
287 - 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
khan_dl.py
in prompts/khanacademy/khan_dl
321 21 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
1_scraper.ipynb
in prompts/stanford
399 - 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
25 1 2024-05-31 2024-05-31 1 1 anton@huggingface.co anton@huggingface.co
run_edu_bert.py
in classification
64 1 2024-05-31 2024-07-16 2 1 anton@huggingface.co anton@huggingface.co
decontaminate.py
in decontamination
99 6 2024-02-21 2024-07-16 3 2 anton@huggingface.co anton@huggingface.co
search_sharded.py
in fulltext_search
102 3 2024-07-16 2024-07-16 1 1 anton@huggingface.co anton@huggingface.co
index_docs.py
in fulltext_search
117 2 2024-07-16 2024-07-16 1 1 anton@huggingface.co anton@huggingface.co
train_edu_bert.py
in classification
128 2 2024-05-31 2024-07-16 3 1 anton@huggingface.co anton@huggingface.co
Most Recently Created Files (Top 20)
File# lines# unitscreatedlast modified# changes
(days)
# contributorsfirst
contributor
latest
contributor
index_docs.py
in fulltext_search
117 2 2024-07-16 2024-07-16 1 1 anton@huggingface.co anton@huggingface.co
search_sharded.py
in fulltext_search
102 3 2024-07-16 2024-07-16 1 1 anton@huggingface.co anton@huggingface.co
train_edu_bert.py
in classification
128 2 2024-05-31 2024-07-16 3 1 anton@huggingface.co anton@huggingface.co
run_edu_bert.py
in classification
64 1 2024-05-31 2024-07-16 2 1 anton@huggingface.co anton@huggingface.co
25 1 2024-05-31 2024-05-31 1 1 anton@huggingface.co anton@huggingface.co
decontaminate.py
in decontamination
99 6 2024-02-21 2024-07-16 3 2 anton@huggingface.co anton@huggingface.co
1_scraper.ipynb
in prompts/stanford
399 - 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
khan_dl.py
in prompts/khanacademy/khan_dl
321 21 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
287 - 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
filter_and_classify_clusters.py
in prompts/web_samples
211 3 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
llm_swarm_script.py
in generation
195 - 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
build_openstax_prompts.py
in prompts/openstax
190 4 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
deduplicate_dataset.py
in deduplication
93 - 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
main.py
in prompts/khanacademy/khan_dl
66 2 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
build_web_prompts.py
in prompts/web_samples
48 2 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
47 2 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
42 2 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
build_science_prompts.py
in prompts/auto_math_text
38 2 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
generate_textbooks.py
in prompts/khanacademy
33 - 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
filter_openhermes.py
in prompts/stories
31 2 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
Most Recently Changed Files (Top 20)
File# lines# unitscreatedlast modified# changes
(days)
# contributorsfirst
contributor
latest
contributor
train_edu_bert.py
in classification
128 2 2024-05-31 2024-07-16 3 1 anton@huggingface.co anton@huggingface.co
index_docs.py
in fulltext_search
117 2 2024-07-16 2024-07-16 1 1 anton@huggingface.co anton@huggingface.co
search_sharded.py
in fulltext_search
102 3 2024-07-16 2024-07-16 1 1 anton@huggingface.co anton@huggingface.co
decontaminate.py
in decontamination
99 6 2024-02-21 2024-07-16 3 2 anton@huggingface.co anton@huggingface.co
run_edu_bert.py
in classification
64 1 2024-05-31 2024-07-16 2 1 anton@huggingface.co anton@huggingface.co
25 1 2024-05-31 2024-05-31 1 1 anton@huggingface.co anton@huggingface.co
1_scraper.ipynb
in prompts/stanford
399 - 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
khan_dl.py
in prompts/khanacademy/khan_dl
321 21 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
287 - 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
filter_and_classify_clusters.py
in prompts/web_samples
211 3 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
llm_swarm_script.py
in generation
195 - 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
build_openstax_prompts.py
in prompts/openstax
190 4 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
deduplicate_dataset.py
in deduplication
93 - 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
main.py
in prompts/khanacademy/khan_dl
66 2 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
build_web_prompts.py
in prompts/web_samples
48 2 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
47 2 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
42 2 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
build_science_prompts.py
in prompts/auto_math_text
38 2 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
generate_textbooks.py
in prompts/khanacademy
33 - 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com
filter_openhermes.py
in prompts/stories
31 2 2024-02-20 2024-02-20 1 2 44069155+loubnabnl@users.no... loubnabenallal1999@gmail.com