huggingface / datasets
🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
GitHub Repo 
20K
lines of main code
127 files
23K
lines of test code
76 files
8.6K
lines of other code
16 files
5y
age
1,904 days
90%
main code touched
1 year (18K LOC)
2%
new main code
1 year (423 LOC)
20K
py
0.1K
yaml
0.02K
toml

105

294

413

950

1107

1177

31

60

89

157

228

209

2025 2024 2023 2022 2021 2020

generated by sokrates.dev (configuration) on 2025-06-30