uber / petastorm
Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.
GitHub Repo 
4.6K
lines of main code
56 files
5.7K
lines of test code
52 files
1.3K
lines of other code
37 files
8y
age
2,831 days
21%
main code touched
1 year (982 LOC)
0%
new main code
1 year (0 LOC)
4.6K
py
CFG
0.01K
cfg

1

26

6

54

102

380

457

1035

1

9

3

24

51

164

139

303

1

4

3

8

10

19

12

14

2026 2025 2023 2022 2021 2020 2019 2018

generated by sokrates.dev (configuration) on 2026-04-18