Uber / petastorm
Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.
GitHub Repo
4.5K
lines of main code
57 files
5.3K
lines of test code
52 files
1.3K
lines of other code
37 files
0%
main code touched
1 year (0 LOC)
0%
new main code
1 year (0 LOC)
0
recent contributors
past 30 days
6y
age
2,086 days
4.5K
py
9
yml
CFG
2
cfg

github actions
make
docker


Main Code: 4,510 LOC (57 files) = PY (99%) + YML (<1%) + CFG (<1%)
Secondary code: Test: 5,259 LOC (52); Generated: 0 LOC (0); Build & Deploy: 5 LOC (1); Other: 1,260 LOC (36);
Duplication: 4%
File Size: 0% long (>1000 LOC), 62% short (<= 200 LOC)
Unit Size: 0% long (>100 LOC), 62% short (<= 10 LOC)
Conditional Complexity: 0% complex (McCabe index > 50), 52% simple (McCabe index <= 5)
Logical Component Decomposition: primary (2 components)

5 years, 8 months old

  • 100% of code older than 365 days
  • 100% of code not updated in the past 365 days

8% of code updated more than 50 times

Also see temporal dependencies for files frequently changed in same commits.

Goals: Keep the system simple and easy to change (4)
Straight_Line
Features of interest:
TODOs
3 files

generated by sokrates.dev (configuration) on 2024-04-03