Petastorm
File Age

File age measurements show the distribution of file ages (days since the first commit) and the recency of file updates (days since the latest commit).

Summary
  • Number of files: 84
  • Daily file updates (only one update per file and date counted): 331
  • First update: 2018-07-18
  • Latest update: 2020-09-10
  • Days between first and latest update: 786 (112 weeks, estimated 560 working days)
  • Active days (at least one file change): 233
  • Data:
File Change History Overall
File Age Distribution Overall
Days since first update
  • There are 84 files with 5,134 lines of code in files.
    • 73 files older than 1 year (4,285 lines of code)
    • 5 files are 180 days to 1 year old (480 lines of code)
    • 6 files are 90 to 180 days old (369 lines of code)
    • 0 files are 30 to 90 days old (0 lines of code)
    • 0 files are less than 30 days old (0 lines of code)
83% | 9% | 7% | 0% | 0%
Legend:
> 1y
6-12m
91-180d
31-90d
1-30d
Latest Change Distribution Overall
Days since last update
  • There are 84 files with 5,134 lines of code in files.
    • 51 files have been last changed more than 1 year ago (1,527 lines of code)
    • 5 files have been last changed 180 days to 1 year ago (394 lines of code)
    • 14 files have been last changed 90 to 180 days ago (1,062 lines of code)
    • 10 files have been last changed 30 to 90 days ago (1,707 lines of code)
    • 4 files have been last changed less than 30 days ago (444 lines of code)
29% | 7% | 20% | 33% | 8%
Legend:
> 1y
6-12m
91-180d
31-90d
1-30d
File Change History per File Extension
py, crc, parquet
File Age Distribution per Extension
Days since first update
py83% | 9% | 7% | 0% | 0%
Legend:
> 1y
6-12m
91-180d
31-90d
1-30d
Latest Change Distribution per Extension
Days since last update
py29% | 7% | 20% | 33% | 8%
Legend:
> 1y
6-12m
91-180d
31-90d
1-30d
File Change History per Logical Decomposition
primary
primary (file age distribution)
Days since first update
petastorm97% | 2% | 0% | 0% | 0%
petastorm/etl100% | 0% | 0% | 0% | 0%
petastorm/workers_pool100% | 0% | 0% | 0% | 0%
examples/mnist100% | 0% | 0% | 0% | 0%
petastorm/benchmark77% | 0% | 22% | 0% | 0%
petastorm/hdfs100% | 0% | 0% | 0% | 0%
examples/hello_world100% | 0% | 0% | 0% | 0%
petastorm/reader_impl47% | 0% | 52% | 0% | 0%
petastorm/tools100% | 0% | 0% | 0% | 0%
ROOT100% | 0% | 0% | 0% | 0%
examples/imagenet100% | 0% | 0% | 0% | 0%
petastorm/pyarrow_helpers100% | 0% | 0% | 0% | 0%
examples100% | 0% | 0% | 0% | 0%
petastorm/spark0% | 100% | 0% | 0% | 0%
petastorm/gcsfs_helpers0% | 100% | 0% | 0% | 0%
examples/spark_dataset_converter0% | 0% | 100% | 0% | 0%
Legend:
> 1y
6-12m
91-180d
31-90d
1-30d
primary (latest change distribution)
Days since last update
petastorm13% | 4% | 22% | 51% | 7%
petastorm/workers_pool57% | 42% | 0% | 0% | 0%
petastorm/etl48% | 13% | 0% | 0% | 38%
examples/mnist71% | 0% | 28% | 0% | 0%
petastorm/reader_impl47% | 0% | 52% | 0% | 0%
examples/hello_world73% | 0% | 26% | 0% | 0%
petastorm/tools100% | 0% | 0% | 0% | 0%
examples/imagenet100% | 0% | 0% | 0% | 0%
petastorm/benchmark27% | 0% | 72% | 0% | 0%
petastorm/pyarrow_helpers100% | 0% | 0% | 0% | 0%
petastorm/hdfs<1% | 0% | 99% | 0% | 0%
examples100% | 0% | 0% | 0% | 0%
petastorm/spark0% | <1% | 0% | 99% | 0%
petastorm/gcsfs_helpers0% | 2% | 0% | 98% | 0%
examples/spark_dataset_converter0% | 0% | <1% | 99% | 0%
ROOT0% | 0% | 0% | 0% | 100%
Legend:
> 1y
6-12m
91-180d
31-90d
1-30d
Oldest Files (Top 50)
File# lines# unitslast modified
(days ago)
created
(days ago)
# changes
setup.py
in null
86 - 19 787 37
reader.py
in petastorm
325 20 50 781 59
unischema.py
in petastorm
265 23 47 781 32
process_pool.py
in petastorm/workers_pool
213 14 225 781 19
dataset_metadata.py
in petastorm/etl
200 12 2 781 31
tf_utils.py
in petastorm
177 17 143 781 16
namenode.py
in petastorm/hdfs
162 16 103 781 11
codecs.py
in petastorm
154 24 2 781 18
thread_pool.py
in petastorm/workers_pool
126 11 657 781 8
fs_utils.py
in petastorm
115 9 152 781 14
predicates.py
in petastorm
93 21 454 781 5
ventilator.py
in petastorm/workers_pool
81 12 529 781 7
rowgroup_indexing.py
in petastorm/etl
73 4 338 781 13
rowgroup_indexers.py
in petastorm/etl
73 14 757 781 4
utils.py
in petastorm
62 3 153 781 12
selectors.py
in petastorm
46 11 496 781 3
metadata_util.py
in petastorm/etl
45 - 604 781 5
dummy_pool.py
in petastorm/workers_pool
38 7 707 781 5
exec_in_new_process.py
in petastorm/workers_pool
31 1 596 781 5
local_disk_cache.py
in petastorm
26 3 759 781 3
__init__.py
in petastorm/etl
21 6 657 781 3
spark_utils.py
in petastorm
18 1 604 781 5
generator.py
in petastorm
15 1 744 781 3
worker_base.py
in petastorm/workers_pool
11 3 781 781 1
cache.py
in petastorm
9 2 657 781 4
__init__.py
in petastorm
4 - 17 781 62
__init__.py
in petastorm/workers_pool
3 - 657 781 3
__init__.py
in petastorm/hdfs
1 - 781 781 1
generate_petastorm_imagenet.py
in examples/imagenet
67 3 744 775 5
schema.py
in examples/imagenet
9 - 768 775 2
__init__.py
in examples
1 - 775 775 1
__init__.py
in examples/imagenet
1 - 775 775 1
legacy.py
in petastorm/etl
16 1 759 765 2
__init__.py
in examples/hello_world
1 - 765 765 1
generate_petastorm_mnist.py
in examples/mnist
63 3 449 764 7
schema.py
in examples/mnist
9 - 758 764 3
__init__.py
in examples/mnist
1 - 739 764 2
petastorm_generate_metadata.py
in petastorm/etl
98 3 463 761 13
pytorch.py
in petastorm
171 14 64 758 16
ngram.py
in petastorm
105 15 339 757 9
shuffling_buffer.py
in petastorm/reader_impl
75 20 401 757 6
__init__.py
in petastorm/reader_impl
1 - 757 757 1
weighted_sampling_reader.py
in petastorm
44 8 123 731 7
throughput.py
in petastorm/benchmark
126 8 150 723 8
cli.py
in petastorm/benchmark
67 3 467 723 3
__init__.py
in petastorm/benchmark
1 - 723 723 1
pytorch_example.py
in examples/mnist
115 6 596 721 5
copy_dataset.py
in petastorm/tools
77 4 463 718 4
spark_session_cli.py
in petastorm/tools
28 3 718 718 1
__init__.py
in petastorm/tools
1 - 718 718 1
Files Not Recently Changed (Top 50)
File# lines# unitslast modified
(days ago)
created
(days ago)
# changes
__init__.py
in petastorm/hdfs
1 - 781 781 1
worker_base.py
in petastorm/workers_pool
11 3 781 781 1
__init__.py
in examples/imagenet
1 - 775 775 1
__init__.py
in examples
1 - 775 775 1
schema.py
in examples/imagenet
9 - 768 775 2
__init__.py
in examples/hello_world
1 - 765 765 1
legacy.py
in petastorm/etl
16 1 759 765 2
local_disk_cache.py
in petastorm
26 3 759 781 3
schema.py
in examples/mnist
9 - 758 764 3
__init__.py
in petastorm/reader_impl
1 - 757 757 1
rowgroup_indexers.py
in petastorm/etl
73 14 757 781 4
generator.py
in petastorm
15 1 744 781 3
generate_petastorm_imagenet.py
in examples/imagenet
67 3 744 775 5
__init__.py
in examples/mnist
1 - 739 764 2
__init__.py
in petastorm/benchmark
1 - 723 723 1
__init__.py
in petastorm/tools
1 - 718 718 1
spark_session_cli.py
in petastorm/tools
28 3 718 718 1
pyarrow_serializer.py
in petastorm/reader_impl
19 4 717 717 1
pickle_serializer.py
in petastorm/reader_impl
6 2 711 711 1
dummy_pool.py
in petastorm/workers_pool
38 7 707 781 5
__init__.py
in petastorm/pyarrow_helpers
1 - 672 672 1
batching_table_queue.py
in petastorm/pyarrow_helpers
30 4 672 672 1
__init__.py
in petastorm/workers_pool
3 - 657 781 3
cache.py
in petastorm
9 2 657 781 4
__init__.py
in petastorm/etl
21 6 657 781 3
thread_pool.py
in petastorm/workers_pool
126 11 657 781 8
arrow_table_serializer.py
in petastorm/reader_impl
12 2 648 648 1
spark_utils.py
in petastorm
18 1 604 781 5
metadata_util.py
in petastorm/etl
45 - 604 781 5
__init__.py
in examples/hello_world/external_dataset
1 - 603 603 1
__init__.py
in examples/hello_world/petastorm_dataset
1 - 603 603 1
python_hello_world.py
in examples/hello_world/petastorm_dataset
8 1 603 603 1
pytorch_hello_world.py
in examples/hello_world/external_dataset
9 1 603 603 1
python_hello_world.py
in examples/hello_world/external_dataset
9 1 603 603 1
pytorch_hello_world.py
in examples/hello_world/petastorm_dataset
9 1 603 603 1
pyspark_hello_world.py
in examples/hello_world/petastorm_dataset
21 1 603 603 1
generate_petastorm_dataset.py
in examples/hello_world/petastorm_dataset
31 2 603 603 1
exec_in_new_process.py
in petastorm/workers_pool
31 1 596 781 5
pytorch_example.py
in examples/mnist
115 6 596 721 5
namedtuple_gt_255_fields.py
in petastorm
65 2 535 535 1
ventilator.py
in petastorm/workers_pool
81 12 529 781 7
errors.py
in petastorm
1 - 526 526 1
local_disk_arrow_table_cache.py
in petastorm
18 2 497 648 2
selectors.py
in petastorm
46 11 496 781 3
cli.py
in petastorm/benchmark
67 3 467 723 3
copy_dataset.py
in petastorm/tools
77 4 463 718 4
petastorm_generate_metadata.py
in petastorm/etl
98 3 463 761 13
predicates.py
in petastorm
93 21 454 781 5
generate_petastorm_mnist.py
in examples/mnist
63 3 449 764 7
generate_external_dataset.py
in examples/hello_world/external_dataset
18 2 443 603 2
Most Recently Created Files (Top 50)
File# lines# unitslast modified
(days ago)
created
(days ago)
# changes
pytorch_shuffling_buffer.py
in petastorm/reader_impl
125 23 148 148 1
dummy_reader.py
in petastorm/benchmark
56 6 148 148 1
pytorch_converter_example.py
in examples/spark_dataset_converter
109 6 64 162 2
tensorflow_converter_example.py
in examples/spark_dataset_converter
62 4 64 162 3
utils.py
in examples/spark_dataset_converter
16 2 64 162 2
__init__.py
in examples/spark_dataset_converter
1 - 162 162 1
spark_dataset_converter.py
in petastorm/spark
380 36 51 206 23
__init__.py
in petastorm/spark
2 - 186 206 3
gcsfs_wrapper.py
in petastorm/gcsfs_helpers
49 3 51 234 3
__init__.py
in petastorm/gcsfs_helpers
1 - 234 234 1
compat.py
in petastorm
48 7 153 338 4
errors.py
in petastorm
1 - 526 526 1
namedtuple_gt_255_fields.py
in petastorm
65 2 535 535 1
generate_petastorm_dataset.py
in examples/hello_world/petastorm_dataset
31 2 603 603 1
pyspark_hello_world.py
in examples/hello_world/petastorm_dataset
21 1 603 603 1
tensorflow_hello_world.py
in examples/hello_world/petastorm_dataset
19 1 150 603 2
tensorflow_hello_world.py
in examples/hello_world/external_dataset
19 1 150 603 2
generate_external_dataset.py
in examples/hello_world/external_dataset
18 2 443 603 2
pytorch_hello_world.py
in examples/hello_world/petastorm_dataset
9 1 603 603 1
python_hello_world.py
in examples/hello_world/external_dataset
9 1 603 603 1
pytorch_hello_world.py
in examples/hello_world/external_dataset
9 1 603 603 1
python_hello_world.py
in examples/hello_world/petastorm_dataset
8 1 603 603 1
__init__.py
in examples/hello_world/petastorm_dataset
1 - 603 603 1
__init__.py
in examples/hello_world/external_dataset
1 - 603 603 1
transform.py
in petastorm
32 3 163 624 8
arrow_reader_worker.py
in petastorm
183 10 50 648 16
py_dict_reader_worker.py
in petastorm
147 12 50 648 10
local_disk_arrow_table_cache.py
in petastorm
18 2 497 648 2
arrow_table_serializer.py
in petastorm/reader_impl
12 2 648 648 1
batching_table_queue.py
in petastorm/pyarrow_helpers
30 4 672 672 1
__init__.py
in petastorm/pyarrow_helpers
1 - 672 672 1
tf_example.py
in examples/mnist
76 2 150 705 2
pickle_serializer.py
in petastorm/reader_impl
6 2 711 711 1
pyarrow_serializer.py
in petastorm/reader_impl
19 4 717 717 1
copy_dataset.py
in petastorm/tools
77 4 463 718 4
spark_session_cli.py
in petastorm/tools
28 3 718 718 1
__init__.py
in petastorm/tools
1 - 718 718 1
pytorch_example.py
in examples/mnist
115 6 596 721 5
throughput.py
in petastorm/benchmark
126 8 150 723 8
cli.py
in petastorm/benchmark
67 3 467 723 3
__init__.py
in petastorm/benchmark
1 - 723 723 1
weighted_sampling_reader.py
in petastorm
44 8 123 731 7
ngram.py
in petastorm
105 15 339 757 9
shuffling_buffer.py
in petastorm/reader_impl
75 20 401 757 6
__init__.py
in petastorm/reader_impl
1 - 757 757 1
pytorch.py
in petastorm
171 14 64 758 16
petastorm_generate_metadata.py
in petastorm/etl
98 3 463 761 13
generate_petastorm_mnist.py
in examples/mnist
63 3 449 764 7
schema.py
in examples/mnist
9 - 758 764 3
__init__.py
in examples/mnist
1 - 739 764 2
Most Recently Changed Files (Top 50)
File# lines# unitslast modified
(days ago)
created
(days ago)
# changes
dataset_metadata.py
in petastorm/etl
200 12 2 781 31
codecs.py
in petastorm
154 24 2 781 18
__init__.py
in petastorm
4 - 17 781 62
setup.py
in null
86 - 19 787 37
unischema.py
in petastorm
265 23 47 781 32
reader.py
in petastorm
325 20 50 781 59
arrow_reader_worker.py
in petastorm
183 10 50 648 16
py_dict_reader_worker.py
in petastorm
147 12 50 648 10
spark_dataset_converter.py
in petastorm/spark
380 36 51 206 23
gcsfs_wrapper.py
in petastorm/gcsfs_helpers
49 3 51 234 3
pytorch.py
in petastorm
171 14 64 758 16
pytorch_converter_example.py
in examples/spark_dataset_converter
109 6 64 162 2
tensorflow_converter_example.py
in examples/spark_dataset_converter
62 4 64 162 3
utils.py
in examples/spark_dataset_converter
16 2 64 162 2
namenode.py
in petastorm/hdfs
162 16 103 781 11
weighted_sampling_reader.py
in petastorm
44 8 123 731 7
tf_utils.py
in petastorm
177 17 143 781 16
pytorch_shuffling_buffer.py
in petastorm/reader_impl
125 23 148 148 1
dummy_reader.py
in petastorm/benchmark
56 6 148 148 1
throughput.py
in petastorm/benchmark
126 8 150 723 8
tf_example.py
in examples/mnist
76 2 150 705 2
tensorflow_hello_world.py
in examples/hello_world/petastorm_dataset
19 1 150 603 2
tensorflow_hello_world.py
in examples/hello_world/external_dataset
19 1 150 603 2
fs_utils.py
in petastorm
115 9 152 781 14
utils.py
in petastorm
62 3 153 781 12
compat.py
in petastorm
48 7 153 338 4
__init__.py
in examples/spark_dataset_converter
1 - 162 162 1
transform.py
in petastorm
32 3 163 624 8
__init__.py
in petastorm/spark
2 - 186 206 3
process_pool.py
in petastorm/workers_pool
213 14 225 781 19
__init__.py
in petastorm/gcsfs_helpers
1 - 234 234 1
rowgroup_indexing.py
in petastorm/etl
73 4 338 781 13
ngram.py
in petastorm
105 15 339 757 9
shuffling_buffer.py
in petastorm/reader_impl
75 20 401 757 6
generate_external_dataset.py
in examples/hello_world/external_dataset
18 2 443 603 2
generate_petastorm_mnist.py
in examples/mnist
63 3 449 764 7
predicates.py
in petastorm
93 21 454 781 5
petastorm_generate_metadata.py
in petastorm/etl
98 3 463 761 13
copy_dataset.py
in petastorm/tools
77 4 463 718 4
cli.py
in petastorm/benchmark
67 3 467 723 3
selectors.py
in petastorm
46 11 496 781 3
local_disk_arrow_table_cache.py
in petastorm
18 2 497 648 2
errors.py
in petastorm
1 - 526 526 1
ventilator.py
in petastorm/workers_pool
81 12 529 781 7
namedtuple_gt_255_fields.py
in petastorm
65 2 535 535 1
pytorch_example.py
in examples/mnist
115 6 596 721 5
exec_in_new_process.py
in petastorm/workers_pool
31 1 596 781 5
generate_petastorm_dataset.py
in examples/hello_world/petastorm_dataset
31 2 603 603 1
pyspark_hello_world.py
in examples/hello_world/petastorm_dataset
21 1 603 603 1
pytorch_hello_world.py
in examples/hello_world/petastorm_dataset
9 1 603 603 1