Petastorm
Temporal Dependencies

A temporal dependency occurs when developers change two or more files at the same time (i.e. they are a part of the same commit).

File Change History per Logical Decomposition
primary
primary (2+ links)
G petastorm petastorm petastorm/etl petastorm/etl petastorm--petastorm/etl 18 petastorm/hdfs petastorm/hdfs petastorm--petastorm/hdfs 6 petastorm/benchmark petastorm/benchmark petastorm--petastorm/benchmark 7 examples/mnist examples/mnist petastorm--examples/mnist 8 examples/imagenet examples/imagenet petastorm--examples/imagenet 2 petastorm/etl--petastorm/benchmark 2 petastorm/etl--examples/imagenet 2 petastorm/workers_pool petastorm/workers_pool petastorm/workers_pool--petastorm 17 petastorm/workers_pool--petastorm/etl 6 petastorm/workers_pool--petastorm/hdfs 4 petastorm/workers_pool--petastorm/benchmark 2 petastorm/workers_pool--examples/mnist 2 petastorm/workers_pool--examples/imagenet 2 petastorm/reader_impl petastorm/reader_impl petastorm/workers_pool--petastorm/reader_impl 4 ROOT ROOT ROOT--petastorm 17 ROOT--petastorm/etl 9 ROOT--petastorm/workers_pool 6 ROOT--petastorm/hdfs 5 ROOT--petastorm/benchmark 5 ROOT--examples/mnist 5 petastorm/spark petastorm/spark ROOT--petastorm/spark 3 ROOT--examples/imagenet 2 ROOT--petastorm/reader_impl 2 examples/spark_dataset_converter examples/spark_dataset_converter ROOT--examples/spark_dataset_converter 2 petastorm/hdfs--petastorm/etl 6 petastorm/tools petastorm/tools petastorm/tools--petastorm 3 petastorm/tools--petastorm/etl 5 petastorm/benchmark--examples/mnist 3 petastorm/spark--petastorm 5 petastorm/spark--examples/spark_dataset_converter 2 petastorm/reader_impl--petastorm 5 petastorm/reader_impl--petastorm/etl 4 petastorm/reader_impl--petastorm/benchmark 2

Files Most Frequently Changed Together (Top 20)

data...

Pairs # same commits # commits 1 # commits 2
petastorm/reader.py
petastorm/etl/dataset_metadata.py
12 65 (18%) 33 (36%)
petastorm/etl/rowgroup_indexing.py
petastorm/etl/dataset_metadata.py
11 13 (84%) 33 (33%)
petastorm/workers_pool/process_pool.py
petastorm/reader.py
10 22 (45%) 65 (15%)
petastorm/etl/petastorm_generate_metadata.py
petastorm/etl/dataset_metadata.py
9 17 (52%) 33 (27%)
setup.py
petastorm/etl/dataset_metadata.py
8 50 (16%) 33 (24%)
setup.py
petastorm/reader.py
8 50 (16%) 65 (12%)
petastorm/fs_utils.py
petastorm/etl/dataset_metadata.py
7 15 (46%) 33 (21%)
petastorm/reader.py
petastorm/arrow_reader_worker.py
7 65 (10%) 21 (33%)
petastorm/reader.py
petastorm/fs_utils.py
7 65 (10%) 15 (46%)
petastorm/unischema.py
petastorm/reader.py
7 36 (19%) 65 (10%)
petastorm/workers_pool/thread_pool.py
petastorm/workers_pool/process_pool.py
7 8 (87%) 22 (31%)
petastorm/reader.py
petastorm/ngram.py
6 65 (9%) 14 (42%)
petastorm/reader.py
petastorm/__init__.py
6 65 (9%) 79 (7%)
petastorm/reader.py
petastorm/etl/rowgroup_indexing.py
6 65 (9%) 13 (46%)
petastorm/tf_utils.py
petastorm/reader.py
6 17 (35%) 65 (9%)
petastorm/unischema.py
petastorm/etl/dataset_metadata.py
6 36 (16%) 33 (18%)
petastorm/utils.py
petastorm/unischema.py
6 12 (50%) 36 (16%)
petastorm/workers_pool/process_pool.py
petastorm/codecs.py
6 22 (27%) 20 (30%)
petastorm/workers_pool/thread_pool.py
petastorm/reader.py
6 8 (75%) 65 (9%)
setup.py
petastorm/__init__.py
6 50 (12%) 79 (7%)
Files from Different Folders Most Frequently Changed Together (Top 20)

data...

Pairs # same commits # commits 1 # commits 2
petastorm/reader.py
petastorm/etl/dataset_metadata.py
12 65 (18%) 33 (36%)
petastorm/workers_pool/process_pool.py
petastorm/reader.py
10 22 (45%) 65 (15%)
petastorm/fs_utils.py
petastorm/etl/dataset_metadata.py
7 15 (46%) 33 (21%)
petastorm/reader.py
petastorm/etl/rowgroup_indexing.py
6 65 (9%) 13 (46%)
petastorm/unischema.py
petastorm/etl/dataset_metadata.py
6 36 (16%) 33 (18%)
petastorm/workers_pool/process_pool.py
petastorm/codecs.py
6 22 (27%) 20 (30%)
petastorm/workers_pool/thread_pool.py
petastorm/reader.py
6 8 (75%) 65 (9%)
petastorm/fs_utils.py
petastorm/etl/rowgroup_indexing.py
5 15 (33%) 13 (38%)
petastorm/hdfs/namenode.py
petastorm/etl/dataset_metadata.py
5 12 (41%) 33 (15%)
petastorm/hdfs/namenode.py
petastorm/etl/rowgroup_indexing.py
5 12 (41%) 13 (38%)
petastorm/reader.py
petastorm/etl/metadata_util.py
5 65 (7%) 7 (71%)
petastorm/tools/copy_dataset.py
petastorm/etl/petastorm_generate_metadata.py
5 6 (83%) 17 (29%)
petastorm/workers_pool/thread_pool.py
petastorm/etl/dataset_metadata.py
5 8 (62%) 33 (15%)
petastorm/etl/dataset_metadata.py
petastorm/codecs.py
4 33 (12%) 20 (20%)
petastorm/fs_utils.py
petastorm/etl/petastorm_generate_metadata.py
4 15 (26%) 17 (23%)
petastorm/fs_utils.py
petastorm/etl/metadata_util.py
4 15 (26%) 7 (57%)
petastorm/reader.py
petastorm/etl/petastorm_generate_metadata.py
4 65 (6%) 17 (23%)
petastorm/spark_utils.py
petastorm/etl/dataset_metadata.py
4 5 (80%) 33 (12%)
petastorm/spark_utils.py
petastorm/etl/metadata_util.py
4 5 (80%) 7 (57%)
petastorm/spark_utils.py
petastorm/etl/rowgroup_indexing.py
4 5 (80%) 13 (30%)