petastorm/spark/spark_dataset_converter.py (5 lines): - line 195: # TODO: generate a best tuned value for default worker count value - line 339: # TODO: auto tune best batch size in default case. - line 646: # TODO: also check file size for other file system. - line 714: # TODO: Improve default behavior to be automatically choosing the best way. - line 726: # TODO: improve this by read parquet file metadata to get count petastorm/unischema.py (1 line): - line 157: # TODO: Changing fields in this class or the UnischemaField will break reading due to the schema being pickled next to petastorm/workers_pool/thread_pool.py (1 line): - line 82: TODO: consider using a standard thread pool