Summary: 8 instances, 8 unique

Text	Count
# TODO: try to reduce this / make it a function of "hash_in_mem" / num_langs	1
# TODO: should we use another format ?	1
# TODO: better default	1
# TODO find a tokenizer for those languages	1
# TODO: try copying models file, try READ or PARALLEL_READ	1
# TODO use classic files directory.	1
# TODO: open the remote file in streaming mode.	1
# TODO: start downloading the next segment in the background	1