Summary: 17 instances, 16 unique

Text	Count
overflow_matrix = torch.div(  # FIXME: div blows up when MAX_WIDTH_BYTES >7	1
# TODO use an independent random generator	1
# TODO add flag for a final batch being incomplete	1
# TODO: for optimizers that don't use momentum or adaptive learning rate,	1
TODO If needed we can also add device here.	1
TODO correct note if above option added.	1
# TODO: better if this assert fires at config creation time	1
# TODO: async_user_selector_type should be directly instantiable from json_config	1
# TODO MM make sure metric reporter is multi-process safe.	1
# TODO: is this needed? Do we ever call this externally?	1
# TODO: (jesikmin) T55869097 Check whether the size of buffer is same as	1
# TODO: enable all_reduce on mixed dtypes with dtype-based bucketing	1
)  # TODO do not call distributed utils here, this is upstream responsibility	1
TODO: add adaptive learning rate based on staleness of gradient	1
# TODO these are specific to mean reducer [this implementation]	1
# TODO num_samples is used as the default weight, this needs revisit	2