Summary: 17 instances, 16 unique Text Count overflow_matrix = torch.div( # FIXME: div blows up when MAX_WIDTH_BYTES >7 1 # TODO use an independent random generator 1 # TODO add flag for a final batch being incomplete 1 # TODO: for optimizers that don't use momentum or adaptive learning rate, 1 TODO If needed we can also add device here. 1 TODO correct note if above option added. 1 # TODO: better if this assert fires at config creation time 1 # TODO: async_user_selector_type should be directly instantiable from json_config 1 # TODO MM make sure metric reporter is multi-process safe. 1 # TODO: is this needed? Do we ever call this externally? 1 # TODO: (jesikmin) T55869097 Check whether the size of buffer is same as 1 # TODO: enable all_reduce on mixed dtypes with dtype-based bucketing 1 ) # TODO do not call distributed utils here, this is upstream responsibility 1 TODO: add adaptive learning rate based on staleness of gradient 1 # TODO these are specific to mean reducer [this implementation] 1 # TODO num_samples is used as the default weight, this needs revisit 2