muse/modeling_transformer.py (4 lines): - line 888: # TODO: should norm be applied to encoder_hidden_states as well? - line 1208: # TODO: make this configurable - line 1290: can_remask_prev_masked=False, # TODO: implement this - line 1360: scores = rearrange(scores, "... 1 -> ...") # TODO: use torch muse/pipeline_muse.py (4 lines): - line 120: ).input_ids # TODO: remove hardcode - line 299: # TODO: Add config for pipeline to specify text encoder - line 311: # TODO: make this more robust - line 436: ).input_ids # TODO: remove hardcode training/train_muse.py (3 lines): - line 179: # TODO - would be nice to vectorize - line 526: # TODO: make this configurable - line 911: # TODO: Add generation scripts/gen_sdxl_synthetic_dataset.py (1 line): - line 97: # TODO - can we avoid syncing images to cpu muse/modeling_paella_vq.py (1 line): - line 11: # TODO: This model only supports inference, not training. Make it trainable. scripts/pre_encode.py (1 line): - line 92: TODO - probably would be better to wait until the thread pool is full and then muse/modeling_utils.py (1 line): - line 831: # TODO: remove this when we remove the deprecation warning, and the `kwargs` argument, training/data.py (1 line): - line 75: # FIXME webdataset version throws if suffix in current_sample, but we have a potential for training/train_maskgit_imagenet.py (1 line): - line 499: # TODO: Add generation muse/modeling_transformer_v2.py (1 line): - line 160: # TODO: Allow enabling fused norm using a function (like we do for xformers attention)