evaluation/parser.py (3 lines): - line 354: # TODO: SFT models - line 502: # TODO check multiple choice - line 575: # TODO check multiple choice evaluation/python_executor.py (1 line): - line 37: # TODO: use: https://github.com/shroominic/codebox-api