lmms_eval/tasks/gqa/gqa.yaml (31 lines of code) (raw):

dataset_path: lmms-lab/GQA dataset_name: testdev_balanced_instructions dataset_kwargs: token: True task: "gqa" test_split: testdev output_type: generate_until doc_to_visual: !function utils.gqa_doc_to_visual doc_to_text: !function utils.gqa_doc_to_text doc_to_target: "answer" generation_kwargs: max_new_tokens: 16 temperature: 0 top_p: 0 num_beams: 1 do_sample: false metric_list: - metric: exact_match aggregation: mean higher_is_better: true ignore_case: true ignore_punctuation: true metadata: - version: 0.0 model_specific_prompt_kwargs: default: pre_prompt: "" post_prompt: "\nAnswer the question using a single word or phrase." qwen_vl: pre_prompt: "" post_prompt: " Answer:"