lm_eval/tasks/pubmedqa/pubmedqa.yaml (16 lines of code) (raw):

task: pubmedqa dataset_path: bigbio/pubmed_qa dataset_name: pubmed_qa_labeled_fold0_source output_type: multiple_choice training_split: train validation_split: validation test_split: test doc_to_text: !function preprocess_pubmedqa.doc_to_text doc_to_target: final_decision doc_to_choice: ["yes", "no", "maybe"] metric_list: - metric: acc aggregation: mean higher_is_better: true metadata: version: 1.0