lm_eval/tasks/pubmedqa/pubmedqa.yaml (16 lines of code) (raw):
task: pubmedqa
dataset_path: bigbio/pubmed_qa
dataset_name: pubmed_qa_labeled_fold0_source
output_type: multiple_choice
training_split: train
validation_split: validation
test_split: test
doc_to_text: !function preprocess_pubmedqa.doc_to_text
doc_to_target: final_decision
doc_to_choice: ["yes", "no", "maybe"]
metric_list:
- metric: acc
aggregation: mean
higher_is_better: true
metadata:
version: 1.0