ultravox/inference/oaieval_registry/solvers/vllm_server_config.yaml (9 lines of code) (raw):
generation/direct/vllm:
class: evals.solvers.providers.fixie.fixie_solver:FixieSolver
args:
api_base: http://localhost:8000/v1
completion_fn_options:
model: fixie-ai/ultravox
extra_options:
temperature: 0
# empirically, setting frequency_penalty to 1.0 seems to get us closer to the HF results
# but it's not clear why this is necessary
# TODO: investigate and try to match HF/VLLM results without this
# UPDATE: freq penalty 1.0 hurts performance on big-bench-audio eval for ultravox8b. Switching back to 0.0.
frequency_penalty: 0.0