prompts/stanford/2_generate_course_outlines.ipynb (287 lines of code) (raw):

{ "cells": [ { "cell_type": "code", "execution_count": null, "id": "initial_id", "metadata": { "collapsed": true }, "outputs": [], "source": [ "\n", "import pandas as pd\n", "\n", "df = pd.read_csv(\"stanford_courses_cleaned_non_generic.csv\", dtype=str)" ] }, { "cell_type": "markdown", "source": [ "1-shot generation of course outlines from the title and description" ], "metadata": { "collapsed": false }, "id": "255265fa0caac0da" }, { "cell_type": "code", "execution_count": null, "outputs": [], "source": [ "from string import Template\n", "\n", "OUTLINE_TEMPLATE = Template(\"\"\"Write a course outline for a textbook on \\\"The Global Positioning System: Where on Earth are We, and What Time is It?\\\" covering the following topics: \\\"Why people want to know where they are: answers include cross-Pacific trips of Polynesians, missile guidance, and distraught callers. How people determine where they are: navigation technology from dead-reckoning, sextants, and satellite navigation (GPS). Hands-on experience. How GPS works; when it does not work; possibilities for improving performance.\\\".\n", "Model: 1. Introduction\n", "- What is the Global Positioning System?\n", "- Importance of GPS\n", "- Overview of the course\n", "\n", "2. Navigation technology\n", "- Dead-reckoning\n", "- Sextants\n", "- Satellite navigation\n", "- Comparison of technologies\n", "- Hands-on experience with navigation technology\n", "\n", "3. GPS technology\n", "- How GPS works\n", " - Satellites\n", " - Ground receivers\n", " - Triangulation\n", "- When GPS does not work\n", " - Blockage\n", " - Multipath\n", "- Possibilities for improving performance\n", "\n", "4. Applications of GPS\n", "- Cross-Pacific trips of Polynesians\n", "- Missile guidance\n", "- Distraught callers\n", "- Other applications of GPS\n", "\n", "User: Write a course outline for a textbook on \\\"${COURSE_TITLE}\\\" covering the following topics: \\\"${COURSE_DESCRIPTION}\\\". Do not include assignments, exams or prerequisites.\n", "Model: \"\"\")" ], "metadata": { "collapsed": false }, "id": "52c53c272dffa9d2" }, { "cell_type": "code", "execution_count": null, "outputs": [], "source": [ "courses_to_generate = []\n", "for a, b in df.iterrows():\n", " prompt = OUTLINE_TEMPLATE.substitute({\"COURSE_TITLE\": b[\"title\"], \"COURSE_DESCRIPTION\": b[\"description\"]})\n", " courses_to_generate.append({\n", " \"course_title\": b[\"title\"],\n", " \"course_description\": b[\"description\"],\n", " \"prompt\": prompt,\n", " })" ], "metadata": { "collapsed": false }, "id": "de59641276702b38" }, { "cell_type": "code", "execution_count": null, "outputs": [], "source": [ "generations = [...] # code to generate using the prompts here" ], "metadata": { "collapsed": false }, "id": "29965888236f2bdd" }, { "cell_type": "code", "execution_count": null, "outputs": [], "source": [ "for course, generation in zip(courses_to_generate, generations):\n", " course[\"outline\"] = generation\n", "\n", "pd.DataFrame(courses_to_generate).to_csv(\"outlines_full.csv\")" ], "metadata": { "collapsed": false }, "id": "55c5350ab00b5d54" }, { "cell_type": "markdown", "source": [ "(very large) 2-shot prompt to have the model correct and clean up the generated outlines" ], "metadata": { "collapsed": false }, "id": "c0d4dd91d694584f" }, { "cell_type": "code", "execution_count": null, "outputs": [], "source": [ "\n", "OUTLINE_FILTER_TEMPLATE = Template(\"\"\"The following is a course outline for a course on \\\"Anesthesia Operating Room Clerkship\\\". This outline needs to be anonymized and adapted to an online audience:\n", "1.1 Introduction: Overview of the Anesthesia Operating Room Clerkship\n", "1.2 Introduction: Objectives of the clerkship\n", "1.3 Introduction: Prerequisites for the clerkship\n", "2.1 Clinical settings: Sequoia Hospital in Redwood City\n", "2.2 Clinical settings: Outpatient surgery centers throughout the community\n", "2.3 Clinical settings: Exposure to general and regional anesthetic techniques\n", "2.4 Clinical settings: Adult and pediatric patients\n", "3.1 Personalized discussion: Applied physiology\n", "3.2 Personalized discussion: Pharmacology\n", "3.3 Personalized discussion: Pathophysiology of the surgical patient\n", "3.4 Personalized discussion: Daily basis\n", "3.5 Personalized discussion: Final paper to be submitted by the students\n", "4.1 Transportation: Students need to arrange transportation to the various workplaces\n", "5.1 Prerequisites: A major clerkship in medicine or surgery is strongly recommended\n", "6.1 Periods available: 1-12, full-time for 2 weeks\n", "6.2 Periods available: 1 student per period\n", "7.1 Clerkship director and coordinator: Kurt Fink, M.D.\n", "7.2 Clerkship director and coordinator: Yun Tao, 650-724-1706, yuntao@stanford.edu, Stanford Hospital\n", "8.1 Reporting instructions: Contact Dr. Kurt Fink one week prior\n", "8.2 Reporting instructions: Time: TBA\n", "8.3 Reporting instructions: Call code: 0\n", "9.1 Other faculty: Palo Alto Medical Clinic Anesthesiologist\n", "10.1 Location: Palo Alto Medical Foundation.\n", "\n", "Which of the sections of the outline contain: \n", "- private faculty members information (names or contact information)\n", "- prerequisites, requirements, application processes or other practical course information not related to the course content\n", "- assignments, final papers, exams, presentations or other student evaluation information\n", "Falcon:\n", "- private faculty members information (names or contact information): 7.1, 7.2., 8.1, 9.1\n", "- prerequisites, requirements, application processes, schedules or other practical course information not related to the course content: 1.3, 4.1, 5.1, 6.1, 6.2, 8.1, 8.2, 8.3, 10.1\n", "- assignments, final papers, exams, presentations or other student evaluation information: 3.5\n", "User: The following is a course outline for a course on \"Numerical Methods for Compressible Flows\". This outline needs to be anonymized and adapted to an online audience:\n", "1.1 Introduction: Overview of the course\n", "1.2 Introduction: Importance of numerical methods for compressible flows\n", "1.3 Introduction: Prerequisites for the course\n", "2.1 Mathematical models for compressible flows: Hierarchy of mathematical models\n", "2.2 Mathematical models for compressible flows: Ideal potential flow\n", "2.3 Mathematical models for compressible flows: Transonic potential flow\n", "3.1 Numerical methods for compressible flows: Finite difference methods\n", "3.2 Numerical methods for compressible flows: Finite volume methods\n", "3.3 Numerical methods for compressible flows: Finite element methods\n", "4.1 Representative model problems: Shocks\n", "4.2 Representative model problems: Expansions\n", "5.1 Treatment of boundary conditions: Dirichlet boundary conditions\n", "5.2 Treatment of boundary conditions: Neumann boundary conditions\n", "6.1 Applications of numerical methods for compressible flows: Aerospace engineering\n", "6.3 Applications of numerical methods for compressible flows: Other applications of numerical methods for compressible flows\n", "\n", "Which of the sections of the outline contain: \n", "- private faculty members information (names or contact information)\n", "- prerequisites, requirements, application processes or other practical course information not related to the course content\n", "- assignments, final papers, exams, presentations or other student evaluation information\n", "Falcon: \n", "- private faculty members information (names or contact information): None\n", "- prerequisites, requirements, application processes, schedules or other practical course information not related to the course content: 1.3\n", "- assignments, final papers, exams, presentations or other student evaluation information: None\n", "User: The following is a course outline for a course on \\\"${COURSE_TITLE}\\\". This outline needs to be anonymized and adapted to an online audience:\n", "${SECTIONS_LIST}\n", "\n", "Which of the sections of the outline contain: \n", "- private faculty members information (names or contact information)\n", "- prerequisites, requirements, application processes, schedules or other practical course information not related to the course content\n", "- assignments, final papers, exams, presentations or other student evaluation information\n", "Falcon: \"\"\")" ], "metadata": { "collapsed": false }, "id": "53702cd87a677315" }, { "cell_type": "markdown", "source": [ "Reformat cells into numbered format" ], "metadata": { "collapsed": false }, "id": "ca753f9d85a95139" }, { "cell_type": "code", "execution_count": null, "outputs": [], "source": [ "import re\n", "\n", "FIND_SECTIONS_REGEX = re.compile(r\"\\d\\. .*(?:\\n\\s*- .*)+\")\n", "FIND_TITLES_REGEX = re.compile(r\"\\d\\. (.*)\")\n", "FIND_UNIT_TITLES_REGEX = re.compile(r\"\\n\\s*- (.*)\")\n", "\n", "def extract_sections(outline):\n", " sections = FIND_SECTIONS_REGEX.findall(outline)\n", " return [\n", " {\n", " \"section_nr\": si + 1,\n", " \"title\": FIND_TITLES_REGEX.search(section).group(1),\n", " \"unit_titles\": FIND_UNIT_TITLES_REGEX.findall(section),\n", " } for si, section in enumerate(sections)\n", " ]\n", "\n", "\n", "df = pd.read_csv(\"outlines_full.csv\", dtype=str)\n", "for a, b in df.iterrows():\n", " sections = extract_sections(b[\"outline\"])\n", " sections_list = '\\n'.join(\n", " [f\"{si + 1}.{ui + 1} {section['title']}: {unit_title}\" for si, section in enumerate(sections) for\n", " ui, unit_title in enumerate(section[\"unit_titles\"])])\n", " prompt = OUTLINE_FILTER_TEMPLATE.substitute({\"COURSE_TITLE\": b[\"course_title\"], \"SECTIONS_LIST\": sections_list})\n", " df.loc[a, 'filter_outline_prompt'] = prompt\n", " df.loc[a, 'filter_outline_result'] = generate... # actually generate the filter results" ], "metadata": { "collapsed": false }, "id": "1cfa8766032e5ee2" }, { "cell_type": "code", "execution_count": null, "outputs": [], "source": [ "df.to_csv(\"outlines_full_filtered.csv\", index=False)" ], "metadata": { "collapsed": false }, "id": "ea6a4080d3619d40" } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.6" } }, "nbformat": 4, "nbformat_minor": 5 }