prompts/stanford/2_generate_course_outlines.ipynb (287 lines of code) (raw):
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"id": "initial_id",
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"\n",
"import pandas as pd\n",
"\n",
"df = pd.read_csv(\"stanford_courses_cleaned_non_generic.csv\", dtype=str)"
]
},
{
"cell_type": "markdown",
"source": [
"1-shot generation of course outlines from the title and description"
],
"metadata": {
"collapsed": false
},
"id": "255265fa0caac0da"
},
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"from string import Template\n",
"\n",
"OUTLINE_TEMPLATE = Template(\"\"\"Write a course outline for a textbook on \\\"The Global Positioning System: Where on Earth are We, and What Time is It?\\\" covering the following topics: \\\"Why people want to know where they are: answers include cross-Pacific trips of Polynesians, missile guidance, and distraught callers. How people determine where they are: navigation technology from dead-reckoning, sextants, and satellite navigation (GPS). Hands-on experience. How GPS works; when it does not work; possibilities for improving performance.\\\".\n",
"Model: 1. Introduction\n",
"- What is the Global Positioning System?\n",
"- Importance of GPS\n",
"- Overview of the course\n",
"\n",
"2. Navigation technology\n",
"- Dead-reckoning\n",
"- Sextants\n",
"- Satellite navigation\n",
"- Comparison of technologies\n",
"- Hands-on experience with navigation technology\n",
"\n",
"3. GPS technology\n",
"- How GPS works\n",
" - Satellites\n",
" - Ground receivers\n",
" - Triangulation\n",
"- When GPS does not work\n",
" - Blockage\n",
" - Multipath\n",
"- Possibilities for improving performance\n",
"\n",
"4. Applications of GPS\n",
"- Cross-Pacific trips of Polynesians\n",
"- Missile guidance\n",
"- Distraught callers\n",
"- Other applications of GPS\n",
"\n",
"User: Write a course outline for a textbook on \\\"${COURSE_TITLE}\\\" covering the following topics: \\\"${COURSE_DESCRIPTION}\\\". Do not include assignments, exams or prerequisites.\n",
"Model: \"\"\")"
],
"metadata": {
"collapsed": false
},
"id": "52c53c272dffa9d2"
},
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"courses_to_generate = []\n",
"for a, b in df.iterrows():\n",
" prompt = OUTLINE_TEMPLATE.substitute({\"COURSE_TITLE\": b[\"title\"], \"COURSE_DESCRIPTION\": b[\"description\"]})\n",
" courses_to_generate.append({\n",
" \"course_title\": b[\"title\"],\n",
" \"course_description\": b[\"description\"],\n",
" \"prompt\": prompt,\n",
" })"
],
"metadata": {
"collapsed": false
},
"id": "de59641276702b38"
},
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"generations = [...] # code to generate using the prompts here"
],
"metadata": {
"collapsed": false
},
"id": "29965888236f2bdd"
},
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"for course, generation in zip(courses_to_generate, generations):\n",
" course[\"outline\"] = generation\n",
"\n",
"pd.DataFrame(courses_to_generate).to_csv(\"outlines_full.csv\")"
],
"metadata": {
"collapsed": false
},
"id": "55c5350ab00b5d54"
},
{
"cell_type": "markdown",
"source": [
"(very large) 2-shot prompt to have the model correct and clean up the generated outlines"
],
"metadata": {
"collapsed": false
},
"id": "c0d4dd91d694584f"
},
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"\n",
"OUTLINE_FILTER_TEMPLATE = Template(\"\"\"The following is a course outline for a course on \\\"Anesthesia Operating Room Clerkship\\\". This outline needs to be anonymized and adapted to an online audience:\n",
"1.1 Introduction: Overview of the Anesthesia Operating Room Clerkship\n",
"1.2 Introduction: Objectives of the clerkship\n",
"1.3 Introduction: Prerequisites for the clerkship\n",
"2.1 Clinical settings: Sequoia Hospital in Redwood City\n",
"2.2 Clinical settings: Outpatient surgery centers throughout the community\n",
"2.3 Clinical settings: Exposure to general and regional anesthetic techniques\n",
"2.4 Clinical settings: Adult and pediatric patients\n",
"3.1 Personalized discussion: Applied physiology\n",
"3.2 Personalized discussion: Pharmacology\n",
"3.3 Personalized discussion: Pathophysiology of the surgical patient\n",
"3.4 Personalized discussion: Daily basis\n",
"3.5 Personalized discussion: Final paper to be submitted by the students\n",
"4.1 Transportation: Students need to arrange transportation to the various workplaces\n",
"5.1 Prerequisites: A major clerkship in medicine or surgery is strongly recommended\n",
"6.1 Periods available: 1-12, full-time for 2 weeks\n",
"6.2 Periods available: 1 student per period\n",
"7.1 Clerkship director and coordinator: Kurt Fink, M.D.\n",
"7.2 Clerkship director and coordinator: Yun Tao, 650-724-1706, yuntao@stanford.edu, Stanford Hospital\n",
"8.1 Reporting instructions: Contact Dr. Kurt Fink one week prior\n",
"8.2 Reporting instructions: Time: TBA\n",
"8.3 Reporting instructions: Call code: 0\n",
"9.1 Other faculty: Palo Alto Medical Clinic Anesthesiologist\n",
"10.1 Location: Palo Alto Medical Foundation.\n",
"\n",
"Which of the sections of the outline contain: \n",
"- private faculty members information (names or contact information)\n",
"- prerequisites, requirements, application processes or other practical course information not related to the course content\n",
"- assignments, final papers, exams, presentations or other student evaluation information\n",
"Falcon:\n",
"- private faculty members information (names or contact information): 7.1, 7.2., 8.1, 9.1\n",
"- prerequisites, requirements, application processes, schedules or other practical course information not related to the course content: 1.3, 4.1, 5.1, 6.1, 6.2, 8.1, 8.2, 8.3, 10.1\n",
"- assignments, final papers, exams, presentations or other student evaluation information: 3.5\n",
"User: The following is a course outline for a course on \"Numerical Methods for Compressible Flows\". This outline needs to be anonymized and adapted to an online audience:\n",
"1.1 Introduction: Overview of the course\n",
"1.2 Introduction: Importance of numerical methods for compressible flows\n",
"1.3 Introduction: Prerequisites for the course\n",
"2.1 Mathematical models for compressible flows: Hierarchy of mathematical models\n",
"2.2 Mathematical models for compressible flows: Ideal potential flow\n",
"2.3 Mathematical models for compressible flows: Transonic potential flow\n",
"3.1 Numerical methods for compressible flows: Finite difference methods\n",
"3.2 Numerical methods for compressible flows: Finite volume methods\n",
"3.3 Numerical methods for compressible flows: Finite element methods\n",
"4.1 Representative model problems: Shocks\n",
"4.2 Representative model problems: Expansions\n",
"5.1 Treatment of boundary conditions: Dirichlet boundary conditions\n",
"5.2 Treatment of boundary conditions: Neumann boundary conditions\n",
"6.1 Applications of numerical methods for compressible flows: Aerospace engineering\n",
"6.3 Applications of numerical methods for compressible flows: Other applications of numerical methods for compressible flows\n",
"\n",
"Which of the sections of the outline contain: \n",
"- private faculty members information (names or contact information)\n",
"- prerequisites, requirements, application processes or other practical course information not related to the course content\n",
"- assignments, final papers, exams, presentations or other student evaluation information\n",
"Falcon: \n",
"- private faculty members information (names or contact information): None\n",
"- prerequisites, requirements, application processes, schedules or other practical course information not related to the course content: 1.3\n",
"- assignments, final papers, exams, presentations or other student evaluation information: None\n",
"User: The following is a course outline for a course on \\\"${COURSE_TITLE}\\\". This outline needs to be anonymized and adapted to an online audience:\n",
"${SECTIONS_LIST}\n",
"\n",
"Which of the sections of the outline contain: \n",
"- private faculty members information (names or contact information)\n",
"- prerequisites, requirements, application processes, schedules or other practical course information not related to the course content\n",
"- assignments, final papers, exams, presentations or other student evaluation information\n",
"Falcon: \"\"\")"
],
"metadata": {
"collapsed": false
},
"id": "53702cd87a677315"
},
{
"cell_type": "markdown",
"source": [
"Reformat cells into numbered format"
],
"metadata": {
"collapsed": false
},
"id": "ca753f9d85a95139"
},
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"import re\n",
"\n",
"FIND_SECTIONS_REGEX = re.compile(r\"\\d\\. .*(?:\\n\\s*- .*)+\")\n",
"FIND_TITLES_REGEX = re.compile(r\"\\d\\. (.*)\")\n",
"FIND_UNIT_TITLES_REGEX = re.compile(r\"\\n\\s*- (.*)\")\n",
"\n",
"def extract_sections(outline):\n",
" sections = FIND_SECTIONS_REGEX.findall(outline)\n",
" return [\n",
" {\n",
" \"section_nr\": si + 1,\n",
" \"title\": FIND_TITLES_REGEX.search(section).group(1),\n",
" \"unit_titles\": FIND_UNIT_TITLES_REGEX.findall(section),\n",
" } for si, section in enumerate(sections)\n",
" ]\n",
"\n",
"\n",
"df = pd.read_csv(\"outlines_full.csv\", dtype=str)\n",
"for a, b in df.iterrows():\n",
" sections = extract_sections(b[\"outline\"])\n",
" sections_list = '\\n'.join(\n",
" [f\"{si + 1}.{ui + 1} {section['title']}: {unit_title}\" for si, section in enumerate(sections) for\n",
" ui, unit_title in enumerate(section[\"unit_titles\"])])\n",
" prompt = OUTLINE_FILTER_TEMPLATE.substitute({\"COURSE_TITLE\": b[\"course_title\"], \"SECTIONS_LIST\": sections_list})\n",
" df.loc[a, 'filter_outline_prompt'] = prompt\n",
" df.loc[a, 'filter_outline_result'] = generate... # actually generate the filter results"
],
"metadata": {
"collapsed": false
},
"id": "1cfa8766032e5ee2"
},
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"df.to_csv(\"outlines_full_filtered.csv\", index=False)"
],
"metadata": {
"collapsed": false
},
"id": "ea6a4080d3619d40"
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}