site/releases/spark-release-0-9-2.html (270 lines of code) (raw):

<!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8"> <meta http-equiv="X-UA-Compatible" content="IE=edge"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <title> Spark Release 0.9.2 | Apache Spark </title> <link href="/css/bootstrap.min.css" rel="stylesheet"> <link rel="preconnect" href="https://fonts.googleapis.com"> <link rel="preconnect" href="https://fonts.gstatic.com" crossorigin> <link href="https://fonts.googleapis.com/css2?family=DM+Sans:ital,wght@0,400;0,500;0,700;1,400;1,500;1,700&Courier+Prime:wght@400;700&display=swap" rel="stylesheet"> <link href="/css/custom.css" rel="stylesheet"> <!-- Code highlighter CSS --> <link href="/css/pygments-default.css" rel="stylesheet"> <link rel="icon" href="/favicon.ico" type="image/x-icon"> <!-- Matomo --> <script> var _paq = window._paq = window._paq || []; /* tracker methods like "setCustomDimension" should be called before "trackPageView" */ _paq.push(["disableCookies"]); _paq.push(['trackPageView']); _paq.push(['enableLinkTracking']); (function() { var u="https://analytics.apache.org/"; _paq.push(['setTrackerUrl', u+'matomo.php']); _paq.push(['setSiteId', '40']); var d=document, g=d.createElement('script'), s=d.getElementsByTagName('script')[0]; g.async=true; g.src=u+'matomo.js'; s.parentNode.insertBefore(g,s); })(); </script> <!-- End Matomo Code --> </head> <body class="global"> <nav class="navbar navbar-expand-lg navbar-dark p-0 px-4" style="background: #1D6890;"> <a class="navbar-brand" href="/"> <img src="/images/spark-logo-rev.svg" alt="" width="141" height="72"> </a> <button class="navbar-toggler" type="button" data-bs-toggle="collapse" data-bs-target="#navbarContent" aria-controls="navbarContent" aria-expanded="false" aria-label="Toggle navigation"> <span class="navbar-toggler-icon"></span> </button> <div class="collapse navbar-collapse col-md-12 col-lg-auto pt-4" id="navbarContent"> <ul class="navbar-nav me-auto"> <li class="nav-item"> <a class="nav-link active" aria-current="page" href="/downloads.html">Download</a> </li> <li class="nav-item dropdown"> <a class="nav-link dropdown-toggle" href="#" id="libraries" role="button" data-bs-toggle="dropdown" aria-expanded="false"> Libraries </a> <ul class="dropdown-menu" aria-labelledby="libraries"> <li><a class="dropdown-item" href="/sql/">SQL and DataFrames</a></li> <li><a class="dropdown-item" href="/spark-connect/">Spark Connect</a></li> <li><a class="dropdown-item" href="/streaming/">Spark Streaming</a></li> <li><a class="dropdown-item" href="/pandas-on-spark/">pandas on Spark</a></li> <li><a class="dropdown-item" href="/mllib/">MLlib (machine learning)</a></li> <li><a class="dropdown-item" href="/graphx/">GraphX (graph)</a></li> <li> <hr class="dropdown-divider"> </li> <li><a class="dropdown-item" href="/third-party-projects.html">Third-Party Projects</a></li> </ul> </li> <li class="nav-item dropdown"> <a class="nav-link dropdown-toggle" href="#" id="documentation" role="button" data-bs-toggle="dropdown" aria-expanded="false"> Documentation </a> <ul class="dropdown-menu" aria-labelledby="documentation"> <li><a class="dropdown-item" href="/docs/latest/">Latest Release</a></li> <li><a class="dropdown-item" href="/documentation.html">Older Versions and Other Resources</a></li> <li><a class="dropdown-item" href="/faq.html">Frequently Asked Questions</a></li> </ul> </li> <li class="nav-item"> <a class="nav-link active" aria-current="page" href="/examples.html">Examples</a> </li> <li class="nav-item dropdown"> <a class="nav-link dropdown-toggle" href="#" id="community" role="button" data-bs-toggle="dropdown" aria-expanded="false"> Community </a> <ul class="dropdown-menu" aria-labelledby="community"> <li><a class="dropdown-item" href="/community.html">Mailing Lists &amp; Resources</a></li> <li><a class="dropdown-item" href="/contributing.html">Contributing to Spark</a></li> <li><a class="dropdown-item" href="/improvement-proposals.html">Improvement Proposals (SPIP)</a> </li> <li><a class="dropdown-item" href="https://issues.apache.org/jira/browse/SPARK">Issue Tracker</a> </li> <li><a class="dropdown-item" href="/powered-by.html">Powered By</a></li> <li><a class="dropdown-item" href="/committers.html">Project Committers</a></li> <li><a class="dropdown-item" href="/history.html">Project History</a></li> </ul> </li> <li class="nav-item dropdown"> <a class="nav-link dropdown-toggle" href="#" id="developers" role="button" data-bs-toggle="dropdown" aria-expanded="false"> Developers </a> <ul class="dropdown-menu" aria-labelledby="developers"> <li><a class="dropdown-item" href="/developer-tools.html">Useful Developer Tools</a></li> <li><a class="dropdown-item" href="/versioning-policy.html">Versioning Policy</a></li> <li><a class="dropdown-item" href="/release-process.html">Release Process</a></li> <li><a class="dropdown-item" href="/security.html">Security</a></li> </ul> </li> <li class="nav-item dropdown"> <a class="nav-link dropdown-toggle" href="#" id="github" role="button" data-bs-toggle="dropdown" aria-expanded="false"> GitHub </a> <ul class="dropdown-menu" aria-labelledby="github"> <li><a class="dropdown-item" href="https://github.com/apache/spark">spark</a></li> <li><a class="dropdown-item" href="https://github.com/apache/spark-connect-go">spark-connect-go</a></li> <li><a class="dropdown-item" href="https://github.com/apache/spark-connect-swift">spark-connect-swift</a></li> <li><a class="dropdown-item" href="https://github.com/apache/spark-docker">spark-docker</a></li> <li><a class="dropdown-item" href="https://github.com/apache/spark-kubernetes-operator">spark-kubernetes-operator</a></li> <li><a class="dropdown-item" href="https://github.com/apache/spark-website">spark-website</a></li> </ul> </li> </ul> <ul class="navbar-nav ml-auto"> <li class="nav-item dropdown"> <a class="nav-link dropdown-toggle" href="#" id="apacheFoundation" role="button" data-bs-toggle="dropdown" aria-expanded="false"> Apache Software Foundation </a> <ul class="dropdown-menu" aria-labelledby="apacheFoundation"> <li><a class="dropdown-item" href="https://www.apache.org/">Apache Homepage</a></li> <li><a class="dropdown-item" href="https://www.apache.org/licenses/">License</a></li> <li><a class="dropdown-item" href="https://www.apache.org/foundation/sponsorship.html">Sponsorship</a></li> <li><a class="dropdown-item" href="https://www.apache.org/foundation/thanks.html">Thanks</a></li> <li><a class="dropdown-item" href="https://www.apache.org/events/current-event">Event</a></li> </ul> </li> </ul> </div> </nav> <div class="container"> <div class="row mt-4"> <div class="col-12 col-md-9"> <h2>Spark Release 0.9.2</h2> <p>Spark 0.9.2 is a maintenance release with bug fixes. This release is based on the <a href="https://github.com/apache/spark/tree/branch-0.9">branch-0.9</a> maintenance branch of Spark. We recommend all 0.9.x users to upgrade to this stable release. Contributions to this release came from 28 developers.</p> <p>You can download Spark 0.9.2 as either a <a href="http://d3kbcqa49mib13.cloudfront.net/spark-0.9.2.tgz" onclick="trackOutboundLink(this, 'Release Download Links', 'cloudfront_spark-0.9.2.tgz'); return false;">source package</a> (6 MB tgz) or a prebuilt package for <a href="http://d3kbcqa49mib13.cloudfront.net/spark-0.9.2-bin-hadoop1.tgz" onclick="trackOutboundLink(this, 'Release Download Links', 'cloudfront_spark-0.9.2-bin-hadoop1.tgz'); return false;">Hadoop 1 / CDH3</a> (156 MB tgz), <a href="http://d3kbcqa49mib13.cloudfront.net/spark-0.9.2-bin-cdh4.tgz" onclick="trackOutboundLink(this, 'Release Download Links', 'cloudfront_spark-0.9.2-bin-cdh4.tgz'); return false;">CDH4</a> (161 MB tgz), or <a href="http://d3kbcqa49mib13.cloudfront.net/spark-0.9.2-bin-hadoop2.tgz" onclick="trackOutboundLink(this, 'Release Download Links', 'cloudfront_spark-0.9.2-bin-hadoop2.tgz'); return false;">Hadoop 2 / CDH5 / HDP2</a> (168 MB tgz). Release signatures and checksums are available at the official <a href="http://www.apache.org/dist/spark/spark-0.9.2/">Apache download site</a>.</p> <h3 id="fixes">Fixes</h3> <p>Spark 0.9.2 contains bug fixes in several components. Some of the more important fixes are highlighted below. You can visit the <a href="http://s.apache.org/d0t">Spark issue tracker</a> for the full list of fixes.</p> <h4 id="spark-core">Spark Core</h4> <ul> <li>ExternalAppendOnlyMap doesn&#8217;t always find matching keys. (<a href="https://issues.apache.org/jira/browse/SPARK-2043">SPARK-2043</a>)</li> <li>Jobs hang due to akka frame size settings. (<a href="https://issues.apache.org/jira/browse/SPARK-1112">SPARK-1112</a>, <a href="https://issues.apache.org/jira/browse/SPARK-2156">SPARK-2156</a>)</li> <li>HDFS FileSystems continually pile up in the FS cache. (<a href="https://issues.apache.org/jira/browse/SPARK-1676">SPARK-1676</a>)</li> <li>Unneeded lock in ShuffleMapTask.deserializeInfo. (<a href="https://issues.apache.org/jira/browse/SPARK-1775">SPARK-1775</a>)</li> <li>Secondary jars are not added to executor classpath for YARN. (<a href="https://issues.apache.org/jira/browse/SPARK-1870">SPARK-1870</a>)</li> </ul> <h4 id="pyspark">PySpark</h4> <ul> <li>IPython won&#8217;t run standalone Python script. (<a href="https://issues.apache.org/jira/browse/SPARK-1134">SPARK-1134</a>)</li> <li>The hash method used by partitionBy doesn&#8217;t deal with None correctly. (<a href="https://issues.apache.org/jira/browse/SPARK-1468">SPARK-1468</a>)</li> <li>PySpark crashes if too many tasks complete quickly. (<a href="https://issues.apache.org/jira/browse/SPARK-2282">SPARK-2282</a>)</li> </ul> <h4 id="mllib">MLlib</h4> <ul> <li>Make MLlib work on Python 2.6. (<a href="https://issues.apache.org/jira/browse/SPARK-1421">SPARK-1421</a>)</li> <li>Fix PySpark&#8217;s Naive Bayes implementation. (<a href="https://issues.apache.org/jira/browse/SPARK-2433">SPARK-2433</a>)</li> </ul> <h4 id="streaming">Streaming</h4> <ul> <li>SparkFlumeEvent with body bigger than 1020 bytes are not read properly. (<a href="https://issues.apache.org/jira/browse/SPARK-1916">SPARK-1916</a>)</li> </ul> <h4 id="graphx">GraphX</h4> <ul> <li>GraphX triplets not working properly. (<a href="https://issues.apache.org/jira/browse/SPARK-1188">SPARK-1188</a>)</li> </ul> <h3 id="contributors">Contributors</h3> <p>The following developers contributed to this release:</p> <ul> <li>Aaron Davidson - bug fix and optimization</li> <li>Anant Daksh Asthana - improvement</li> <li>Daniel Darabos - bug fix</li> <li>David Lemieux - bug fix</li> <li>Davis Shepherd - bug fix</li> <li>DB Tsai - bug fix</li> <li>Diana Carroll - bug fix</li> <li>Erik Selin - bug fix</li> <li>Gabriele Nizzoli - bug fix</li> <li>Guoqiang Li - bug fix</li> <li>John Zhao - improvement</li> <li>Mark Hamstra - bug fix</li> <li>Matei Zaharia - bug fix and improvement</li> <li>Nan Zhu - bug fix</li> <li>Nick Lanham - bug fix</li> <li>Ori Kremer - bug fix</li> <li>Patrick Wendell - bug fixes</li> <li>Prashant Sharma - new feature</li> <li>Sam Sun - bug fix</li> <li>Sandeep Singh - bug fix</li> <li>Shuo Bai - improvement</li> <li>Sujeet Varakhedi - improvement</li> <li>Tathagata Das - bug fixes and documentation fix</li> <li>Thomas Graves - bug fixes</li> <li>Uri Laserson - bug fix</li> <li>Wenchen Fan - bug fix</li> <li>Xiangrui Meng - bug fixes and release manager</li> <li>Yin Huai - bug fix</li> </ul> <p><em>Thanks to everyone who contributed!</em></p> <p> <br/> <a href="/news/">Spark News Archive</a> </p> </div> <div class="col-12 col-md-3"> <div class="news" style="margin-bottom: 20px;"> <h5>Latest News</h5> <ul class="list-unstyled"> <li><a href="/news/spark-3-5-5-released.html">Spark 3.5.5 released</a> <span class="small">(Feb 27, 2025)</span></li> <li><a href="/news/spark-3-5-4-released.html">Spark 3.5.4 released</a> <span class="small">(Dec 20, 2024)</span></li> <li><a href="/news/spark-3-4-4-released.html">Spark 3.4.4 released</a> <span class="small">(Oct 27, 2024)</span></li> <li><a href="/news/spark-4.0.0-preview2.html">Preview release of Spark 4.0</a> <span class="small">(Sep 26, 2024)</span></li> </ul> <p class="small" style="text-align: right;"><a href="/news/index.html">Archive</a></p> </div> <div style="text-align:center; margin-bottom: 20px;"> <a href="https://www.apache.org/events/current-event.html"> <img src="https://www.apache.org/events/current-event-234x60.png" style="max-width: 100%;"/> </a> </div> <div class="hidden-xs hidden-sm"> <a href="/downloads.html" class="btn btn-cta btn-lg d-grid" style="margin-bottom: 30px;"> Download Spark </a> <p style="font-size: 16px; font-weight: 500; color: #555;"> Built-in Libraries: </p> <ul class="list-none"> <li><a href="/sql/">SQL and DataFrames</a></li> <li><a href="/streaming/">Spark Streaming</a></li> <li><a href="/mllib/">MLlib (machine learning)</a></li> <li><a href="/graphx/">GraphX (graph)</a></li> </ul> <a href="/third-party-projects.html">Third-Party Projects</a> </div> </div> </div> <footer class="small"> <hr> Apache Spark, Spark, Apache, the Apache feather logo, and the Apache Spark project logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries. See guidance on use of Apache Spark <a href="/trademarks.html">trademarks</a>. All other marks mentioned may be trademarks or registered trademarks of their respective owners. Copyright &copy; 2018 The Apache Software Foundation, Licensed under the <a href="https://www.apache.org/licenses/">Apache License, Version 2.0</a>. </footer> </div> <script src="/js/jquery.js"></script> <script src="/js/bootstrap.bundle.min.js"></script> <script src="/js/lang-tabs.js"></script> <script src="/js/downloads.js"></script> </body> </html>