_posts/2019-08-22-hadoop-community-meetup-beijing-aug.html (275 lines of code) (raw):

--- layout: post status: PUBLISHED published: true title: Hadoop Community Meetup @ Beijing, Aug 2019 id: 5b18f0f0-6985-41b9-8357-cff113cb763d date: '2019-08-22 09:50:18 -0400' categories: hadoop tags: - meetup - hadoop permalink: hadoop/entry/hadoop-community-meetup-beijing-aug --- <p><html></p> <p><head><br /> <meta http-equiv=Content-Type content="text/html; charset=utf-8"><br /> <meta name=Generator content="Microsoft Word 15 (filtered)"></p> <style> <!--<br /> /* Font Definitions */<br /> @font-face<br /> {font-family:"Cambria Math";<br /> panose-1:2 4 5 3 5 4 6 3 2 4;}<br /> @font-face<br /> {font-family:Calibri;<br /> panose-1:2 15 5 2 2 2 4 3 2 4;}<br /> @font-face<br /> {font-family:"Calibri Light";<br /> panose-1:2 15 3 2 2 2 4 3 2 4;}<br /> @font-face<br /> {font-family:"DengXian Light";<br /> panose-1:2 1 6 0 3 1 1 1 1 1;}<br /> @font-face<br /> {font-family:Georgia;<br /> panose-1:2 4 5 2 5 4 5 2 3 3;}<br /> @font-face<br /> {font-family:"Noto Sans Symbols";<br /> panose-1:2 11 6 4 2 2 2 2 2 4;}<br /> @font-face<br /> {font-family:-webkit-standard;<br /> panose-1:2 11 6 4 2 2 2 2 2 4;}<br /> @font-face<br /> {font-family:"\@DengXian Light";}<br /> /* Style Definitions */<br /> p.MsoNormal, li.MsoNormal, div.MsoNormal<br /> {margin:0in;<br /> margin-bottom:.0001pt;<br /> font-size:12.0pt;<br /> font-family:"Calibri",sans-serif;}<br /> center {<br /> text-align: center;<br /> border: 3px solid blue;<br /> }<br /> h1<br /> {mso-style-link:"Heading 1 Char";<br /> margin-top:12.0pt;<br /> margin-right:0in;<br /> margin-bottom:0in;<br /> margin-left:0in;<br /> margin-bottom:.0001pt;<br /> page-break-after:avoid;<br /> font-size:16.0pt;<br /> font-family:"Calibri Light",sans-serif;<br /> color:#2F5496;<br /> font-weight:normal;}<br /> h2<br /> {mso-style-link:"Heading 2 Char";<br /> margin-top:2.0pt;<br /> margin-right:0in;<br /> margin-bottom:0in;<br /> margin-left:0in;<br /> margin-bottom:.0001pt;<br /> page-break-after:avoid;<br /> font-size:13.0pt;<br /> font-family:"Calibri Light",sans-serif;<br /> color:#2F5496;<br /> font-weight:normal;}<br /> h3<br /> {margin-top:14.0pt;<br /> margin-right:0in;<br /> margin-bottom:4.0pt;<br /> margin-left:0in;<br /> page-break-after:avoid;<br /> font-size:14.0pt;<br /> font-family:"Calibri",sans-serif;}<br /> h4<br /> {margin-top:12.0pt;<br /> margin-right:0in;<br /> margin-bottom:2.0pt;<br /> margin-left:0in;<br /> page-break-after:avoid;<br /> font-size:12.0pt;<br /> font-family:"Calibri",sans-serif;}<br /> h5<br /> {margin-top:11.0pt;<br /> margin-right:0in;<br /> margin-bottom:2.0pt;<br /> margin-left:0in;<br /> page-break-after:avoid;<br /> font-size:11.0pt;<br /> font-family:"Calibri",sans-serif;}<br /> h6<br /> {margin-top:10.0pt;<br /> margin-right:0in;<br /> margin-bottom:2.0pt;<br /> margin-left:0in;<br /> page-break-after:avoid;<br /> font-size:10.0pt;<br /> font-family:"Calibri",sans-serif;}<br /> p.MsoTitle, li.MsoTitle, div.MsoTitle<br /> {mso-style-link:"Title Char";<br /> margin:0in;<br /> margin-bottom:.0001pt;<br /> font-size:28.0pt;<br /> font-family:"Calibri Light",sans-serif;<br /> letter-spacing:-.5pt;}<br /> p.MsoTitleCxSpFirst, li.MsoTitleCxSpFirst, div.MsoTitleCxSpFirst<br /> {mso-style-link:"Title Char";<br /> margin:0in;<br /> margin-bottom:.0001pt;<br /> font-size:28.0pt;<br /> font-family:"Calibri Light",sans-serif;<br /> letter-spacing:-.5pt;}<br /> p.MsoTitleCxSpMiddle, li.MsoTitleCxSpMiddle, div.MsoTitleCxSpMiddle<br /> {mso-style-link:"Title Char";<br /> margin:0in;<br /> margin-bottom:.0001pt;<br /> font-size:28.0pt;<br /> font-family:"Calibri Light",sans-serif;<br /> letter-spacing:-.5pt;}<br /> p.MsoTitleCxSpLast, li.MsoTitleCxSpLast, div.MsoTitleCxSpLast<br /> {mso-style-link:"Title Char";<br /> margin:0in;<br /> margin-bottom:.0001pt;<br /> font-size:28.0pt;<br /> font-family:"Calibri Light",sans-serif;<br /> letter-spacing:-.5pt;}<br /> p.MsoSubtitle, li.MsoSubtitle, div.MsoSubtitle<br /> {margin-top:.25in;<br /> margin-right:0in;<br /> margin-bottom:4.0pt;<br /> margin-left:0in;<br /> page-break-after:avoid;<br /> font-size:24.0pt;<br /> font-family:"Georgia",serif;<br /> color:#666666;<br /> font-style:italic;}<br /> p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph<br /> {margin-top:0in;<br /> margin-right:0in;<br /> margin-bottom:0in;<br /> margin-left:.5in;<br /> margin-bottom:.0001pt;<br /> font-size:12.0pt;<br /> font-family:"Calibri",sans-serif;}<br /> p.MsoListParagraphCxSpFirst, li.MsoListParagraphCxSpFirst, div.MsoListParagraphCxSpFirst<br /> {margin-top:0in;<br /> margin-right:0in;<br /> margin-bottom:0in;<br /> margin-left:.5in;<br /> margin-bottom:.0001pt;<br /> font-size:12.0pt;<br /> font-family:"Calibri",sans-serif;}<br /> p.MsoListParagraphCxSpMiddle, li.MsoListParagraphCxSpMiddle, div.MsoListParagraphCxSpMiddle<br /> {margin-top:0in;<br /> margin-right:0in;<br /> margin-bottom:0in;<br /> margin-left:.5in;<br /> margin-bottom:.0001pt;<br /> font-size:12.0pt;<br /> font-family:"Calibri",sans-serif;}<br /> p.MsoListParagraphCxSpLast, li.MsoListParagraphCxSpLast, div.MsoListParagraphCxSpLast<br /> {margin-top:0in;<br /> margin-right:0in;<br /> margin-bottom:0in;<br /> margin-left:.5in;<br /> margin-bottom:.0001pt;<br /> font-size:12.0pt;<br /> font-family:"Calibri",sans-serif;}<br /> span.TitleChar<br /> {mso-style-name:"Title Char";<br /> mso-style-link:Title;<br /> font-family:"Calibri Light",sans-serif;<br /> letter-spacing:-.5pt;}<br /> span.Heading1Char<br /> {mso-style-name:"Heading 1 Char";<br /> mso-style-link:"Heading 1";<br /> font-family:"Calibri Light",sans-serif;<br /> color:#2F5496;}<br /> span.Heading2Char<br /> {mso-style-name:"Heading 2 Char";<br /> mso-style-link:"Heading 2";<br /> font-family:"Calibri Light",sans-serif;<br /> color:#2F5496;}<br /> .MsoChpDefault<br /> {font-family:"Calibri",sans-serif;}<br /> @page WordSection1<br /> {size:8.5in 11.0in;<br /> margin:1.0in 1.0in 1.0in 1.0in;}<br /> div.WordSection1<br /> {page:WordSection1;}<br /> /* List Definitions */<br /> ol<br /> {margin-bottom:0in;}<br /> ul<br /> {margin-bottom:0in;}<br /> --><br /> </style> <p></head></p> <p><body lang=EN-US></p> <div class=WordSection1> <p class=MsoTitle align=center style='text-align:center'>Hadoop Community<br /> Meetup @ Beijing</p> <h4> <p>Author: Junping Du (Tencent) & Wangda Tan (Cloudera)</p> </h4> <p><a href="https://blogs.apache.org/hadoop/mediaresource/fdaef52f-146c-46c3-b780-b501a7302d94"><img src="https://blogs.apache.org/hadoop/mediaresource/fdaef52f-146c-46c3-b780-b501a7302d94" alt="Picture01.jpg"></img></a></p> <p><a href="https://blogs.apache.org/hadoop/mediaresource/581538b6-3a43-49c0-b092-eaede2739c03"><img src="https://blogs.apache.org/hadoop/mediaresource/581538b6-3a43-49c0-b092-eaede2739c03" alt="Picture02.jpg"></img></a></p> <h1>Overview</h1> <p class=MsoNormal>On Aug 11<sup>th</sup> 2019, Hadoop developers/users<br /> gathered together at Tencent&rsquo;s Sigma Center office in Beijing to share their<br /> latest works, with 12 presentations by engineers from Tencent, Cloudera,<br /> Alibaba, Didi, Xiaomi, Meituan, ByteDance (Parent company of TikTok, Toutiao,<br /> etc.), JD.com, Huawei. This is also first Hadoop community meetup hosted by<br /> Apache Hadoop PMC members. </p> <h2>Attendees</h2> <p class=MsoNormal>We received tremendous numbers of participations to the<br /> meetup. There&rsquo;re 200 spots available for registration to attend this meetup<br /> in-person, and spots got fully booked in <b>10 mins</b>. We got <b>150+</b><br /> attendees in-person, and <b>3000+</b> attendees participated online live<br /> sessions. </p> <p class=MsoNormal>We have participants from dozens of different companies and<br /> universities in person, many of them are flying from Shanghai, Hangzhou,<br /> Shenzhen and even San Francisco Bay Area! </p> <h1>Sessions</h1> <h2>1. Hadoop Community Update And Roadmaps</h2> <p class=MsoNormal>Junping Du @ Tencent and Wangda Tan @ Cloudera talked about<br /> Hadoop community updates and roadmaps.</p> <p><a href="https://blogs.apache.org/hadoop/mediaresource/f7f923fb-95a2-4da9-ab67-0647152a1532"><img src="https://blogs.apache.org/hadoop/mediaresource/f7f923fb-95a2-4da9-ab67-0647152a1532" alt="Picture03.jpg"></img></a></p> <p class=MsoNormal align=center style='text-align:center'><span<br /> style='font-family:-webkit-standard;color:black'>Junping Du @ Tencent</span></p> <p class=MsoNormal><span style='font-family:-webkit-standard;color:black'>&nbsp;</span></p> <p><a href="https://blogs.apache.org/hadoop/mediaresource/46a5a5e5-285b-487e-b7d1-4dbc8f055d7c"><img src="https://blogs.apache.org/hadoop/mediaresource/46a5a5e5-285b-487e-b7d1-4dbc8f055d7c" alt="Picture04.jpg"></img></a></p> <p class=MsoNormal align=center style='text-align:center'><span<br /> style='font-family:"Times New Roman",serif'>Wangda Tan @ Cloudera</span></p> <p class=MsoNormal>Junping introduced recent trends in the storage field, such<br /> as better scalability and moving to cloud. He talked about features like RBF<br /> (Router Based Federation), improvements of NameNode scalability, Improvements<br /> of cloud connectors and Ozone. </p> <p class=MsoNormal>Wangda talked about recent trends in the compute field, such<br /> as better scalability, moving to clkoud-native environment, containerization<br /> works and support of Machine-Learning use cases. He talked about global<br /> scheduling framework for better scheduling throughput and placement quality.<br /> Recent containerization works in YARN such as runc, interactive docker shell.<br /> And YARN-on-cloud initiatives from community such as autoscaling, graceful<br /> decommissions, etc. Wangda also talked about Submarine and its release plans. </p> <p class=MsoNormal>At last, Wangda looked back at releases in 2018/2019, and<br /> shared tentative release plan of Hadoop in 2019. Such as 3.1.3, 3.2.1 and<br /> what&rsquo;s new coming to 3.3.0. </p> <h2>2. Ozone: Hadoop native object store </h2> <p class=MsoNormal>Sammi (Yi) Chen @ Tencent talked about native object store<br /> project from Hadoop community. </p> <p><a href="https://blogs.apache.org/hadoop/mediaresource/7b5c8dc2-d493-4883-ac0a-b2330e3a4314"><img src="https://blogs.apache.org/hadoop/mediaresource/7b5c8dc2-d493-4883-ac0a-b2330e3a4314" alt="Picture05.jpg"></img></a></p> <p class=MsoNormal>Ozone is a strong-consistent distributed object store<br /> service. Like HDFS, Ozone has same level of reliability, consistency and<br /> usability. It supports S3 interface, so it is not only useful to on-prem<br /> big-data workload. It is also a good option to move big data to cloud. </p> <p class=MsoNormal>Sammi talked about architecture of Ozone, and what&rsquo;s new in<br /> Ozone 0.5 release. </p> <h2>3. YARN 3.x in Alibaba </h2> <p class=MsoNormal>Tao Yang from Alibaba talked about Hadoop use cases in<br /> Alibaba. He also talked about how new features in YARN 3.x being used to solve<br /> use cases. Tao talked about features like preemption, scheduling, resource<br /> over-commitment, scheduling diagnostic, mixed deployment of online/offline<br /> workload. Tao also talked about how new features in YARN help to better run<br /> Apache Flink on YARN. </p> <p><a href="https://blogs.apache.org/hadoop/mediaresource/c4b7a61e-4b32-40cd-b4a0-4e7b3a6693b6"><img src="https://blogs.apache.org/hadoop/mediaresource/c4b7a61e-4b32-40cd-b4a0-4e7b3a6693b6" alt="Picture06.jpg"></img></a></p> <p class=MsoNormal>Tao talked about many interesting features such as<br /> MultiNodeLookupPolicy, which can help schedule jobs on a pluggable node sorter.</p> <h2>4. HDFS Best Practices learned from Didi&rsquo;s production environment. </h2> <p class=MsoNormal>Hui Fei from Didi talked about HDFS best practices learned<br /> from Didi&rsquo;s large scale (hundreds of PBs) production environment.</p> <p class=MsoNormal>Hui first talked about storage use cases and scale in Didi&rsquo;s<br /> environment. Then Hui talked about functionalities and improvements Didi&rsquo;s<br /> Hadoop team built on top of Hadoop HDFS 2.7.2 such as: Security, NameNode<br /> Federation, Balancer, etc.</p> <p><a href="https://blogs.apache.org/hadoop/mediaresource/57b9a051-9e6b-4616-b1ca-dcdab17cd42f"><img src="https://blogs.apache.org/hadoop/mediaresource/57b9a051-9e6b-4616-b1ca-dcdab17cd42f" alt="Picture07.jpg"></img></a></p> <p class=MsoNormal>Hui also talked about the status of upgrading production<br /> cluster based on Hadoop 2.7.2 to Hadoop 3.2.0. The primary driver of upgrade is<br /> to save storage spaces. Didi wants to use features like Erasure Coding in<br /> Hadoop 3.x. </p> <p class=MsoNormal>Didi has upgraded a test cluster (100+ nodes) from 2.7.2 to<br /> 3.2.0, has a backup cluster with 2k+ nodes run Hadoop 3.1.1 and will rolling<br /> upgrade it to 3.2.0. There&rsquo;s a primary cluster with 10K+ nodes (with 5<br /> namespaces), will start to upgrade to 3.2.0 starting Oct</p> <h2>5. Submarine: A one-stop, cross-platform machine learning platform</h2> <p class=MsoNormal>Xun Liu @ NetEase and Zhankun Tang @ Cloudera talked about<br /> background, existing status and future of Submarine project.</p> <p><a href="https://blogs.apache.org/hadoop/mediaresource/06051afb-2491-45d3-bade-4c3a8fdbcc4e"><img src="https://blogs.apache.org/hadoop/mediaresource/06051afb-2491-45d3-bade-4c3a8fdbcc4e" alt="Picture08.jpg"></img></a></p> <p class=MsoNormal align=center style='text-align:center'><span<br /> style='font-family:-webkit-standard;color:black'>Zhankun Tang @ Cloudera</span></p> <p class=MsoNormal align=center style='text-align:center'><span<br /> style='font-family:"Times New Roman",serif'>&nbsp;</span></p> <p><a href="https://blogs.apache.org/hadoop/mediaresource/a0087efe-b086-4b75-bb36-03518dcaeb3d"><img src="https://blogs.apache.org/hadoop/mediaresource/a0087efe-b086-4b75-bb36-03518dcaeb3d" alt="Picture09.jpg"></img></a></p> <p class=MsoNormal align=center style='text-align:center'><span<br /> style='font-family:"Times New Roman",serif'>Xun Liu @ Netease</span></p> <p class=MsoNormal>Machine learning includes many components like<br /> data-preprocessing, feature extraction, model training/serving/management,<br /> distributed workload management. Submarine project started by Hadoop community<br /> is targeted to achieve these goals by focusing on Notebook experiences. With<br /> Submarine, data scientists or machine learning engineer don&rsquo;t need to<br /> understand lower-level platform such as YARN, K8s, Docker container. </p> <p class=MsoNormal>Zhankun showed a new feature called mini-submarine which<br /> allows developers try Submarine locally without installing a YARN cluster. </p> <p class=MsoNormal>Xun did demos for:</p> <p class=MsoNormal style='margin-left:.5in;text-indent:-.25in;border:none'><span<br /> style='font-family:"Noto Sans Symbols"'>●<span style='font:7.0pt "Times New Roman"'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<br /> </span></span><span style='color:black'>Integration of Submarine + Zeppelin<br /> notebook. </span></p> <p class=MsoNormal style='margin-left:.5in;text-indent:-.25in;border:none'><span<br /> style='font-family:"Noto Sans Symbols"'>●<span style='font:7.0pt "Times New Roman"'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<br /> </span></span><span style='color:black'>New Submarine web UI to allow data </span>scientists<span<br /> style='color:black'> to run jobs and manage models, etc. in the unified user<br /> experiences. </span></p> <p class=MsoNormal>Xun also talked about companies which are reported using<br /> Submarine in production. Such as NetEase, Linkedin, Dahua, Ke.com, JD.com. </p> <h2>6. Hadoop Improvements in Xiaomi</h2> <p class=MsoNormal>Chen Zhang and Kang Zhou from Xiaomi talked about how Hadoop<br /> is being used in Xiaomi. They talked about improvements of HDFS&rsquo;s performance<br /> and scalability; Problems/Solutions when trying to platformize YARN. </p> <p class=MsoNormal>For HDFS side, Chen talked about their improvements of HDFS<br /> federation, such as lower the business impact when upgrading single NameNode to<br /> federated NameNode. They have also improved NameNode Performance, which now<br /> allows supporting 600 millions of objects (files + blocks) in a single<br /> NameNode.</p> <p><a href="https://blogs.apache.org/hadoop/mediaresource/8e597b99-fb73-4e6a-9f1d-4211e1e58794"><img src="https://blogs.apache.org/hadoop/mediaresource/8e597b99-fb73-4e6a-9f1d-4211e1e58794" alt="Picture10.jpg"></img></a></p> <p class=MsoNormal>In YARN, Kang talked about usability improvements in YARN.<br /> Such as RMStateStore/History Server, etc. Also, he talked about multi-cluster<br /> management tools such as a unified client/RM-UI for multiple clusters. Kang<br /> also talked about improvements they have done for scheduling optimization like<br /> cache Resource Usage, improvements of utilization and preemption, etc. </p> <h2>7. Key Customizations of YARN @ ByteDance </h2> <p class=MsoNormal>Yakun Li from ByteDance talked customizations of their YARN<br /> cluster to handle extra large scale, multi-clusters environment, Including:<br /> utilization improvements, stabilization, optimizations for<br /> streaming/model-training environment, and multi datacenter issues, etc. </p> <p><a href="https://blogs.apache.org/hadoop/mediaresource/f217e756-cb1c-4e0c-955b-434e0b4d5dae"><img src="https://blogs.apache.org/hadoop/mediaresource/f217e756-cb1c-4e0c-955b-434e0b4d5dae" alt="Picture11.jpg"></img></a> </p> <p class=MsoNormal>For scheduling, Yakun also talked about how they implement<br /> Gang Scheduling in YARN, which do scheduling for application instead of node.<br /> And it can achieve low-latency, hard/soft constraints. He also talked about<br /> implementation of multi-thread version FairScheduler which can push number of<br /> container allocation per second up to 3k. </p> <p class=MsoNormal>In mixed-workloads (Batch, Streaming, ML) deployment part,<br /> Yakun talked about they have adopted Docker on YARN support to isolate<br /> dependencies. Support CPUSET/NUMA, temporarily skip nodes which have too high<br /> physical utilizations, etc. All these efforts can help mixed workload runs well<br /> in same cluster.</p> <h2>8. YuniKorn: A New Unified Scheduler for Both YARN and K8s </h2> <p class=MsoNormal>Weiwei Yang and Wangda Tan from Cloudera talked about their<br /> works about a new scheduler named YuniKorn (<a<br /> href="https://github.com/cloudera/yunikorn-core"><span style='color:#1155CC'>https://github.com/cloudera/yunikorn-core</span></a>)<br /> and how it can benefit both YARN and K8s community. </p> <p><a href="https://blogs.apache.org/hadoop/mediaresource/22b6c49d-62eb-4361-88fc-383f4c0dd430"><img src="https://blogs.apache.org/hadoop/mediaresource/22b6c49d-62eb-4361-88fc-383f4c0dd430" alt="Picture12.jpg"></img></a> </p> <p class=MsoNormal align=center style='text-align:center'><span<br /> style='font-family:"Times New Roman",serif'>Weiwei Yang (Right) and Wangda Tan<br /> (Left) from Cloudera</span></p> <p class=MsoNormal>Scheduler of a container orchestration system, such as YARN<br /> and Kubernetes, is a critical component that users rely on to plan resources<br /> and manage applications. They have different characters to support different<br /> workloads.</p> <p class=MsoNormal>YARN schedulers are optimized for high-throughput,<br /> multi-tenant batch workloads. It can scale up to 50k nodes per cluster, and<br /> schedule 20k containers per second; On the other side, Kubernetes schedulers<br /> are optimized for long-running services, but many features like hierarchical<br /> queues, fairness resource sharing, and preemption etc, are either missing or<br /> not mature enough at this point of time.</p> <p class=MsoNormal>However, underneath they are responsible for one same job:<br /> the decision maker for resource allocations. They mentioned the need to run<br /> services on YARN as well as run jobs on Kubernetes. This motivates them to<br /> create a universal scheduler which can work for both YARN and Kubernetes and<br /> configured in the same way.</p> <p class=MsoNormal>In this talk, Weiwei and Wangda talked about their efforts<br /> of design and implement the universal scheduler. They have integrated it with<br /> to Kubernetes already and YARN integration is working-in-progress. This<br /> scheduler brings long-wanted features such as hierarchical queues, fairness<br /> between users/jobs/queues, preemption to Kubernetes; and it brings service<br /> scheduling enhancements to YARN. Most importantly, it provides the opportunity<br /> to let YARN and Kubernetes share the same user experience on scheduling big<br /> data workloads. And any improvements of this universal scheduler can benefit<br /> both Kubernetes and YARN community.</p> <h2>9. HDFS cluster improvements and optimization practices in Meituan Dianping<br /> </h2> <p class=MsoNormal>Xiaoqiao He from Meituan Dianping talked about Hadoop<br /> cluster scalabilities now. Their Hadoop cluster keep growing since 2015. By<br /> far, there&rsquo;re more than 30k nodes in the Hadoop clusters. </p> <p><a href="https://blogs.apache.org/hadoop/mediaresource/ef6801e9-77ae-4bd1-b32c-ee889a16a7ca"><img src="https://blogs.apache.org/hadoop/mediaresource/ef6801e9-77ae-4bd1-b32c-ee889a16a7ca" alt="Picture13.jpg"></img></a></p> <p class=MsoNormal>He shared many details and practice about the infrastructure<br /> of physical deployments, especially on solution for cluster across multiple<br /> regions. In the last part, Xiaoqiao shows some practices for optimizing HDFS<br /> cluster, such as: improve the Namenode restart process and rebalance for<br /> Namenode workload, etc.</p> <h2><a name="_heading=h.aldjt4xtlgc"></a>10. Evolution of YARN in JD.com</h2> <p class=MsoNormal>Wanqiang Ji from JD.com talked about how YARN evolves to<br /> support JD.com&rsquo;s business needs. </p> <p><a href="https://blogs.apache.org/hadoop/mediaresource/5513cff6-0674-4215-8a2f-12ade3d9b04d"><img src="https://blogs.apache.org/hadoop/mediaresource/5513cff6-0674-4215-8a2f-12ade3d9b04d" alt="Picture14.jpg"></img></a></p> <p class=MsoNormal>In the last 3 years, maximum number of nodes in a single<br /> YARN cluster scales from 3k, 5k, 10k to 16k. Internally there&rsquo;re works to<br /> balance resources between YARN/K8s cluster. Also there are improvements of<br /> container eviction policies to make sure nodes won&rsquo;t crash or restart when<br /> machine&rsquo;s physical utilization grows above a certain level. </p> <h2><a name="_heading=h.o4jjvh6qs4pp"></a>11. Lessons learned from large scale<br /> YARN cluster operation @ Tencent</h2> <p class=MsoNormal>Jun Gong and Dongdong Chen from Tencent talked about their<br /> works to support large scale YARN cluster inside tencent. </p> <p><a href="https://blogs.apache.org/hadoop/mediaresource/32c57607-a28c-41de-96a5-f0371a5a6748"><img src="https://blogs.apache.org/hadoop/mediaresource/32c57607-a28c-41de-96a5-f0371a5a6748" alt="Picture15.jpg"></img></a></p> <p class=MsoNormal align=center style='text-align:center'>Gong Jun @ Tencent</p> <p><a href="https://blogs.apache.org/hadoop/mediaresource/d65796d7-4740-41c5-9e36-ade2a9d5573b"><img src="https://blogs.apache.org/hadoop/mediaresource/d65796d7-4740-41c5-9e36-ade2a9d5573b" alt="Picture16.jpg"></img></a></p> <p class=MsoNormal align=center style='text-align:center'>Dongdong Chen @<br /> Tencent</p> <p class=MsoNormal>Jun and Dongdong shared inside Tencent, they widely used SLS<br /> to figure out bottleneck of scheduler, many of the scheduler improvements have<br /> contributed back to the community. After optimization, in their production<br /> cluster, they have 2k+ queues, 8K+ nodes, 5k+ concurrent jobs. And they can<br /> achieve 3k+ container allocations per second, and more than 100 millions<br /> container allocations per day.</p> <p class=MsoNormal>Also, Jun and Dongdong shared how they uses YARN CGroups<br /> parameters to fine-tune CPU/Memory/Network shares for launched YARN containers<br /> in a multi-tenant cluster.</p> <h2><a name="_heading=h.sxlj482o0isz"></a>12. Run Spark and Hadoop on ARM</h2> <p class=MsoNormal>Rui Chen and Sheng Liu from Huawei shared their works to run<br /> Spark and Hadoop on ARM.</p> <p><a href="https://blogs.apache.org/hadoop/mediaresource/b6a72eef-5158-4abd-bb83-33eb867f33f3"><img src="https://blogs.apache.org/hadoop/mediaresource/b6a72eef-5158-4abd-bb83-33eb867f33f3" alt="Picture17.jpg"></img></a></p> <p class=MsoNormal align=center style='text-align:center'>Rui Chen</p> <p class=MsoNormal align=center style='text-align:center'>&nbsp;</p> <p><a href="https://blogs.apache.org/hadoop/mediaresource/88c34ec0-d4fd-47c8-8ed9-47724d6c715d"><img src="https://blogs.apache.org/hadoop/mediaresource/88c34ec0-d4fd-47c8-8ed9-47724d6c715d" alt="Picture18.jpg"></img></a></p> <p class=MsoNormal align=center style='text-align:center'>Sheng Liu</p> <p class=MsoNormal align=center style='text-align:center'>&nbsp;</p> <p class=MsoNormal>Rui and Sheng shared the motivation of running hadoop and<br /> spark on ARM platform which is for high performance and power efficiency. After<br /> that, they went ahead to share status of ARM support for hadoop and spark and<br /> details of building release Tarball on ARM platform include parameters, and<br /> issues. In the last part, they introduced how hadoop/spark release work can<br /> make sure proper testing for arm platform and they were building a community<br /> called OpenLab to make sure the process more smoothly.</p> <p class=MsoNormal align=center style='text-align:center'>&nbsp;</p> <h1>Acknowledges</h1> <p class=MsoNormal>Thanks everyone for contributing this successful event in<br /> one way or another, such as following speakers:</p> <p class=MsoNormal>Sammi Chen, Jun Gong and Dongdong Chen from Tencent, </p> <p class=MsoNormal>Weiwei Yang, Zhankun Tang from Cloudera, </p> <p class=MsoNormal>Wanqiang Ji from Jingdong, </p> <p class=MsoNormal>Tao Yang from Alibaba, </p> <p class=MsoNormal>Chen Zhang and Kang Zhou from Xiaomi, </p> <p class=MsoNormal> Hui Fei from Didi, </p> <p class=MsoNormal>Rui Chen and Sheng Liu from Huawei, </p> <p class=MsoNormal>Xiaoqiao He from Meituan Dianping ,</p> <p class=MsoNormal>Yakun Li from ByteDance,</p> <p class=MsoNormal>and Xun Liu from Netease.</p> <p class=MsoNormal>And especially thanks Chunyu Wang, Summer Xia, Katty Ma for<br /> organizing the meetup!</p> </div> <p></body></p> <p></html></p>