blog/2013/04/03/15-minutes-to-live-druid.html (231 lines of code) (raw):

<!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8" /> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <meta name="description" content="Apache Druid"> <meta name="keywords" content="druid,kafka,database,analytics,streaming,real-time,real time,apache,open source"> <meta name="author" content="Apache Software Foundation"> <title>Druid | 15 Minutes to Live Druid</title> <link rel="alternate" type="application/atom+xml" href="/feed"> <link rel="shortcut icon" href="/img/favicon.png"> <link rel="stylesheet" href="/assets/css/font-awesome-5.css"> <link href='//fonts.googleapis.com/css?family=Open+Sans+Condensed:300,700,300italic|Open+Sans:300italic,400italic,600italic,400,300,600,700' rel='stylesheet' type='text/css'> <link rel="stylesheet" href="/css/bootstrap-pure.css?v=1.1"> <link rel="stylesheet" href="/css/base.css?v=1.1"> <link rel="stylesheet" href="/css/header.css?v=1.1"> <link rel="stylesheet" href="/css/footer.css?v=1.1"> <link rel="stylesheet" href="/css/syntax.css?v=1.1"> <link rel="stylesheet" href="/css/docs.css?v=1.1"> <script> (function() { var cx = '000162378814775985090:molvbm0vggm'; var gcse = document.createElement('script'); gcse.type = 'text/javascript'; gcse.async = true; gcse.src = (document.location.protocol == 'https:' ? 'https:' : 'http:') + '//cse.google.com/cse.js?cx=' + cx; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(gcse, s); })(); </script> </head> <body> <!-- Start page_header include --> <script src="//ajax.googleapis.com/ajax/libs/jquery/2.2.4/jquery.min.js"></script> <div class="top-navigator"> <div class="container"> <div class="left-cont"> <a class="logo" href="/"><span class="druid-logo"></span></a> </div> <div class="right-cont"> <ul class="links"> <li class=""><a href="/technology">Technology</a></li> <li class=""><a href="/use-cases">Use Cases</a></li> <li class=""><a href="/druid-powered">Powered By</a></li> <li class=""><a href="/docs/latest/design/">Docs</a></li> <li class=""><a href="/community/">Community</a></li> <li class="header-dropdown"> <a>Apache</a> <div class="header-dropdown-menu"> <a href="https://www.apache.org/" target="_blank">Foundation</a> <a href="https://www.apache.org/events/current-event" target="_blank">Events</a> <a href="https://www.apache.org/licenses/" target="_blank">License</a> <a href="https://www.apache.org/foundation/thanks.html" target="_blank">Thanks</a> <a href="https://www.apache.org/security/" target="_blank">Security</a> <a href="https://www.apache.org/foundation/sponsorship.html" target="_blank">Sponsorship</a> </div> </li> <li class=" button-link"><a href="/downloads.html">Download</a></li> </ul> </div> </div> <div class="action-button menu-icon"> <span class="fa fa-bars"></span> MENU </div> <div class="action-button menu-icon-close"> <span class="fa fa-times"></span> MENU </div> </div> <script type="text/javascript"> var $menu = $('.right-cont'); var $menuIcon = $('.menu-icon'); var $menuIconClose = $('.menu-icon-close'); function showMenu() { $menu.fadeIn(100); $menuIcon.fadeOut(100); $menuIconClose.fadeIn(100); } $menuIcon.click(showMenu); function hideMenu() { $menu.fadeOut(100); $menuIconClose.fadeOut(100); $menuIcon.fadeIn(100); } $menuIconClose.click(hideMenu); $(window).resize(function() { if ($(window).width() >= 840) { $menu.fadeIn(100); $menuIcon.fadeOut(100); $menuIconClose.fadeOut(100); } else { $menu.fadeOut(100); $menuIcon.fadeIn(100); $menuIconClose.fadeOut(100); } }); </script> <!-- Stop page_header include --> <link rel="stylesheet" href="/css/blogs.css"> <div class="blog druid-header"> <div class="row"> <div class="col-md-8 col-md-offset-2"> <div class="title-image-wrap"> <div class="title-spacer"></div> <img class="title-image" src="http://metamarkets.com/wp-content/uploads/2013/04/Druid-Cluster1.jpg" alt="15 Minutes to Live Druid"/> </div> </div> </div> </div> <div class="container blog"> <div class="row"> <div class="col-md-8 col-md-offset-2"> <div class="blog-entry"> <h1>15 Minutes to Live Druid</h1> <p class="text-muted">by <span class="author text-uppercase">Jaypal Sethi</span> · April 3, 2013</p> <p>Big Data reflects today’s world where data generating events are measured in the billions and business decisions based on insight derived from this data is measured in seconds. There are few tools that provide deep insight into both live and stationary data as business events are occurring; Druid was designed specifically to serve this purpose.</p> <p>If you’re not familiar with Druid, it’s a powerful, open source, real-time analytics database designed to allow queries on large quantities of streaming data – that means querying data as it’s being ingested into the system (see previous <a href="http://metamarkets.com/2012/metamarkets-open-sources-druid/">blog post</a>. Many databases claim they are real-time because they are “real fast;” this usually works for smaller workloads or for customers with infinite IT budgets. For companies like Netflix, whose engineers use Druid to cull through <a href="http://www.slideshare.net/g9yuayon/netflix-druidstrata2013">70 billion log events per day, ingesting over 2 TB per hour at peak times</a> (more on this in a later blog post), real-time means they have to query data as it’s being ingested into the system.</p> <p>Taking Druid a step further, the database provides benefits for both real-time and non-real-time uses by allowing arbitrary drill-downs and n-dimensional filtering without any impact on performance. Beyond being a key feature used by <a href="http://www.metamarkets.com/">Metamarkets</a> (average query times of less than 500 milliseconds), it’s also a valuable capability for Netflix, and a key use case for the R community.</p> <p>Outside of features and functionality, the value of so many successful open source projects can be attributed to their user community. As a sponsor of this project, one of our core goals here at Metamarkets is to support our growing Druid Community. In fact, this blog post is a good example of responding to community feedback to make Druid immediately accessible to users who want to explore and become familiar with the database.</p> <p>Today, we’re excited to announce a ready to run Druid Personal Demo Cluster with a pre-loaded test workload: the Wikipedia edit stream. The DPDC (Druid Personal Demo Cluster) is available via AWS as a StackTemplate and is free to use and run; all that’s required is your own AWS account and 15 minutes.</p> <p>The DPDC is designed to provide a small, but realistic and fully functional Druid environment, allowing users to become familiar with a working example of a Druid system, write queries and understand how to manage the environment. The DPDC is also extensible; once users are familiar with Druid, we encourage them to load their own data and to continue learning. While the DPDC is far from an actual deployment, it’s designed to be an educational tool and an on-ramp towards your own deployment.</p> <p>The AWS (Amazon Web Services) <a href="http://aws.amazon.com/cloudformation/">CloudFormation</a> Template pulls together two Druid AMIs and creates a pre-configured Druid Cluster preloaded with the Wikipedia edit stream, and a basic query interface to help you become familiar with Druid capabilities like drill-downs on arbitrary dimensions and filters.</p> <p>What’s in this Druid Demo Cluster?</p> <ol> <li><p>A single Master node is based on a preconfigured AWS AMI (Amazon Machine Image) and also contains the Zookeeper broker, the Indexer, and a MySQL instance which keeps track of system metadata. You can read more about Druid architecture <a href="https://github.com/metamx/druid/wiki/Design">here</a>.</p></li> <li><p>Three compute nodes based on another AWS AMI; these compute nodes, have been pre-configured to work with the Master node and already contain the Wikipedia edit stream data (no specific setup is required). How to Get Started:</p></li> </ol> <p>Our quick start guide is located on the Druid Github wiki: <a href="https://github.com/metamx/druid/wiki/Druid-Personal-Demo-Cluster">https://github.com/metamx/druid/wiki/Druid-Personal-Demo-Cluster</a></p> <p>For support, please join our mailing list (Google Groups): <a href="https://groups.google.com/d/forum/druid-development">https://groups.google.com/d/forum/druid-development</a>. We welcome your feedback and contributions as we consider adding more content for the DPDC.</p> <p>Need more?</p> <p>Try out our connectors – we recently open-sourced our RDruid connector and will be holding a Druid Meetup where we’ll conduct a hands-on mini-lab to get attendees working with Druid.</p> <p>The community also contributed a Ruby client (<a href="https://github.com/madvertise/ruby-druid">https://github.com/madvertise/ruby-druid</a>) and is rumored to be working on Python and SQL clients. And, a massive thanks to the team at <a href="http://skilledanalysts.com/">SkilledAnalysts</a> for their contributions to the DPDC and their continued involvement in the Druid community.</p> <p>Finally, if you’re looking for more information on Druid, you can find it on our <a href="http://metamarkets.com/product/technology/">technology page</a>.</p> <p>IMAGE: <a href="http://www.shutterstock.com/gallery-86570p1.html">PEDRO MIGUEL SOUSA</a> / <a href="http://www.shutterstock.com/">SHUTTERSTOCK</a></p> </div> </div> </div> </div> <!-- Start page_footer include --> <footer class="druid-footer"> <div class="container"> <div class="text-center"> <p> <a href="/technology">Technology</a>&ensp;·&ensp; <a href="/use-cases">Use Cases</a>&ensp;·&ensp; <a href="/druid-powered">Powered by Druid</a>&ensp;·&ensp; <a href="/docs/latest/">Docs</a>&ensp;·&ensp; <a href="/community/">Community</a>&ensp;·&ensp; <a href="/downloads.html">Download</a>&ensp;·&ensp; <a href="/faq">FAQ</a> </p> </div> <div class="text-center"> <a title="Join the user group" href="https://groups.google.com/forum/#!forum/druid-user" target="_blank"><span class="fa fa-comments"></span></a>&ensp;·&ensp; <a title="Follow Druid" href="https://twitter.com/druidio" target="_blank"><span class="fab fa-twitter"></span></a>&ensp;·&ensp; <a title="GitHub" href="https://github.com/apache/druid" target="_blank"><span class="fab fa-github"></span></a> </div> <div class="text-center license"> Copyright © 2020 <a href="https://www.apache.org/" target="_blank">Apache Software Foundation</a>.<br> Except where otherwise noted, licensed under <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">CC BY-SA 4.0</a>.<br> Apache Druid, Druid, and the Druid logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries. </div> </div> </footer> <script async src="https://www.googletagmanager.com/gtag/js?id=UA-131010415-1"></script> <script> window.dataLayer = window.dataLayer || []; function gtag(){dataLayer.push(arguments);} gtag('js', new Date()); gtag('config', 'UA-131010415-1'); </script> <script> function trackDownload(type, url) { ga('send', 'event', 'download', type, url); } </script> <script src="//code.jquery.com/jquery.min.js"></script> <script src="//maxcdn.bootstrapcdn.com/bootstrap/3.2.0/js/bootstrap.min.js"></script> <script src="/assets/js/druid.js"></script> <!-- stop page_footer include --> </body> </html>