In a class by itself, only Apache HAWQ (incubating) combines exceptional MPP-based analytics performance, robust ANSI SQL compliance, Hadoop ecosystem integration and manageability, and flexible data-store format support. All natively in Hadoop. No connectors required.
Built from a decade’s worth of massively parallel processing (MPP) expertise developed through the creation of the Pivotal Greenplum® enterprise database and open source PostgreSQL, HAWQ enables to you to swiftly and interactively query Hadoop data, natively via HDFS.
HAWQ is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by the Incubator. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF.
Exceptional HAWQ’s parallel processing architecture delivers high performance throughput and low latency - potentially near real time - query responses that can scale to petabyte-sized datasets. Operate natively in Hadoop. |
Robust ANSI SQL compliance Leverage familiar skills. Achieve higher levels of compatibility for SQL-based applications and BI/data visualization tools. Execute complex queries and joins, including roll-ups and nested queries. |
Hadoop ecosystem manageability and integration Integrate and manage with YARN. Provision with Ambari. Interface with HCatalog. HAWQ supports Parquet, AVRO, HBase, and others. Easily scale nodes up or down to meet performance or capacity requirements. |
Plus, HAWQ works Apache MADlib (incubating) machine learning libraries to execute advanced analytics for data-driven digital transformation, modern application development, data science purposes, and more.
HAWQ is breaking new ground for advanced analytics and machine learning in Hadoop. All contributors welcome! Get involved with the next wave in Hadoop analytic database technology. HAWQ is fully open source with Apache. Everything from this community, website, and the code itself has been developed by a community of people who want to support and propel HAWQ technology.
We especially welcome additions and corrections to the documentation, wiki, and website to improve user experiences. Bug reports, and fixes and additions to the HAWQ code are welcome. Helping users learn best practices also earns good karma in our community.
Got an idea for a feature or fix for HAWQ? We welcome contributors. Please discuss in the dev mail list, post an issue on Jira, or make pull requests on Github.
Apache MADlib (incubating) is a SQL-based advanced analytics and machine learning library that works with Apache HAWQ.
Are you a HAWQ expert? Want to share your knowledge with others? We are collaborative community that shares best practices.
Even this website is a work of our community. Got a suggestion, fix, or even a redesign idea? Please contribute.
Documentation and how-to tips can always be made clearer, better organized, more complete, and translated to more languages.
Looking for an answer to a specific question? Are you an expert who likes to answer questions? Engage here: