Apache HAWQ

Apache Hadoop Native SQL.
Advanced Analytics MPP Database for Enterprises.

In a class by itself, only Apache HAWQ (incubating) combines exceptional MPP-based analytics performance, robust ANSI SQL compliance, Hadoop ecosystem integration and manageability, and flexible data-store format support. All natively in Hadoop. No connectors required.

Built from a decade’s worth of massively parallel processing (MPP) expertise developed through the creation of the Pivotal Greenplum® enterprise database and open source PostgreSQL, HAWQ enables to you to swiftly and interactively query Hadoop data, natively via HDFS.

HAWQ is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by the Incubator. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF.

Exceptional
performance

HAWQ’s parallel processing architecture delivers high performance throughput and low latency - potentially near real time - query responses that can scale to petabyte-sized datasets. Operate natively in Hadoop.

Robust ANSI SQL compliance

Leverage familiar skills. Achieve higher levels of compatibility for SQL-based applications and BI/data visualization tools. Execute complex queries and joins, including roll-ups and nested queries.

Hadoop ecosystem manageability and integration

Integrate and manage with YARN. Provision with Ambari. Interface with HCatalog. HAWQ supports Parquet, AVRO, HBase, and others. Easily scale nodes up or down to meet performance or capacity requirements.

Plus, HAWQ works Apache MADlib (incubating) machine learning libraries to execute advanced analytics for data-driven digital transformation, modern application development, data science purposes, and more.

Contribute to Advanced Enterprise Technology!

HAWQ is breaking new ground for advanced analytics and machine learning in Hadoop. All contributors welcome! Get involved with the next wave in Hadoop analytic database technology. HAWQ is fully open source with Apache. Everything from this community, website, and the code itself has been developed by a community of people who want to support and propel HAWQ technology.

We especially welcome additions and corrections to the documentation, wiki, and website to improve user experiences. Bug reports, and fixes and additions to the HAWQ code are welcome. Helping users learn best practices also earns good karma in our community.

Apache HAWQ (incubating) Code

Got an idea for a feature or fix for HAWQ? We welcome contributors. Please discuss in the dev mail list, post an issue on Jira, or make pull requests on Github.

Apache MADlib (incubating)

Apache MADlib (incubating) is a SQL-based advanced analytics and machine learning library that works with Apache HAWQ.

— Visit Apache MADlib community

Evangelism

Are you a HAWQ expert? Want to share your knowledge with others? We are collaborative community that shares best practices.

Website

Even this website is a work of our community. Got a suggestion, fix, or even a redesign idea? Please contribute.

Documentation

Documentation and how-to tips can always be made clearer, better organized, more complete, and translated to more languages.

— HAWQ Wiki
— HAWQ Docs
— HAWQ Extension Framework API (Java Doc)

User Questions

Looking for an answer to a specific question? Are you an expert who likes to answer questions? Engage here:

Apache Hadoop Native SQL.

Advanced, MPP, elastic query engine

and analytic database for enterprises.

Now incubating with Apache.

Apache Hadoop Native SQL.
Advanced Analytics MPP Database for Enterprises.

Contribute to Advanced Enterprise Technology!

Apache HAWQ (incubating) Code

Apache MADlib (incubating)

Evangelism

Website

Documentation

User Questions

Mailing Lists

More …

Apache Hadoop Native SQL.

Advanced, MPP, elastic query engine

and analytic database for enterprises.

Now incubating with Apache.

Apache Hadoop Native SQL. Advanced Analytics MPP Database for Enterprises.

Contribute to Advanced Enterprise Technology!

Apache HAWQ (incubating) Code

Apache MADlib (incubating)

Evangelism

Website

Documentation

User Questions

Mailing Lists

More …

Apache Hadoop Native SQL.
Advanced Analytics MPP Database for Enterprises.