markdown/pxf/HawqExtensionFrameworkPXF.html.md.erb (27 lines of code) (raw):
---
title: Using PXF with Unmanaged Data
---
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
HAWQ Extension Framework (PXF) is an extensible framework that allows HAWQ to query external system data.
PXF includes built-in connectors for accessing data inside HDFS files, Hive tables, and HBase tables. PXF also integrates with HCatalog to query Hive tables directly.
PXF allows users to create custom connectors to access other parallel data stores or processing engines. To create these connectors using Java plug-ins, see the [PXF External Tables and API](PXFExternalTableandAPIReference.html).
- **[Installing PXF Plug-ins](../pxf/InstallPXFPlugins.html)**
This topic describes how to install the built-in PXF service plug-ins that are required to connect PXF to HDFS, Hive, and HBase. You should install the appropriate RPMs on each node in your cluster.
- **[Configuring PXF](../pxf/ConfigurePXF.html)**
This topic describes how to configure the PXF service.
- **[Accessing HDFS File Data](../pxf/HDFSFileDataPXF.html)**
This topic describes how to access HDFS file data using PXF.
- **[Accessing Hive Data](../pxf/HivePXF.html)**
This topic describes how to access Hive data using PXF. You have several options for querying data stored in Hive. You can create external tables in PXF and then query those tables, or you can easily query Hive tables by using HAWQ and PXF's integration with HCatalog. HAWQ accesses Hive table metadata stored in HCatalog.
- **[Accessing HBase Data](../pxf/HBasePXF.html)**
This topic describes how to access HBase data using PXF.
- **[Accessing JSON Data](../pxf/JsonPXF.html)**
This topic describes how to access JSON data using PXF.
- **[Accessing External SQL Databases](../pxf/JdbcPXF.html)**
This topic describes how to access data in external SQL databases using PXF.
- **[Writing Data to HDFS](../pxf/HDFSWritablePXF.html)**
This topic describes how to write to HDFS using PXF.
- **[Using Profiles to Read and Write Data](../pxf/ReadWritePXF.html)**
PXF profiles are collections of common metadata attributes that can be used to simplify the reading and writing of data. You can use any of the built-in profiles that come with PXF or you can create your own.
- **[PXF External Tables and API](../pxf/PXFExternalTableandAPIReference.html)**
You can use the PXF API to create your own connectors to access any other type of parallel data store or processing engine.
- **[Troubleshooting PXF](../pxf/TroubleshootingPXF.html)**