parquet-column/src/main/java/org/apache/parquet/filter2/predicate/FilterApi.java (5 lines): - line 62: // TODO: Support repeated columns (https://issues.apache.org/jira/browse/PARQUET-34) - line 64: // TODO: Support filtering on groups (eg, filter where this group is / isn't null) - line 65: // TODO: (https://issues.apache.org/jira/browse/PARQUET-43) - line 67: // TODO: Consider adding support for more column types that aren't coupled with parquet types, eg Column - line 68: // TODO: (https://issues.apache.org/jira/browse/PARQUET-35) parquet-column/src/main/java/org/apache/parquet/filter2/recordlevel/FilteringPrimitiveConverter.java (4 lines): - line 42: // TODO: this works, but - line 43: // TODO: essentially turns off the benefits of dictionary support - line 44: // TODO: even if the underlying delegate supports it. - line 45: // TODO: we should support it here. (https://issues.apache.org/jira/browse/PARQUET-36) parquet-common/src/main/java/org/apache/parquet/bytes/BytesUtils.java (3 lines): - line 86: // TODO: this is duplicated code in LittleEndianDataInputStream - line 160: // TODO: this is duplicated code in LittleEndianDataOutputStream - line 248: * TODO: the implementation is compatible with readZigZagVarInt. Is there a need for different functions? parquet-hadoop/src/main/java/org/apache/parquet/hadoop/rewrite/RewriteOptions.java (3 lines): - line 304: * TODO: support rewrite by record to break the original row groups into reasonable ones. - line 331: * TODO: support rewrite by record to break the original row groups into reasonable ones. - line 356: * TODO: support rewrite by record to break the original row groups into reasonable ones. parquet-column/src/main/java/org/apache/parquet/io/RecordReaderImplementation.java (3 lines): - line 261: .getRootConverter(); // TODO: validator(wrap(recordMaterializer), validating, root.getType()); - line 293: // TODO: when we use nextColumnIdxForRepLevel, should we provide current rep level or the rep level for - line 393: // TODO: have those wrappers for a converter parquet-column/src/main/java/org/apache/parquet/filter2/predicate/SchemaCompatibilityValidator.java (3 lines): - line 55: * TODO: detect if a column is optional or required and validate that eq(null) - line 56: * TODO: is not called on required fields (is that too strict?) - line 57: * TODO: (https://issues.apache.org/jira/browse/PARQUET-44) parquet-hadoop/src/main/java/org/apache/parquet/format/converter/ParquetMetadataConverter.java (3 lines): - line 138: // TODO: This file has become too long! - line 139: // TODO: Lets split it up: https://issues.apache.org/jira/browse/PARQUET-310 - line 1076: case BYTE_ARRAY: // TODO: rename BINARY and remove this switch parquet-column/src/main/java/org/apache/parquet/filter2/recordlevel/IncrementallyUpdatedFilterPredicateEvaluator.java (2 lines): - line 32: * TODO: We could also build an evaluator that detects if enough values are known to determine the outcome - line 33: * TODO: of the predicate and quit the record assembly early. (https://issues.apache.org/jira/browse/PARQUET-37) parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetFileReader.java (2 lines): - line 1510: // TODO: this should use getDictionaryPageOffset() but it isn't reliable. - line 1547: return null; // TODO: should this complain? parquet-hadoop/src/main/java/org/apache/parquet/filter2/statisticslevel/StatisticsFilter.java (2 lines): - line 65: // TODO: this belongs in the parquet-column project, but some of the classes here need to be moved too - line 66: // TODO: (https://issues.apache.org/jira/browse/PARQUET-38) parquet-column/src/main/java/org/apache/parquet/filter2/predicate/ValidTypeMap.java (2 lines): - line 36: * TODO: this has some overlap with {@link PrimitiveTypeName#javaType} - line 37: * TODO: (https://issues.apache.org/jira/browse/PARQUET-30) parquet-column/src/main/java/org/apache/parquet/internal/column/columnindex/ColumnIndexBuilder.java (2 lines): - line 692: /// FIXME: it is unfortunate that we don't know the max repetition level here. - line 698: /// FIXME: it is unfortunate that we don't know the max definition level here. parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetOutputCommitter.java (2 lines): - line 51: // TODO: This method should propagate errors, and we should clean up - line 52: // TODO: all the catching of Exceptions below -- see PARQUET-383 parquet-variant/src/main/java/org/apache/parquet/variant/VariantBuilder.java (2 lines): - line 87: // TODO: Support sorted dictionary keys. - line 100: // TODO: Reduce the copying, and look into builder reuse. parquet-thrift/src/main/java/org/apache/parquet/thrift/ThriftRecordConverter.java (2 lines): - line 332: // TODO: check thrift has no float - line 343: // TODO: make subclass per type parquet-thrift/src/main/java/org/apache/parquet/thrift/ThriftSchemaConvertVisitor.java (2 lines): - line 153: // TODO: This is a bug! this should be REQUIRED but changing this will - line 292: // TODO: in the future, we should just filter these records out instead parquet-column/src/main/java/org/apache/parquet/filter2/predicate/UserDefinedPredicate.java (2 lines): - line 29: // TODO: consider avoiding autoboxing and adding the specialized methods for each type - line 30: // TODO: downside is that's fairly unwieldy for users parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetFileWriter.java (2 lines): - line 588: // out.write(MAGIC); // TODO: add a magic delimiter - line 1685: // TODO: column/offset indexes are not copied parquet-common/src/main/java/org/apache/parquet/bytes/LittleEndianDataInputStream.java (2 lines): - line 335: // TODO: has this been benchmarked against two alternate implementations? - line 365: // TODO: see perf question above in readInt parquet-column/src/main/java/org/apache/parquet/column/values/delta/DeltaBinaryPackingValuesReader.java (2 lines): - line 105: // TODO: probably implement it separately - line 158: // TODO: update the packer to consume from an InputStream parquet-cli/src/main/java/org/apache/parquet/cli/commands/ShowPagesCommand.java (2 lines): - line 137: // TODO: Show total column size and overall size per value in the column summary line - line 188: // TODO: the compressed size of a dictionary page is lost in Parquet parquet-column/src/main/java/org/apache/parquet/column/page/PageReadStore.java (1 line): - line 28: * TODO: rename to RowGroup? parquet-column/src/main/java/org/apache/parquet/column/impl/ColumnReaderBase.java (1 line): - line 160: // TODO: rework that parquet-thrift/src/main/java/org/apache/parquet/hadoop/thrift/AbstractThriftWriteSupport.java (1 line): - line 103: // TODO: make this work for non-tbase types parquet-column/src/main/java/org/apache/parquet/column/values/ValuesWriter.java (1 line): - line 38: // TODO: maybe consolidate into a getPage parquet-common/src/main/java/org/apache/parquet/bytes/MultiBufferInputStream.java (1 line): - line 145: // TODO: use an allocator parquet-column/src/main/java/org/apache/parquet/column/values/delta/DeltaBinaryPackingValuesWriterForLong.java (1 line): - line 120: // TODO: should this cache the packer? parquet-column/src/main/java/org/apache/parquet/column/page/DictionaryPage.java (1 line): - line 43: this(bytes, (int) bytes.size(), dictionarySize, encoding); // TODO: fix sizes long or int parquet-hadoop/src/main/java/org/apache/parquet/hadoop/rewrite/ParquetRewriter.java (1 line): - line 214: // TODO: Should we mark it as deprecated to encourage the main constructor usage? it is also used only from parquet-column/src/main/java/org/apache/parquet/column/values/delta/DeltaBinaryPackingValuesWriter.java (1 line): - line 79: // TODO: remove this. parquet-cli/src/main/java/org/apache/parquet/cli/csv/RecordBuilder.java (1 line): - line 160: // TODO: translate to enum class parquet-avro/src/main/java/org/apache/parquet/avro/AvroWriteSupport.java (1 line): - line 257: // TODO: what if the value is null? parquet-column/src/main/java/org/apache/parquet/io/api/RecordConsumer.java (1 line): - line 136: // TODO: make this abstract in 2.0 parquet-column/src/main/java/org/apache/parquet/schema/PrimitiveType.java (1 line): - line 686: // TODO: should we print decimal metadata too? parquet-column/src/main/java/org/apache/parquet/column/values/rle/RunLengthBitPackingHybridEncoder.java (1 line): - line 56: * Only supports positive values (including 0) // TODO: is that ok? Should we make a signed version? parquet-common/src/main/java/org/apache/parquet/bytes/BytesInput.java (1 line): - line 370: // TODO: more efficient parquet-common/src/main/java/org/apache/parquet/bytes/LittleEndianDataOutputStream.java (1 line): - line 147: // TODO: see note in LittleEndianDataInputStream: maybe faster parquet-encoding/src/main/java/org/apache/parquet/column/values/bitpacking/BitPacking.java (1 line): - line 28: // TODO: rework the whole thing. It does not need to use streams at all parquet-column/src/main/java/org/apache/parquet/column/impl/ColumnWriterV2.java (1 line): - line 95: // TODO: rework this API. The bytes shall be retrieved before the encoding (encoding might be different parquet-benchmarks/src/main/java/org/apache/parquet/benchmarks/ReadBenchmarks.java (1 line): - line 105: // TODO how to handle lzo jar? parquet-common/src/main/java/org/apache/parquet/glob/GlobParser.java (1 line): - line 77: // TODO: maybe turn this check off? parquet-column/src/main/java/org/apache/parquet/schema/MessageType.java (1 line): - line 108: // TODO: optimize this parquet-cli/src/main/java/org/apache/parquet/cli/commands/ShowDictionaryCommand.java (1 line): - line 40: // TODO: show dictionary size in values and in bytes parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ColumnChunkPageWriteStore.java (1 line): - line 301: // TODO: decide if we compress parquet-thrift/src/main/java/org/apache/parquet/hadoop/thrift/ThriftBytesWriteSupport.java (1 line): - line 85: @SuppressWarnings("rawtypes") // TODO: fix type parquet-column/src/main/java/org/apache/parquet/column/values/rle/RunLengthBitPackingHybridDecoder.java (1 line): - line 94: currentBuffer = new int[currentCount]; // TODO: reuse a buffer parquet-protobuf/src/main/java/org/apache/parquet/proto/ProtoWriteSupport.java (1 line): - line 691: // TODO: figure out a way to use MessageOrBuilder parquet-column/src/main/java/org/apache/parquet/column/Encoding.java (1 line): - line 114: * TODO: Should we rename this to be more clear? parquet-column/src/main/java/org/apache/parquet/io/EmptyRecordReader.java (1 line): - line 38: .getRootConverter(); // TODO: validator(wrap(recordMaterializer), validating, root.getType()); parquet-protobuf/src/main/java/org/apache/parquet/proto/ProtoSchemaConverter.java (1 line): - line 81: // TODO: use proto custom options to override per field. parquet-benchmarks/src/main/java/org/apache/parquet/benchmarks/WriteBenchmarks.java (1 line): - line 131: // TODO how to handle lzo jar? parquet-cli/src/main/java/org/apache/parquet/cli/BaseCommand.java (1 line): - line 324: // TODO: add these to the reader builder parquet-column/src/main/java/org/apache/parquet/internal/column/columnindex/BinaryTruncator.java (1 line): - line 54: // TODO this is currently used for UTF-8 only, so validity check could be done without copying. parquet-column/src/main/java/org/apache/parquet/filter2/recordlevel/IncrementallyUpdatedFilterPredicateBuilderBase.java (1 line): - line 55: * TODO: UserDefinedPredicates still autobox however parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetReader.java (1 line): - line 52: * TODO: too many constructors (https://issues.apache.org/jira/browse/PARQUET-39) parquet-column/src/main/java/org/apache/parquet/filter2/recordlevel/FilteringGroupConverter.java (1 line): - line 68: // TODO: making the assumption that getConverter(i) is only called once, is that valid? parquet-hadoop/src/main/java/org/apache/parquet/hadoop/util/SerializationUtil.java (1 line): - line 38: * TODO: Refactor elephant-bird so that we can depend on utils like this without extra baggage. parquet-column/src/main/java/org/apache/parquet/io/api/Binary.java (1 line): - line 584: // TODO: should not have to materialize those bytes parquet-cli/src/main/java/org/apache/parquet/cli/util/Expressions.java (1 line): - line 296: // TODO: this should only return something if the type can match rather than explicitly parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ColumnChunkPageReadStore.java (1 line): - line 53: * TODO: should this actually be called RowGroupImpl or something? parquet-column/src/main/java/org/apache/parquet/column/values/factory/DefaultV2ValuesWriterFactory.java (1 line): - line 88: // TODO: