sdks/go/pkg/beam/runners/prism/internal/engine/elementmanager.go (16 lines): - line 99: // TODO: Handle custom variants with built in "known" coders, and length prefixed ones as separate cases. - line 385: // Update if the processing time has advanced while we waited, and add refreshes here. (TODO waking on real time here for prod mode) - line 391: // TODO: Migrate these to the per-stage mechanism for consistency with triggers. - line 815: // TODO: Optimize unnecessary copies. This is doubleteeing. - line 820: keyBytes = info.KeyDec(kbuf) // TODO: Optimize unnecessary copies. This is tripleteeing? - line 856: // TODO sort out pending element watermark holds for process continuation residuals. - line 861: // TODO actually reschedule based on the residuals delay... - line 956: // TODO: Call in a for:range loop when Beam's minimum Go version hits 1.23.0 - line 1258: // TODO - handle for side inputs too. - line 1557: // TODO: Move to better place for configuration - line 1596: // TODO: Allow configurable limit of keys per bundle, and elements per key to improve parallelism. - line 1597: // TODO: when we do, we need to ensure that the stage remains schedualable for bundle execution, for remaining pending elements and keys. - line 1678: // TODO: Allow configurable limit of keys per bundle, and elements per key to improve parallelism. - line 1679: // TODO: when we do, we need to ensure that the stage remains schedualable for bundle execution, for remaining pending elements and keys. - line 1788: // TODO: Determine if it's possible and a good idea to treat all EventTime processing as a MinTime - line 2152: // TODO toggle between testmode and production mode. sdks/typescript/src/apache_beam/transforms/pardo.ts (14 lines): - line 63: // TODO: (API) Re-consider this API. - line 73: // TODO: (API) Do we need an AsyncDoFn (and async[Flat]Map) to be able to call - line 81: // TODO: (Typescript) Can the context arg be optional iff ContextT is undefined? - line 127: // TODO: (Cleanup) The viewFn is stored in the side input object. - line 130: // TODO: (Extension) Possibly place this in the accessor. - line 167: // TODO: (Types) Should there be a way to specify, or better yet infer, the coder to use? - line 176: // TODO: (Cleanup) use runnerApi.StandardPTransformClasss_Primitives.PAR_DO.urn. - line 194: // TODO: (API) Consider as top-level method. - line 195: // TODO: Naming. - line 308: // TODO: Nameing "get" seems to be special. - line 364: // TODO: (Extension) Support side inputs that are composites of multiple more - line 392: // TODO: (Cleanup) Rename to tag for consistency? - line 438: // TODO: (Extension) Map side inputs. - line 470: // TODO: (Extension) Add providers for state, timers, buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy (11 lines): - line 654: // TODO: remove this and download the jar normally when the catalog gets - line 1063: // TODO: Figure out whether this should be a test scope dependency - line 1248: // TODO: Figure out whether we should force all dependency conflict resolution - line 1586: // TODO: Enforce all relocations are always performed to: - line 2014: // TODO: Should we use the runtime scope instead of the compile scope - line 2042: // TODO: Load this from file? - line 2223: // Minor TODO: Figure out if we can pull out the GOCMD env variable after goPrepare script - line 2463: // TODO: Decide whether this should be inlined into the one project that relies on it - line 2980: // TODO: https://github.com/apache/beam/issues/29022 - line 3157: // TODO: Figure out GCS credentials and use real GCS input and output. - line 3194: // TODO: Check that the output file is generated and runs. runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/state/StateRequestHandlers.java (9 lines): - line 63: *
TODO: Add a variant which works on {@link ByteString}s to remove encoding/decoding overhead. - line 94: *
TODO: Add support for side input chunking and caching if a {@link Reiterable} is returned. - line 113: *
TODO: Add support for side input chunking and caching if a {@link Reiterable} is returned. - line 121: *
TODO: Add support for side input chunking and caching if a {@link Reiterable} is returned. - line 197: *
TODO: Add support for bag user state chunking and caching if a {@link Reiterable} is - line 379: // TODO: Add support for continuation tokens when handling state if the handler - line 411: // TODO: Add support for continuation tokens when handling state if the handler - line 444: // TODO: Add support for continuation tokens when handling state if the handler - line 606: // TODO: Add support for continuation tokens when handling state if the handler sdks/typescript/src/apache_beam/transforms/group_and_combine.ts (9 lines): - line 35: // TODO: (API) Consider groupBy as a top-level method on PCollections. - line 60: // TODO: (Typescript) When typing this as ((a: I, b: I) => I), types are not inferred well. - line 203: // TODO: (Naming) Name this combine? - line 207: resultName: string, // TODO: (Unique names) Optionally derive from expr and combineFn? - line 309: // TODO: (Typescript) Is there a way to indicate type parameters match the above? - line 318: // TODO: (Cleanup) Does javascript have a clean zip? - line 345: // TODO: (Cleanup) Does javascript have a clean zip? - line 355: // TODO: Consider adding valueFn(s) rather than using the full value. - line 394: // TODO: (Typescript) Can I type T as "something that has this key" and/or, sdks/java/extensions/sql/zetasql/src/main/java/org/apache/beam/sdk/extensions/sql/zetasql/translation/ExpressionConverter.java (9 lines): - line 347: // TODO: is there a better way to shared code for different cases of - line 383: // TODO: is there any other illegal case? - line 398: // TODO: Have extra verification here to make sure window start/end functions have the same - line 405: // TODO: in Calcite implementation, session window's start is equal to end. Need to fix it - line 413: // TODO: check window_end 's duration is the same as it's aggregate window. - line 591: // TODO: id - 1 might be only correct if the columns read from TableScan. - line 594: // TODO: can join key be NULL? - line 628: // TODO: check size and type of window function argument list. - line 895: // TODO: Structure CAST_OP so that we don't have to repeat the supported types sdks/python/apache_beam/dataframe/frames.py (8 lines): - line 223: # TODO: This could be parallelized by putting index values in a - line 822: # TODO: Documentation about DeferredScalar - line 1169: # TODO: assigning the index is generally order-sensitive, but we could - line 1910: # TODO: assigning the index is generally order-sensitive, but we could - line 2516: # TODO: Replicate pd.DataFrame.__getitem__ logic - line 2692: # TODO: assigning the index is generally order-sensitive, but we could - line 3685: # TODO: We could do this in parallel by creating a ConstantExpression - line 5491: # TODO: non-trivial level? sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/StructUtils.java (7 lines): - line 179: // TODO: implement logical type date and timestamp - line 261: // TODO: implement logical type date and timestamp - line 334: // TODO: implement logical date and datetime - line 366: // TODO: implement logical datetime - line 369: // TODO: implement logical date - line 410: // TODO: implement logical datetime - line 415: // TODO: implement logical date sdks/java/extensions/sql/zetasql/src/main/java/org/apache/beam/sdk/extensions/sql/zetasql/translation/AggregateScanConverter.java (7 lines): - line 89: // TODO: add support for indicator - line 147: // TODO: remove duplicate columns in projects. - line 150: // TODO: handle aggregate function with more than one argument and handle OVER - line 151: // TODO: is there is general way for column reference tracking and deduplication for - line 160: // TODO: assume aggregate function's input is either a ColumnRef or a cast(ColumnRef). - line 161: // TODO: user might use multiple CAST so we need to handle this rare case. - line 253: // TODO: is there a general way to handle aggregation calls conversion? sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubIO.java (6 lines): - line 606: // TODO: Stop using ProtoCoder and instead parse the payload directly. - line 664: // TODO: Stop using AvroCoder and instead parse the payload directly. - line 776: // TODO: Like in readProtos(), stop using ProtoCoder and instead format the payload directly. - line 789: // TODO: Like in readProtos(), stop using ProtoCoder and instead format the payload directly. - line 800: // TODO: Like in readAvros(), stop using AvroCoder and instead format the payload directly. - line 813: // TODO: Like in readAvros(), stop using AvroCoder and instead format the payload directly. runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkStreamingPortablePipelineTranslator.java (6 lines): - line 244: // TODO Legacy transforms which need to be removed - line 736: // TODO: Fail on splittable DoFns. - line 737: // TODO: Special-case single outputs to avoid multiplexing PCollections. - line 760: // TODO: does it matter which output we designate as "main" - line 984: // TODO: local name is unique as long as only one transform with side input can be within a - line 1017: // TODO: support custom mapping fn runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WorkItemStatusClient.java (6 lines): - line 126: // TODO: Provide more structure representation of error, e.g., the serialized exception object. - line 127: // TODO: Look into moving the stack trace thinning into the client. - line 131: error.setCode(2); // Code.UNKNOWN. TODO: Replace with a generated definition. - line 132: // TODO: Attach the stack trace as exception details, not to the message. - line 166: // TODO: Find out a generic way for the DataflowWorkExecutor to report work-specific results - line 357: // TODO: Implement exactly-once delivery and use deltas, sdks/go/pkg/beam/runners/prism/internal/stage.go (5 lines): - line 51: // TODO: Consider ignoring environment boundaries and making fusion - line 236: // TODO sort out rescheduling primary Roots on bundle failure. - line 245: // TODO what happens to output watermarks on splits? - line 571: // TODO: replace si.Global with newGlobal? - line 626: // TODO: filter PCollections, filter windowing strategies by Pcollections instead. runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/SimpleParDoFn.java (5 lines): - line 81: // TODO: Remove once Distributions has shipped. - line 160: // TODO: Remove once Distributions has shipped. - line 165: // TODO: Remove log statement when functionality is enabled by default. - line 271: // TODO: plumb through the operationName, so that we can - line 274: // TODO: plumb through the counter prefix, so we can runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/environment/DockerCommand.java (5 lines): - line 52: // TODO: Should we require 64-character container ids? Docker technically allows abbreviated ids, - line 103: // TODO: Validate args? - line 123: // TODO: Validate args? - line 208: // TODO: Consider supplying executor service here. - line 236: // TODO: Retry on interrupt? sdks/go/pkg/beam/runners/prism/internal/handlerunner.go (5 lines): - line 119: // TODO: do the following injection conditionally. - line 162: // TODO: Implement the windowing strategy the "backup" transforms used for Reshuffle. - line 263: // TODO assert this is a KV. It's probably fine, but we should fail anyway. - line 362: // TODO need to correct session logic if output time is different. - line 457: // TODO: May need to adjust the ordering here. sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/nfa/NFA.java (5 lines): - line 38: // TODO: add support for more quantifiers: `?`, `{19, }` ... for now, support `+` and singletons - line 39: // TODO: sort conditions based on "the last identifier" during compilation - line 40: // TODO: add optimization for the NFA - line 99: // TODO: add support for after match strategy - line 600: // TODO: add implementation. for now, return the current row sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryUtils.java (5 lines): - line 267: // TODO: BigQuery code should not be relying on Calcite metadata fields. If so, this belongs - line 351: // TODO Add metadata for custom sql types ? - line 367: case "RANGE": // TODO add support for range type - line 686: // TODO deprecate toBeamRow(Schema, TableSchema, TableRow) function in favour of this function. - line 807: // TODO: BigQuery shouldn't know about SQL internal logical types. sdks/go/pkg/beam/runners/prism/internal/worker/worker.go (5 lines): - line 125: // TODO set logging level. - line 134: // TODO: Include runner capabilities with the per job configuration. - line 176: // TODO base this on a per pipeline logging setting. - line 186: slog.String("transformID", l.GetTransformId()), // TODO: pull the unique name from the pipeline graph. - line 437: // TODO: move data handling to be pcollection based. sdks/java/extensions/sql/expansion-service/src/main/java/org/apache/beam/sdk/extensions/sql/expansion/SqlTransformSchemaTransformProvider.java (5 lines): - line 102: "ddl", Schema.FieldType.STRING), // TODO: Underlying builder seems more capable? - line 212: if (p != null) { // TODO: We ignore tableproviders that don't exist, we could change - line 221: // TODO: Process query parameters. This is not necessary for Syndeo GA but would be - line 224: // TODO: See about reimplementing a correct version of SqlTransform - line 228: // TODO: One possibility for capturing the required tables would be to inject a runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/ExecutableStageDoFnOperator.java (5 lines): - line 132: *
TODO Integrate support for progress updates and metrics - line 251: // TODO: Wire this into the distributed cache and make it pluggable. - line 252: // TODO: Do we really want this layer of indirection when accessing the stage bundle factory? - line 1033: // TODO: Support propagating the PaneInfo through. - line 1047: // TODO: it would be nice to emit results as they arrive, can thread wait non-blocking? sdks/python/apache_beam/transforms/core.py (4 lines): - line 955: # TODO: Consider requiring an inheritance relationship rather than - line 1690: # TODO: Test this code (in batch_dofn_test) - line 1718: # TODO: Mention process method in this error - line 2670: # TODO: What about callable classes? sdks/typescript/src/apache_beam/io/kafka.ts (4 lines): - line 49: consumerConfig: { [key: string]: string }, // TODO: Or a map? - line 63: consumerConfig: { [key: string]: string }, // TODO: Or a map? - line 79: consumerConfig: { [key: string]: string }, // TODO: Or a map? - line 108: producerConfig: { [key: string]: string }, // TODO: Or a map? runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/DataflowElementExecutionTracker.java (4 lines): - line 76: // TODO: Remove once feature has launched. - line 81: // TODO: Remove log statement when functionality is enabled by default. - line 281: // TODO: If possible, this should report tentative counter values before they are - line 293: // TODO: This algorithm is used to compute "per-element-processing-time" counter sdks/go/pkg/beam/runners/prism/internal/preprocess.go (4 lines): - line 82: // TODO move this out of this part of the pre-processor? - line 290: // TODO validate that fused stages have the same environment. - line 413: // 4. Check that all transforms are in the same environment or are environment agnostic. (TODO for xlang) - line 549: // TODO: Sink/Unzip Flattens so they vanish from the graph. sdks/typescript/src/apache_beam/worker/state.ts (4 lines): - line 24: // TODO: (Extension) Lazy iteration via continuation tokens. - line 48: // TODO: (Advanced) Cross-bundle caching. - line 58: // TODO: (Perf) Consider caching on something ligher-weight than the full - line 78: // TODO: (Perf) Cache eviction. runners/spark/3/src/main/java/org/apache/beam/runners/spark/structuredstreaming/translation/batch/ParDoTranslatorBatch.java (4 lines): - line 67: *
TODO:
- line 84: // TODO: add support of Splittable DoFn
- line 90: // TODO: add support of states and timers
- line 134: // FIXME What's the strategy to unpersist Datasets / RDDs?
sdks/python/apache_beam/runners/dataflow/internal/apiclient.py (4 lines):
- line 145: # TODO: Use enumerated type instead of strings for job types.
- line 152: # TODO: Cleanup use_multiple_sdk_containers once we deprecate Python SDK
- line 1141: # TODO: Used in legacy batch worker. Move under MetricUpdateTranslators
- line 1163: # TODO: Used in legacy batch worker. Delete after Runner V2 transition.
runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslator.java (4 lines):
- line 719: // TODO: This should be done via a Structs accessor.
- line 726: // TODO: This should be done via a Structs accessor.
- line 917: // TODO: Add support for combiner lifting once the need arises.
- line 973: // TODO: Allow combiner lifting on the non-default trigger, as appropriate.
sdks/typescript/src/apache_beam/worker/operators.ts (4 lines):
- line 487: // TODO: Tune this, or better use LRU or ARC for this cache.
- line 702: // TODO: (Perf) We could inspect the context more deeply and allow some
- line 893: // TODO: Verify it falls in window and doesn't cause late data.
- line 909: // TODO: (Cleanup) Ideally we could branch on the urn itself, but some runners have a closed set of known URNs.
sdks/python/apache_beam/coders/coder_impl.py (4 lines):
- line 816: # TODO: Fn Harness only supports millis. Is this important enough to fix?
- line 861: TODO: SDK agnostic encoding
- line 1296: # TODO: (https://github.com/apache/beam/issues/18169) Update to use an
- line 1302: # TODO: More efficient size estimation in the case of state-backed
sdks/go/pkg/beam/model/fnexecution_v1/beam_fn_api.pb.go (3 lines):
- line 21: // TODO: Usage of plural names in lists looks awkward in Java
- line 24: // TODO: gRPC / proto field names conflict with generated code
- line 33: // TODO: Consider consolidating common components in another package
sdks/go/pkg/beam/io/databaseio/database.go (3 lines):
- line 72: //TODO move DB Open and Close to Setup and Teardown methods or StartBundle and FinishBundle
- line 157: //TODO move DB Open and Close to Setup and Teardown methods or StartBundle and FinishBundle
- line 181: //TODO move to Setup methods
sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java (3 lines):
- line 2923: // TODO (Version 3.0): Create the only one generic {@code Write TODO: Move the bounded executor out to an accessible place such as on PipelineOptions.
- line 636: throw new UnsupportedOperationException("TODO: Support enumerating the keys.");
- line 760: // TODO: Support greater than Integer.MAX_VALUE values for iteration/lookup and size.
sdks/typescript/src/apache_beam/worker/worker.ts (3 lines):
- line 115: // TODO: Await closing of control log.
- line 306: // TODO: (Typescript) Defining something like Python's defaultdict could
- line 358: // TODO: (Perf) Consider defering this possibly expensive deserialization lazily to the worker thread.
sdks/typescript/src/apache_beam/transforms/external.ts (3 lines):
- line 69: // TODO: (API) (Types) This class expects PCollections to already have the
- line 343: // TODO: We could still patch things together if we don't understand the coders,
- line 352: // TODO: (Typescipt) Can I get the concrete OutputT at runtime?
sdks/go/pkg/beam/runners/prism/internal/jobservices/management.go (3 lines):
- line 59: // TODO migrate to errors.Join once Beam requires go1.20+
- line 101: Logger: s.logger, // TODO substitute with a configured logger.
- line 466: // TODO report missed messages for this stream.
runners/samza/src/main/java/org/apache/beam/runners/samza/translation/ConfigBuilder.java (3 lines):
- line 170: // TODO: add check to all non-empty files once we don't need to
- line 259: // TODO: remove after SAMZA-1531 is resolved
- line 319: // TODO: remove after we sort out Samza task wrapper
sdks/typescript/src/apache_beam/transforms/internal.ts (3 lines):
- line 67: // TODO: (API) Should we offer a method on PCollection to do this?
- line 159: // TODO: (Cleanup) warn about BsonObjectCoder and (non)deterministic key ordering?
- line 170: // TODO: (Cleanup) runnerApi.StandardPTransformClasss_Primitives.GROUP_BY_KEY.urn.
sdks/python/apache_beam/utils/subprocess_server.py (3 lines):
- line 391: # TODO: Attempt to use nightly snapshots?
- line 405: # TODO: Verify checksum?
- line 418: # TODO: Clean up this cache according to some policy.
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/parser/FieldAccessDescriptorParser.java (3 lines):
- line 86: // TODO: We should support expanding out x.*.y expressions.
- line 136: // TODO: Change once we support slices and selectors.
- line 144: // TODO: Change once we support slices and selectors.
sdks/typescript/src/apache_beam/proto/org/apache/beam/model/fn_execution/v1/beam_fn_api.grpc-client.ts (3 lines):
- line 26: // TODO: Usage of plural names in lists looks awkward in Java
- line 29: // TODO: gRPC / proto field names conflict with generated code
- line 33: // TODO: Consider consolidating common components in another package
sdks/typescript/src/apache_beam/runners/direct_runner.ts (3 lines):
- line 170: timestamp: Long.fromValue("-9223372036854775"), // TODO: (Cleanup) Pull constant out of proto, or at least as a constant elsewhere.
- line 180: // TODO: (Extension) This could be used as a base for the PGBKOperation operator,
- line 329: // TODO: (Typescript) Is there a clean way to do a set difference?
sdks/java/harness/src/main/java/org/apache/beam/fn/harness/control/ProcessBundleHandler.java (2 lines):
- line 128: // TODO: What should the initial set of URNs be?
- line 851: // TODO: Remove source as a root and have it be triggered by the Runner.
sdks/python/apache_beam/runners/pipeline_context.py (2 lines):
- line 115: # TODO: this method may not be safe for arbitrary protos due to
- line 302: # (TODO https://github.com/apache/beam/issues/25615)
it/splunk/src/main/java/org/apache/beam/it/splunk/SplunkResourceManager.java (2 lines):
- line 115: // TODO - add support for https scheme
- line 155: // TODO - add support for ssl
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/meta/provider/bigquery/BeamSqlUnparseContext.java (2 lines):
- line 70: // TODO: Move away from deprecated classes.
- line 71: // TODO: Escaping single quotes, SqlCharStringLiteral (produced by SqlLiteral.createCharString)
sdks/python/apache_beam/typehints/schemas.py (2 lines):
- line 426: # TODO: Allow other value types
- line 437: # TODO: Allow other value types
sdks/python/apache_beam/runners/runner.py (2 lines):
- line 175: # TODO: https://github.com/apache/beam/issues/19168
- line 233: # FIXME: replace with PipelineState(str, enum.Enum)
sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFnTester.java (2 lines):
- line 347: // TODO: Should we return an unmodifiable list?
- line 398: // TODO: Should we return an unmodifiable list?
runners/direct-java/src/main/java/org/apache/beam/runners/direct/TransformExecutorServices.java (2 lines):
- line 73: // TODO: [https://github.com/apache/beam/issues/18968] Pass Future back to consumer to check for
- line 158: // TODO: [https://github.com/apache/beam/issues/18968] Pass Future back to consumer to check for
sdks/java/extensions/sql/zetasql/src/main/java/org/apache/beam/sdk/extensions/sql/zetasql/BeamZetaSqlCatalog.java (2 lines):
- line 81: // TODO: support optional function argument (for window_offset).
- line 279: // annotation upstream. TODO Unsuppress when this is fixed in ZetaSQL.
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/util/common/worker/MapTaskExecutor.java (2 lines):
- line 114: // TODO: support for success / failure ports?
- line 158: // TODO: Also interrupt the execution thread.
sdks/java/core/src/main/java/org/apache/beam/sdk/values/PCollectionViews.java (2 lines):
- line 1531: // TODO: https://github.com/apache/beam/issues/18569 - fix this so that we aren't relying on
- line 1616: // TODO: https://github.com/apache/beam/issues/18569 - fix this so that we aren't relying on
runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkBatchPortablePipelineTranslator.java (2 lines):
- line 305: // TODO: Fail on splittable DoFns.
- line 306: // TODO: Special-case single outputs to avoid multiplexing PCollections.
runners/java-job-service/src/main/java/org/apache/beam/runners/jobsubmission/InMemoryJobService.java (2 lines):
- line 76: * TODO: replace in-memory job management state with persistent solution.
- line 528: // TODO: throw error if jobs are running
sdks/python/apache_beam/io/aws/s3io.py (2 lines):
- line 547: # TODO: Throw value error if path has directory
- line 646: # TODO: Byte strings might not be the most performant way to handle this
sdks/java/core/src/main/java/org/apache/beam/sdk/util/construction/graph/GreedyStageFuser.java (2 lines):
- line 50: // TODO: Provide a way to merge in a compatible subgraph (e.g. one where all of the siblings
- line 181: // TODO: Potentially, some of the consumers can be fused back into this stage later
sdks/python/apache_beam/ml/anomaly/specifiable.py (2 lines):
- line 113: # TODO: support spec treatment for more types
- line 139: # TODO: support spec treatment for more types
sdks/java/core/src/main/java/org/apache/beam/sdk/util/construction/graph/GreedyPCollectionFusers.java (2 lines):
- line 87: // TODO: Migrate
- line 357: // TODO: There is performance to be gained if the output of a flatten is fused into a stage
sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/join/CoGroupByKey.java (2 lines):
- line 91: // TODO: Look at better integration of union types with the
- line 103: // TODO: Use the schema to order the indices rather than depending
sdks/java/extensions/avro/src/main/java/org/apache/beam/sdk/extensions/avro/schemas/utils/AvroUtils.java (2 lines):
- line 1019: // TODO: There is a desire to move Beam schema DATETIME to a micros representation. When
- line 1133: // TODO: There is a desire to move Beam schema DATETIME to a micros representation. When
sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/util/gcsfs/GcsPath.java (2 lines):
- line 374: // TODO: support "." and ".." path components?
- line 581: // TODO: Consider using resource names for all GCS paths used by the SDK.
sdks/python/apache_beam/pipeline.py (2 lines):
- line 557: TODO: Update this to also work for transform overrides where input and
- line 1569: TODO: Update this to support cases where input and/our output types are
sdks/python/apache_beam/internal/dill_pickler.py (2 lines):
- line 192: # TODO: Remove this once Beam depends on dill >= 0.2.8
- line 197: # TODO: Remove once Dataflow has containers with a preinstalled dill >= 0.2.8
sdks/typescript/src/apache_beam/worker/data.ts (2 lines):
- line 115: // TODO: (Perf) Buffer and consilidate send requests?
- line 158: // TODO: (Naming) onData?
sdks/python/apache_beam/runners/direct/sdf_direct_runner.py (2 lines):
- line 308: # TODO: handle key collisions here.
- line 516: # TODO: support continuing after the specified amount of delay.
sdks/java/extensions/sql/zetasql/src/main/java/org/apache/beam/sdk/extensions/sql/zetasql/ZetaSqlCalciteTranslationUtils.java (2 lines):
- line 162: // TODO: Should element type has the same nullability as the array type?
- line 166: // TODO: Should field type has the same nullability as the struct type?
sdks/java/extensions/ordered/src/main/java/org/apache/beam/sdk/extensions/ordered/ProcessorDoFn.java (2 lines):
- line 282: // TODO: test that on draining the pipeline all the results are still produced correctly.
- line 358: // TODO: When there is a large number of duplicates this can cause a situation where
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/datastore/DataStoreV1SchemaIOProvider.java (2 lines):
- line 70: // TODO: allow users to specify a name of the field to store a key value via TableProperties.
- line 170: // TODO: allow users to specify a namespace in a location string.
sdks/java/extensions/arrow/src/main/java/org/apache/beam/sdk/extensions/arrow/ArrowConversion.java (2 lines):
- line 340: // TODO: Consider using ByteBuddyUtils.TypeConversion for this
- line 350: // TODO: code to create a row.
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryAvroUtils.java (2 lines):
- line 125: // TODO for load/storage use
- line 154: case "RANGE": // TODO add support for range type
sdks/java/core/src/main/java/org/apache/beam/sdk/fn/data/BeamFnDataGrpcMultiplexer.java (2 lines):
- line 51: * TODO: Add support for multiplexing over multiple outbound observers by stickying the output
- line 287: * TODO: On failure we should fail any bundles that were impacted eagerly
sdks/typescript/src/apache_beam/runners/artifacts.ts (2 lines):
- line 75: // TODO: (Perf) Hardlink if same filesystem and correct hash?
- line 108: // TODO: (Typescript) Yield from asycn?
sdks/python/apache_beam/runners/direct/direct_runner.py (2 lines):
- line 351: # TODO: Move imports to top. Pipeline <-> Runner dependency cause problems
- line 556: # TODO: Move imports to top. Pipeline <-> Runner dependency cause problems
sdks/java/core/src/main/java/org/apache/beam/sdk/runners/TransformHierarchy.java (2 lines):
- line 257: // TODO: track which outputs need to be exported to parent.
- line 469: // producers map will not be appropriately updated. TODO: investigate alternatives
sdks/typescript/src/apache_beam/io/avroio.ts (2 lines):
- line 27: // TODO: Allow schema to be inferred.
- line 41: // TODO: Allow schema to be inferred.
model/pipeline/src/main/proto/org/apache/beam/model/pipeline/v1/beam_runner_api.proto (2 lines):
- line 645: // TODO: full audit of fields required by runners as opposed to SDK harness
- line 1111: // TODO: consider inlining field on PCollection
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/meta/provider/bigquery/BigQueryFilter.java (2 lines):
- line 52: // TODO: Check what other functions are supported and add support for them (ex: trim).
- line 111: * TODO: Check if comparison between two columns is supported. Also over a boolean field.
sdks/java/io/synthetic/src/main/java/org/apache/beam/sdk/io/synthetic/SyntheticBoundedSource.java (2 lines):
- line 83: // TODO: test cases where the source size could not be estimated (i.e., return 0).
- line 84: // TODO: test cases where the key size and value size might differ from record to record.
sdks/java/io/solr/src/main/java/org/apache/beam/sdk/io/solr/SolrIO.java (2 lines):
- line 358: // TODO remove this configuration, we can figure out the best number
- line 433: // TODO in case of this replica goes inactive while the pipeline runs.
sdks/java/core/src/main/java/org/apache/beam/sdk/util/construction/graph/GreedyPipelineFuser.java (2 lines):
- line 166: // TODO: Figure out where to store this.
- line 169: // TODO: Stages can be fused with each other, if doing so does not introduce duplicate paths
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/control/DefaultJobBundleFactory.java (2 lines):
- line 472: // TODO: Consider having BundleProcessor#newBundle take in an OutputReceiverFactory rather
- line 652: // TODO: Wait for executor shutdown?
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/DataflowExecutionContext.java (2 lines):
- line 148: // TODO: We should have state bytes also to contribute to this hint, otherwise,
- line 175: // TODO: Move StepContext creation to the OperationContext.
runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/BatchViewOverrides.java (2 lines):
- line 850: // TODO: swap to use a variable length long coder which has values which compare
- line 1127: // TODO: swap to use a variable length long coder which has values which compare
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/GroupAlsoByWindowParDoFnFactory.java (2 lines):
- line 154: // TODO: do not do this with mess of "if"
- line 187: // TODO: or anyhow related to it, do not do this with mess of "if"
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/graph/LengthPrefixUnknownCoders.java (2 lines):
- line 61: // TODO: Remove TimerOrElementCoder as it is not truly a well known type.
- line 274: // TODO: Handle other types of ParDos.
runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectGraph.java (2 lines):
- line 86: // TODO: This must only be called on primitive transforms; composites should return empty
- line 93: // TODO: Make this actually track this type of edge, because this isn't quite the right
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/FieldTypeDescriptors.java (2 lines):
- line 61: // TODO: shouldn't we handle this differently?
- line 80: // TODO: Convert for registered logical types.
sdks/python/apache_beam/io/fileio.py (2 lines):
- line 168: # TODO: Should we batch the lookups?
- line 254: # TODO: Mime type? Other arguments? Maybe arguments passed in to transform?
runners/samza/src/main/java/org/apache/beam/runners/samza/translation/ParDoBoundMultiTranslator.java (2 lines):
- line 311: // TODO: support schema and side inputs for portable runner
- line 473: // TODO: support custom mapping fn
sdks/java/core/src/main/java/org/apache/beam/sdk/util/construction/PipelineTranslation.java (2 lines):
- line 80: // TODO: Include DisplayData in the proto
- line 93: // TODO: Include DisplayData in the proto
runners/samza/src/main/java/org/apache/beam/runners/samza/runtime/DoFnOp.java (2 lines):
- line 110: // TODO: eagerly initialize the hold in init
- line 116: // TODO: add this to checkpointable state
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/udf/BuiltinTrigonometricFunctions.java (2 lines):
- line 35: // TODO: handle overflow
- line 52: // TODO: handle overflow
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/CreateTableHelpers.java (2 lines):
- line 58: * TODO: We should put a bound on memory usage of this. Use guava cache instead.
- line 79: // TODO: Once BigQuery reliably returns a consistent error on table not found, we should
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/control/ProcessBundleDescriptors.java (2 lines):
- line 68: // TODO: Rename to ExecutableStages?
- line 114: // TODO: Remove the unreachable subcomponents if the size of the descriptor matters.
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/state/GrpcStateService.java (2 lines):
- line 106: // TODO: Abort in-flight state requests. Flag this processBundleInstructionId as a fail.
- line 115: * TODO: Handle when the client indicates completion or an error on the inbound stream and
sdks/typescript/src/apache_beam/coders/row_coder.ts (2 lines):
- line 172: // TODO: Support float type.
- line 184: // TODO: Infer element type in a better way
sdks/python/apache_beam/transforms/external.py (2 lines):
- line 794: # TODO: Possibly loosen this.
- line 863: # TODO: update this to support secure non-local channels.
sdks/go/pkg/beam/core/graph/edge.go (2 lines):
- line 54: Map InputKind = "Map" // TODO: allow?
- line 55: MultiMap InputKind = "MultiMap" // TODO: allow?
sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Watch.java (2 lines):
- line 989: // TODO (https://github.com/apache/beam/issues/18459):
- line 1073: // in case of pipeline update. TODO: do this.
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/MutationUtils.java (2 lines):
- line 149: // TODO: Implement logical date and datetime
- line 227: // TODO: Implement logical date and datetime
build.gradle.kts (2 lines):
- line 518: // TODO (https://github.com/apache/beam/issues/23966)
- line 527: // TODO: https://github.com/apache/beam/issues/22651
sdks/python/apache_beam/dataframe/convert.py (2 lines):
- line 37: # TODO: Or should this be called as_dataframe?
- line 164: # TODO: Or should this be called from_dataframe?
sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/ReadFromKafkaDoFn.java (2 lines):
- line 568: // TODO: Metrics should be reported per split instead of partition, add bootstrap server hash?
- line 601: // TODO: Remove this timer and use the existing fetch-latency-avg metric.
sdks/python/apache_beam/utils/transform_service_launcher.py (2 lines):
- line 65: 'docker-compose', '-p', project_name, '-f', 'TODO path'
- line 144: # TODO: update this to support secure non-local channels.
runners/samza/src/main/java/org/apache/beam/runners/samza/runtime/PortableDoFnOp.java (2 lines):
- line 106: // TODO: eagerly initialize the hold in init
- line 112: // TODO: add this to checkpointable state
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/counters/CounterFactory.java (2 lines):
- line 228: // TODO: Replace sum-of-squares with statistics for a better stddev algorithm.
- line 857: // TODO: Using CounterDistribution internally is likely very expensive as each
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/StorageApiWritesShardedRecords.java (2 lines):
- line 605: // TODO: Push messageFromTableRow up to top level. That we we cans skip TableRow entirely if
- line 795: // TODO: Only do this on explicit NOT_FOUND errors once BigQuery reliably produces
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/counters/CounterName.java (2 lines):
- line 243: * It is null before {@link #getFlatName()} is called. TODO: this can be replaced
- line 276: * It is null before {@link #getPrettyName()} is called. TODO: this can be replaced
sdks/java/harness/src/main/java/org/apache/beam/fn/harness/state/OrderedListUserState.java (2 lines):
- line 70: * TODO: Move to an async persist model where persistence is signalled based upon cache memory
- line 215: // TODO: consider use cache here
sdks/go/pkg/beam/runners/prism/internal/execute.go (2 lines):
- line 150: // TODO move this loop and code into the preprocessor instead.
- line 287: // TODO: Determine the SDK common formalism for setting processing time to infinity.
sdks/java/transform-service/src/main/java/org/apache/beam/sdk/transformservice/ArtifactService.java (2 lines):
- line 55: // TODO: when all services fail, return an aggregated error with errors from all services.
- line 89: // TODO: when all services fail, return an aggregated error with errors from all services.
sdks/java/core/src/main/java/org/apache/beam/sdk/util/construction/graph/PipelineValidator.java (2 lines):
- line 249: // TODO: Validate state_specs and timer_specs
- line 324: // TODO: Also validate that side inputs of all transforms within components.getTransforms()
runners/twister2/src/main/java/org/apache/beam/runners/twister2/Twister2Runner.java (2 lines):
- line 185: // TODO: Need to fix the check for "RUNNING" once fix for this is done on Twister2 end.
- line 282: // TODO figure out if we can remove all the dependencies that come with
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/TableRowToStorageApiProto.java (2 lines):
- line 856: // TODO: Allow expanding this!
- line 1111: // TODO: Would be more correct to generate TableRows using setF.
sdks/python/apache_beam/yaml/yaml_transform.py (2 lines):
- line 505: # TODO: Move validation to construction?
- line 512: # TODO: Handle (or at least reject) nested case.
sdks/python/apache_beam/typehints/trivial_inference.py (2 lines):
- line 648: # TODO: see if we need to implement cells like this
- line 651: # TODO: see what this behavior is supposed to be beyond
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/logging/DataflowWorkerLoggingHandler.java (2 lines):
- line 215: // TODO: Write the stage execution information by translating the currently execution
- line 224: // TODO: Figure out a way to get exceptions transported across Beam Fn Logging API
sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/storage/GcsFileSystem.java (2 lines):
- line 239: // TODO: Executes in parallel, address https://issues.apache.org/jira/browse/BEAM-1503.
- line 321: // TODO: Address https://issues.apache.org/jira/browse/BEAM-1494
sdks/go/pkg/beam/core/runtime/exec/plan.go (2 lines):
- line 44: // TODO: there can be more than 1 DataSource in a bundle.
- line 305: // TODO: When bundles with multiple sources, are supported, perform splits
sdks/java/core/src/main/java/org/apache/beam/sdk/util/construction/PipelineOptionsTranslation.java (2 lines):
- line 39: * TODO: Make this the default to/from translation for PipelineOptions.
- line 54: // TODO: Officially define URNs for options and their scheme.
sdks/python/apache_beam/runners/dataflow/dataflow_runner.py (2 lines):
- line 89: # TODO: Remove the apache_beam.pipeline dependency in CreatePTransformOverride
- line 803: # TODO: Merge the termination code in poll_for_job_completion and
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryStorageStreamSource.java (2 lines):
- line 136: // TODO: Implement progress reporting.
- line 144: // TODO: Implement dynamic work rebalancing.
sdks/typescript/src/apache_beam/runners/runner.ts (2 lines):
- line 36: // TODO: Support filtering, slicing.
- line 109: // TODO: Grab the last/most severe error message?
sdks/go/pkg/beam/runners/prism/internal/engine/strategy.go (2 lines):
- line 102: // TODO handle https://github.com/apache/beam/issues/31438 merging triggers and state for merging windows (sessions, but also custom merging windows)
- line 576: // TODO https://github.com/apache/beam/issues/31438 Handle TriggerAfterProcessingTime
sdks/python/apache_beam/transforms/display.py (2 lines):
- line 356: # TODO: Python Class types should not be special-cased once
- line 444: #TODO: Fix Args: documentation once the Python classes handling has changed
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/meta/provider/mongodb/MongoDbTable.java (2 lines):
- line 329: // TODO: add support for complex fields (May require modifying how Calcite parses nested
- line 457: // TODO: Can be supported via Filters#where.
sdks/java/core/src/main/java/org/apache/beam/sdk/util/SerializableUtils.java (2 lines):
- line 156: // TODO: Put in better element printing:
- line 164: // TODO: Put in better encoded byte array printing:
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubUnboundedSource.java (2 lines):
- line 177: // TODO: Would prefer to use MinLongFn but it is a BinaryCombineFn TODO: We should put a bound on memory usage of this. Use guava cache instead.
sdks/java/io/tika/src/main/java/org/apache/beam/sdk/io/tika/TikaIO.java (1 line):
- line 223: // TODO: use metadata.toString() only without a trim() once Apache Tika 1.17 gets released
sdks/java/core/src/main/java/org/apache/beam/sdk/testing/PAssert.java (1 line):
- line 1376: // TODO: Split the filtering from the rewindowing, and apply filtering before the Gather
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerIO.java (1 line):
- line 937: // TODO: validate query?
sdks/python/apache_beam/runners/worker/bundle_processor.py (1 line):
- line 1077: # TODO: Consider warning on mismatches in versions of installed packages.
sdks/java/io/tika/src/main/java/org/apache/beam/sdk/io/tika/ParseResult.java (1 line):
- line 129: // TODO: Remove this function and use metadata.hashCode() once Apache Tika 1.17 gets released.
sdks/java/core/src/main/java/org/apache/beam/sdk/util/construction/graph/Networks.java (1 line):
- line 201: // TODO: (github/guava/2641) Upgrade Guava and remove this method if topological sorting becomes
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/meta/provider/bigquery/BigQueryTable.java (1 line):
- line 217: // TODO: BigQuerySqlDialectWithTypeTranslation can be replaced with BigQuerySqlDialect after
website/www/site/assets/scss/_hero.scss (1 line):
- line 27: // TODO - revert margin-top to -30px after summit is over when removing banner
runners/spark/src/main/java/org/apache/beam/runners/spark/translation/SparkStreamingPortablePipelineTranslator.java (1 line):
- line 235: // TODO (https://github.com/apache/beam/issues/20395): handle side inputs.
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamCalcRel.java (1 line):
- line 834: // TODO transform keys, in this case, we need to do lookup, so it should be both ways:
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/util/common/worker/OutputObjectAndByteCounter.java (1 line):
- line 176: // TODO: use original name from the NameContext
runners/spark/3/src/main/java/org/apache/beam/runners/spark/structuredstreaming/translation/batch/PipelineTranslatorBatch.java (1 line):
- line 55: // TODO the ability to have more than one TransformTranslator per URN
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/BatchModeExecutionContext.java (1 line):
- line 311: // TODO: Expose a keyed sub-cache which allows one to store all cached values in their
playground/terraform/infrastructure/artifact_registry/main.tf (1 line):
- line 21: // TODO: remove when generally available
sdks/python/apache_beam/ml/transforms/tft.py (1 line):
- line 80: # TODO: https://github.com/apache/beam/pull/29016
sdks/python/apache_beam/testing/benchmarks/inference/pytorch_image_classification_benchmarks.py (1 line):
- line 31: # TODO (https://github.com/apache/beam/issues/23008)
sdks/java/harness/src/main/java/org/apache/beam/fn/harness/state/BeamFnStateGrpcClientCache.java (1 line):
- line 39: * TODO: Add the ability to close which cancels any pending and stops any future requests.
sdks/python/apache_beam/typehints/typehints.py (1 line):
- line 1614: # TODO: Possibly handle other valid types.
sdks/java/testing/tpcds/src/main/java/org/apache/beam/sdk/tpcds/SqlTransformRunner.java (1 line):
- line 71: * TODO: Add tests.
sdks/java/extensions/sql/zetasql/src/main/java/org/apache/beam/sdk/extensions/sql/zetasql/translation/TableScanConverter.java (1 line):
- line 58: // TODO: reject incorrect top-level schema
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/JavaFieldSchema.java (1 line):
- line 51: * TODO: Validate equals() method is provided, and if not generate a "slow" equals method based
runners/spark/3/src/main/java/org/apache/beam/runners/spark/structuredstreaming/translation/SparkTransformOverrides.java (1 line):
- line 37: // TODO: [https://github.com/apache/beam/issues/19107] Support @RequiresStableInput on Spark
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/control/MapControlClientPool.java (1 line):
- line 68: // TODO: Wire in health checking of clients so requests don't hang.
runners/spark/src/main/java/org/apache/beam/runners/spark/metrics/BeamMetricSet.java (1 line):
- line 33: // TODO: turn into MetricRegistry https://github.com/apache/beam/issues/22384
website/www/site/layouts/partials/head_homepage.html (1 line):
- line 41:
runners/flink/src/main/java/org/apache/beam/runners/flink/translation/functions/FlinkReduceFunction.java (1 line):
- line 50: // TODO: Remove side input functionality since liftable Combines no longer have side inputs.
sdks/python/apache_beam/ml/transforms/utils.py (1 line):
- line 63: # TODO: https://github.com/apache/beam/issues/29356
sdks/java/io/hadoop-file-system/src/main/java/org/apache/beam/sdk/io/hdfs/HadoopFileSystem.java (1 line):
- line 367: // TODO: Add support for off heap ByteBuffers in case the underlying FSDataInputStream
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/StorageApiWriteUnshardedRecords.java (1 line):
- line 853: // TODO: Only do this on explicit NOT_FOUND errors once BigQuery reliably produces
sdks/java/core/src/main/java/org/apache/beam/sdk/Pipeline.java (1 line):
- line 157: // TODO: fix runners that mutate PipelineOptions in this method, then remove this line
playground/frontend/playground_components/lib/src/widgets/drag_handle.dart (1 line):
- line 35: // TODO: Use a single file and just rotate it if needed.
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rule/BeamAggregationRule.java (1 line):
- line 205: // @TODO: this can be simplified after CALCITE-2837
sdks/java/io/sparkreceiver/3/src/main/java/org/apache/beam/sdk/io/sparkreceiver/SparkReceiverIO.java (1 line):
- line 197: // TODO: Split data from SparkReceiver into multiple workers
sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Sets.java (1 line):
- line 651: // TODO: lift combiners through the CoGBK.
runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/KeyedPushedBackElementsHandler.java (1 line):
- line 83: // TODO we have to collect all keys because otherwise we get ConcurrentModificationExceptions
sdks/go/pkg/beam/runners/prism/internal/engine/data.go (1 line):
- line 98: // TODO: Custom Window handling.
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/util/common/worker/BatchingShuffleEntryReader.java (1 line):
- line 88: // TODO: Report API errors to the caller using checked
sdks/java/harness/src/main/java/org/apache/beam/fn/harness/state/FnApiStateAccessor.java (1 line):
- line 687: // TODO: Support squashing accumulators depending on whether we know of all
sdks/python/apache_beam/io/avroio.py (1 line):
- line 321: TODO: remove ``_AvroSource`` in favor of using ``_FastAvroSource``
runners/twister2/src/main/java/org/apache/beam/runners/twister2/translators/batch/ParDoMultiOutputTranslatorBatch.java (1 line):
- line 84: // TODO : note change from List to map in sideinputs
it/splunk/src/main/java/org/apache/beam/it/splunk/SplunkContainer.java (1 line):
- line 116: // TODO - Future config environment variables that may be useful to add
sdks/java/core/src/main/java/org/apache/beam/sdk/util/construction/WindowingStrategyTranslation.java (1 line):
- line 206: // TODO: standardize such things
sdks/java/core/src/main/java/org/apache/beam/sdk/fn/windowing/EncodedBoundedWindow.java (1 line):
- line 49: "TODO: Add support for reading the timestamp from " + "the encoded window.");
sdks/typescript/src/apache_beam/transforms/flatten.ts (1 line):
- line 41: // TODO: UnionCoder if they're not the same?
sdks/python/apache_beam/io/filebasedsink.py (1 line):
- line 422: # TODO: Clean up workitem_test which uses this.
runners/samza/src/main/java/org/apache/beam/runners/samza/adapter/UnboundedSourceSystem.java (1 line):
- line 325: // TODO: make poll interval configurable
sdks/typescript/src/apache_beam/testing/multi_pipeline_runner.ts (1 line):
- line 115: // TODO: Grab the last/most severe error message?
sdks/java/core/src/main/java/org/apache/beam/sdk/util/construction/CoderTranslation.java (1 line):
- line 60: // TODO: standardize such things
sdks/java/core/src/main/java/org/apache/beam/sdk/io/Read.java (1 line):
- line 1084: // TODO: Support "global" backlog reporting
sdks/python/apache_beam/internal/cloudpickle/cloudpickle.py (1 line):
- line 1336: # TODO: decorrelate reducer_override (which is tied to CPython's
sdks/java/harness/src/main/java/org/apache/beam/fn/harness/AssignWindowsRunner.java (1 line):
- line 92: // TODO: https://github.com/apache/beam/issues/18870 consider allocating only once and updating
sdks/python/apache_beam/internal/util.py (1 line):
- line 153: # TODO: Remove this once above issue in 'apitools' is fixed.
runners/samza/src/main/java/org/apache/beam/runners/samza/SamzaRunnerOverrideConfigs.java (1 line):
- line 22: // TODO: can we get rid of this class? Right now the SamzaPipelineOptionsValidator would force
sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/storage/GcsCreateOptions.java (1 line):
- line 36: // TODO: Add other GCS options when needed.
sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/resourcehints/ResourceHints.java (1 line):
- line 55: // TODO: reference this from a common location in all packages that use this.
playground/terraform/infrastructure/memorystore/main.tf (1 line):
- line 24: // TODO: remove when replica_count, etc is generally available
sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileBasedSink.java (1 line):
- line 833: // TODO: Windows OS cannot resolves and matches '*' in the path,
runners/core-java/src/main/java/org/apache/beam/runners/core/construction/SerializablePipelineOptions.java (1 line):
- line 62: // TODO https://github.com/apache/beam/issues/18430: remove this call.
runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java (1 line):
- line 1337: // TODO: add an explicit `pipeline` parameter to the submission instead of pipeline options
sdks/go/pkg/beam/core/runtime/metricsx/urns.go (1 line):
- line 30: // TODO: Pull these from the protos.
sdks/java/harness/src/main/java/org/apache/beam/fn/harness/data/BeamFnDataGrpcClient.java (1 line):
- line 41: * TODO: Handle closing clients that are currently not a consumer nor are being consumed.
playground/frontend/playground_components/lib/src/controllers/code_runner.dart (1 line):
- line 268: // TODO: Listen to this object outside of widgets,
sdks/java/container/boot.go (1 line):
- line 229: // TODO: verify if it's intentional or not.
runners/samza/src/main/java/org/apache/beam/runners/samza/util/DoFnUtils.java (1 line):
- line 69: // TODO: format name when there are multiple input/output PTransform(s) in the ExecutableStage
sdks/python/apache_beam/ml/transforms/handlers.py (1 line):
- line 439: # TODO: Remove the 'id' column from the transformed
sdks/java/extensions/ordered/src/main/java/org/apache/beam/sdk/extensions/ordered/SequencePerKeyProcessorDoFn.java (1 line):
- line 272: // TODO: validate that this is correct.
playground/backend/internal/utils/cache_utils.go (1 line):
- line 34: // TODO send email to fix error with writing to cache
sdks/python/apache_beam/runners/render.py (1 line):
- line 441: # TODO: If this gets more complex, we could consider taking on a
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryStorageSourceBase.java (1 line):
- line 181: // TODO: this is inconsistent with method above, where it can be null
sdks/python/apache_beam/ml/transforms/embeddings/__init__.py (1 line):
- line 16: # TODO: Add dead letter queue for RunInference transforms.
sdks/python/apache_beam/dataframe/frame_base.py (1 line):
- line 254: # TODO: fix for delegation?
runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/PrimitiveParDoSingleFactory.java (1 line):
- line 192: // TODO: Is there a better way to do this?
sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/join/KeyedPCollectionTuple.java (1 line):
- line 157: // TODO: This should already have run coder inference for output, but may not have been consumed
sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/WithKeys.java (1 line):
- line 132: // TODO: Remove when we can set the coder inference context.
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/ShuffleSink.java (1 line):
- line 149: // TODO: Decide the representation of sort-keyed values.
playground/backend/cmd/server/wrapper.go (1 line):
- line 27: // TODO address what is the minimum necessary
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/DataflowWorkerHarnessHelper.java (1 line):
- line 135: // TODO: make env logic private to main() so it is never done outside of initializing the process
sdks/java/io/cdap/src/main/java/org/apache/beam/sdk/io/cdap/CdapIO.java (1 line):
- line 478: // TODO: implement SparkReceiverIO.<~>write()
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/StreamingModeExecutionContext.java (1 line):
- line 535: // TODO: Take in the requesting step name and side input index for streaming.
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubUnboundedSink.java (1 line):
- line 329: // TODO: Do we really need to recreate the client on every bundle?
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableServiceImpl.java (1 line):
- line 537: // TODO: Bigtable client already tracks BatchingExceptions, use BatchingExceptions
sdks/java/core/src/main/java/org/apache/beam/sdk/util/ApiSurface.java (1 line):
- line 324: // TODO: Ignore any NoClassDefFoundError errors as a workaround.
sdks/go/pkg/beam/core/runtime/graphx/schema/logicaltypes.go (1 line):
- line 101: // TODO add duplication checks.
runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/DoFnOperator.java (1 line):
- line 926: // TODO: Do we need to do this when OnWindowExpiration is set, since in that case we have a
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/transforms/Convert.java (1 line):
- line 152: // TODO: Support boxing in Convert (e.g. Long -> Row with Schema { Long }).
runners/samza/src/main/java/org/apache/beam/runners/samza/runtime/SamzaDoFnRunners.java (1 line):
- line 444: // TODO: Support propagating the PaneInfo through.
sdks/go/pkg/beam/runners/prism/internal/worker/bundle.go (1 line):
- line 150: // TODO: make batching decisions on the maxium to send per elements block, to reduce processing time overhead.
sdks/java/io/synthetic/src/main/java/org/apache/beam/sdk/io/synthetic/delay/ReaderDelay.java (1 line):
- line 35: // TODO: add a separate distribution for the sleep time of reading the first record
sdks/python/apache_beam/ml/inference/vertex_ai_inference.py (1 line):
- line 118: # TODO: support the full list of options for aiplatform.init()
sdks/java/io/clickhouse/src/main/java/org/apache/beam/sdk/io/clickhouse/ClickHouseIO.java (1 line):
- line 357: // TODO: This should be the same as resolved so that Beam knows which fields
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/logging/Slf4jLogWriter.java (1 line):
- line 44: // TODO: Provide a useful default
sdks/python/apache_beam/runners/worker/worker_id_interceptor.py (1 line):
- line 37: # TODO: (BEAM-3904) Removed defaulting to UUID when worker_id is not present
runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/NonKeyedPushedBackElementsHandler.java (1 line):
- line 57: // TODO: use addAll() once Flink has addAll(Iterable TODO: Modify the Beam DistributionCell to support extracting the delta.
sdks/java/testing/nexmark/src/main/java/org/apache/beam/sdk/nexmark/NexmarkLauncher.java (1 line):
- line 374: // TODO: assign one generator per core rather than one per worker.
runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkJobServerDriver.java (1 line):
- line 71: // TODO: Expose the fileSystem related options.
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/DataflowOperationContext.java (1 line):
- line 127: // TODO: It could be useful to capture enough context to report per-timer
sdks/python/apache_beam/coders/coders.py (1 line):
- line 247: # TODO: After https://github.com/apache/beam/issues/18490 we should be
sdks/python/apache_beam/typehints/testing/strategies.py (1 line):
- line 88: # TODO: Currently this will only draw from the primitive types that can be
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/data/GrpcDataService.java (1 line):
- line 68: * TODO: (BEAM-3811) Replace with some cancellable collection, to ensure that new clients of a
runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/io/source/unbounded/FlinkUnboundedSourceReader.java (1 line):
- line 322: // TODO: Replace with SourceReaderContest.metricGroup().setPendingBytesGauge() after Flink 1.14
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/healthcare/DicomIO.java (1 line):
- line 177: // TODO [https://github.com/apache/beam/issues/20582] Change to non-blocking async calls
sdks/java/io/contextualtextio/src/main/java/org/apache/beam/sdk/io/contextualtextio/ContextualTextIO.java (1 line):
- line 461: // TODO: We don't need to attach the filename during sorting since we process all
runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/Structs.java (1 line):
- line 62: // TODO: Need to agree on a format for encoding bytes in
website/www/site/assets/scss/_navbar-desktop.scss (1 line):
- line 36: // TODO - revert margin-bottom to 30px after summit is over when removing banner
sdks/python/expansion-service-container/boot.go (1 line):
- line 107: // TODO update
runners/portability/java/src/main/java/org/apache/beam/runners/portability/JobServicePipelineResult.java (1 line):
- line 199: // TODO: Determine the correct mappings for the states below.
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/windmill/work/processing/StreamingCommitFinalizer.java (1 line):
- line 69: // TODO: It is also possible for an earlier finalized id to be lost.
website/www/site/assets/scss/bootstrap/_dropdowns.scss (1 line):
- line 181: // TODO: abstract this so that the navbar fixed styles are not placed here?
sdks/typescript/src/apache_beam/coders/required_coders.ts (1 line):
- line 319: // TODO: these actually go up to int64
sdks/java/harness/src/main/java/org/apache/beam/fn/harness/data/PCollectionConsumerRegistry.java (1 line):
- line 216: // TODO: Stop passing windowed value coders within PCollections.
sdks/java/harness/src/main/java/org/apache/beam/fn/harness/state/MultimapUserState.java (1 line):
- line 56: * TODO: Move to an async persist model where persistence is signalled based upon cache memory
sdks/java/io/amazon-web-services2/src/main/java/org/apache/beam/sdk/io/aws2/sqs/SqsUnboundedReader.java (1 line):
- line 387: // TODO: Estimate a timestamp lag.
sdks/typescript/src/apache_beam/internal/environments.ts (1 line):
- line 24: // TODO: Cleanup. Actually populate.
sdks/go/pkg/beam/runners/prism/internal/jobservices/server.go (1 line):
- line 78: logger: slog.Default(), // TODO substitute with a configured logger.
it/jdbc/src/main/java/org/apache/beam/it/jdbc/OracleResourceManager.java (1 line):
- line 42: // TODO - oracle-xe seems to require these credentials to spin up the container.
sdks/java/core/src/main/java/org/apache/beam/sdk/PipelineResult.java (1 line):
- line 66: // TODO: method to retrieve error messages.
sdks/python/apache_beam/io/gcp/gcsio_retry.py (1 line):
- line 62: # TODO: revisit the logic here when gcs client library supports error
runners/samza/src/main/java/org/apache/beam/runners/samza/SamzaJobServerDriver.java (1 line):
- line 39: // TODO: Expose the fileSystem related options.
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/CombinePhase.java (1 line):
- line 25: // TODO: These strings are part of the service definition, and
sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/join/UnionCoder.java (1 line):
- line 36: // TODO: Think about how to integrate this with a schema object (i.e.
sdks/java/core/src/main/java/org/apache/beam/sdk/util/construction/graph/ProtoOverrides.java (1 line):
- line 79: // TODO: remove PCollections not produced by 'pt' here.
runners/flink/src/main/java/org/apache/beam/runners/flink/metrics/FileReporter.java (1 line):
- line 76: // TODO Remove once FLINK-22646 is fixed on upstream Flink.
sdks/typescript/src/apache_beam/coders/coders.ts (1 line):
- line 61: // TODO: Figure out how to branch on constructors (called with new) and
sdks/python/apache_beam/ml/anomaly/detectors/offline.py (1 line):
- line 105: # TODO: validate the model handler type
runners/core-java/src/main/java/org/apache/beam/runners/core/triggers/TriggerStateMachineRunner.java (1 line):
- line 215: // TODO: If we know that no trigger in the tree will ever finish, we don't need to do the
sdks/python/apache_beam/utils/multi_process_shared.py (1 line):
- line 298: # TODO: Allow passing/parameterizing the callable here, in case they are
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/testing/FakeDatasetService.java (1 line):
- line 382: // TODO: Only allow "legal" schema changes.
sdks/java/harness/src/main/java/org/apache/beam/fn/harness/FnHarness.java (1 line):
- line 177: // TODO: https://github.com/apache/beam/issues/30301
playground/api/v1/api.proto (1 line):
- line 184: // TODO mark reserved after #24402 update sequence is done
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/windmill/appliance/JniWindmillApplianceServer.java (1 line):
- line 41: // TODO: Remove the use of JNI here
sdks/go/pkg/beam/core/graph/fn.go (1 line):
- line 179: // TODO: ViewFn, etc.
sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/util/GcsUtil.java (1 line):
- line 748: // TODO eliminate reflection once Beam drops Java 8 support and upgrades to gcsio 3.x
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/control/FnApiControlClientPoolService.java (1 line):
- line 96: // TODO: https://github.com/apache/beam/issues/18790: Prevent stale client references
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/healthcare/HL7v2IO.java (1 line):
- line 946: // TODO: add support for HL7v2 import.
sdks/java/extensions/sql/zetasql/src/main/java/org/apache/beam/sdk/extensions/sql/zetasql/translation/TVFScanConverter.java (1 line):
- line 87: // TODO: migrate to public Java API to retrieve FunctionSignature.
runners/spark/3/src/main/java/org/apache/beam/runners/spark/structuredstreaming/translation/batch/DoFnPartitionIteratorFactory.java (1 line):
- line 153: // FIXME Add support for TimerInternals.TimerData
sdks/java/managed/src/main/java/org/apache/beam/sdk/managed/Managed.java (1 line):
- line 93: // TODO: Dynamically generate a list of supported transforms
sdks/java/extensions/ordered/src/main/java/org/apache/beam/sdk/extensions/ordered/GlobalSequencesProcessorDoFn.java (1 line):
- line 283: // TODO: validate that this is correct.
runners/direct-java/src/main/java/org/apache/beam/runners/direct/BoundedReadEvaluatorFactory.java (1 line):
- line 68: // TODO: (https://github.com/apache/beam/issues/18079) Create a shared ExecutorService for
sdks/go/pkg/beam/runners/prism/internal/handlecombine.go (1 line):
- line 34: // TODO figure out the factory we'd like.
runners/twister2/src/main/java/org/apache/beam/runners/twister2/translators/functions/Twister2SinkFunction.java (1 line):
- line 36: // TODO need to complete functionality if needed
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/utils/ConvertHelpers.java (1 line):
- line 122: // TODO: Properly handle nullable.
runners/direct-java/src/main/java/org/apache/beam/runners/direct/ExecutorServiceParallelExecutor.java (1 line):
- line 153: // TODO: [https://github.com/apache/beam/issues/18968] Pass Future back to consumer to check for
sdks/java/io/pulsar/src/main/java/org/apache/beam/sdk/io/pulsar/ReadFromPulsarDoFn.java (1 line):
- line 130: // TODO improve getsize estiamate, check pulsar stats to improve get size estimate
runners/samza/src/main/java/org/apache/beam/runners/samza/util/PipelineJsonRenderer.java (1 line):
- line 134: // TODO: Refactor PipelineJsonRenderer to use SamzaPipelineVisitor instead of PipelineVisitor to
sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/util/RetryHttpRequestInitializer.java (1 line):
- line 254: // TODO: Do this exclusively for work requests.
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/utils/StaticSchemaInference.java (1 line):
- line 188: // TODO: should this be AbstractCollection?
sdks/go/pkg/beam/core/graph/window/trigger/trigger.go (1 line):
- line 175: // TODO: Change to call UnixMilli() once we move to only supporting a go version > 1.17.
playground/frontend/lib/modules/messages/parsers/messages_parser.dart (1 line):
- line 47: // TODO: Log
it/google-cloud-platform/src/main/java/org/apache/beam/it/gcp/LoadTestBase.java (1 line):
- line 268: // TODO: determine elapsed time more accurately if Direct runner supports do so.
sdks/python/apache_beam/runners/common.py (1 line):
- line 242: # TODO: Should we also store batching parameters here? (time/size preferences)
runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkJobInvoker.java (1 line):
- line 63: // TODO: How to make Java/Python agree on names of keys and their values?
runners/flink/src/main/java/org/apache/beam/runners/flink/translation/functions/FlinkPartialReduceFunction.java (1 line):
- line 52: // TODO: Remove side input functionality since liftable Combines no longer have side inputs.
playground/frontend/playground_components/lib/src/constants/sizes.dart (1 line):
- line 55: static const double infinite = 1000; // TODO: Use StadiumBorder
sdks/go/pkg/beam/runners/prism/internal/web/web.go (1 line):
- line 242: // TODO: Figure out where uniquename or id is being used in prism. It should be all global transform ID to faciliate lookups.
runners/spark/src/main/java/org/apache/beam/runners/spark/SparkTransformOverrides.java (1 line):
- line 40: // TODO: [https://github.com/apache/beam/issues/19107] Support @RequiresStableInput on Spark
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/transforms/Cast.java (1 line):
- line 296: // TODO: This should be the same as resolved so that Beam knows which fields
model/pipeline/src/main/proto/org/apache/beam/model/pipeline/v1/endpoints.proto (1 line):
- line 54: // TODO: Add authentication specifications as needed.
runners/direct-java/src/main/java/org/apache/beam/runners/direct/WatermarkManager.java (1 line):
- line 572: // TODO: Track elements in the bundle by the processing time they were output instead of
sdks/python/apache_beam/yaml/yaml_testing.py (1 line):
- line 427: # TODO: Optionally take this as a parameter.
sdks/python/apache_beam/runners/direct/transform_evaluator.py (1 line):
- line 870: # TODO Add paneinfo to timer_firing in DirectRunner
sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaUnboundedReader.java (1 line):
- line 192: // TODO: write records that can't be deserialized to a "dead-letter" additional output.
sdks/go/pkg/beam/core/runtime/harness/monitoring.go (1 line):
- line 38: // TODO: 2020/03/26 - measure mutex overhead vs sync.Map for this case.
runners/spark/3/src/main/java/org/apache/beam/runners/spark/structuredstreaming/translation/SparkSessionFactory.java (1 line):
- line 221: // TODO find more efficient ways
playground/backend/internal/validators/scio_validators.go (1 line):
- line 49: //TODO BEAM-13702
sdks/python/apache_beam/typehints/pytorch_type_compatibility.py (1 line):
- line 109: # TODO Check sub against batch type, and element type
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/SchemaRegistry.java (1 line):
- line 46: * TODO: Provide support for schemas registered via a ServiceLoader interface. This will allow
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/datastore/EntityToRow.java (1 line):
- line 157: // TODO: figure out in what order the elements are in (without relying on Beam schema).
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/InMemoryReader.java (1 line):
- line 120: // TODO: Replace with the real encoding used by the
sdks/java/extensions/avro/src/main/java/org/apache/beam/sdk/extensions/avro/coders/AvroCoder.java (1 line):
- line 664: // TODO: We should be able to support custom schemas on POJO fields, but we shouldn't
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java (1 line):
- line 1836: // TODO: In future, Bigtable service may provide finer grained APIs, e.g., to sample given a
runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/internal/CustomSources.java (1 line):
- line 61: int cores = 4; // TODO: decide at runtime?
sdks/java/core/src/main/java/org/apache/beam/sdk/util/construction/SdkComponents.java (1 line):
- line 344: // TODO Support multiple environments. The environment should be decided by the translation.
sdks/java/core/src/main/java/org/apache/beam/sdk/util/construction/graph/QueryablePipeline.java (1 line):
- line 60: // TODO: Is it better to have the signatures here require nodes in almost all contexts, or should
sdks/python/apache_beam/io/aws/clients/s3/fake_client.py (1 line):
- line 89: # TODO: Do we want to mock out a lack of credentials?
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/windmill/client/grpc/auth/VendoredRequestMetadataCallbackAdapter.java (1 line):
- line 28: * TODO: Replace this with an auto generated proxy which calls the underlying implementation
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/JavaBeanSchema.java (1 line):
- line 49: * TODO: Validate equals() method is provided, and if not generate a "slow" equals method based
sdks/python/apache_beam/internal/cloudpickle_pickler.py (1 line):
- line 117: # TODO: Add support once https://github.com/cloudpipe/cloudpickle/pull/563
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/transforms/Select.java (1 line):
- line 131: // TODO: This should be the same as resolved so that Beam knows which fields
sdks/java/core/src/main/java/org/apache/beam/sdk/io/range/OffsetRangeTracker.java (1 line):
- line 146: // TODO: Investigate whether in practice this is useful or, rather, confusing.
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/util/common/worker/ShuffleEntry.java (1 line):
- line 87: // TODO: Use a more compact and readable representation,
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerTransformRegistrar.java (1 line):
- line 116: // TODO: https://github.com/apache/beam/issues/20415 Come up with something to determine
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/AvroRowWriter.java (1 line):
- line 33: "nullness" // calling superclass method in constructor flagged as error; TODO: fix
sdks/java/extensions/sorter/src/main/java/org/apache/beam/sdk/extensions/sorter/Sorter.java (1 line):
- line 29: * TODO: Support custom comparison functions.
sdks/python/apache_beam/runners/interactive/interactive_runner.py (1 line):
- line 189: # TODO: make the StreamingCacheManager and TestStreamServiceController
sdks/python/apache_beam/testing/benchmarks/nexmark/queries/query10.py (1 line):
- line 50: # TODO: [https://github.com/apache/beam/issues/20670] it seems that beam team
sdks/java/extensions/zetasketch/src/main/java/org/apache/beam/sdk/extensions/zetasketch/HyperLogLogPlusPlusCoder.java (1 line):
- line 62: // TODO: check if we can know the sketch size without serializing it
playground/frontend/lib/pages/standalone_playground/screen.dart (1 line):
- line 36: // TODO: calculate sum of widths of all app bar buttons at the first frame.
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/DataflowPortabilityPCollectionView.java (1 line):
- line 53: * TODO: Migrate to a runner only specific concept of a side input to be used with {@link
sdks/python/apache_beam/testing/benchmarks/nexmark/queries/winning_bids.py (1 line):
- line 175: #TODO: change this to be calculated by event generation
learning/tour-of-beam/backend/auth.go (1 line):
- line 86: // TODO: implement IDToken caching in tb_user to optimize calls to Firebase API
runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/MonitoringUtil.java (1 line):
- line 132: // TODO: Allow filtering messages by importance
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WorkerCustomSources.java (1 line):
- line 497: *