gobblin-aws/src/main/java/org/apache/gobblin/aws/CloudInitScriptBuilder.java (12 lines): - line 63: * 1. Mount NFS Server (TODO: To be replaced with EFS soon) - line 66: * 4. Download Gobblin application jars from S3 (TODO: To be replaced via baked in jars in custom Gobblin AMI) - line 69: * 7. TODO: Add cron that watches the {@link GobblinAWSClusterManager} application and restarts it if it dies - line 96: // TODO: Replace with EFS (it went into GA on 6/30/2016) - line 131: // TODO: Eventually limit only custom user jars to pulled from S3, load rest from AMI - line 138: // TODO: Add cron that brings back master if it dies - line 171: * 1. Mount NFS volume (TODO: To be replaced with EFS soon) - line 174: * 4. Download Gobblin application jars from S3 (TODO: To be replaced via baked in jars in custom Gobblin AMI) - line 177: * 7. TODO: Add cron that watches the {@link GobblinAWSTaskRunner} application and restarts it if it dies - line 205: // TODO: Replace with EFS (it went into GA on 6/30/2016) - line 234: // TODO: Limit only custom user jars to pulled from S3, load rest from AMI - line 244: // TODO: Add cron that brings back worker if it dies gobblin-temporal/src/main/java/org/apache/gobblin/temporal/ddm/workflow/impl/ExecuteGobblinWorkflowImpl.java (10 lines): - line 115: // TODO: when eliminating the "GenWUs Worker", pause/block until scaling is complete - line 117: // TODO: decide whether this should be a hard failure; for now, "gracefully degrade" by continuing processing - line 136: // TODO: Cleanup WorkUnit/Taskstate Directory for jobs cancelled mid flight - line 166: // TODO: make fully configurable! for now, cap Work Discovery at 45 mins and set aside 10 mins for the `CommitStepWorkflow` - line 181: // TODO: make any adjustments - e.g. decide whether to shutdown the (often oversize) `GenerateWorkUnits` worker or alternatively to deduct one to count it - line 185: // TODO: be more robust and code more defensively, rather than presuming the impl of `RecommendScalingForWorkUnitsLinearHeuristicImpl` - line 192: // TODO: consider whether to allow either a) "pre-defining" a profile w/ set point zero, available for later use OR b) down-scaling to zero to pause worker - line 201: // TODO: use our own prop names; don't "borrow" from `ProcessWorkUnitsJobLauncher` - line 229: // TODO: Add configuration to support cleaning up historical work dirs from same job name - line 232: // TODO: Avoid cleaning up if work is being checkpointed e.g. midway of a commit for EXACTLY_ONCE gobblin-temporal/src/main/java/org/apache/gobblin/temporal/ddm/activity/impl/GenerateWorkUnitsImpl.java (9 lines): - line 135: // TODO: decide whether to acquire a job lock (as MR did)! - line 136: // TODO: provide for job cancellation (unless handling at the temporal-level of parent workflows)! - line 147: // TODO: determine whether these are actually necessary to do (as MR/AbstractJobLauncher did)! - line 212: // TODO: decide whether a non-retryable failure is too severe... (some sources may merit retry) - line 240: // TODO: report (timer) metrics for workunits preparation - line 243: // TODO: gobblinJobMetricsReporter.reportWorkUnitCountMetrics(this.jobContext.getJobState().getPropAsInt(NUM_WORKUNITS), jobState); - line 264: // WARNING/TODO: NOT resilient to nested multi-workunits... should it be? - line 301: // WARNING/TODO: NOT resilient to nested multi-workunits... should it be? - line 316: // TODO - decide whether helpful/necessary to `.compress()` gobblin-temporal/src/main/java/org/apache/gobblin/temporal/ddm/activity/impl/CommitActivityImpl.java (7 lines): - line 98: // TODO: Make this configurable - line 155: //TODO: IMPROVE GRANULARITY OF RETRIES - line 179: //TODO: Make this configurable - line 203: // TODO: Rewrite executorUtils to use java util optional - line 220: // TODO: propagate cause of failure and determine whether or not this is retryable to throw a non-retryable failure exception - line 231: // TODO - decide whether to replace this method by adapting TaskStateCollectorService::collectOutputTaskStates (whence much of this code was drawn) - line 244: // TODO - decide whether something akin necessary to streamline cumulative in-memory size of all issues: consumeTaskIssues(taskState); gobblin-core-base/src/main/java/org/apache/gobblin/writer/AsyncWriterManager.java (5 lines): - line 66: * 4. Support a fixed number of retries on failure of individual records (TODO: retry strategies) - line 68: * 6. TODO: Support ordered / unordered write semantics - line 175: * TODO: Figure out what this means for checkpointing. - line 283: * TODO: Add windowed stats to test for x% failures in y time window - line 549: // TODO: Make this configurable gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/iceberg/IcebergDataset.java (5 lines): - line 76: private final boolean shouldTolerateMissingSourceFiles = true; // TODO: make parameterizable, if desired - line 116: // TODO: Implement PushDownRequestor and priority based copy entity iteration - line 159: // TODO: should be the same FS each time; try creating once, reusing thereafter, to not recreate wastefully - line 289: // TODO: investigate whether streaming initialization of `Map` preferable--`getFileStatus()` network calls likely - line 383: // TODO: Filter properties specific to iceberg registration and avoid serializing every global property gobblin-temporal/src/main/java/org/apache/gobblin/temporal/ddm/work/assistance/Help.java (5 lines): - line 95: // TODO: determine whether the same could be obtained from `workerConfig` (likely much more efficiently) - line 117: // TODO - determine whether this works... unclear whether it led to "FS closed", or that had another cause... - line 130: // TODO - more investigation to sort out the true RC... and whether caching definitively is or is not possible for use here! - line 190: // TODO: decide whether actually necessary... it was added in a fit of debugging "FS closed" errors - line 248: // TODO: log4j2 has better syntax around conditional logging such that the key does not need to be included in the value gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/iceberg/IcebergDatasetFinder.java (4 lines): - line 101: // TODO: eventually remove support for combo (src+dest) iceberg props, in favor of separate source/dest-scoped props; for now, maintain support - line 107: // TODO: eventually remove support for combo (src+dest) iceberg props, in favor of separate source/dest-scoped props; for now, maintain support - line 154: // TODO: Rethink strategy to enforce dest iceberg table - line 162: // TODO: Filter properties specific to Hadoop gobblin-aws/src/main/java/org/apache/gobblin/aws/GobblinAWSClusterLauncher.java (4 lines): - line 268: // TODO: Add cluster monitoring - line 412: // TODO: Make security group restrictive - line 472: // TODO: Make size configurable when we have support multi-master - line 520: // TODO: Add listener to Helix / Zookeeper for master restart and update master public ip gobblin-temporal/src/main/java/org/apache/gobblin/temporal/ddm/activity/impl/ProcessWorkUnitImpl.java (4 lines): - line 141: // TODO: shall we schedule metrics update based on config? - line 152: // TODO: if metrics configured, report them now - line 174: // TODO - describe WorkUnits other than `CopyableFile` - line 224: // TODO - decide whether to support... and if so, employ a useful path; otherwise, just evaluate predicate to always false gobblin-modules/gobblin-azkaban/src/main/java/org/apache/gobblin/service/modules/orchestration/AzkabanAjaxAPIClient.java (4 lines): - line 64: // TODO: Ensure GET call urls do not grow too big - line 142: // TODO: Support users (not just groups), and different permission types - line 202: // TODO: Support users (not just groups), and different permission types - line 284: // Run once OR push down schedule (TODO: Enable when push down is finalized) gobblin-temporal/src/main/java/org/apache/gobblin/temporal/util/nesting/workflow/AbstractNestingExecWorkflowImpl.java (3 lines): - line 104: // TODO: determine whether any benefit to unordered `::get` blocking for any next ready (perhaps no difference...) - line 106: // TODO: consider a generalized reduce op for things other than counting! - line 131: // TODO: use a configuration value, for simpler adjustment, rather than hard-code gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/iceberg/IcebergPartitionDataset.java (3 lines): - line 100: // TODO: Refactor the IcebergDataset::generateCopyEntities to avoid code duplication - line 112: // TODO: should be the same FS each time; try creating once, reusing thereafter, to not recreate wastefully - line 218: //TODO: Refactor it later using factory or other way to support different types of filter predicate gobblin-iceberg/src/main/java/org/apache/gobblin/iceberg/Utils/IcebergUtils.java (3 lines): - line 69: //TODO: Add more information into partition spec e.g. day, year, month, kafka partition ids, offset ranges for better consuming - line 141: //TODO parse partitionValue as per partitionSchema - line 254: // TODO: If required, handle NaN value count File metric conversion in ORC metrics with iceberg upgrade gobblin-metastore/src/main/java/org/apache/gobblin/metastore/MysqlStateStore.java (3 lines): - line 218: // TODO: consider whether to demote to DEBUG log level - line 227: // TODO: revisit following verification of successful connection pool migration: - line 235: // TODO: revisit following verification of successful connection pool migration: gobblin-service/src/main/java/org/apache/gobblin/service/monitoring/GaaSJobObservabilityEventProducer.java (3 lines): - line 241: .setExecutorUrn(null); //TODO: Fill with information from job execution - line 248: // TODO: Separate failure cases to SUBMISSION FAILURE and COMPILATION FAILURE, investigate events to populate these fields - line 256: // TODO: If cancelled due to start SLA exceeded, consider grouping this as a submission failure? gobblin-temporal/src/main/java/org/apache/gobblin/temporal/ddm/work/WorkUnitClaimCheck.java (3 lines): - line 46: * TODO: if we're to generalize Work Prediction+Prioritization across multiplexed jobs, each having its own separate time budget, every WU claim-check - line 66: return new State(); // TODO - figure out how to truly set! - line 72: // TODO: decide whether wise to hard-code... (per `MRJobLauncher` conventions, we expect job state file to be sibling of WU dir) gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/iceberg/IcebergTable.java (3 lines): - line 142: // TODO: investigate using `.addedFiles()`, `.deletedFiles()` to calc this - line 184: // TODO: verify correctness, even when handling 'delete manifests'! - line 258: //TODO: Add support for deleteManifests as well later gobblin-salesforce/src/main/java/org/apache/gobblin/salesforce/SalesforceExtractor.java (3 lines): - line 709: * TODO: abstract this function to a common function: arguments need to add connetion, header, output-format - line 710: * TODO: make it and its related functions pure function (no side effect). Currently still unnecesarily changing this.bulkJob) - line 726: this.bulkJob = createdJob; // other functions need to use it TODO: remove bulkJob from this class gobblin-runtime/src/main/java/org/apache/gobblin/runtime/spec_store/MysqlBaseSpecStore.java (3 lines): - line 238: // TODO: fix to obey the `SpecStore` contract of returning the *updated* `Spec` - line 355: // TODO: migrate this class to use common util {@link DBStatementExecutor} - line 366: // TODO: revisit use of connection test query following verification of successful connection pool migration: gobblin-temporal/src/main/java/org/apache/gobblin/temporal/ddm/activity/impl/AbstractRecommendScalingForWorkUnitsImpl.java (3 lines): - line 44: // TODO: decide whether this name ought to be configurable - or instead a predictable name that callers may expect (and possibly adjust) - line 65: // TODO: implement right-sizing!!! (for now just return unchanged) - line 70: // TODO: if we ever return > 1 directive, append a monotonically increasing number to avoid collisions gobblin-service/src/main/java/org/apache/gobblin/service/modules/scheduler/GobblinServiceJobScheduler.java (3 lines): - line 402: // TODO: Instead of simply call onDeleteSpec, a callback when FlowSpec is deleted from FlowCatalog, should also kill Azkaban Flow from AzkabanSpecProducer. - line 673: // TODO: move this out of the try clause after location NPE source - line 696: // TODO: add a metric if expected reminder time far exceeds system time gobblin-core-base/src/main/java/org/apache/gobblin/writer/TrackerBasedWatermarkManager.java (2 lines): - line 40: * TODO: Add metrics monitoring - line 76: //TODO: Not checking if this watermark has already been committed successfully. gobblin-temporal/src/main/java/org/apache/gobblin/temporal/ddm/activity/impl/DeleteWorkDirsActivityImpl.java (2 lines): - line 55: //TODO: Emit timers to measure length of cleanup step - line 75: //TODO: Support task level deletes if necessary, currently it is deemed redundant due to collecting temp dirs during generate work unit step gobblin-aws/src/main/java/org/apache/gobblin/aws/AWSJobConfigurationManager.java (2 lines): - line 145: // TODO: Eventually when config store supports job files as well - line 158: // TODO: Currently new and updated jobs are handled, we should un-schedule deleted jobs as well gobblin-metastore/src/main/java/org/apache/gobblin/metastore/JobHistoryDataSourceProvider.java (2 lines): - line 53: // TODO: revisit following verification of successful connection pool migration: - line 62: // TODO: revisit following verification of successful connection pool migration: gobblin-runtime/src/main/java/org/apache/gobblin/runtime/GobblinMultiTaskAttempt.java (2 lines): - line 298: * TODO: Call this from the right place. - line 603: * FIXME this method is provided for backwards compatibility in the LocalJobLauncher since it does gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/MysqlMultiActiveLeaseArbiter.java (2 lines): - line 294: // TODO: emit metric here to capture this unexpected behavior - line 303: // TODO: check whether reminder event before replacing flowExecutionId gobblin-data-management/src/main/java/org/apache/gobblin/data/management/conversion/hive/provider/HiveMetastoreBasedUpdateProvider.java (2 lines): - line 38: // TODO if a table/partition is registered by gobblin an update time will be made available in table properties - line 45: // TODO if a table/partition is registered by gobblin an update time will be made available in table properties gobblin-modules/gobblin-kafka-09/src/main/java/org/apache/gobblin/kafka/client/Kafka09ConsumerClient.java (2 lines): - line 213: * TODO Add multi topic support - line 223: * TODO Add multi topic support gobblin-temporal/src/main/java/org/apache/gobblin/temporal/ddm/workflow/impl/ProcessWorkUnitsWorkflowImpl.java (2 lines): - line 137: // TODO: to incorporate multiple different concrete `NestingExecWorkflow` sub-workflows in the same super-workflow... shall we use queues?!?!? - line 143: // TODO: verify to instead use: Policy.PARENT_CLOSE_POLICY_TERMINATE) gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/DagProcessingEngine.java (2 lines): - line 73: // TODO Update to fetch list from config once transient exception handling is implemented and retryable exceptions defined - line 168: // TODO add the else block for transient exceptions and add conclude task only if retry limit is not breached gobblin-utility/src/main/java/org/apache/gobblin/util/jdbc/DataSourceProvider.java (2 lines): - line 67: // TODO: revisit following verification of successful connection pool migration: - line 75: // TODO: revisit following verification of successful connection pool migration: gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/task/DagProcessingEngineMetrics.java (2 lines): - line 137: // TODO: implement evaluating max retries later - line 174: // TODO: measure processing time gobblin-runtime/src/main/java/org/apache/gobblin/runtime/Task.java (2 lines): - line 394: //TODO: Move these to explicit shutdown phase - line 413: * TODO: Remove this method after Java-11 as JDK offers similar built-in solution. gobblin-modules/gobblin-sql/src/main/java/org/apache/gobblin/source/jdbc/JdbcProvider.java (2 lines): - line 59: // TODO make connection Url parsing much more robust -- some connections URLs can have colons and slashes in the - line 84: // TODO: revisit following verification of successful connection pool migration: gobblin-temporal/src/main/java/org/apache/gobblin/temporal/cluster/GobblinTemporalClusterManager.java (2 lines): - line 287: * TODO for now the cluster id is hardcoded to 1 both here and in the {@link GobblinTaskRunner}. In the future, the - line 291: * TODO: unravel what to make of the comment above. as it is, `GobblinTemporalApplicationMaster#main` is what runs, NOT `GobblinTemporalClusterManager#main` gobblin-runtime/src/main/java/org/apache/gobblin/runtime/job_exec/JobLauncherExecutionDriver.java (2 lines): - line 293: // TODO Remove next line once the JobLauncher starts sending notifications for success - line 595: // FIXME there is a race condition here as the job may complete successfully before we gobblin-cluster/src/main/java/org/apache/gobblin/cluster/ScheduledJobConfigurationManager.java (2 lines): - line 113: * TODO: Change cluster code to handle Spec. Right now all job properties are needed to be in config and template is not honored - line 114: * TODO: Materialized JobSpec and make use of ResolvedJobSpec gobblin-runtime/src/main/java/org/apache/gobblin/runtime/AbstractJobLauncher.java (2 lines): - line 140: * TODO: Expand support to Gobblin-as-a-Service in FlowTemplateCatalog. - line 1063: // TODO: decide whether should be `.warn`, stay as `.info`, or change back to `.error` gobblin-iceberg/src/main/java/org/apache/gobblin/iceberg/writer/IcebergMetadataWriter.java (2 lines): - line 635: //TODO: emit these metrics to Ingraphs, in addition to metrics for publishing new snapshots and other Iceberg metadata operations. - line 735: // TODO Find better way to determine a partition value gobblin-modules/gobblin-kafka-1/src/main/java/org/apache/gobblin/kafka/client/Kafka1ConsumerClient.java (2 lines): - line 181: * TODO Add multi topic support - line 191: * TODO Add multi topic support gobblin-modules/gobblin-kafka-common/src/main/java/org/apache/gobblin/source/extractor/extract/kafka/KafkaSource.java (2 lines): - line 210: * TODO: Utilize the minContainer in {@link KafkaTopicGroupingWorkUnitPacker#pack(Map, int)}, as the numContainers variable - line 917: // TODO: replace this with TopicNameValidator in the config once TopicValidators is rolled out. gobblin-runtime/src/main/java/org/apache/gobblin/runtime/spec_store/MysqlSpecStoreWithUpdate.java (2 lines): - line 66: // TODO: fix to obey the `SpecStore` contract of returning the *updated* `Spec` - line 73: // TODO: fix to obey the `SpecStore` contract of returning the *updated* `Spec` gobblin-yarn/src/main/java/org/apache/gobblin/yarn/YarnAppSecurityManagerWithKeytabs.java (2 lines): - line 94: // TODO: If token failed to get renewed in case its expired ( can be detected via the error text ), - line 124: //TODO: Any other required tokens can be fetched here based on config or any other detection mechanism gobblin-metrics-libs/gobblin-metrics/src/main/java/org/apache/gobblin/metrics/ServiceMetricNames.java (2 lines): - line 39: // TODO: change these metric variable and names after verifying the refactoring preserves existing functionality - line 84: // TODO: implement dropping reminder event after exceed some time gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinHelixJobLauncher.java (2 lines): - line 335: //TODO: having a smarter way to calculate the new work unit size to replace the current static approach to simply double - line 545: // TODO: Better error handling. The current impl swallows exceptions for jobs that were started by this method call. gobblin-restli/gobblin-flow-config-service/gobblin-flow-config-service-server/src/main/java/org/apache/gobblin/service/modules/restli/FlowConfigsV2ResourceHandler.java (2 lines): - line 212: // TODO: Compilation errors should fall under throwable exceptions as well instead of checking for strings - line 264: // TODO: Compilation errors should fall under throwable exceptions as well instead of checking for strings gobblin-temporal/src/main/java/org/apache/gobblin/temporal/ddm/work/WUProcessingSpec.java (2 lines): - line 65: return new State(); // TODO - figure out how to truly set! - line 71: // TODO: decide whether wise to hard-code... (per `MRJobLauncher` conventions, we expect job state file to be sibling of WU dir) gobblin-temporal/src/main/java/org/apache/gobblin/temporal/ddm/work/AbstractEagerFsDirBackedWorkload.java (2 lines): - line 93: * TODO: handle case of dir contents growing (e.g. use timestamp to filter out newer paths)... how could we handle the case of shrinking/deletion? - line 132: return new State(); // TODO - figure out how to truly set! gobblin-runtime/src/main/java/org/apache/gobblin/runtime/api/JobExecutionResult.java (2 lines): - line 35: // TODO add TaskExecutionResults - line 72: // FIXME we need to capture error(s) gobblin-utility/src/main/java/org/apache/gobblin/util/HadoopUtils.java (2 lines): - line 370: * TODO this method does not handle cleaning up any local files leftover by writing to S3. - line 394: * TODO this method does not handle cleaning up any local files leftover by writing to S3. gobblin-kubernetes/gobblin-service/base-cluster/deployment.yaml (2 lines): - line 37: imagePullPolicy: Never # TODO: Remove this once docker images are deployed to Apache DockerHub post-graduation - line 70: imagePullPolicy: Never # TODO: Remove this once docker images are deployed to Apache DockerHub post-graduation gobblin-temporal/src/main/java/org/apache/gobblin/temporal/ddm/work/GenerateWorkUnitsResult.java (1 line): - line 41: // TODO: characterize the WUs more thoroughly, by also including destination info, and with more specifics, like src+dest location, I/O config, throttling... gobblin-modules/gobblin-couchbase/src/main/java/org/apache/gobblin/couchbase/writer/CouchbaseEnvironmentFactory.java (1 line): - line 36: * TODO: Determine if we need to use the config to tweak certain parameters gobblin-modules/gobblin-kafka-09/src/main/java/org/apache/gobblin/source/extractor/extract/kafka/KafkaSimpleStreamingSource.java (1 line): - line 129: StringUtils.EMPTY); // TODO: fix this to use the new API when KafkaWrapper is fixed gobblin-modules/gobblin-kafka-common/src/main/java/org/apache/gobblin/kafka/serialize/MD5Digest.java (1 line): - line 88: //TODO: Replace this with a version that encodes without needing a copy. gobblin-api/src/main/java/org/apache/gobblin/configuration/WorkUnitState.java (1 line): - line 261: * TODO - Once we are ready to make a backwards incompatible change to the {@link org.apache.gobblin.source.extractor.Extractor} gobblin-iceberg/src/main/java/org/apache/gobblin/iceberg/writer/GobblinMCEWriter.java (1 line): - line 538: //TODO: Support multi partition columns gobblin-runtime/src/main/java/org/apache/gobblin/runtime/api/InstrumentedSpecStore.java (1 line): - line 144: // NOTE: uses same `getTimer` as `getSpec`; TODO: explore separating, since measuring different operation gobblin-modules/google-ingestion/src/main/java/org/apache/gobblin/ingestion/google/webmaster/ProducerJob.java (1 line): - line 70: //TODO: don't need to recreate objects if it's of type SimpleProducerJob gobblin-temporal/src/main/java/org/apache/gobblin/temporal/ddm/worker/WorkFulfillmentWorker.java (1 line): - line 45: public static final long DEADLOCK_DETECTION_TIMEOUT_SECONDS = 120; // TODO: make configurable! gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/Orchestrator.java (1 line): - line 231: // TODO: Evolve logic to cache and reuse previously compiled JobSpecs gobblin-modules/gobblin-elasticsearch/src/main/java/org/apache/gobblin/elasticsearch/writer/ElasticsearchRestWriter.java (1 line): - line 126: //TODO: Support pass through of configuration (e.g. timeouts etc) of rest client from above gobblin-admin/src/main/resources/static/js/views/key-value-table-view.js (1 line): - line 46: // TODO attach elsewhere? gobblin-service/src/main/java/org/apache/gobblin/service/monitoring/DagManagementDagActionStoreChangeMonitor.java (1 line): - line 77: /* TODO: skip deadline removal for now and let them fire gobblin-temporal/src/main/java/org/apache/gobblin/temporal/workflows/metrics/EventSubmitterContext.java (1 line): - line 104: // TODO: Add temporal specific metadata tags gobblin-utility/src/main/java/org/apache/gobblin/util/WorkUnitSizeInfo.java (1 line): - line 80: // WARNING/TODO: NOT resilient to nested multi-workunits... should it be? gobblin-compaction/src/main/java/org/apache/gobblin/compaction/mapreduce/orc/GobblinOrcMapreduceRecordWriter.java (1 line): - line 51: // TODO: Emit this information as kafka events for ease for populating dashboard. gobblin-modules/gobblin-kafka-common/src/main/java/org/apache/gobblin/source/extractor/extract/kafka/workunit/packer/KafkaTopicGroupingWorkUnitPacker.java (1 line): - line 255: * TODO: This method should be moved into {@link KafkaSource}, which requires moving classes such gobblin-core/src/main/java/org/apache/gobblin/source/extractor/extract/QueryBasedSource.java (1 line): - line 360: * TODO: this is currently a simple round-robin packing. More sophisticated bin packing may be necessary gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/MysqlUserQuotaManager.java (1 line): - line 331: // TODO: revisit use of connection test query following verification of successful connection pool migration: gobblin-runtime/src/main/java/org/apache/gobblin/runtime/api/TopologySpec.java (1 line): - line 321: // TODO: Try to init SpecProducer from config if not initialized via builder. gobblin-runtime/src/main/java/org/apache/gobblin/runtime/api/JobExecutionDriver.java (1 line): - line 47: //TODO add: gobblin-service/src/main/java/org/apache/gobblin/service/modules/flowgraph/pathfinder/AbstractPathFinder.java (1 line): - line 266: // TODO: Choose the min-cost executor for the FlowEdge as opposed to the first one that resolves. gobblin-metrics-libs/gobblin-metrics/src/main/java/org/apache/gobblin/metrics/OpenTelemetryMetrics.java (1 line): - line 64: // TODO: Refactor the method to use a factory pattern for instantiating MetricExporter. Each MetricExporter gobblin-temporal/src/main/java/org/apache/gobblin/temporal/ddm/workflow/impl/GenerateWorkUnitsWorkflowImpl.java (1 line): - line 37: public static final Duration startToCloseTimeout = Duration.ofMinutes(90); // TODO: make configurable gobblin-data-management/src/main/java/org/apache/gobblin/data/management/retention/DatasetCleaner.java (1 line): - line 108: // TODO -- Remove the dependency on gobblin-core after new Gobblin Metrics does not depend on gobblin-core. gobblin-data-management/src/main/java/org/apache/gobblin/data/management/source/LoopingDatasetFinderSource.java (1 line): - line 56: * TODO: handle retries gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/DagManagementTaskStreamImpl.java (1 line): - line 138: //TODO: add metrics gobblin-modules/google-ingestion/src/main/java/org/apache/gobblin/ingestion/google/webmaster/GoogleWebmasterExtractor.java (1 line): - line 146: * TODO: May need to implement this feature in the future based on use cases. gobblin-modules/gobblin-sql/src/main/java/org/apache/gobblin/publisher/JdbcPublisher.java (1 line): - line 128: * TODO: Research on running this in parallel. While testing publishing it in parallel, it turns out delete all from the table locks the table gobblin-modules/google-ingestion/src/main/java/org/apache/gobblin/ingestion/google/webmaster/GoogleWebmasterExtractorIterator.java (1 line): - line 424: //TODO: 99.99% cases we are good. But what if it happens, what can we do? gobblin-hive-registration/src/main/java/org/apache/gobblin/hive/HiveRegistrationUnitComparator.java (1 line): - line 151: //FIXME: This is a temp fix for special character in schema string, need to investigate the root gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/task/DagTask.java (1 line): - line 72: // TODO: Decide appropriate exception to throw and add to the commit method's signature gobblin-service/src/main/java/org/apache/gobblin/service/modules/core/GobblinServiceManager.java (1 line): - line 416: // TODO: Remove after adding test cases gobblin-core/src/main/java/org/apache/gobblin/writer/SimpleDataWriterBuilder.java (1 line): - line 61: // TODO: refactor this when capability support comes back in gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/CopySource.java (1 line): - line 301: // TODO: we should set job as partial success if there is a mix of allocated requests and rejections gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/CopyableFile.java (1 line): - line 163: // TODO: gobblin-cluster/src/main/java/org/apache/gobblin/cluster/TaskRunnerSuiteProcessModel.java (1 line): - line 62: //TODO: taskFactoryMap.put(GOBBLIN_JOB_FACTORY_NAME, jobFactory); gobblin-temporal/src/main/java/org/apache/gobblin/temporal/dynamic/FsScalingDirectivesRecipient.java (1 line): - line 34: * TODO: per {@link FsScalingDirectiveSource} - directives too long for one filename path component MUST (but currently do NOT!) use the gobblin-yarn/src/main/java/org/apache/gobblin/yarn/GobblinYarnLogSource.java (1 line): - line 63: * TODO: This is duplicated to the org.apache.gobblin.yarn.GobblinYarnAppLauncher#buildLogCopier(com.typesafe.config.Config, org.apache.hadoop.fs.Path, org.apache.hadoop.fs.Path) gobblin-runtime/src/main/java/org/apache/gobblin/runtime/spec_catalog/FlowCatalog.java (1 line): - line 249: // TODO: Have kind of metrics keeping track of specs that failed to be deserialized. gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/iceberg/IcebergTableMetadataValidatorUtils.java (1 line): - line 57: // TODO: Need to add support for schema evolution gobblin-runtime/src/main/java/org/apache/gobblin/runtime/api/JobExecutionState.java (1 line): - line 94: // TODO default implementation gobblin-metrics-libs/gobblin-metrics-base/src/main/java/org/apache/gobblin/metrics/event/JobEvent.java (1 line): - line 30: public static final String JOB_STATE = "JobStateEvent"; // TODO: Migrate to JobStateEventBuilder gobblin-modules/gobblin-compliance/src/main/java/org/apache/gobblin/compliance/purger/HivePurgerSource.java (1 line): - line 100: // TODO: Event submitter and metrics will be added later gobblin-modules/gobblin-kafka-08/src/main/java/org/apache/gobblin/kafka/serialize/LiAvroSerializer.java (1 line): - line 27: * TODO: Implement this for IndexedRecord not just GenericRecord gobblin-salesforce/src/main/java/org/apache/gobblin/salesforce/SalesforceSource.java (1 line): - line 307: // TODO: we should consider move this logic into getRefinedHistogram so that we can early terminate the search gobblin-runtime/src/main/java/org/apache/gobblin/runtime/job_catalog/FSPathAlterationListenerAdaptor.java (1 line): - line 72: // TODO: fix version gobblin-modules/gobblin-kafka-1/src/main/java/org/apache/gobblin/kafka/serialize/LiAvroSerializer.java (1 line): - line 27: * TODO: Implement this for IndexedRecord not just GenericRecord gobblin-temporal/src/main/java/org/apache/gobblin/temporal/joblauncher/GobblinJobLauncher.java (1 line): - line 220: // TODO: Better error handling. The current impl swallows exceptions for jobs that were started by this method call. gobblin-service/src/main/java/org/apache/gobblin/service/monitoring/MysqlJobStatusRetriever.java (1 line): - line 92: // TODO: optimize as needed: returned `List` may be large, since encompassing every execution of every flow (in group)! gobblin-admin/src/main/resources/static/js/views/table-view.js (1 line): - line 77: // TODO attach elsewhere? gobblin-data-management/src/main/java/org/apache/gobblin/util/commit/SetPermissionCommitStep.java (1 line): - line 77: // TODO : we can also set owner and group here. gobblin-api/src/main/java/org/apache/gobblin/configuration/ConfigurationKeys.java (1 line): - line 1096: TODO - gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinClusterManager.java (1 line): - line 497: * TODO for now the cluster id is hardcoded to 1 both here and in the {@link GobblinTaskRunner}. In the future, the gobblin-modules/gobblin-kafka-common/src/main/java/org/apache/gobblin/kafka/serialize/LiAvroSerializerBase.java (1 line): - line 37: * TODO: Implement this for IndexedRecord not just GenericRecord gobblin-hive-registration/src/main/java/org/apache/gobblin/hive/writer/HiveMetadataWriter.java (1 line): - line 351: //TODO: De-register table if table location does not exist (Configurable) gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/converter/DistcpConverter.java (1 line): - line 61: * TODO: actually use this method and add the extensions. gobblin-runtime/src/main/java/org/apache/gobblin/runtime/messaging/handler/SplitMessageHandler.java (1 line): - line 39: //TODO: GOBBLIN-1688 Recompute workunit based on message. gobblin-config-management/gobblin-config-client/src/main/java/org/apache/gobblin/config/client/package-info.java (1 line): - line 23: //TODO: Remove once we commit any other classes gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/iceberg/IcebergRegisterStep.java (1 line): - line 63: // store as string for serializability... TODO: explore whether truly necessary (or we could just as well store as `TableIdentifier`) gobblin-modules/gobblin-azkaban/src/main/java/org/apache/gobblin/service/modules/orchestration/AzkabanJobHelper.java (1 line): - line 371: // TODO: compare checksum gobblin-data-management/src/main/java/org/apache/gobblin/data/management/conversion/hive/query/HiveAvroORCQueryGenerator.java (1 line): - line 945: // TODO: Add compatibility check when ORC evolution supports complex types gobblin-data-management/src/main/java/org/apache/gobblin/data/management/conversion/hive/converter/AbstractAvroToOrcConverter.java (1 line): - line 389: // TODO: Split this method into two (conversion and publish) gobblin-modules/gobblin-azkaban/src/main/java/org/apache/gobblin/azkaban/EmbeddedGobblinYarnAppLauncher.java (1 line): - line 45: * TODO: Adding embedded Kafka cluster and set golden datasets for data-validation. gobblin-temporal/src/main/java/org/apache/gobblin/temporal/dynamic/ScalingDirectiveParser.java (1 line): - line 197: // TODO: syntax to remove an attr while ALSO "adding" (so not simply setting to the empty string) - [proposal: alt. form for KV_PAIR ::= ( KEY '|=|' )] gobblin-modules/gobblin-orc/src/main/java/org/apache/gobblin/writer/GenericRecordToOrcValueWriter.java (1 line): - line 56: * TODO: consider using the record size provided by the extractor instead of the converter as it may be more available and accurate gobblin-runtime/src/main/java/org/apache/gobblin/runtime/locks/FileBasedJobLockFactory.java (1 line): - line 58: * TODO add configuration support gobblin-runtime/src/main/java/org/apache/gobblin/runtime/kafka/HighLevelConsumer.java (1 line): - line 259: // TODO: we may be committing too early and only want to commit after process messages gobblin-modules/gobblin-kafka-08/src/main/java/org/apache/gobblin/kafka/tool/SimpleKafkaConsumer.java (1 line): - line 72: /** TODO: Make Confluent schema registry integration configurable gobblin-hive-registration/src/main/java/org/apache/gobblin/hive/HiveRegisterStep.java (1 line): - line 56: // TODO: this is complicated due to preactivities, postactivities, etc. but unnecessary for now because exactly once gobblin-core/src/main/java/org/apache/gobblin/source/extractor/filebased/FileBasedExtractor.java (1 line): - line 210: * TODO Add support for different file formats besides text e.g. avro iterator, byte iterator, json iterator. gobblin-temporal/src/main/java/org/apache/gobblin/temporal/dynamic/WorkforcePlan.java (1 line): - line 116: // TODO - make idempotent, since any retry attempt following failure between `addProfile` and `reviseStaffing` would thereafter fail with gobblin-modules/gobblin-orc/src/main/java/org/apache/gobblin/writer/OrcConverterMemoryManager.java (1 line): - line 142: * TODO: Consider calculating this value on the fly everytime a resize is called gobblin-modules/gobblin-couchbase/src/main/java/org/apache/gobblin/couchbase/writer/CouchbaseWriterBuilder.java (1 line): - line 41: //TODO: Read config to decide whether to build a blocking writer or an async writer gobblin-service/src/main/java/org/apache/gobblin/service/monitoring/LocalFsJobStatusRetriever.java (1 line): - line 131: //TODO: implement this gobblin-api/src/main/java/org/apache/gobblin/password/PasswordManager.java (1 line): - line 142: // TODO: determine whether to always throw whenever no encryptors, despite `exception == null`! (for now, at least give notice by logging) gobblin-modules/gobblin-kafka-common/src/main/java/org/apache/gobblin/source/extractor/extract/kafka/KafkaStreamingExtractor.java (1 line): - line 575: // TODO: check if these topic partitions actually were part of the assignment gobblin-core/src/main/java/org/apache/gobblin/policies/schema/SchemaCompatibilityPolicy.java (1 line): - line 42: // TODO how do you test for backwards compatibility? gobblin-modules/gobblin-couchbase/src/main/java/org/apache/gobblin/couchbase/converter/AvroToCouchbaseTupleConverter.java (1 line): - line 47: //TODO: Use the schema and config to determine which fields to pull out gobblin-temporal/src/main/java/org/apache/gobblin/temporal/dynamic/FsScalingDirectiveSource.java (1 line): - line 84: // TODO: add caching by dir modtime to avoid re-listing the same, unchanged contents, while also avoiding repetitive parsing gobblin-compaction/src/main/java/org/apache/gobblin/compaction/verify/CompactionAuditCountVerifier.java (1 line): - line 47: * @TODO: 8/31/21 "Use @{@link org.apache.gobblin.completeness.verifier.KafkaAuditCountVerifier}" gobblin-temporal/src/main/java/org/apache/gobblin/temporal/ddm/activity/DeleteWorkDirsActivity.java (1 line): - line 34: * TODO: Generalize the input to support multiple platforms outside of just HDFS gobblin-service/src/main/java/org/apache/gobblin/service/modules/db/ServiceDatabaseProviderImpl.java (1 line): - line 75: // TODO: revisit following verification of successful connection pool migration: gobblin-modules/gobblin-kafka-09/src/main/java/org/apache/gobblin/kafka/serialize/LiAvroSerializer.java (1 line): - line 27: * TODO: Implement this for IndexedRecord not just GenericRecord gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/FlowLaunchHandler.java (1 line): - line 295: // TODO: investigate potentially better way of generating cron expression that does not make it US dependent gobblin-modules/gobblin-crypto-provider/src/main/java/org/apache/gobblin/crypto/GobblinEncryptionProvider.java (1 line): - line 118: // TODO this is yet another example of building a broad type (CredentialStore) based on a human-readable name gobblin-utility/src/main/java/org/apache/gobblin/util/ProxiedFileSystemUtils.java (1 line): - line 58: * TODO figure out the proper generic type for the {@link Token} objects. gobblin-core/src/main/java/org/apache/gobblin/writer/objectstore/ObjectStoreWriter.java (1 line): - line 81: // TODO Will be added when ObjectStorePutOperation is implemented. Currently we only support ObjectStoreDeleteOperation