Skip to content

Metrics for Gobblin ETL

Issac Buenrostro edited this page Aug 21, 2015 · 7 revisions

Gobblin ETL comes equipped with instrumentation using [Gobblin Metrics](Gobblin Metrics), as well as end points to easily extend this instrumentation.

Operational Metrics

Each construct in a Gobblin ETL run computes metrics regarding it's performance / progress. Each metric is tagged by default with the following tags:

  • jobName: Gobblin generated name for the job.
  • jobId: Gobblin generated id for the job.
  • clusterIdentifier: string identifier the cluster / host where the job was run. Obtained from resource manager, job tracker, or the name of the host.
  • taskId: Gobblin generated id for the task that generated the metric.
  • construct: construct type that generated the metric (e.g. extractor, converter, etc.)
  • class: specific class of the construct that generated the metric.

This is the list of operational metrics implemented by default, grouped by construct.

Extractor Metrics

  • gobblin.extractor.records.read: meter for records read.
  • gobblin.extractor.records.failed: meter for records failed to read.
  • gobblin.extractor.extract.time: timer for reading of records.

Converter Metrics

  • gobblin.converter.records.in: meter for records going into the converter.
  • gobblin.converter.records.out: meter for records outputted by the converter.
  • gobblin.converter.records.failed: meter for records that failed to be converted.
  • gobblin.converter.convert.time: timer for conversion time of each record.

Fork Operator Metrics

  • gobblin.fork.operator.records.in: meter for records going into the fork operator.
  • gobblin.fork.operator.forks.out: meter for records going out of the fork operator (each record is counted once for each fork it is emitted to).
  • gobblin.fork.operator.fork.time: timer for forking of each record.

Row Level Policy Metrics

  • gobblin.qualitychecker.records.in: meter for records going into the row level policy.
  • gobblin.qualitychecker.records.passed: meter for records passing the row level policy check.
  • gobblin.qualitychecker.records.failed: meter for records failing the row level policy check.
  • gobblin.qualitychecker.check.time: timer for row level policy checking of each record.

Data Writer Metrics

  • gobblin.writer.records.in: meter for records requested to be written.
  • gobblin.writer.records.written: meter for records actually written.
  • gobblin.writer.records.failed: meter for records failed to be written.
  • gobblin.writer.write.time: timer for writing each record.

Runtime Events

The Gobblin ETL runtime emits events marking its progress. All events have the following metadata:

  • jobName: Gobblin generated name for the job.
  • jobId: Gobblin generated id for the job.
  • clusterIdentifier: string identifier the cluster / host where the job was run. Obtained from resource manager, job tracker, or the name of the host.
  • taskId: Gobblin generated id for the task that generated the metric (if applicable).

This is the list of events that are emitted by the Gobblin runtime:

Job Progression Events

  • LockInUse: emitted if a job fails because it fails to get a lock.
  • WorkUnitsMissing: emitted if a job exits because source failed to get work units.
  • WorkUnitsEmpty: emitted if a job exits because there were no work units to process.
  • TasksSubmitted: emitted when tasks are submitted for execution. Metadata: tasksCount(number of tasks submitted).
  • TaskFailed: emitted when a task fails. Metadata: taskId(id of the failed task).
  • Job_Successful: emitted at the end of a successful job.
  • Job_Failed: emitted at the end of a failed job.

Job Timing Events

These events give information on timing on certain parts of the execution. Each timing event contains the following metadata:

  • startTime: timestamp when the timed processing started.
  • endTime: timestamp when the timed processing finished.
  • durationMillis: duration in milliseconds of the timed processing.
  • eventType: always "timingEvent" for timing events.

The following timing events are emitted:

  • FullJobExecutionTimer: times the entire job execution.
  • WorkUnitsCreationTimer: times the creation of work units.
  • WorkUnitsPreparationTime: times the preparation of work units.
  • JobRunTimer: times the actual running of job (i.e. processing of all work units).
  • JobCommitTimer: times the committing of work units.
  • JobCleanupTimer: times the job cleanup.
  • JobLocalSetupTimer: times the setup of a local job.
  • JobMrStagingDataCleanTimer: times the deletion of staging directories from previous work units (MR mode).
  • JobMrDistributedCacheSetupTimer: times the setting up of distributed cache (MR mode).
  • JobMrSetupTimer: times the setup of the MR job (MR mode).
  • JobMrRunTimer: times the execution of the MR job (MR mode).
Clone this wiki locally