InfluxDBSink: write Spark metrics and application info in near real-time to InfluxDB
use this mode to monitor Spark execution workload
use for Grafana dashboard and analytics of job execution
How to use: attach the InfluxDBSInk to a Spark Context using the extra listener infrastructure.
Example:
--conf spark.extraListeners=ch.cern.sparkmeasure.InfluxDBSink
Configuration for InfluxDBSink is handled with Spark conf parameters:
spark.sparkmeasure.influxdbURL, example value: http://mytestInfluxDB:8086
spark.sparkmeasure.influxdbUsername (can be empty)
spark.sparkmeasure.influxdbPassword (can be empty)
spark.sparkmeasure.influxdbName, defaults to "sparkmeasure"
spark.sparkmeasure.influxdbStagemetrics, boolean, default is false
This code depends on "influxdb.java", you may need to add the dependency:
--packages org.influxdb:influxdb-java:2.14
InfluxDBExtended: provides additional and verbose info on Task execution
use: --conf spark.extraListeners=ch.cern.sparkmeasure.InfluxDBSinkExtended
InfluxDBSink: the amount of data generated is relatively small in most applications: O(number_of_stages)
InfluxDBSInkExtended can generate a large amount of data O(Number_of_tasks), use with care
Linear Supertypes
SparkListener, SparkListenerInterface, AnyRef, Any
InfluxDBSink: write Spark metrics and application info in near real-time to InfluxDB use this mode to monitor Spark execution workload use for Grafana dashboard and analytics of job execution How to use: attach the InfluxDBSInk to a Spark Context using the extra listener infrastructure. Example: --conf spark.extraListeners=ch.cern.sparkmeasure.InfluxDBSink
Configuration for InfluxDBSink is handled with Spark conf parameters:
spark.sparkmeasure.influxdbURL, example value: http://mytestInfluxDB:8086 spark.sparkmeasure.influxdbUsername (can be empty) spark.sparkmeasure.influxdbPassword (can be empty) spark.sparkmeasure.influxdbName, defaults to "sparkmeasure" spark.sparkmeasure.influxdbStagemetrics, boolean, default is false
This code depends on "influxdb.java", you may need to add the dependency: --packages org.influxdb:influxdb-java:2.14
InfluxDBExtended: provides additional and verbose info on Task execution use: --conf spark.extraListeners=ch.cern.sparkmeasure.InfluxDBSinkExtended
InfluxDBSink: the amount of data generated is relatively small in most applications: O(number_of_stages) InfluxDBSInkExtended can generate a large amount of data O(Number_of_tasks), use with care