Metrics reference

Reference of available Prometheus metrics and common labels.

Prometheus metric types

Astra exposes metrics using one of the three following Prometheus types.

counter

Monotonically increasing counter, which can only increase or be reset to zero.

gauge

Single numerical value that can go up or down.

summary

Provides total count and sum of observation, along with configurable quantiles.

Tip: The most common quantile values include0.0, 0.5, 0.75, 0.9, 0.95, 0.98, 0.99, 0.999, and 1.0

Tip: For each summary up to three series will be exposed:

<metric_name>{quantile="<quantile>"}
<metric_name>_sum
<metric_name>_count

Common labels

Labels automatically applied to all exported Prometheus metrics.

astra_cluster_name

Cluster name, as defined in clusterConfig.clusterName.

astra_component

Node type, one of valid node roles.

query, index, cache, manager, recovery, preprocessor
            

astra_env

Cluster environment, as defined in clusterConfig.env.

Astra metrics

astra_index_commits_seconds | summary

astra_index_commits_seconds_max | gauge

astra_index_final_merges_seconds | summary

astra_index_final_merges_seconds_max | gauge

astra_index_merge_count_total | counter

astra_index_merge_stall_threads | gauge

astra_index_merge_stall_time_ms_total | counter

astra_index_refreshes_seconds | summary

astra_index_refreshes_seconds_max | gauge

astra_preprocessor_bulk_ingest_seconds | summary

astra_preprocessor_bulk_ingest_seconds_max | gauge

astra_preprocessor_incoming_byte_total | counter

astra_preprocessor_incoming_docs_total | counter

bulk_ingest_producer_batch_size | gauge

bulk_ingest_producer_failed_set_response_total | counter

bulk_ingest_producer_kafka_restart_timer_seconds | summary

bulk_ingest_producer_kafka_restart_timer_seconds_max | gauge

bulk_ingest_producer_stall_counter_total | counter

cached_cache_slots_size | gauge

The amount of cache slot znodes stored in Zookeeper.

                    The current state of the cache slot.
FREE, ASSIGNED, LOADING, LIVE, EVICT, EVICTING, UNRECOGNIZED
                        

cached_recovery_nodes_size | gauge

The amount of recovery node znodes stored in Zookeeper.

cached_recovery_tasks_size | gauge

The amount of recovery task znodes stored in Zookeeper.

cached_replica_nodes_size | gauge

The amount of replica znodes stored in Zookeeper.

cached_service_nodes_size | gauge

The amount of dataset znodes stored in Zookeeper.

cached_snapshots_size | gauge

The amount of snapshost znodes stored in Zookeeper.

chunk_assignment_timer_seconds | summary

chunk_assignment_timer_seconds_max | gauge

chunk_eviction_timer_seconds | summary

chunk_eviction_timer_seconds_max | gauge

convert_and_duplicate_field_total | counter

convert_errors_total | counter

convert_field_value_total | counter

distributed_query_apdex_frustrated_total | counter

distributed_query_apdex_satisfied_total | counter

distributed_query_apdex_tolerating_total | counter

distributed_query_snapshots_with_replicas_total | counter

distributed_query_total_snapshots_total | counter

dropped_fields_total | counter

hpa_cache_demand_factor_rep1 | gauge

hpa_cache_demand_factor_rep2 | gauge

index_files_upload_failed_total | counter

index_files_upload_total | counter

live_bytes_dir | gauge

live_bytes_indexed | gauge

live_messages_indexed | gauge

messages_failed_total | counter

messages_received_total | counter

metadata_failed_total | counter

preprocessor_dataset_rate_limit_reload_timer_seconds | summary

preprocessor_dataset_rate_limit_reload_timer_seconds_max | gauge

preprocessor_rate_limit_bytes_dropped_total | counter

service

preprocessor_rate_limit_messages_dropped_total | counter

service

records_failed_total | counter

records_received_total | counter

recovery_task_assignment_timer_seconds | summary

recovery_task_assignment_timer_seconds_max | gauge

recovery_tasks_assigned_total | counter

recovery_tasks_assignment_failures_total | counter

recovery_tasks_created_total | counter

recovery_tasks_insufficient_capacity_total | counter

replica_assign_available_capacity | gauge

replica_assign_failed_total | counter

replica_assign_pending | gauge

replica_assign_succeeded_total | counter

replica_assign_timer_seconds | summary

replica_assign_timer_seconds_max | gauge

replica_assignment_timer_seconds | summary

replica_assignment_timer_seconds_max | gauge

replica_delete_failed_total | counter

replica_delete_success_total | counter

replica_delete_timer_seconds | summary

replica_delete_timer_seconds_max | gauge

replica_mark_evict_failed_total | counter

replica_mark_evict_succeeded_total | counter

replica_mark_evict_timer_seconds | summary

replica_mark_evict_timer_seconds_max | gauge

replicas_created_total | counter

replicas_failed_total | counter

rollover_timer_seconds | summary

rollover_timer_seconds_max | gauge

rollovers_completed_total | counter

rollovers_failed_total | counter

rollovers_initiated_total | counter

search_metadata_total_change_counter_total | counter

snapshot_delete_failed_total | counter

snapshot_delete_success_total | counter

snapshot_delete_timer_seconds | summary

snapshot_delete_timer_seconds_max | gauge

snapshot_timer_seconds | summary

snapshot_timer_seconds_max | gauge

stale_snapshot_delete_failed_total | counter

stale_snapshot_delete_success_total | counter

total_fields_total | counter

Armeria metrics

armeria_server_file_vfs_cache_eviction_weight_total | counter

route

hostname_pattern

armeria_executor_queue_remaining_tasks | gauge

The number of additional elements that this queue can ideally accept without blocking

armeria_executor_pool_size_threads | gauge

The current number of threads in the pool

armeria_build_info | gauge

A metric with a constant ‘1’ value labeled by version and commit hash from which Armeria was built.

repo_status

version

armeria_server_file_vfs_cache_requests_total | counter

vfs

route

hostname_pattern

armeria_netty_common_event_loop_pending_tasks | gauge

armeria_executor_completed_tasks_total | counter

The approximate total number of tasks that have completed execution

armeria_server_router_virtual_host_cache_estimated_size | gauge

armeria_server_pending_responses | gauge

armeria_executor_queued_tasks | gauge

The approximate number of tasks that are queued for execution

armeria_server_router_virtual_host_cache_evictions_total | counter

armeria_server_router_virtual_host_cache_requests_total | counter

hostname_pattern

armeria_server_connections | gauge

armeria_netty_common_event_loop_workers | gauge

armeria_server_router_virtual_host_cache_eviction_weight_total | counter

armeria_executor_pool_max_threads | gauge

The maximum allowed number of threads in the pool

armeria_server_file_vfs_cache_evictions_total | counter

route

hostname_pattern

armeria_server_exceptions_unhandled_total | counter

armeria_server_connections_lifespan_seconds_max | gauge

armeria_server_connections_lifespan_seconds | summary

quantile

armeria_executor_active_threads | gauge

The approximate number of threads that are actively executing tasks

armeria_executor_pool_core_threads | gauge

The core number of threads for the pool

armeria_server_file_vfs_cache_estimated_size | gauge

route

hostname_pattern

Kafka metrics

kafka_app_info_start_time_ms | gauge

Metric indicating start-time-ms

client_id

kafka_consumer_commit_sync_time_ns_total | counter

The total time the consumer has spent in commitSync in nanoseconds

client_id

kafka_consumer_committed_time_ns_total | counter

The total time the consumer has spent in committed in nanoseconds

client_id

kafka_consumer_connection_close_rate | gauge

The number of connections closed per second

client_id

kafka_consumer_connection_close_total | counter

The total number of connections closed

client_id

kafka_consumer_connection_count | gauge

The current number of active connections.

client_id

kafka_consumer_connection_creation_rate | gauge

The number of new connections established per second

client_id

kafka_consumer_connection_creation_total | counter

The total number of new connections established

client_id

kafka_consumer_coordinator_assigned_partitions | gauge

The number of partitions currently assigned to this consumer

client_id

kafka_consumer_coordinator_commit_latency_avg | gauge

The average time taken for a commit request

client_id

kafka_consumer_coordinator_commit_latency_max | gauge

The max time taken for a commit request

client_id

kafka_consumer_coordinator_commit_rate | gauge

The number of commit calls per second

client_id

kafka_consumer_coordinator_commit_total | counter

The total number of commit calls

client_id

kafka_consumer_coordinator_failed_rebalance_rate_per_hour | gauge

The number of failed rebalance events per hour

client_id

kafka_consumer_coordinator_failed_rebalance_total | counter

The total number of failed rebalance events

client_id

kafka_consumer_coordinator_heartbeat_rate | gauge

The number of heartbeats per second

client_id

kafka_consumer_coordinator_heartbeat_response_time_max | gauge

The max time taken to receive a response to a heartbeat request

client_id

kafka_consumer_coordinator_heartbeat_total | counter

The total number of heartbeats

client_id

kafka_consumer_coordinator_join_rate | gauge

The number of group joins per second

client_id

kafka_consumer_coordinator_join_time_avg | gauge

The average time taken for a group rejoin

client_id

kafka_consumer_coordinator_join_time_max | gauge

The max time taken for a group rejoin

client_id

kafka_consumer_coordinator_join_total | counter

The total number of group joins

client_id

kafka_consumer_coordinator_last_heartbeat_seconds_ago | gauge

The number of seconds since the last coordinator heartbeat was sent

client_id

kafka_consumer_coordinator_last_rebalance_seconds_ago | gauge

The number of seconds since the last successful rebalance event

client_id

kafka_consumer_coordinator_partition_assigned_latency_avg | gauge

The average time taken for a partition-assigned rebalance listener callback

client_id

kafka_consumer_coordinator_partition_assigned_latency_max | gauge

The max time taken for a partition-assigned rebalance listener callback

client_id

kafka_consumer_coordinator_partition_lost_latency_avg | gauge

The average time taken for a partition-lost rebalance listener callback

client_id

kafka_consumer_coordinator_partition_lost_latency_max | gauge

The max time taken for a partition-lost rebalance listener callback

client_id

kafka_consumer_coordinator_partition_revoked_latency_avg | gauge

The average time taken for a partition-revoked rebalance listener callback

client_id

kafka_consumer_coordinator_partition_revoked_latency_max | gauge

The max time taken for a partition-revoked rebalance listener callback

client_id

kafka_consumer_coordinator_rebalance_latency_avg | gauge

The average time taken for a group to complete a successful rebalance, which may be composed of several failed re-trials until it succeeded

client_id

kafka_consumer_coordinator_rebalance_latency_max | gauge

The max time taken for a group to complete a successful rebalance, which may be composed of several failed re-trials until it succeeded

client_id

kafka_consumer_coordinator_rebalance_latency_total | counter

The total number of milliseconds this consumer has spent in successful rebalances since creation

client_id

kafka_consumer_coordinator_rebalance_rate_per_hour | gauge

The number of successful rebalance events per hour, each event is composed of several failed re-trials until it succeeded

client_id

kafka_consumer_coordinator_rebalance_total | counter

The total number of successful rebalance events, each event is composed of several failed re-trials until it succeeded

client_id

kafka_consumer_coordinator_sync_rate | gauge

The number of group syncs per second

client_id

kafka_consumer_coordinator_sync_time_avg | gauge

The average time taken for a group sync

client_id

kafka_consumer_coordinator_sync_time_max | gauge

The max time taken for a group sync

client_id

kafka_consumer_coordinator_sync_total | counter

The total number of group syncs

client_id

kafka_consumer_failed_authentication_rate | gauge

The number of connections with failed authentication per second

client_id

kafka_consumer_failed_authentication_total | counter

The total number of connections with failed authentication

client_id

kafka_consumer_failed_reauthentication_rate | gauge

The number of failed re-authentication of connections per second

client_id

kafka_consumer_failed_reauthentication_total | counter

The total number of failed re-authentication of connections

client_id

kafka_consumer_fetch_manager_bytes_consumed_rate | gauge

The average number of bytes consumed per second for a topic

topic

client_id

kafka_consumer_fetch_manager_bytes_consumed_total | counter

The total number of bytes consumed for a topic

topic

client_id

kafka_consumer_fetch_manager_fetch_latency_avg | gauge

The average time taken for a fetch request.

client_id

kafka_consumer_fetch_manager_fetch_latency_max | gauge

The max time taken for any fetch request.

client_id

kafka_consumer_fetch_manager_fetch_rate | gauge

The number of fetch requests per second.

client_id

kafka_consumer_fetch_manager_fetch_size_avg | gauge

The average number of bytes fetched per request for a topic

topic

client_id

kafka_consumer_fetch_manager_fetch_size_max | gauge

The maximum number of bytes fetched per request for a topic

topic

client_id

kafka_consumer_fetch_manager_fetch_throttle_time_avg | gauge

The average throttle time in ms

client_id

kafka_consumer_fetch_manager_fetch_throttle_time_max | gauge

The maximum throttle time in ms

client_id

kafka_consumer_fetch_manager_fetch_total | counter

The total number of fetch requests.

client_id

kafka_consumer_fetch_manager_preferred_read_replica | gauge

The current read replica for the partition, or -1 if reading from leader

kafka_version

topic

client_id

kafka_consumer_fetch_manager_records_consumed_rate | gauge

The average number of records consumed per second for a topic

topic

client_id

kafka_consumer_fetch_manager_records_consumed_total | counter

The total number of records consumed for a topic

topic

client_id

kafka_consumer_fetch_manager_records_lag | gauge

The latest lag of the partition

kafka_version

topic

client_id

kafka_consumer_fetch_manager_records_lag_avg | gauge

The average lag of the partition

kafka_version

topic

client_id

kafka_consumer_fetch_manager_records_lag_max | gauge

The max lag of the partition

kafka_version

topic

client_id

kafka_consumer_fetch_manager_records_lead | gauge

The latest lead of the partition

kafka_version

topic

client_id

kafka_consumer_fetch_manager_records_lead_avg | gauge

The average lead of the partition

kafka_version

topic

client_id

kafka_consumer_fetch_manager_records_lead_min | gauge

The min lead of the partition

kafka_version

topic

client_id

kafka_consumer_fetch_manager_records_per_request_avg | gauge

The average number of records in each request for a topic

topic

client_id

kafka_consumer_incoming_byte_rate | gauge

The number of bytes read off all sockets per second

client_id

kafka_consumer_incoming_byte_total | counter

The total number of bytes read off all sockets

client_id

kafka_consumer_io_ratio | gauge

Deprecated The fraction of time the I/O thread spent doing I/O

client_id

kafka_consumer_io_time_ns_avg | gauge

The average length of time for I/O per select call in nanoseconds.

client_id

kafka_consumer_io_time_ns_total | counter

The total time the I/O thread spent doing I/O

client_id

kafka_consumer_io_wait_ratio | gauge

Deprecated The fraction of time the I/O thread spent waiting

client_id

kafka_consumer_io_wait_time_ns_avg | gauge

The average length of time the I/O thread spent waiting for a socket ready for reads or writes in nanoseconds.

client_id

kafka_consumer_io_wait_time_ns_total | counter

The total time the I/O thread spent waiting

client_id

kafka_consumer_io_waittime_total | counter

Deprecated The total time the I/O thread spent waiting

client_id

kafka_consumer_iotime_total | counter

Deprecated The total time the I/O thread spent doing I/O

client_id

kafka_consumer_last_poll_seconds_ago | gauge

The number of seconds since the last poll() invocation.

client_id

kafka_consumer_network_io_rate | gauge

The number of network operations (reads or writes) on all connections per second

client_id

kafka_consumer_network_io_total | counter

The total number of network operations (reads or writes) on all connections

client_id

kafka_consumer_node_incoming_byte_rate | gauge

The number of incoming bytes per second

client_id

node_id

kafka_consumer_node_incoming_byte_total | counter

The total number of incoming bytes

client_id

node_id

kafka_consumer_node_outgoing_byte_rate | gauge

The number of outgoing bytes per second

client_id

node_id

kafka_consumer_node_outgoing_byte_total | counter

The total number of outgoing bytes

client_id

node_id

kafka_consumer_node_request_latency_avg | gauge

client_id

node_id

kafka_consumer_node_request_latency_max | gauge

client_id

node_id

kafka_consumer_node_request_rate | gauge

The number of requests sent per second

client_id

node_id

kafka_consumer_node_request_size_avg | gauge

The average size of requests sent.

client_id

node_id

kafka_consumer_node_request_size_max | gauge

The maximum size of any request sent.

client_id

node_id

kafka_consumer_node_request_total | counter

The total number of requests sent

client_id

node_id

kafka_consumer_node_response_rate | gauge

The number of responses received per second

client_id

node_id

kafka_consumer_node_response_total | counter

The total number of responses received

client_id

node_id

kafka_consumer_outgoing_byte_rate | gauge

The number of outgoing bytes sent to all servers per second

client_id

kafka_consumer_outgoing_byte_total | counter

The total number of outgoing bytes sent to all servers

client_id

kafka_consumer_poll_idle_ratio_avg | gauge

The average fraction of time the consumer’s poll() is idle as opposed to waiting for the user code to process records.

client_id

kafka_consumer_reauthentication_latency_avg | gauge

The average latency observed due to re-authentication

client_id

kafka_consumer_reauthentication_latency_max | gauge

The max latency observed due to re-authentication

client_id

kafka_consumer_request_rate | gauge

The number of requests sent per second

client_id

kafka_consumer_request_size_avg | gauge

The average size of requests sent.

client_id

kafka_consumer_request_size_max | gauge

The maximum size of any request sent.

client_id

kafka_consumer_request_total | counter

The total number of requests sent

client_id

kafka_consumer_response_rate | gauge

The number of responses received per second

client_id

kafka_consumer_response_total | counter

The total number of responses received

client_id

kafka_consumer_select_rate | gauge

The number of times the I/O layer checked for new I/O to perform per second

client_id

kafka_consumer_select_total | counter

The total number of times the I/O layer checked for new I/O to perform

client_id

kafka_consumer_successful_authentication_no_reauth_total | counter

The total number of connections with successful authentication where the client does not support re-authentication

client_id

kafka_consumer_successful_authentication_rate | gauge

The number of connections with successful authentication per second

client_id

kafka_consumer_successful_authentication_total | counter

The total number of connections with successful authentication

client_id

kafka_consumer_successful_reauthentication_rate | gauge

The number of successful re-authentication of connections per second

client_id

kafka_consumer_successful_reauthentication_total | counter

The total number of successful re-authentication of connections

client_id

kafka_consumer_time_between_poll_avg | gauge

The average delay between invocations of poll() in milliseconds.

client_id

kafka_consumer_time_between_poll_max | gauge

The max delay between invocations of poll() in milliseconds.

client_id

kafka_producer_batch_size_avg | gauge

The average number of bytes sent per partition per-request.

client_id

kafka_producer_batch_size_max | gauge

The max number of bytes sent per partition per-request.

client_id

kafka_producer_batch_split_rate | gauge

The average number of batch splits per second

client_id

kafka_producer_batch_split_total | counter

The total number of batch splits

client_id

kafka_producer_buffer_available_bytes | gauge

The total amount of buffer memory that is not being used (either unallocated or in the free list).

client_id

kafka_producer_buffer_exhausted_rate | gauge

The average per-second number of record sends that are dropped due to buffer exhaustion

client_id

kafka_producer_buffer_exhausted_total | counter

The total number of record sends that are dropped due to buffer exhaustion

client_id

kafka_producer_buffer_total_bytes | gauge

The maximum amount of buffer memory the client can use (whether or not it is currently used).

client_id

kafka_producer_bufferpool_wait_ratio | gauge

The fraction of time an appender waits for space allocation.

client_id

kafka_producer_bufferpool_wait_time_ns_total | counter

The total time in nanoseconds an appender waits for space allocation.

client_id

kafka_producer_bufferpool_wait_time_total | counter

Deprecated The total time an appender waits for space allocation.

client_id

kafka_producer_compression_rate_avg | gauge

The average compression rate of record batches, defined as the average ratio of the compressed batch size over the uncompressed size.

client_id

kafka_producer_connection_close_rate | gauge

The number of connections closed per second

client_id

kafka_producer_connection_close_total | counter

The total number of connections closed

client_id

kafka_producer_connection_count | gauge

The current number of active connections.

client_id

kafka_producer_connection_creation_rate | gauge

The number of new connections established per second

client_id

kafka_producer_connection_creation_total | counter

The total number of new connections established

client_id

kafka_producer_failed_authentication_rate | gauge

The number of connections with failed authentication per second

client_id

kafka_producer_failed_authentication_total | counter

The total number of connections with failed authentication

client_id

kafka_producer_failed_reauthentication_rate | gauge

The number of failed re-authentication of connections per second

client_id

kafka_producer_failed_reauthentication_total | counter

The total number of failed re-authentication of connections

client_id

kafka_producer_flush_time_ns_total | counter

Total time producer has spent in flush in nanoseconds.

client_id

kafka_producer_incoming_byte_rate | gauge

The number of bytes read off all sockets per second

client_id

kafka_producer_incoming_byte_total | counter

The total number of bytes read off all sockets

client_id

kafka_producer_io_ratio | gauge

Deprecated The fraction of time the I/O thread spent doing I/O

client_id

kafka_producer_io_time_ns_avg | gauge

The average length of time for I/O per select call in nanoseconds.

client_id

kafka_producer_io_time_ns_total | counter

The total time the I/O thread spent doing I/O

client_id

kafka_producer_io_wait_ratio | gauge

Deprecated The fraction of time the I/O thread spent waiting

client_id

kafka_producer_io_wait_time_ns_avg | gauge

The average length of time the I/O thread spent waiting for a socket ready for reads or writes in nanoseconds.

client_id

kafka_producer_io_wait_time_ns_total | counter

The total time the I/O thread spent waiting

client_id

kafka_producer_io_waittime_total | counter

Deprecated The total time the I/O thread spent waiting

client_id

kafka_producer_iotime_total | counter

Deprecated The total time the I/O thread spent doing I/O

client_id

kafka_producer_metadata_age | gauge

The age in seconds of the current producer metadata being used.

client_id

kafka_producer_metadata_wait_time_ns_total | counter

Total time producer has spent waiting on topic metadata in nanoseconds.

client_id

kafka_producer_network_io_rate | gauge

The number of network operations (reads or writes) on all connections per second

client_id

kafka_producer_network_io_total | counter

The total number of network operations (reads or writes) on all connections

client_id

kafka_producer_node_incoming_byte_rate | gauge

The number of incoming bytes per second

client_id

node_id

kafka_producer_node_incoming_byte_total | counter

The total number of incoming bytes

client_id

node_id

kafka_producer_node_outgoing_byte_rate | gauge

The number of outgoing bytes per second

client_id

node_id

kafka_producer_node_outgoing_byte_total | counter

The total number of outgoing bytes

client_id

node_id

kafka_producer_node_request_latency_avg | gauge

client_id

node_id

kafka_producer_node_request_latency_max | gauge

client_id

node_id

kafka_producer_node_request_rate | gauge

The number of requests sent per second

client_id

node_id

kafka_producer_node_request_size_avg | gauge

The average size of requests sent.

client_id

node_id

kafka_producer_node_request_size_max | gauge

The maximum size of any request sent.

client_id

node_id

kafka_producer_node_request_total | counter

The total number of requests sent

client_id

node_id

kafka_producer_node_response_rate | gauge

The number of responses received per second

client_id

node_id

kafka_producer_node_response_total | counter

The total number of responses received

client_id

node_id

kafka_producer_outgoing_byte_rate | gauge

The number of outgoing bytes sent to all servers per second

client_id

kafka_producer_outgoing_byte_total | counter

The total number of outgoing bytes sent to all servers

client_id

kafka_producer_produce_throttle_time_avg | gauge

The average time in ms a request was throttled by a broker

client_id

kafka_producer_produce_throttle_time_max | gauge

The maximum time in ms a request was throttled by a broker

client_id

kafka_producer_reauthentication_latency_avg | gauge

The average latency observed due to re-authentication

client_id

kafka_producer_reauthentication_latency_max | gauge

The max latency observed due to re-authentication

client_id

kafka_producer_record_error_rate | gauge

The average per-second number of record sends that resulted in errors

client_id

kafka_producer_record_error_total | counter

The total number of record sends that resulted in errors

client_id

kafka_producer_record_queue_time_avg | gauge

The average time in ms record batches spent in the send buffer.

client_id

kafka_producer_record_queue_time_max | gauge

The maximum time in ms record batches spent in the send buffer.

client_id

kafka_producer_record_retry_rate | gauge

The average per-second number of retried record sends

client_id

kafka_producer_record_retry_total | counter

The total number of retried record sends

client_id

kafka_producer_record_send_rate | gauge

The average number of records sent per second.

client_id

kafka_producer_record_send_total | counter

The total number of records sent.

client_id

kafka_producer_record_size_avg | gauge

The average record size

client_id

kafka_producer_record_size_max | gauge

The maximum record size

client_id

kafka_producer_records_per_request_avg | gauge

The average number of records per request.

client_id

kafka_producer_request_latency_avg | gauge

The average request latency in ms

client_id

kafka_producer_request_latency_max | gauge

The maximum request latency in ms

client_id

kafka_producer_request_rate | gauge

The number of requests sent per second

client_id

kafka_producer_request_size_avg | gauge

The average size of requests sent.

client_id

kafka_producer_request_size_max | gauge

The maximum size of any request sent.

client_id

kafka_producer_request_total | counter

The total number of requests sent

client_id

kafka_producer_requests_in_flight | gauge

The current number of in-flight requests awaiting a response.

client_id

kafka_producer_response_rate | gauge

The number of responses received per second

client_id

kafka_producer_response_total | counter

The total number of responses received

client_id

kafka_producer_select_rate | gauge

The number of times the I/O layer checked for new I/O to perform per second

client_id

kafka_producer_select_total | counter

The total number of times the I/O layer checked for new I/O to perform

client_id

kafka_producer_successful_authentication_no_reauth_total | counter

The total number of connections with successful authentication where the client does not support re-authentication

client_id

kafka_producer_successful_authentication_rate | gauge

The number of connections with successful authentication per second

client_id

kafka_producer_successful_authentication_total | counter

The total number of connections with successful authentication

client_id

kafka_producer_successful_reauthentication_rate | gauge

The number of successful re-authentication of connections per second

client_id

kafka_producer_successful_reauthentication_total | counter

The total number of successful re-authentication of connections

client_id

kafka_producer_topic_byte_rate | gauge

The average number of bytes sent per second for a topic.

topic

client_id

kafka_producer_topic_byte_total | counter

The total number of bytes sent for a topic.

topic

client_id

kafka_producer_topic_compression_rate | gauge

The average compression rate of record batches for a topic, defined as the average ratio of the compressed batch size over the uncompressed size.

topic

client_id

kafka_producer_topic_record_error_rate | gauge

The average per-second number of record sends that resulted in errors for a topic

topic

client_id

kafka_producer_topic_record_error_total | counter

The total number of record sends that resulted in errors for a topic

topic

client_id

kafka_producer_topic_record_retry_rate | gauge

The average per-second number of retried record sends for a topic

topic

client_id

kafka_producer_topic_record_retry_total | counter

The total number of retried record sends for a topic

topic

client_id

kafka_producer_topic_record_send_rate | gauge

The average number of records sent per second for a topic.

topic

client_id

kafka_producer_topic_record_send_total | counter

The total number of records sent for a topic.

topic

client_id

kafka_producer_txn_abort_time_ns_total | counter

Total time producer has spent in abortTransaction in nanoseconds.

client_id

kafka_producer_txn_begin_time_ns_total | counter

Total time producer has spent in beginTransaction in nanoseconds.

client_id

kafka_producer_txn_commit_time_ns_total | counter

Total time producer has spent in commitTransaction in nanoseconds.

client_id

kafka_producer_txn_init_time_ns_total | counter

Total time producer has spent in initTransactions in nanoseconds.

client_id

kafka_producer_txn_send_offsets_time_ns_total | counter

Total time producer has spent in sendOffsetsToTransaction in nanoseconds.

client_id

kafka_producer_waiting_threads | gauge

The number of user threads blocked waiting for buffer memory to enqueue their records

client_id

GRPC metrics

grpc_service_active_requests | gauge

method

service

grpc_service_request_duration_seconds | summary

method

service

quantile

grpc_status

http_status

grpc_service_request_duration_seconds_max | gauge

method

service

grpc_status

http_status

grpc_service_request_length | summary

method

service

quantile

grpc_status

http_status

grpc_service_request_length_max | gauge

method

service

grpc_status

http_status

grpc_service_requests_total | counter

hostname_pattern

method

service

grpc_status

http_status

grpc_service_response_duration_seconds | summary

method

service

quantile

grpc_status

http_status

grpc_service_response_duration_seconds_max | gauge

method

service

grpc_status

http_status

grpc_service_response_length | summary

method

service

quantile

grpc_status

http_status

grpc_service_response_length_max | gauge

method

service

grpc_status

http_status

grpc_service_timeouts_total | counter

method

service

grpc_status

cause

http_status

grpc_service_total_duration_seconds | summary

method

service

quantile

grpc_status

http_status

grpc_service_total_duration_seconds_max | gauge

method

service

grpc_status

http_status

Processor metrics

process_cpu_usage | gauge

The “recent cpu usage” for the Java Virtual Machine process

system_cpu_count | gauge

The number of processors available to the Java virtual machine

system_cpu_usage | gauge

The “recent cpu usage” of the system the application is running in

system_load_average_1m | gauge

The sum of the number of runnable entities queued to available processors and the number of runnable entities running on the available processors averaged over a period of time

JVM metrics

jvm_buffer_count_buffers | gauge

An estimate of the number of buffers in the pool

jvm_buffer_memory_used_bytes | gauge

An estimate of the memory that the Java virtual machine is using for this buffer pool

jvm_buffer_total_capacity_bytes | gauge

An estimate of the total capacity of the buffers in this pool

jvm_classes_loaded_classes | gauge

The number of classes that are currently loaded in the Java virtual machine

jvm_classes_unloaded_classes_total | counter

The total number of classes unloaded since the Java virtual machine has started execution

jvm_gc_concurrent_phase_time_seconds | summary

Time spent in concurrent phase

cause

gc

jvm_gc_concurrent_phase_time_seconds_max | gauge

Time spent in concurrent phase

cause

gc

jvm_gc_live_data_size_bytes | gauge

Size of long-lived heap memory pool after reclamation

jvm_gc_max_data_size_bytes | gauge

Max size of long-lived heap memory pool

jvm_gc_memory_allocated_bytes_total | counter

Incremented for an increase in the size of the (young) heap memory pool after one GC to before the next

jvm_gc_pause_seconds | summary

Time spent in GC pause

cause

gc

jvm_gc_pause_seconds_max | gauge

Time spent in GC pause

cause

gc

jvm_memory_committed_bytes | gauge

The amount of memory in bytes that is committed for the Java virtual machine to use

id

jvm_memory_max_bytes | gauge

The maximum amount of memory in bytes that can be used for memory management

id

jvm_memory_used_bytes | gauge

The amount of used memory

id

jvm_threads_daemon_threads | gauge

The current number of live daemon threads

jvm_threads_live_threads | gauge

The current number of live threads including both daemon and non-daemon threads

jvm_threads_peak_threads | gauge

The peak live thread count since the Java virtual machine started or peak was reset

jvm_threads_started_threads_total | counter

The total number of application threads started in the JVM

jvm_threads_states_threads | gauge

The current number of threads