第117课: Spark Streaming性能优化:如何最大程度的确保Spark Cluster和Kafka链接的稳定性
1 Spark Streaming与Kafka连接问题 2 KafkaReceiver
spark2.x弄了一个静态类型检查,python就不能用
ZkUtils.scala zookeeper.session.timeout.ms 系统默认6000 6s,这个超时zk就资源重新分配,所有数据都不能接收。考虑GC等因素,zookeeper.seesion.timeout.ms这个值在生产环境中设置为30s。30000 ACK确认
ZkUtils
class ZKConfig(props: VerifiableProperties) { /** ZK host string */ val zkConnect = props.getString("zookeeper.connect") /** zookeeper session timeout */ val zkSessionTimeoutMs = props.getInt("zookeeper.session.timeout.ms", 6000)
/** the max time that the client waits to establish a connection to zookeeper */ val zkConnectionTimeoutMs = props.getInt("zookeeper.connection.timeout.ms",zkSessionTimeoutMs) /** how far a ZK follower can be behind a ZK leader */ val zkSyncTimeMs = props.getInt("zookeeper.sync.time.ms", 2000)}
ZookeeperConsumerConnector.scala