您现在的位置是：首页 > 其他

当前栏目

大数据基础之Impala（3）部分调优

基础数据部分调优 Impala

2023-09-14 09:00:07 时间

1）将coordinator和executor角色分离

By default, each host in the cluster that runs the impalad daemon can act as the coordinator for an Impala query, execute the fragments of the execution plan for the query, or both. During highly concurrent workloads for large-scale queries, especially on large clusters, the dual roles can cause scalability issues:

The extra work required for a host to act as the coordinator could interfere with its capacity to perform other work for the earlier phases of the query. For example, the coordinator can experience significant network and CPU overhead during queries containing a large number of query fragments. Each coordinator caches metadata for all table partitions and data files, which can be substantial and contend with memory needed to process joins, aggregations, and other operations performed by query executors.
Having a large number of hosts act as coordinators can cause unnecessary network overhead, or even timeout errors, as each of those hosts communicates with the statestored daemon for metadata updates.
The “soft limits” imposed by the admission control feature are more likely to be exceeded when there are a large number of heavily loaded hosts acting as coordinators.

2）default_pool_max_requests，默认是200，要根据自己集群的内存规模以及每个查询需要的内存进行调整；

Maximum number of concurrent outstanding requests allowed to run before incoming requests are queued. Because this limit applies cluster-wide, but each Impala node makes independent decisions to run queries immediately or queue them, it is a soft limit; the overall number of concurrent queries might be slightly higher during times of heavy load. A negative value indicates no limit. Ignored if fair_scheduler_config_path and llama_site_path are set.

3）开启kerberos之后，通过jdbc访问需要做客户端load balance，因为jdbc url里需要携带对应server的principal；

猜你喜欢

AI之Paper：人工智能领域之学术界的所有国际期刊简介、入门、学以致用(持续更新，建议收藏！)
【学习总结】SQL的学习-4-常用函数介绍
VL32.2-非整数倍位宽转换，8bit 转到 12bit
Python编程：socket实现文件传输
Kubernetes 笔记 10 Job 机器人加工厂
LabVIEW如何减少下一代测试系统中的硬件过时2
第二篇：git创建流程
Linux Shell脚本之色彩显示
Database Crash With ORA-27063 and OS Error: 5: I/O Error
项目资源管理
利用JNDI的命名与服务功能来满足企业级API对命名与服务的访问
基于WebRTC实现1v1音视频聊天室
SpringBoot中@Mapper和@Repository注解的区别
iOS网络编程-配置iCloud-图文解说
QEMU：Checking for hardware virtualization FAIL (Only emulated CPUs are available) Intel-VT
Android SD卡升级报错解决
Git 管理工具 SourceTree 的使用（上手简单，不熟悉git命令的开发者必用）
【Codeforces 91B】Queue
SAP Gateway 在开发系统和生产系统上的缓存控制
MySQL MHA高可用环境搭建
求一个集合的所有子集 Python实现
Python：设计模式之门面模式

相关主题

Java 基础二
mysql基础使用
软件测试基础
JQuery基础
SQLite 基础16
CSS 基础
nginx基础3
Ajax - 基础
图像处理的基础
13 R基础练习
Socket基础一
基础算法0x01
元数据元数据
算法——基础
JAVA 基础IO流
T-SQL基础
大数据基础

zl程序教程

当前栏目

大数据基础之Impala（3）部分调优

相关文章