案例:troubleshooting ORA-04030 in MGA on 19c

本案例来自东区某客户,db版本为19.12,业务应用程序频繁报ORA-04030导致业务中断。

继续分析trace查看报错的heap

报错与MGA有关,MGA是12的新特性,mos文档MGA (Managed Global Area) Reference Note (Doc ID 2638904.1)中有比较详细的描述。本文就不作过多的介绍。

About MGA (Managed Global Area):

The MGA allows a “smaller set” of processes (or even all processes) to share an address space for the duration (typically a query).  The MGA is made up of namespaces which contain segments and heaps with space management either directly or through a heap manager (KGH).

Processes can attach to their namespace for the duration they require.  For e.g., in a parallel query environment, PQs participating with the QC can share the namespace from QC to share the results.  Once done, they will detach from the same and the namespace can be tiered down.

The MGA is allocated dynamically which gives us more flexibility to create, control and share.  The sharing is dynamic in the sense that processes coordinate before sharing.  So in that sense, MGA sits between the SGA and PGA.  In addition, since the MGA is a shared memory area, a latch (latch: MGA) is used to control access and protect it.

Difference between MGA/PGA/SGA:

The MGA is not the SGA or PGA.  The PGA is private and the SGA is completely shared by all processes and is not elastic.  The MGA is elastic (processes can create and drop their MGA segments dynamically), shared between a set of processes or all processes, and counted under the PGA target/limit values (in v$PGASTAT).  The SGA is typically created once while the MGA is created on demand.

There can be multiple MGAs in an instance while we have just one SGA.

 

MGA Clients and Use Cases:

The MGA is a general feature which has multiple consumers and the benefit is based on the consumers.  MGA consumers are:  IPC, MGA hash joinIMCDT and PQ in current releases.  PQ is not a client by default.  When a query uses features like MGA hash join or IMCDT, PQs internally attach to the namespace.  (In other words, MGA hash join and IMCDT are actually executed by PQs with MGA being shared.)

从trace中可以看到总共2个namespace,总共分配了10个mga segments,对于MGA内存可以看到并没有使用大页,仍然使用的4k页面,关于MGA的segment大小与是否使用大页都有参数来控制

可以看到每个mga segment大小都是一致的,都是256M,由参数_ksipc_mga_segment_size控制,参数_use_large_pages_for_mga控制MGA是否使用大页,默认是关闭的。这10个mga segment都能从trace的process map中看到,位于/dev/shm下。

从heapdump上看,基本TOP heap都是MGA。

官方文档中介绍如果启用了MGA,那么PGA_AGGREGATE_LIMIT的大小需要考虑到processes,大概的公式为:

继续分析trace,发现MGA大概分配到PGA_AGGREGATE_LIMIT的20%时,就会报出ORA-04030

发现匹配BUG:32521805 – ORA-4030 IN MGA, MGA 20% OF PGA_AGGREGATE_LIMIT,从bug文档中并没有找到该bug的fixed版本。解决方法有两种都比较消极:

  • 加大PGA_AGGREGATE_LIMIT
  • 禁用MGA(“_ksipc_service_mask”=0)
此条目发表在Oracle, Oracle troubleshooting分类目录,贴了, 标签。将固定链接加入收藏夹。

发表回复

您的电子邮箱地址不会被公开。 必填项已用*标注