1.ASM的由来与现状
1996年,ORACLE公司的架构师同时也是ASM的创造者Bill Bridge,最早提出了自动存储管理(ASM)的想法,并通过电子邮件发送给Oracle服务技术部门的经理和架构师。项目名称为Parallel Storage Manager(PSM),该邮件部分摘录如下:
I propose that Oracle build a portable filesystem to support the RDBMS and other server products. I believe this is important for the very large servers of the future.Currently the RDBMS relies on the OS to provide a filesystem to control placement ofdata on physical devices and to mirror data for reliability. This causes a number of problems that can be solved with our own filesystem.I have some ideas on how to implement this filesystem, but a more detailed design is required. If this plan were approved, the next step would be for me to create a high level design document of around 100 pages. This would take me a few months. I would also need some time from various other people to discuss the design and what it would have to do to satisfy their needs. I would anticipate the implementation to take over a year and require 10–100 man years of effort to get to an alpha release.
该邮件发出之后,ORACLE内部出现了两种声音,一种认为该提议非常好,尽管开发成本很高,另一种声音则认为此想法过于激进,该功能与操作系统紧密相联,oracle不应该提供。在当时的环境下Oracle Parallel Server(OPS,RAC的前身)使用非常少,所以该提议一直被搁置,但是仍然推动了ORACLE 9i的一些新功能,比如Oracle-managed files (OMF)。随着RAC使用越来越广泛,ASM项目终于在1999年被oracle内部审批通过,项目名称从Parallel Storage Manager(PSM)变更为Oracle Storage Manager (OSM),并于2000年进入开发阶段,Rich Long正式成为OSM项目的开发经理,组成了6人的开发团队。其中最困难的挑战就是OSM元数据的设计和管理,同年,oracle市场营销部门又决定将Oracle Storage Manager (OSM)正式变更为Automatic Storage Management (ASM),以强调其自动化管理。
ASM的第一个版本在Oracle Database 10g Release 1上正式发行,并在2003年的Oracle Open World大会上正式成为10g grid的重要组成部分,ASM的第一个版本发布就受到了广大欢迎,让Oracle Rac不再依赖第三方的存储管理软件,与生俱来的冗余和条带化功能使得ASM被广泛使用,据统计在10g的rac上70%的用户选择了ASM。
2007年的OOW大会上,Oracle Database 11g正式发布,在Database 11g Release 1中,ASM做了许多增强功能,其中最重要的特性就是ASM Fast Mirror Resync,详见:
https://docs.oracle.com/cd/E11882_01/server.112/e18951/whatsnew.htm#OSTMG94052
现如今,经过20多年多年的发展,ASM的功能已经非常完善和强大,基本成为实现RAC共享存储的不二选择。在设计上,ASM开发人员设计了大量的元数据去管理和维护ASM文件以及支撑其强大而且完善的诸多功能,本章的内容主要介绍一些核心的ASM元数据。
2.asm元数据概述
ASM metadata官方的定义分为Physically addressed metadata和Virtually addressed metadata。
Physically addressed metadata包括:
1 2 3 4 |
Disk header Allocation table Free space table Partnership and status table |
Physically addressed metadata是位于各自磁盘固定位置的元数据块,并且每个磁盘的Physically addressed metadata都独立存在,Partnership and status table(PST)除外,因为并不是每个磁盘都拥有PST块,但是每个磁盘都有固定的位置为PST所保留。
Virtually addressed metadata包括:
1 2 3 4 5 6 7 8 9 10 11 12 |
File directory - ASM file number 1 Disk directory - ASM file number 2 Active change directory (ACD) - ASM file number 3 Continuing operations directory (COD) - ASM file number 4 Template directory - ASM file number 5 Alias directory - ASM file number 6 AVD volume file directory – ASM file number 7 Disk free space directory – ASM file number 8 Attributes directory – ASM file number 9 ASM User directory – ASM file number 10 ASM user group directory – ASM file number 11 Staleness directory – ASM file number 12 |
这里只列举了部分比较重要的Virtually addressed metadata,可以看到Virtually addressed metadata是由asm文件来定义的,因为asm条带化的特点,这些元数据将分布在每个磁盘上的。
ASM metadata由ASM metadata block组成,默认大小为4k,其中前32个字节为metadata header,所有的ASM元数据类型都具有metadata header,其结构定义为:
1 2 3 4 5 6 7 8 9 10 11 |
kfbh - Kernel Files (ASM) Block Header KF3_FIELD(kfbh, ub1, endian_kfbh) /* endianness of writer */ KF3_FIELD(kfbh, ub1, hard_kfbh) /* H.A.R.D. magic # and block size */ KF3_FIELD(kfbh, kfbtyp, type_kfbh) /* metadata block type */ KF3_FIELD(kfbh, ub1, datfmt_kfbh) /* metadata block data format */ KF3_FIELD(kfbh, kfbl, block_kfbh) /* block location of this block */ KF3_FIELD(kfbh, ub4, check_kfbh) /* check value to verify consistency */ KF3_FIELD(kfbh, kfcn, fcn_kfbh) /* change number of last change */ KF3_FIELD(kfbh, ub4, spare1_kfbh) /* zero pad out to 32 bytes */ KF3_FIELD(kfbh, ub4, spare2_kfbh) /* zero pad out to 32 bytes */ |
其具体含义如下:
- endian_kfbh:平台字节序
1 2 |
Little Endian = 1 Big Endian = 0 |
- hard_kfbh:H.A.R.D. magic number和metadata block size,例如默认4k大小的元数据的情况:hard_kfbh=KFBH_HARD+KFBH_HARD_4k=0x82,asm元数据块大小由asm实例隐藏参数_asm_blksize控制。
1 2 3 4 5 |
KFBH_HARD 0x02 /* expected magic number */ KFBH_HARD_4K 0x80 /* 4K metadata block size */ KFBH_HARD_8K 0xa0 /* 8K metadata block size */ KFBH_HARD_16K 0xc0 /* 16K metadata block size */ KFBH_HARD_32K 0xe0 /* 32K metadata block size */ |
- type_kfbh:描述了该元数据块属于哪种类型,其枚举定义如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
#define KFBTYP_INVALID ((kfbtyp)0x00) /* Unused: invalid block type */ #define KFBTYP_DISKHEAD ((kfbtyp)0x01) /* ASM disk header block */ #define KFBTYP_FREESPC ((kfbtyp)0x02) /* A block summarizing available AUs */ #define KFBTYP_ALLOCTBL ((kfbtyp)0x03) /* A block of disk allocation table */ #define KFBTYP_FILEDIR ((kfbtyp)0x04) /* A block of file directory entries */ #define KFBTYP_LISTHEAD ((kfbtyp)0x05) /* The block-allocated freelist head */ #define KFBTYP_DISKDIR ((kfbtyp)0x06) /* A block of disk directory entries */ #define KFBTYP_ACDC ((kfbtyp)0x07) /* An ACDC block */ #define KFBTYP_CHNGDIR ((kfbtyp)0x08) /* A block of ACD records */ #define KFBTYP_COD_BGO ((kfbtyp)0x09) /* COD BackGround Operations block */ #define KFBTYP_TMPLTDIR ((kfbtyp)0x0A) /* A block of template definitions */ #define KFBTYP_ALIASDIR ((kfbtyp)0x0B) /* A block of alias definitions */ #define KFBTYP_INDIRECT ((kfbtyp)0x0C) /* A block of extent pointers */ #define KFBTYP_PST_NONE ((kfbtyp)0x0D) /* PST AU block with no PST data */ #define KFBTYP_HASHNODE ((kfbtyp)0x0E) /* Reserved: KFFD hash bucket block */ #define KFBTYP_COD_RBO ((kfbtyp)0x0F) /* COD RollBack Operations block */ #define KFBTYP_COD_DATA ((kfbtyp)0x10) /* COD rollback Data block */ #define KFBTYP_PST_META ((kfbtyp)0x11) /* PST Meta block */ #define KFBTYP_PST_DTA ((kfbtyp)0x12) /* PST Data block */ #define KFBTYP_HBEAT ((kfbtyp)0x13) /* Heartbeat block */ #define KFBTYP_SR ((kfbtyp)0x14) /* Staleness registry block */ #define KFBTYP_STALEDIR ((kfbtyp)0x15) /* Staleness directory block */ #define KFBTYP_VOLUMEDIR ((kfbtyp)0x16) /* A block of volume entries */ #define KFBTYP_ATTRDIR ((kfbtyp)0x17) /* Attributes directory */ #define KFBTYP_LAST ((kfbtyp)0x18) /* First undefined block type */ |
- block_kfbh:分为block_kfbh.blk和block_kfbh.obj两部分。对于Physically addressed metadata,block_kfbh.obj为磁盘编号,block_kfbh.blk为位于磁盘的块号(基于元数据块大小);对于Virtually addressed metadata,block_kfbh.obj为asm文件号,block_kfbh.blk为位于该asm文件的块号(基于元数据块大小)。
- check_kfbh:在将块写入磁盘之前,会计算该校验值,确保整个元数据块32位异或运算为0,这与数据文件数据块的checksum类似。
- fcn_kfbh:元数据最后一次改动的计数,类似于数据文件数据块的scn,也是分为base(低位)和wrap(高位)两部分。