For the past few years, I have been working with several Oracle RAC projects. I realize the complexity and difficulties in implementing a Oracle, especially in deciding what to use and purchase for the cluster. Here is a list of things that I will consider during the design phrase of Oracle RAC.
-Check with the Disk Vendor that the Number of Nodes, OS version, RAC version, CRS version, Network fabric, and Patches are certified, as some Storage/San vendors may require special certification for a certain number of nodes.
-Use both external and Oracle provided redundancy for the OCR and Voting disks
-Eliminate any single points of failure in the architecture. Examples include : Cluster interconnect redundancy (NIC bonding etc), multiple access paths to storage, using 2 or more HBA's or initiators and multipathing software, and Disk mirroring/RAID
-Avoid using underscores in a host or domainname according to RFC952
-Make sure network interfaces have the same name on all nodes.
-Use Jumbo Frames if supported and possible in the system
-Make sure network interfaces are configured correctly in terms of speed, duplex, etc
-Configure nics for fault tolerance (bonding/link aggregation).
-Place hba and nic cards in the same corresponding slot on each server in the cluster.
-To avoid ORA-12545 errors, ensure that client HOSTS files and/or DNS are furnished with both VIP and Public hostnames.
-The CRS_HOME must be at a patch level or version that is greater than or equal to the patch level or version of the ASM Home. The CRS_HOME must be a patch level or version that is greater than or equal to the patch level or version of the RDBMS home
-Begin with minimum version 10.2.0.3 when upgrading 10.2.0.X to 11.X
-Set OPTIMIZER_DYNAMIC_SAMPLING = 1 or simply analyze your objects because 10g Dynamic sampling can generate extra CR buffers during execution of SQL statements.
-Many sites run with too few redo logs or with logs that are sized too small. With too few redo logs configured, there is the potential that the archiver process(es) cannot keep up which could cause the database to stall. Small redo logs cause frequent log switches, which can put a high load on the buffer cache and I/O system. As a general practice each thread should have at least three redo log groups with two members in each group.
-Reducing long full table scans in OLTP environments. FTSes are performance killer for a Oracle RAC database.
-Increasing sequence caches in insert intensive database improves instance affinity to index keys deriving their values from sequences. Increase the Cache for Application Sequences and some System sequences for better performance. Use a large cache value of maybe 10,000 or more. Additionaly use of the NOORDER attribute is most effective, but it does not guarantee sequence numbers are generated in order of request (this is actually the default.)
-Increase retention period for AWR data from 7 days to at least one business cycle. Use the awrinfo.sql script to budget for the amount of information required to be stored in the AWR and hence sizing the same.
-RAC Assurance Support Team: RAC Starter Kit and Best Practices (Generic)
Server considerations
Processor architecture(CPU speed, 32-bit vs 64-bit, Cache size)
Memory
NICs
Local disks
HBA cards
Network considerations
-Speed of interconnect networks(usually Gigabit network)
-Interconnect switches redundancy
-NICs fault tolerance (bonding)
Storage considerations
-Choice of shared storage(San, NAS or iSCSI)
-Storage(SAN) switches
-Storage size
-Disk redundancy configuration (RAID1,10 or 5)
-Cluster file system(OCFS2 or 3-party)
-IO multipathing
Oracle Considerations
-Oracle version( 10g or 11g, Enterprise or Standard)
-Patch level
-Choice of Clusterware
-Number of Nodes(3-node design seems to be a good choice as it prevents “split-brain” condition)
-Local Oracle Homes vs Shared Oracle Homes
-Location of OCR and voting disks
-Redundancy for the OCR and Voting disks
-Location of Archivelog
-Shared filesystem(ASM,RAW or cluster filesystem)
-Backup and recovery strategies

Post a Comment