DBPedias

Your Database Knowledge Community

John Piwowar

  1. Build an 11gR2 RAC cluster in VirtualBox in 1 Hour using OVM templates

    [I originally posted this over at the Pythian blog. If you're not following it, you should! Way more content, by far smarter people than lil ol' me.]

    After reviewing my blog post about running EBS OVM templates in VirtualBox, two of my teammates suggested that I work on something with potentially broader appeal. Their basic message was, "This is really cool for us EBS nerds, but what about the Core DBAs?"

    So how does "11gR2 RAC in an hour" sound? :-) In this post, I'll demonstrate how to deploy the pre-built Oracle VM templates to create a two-node 11gR2 RAC cluster in Oracle VirtualBox.

    Why do this?

    There are already several high-quality "How to run RAC on your workstation" HOW-TO's out there, including the well-known RAC Attack (by Pythian's own Jeremy Schneider, and others) and Tim Hall's super-straightforward article on ORACLE-BASE. Does the internet really need another screenshot-heavy blog post about installing Oracle RAC? Maybe not, but I'm doing it anyway, because:

    • The OVM templates come with the software pre-installed/patched, and scripts that configure the networking, Grid Infrastructure, and database for you. Less fiddling around reduces the possibility of error, and you still have a RAC cluster at the end!
    • I claimed in my earlier blog post that it should be possible to convert other OVM templates, so it seemed like a good idea to actually test that claim.
    • I wanted an excuse to play around a bit more with the command-line interface to VirtualBox.

    Some readers might point out that installing and configuring the software is a good way to learn how things work, and that breaking and fixing things along the way helps one learn even more. I actually agree with that sentiment in general, since I'm a "learn by failing doing" kind of guy. On the other hand, Oracle is selling a line of high-end products that are supposed to take all of the hard work out of configuring RAC, so why shouldn't we have a bit of fun?

    Ingredients

    You will need:

    1. RAM. Lots of RAM. The OVM template docs specify 2GB *per RAC NODE*, and that is probably on the small side for any serious work. If you want to do anything else with your workstation while this is running, you will not be able to proceed without at least 6GB of RAM on the host machine. This is less resource-intensive than building your own OVM server, but it is not a lightweight endeavor.
    2. 80-100GB of disk space, depending on how you size your ASM disks
    3. A recent version of VirtualBox. An old one might do, but I didn't test on an old version. :)
    4. DNS service for the SCAN interface. You might be able to get away without it, but I can't guarantee that the Oracle-supplied cluster build scripts will work if you try to fake it. Tim Hall has a great post on a minimal DNS setup for SCAN, or you can use dnsmasq to convert your local hosts file into a DNS service. I opted for dnsmasq; it's pretty cool.
    5. A Linux install ISO image (or physical CD, if you're into that sort of thing). I used Oracle Enterprise Linux 5, Update 6, but any relatively recent OEL or RHEL install image should do the job here.
    6. An understanding of some basic Linux systems administration tasks
    7. Familiarity with configuring storage and network options in Virtualbox

    Important notes and thank-you's

    Nothing you're about to read in this post is supported by anyone. Not me, not Pythian, and certainly not Oracle. If you're thinking about using the techniques described here for any sort of production or QA deployment, please stop and question your sanity. Then call a few colleagues over to your desk and ask them to question your sanity.

    Please be mindful of your licensing and support status before working with these templates. Content from Oracle's Software Delivery Cloud is subject to a far more restrictive licensing than the more-familiar OTN development license. [Thanks to Don Seiler (@dtseiler) for reminding me of this.]

    So far, this is just a proof-of-concept. I haven't done extensive work to validate the RAC cluster I built from these instructions. There may be resource limitations that I have not yet discovered in this system, or more artifacts specific to the Oracle VM template that could be removed. "Do not be too proud of this technological terror you've constructed." :-)

    As always, I'm "standing on the shoulders of giants" to make this post happen. Huge thanks to Tim Hall (aka ORACLE-BASE) for his concise how-to documents that served as a springboard for this project; to the creators of dnsmasq for the easy local DNS option; to the clever folks at Oracle who built the VM cluster deployment script; and to my Pythian teammates and a handful of Twitter followers for encouraging me to blog about this.

    HOWTO: The short version

    The basic steps are as follows, with details in the next section.

    1. Set up your local DNS with IP addresses for both nodes in your future RAC cluster.
    2. Download the "Oracle RAC 11.2.0.1.4 for x86_64 (64 bit) with Oracle Linux 5.5 " OVM templates from the Oracle Software Delivery Cloud, and unzip (and unzip again!) the files
    3. Create a single VM in Virtualbox to be the first node in the RAC cluster
    4. Convert the OVM disk image files to VDI format, and attach them to your VM
    5. Boot the VM in rescue mode from a Linux install ISO, install a non-Xen version of the kernel, and make some config file adjustments
    6. Clone the VM to create the second node of the RAC cluster
    7. Create shared disks and attach to both VMs
    8. Boot both VMs and run the cluster configuration script
    9. Start playing with your new Virtualbox RAC cluster! (or watch your workstation swap itself to death, if you didn't heed my "lots of RAM" warning, above)

    HOWTO: The long version

    1. Set up DNS entries for your RAC cluster

      Complete details on DNS setup are beyond the scope of this post; instead, I've provided external references above to point you in a good direction. Here are the IPs and hostnames that I will be using in my example deployment. I'm using two separate host-only networks (vboxnet0 and vboxnet1) for the public and private interfaces, and the subnets (192.168.56.x and 192.168.57.x) were chosen automatically for me by Virtualbox. I try to keep things simple. :)

      #RAC stuff
      #Pub
      192.168.56.11 thing1.local.org thing1
      192.168.56.12 thing2.local.org thing2
      #Priv
      192.168.57.11 thing1-priv.local.org thing1-priv
      192.168.57.12 thing2-priv.local.org thing2-priv
      #VIP
      192.168.56.21 thing1-vip.local.org thing1-vip
      192.168.56.22 thing2-vip.local.org thing2-vip
      #SCAN
      192.168.56.31 clu-scan.local.org clu-scan
      192.168.56.32 clu-scan.local.org clu-scan
      192.168.56.33 clu-scan.local.org clu-scan
      

      Be sure to test that the new IPs resolve to the expected hostnames on your host machine. In particular, it's a good idea to check whether your SCAN IPs are round-robining:

      zathras:OVMRACTempl jpiwowar$ nslookup clu-scan
      Server:		127.0.0.1
      Address:	127.0.0.1#53
      
      Name:	clu-scan
      Address: 192.168.56.31
      Name:	clu-scan
      Address: 192.168.56.32
      Name:	clu-scan
      Address: 192.168.56.33
      
      zathras:OVMRACTempl jpiwowar$ dig clu-scan
      
      ; <<>> DiG 9.7.6-P1 <<>> clu-scan
      ;; global options: +cmd
      ;; Got answer:
      ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 43681
      ;; flags: qr aa rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 0
      
      ;; QUESTION SECTION:
      ;clu-scan.			IN	A
      
      ;; ANSWER SECTION:
      clu-scan.		0	IN	A	192.168.56.32
      clu-scan.		0	IN	A	192.168.56.33
      clu-scan.		0	IN	A	192.168.56.31
      
      ;; Query time: 2 msec
      ;; SERVER: 127.0.0.1#53(127.0.0.1)
      ;; WHEN: Thu Dec 20 21:28:19 2012
      ;; MSG SIZE  rcvd: 74
      
      zathras:OVMRACTempl jpiwowar$ dig clu-scan
      
      ; <<>> DiG 9.7.6-P1 <<>> clu-scan
      ;; global options: +cmd
      ;; Got answer:
      ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 15944
      ;; flags: qr aa rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 0
      
      ;; QUESTION SECTION:
      ;clu-scan.			IN	A
      
      ;; ANSWER SECTION:
      clu-scan.		0	IN	A	192.168.56.33
      clu-scan.		0	IN	A	192.168.56.31
      clu-scan.		0	IN	A	192.168.56.32
      
      ;; Query time: 0 msec
      ;; SERVER: 127.0.0.1#53(127.0.0.1)
      ;; WHEN: Thu Dec 20 21:28:27 2012
      ;; MSG SIZE  rcvd: 74
      
    2. Download the 11.2.0.1 11gR2 EL5.5 templates

      Connect to Oracle's Software Delivery Cloud and download the files listed under Oracle VM Templates for Oracle RAC 11gR2 Media Pack for x86_64 (64 bit). You'll need the two files for "Oracle RAC 11.2.0.1.4 for x86_64 (64 bit) with Oracle Linux 5.5" (V25916-01.zip and V25917-01.zip). I also recommend clicking the "View Digest"" button near the top of the download page, and running md5sum on each of the downloaded zip files to make sure the checksums match that list.

    3. Extract the templates
      • Unzip the two files you just downloaded (V25916-01.zip and V25917-01.zip). You'll get two .tgz files, OVM_EL5U5_X86_64_11201RAC_PVM-1of2.tgz and OVM_EL5U5_X86_64_11201RAC_PVM-2of2.tgz
      • Unpack the two zipped tar files (tar zxpf OVM_EL5U5_X86_64_11201RAC_PVM*.tgz). This will create a directory called OVM_EL5U5_X86_64_11201RAC_PVM, and that's where we'll be doing all of our work.
    4. Convert the OVM disk images to VDI format

      Open a command/terminal window and use the VBoxManage utility to convert the raw disk images (.img) in OVM_EL5U5_X86_64_11201RAC_PVM to .vdi files. This utility is installed with VirtualBox; you may need to find it first and add it to your path (location varies by host platform). Timings listed in the examples below are provided to set expectations for how long you'll need to wait for the conversion to complete.

      Note: I'm running VirtualBox on OS X, and the installer dropped VBoxManage into /usr/bin for me, so it's already in my path. Presumably you'll find a similar situation in Linux. If you're on Windows, and haven't customized your install, you should be able to find VBoxManage.exe in Program Files/Oracle/VirtualBox.

      zathras:OVMRACTempl jpiwowar$ mkdir OVM_EL5U5_X86_64_11201RAC_PVM/Thing1
      zathras:OVMRACTempl jpiwowar$ time VBoxManage convertfromraw OVM_EL5U5_X86_64_11201RAC_PVM/System.img OVM_EL5U5_X86_64_11201RAC_PVM/Thing1/RacRoot.vdi
      Converting from raw image file="OVM_EL5U5_X86_64_11201RAC_PVM/System.img" to file="OVM_EL5U5_X86_64_11201RAC_PVM/Thing1/RacRoot.vdi"...
      Creating dynamic image with size 13316728320 bytes (12700MB)...
      
      real	5m12.042s
      user	0m6.336s
      sys	0m12.783s
      zathras:OVMRACTempl jpiwowar$ time VBoxManage convertfromraw OVM_EL5U5_X86_64_11201RAC_PVM/Oracle11201RAC_x86_64-xvdb.img OVM_EL5U5_X86_64_11201RAC_PVM/Thing1/RacORCL.vdi
      
      Converting from raw image file="OVM_EL5U5_X86_64_11201RAC_PVM/Oracle11201RAC_x86_64-xvdb.img" to file="OVM_EL5U5_X86_64_11201RAC_PVM/Thing1/RacORCL.vdi"...
      Creating dynamic image with size 17179869184 bytes (16384MB)...
      
      real	9m1.932s
      user	0m7.424s
      sys	0m20.825s
    5. Create a VM for the first node of your RAC cluster

      The VM will need to be configured as follows:

      • OS: Oracle Linux (64-bit)
      • Three (3) NICs: The first two attached to separate Host-only networks (vboxnet0 and vboxnet1), and the third configured to use NAT.
      • One (1) CPU
      • 2GB of RAM
      • Device boot order: CD-ROM, then Hard Disk
      • Storage: Attach the two .VDI files to the SATA controller (Root disk first), and attach the Linux install ISO to the virtual DVD drive

      Rather than just present a screenshot of the configuration, here's a listing of the VBoxManage showvminfo command for my first VM (Thing1):

      Name:            Thing1
      Groups:          /
      Guest OS:        Oracle (64 bit)
      UUID:            f7108f32-190f-431e-8310-19179ce73909
      Config file:     /Users/jpiwowar/VMs/OVMRACTempl/OVM_EL5U5_X86_64_11201RAC_PVM/Thing1/Thing1.vbox
      Snapshot folder: /Users/jpiwowar/VMs/OVMRACTempl/OVM_EL5U5_X86_64_11201RAC_PVM/Thing1/Snapshots
      Log folder:      /Users/jpiwowar/VMs/OVMRACTempl/OVM_EL5U5_X86_64_11201RAC_PVM/Thing1/Logs
      Hardware UUID:   f7108f32-190f-431e-8310-19179ce73909
      Memory size:     2048MB
      Page Fusion:     off
      VRAM size:       8MB
      CPU exec cap:    100%
      HPET:            off
      Chipset:         piix3
      Firmware:        BIOS
      Number of CPUs:  1
      Synthetic Cpu:   off
      CPUID overrides: None
      Boot menu mode:  message and menu
      Boot Device (1): DVD
      Boot Device (2): HardDisk
      Boot Device (3): Not Assigned
      Boot Device (4): Not Assigned
      ACPI:            on
      IOAPIC:          on
      PAE:             on
      Time offset:     0ms
      RTC:             local time
      Hardw. virt.ext: on
      Hardw. virt.ext exclusive: off
      Nested Paging:   on
      Large Pages:     on
      VT-x VPID:       on
      State:           powered off (since 2012-12-19T07:57:37.202000000)
      Monitor count:   1
      3D Acceleration: off
      2D Video Acceleration: off
      Teleporter Enabled: off
      Teleporter Port: 0
      Teleporter Address:
      Teleporter Password:
      Tracing Enabled: off
      Allow Tracing to Access VM: off
      Tracing Configuration:
      Autostart Enabled: off
      Autostart Delay: 0
      Storage Controller Name (0):            IDE Controller
      Storage Controller Type (0):            PIIX4
      Storage Controller Instance Number (0): 0
      Storage Controller Max Port Count (0):  2
      Storage Controller Port Count (0):      2
      Storage Controller Bootable (0):        on
      Storage Controller Name (1):            SATA Controller
      Storage Controller Type (1):            IntelAhci
      Storage Controller Instance Number (1): 0
      Storage Controller Max Port Count (1):  30
      Storage Controller Port Count (1):      10
      Storage Controller Bootable (1):        on
      IDE (0, 0): /Users/jpiwowar/Downloads/Enterprise-R5-U6-Server-x86_64-dvd.iso (UUID: 43f3022e-fc22-44d0-bc86-8d82e3732d09)
      SATA (0, 0): /Users/jpiwowar/VMs/OVMRACTempl/OVM_EL5U5_X86_64_11201RAC_PVM/Thing1/RacRoot.vdi (UUID: d3894bf3-aa74-4d61-b2f7-86b30f1a61db)
      SATA (1, 0): /Users/jpiwowar/VMs/OVMRACTempl/OVM_EL5U5_X86_64_11201RAC_PVM/Thing1/RacORCL.vdi (UUID: 22565328-fe92-4467-b0a1-98d0bb71879d)
      NIC 1:           MAC: 08002768FFC0, Attachment: Host-only Interface 'vboxnet0', Cable connected: on, Trace: off (file: none), Type: 82540EM, Reported speed: 0 Mbps, Boot priority: 0, Promisc Policy: deny, Bandwidth group: none
      NIC 2:           MAC: 0800274C8D14, Attachment: Host-only Interface 'vboxnet1', Cable connected: on, Trace: off (file: none), Type: 82540EM, Reported speed: 0 Mbps, Boot priority: 0, Promisc Policy: deny, Bandwidth group: none
      NIC 3:           MAC: 08002758D099, Attachment: NAT, Cable connected: on, Trace: off (file: none), Type: 82540EM, Reported speed: 0 Mbps, Boot priority: 0, Promisc Policy: deny, Bandwidth group: none
      NIC 3 Settings:  MTU: 0, Socket (send: 64, receive: 64), TCP Window (send:64, receive: 64)
      NIC 4:           disabled
      NIC 5:           disabled
      NIC 6:           disabled
      NIC 7:           disabled
      NIC 8:           disabled
      Pointing Device: PS/2 Mouse
      Keyboard Device: PS/2 Keyboard
      UART 1:          disabled
      UART 2:          disabled
      LPT 1:           disabled
    6. Boot the new VM (Thing1) in rescue mode from the install CD
      Enter “linux rescue” at the the boot: prompt to enter rescue mode:

      Select the keyboard and language preferences that suit you, and enable two network interfaces: eth0 and eth2 (for now, just select “use IPv4" and “DHCP” when configuring). There is no need to enable eth1, since only one of the host-only interface needs to be active for this exercise:


      (repeat the steps above for the NAT interface, eth2)
      After setting up the network interfaces, progress therough the menus (“Continue” and “OK” in my case) until you get to a linux prompt, and switch to the root volume as instructed: # chroot /mnt/sysimage
      Optional step: start the sshd service and connect to the VM from your host via ssh, instead of performing the next few steps from the console of the VM. Use ‘ifconfig eth0' to find the IP address to use. (Note: the root password for both VMs is ‘ovsroot’)
      # service sshd start

    7. Update a few configuration files

      The kernel modules that are loaded to support the Xen kernel are not going to work with the non-Xen kernel, so we need to update modprobe.conf to match our target kernel version:

      [root@localhost ~]# vi /etc/modprobe.conf
      "/etc/modprobe.conf" 3L, 77C written
      [root@localhost ~]# cat /etc/modprobe.conf
      alias eth0 e1000
      alias scsi_hostadapter ata_piix
      alias scsi_hostadapter ahci
      

      Prevent the server from repeatedly trying to spawn a console on a non-existent OVM server:

      [root@localhost ~]# perl -pi.orig -e 's/^(co)/#\1/' /etc/inittab
      [root@localhost ~]# tail /etc/inittab
      3:2345:respawn:/sbin/mingetty tty3
      4:2345:respawn:/sbin/mingetty tty4
      5:2345:respawn:/sbin/mingetty tty5
      6:2345:respawn:/sbin/mingetty tty6
      
      # Run xdm in runlevel 5
      x:5:respawn:/etc/X11/prefdm -nodaemon
      
      # Run a getty on the virtual console
      #co:2345:respawn:/sbin/agetty xvc0 9600 vt100-nav

      Remove the link to the init script that builds the VM template. We don't want that happening at boot time:

      [root@localhost ~]# rm /etc/rc3.d/S99oraclevm-template
      rm: remove symbolic link `/etc/rc3.d/S99oraclevm-template'? yes

      Update /etc/fstab and the cluster configuration scripts with correct references to disk device names (including a few disks we have't configured yet; that's coming)

      [root@localhost ~]# perl -pi.orig -e 's/xvd/sd/g' /etc/fstab
      [root@localhost ~]# cd /u01/racovm
      [root@localhost racovm]# perl -pi.orig -e 's/xvd/sd/g' params.ini netconfig.ini diskconfig.sh
      
    8. Install a new kernel and modify grub.conf

      This VM is configured with a Xen version of the Oracle Linux 5.5 kernel, so we need to grab a "vanilla" version of that kernel. We'll use the Oracle public yum server to accomplish this; that’s why we’ve configured and activated the NAT interface. Since you've set up your host to act as a DNS server already, you should not need to add a nameserver entry to resolv.conf. In my case, the VM was able to resolve the address for public-yum.oracle.com without any further configuration changes. If you have issues, try replacing the "nameserver" line in /etc/resolv.conf with "nameserver 8.8.8.8"

      [root@localhost ~]#  cd /etc/yum.repos.d/
      [root@localhost yum.repos.d]# cat /etc/resolv.conf
      nameserver 10.0.4.2 <i>--This worked for me, if it doesn't for you, try 8.8.8.8</i>
      [root@localhost yum.repos.d]# wget http://public-yum.oracle.com/public-yum-el5.repo
      --2012-12-18 15:59:56--  http://public-yum.oracle.com/public-yum-el5.repo
      Resolving public-yum.oracle.com... 141.146.44.34
      Connecting to public-yum.oracle.com|141.146.44.34|:80... connected.
      HTTP request sent, awaiting response... 200 OK
      Length: 3974 (3.9K) 1
      Saving to: `public-yum-el5.repo'
      
      100%[==========================================================================================>] 3,974       --.-K/s   in 0s
      

      Update the public-yum-el5.repo file and set enabled=0 for every source except ol5_u6_base:

      [root@localhost yum.repos.d]# vi public-yum-el5.repo
      "public-yum-el5.repo" 111L, 3974C written
      [root@localhost yum.repos.d]# grep -B5 'enabled=1' public-yum-el5.repo | grep ']'
      [el5_u5_base]
      

      Install the OEL5.5 kernel and kernel-devel packages from the Oracle puclic yum server. We'll need kernel-devel to install VirtualBox guest additions later

      [root@localhost yum.repos.d]# yum install kernel-2.6.18-194.el5 kernel-devel-2.6.18-194.el5
      Loaded plugins: security
      el5_u5_base                                                                                                  | 1.1 kB     00:00
      el5_u5_base/primary                                                                                          | 1.1 MB     00:02
      el5_u5_base                                                                                                               4372/4372
      Setting up Install Process
      Resolving Dependencies
      --> Running transaction check
      ---> Package kernel.x86_64 0:2.6.18-194.el5 set to be installed
      ---> Package kernel-devel.x86_64 0:2.6.18-194.el5 set to be installed
      --> Finished Dependency Resolution
      
      Dependencies Resolved
      
      ====================================================================================================================================
       Package                         Arch                      Version                             Repository                      Size
      ====================================================================================================================================
      Installing:
       kernel                          x86_64                    2.6.18-194.el5                      el5_u5_base                     20 M
       kernel-devel                    x86_64                    2.6.18-194.el5                      el5_u5_base                    5.5 M
      
      Transaction Summary
      ====================================================================================================================================
      Install       2 Package(s)
      Upgrade       0 Package(s)
      
      Total download size: 25 M
      Is this ok [y/N]: y
      Downloading Packages:
      (1/2): kernel-devel-2.6.18-194.el5.x86_64.rpm                                                                | 5.5 MB     00:20
      (2/2): kernel-2.6.18-194.el5.x86_64.rpm                                                                      |  20 MB     01:03
      ------------------------------------------------------------------------------------------------------------------------------------
      Total                                                                                               266 kB/s |  25 MB     01:37
      Running rpm_check_debug
      Running Transaction Test
      Finished Transaction Test
      Transaction Test Succeeded
      Running Transaction
        Installing     : kernel                                                                                                       1/2
        Installing     : kernel-devel                                                                                                 2/2
      
      Installed:
        kernel.x86_64 0:2.6.18-194.el5                                kernel-devel.x86_64 0:2.6.18-194.el5                              
      
      Complete!
      

      Create an initrd for the new kernel; this should also add a new stanza to grub.conf:

      [root@localhost yum.repos.d]# mkinitrd -v -f /boot/initrd-2.6.18-194.el5.img 2.6.18-194.el5
      Creating initramfs
      Looking for deps of module ehci-hcd
      Looking for deps of module ohci-hcd
      Looking for deps of module uhci-hcd
      Looking for deps of module ext3: jbd
      Looking for deps of module jbd
      Found root device sda2 for LABEL=/
      Looking for driver for device sda2
      Looking for deps of module pci:v00008086d00002829sv00000000sd00000000bc01sc06i01: scsi_mod libata ahci scsi_mod libata ahci
      Looking for deps of module scsi_mod
      Looking for deps of module sd_mod: scsi_mod
      Looking for deps of module libata: scsi_mod
      Looking for deps of module ahci: scsi_mod libata
      Looking for driver for device sda3
      Looking for deps of module pci:v00008086d00002829sv00000000sd00000000bc01sc06i01: scsi_mod libata ahci scsi_mod libata ahci
      Looking for deps of module ata_piix: scsi_mod libata
      Looking for deps of module ide-disk
      Looking for deps of module dm-mem-cache
      Looking for deps of module dm-region_hash: dm-mod dm-log
      Looking for deps of module dm-mod
      Looking for deps of module dm-log: dm-mod
      Looking for deps of module dm-message
      Looking for deps of module dm-raid45: dm-message dm-mod dm-mem-cache dm-log dm-region_hash
      Using modules:  /lib/modules/2.6.18-194.el5/kernel/drivers/usb/host/ehci-hcd.ko /lib/modules/2.6.18-194.el5/kernel/drivers/usb/host/ohci-hcd.ko /lib/modules/2.6.18-194.el5/kernel/drivers/usb/host/uhci-hcd.ko /lib/modules/2.6.18-194.el5/kernel/fs/jbd/jbd.ko /lib/modules/2.6.18-194.el5/kernel/fs/ext3/ext3.ko /lib/modules/2.6.18-194.el5/kernel/drivers/scsi/scsi_mod.ko /lib/modules/2.6.18-194.el5/kernel/drivers/scsi/sd_mod.ko /lib/modules/2.6.18-194.el5/kernel/drivers/ata/libata.ko /lib/modules/2.6.18-194.el5/kernel/drivers/ata/ahci.ko /lib/modules/2.6.18-194.el5/kernel/drivers/ata/ata_piix.ko /lib/modules/2.6.18-194.el5/kernel/drivers/md/dm-mem-cache.ko /lib/modules/2.6.18-194.el5/kernel/drivers/md/dm-mod.ko /lib/modules/2.6.18-194.el5/kernel/drivers/md/dm-log.ko /lib/modules/2.6.18-194.el5/kernel/drivers/md/dm-region_hash.ko /lib/modules/2.6.18-194.el5/kernel/drivers/md/dm-message.ko /lib/modules/2.6.18-194.el5/kernel/drivers/md/dm-raid45.ko
      /sbin/nash -> /tmp/initrd.Tu1724/bin/nash
      /sbin/insmod.static -> /tmp/initrd.Tu1724/bin/insmod
      copy from `/lib/modules/2.6.18-194.el5/kernel/drivers/usb/host/ehci-hcd.ko' [elf64-x86-64] to `/tmp/initrd.Tu1724/lib/ehci-hcd.ko' [elf64-x86-64]
      copy from `/lib/modules/2.6.18-194.el5/kernel/drivers/usb/host/ohci-hcd.ko' [elf64-x86-64] to `/tmp/initrd.Tu1724/lib/ohci-hcd.ko' [elf64-x86-64]
      copy from `/lib/modules/2.6.18-194.el5/kernel/drivers/usb/host/uhci-hcd.ko' [elf64-x86-64] to `/tmp/initrd.Tu1724/lib/uhci-hcd.ko' [elf64-x86-64]
      copy from `/lib/modules/2.6.18-194.el5/kernel/fs/jbd/jbd.ko' [elf64-x86-64] to `/tmp/initrd.Tu1724/lib/jbd.ko' [elf64-x86-64]
      copy from `/lib/modules/2.6.18-194.el5/kernel/fs/ext3/ext3.ko' [elf64-x86-64] to `/tmp/initrd.Tu1724/lib/ext3.ko' [elf64-x86-64]
      copy from `/lib/modules/2.6.18-194.el5/kernel/drivers/scsi/scsi_mod.ko' [elf64-x86-64] to `/tmp/initrd.Tu1724/lib/scsi_mod.ko' [elf64-x86-64]
      copy from `/lib/modules/2.6.18-194.el5/kernel/drivers/scsi/sd_mod.ko' [elf64-x86-64] to `/tmp/initrd.Tu1724/lib/sd_mod.ko' [elf64-x86-64]
      copy from `/lib/modules/2.6.18-194.el5/kernel/drivers/ata/libata.ko' [elf64-x86-64] to `/tmp/initrd.Tu1724/lib/libata.ko' [elf64-x86-64]
      copy from `/lib/modules/2.6.18-194.el5/kernel/drivers/ata/ahci.ko' [elf64-x86-64] to `/tmp/initrd.Tu1724/lib/ahci.ko' [elf64-x86-64]
      copy from `/lib/modules/2.6.18-194.el5/kernel/drivers/ata/ata_piix.ko' [elf64-x86-64] to `/tmp/initrd.Tu1724/lib/ata_piix.ko' [elf64-x86-64]
      copy from `/lib/modules/2.6.18-194.el5/kernel/drivers/md/dm-mem-cache.ko' [elf64-x86-64] to `/tmp/initrd.Tu1724/lib/dm-mem-cache.ko' [elf64-x86-64]
      copy from `/lib/modules/2.6.18-194.el5/kernel/drivers/md/dm-mod.ko' [elf64-x86-64] to `/tmp/initrd.Tu1724/lib/dm-mod.ko' [elf64-x86-64]
      copy from `/lib/modules/2.6.18-194.el5/kernel/drivers/md/dm-log.ko' [elf64-x86-64] to `/tmp/initrd.Tu1724/lib/dm-log.ko' [elf64-x86-64]
      copy from `/lib/modules/2.6.18-194.el5/kernel/drivers/md/dm-region_hash.ko' [elf64-x86-64] to `/tmp/initrd.Tu1724/lib/dm-region_hash.ko' [elf64-x86-64]
      copy from `/lib/modules/2.6.18-194.el5/kernel/drivers/md/dm-message.ko' [elf64-x86-64] to `/tmp/initrd.Tu1724/lib/dm-message.ko' [elf64-x86-64]
      copy from `/lib/modules/2.6.18-194.el5/kernel/drivers/md/dm-raid45.ko' [elf64-x86-64] to `/tmp/initrd.Tu1724/lib/dm-raid45.ko' [elf64-x86-64]
      /sbin/dmraid.static -> /tmp/initrd.Tu1724/bin/dmraid
      /sbin/kpartx.static -> /tmp/initrd.Tu1724/bin/kpartx
      Adding module ehci-hcd
      Adding module ohci-hcd
      Adding module uhci-hcd
      Adding module jbd
      Adding module ext3
      Adding module scsi_mod
      Adding module sd_mod
      Adding module libata
      Adding module ahci
      Adding module ata_piix
      Adding module dm-mem-cache
      Adding module dm-mod
      Adding module dm-log
      Adding module dm-region_hash
      Adding module dm-message
      Adding module dm-raid45
      [root@localhost yum.repos.d]# head -30 /boot/grub/grub.conf
      # grub.conf generated by anaconda
      #
      # Note that you do not have to rerun grub after making changes to this file
      # NOTICE:  You have a /boot partition.  This means that
      #          all kernel and initrd paths are relative to /boot/, eg.
      #          root (hd0,0)
      #          kernel /vmlinuz-version ro root=/dev/xvda2
      #          initrd /initrd-version.img
      #boot=/dev/xvda
      timeout=9
      splashimage=(hd0,0)/grub/splash.xpm.gz
      hiddenmenu
      title Enterprise Linux Enterprise Linux Server (2.6.18-194.el5)
           root (hd0,0)
           kernel /vmlinuz-2.6.18-194.el5 ro root=LABEL=/ numa=off
           initrd /initrd-2.6.18-194.el5.img
      title Enterprise Linux Enterprise Linux Server (2.6.18-194.0.0.0.3.el5xen)
           root (hd0,0)
           kernel /vmlinuz-2.6.18-194.0.0.0.3.el5xen ro root=LABEL=/ numa=off
           initrd /initrd-2.6.18-194.0.0.0.3.el5xen.img
      

      Finally, add "divider=10" to the boot parameters in grub.conf to improve VM performance. This is often recommended as a way to reduce host CPU utilization when a VM is idle, but it also improves overall guest performance. When I tried my first run-through of this process without this parameter enabled, the cluster configuration script bogged down terribly, and failed midway through creating the database.

      [root@localhost yum.repos.d]# perl -pi.orig -e 's/(numa=off)/\1 divider=10/' /boot/grub/grub.conf
      [root@localhost yum.repos.d]# head -30 /boot/grub/grub.conf
      # grub.conf generated by anaconda
      #
      # Note that you do not have to rerun grub after making changes to this file
      # NOTICE:  You have a /boot partition.  This means that
      #          all kernel and initrd paths are relative to /boot/, eg.
      #          root (hd0,0)
      #          kernel /vmlinuz-version ro root=/dev/xvda2
      #          initrd /initrd-version.img
      #boot=/dev/xvda
      timeout=9
      splashimage=(hd0,0)/grub/splash.xpm.gz
      hiddenmenu
      title Enterprise Linux Enterprise Linux Server (2.6.18-194.el5)
           root (hd0,0)
           kernel /vmlinuz-2.6.18-194.el5 ro root=LABEL=/ numa=off divider=10
           initrd /initrd-2.6.18-194.el5.img
      title Enterprise Linux Enterprise Linux Server (2.6.18-194.0.0.0.3.el5xen)
           root (hd0,0)
           kernel /vmlinuz-2.6.18-194.0.0.0.3.el5xen ro root=LABEL=/ numa=off divider=10
           initrd /initrd-2.6.18-194.0.0.0.3.el5xen.img
      
    9. Reboot and install VirtualBox Guest Additions
      This is another recommended item to improve performance in your running VMs.
      • From the Devices menu, remove the Linux boot iso from the DVD drive
      • Reboot the VM by typing 'exit' twice in the console
      • Log in to the VM as root (password ovsroot)
      • Select "Install Guest Additions" from the Devices menu
      • Mount the Guest Additions media (attached to /dev/cdrom) from the console of the guest
      • Execute the VBoxLinuxAdditions.run script (add the --nox11 option, since we're not using X at this point)
      • After the Guest additions are installed (see screenshot below), shut down the guest (shutdown -h now)

    10. Clone the first VM to a second, and add the shared disks

      Big shout-out to Tim Hall here; apart from some minor variations, these are essentially his steps for adding shared disks to the cluster VMs. The primary difference is that we can use the clonevm functionality of VBoxManage, because we aren't attaching the (unformatted) shared disks until afterward. I've partially scripted these steps, to reduce the amount of copy/paste required.

      Please note that these commands assume a Unix-like shell. If you're running Windows and using a normal command shell (instead of something like Cygwin), you are no doubt a) exceedingly brave, b) very smart, and c) able to translate these commands to fit your environment. :)

      First, clone the "Thing1" VM to its partner, "Thing2":

      zathras:OVMRACTempl jpiwowar$ BASE_DIR=OVM_EL5U5_X86_64_11201RAC_PVM #this needs to be the location of your VM files, as specified in earlier steps!
      zathras:OVMRACTempl jpiwowar$ VMNAME1=Thing1
      zathras:OVMRACTempl jpiwowar$ VMNAME2=Thing2
      zathras:OVMRACTempl jpiwowar$ ASM_DISK_DIR=ASM  #the subdirectory under BASE_DIR where the shared disk files will be stored
      zathras:OVMRACTempl jpiwowar$ ASM_DISK_MB=4096  #size of each ASM disk, in MB.
      zathras:OVMRACTempl jpiwowar$ ASM_DISK_CNT=5    #number of ASM disks, must be 5 to match the OVM template cluster build script
      #Clone the VM...
      zathras:OVMRACTempl jpiwowar$ time VBoxManage clonevm $VMNAME1 --name $VMNAME2 --register --basefolder $BASE_DIR
      0%...10%...20%...30%...40%...50%...60%...70%...80%...90%...100%
      Machine has been successfully cloned as "Thing2"
      
      real	10m19.323s
      user	0m1.673s
      sys	0m2.312s
      

      Next, create and attach the shared disks:

      zathras:OVMRACTempl jpiwowar$ mkdir ${BASE_DIR}/${ASM_DISK_DIR}
      zathras:OVMRACTempl jpiwowar$ for i in `seq 1 $ASM_DISK_CNT`
      > do
      >    date
      >    VBoxManage createhd --filename ${BASE_DIR}/${ASM_DISK_DIR}/asm${i}.vdi --size $ASM_DISK_MB --format VDI --variant Fixed
      >    echo "Disk $i complete"
      > done
      Wed Dec 19 13:30:30 PST 2012
      0%...10%...20%...30%...40%...50%...60%...70%...80%...90%...100%
      Disk image created. UUID: 03cfd4f9-0f0b-4135-a9be-7e1a7f2dbfff
      Disk 1 complete
      Wed Dec 19 13:31:13 PST 2012
      0%...10%...20%...30%...40%...50%...60%...70%...80%...90%...100%
      Disk image created. UUID: e0bbcdf9-8ac6-4971-bc2c-f796196008e0
      Disk 2 complete
      Wed Dec 19 13:32:20 PST 2012
      0%...10%...20%...30%...40%...50%...60%...70%...80%...90%...100%
      Disk image created. UUID: 375212e5-5ce1-4b63-942e-14ea0a7c4450
      Disk 3 complete
      Wed Dec 19 13:33:15 PST 2012
      0%...10%...20%...30%...40%...50%...60%...70%...80%...90%...100%
      Disk image created. UUID: 2818b7cb-83e8-45c7-a005-a3d01abd901e
      Disk 4 complete
      Wed Dec 19 13:34:36 PST 2012
      0%...10%...20%...30%...40%...50%...60%...70%...80%...90%...100%
      Disk image created. UUID: 392f5b00-32ea-4201-b253-0bf6d11a60e7
      Disk 5 complete
      zathras:OVMRACTempl jpiwowar$ for i in `seq 1 $ASM_DISK_CNT`
      > do
      > VBoxManage modifyhd ${BASE_DIR}/${ASM_DISK_DIR}/asm${i}.vdi --type shareable
      > VBoxManage storageattach $VMNAME1 --storagectl "SATA Controller" --port $((i+1)) --device 0 --type hdd --medium ${BASE_DIR}/${ASM_DISK_DIR}/asm${i}.vdi --mtype shareable
      > VBoxManage storageattach $VMNAME2 --storagectl "SATA Controller" --port $((i+1)) --device 0 --type hdd --medium ${BASE_DIR}/${ASM_DISK_DIR}/asm${i}.vdi --mtype shareable
      > done
      #No output expected, unless something goes wrong. :-)

      You can run VBoxManage showvminfo $VMNAME1 and VBoxManage showvminfo $VMNAME2 to display configurations of both machines; they should match, and you should see new disks attached to the SATA controller:

      IDE (0, 0): /Applications/VirtualBox.app/Contents/MacOS/VBoxGuestAdditions.iso (UUID: edb51511-86a5-4f28-ac63-34a811446229)
      SATA (0, 0): /Users/jpiwowar/VMs/OVMRACTempl/OVM_EL5U5_X86_64_11201RAC_PVM/Thing1/RacRoot.vdi (UUID: d3894bf3-aa74-4d61-b2f7-86b30f1a61db)
      SATA (1, 0): /Users/jpiwowar/VMs/OVMRACTempl/OVM_EL5U5_X86_64_11201RAC_PVM/Thing1/RacORCL.vdi (UUID: 22565328-fe92-4467-b0a1-98d0bb71879d)
      SATA (2, 0): /Users/jpiwowar/VMs/OVMRACTempl/OVM_EL5U5_X86_64_11201RAC_PVM/ASM/asm1.vdi (UUID: 03cfd4f9-0f0b-4135-a9be-7e1a7f2dbfff)
      SATA (3, 0): /Users/jpiwowar/VMs/OVMRACTempl/OVM_EL5U5_X86_64_11201RAC_PVM/ASM/asm2.vdi (UUID: e0bbcdf9-8ac6-4971-bc2c-f796196008e0)
      SATA (4, 0): /Users/jpiwowar/VMs/OVMRACTempl/OVM_EL5U5_X86_64_11201RAC_PVM/ASM/asm3.vdi (UUID: 375212e5-5ce1-4b63-942e-14ea0a7c4450)
      SATA (5, 0): /Users/jpiwowar/VMs/OVMRACTempl/OVM_EL5U5_X86_64_11201RAC_PVM/ASM/asm4.vdi (UUID: 2818b7cb-83e8-45c7-a005-a3d01abd901e)
      SATA (6, 0): /Users/jpiwowar/VMs/OVMRACTempl/OVM_EL5U5_X86_64_11201RAC_PVM/ASM/asm5.vdi (UUID: 392f5b00-32ea-4201-b253-0bf6d11a60e7)
    11. Finally, we get to build the RAC cluster!
      Start both VMs, and log in as root. Right now, neither machine has an "identity" on the network, so we need to configure them. On each node, execute the script /u01/racovm/netconfig.sh. These need to be run at the same time; the first node will wait until you've started the script on the second node. Answer 'YES' to "Is this the first node" question on node 1, 'NO' on node 2:

      Complete the network configuration "interview" on node1 to configure, and wait for changes to be propagated to node 2:

      At this point, both machines should be reachable from your host, via the hostnames and IPs you configured earlier. Now we can run the script that builds the RAC cluster and creates a small database called (what else?) ORCL. Connect to Thing1 (your first node) as root and run the script /u01/racovm/buildcluster.sh. I prefer to do this from an ssh session so I can keep track of the output, but you can just as easily run the script from the console. The build script took about 30 minutes on my laptop; your experience will probably be different.

      [root@thing1 ~]# /u01/racovm/buildcluster.sh
      Are you sure you want to install RAC?
      Do not run if software is already installed and/or running..  [yes|no]? yes
      Invoking on thing1 as root...
         Oracle RAC 11gR2 OneCommand (v1.2) for Oracle VM - (c) 2010-2011 Oracle Corporation
         Cksum: [1170221909 255000 racovm.sh] at Wed Dec 19 19:10:08 EST 2012
         Kernel: 2.6.18-194.el5 (x86_64) [1 processor(s)] 2011 MB
         Step(s): buildcluster
      [30-ish minutes later...]
      
      INFO (node:thing1): Running on: thing1 as oracle: export ORACLE_HOME=/u01/app/oracle/product/11.2.0/dbhome_1; /u01/app/oracle/product/11.2.0/dbhome_1/bin/srvctl status database -d ORCL
      Instance ORCL1 is running on node thing1
      Instance ORCL2 is running on node thing2
      
      INFO (node:thing1): Running on: thing1 as root: /u01/app/11.2.0/grid/bin/crsctl status resource -t
      --------------------------------------------------------------------------------
      NAME           TARGET  STATE        SERVER                   STATE_DETAILS
      --------------------------------------------------------------------------------
      Local Resources
      --------------------------------------------------------------------------------
      ora.DATA.dg
                     ONLINE  ONLINE       thing1
                     ONLINE  ONLINE       thing2
      ora.LISTENER.lsnr
                     ONLINE  ONLINE       thing1
                     ONLINE  ONLINE       thing2
      ora.asm
                     ONLINE  ONLINE       thing1                   Started
                     ONLINE  ONLINE       thing2                   Started
      ora.eons
                     ONLINE  ONLINE       thing1
                     ONLINE  ONLINE       thing2
      ora.gsd
                     OFFLINE OFFLINE      thing1
                     OFFLINE OFFLINE      thing2
      ora.net1.network
                     ONLINE  ONLINE       thing1
                     ONLINE  ONLINE       thing2
      ora.ons
                     ONLINE  ONLINE       thing1
                     ONLINE  ONLINE       thing2
      ora.registry.acfs
                     ONLINE  ONLINE       thing1
                     ONLINE  ONLINE       thing2
      --------------------------------------------------------------------------------
      Cluster Resources
      --------------------------------------------------------------------------------
      ora.LISTENER_SCAN1.lsnr
            1        ONLINE  ONLINE       thing2
      ora.LISTENER_SCAN2.lsnr
            1        ONLINE  ONLINE       thing1
      ora.LISTENER_SCAN3.lsnr
            1        ONLINE  ONLINE       thing1
      ora.oc4j
            1        OFFLINE OFFLINE
      ora.orcl.db
            1        ONLINE  ONLINE       thing1                   Open
            2        ONLINE  ONLINE       thing2                   Open
      ora.scan1.vip
            1        ONLINE  ONLINE       thing2
      ora.scan2.vip
            1        ONLINE  ONLINE       thing1
      ora.scan3.vip
            1        ONLINE  ONLINE       thing1
      ora.thing1.vip
            1        ONLINE  ONLINE       thing1
      ora.thing2.vip
            1        ONLINE  ONLINE       thing2                                      
      
      INFO (node:thing1): For an explanation on resources in OFFLINE state, see Note:1068835.1
      2012-12-19 19:42:46:[clusterstate:Time :thing1] Completed successfully in 4 seconds (0h:00m:04s)
      2012-12-19 19:42:46:[buildcluster:Done :thing1] Build 11gR2 RAC Cluster
      2012-12-19 19:42:46:[buildcluster:Time :thing1] Completed successfully in 1958 seconds (0h:32m:38s)
      
      INFO (node:thing1): This entire build was logged in logfile: /u01/racovm/buildcluster1.log
      
    12. ...And we're done!
      Since this is still in the proof-of-concept phase for me, I haven't taken this cluster for a long test drive yet, just brief tours through asmctl and the ORCL database. As far as I can see, I have a healthy, happy RAC cluster on my workstation, but I welcome hearing about your experiences and any tweaks or optimizations. Please start a conversation in the comments!
  2. NOW they’re fixed…

    A few of our test EBS instances had developed a well-deserved reputation of taking a long time to start up. Here’s what we saw:

    1. Database ground to a near-halt shortly after running adstrtal.sh
    2. Most concurrent manager processes were being marked as “dead” in the internal manager log not long after startup, and the internal manager attempted to restart them.
    3. Consequence of #2: Three to four times as many FNDLIBR processes running on the app tier server as expected.
    4. Consequence of #2 and #3: The “Active sessions” graph in Grid Control resembled Mt. Kilimanjaro, and all three database server load average numbers (1, 5, and 15-minute) on were over 100.

    Needless to say, neither users (who could not connect to the test instance), developers (who could not connect to the Apps database), nor DBAs (who had to answer endless “when will the instance be up?” emails) were very happy.

    The culprit? This little query, executed for each FNDLIBR process as it started up, generating a ridiculous number of “control file sequential read” waits along the way:

    SELECT count(*)
      FROM v$thread;
    

    When I took a closer look at the query, this is what I found:

    SQL> select /*+ gather_plan_statistics */ count(*)
      2  from v$thread;
    
      COUNT(*)
    ----------
             2                                                                                                              
    
    Elapsed: 00:00:11.17
    SQL>  select *
      2  from table(DBMS_XPLAN.DISPLAY_CURSOR(NULL, NULL, 'ALLSTATS LAST'));
    
    PLAN_TABLE_OUTPUT
    ------------------------------------------------------------------------------------------------------------------------
    SQL_ID  057v054v2svhp, child number 0
    -------------------------------------
    select /*+ gather_plan_statistics */ count(*) from v$thread                                                             
    
    Plan hash value: 3150894624                                                                                             
    
    ------------------------------------------------------------------------------------------------------------------------
    | Id  | Operation                 | Name            | Starts | E-Rows | A-Rows |   A-Time   |  OMem |  1Mem | Used-Mem |
    ------------------------------------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT          |                 |      1 |        |      1 |00:00:11.16 |       |       |          |
    |   1 |  SORT AGGREGATE           |                 |      1 |      1 |      1 |00:00:11.16 |       |       |          |
    |   2 |   NESTED LOOPS            |                 |      1 |      1 |      2 |00:00:00.37 |       |       |          |
    |   3 |    MERGE JOIN CARTESIAN   |                 |      1 |      1 |   3969 |00:00:00.17 |       |       |          |
    |*  4 |     FIXED TABLE FULL      | X$KCCTIR        |      1 |      1 |     63 |00:00:00.09 |       |       |          |
    |   5 |     BUFFER SORT           |                 |     63 |      1 |   3969 |00:00:00.08 | 73728 | 73728 |          |
    |   6 |      FIXED TABLE FULL     | X$KCCCP         |      1 |      1 |     63 |00:00:00.05 |       |       |          |
    |*  7 |    FIXED TABLE FIXED INDEX| X$KCCRT (ind:1) |   3969 |      1 |      2 |00:00:11.00 |       |       |          |
    ------------------------------------------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------                                                                     
    
       4 - filter("TR"."INST_ID"=USERENV('INSTANCE'))
       7 - filter(("RT"."INST_ID"=USERENV('INSTANCE') AND "RTNLF"<>0 AND "TIRNUM"="RTNUM" AND "CPTNO"="RTNUM"))
    

    That’s a lot of work to do for 2 rows. Clearly the internal manager was expecting a faster start time from its children, which explains why it kept attempting to start new ones. Repeatedly. Until we had over 200 FNDLIBR processes running instead of our expected 80-ish.

    After gathering fixed object statistics, the query behaved a bit better, and we therefore expect that our “slow to awaken” instance should be a bit more speedy:

    SQL> exec dbms_stats.gather_fixed_objects_stats
    
    PL/SQL procedure successfully completed.
    
    Elapsed: 00:03:11.45
    SQL> select /*+ gather_plan_statistics */ count(*) from v$thread
      2  ;
    
      COUNT(*)
    ----------
             2                                                                                                              
    
    Elapsed: 00:00:00.03
    SQL>  select *
      2  from table(DBMS_XPLAN.DISPLAY_CURSOR(NULL, NULL, 'ALLSTATS LAST'));
    
    PLAN_TABLE_OUTPUT
    ------------------------------------------------------------------------------------------------------------------------
    SQL_ID  f1pmbmcstp1rj, child number 0
    -------------------------------------
    select /*+ gather_plan_statistics */ count(*) from v$thread                                                             
    
    Plan hash value: 93051267                                                                                               
    
    -----------------------------------------------------------------------------------------------
    | Id  | Operation                  | Name             | Starts | E-Rows | A-Rows |   A-Time   |
    -----------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT           |                  |      1 |        |      1 |00:00:00.02 |
    |   1 |  SORT AGGREGATE            |                  |      1 |      1 |      1 |00:00:00.02 |
    |   2 |   NESTED LOOPS             |                  |      1 |      2 |      2 |00:00:00.02 |
    |   3 |    NESTED LOOPS            |                  |      1 |      2 |      2 |00:00:00.01 |
    |   4 |     FIXED TABLE FULL       | X$KCCRT          |      1 |      2 |      2 |00:00:00.01 |
    |   5 |     FIXED TABLE FIXED INDEX| X$KCCCP (ind:1)  |      2 |      1 |      2 |00:00:00.01 |
    |   6 |    FIXED TABLE FIXED INDEX | X$KCCTIR (ind:1) |      2 |      1 |      2 |00:00:00.02 |
    -----------------------------------------------------------------------------------------------
    
  3. Logging for non-existent listeners

    While working on an 11gR1 database server today, I fat-fingered the name of a secondary listener (you know, practicing my stock-trading skills), and in the process I noticed something that I hadn't considered before. Here's the setup:

    [oracle@11gr1srv ~]$ cd $DIAG_DIR/tnslsnr/11gr1srv
    [oracle@11gr1srv 11gr1srv]$ ls
    listener  listener_old
    [oracle@11gr1srv 11gr1srv]$ lsnrctl start LISTENER_ODL
    
    LSNRCTL for Linux: Version 11.1.0.7.0 - Production on 16-MAY-2010 13:09:10
    
    Copyright (c) 1991, 2008, Oracle.  All rights reserved.
    
    Starting /opt/oracle/app/oracle/product/11.1.0/dbhome_1/bin/tnslsnr: please wait...
    
    TNSLSNR for Linux: Version 11.1.0.7.0 - Production
    System parameter file is /opt/oracle/app/oracle/product/11.1.0/dbhome_1/network/admin/listener.ora
    Log messages written to /opt/oracle/app/oracle/diag/tnslsnr/11gr1srv/listener_odl/alert/log.xml
    TNS-01151: Missing listener name, LISTENER_ODL, in LISTENER.ORA
    
    Listener failed to start. See the error message(s) above...

    It looks like there are more consequences than brief embarrassment for that typo, though. My failed attempt to start the listener created a full Automatic Diagnostic Repository (ADR) directory structure for the non-existent listener:

    [oracle@11gr1srv 11gr1srv]$ ls
    listener  listener_odl  listener_old
    [oracle@11gr1srv 11gr1srv]$ ls -R listener_odl
    listener_odl:
    alert  cdump  incident  incpkg  lck  metadata  stage  sweep  trace
    
    listener_odl/alert:
    log.xml
    
    listener_odl/cdump:
    
    listener_odl/incident:
    
    listener_odl/incpkg:
    
    listener_odl/lck:
    AM_1096102193_3488045378.lck  AM_1744845641_3861997533.lck
    AM_1096102262_3454819329.lck  AM_3216668543_3129272988.lck
    
    listener_odl/metadata:
    ADR_CONTROL.ams       INC_METER_IMPT_DEF.ams
    ADR_INVALIDATION.ams  INC_METER_PK_IMPTS.ams
    
    listener_odl/stage:
    
    listener_odl/sweep:
    
    listener_odl/trace:
    listener_odl.log

    I looked through the Net Services docs, but wasn't able to find a way to disable this behavior, and my Google-fu failed me as well. This only seems to be happening for the 'start' command; other lsnrctl commands (reload, status, stop, etc) just return the expected "TNS-01101: Could not find service name" response.

    This sort of thing happens in 10g, too (and probably in older versions, I'm just too lazy to fire up my 9i test system), but the impact is much smaller: a single log file, not all of the ADR stuff that comes with 11g. Even in 11g, it's not a big deal; this just generates some light housecleaning work. At the extreme, I suppose it could be possible to launch the lamest DOS attack ever by chewing up lots of inodes, but there have to be more entertaining (and faster) ways to do that.

  4. Resolving ORA-4023 during a 10gR2->11gR1 upgrade

    Here's one for a hypothetical frustrated Googler. I just had something goofy happen to me while testing preparations for a database upgrade from 10gR2 to 11gR1. I'd already been through the process a few times on this server, so I wasn't expecting any problems. Since production upgrades usually happen during time windows that guarantee reduced mental capacity, however, I'd decided to perform one last dry run to verify my documentation and to test my "upgrade day" checklist. I cleaned out the database files and related configurations from the previous successful upgrade, and started over from step 0.

    Everything was going as expected until I actually ran DBUA. The first step after selecting the database to upgrade, "Gathering database information," was failing with the message:

    ORA-04023: Object SYS.STANDARD could not be validated or authorized

    At first, I thought I might have fallen prey to the conditions described in My Oracle Support Note 729909.1: Upgrading to 11.1.0 and DBUA reports ORA-4023 On SYS.STANDARD, but the diagnostic steps in the note did not match what I was seeing. Next stop: the DBUA logs in $ORACLE_BASE/cfgtoollogs/dbua/logs. The answer became obvious when I saw the following lines in the trace.log from the failed dbua session (some lines omitted for brevity):

    [main] [2:35:45:170] [DatabasesPage.initializePage:351]  Loading databases instance found in OraTab/Registry
    [main] [2:35:45:170] [OsUtilsUnix.enumerateSIDs:350]  checking sid: xxxx
    [main] [2:35:45:179] [Database.getStepSQLInterface:690]  OH in database: /opt/oracle/app/oracle/product/11.1.0/dbhome_1
    [main] [2:35:45:186] [Database.getStepSQLInterface:718]  sqlplus created with home:=/opt/oracle/app/oracle/product/11.1.0/dbhome_1 and sid:=xxxx
    [Thread-4] [2:35:49:44] [CompManager.setSelectedDatabase:1339]  setSelectedDatabase::oracleHome=/opt/oracle/app/oracle/product/11.1.0/dbhome_1 sid=xxxx
    [Thread-4] [2:35:49:45] [CompManager.setSelectedDatabase:1424]  Old home=/opt/oracle/app/oracle/product/11.1.0/dbhome_1
    [Thread-4] [2:35:49:47] [CompManager.setSelectedDatabase:1425]  New home=/opt/oracle/app/oracle/product/11.1.0/dbhome_1

    Sure enough, when I checked the same log file from a previous successful upgrade, the corresponding lines in the trace.log file all referred to the old (10.2.0) ORACLE_HOME. Despite all my careful cleanup, I had somehow forgotten to reset the value of the ORACLE_HOME for the database in /etc/oratab:

    [oracle@testsrv logs]$ grep product /etc/oratab
    xxxx:/opt/oracle/app/oracle/product/11.1.0/dbhome_1:N
    

    To make matters worse, I had also failed to notice that the ORACLE_HOME value was wrong in the DBUA interface. This, I suppose, is why we test. :-P

    After setting the ORACLE_HOME in /etc/oratab back to the 10gR2 value, I was able to complete the upgrade successfully.

  5. Friday mumblings: VanOUG, 11gR2, and EBS

    At the first meeting of the newly-reconstituted Vancouver Oracle Users Group this past week, we were treated to three great presentations by Caleb Small and Dan Morgan. They've made the content of their presentations available on the VanOUG web site (these links go to PDFs, if that sort of thing bugs you, consider yourself warned):

    I'm not going to go into a full recap of the presentations, but they were all full of really cool information. This post is an attempt to collect some of my mental notes, mostly cast in the context of one of my favorite topics, Oracle Applications.

    11gR2 HA Best Practices

    Caleb's presentation was very thorough and well-constructed. Dan gave him grief for boring the audience, but I think there was just so much new content to absorb that people were too busy processing to ask many questions on the fly. ;) Here are some one-liners from my notebook (anything that looks like an opinion is my commentary/interpretation, not Caleb's):

    • Lots more "moving parts" in 11gR2 Grid Infrastructure, clear "separation of duties" across three privileged OS accounts.
    • Service startup order is a little different now
    • Cluster status utilities show a lot more information, but need to learn to not rely upon crsstat as much
    • Proper networking configuration of 11gR2 GI not for the faint of heart. ;-)
    • Increased memory requirements will make this tougher to virtualize; I'm going to need a bigger laptop.
    • ACFS looks interesting; I wonder if it will be a valid option for an (shared application tier filesystem) for Oracle Applications. (Turns out the answer is "not currently planned," based on this exchange I had w/ Steven Chan on his blog later in the week).

    11g New Features

    There's a lot of really neat stuff going on in this presentation. I'd like to call out small nugget that, while far from the most important, is still pretty interesting on the surface: "deferred segment creation." When a table is created, no extents are actually allocated until rows are inserted. Seems like an odd feature, but one touted benefit is for large ERP systems like SAP and Oracle Applications, where lots of tables are created that may never be used, depending on what products are implemented. Those thousands of initial extents can certainly add up to real storage, and a more cluttered data dictionary. I can't speak to SAP implementations, but I don't see it as a huge win for EBS customers, given that:

    1. This feature is available only when tables are created, which means the benefit will only really be available when Oracle starts shipping Oracle Applications install media with an 11gR2 database. Anyone upgrading to the 11gR2 database will still be stuck with those empty extents.
    2. Given the overall footprint of an EBS database, the storage savings isn't such a big deal. For example, here's the potential savings from eliminating "empty" tables from an R12 Vision database:
        SYSTEM@R12VIS(11.1.0.7)>select sum(bytes)/1024/1024 potential_savings
      2  from dba_segments s
      3  where exists (select table_name
      4                  from dba_tables
      5                 where num_rows = 0
      6                   and table_name = s.segment_name
      7               )
      8  /
      
      POTENTIAL_SAVINGS
      -----------------
      3850.36719
      

      3.5(ish) GB out of 200GB is okay, I guess, but not a huge deal for a system that's only going to keep growing. FWIW, I'm going to wave my hands and pretend that the fact that a Vision database has way more populated tables than a "fresh-install" EBS database is balanced by the fact that my quick query doesn't account for the possibility that table stats are stale and some of those tables are actually populated. ;-)

    Of course, it's possible that I'm missing the point. It wouldn't be the first time! Maybe it really comes down more to a less-cluttered data dictionary. I mean, it can't be about tablespace fragmentation, since we're not supposed to care about that anymore, right?

    Edition Based Redefinition (EBR)

    This seemed like an interesting feature when I first heard about it last autumn, but I'll confess that I didn't quite comprehend the power of EBR until seeing Dan's demo (parts 1, 2, and 3 are on his Morgan's Library site, with part 4 still in the works). Setting aside the obvious benefits for home-grown applications, the potential benefits in an Oracle Applications environment are huge. Consider:

    1. There's already an option to create a staged Applications System to shorten patch downtime windows, allowing administrators to run the "copy" and "generate" portions of large EBS patches prior to applying the patch to production. With EBR, it could be possible to stage the "database" portion of a patch as well, and switch to a new default edition at patch time. You'd probably still want to do the database staging at a quiet time in the database, of course, but daring souls could accomplish "almost-no-downtime" patching if EBR were worked into the Oracle Applications patching framework. Wicked.
    2. EBR might even make it possible to truly have EBS patches that could be rolled back. The current patching process already backs up files that are replaced. Thoughtful application of cross-edition triggers might make it possible to revert to a previous edition without loss of data if a patch needs to be backed out. Granted, the process would have to be demonstrated to be pretty bullet-proof before I'd try it in production, but it could save restoring test and dev systems from backup in the event that a patch doesn't work out as expected.

    Just as I might be missing the point about deferred segment creation's advantages in EBS, I might be guilty of over-extended enthusiasm with respect to edition based redefinition. Or maybe I've decided to turn this into a science fiction blog. ;-) It's sure to be far more complicated to implement EBR in an Oracle Applications context than I'm implying above, and this is only speculation on my part, not anything that's actually promised by Oracle. Still, a nerd can dream...

    Thanks again to Caleb and Dan for the great presentations, and for your continued support in getting the user group launched!

  6. 32-bit to 64-bit database migration tips: OLAP upgrade

    A while ago, I had the opportunity to migrate an E-Business Suite database (Apps version 12.0.4, database version 10.2.0.4) from 32-bit Linux to 64-bit Linux. It's a straightforward process, outlined in My Oracle Support Note 471566.1: Migrating Oracle E-Business Suite R12 from Linux 32-bit to Linux 64-bit. Performing one of the critical migration steps, upgrading OLAP analytical workspaces (AWs), requires some careful reading, starting with the primary migration document for the database tier: Note 456197.1: Using Oracle E-Business Suite Release 12 with a Database Tier Only Platform on Oracle 10g Release 2. This document directs you to Note 352306.1: Upgrading OLAP from 32 to 64 bits, which covers the migration process for OLAP workspaces: export and delete from the the 32-bit system, then recreate on the 64-bit system and import the contents. The remainder of this blog post includes some embellishment of those four steps, from my migration notes. Examples were recreated on my test database; please don't look for these workspace names in an actual EBS database.

    Please note that I'm by no means an OLAP expert; if you have your own observations or experiences to share, including corrections to any errors I might have made, please leave a comment. The last thing I want to do is spread misinformation! And, as always, remember: test systems exist for a reason, and instructions from Oracle Support should trump anything you read in this blog entry :-)

    "No objects to export" error when exporting AWs

    The export process is explained thoroughly in Note 352306.1. You may encounter the following error, however, when attempting to export an empty workspace:

    BEGIN dbms_aw.execute('export all to eif file ''EXPORT_DIR/AWTEST.eif'''); END;
    *
    ERROR at line 1:
    ORA-33390: There are no objects to export.
    ORA-06512: at "SYS.DBMS_AW", line 93
    ORA-06512: at "SYS.DBMS_AW", line 122
    ORA-06512: at line 1
    

    An export file will not be created, since there's no data in the workspace. Nonetheless, you will still need to recreate the AW in the 64-bit database, which leads us to the next section...

    Before deleting AWs

    In addition to gathering the OLAP workspace's name, schema, and tablespace, make sure that you make a note of how the AW is partitioned. This will allow you to more faithfully reconstruct the AW in the 64-bit database. Again, the basics can be found in Note 352306.1, except for a discussion of workspace partitioning. According to the documentation for DBMS_AW.AW_CREATE, by default, analytic workspaces are created with 8 partitions. Querying dba_segments seemed to tell a different story:

    SYSTEM@mactest(10.2.0.4)>select segment_name
    2    , segment_type
    3    , count(*)
    4   from dba_segments
    5   where segment_name= 'AW$TESTDEFAULT'
    6   group by segment_name
    7   , segment_type;
    
    SEGMENT_NAME                   SEGMENT_TYPE           COUNT(*)
    ------------------------------ -------------------- ----------
    AW$TESTDEFAULT                 TABLE SUBPARTITION           16
    

    This initially confused me, until I found that the table created for the default workspace is actually comprised of two partitions, each comprised of 8 subpartitions. Apparently, "partition" means different things to different people:

    SYSTEM@mactest(10.2.0.4)>select table_name
    2  , partition_name
    3  , subpartition_count sub
    4  from all_tab_partitions
    5  where table_name = 'AW$TESTDEFAULT';
    
    TABLE_NAME           PARTITION_NAME        SUB
    -------------------- -------------- ----------
    AW$TESTDEFAULT       PTN1                    8
    AW$TESTDEFAULT       PTNN                    8
    

    So, before you delete the AWs in the 32-bit database, be sure to consult the data dictionary. In most cases, you'll probably see segment count of 16 in dba_segments (implying a default partitioning scheme). But there are exceptions...

    There's always one goofball

    One of those exceptions came when my query to get a count of AW segments returned a 1. Naturally, I was expecting an even number, so this came as a surprise. At first, I thought this might be a special case when specifying partnum=>1 when creating the workspace:

    SYSTEM@mactest(10.2.0.4)>exec dbms_aw.aw_create('JPTEST.TEST1PART','USERS',1);
    PL/SQL procedure successfully completed.
    
    SYSTEM@mactest(10.2.0.4)>select segment_name
    2  , segment_type
    3  ,count(*)
    4  from dba_segments
    5  where segment_name = 'AW$TEST1PART'
    6  group by segment_name
    7  , segment_type
    8  /
    
    SEGMENT_NAME                   SEGMENT_TYPE           COUNT(*)
    ------------------------------ -------------------- ----------
    AW$TEST1PART                   TABLE SUBPARTITION            2
    

    Then it occurred to me that zero is also a number... ;-)

    SYSTEM@mactest(10.2.0.4)>exec dbms_aw.aw_create('JPTEST.TEST0PART','USERS',0);
    PL/SQL procedure successfully completed.
    
    SYSTEM@mactest(10.2.0.4)>select segment_name
    2  , segment_type
    3  ,count(*)
    4  from dba_segments
    5  where segment_name = 'AW$TEST0PART'
    6  group by segment_type
    7  , segment_name
    8  /
    
    SEGMENT_NAME                   SEGMENT_TYPE           COUNT(*)
    ------------------------------ -------------------- ----------
    AW$TEST0PART                   TABLE                         1
    

    Please recall: Note 352306.1 recommends using the Analytic Workspace Manager (AWM) tool to recreate the AWs in the 64-bit database. If you want to create the AWs manually, I suggest engaging with Oracle Support to get their approval. The preceding examples are provided only for illustration of what's going on when the AW is created.

    Importing AWs

    I don't really have much to add here, other than:

    1. If you had any empty AWs in the 32-bit system, you won't have anything to import for those workspaces, though hopefully you recreated them in the 64-bit system
    2. Isn't this process the sort of thing that cries out to be scripted? Would I really resort to such cheap, obvious devices to foreshadow my next post? I would indeed!
  7. Keeping OPatch up-to-date: an object lesson

    I just did something silly while applying the January 2010 PSU to an 11gR1 ORACLE_HOME, and thought I'd share, in case someone else is Googling for the error message.

    [oracle@dbserv patches]$ cd 9209238
    [oracle@dbserv 9209238]$ opatch apply
    
    Invoking OPatch 11.1.0.6.2
    Oracle Interim Patch Installer version 11.1.0.6.2
    Copyright (c) 2007, Oracle Corporation. All rights reserved.
    
    (some OPatch output snipped)
    
    ApplySession failed: Patch ID is null.
    System intact, OPatch will not attempt to restore the system
    
    OPatch failed with error code 73
    

    A closer look at the installation prereqs reveals the following statement: "You must use the OPatch utility version 11.1.0.6.7 or later," which I confess that I missed the first time around. D'oh. After installing the latest version of OPatch, the "opatch apply" command worked as expected.

    Lessons reinforced:

    1. Even if you think you're up-to-date on prerequisites, triple-checking is a good idea
    2. Even humble tools like the OPatch utility can change pretty frequently
    3. Test systems are useful for testing your patching process, not just patches themselves

    (Everyone join in, now: "Thanks, Mr. Obvious, you're a life saver!") :-P

  8. Oracle 10gR2 RDBMS for Mac OS X (Intel) has arrived!

    This post was written before Snow Leopard (OS X 10.6) shipped. If you're interested in an installation guide for Oracle 10gR2 on Snow Leopard, Raimonds Simanovskis has provided instructions on his blog.

    In the wake of a small flurry of announcements yesterday about the release of Oracle 10gR2 for Mac OS X Intel (almost two weeks after April Fools' Day, so knock it off with the jokes already ;-) ), I decided to try a test installation on my Macbook Pro. The good news is that it works, even if you aren't running OS X Server 10.5.4 as specified in the release notes:

    OraDBCASuccess.jpg

    For the record, this is Mac OS X Leopard, desktop version 10.5.6. I'm still considering whether or not to post an "install guide," since everything more or less worked as advertised. And really, does the Internet need another dozen screenshots of OUI and DBCA doing their usual thing? I feel guilty enough for the screenshot above.

    Install guide updates, 12-13 April 2009:

    So now what?

    Apart from the novelty of it all, is this a worthwhile exercise? I'm not sure yet. I've been using the "Oracle in a VM" approach for a really long time, and I really like the flexibility of being able to play around with system configurations, etc. without worrying about doing any damage to my host system. With snapshots enabled, I don't even have to worry very much about permanently damaging the VM itself. Also, as stated in the release notes  (and called out in this thread on the OTN Apple forum, where hope springs eternal and no one is ever bitter or cranky), the list of supported features for the OS X version is a bit shorter than one might expect. On the other hand, "unsupported" doesn't always mean "won't work," and the memory footprint of a native database is a bit less than that of a full OS + database running in a virtualized environment. My expectation is that for quick testing and tinkering, the native database install could fit a niche, but VM still rules for more advanced stuff like RAC, ASM, 11g, etc.

    What's encouraging is that it appears that Oracle hasn't abandoned OS X as a platform. Feature set aside, I suspect they'll need to do more frequent releases before many people would be very comfortable using this in an enterprise setting. But this also opens the possibility of different licensing options...Express Edition for OS X, anyone? OSXXE? ;-)

    If someone manages to port this to iPhone, though, let me know. That would rock.

  9. Fixing file permission problems on Grid Control targets

    When installing Grid Control agents on a Linux or Unix platform, a recommended practice is to install as a user that doesn't own the ORACLE_HOMEs to be monitored. This poses a challenge when configuring monitoring for some targets, particularly those based on Oracle Application Server 10g. For security purposes, file and directory permissions on some ORACLE_HOMEs are fairly restrictive, and the Grid Control agent can't read all of the files necessary for target discovery and metric collection. The Grid Control release notes mention this complication, and Metalink Note 437078.1 provides additional suggestions for resolving discovery and metric collection errors related to file permission restrictions on a target ORACLE_HOME. I can't reveal the content of the Metalink My Oracle Support Note, and the information in the publicly-available Release Notes provides inadequate coverage, but the overall problem-solving method reduces to:

    1. Make sure that the owner of the agent software is in the same group as the owner of the monitored target software
    2. Check the agent log files for file permission errors, and fix them.

    My experience so far indicates that fixing some errors tends to reveal others. Who doesn't love a rousing game of log file whack-a-mole? In the interest of saving others a few rounds of this, I'm offering up a few additional permission change steps that could make your lives easier. For ease of navigation, I've broken things into the following sections:

    Disclaimer: When considering making the changes I describe below, please consider the trade-off you are making: more robust monitoring at the expense of some level of security. The perceived degree of sacrifice will vary from person to person and organization to organization. I would argue that the benefits outweigh the risks, and since Oracle itself suggests similar changes, this is clearly not a cut-and-dried issue. Hey, if doing the right thing were always easy, this job wouldn't be any fun, right? ;-)

    Please feel free to leave any similar tips (or links to your own posts about this topic) in the comments.

    Oracle Application Server

    Note 437078.1 is a good start, but there are a few additional items I found to ease metric collection and discovery errors for Oracle 10g App Server. ORACLE_HOME refers to the location of the 10gAS software. Depending on the application server components installed on your system, you may need to change more or fewer permissions.

    chmod g+rx $ORACLE_HOME/webcache
    chmod g+r $ORACLE_HOME/webcache/*.xml
    chmod g+rx $ORACLE_HOME/portal
    chmod g+rx $ORACLE_HOME/portal/conf
    chmod g+rx $ORACLE_HOME/discoverer
    chmod g+rx $ORACLE_HOME/discoverer/config
    chmod g+rx $ORACLE_HOME/forms
    chmod g+rx $ORACLE_HOME/forms/server
    chmod g+rx $ORACLE_HOME/ldap
    chmod g+rx $ORACLE_HOME/ldap/das
    chmod g+rx $ORACLE_HOME/uix
    chmod g+rx $ORACLE_HOME/ultrasearch
    chmod g+rx $ORACLE_HOME/ultrasearch/webapp
    chmod g+rx $ORACLE_HOME/ultrasearch/webapp/config
    chmod g+rx $ORACLE_HOME/jpi
    chmod g+rx $ORACLE_HOME/jpi/doc
    chmod g+rx $ORACLE_HOME/Apache/oradav
    chmod g+rx $ORACLE_HOME/Apache/oradav/conf
    chmod g+rx $ORACLE_HOME/Apache/jsp
    chmod g+rx $ORACLE_HOME/Apache/jsp/conf
    chmod g+rx $ORACLE_HOME/Apache/modplsql
    chmod g+rx $ORACLE_HOME/Apache/modplsql/conf
    chmod g+rx $ORACLE_HOME/sso
    chmod g+rx $ORACLE_HOME/sso/conf

    Database targets

    Some environments (for example, the 10gAS infrastructure database) will require permission changes to effectively monitor database and listener targets. You may find the scope of these changes to be a bit broad; at some point I got tired of chasing down individual files and just hit them all with the same hammer. :-) In this case, ORACLE_HOME refers to the ORACLE_HOME of the RDBMS software.

    find $ORACLE_HOME/admin -type d -exec chmod g+rx {} \;
    find $ORACLE_HOME/admin -type f -exec chmod g+r {} \;
    
    find $ORACLE_HOME/network -type d -exec chmod g+rx {} \;
    find $ORACLE_HOME/network -type f -exec chmod g+r {} \;
    
    find $ORACLE_HOME/rdbms -type d -exec chmod g+rx {} \;
    find $ORACLE_HOME/rdbms -type f -exec chmod g+r {} \;
    
    ## The following are needed for database health checks to work properly...
    chmod g+rx $ORACLE_HOME/dbs
    chmod g+rw $ORACLE_HOME/dbs/hc*
    

    Another general configuration note for the database listener, which I stumbled upon when setting up monitoring for a Secure Enterprise Search repository database: If sqlnet.ora is configured with TCP.VALIDNODE_CHECKING=yes, then the server that hosts the Grid Control OMS needs to be added to the list of hosts defined by TCP.INVITED_NODES. Otherwise, you'll get messages like this when attempting to add the database target, even if all monitoring credentials are correct:

    Failed to connect to the database: Io exception: Got minus one from a read call
    The Connect Descriptor was (description=(address=(host=ses.myorg.com)(protocol=tcp)(port=xxxx))
    (connect_data=(service_name=SESdb)(instance_name=SESdb)(UR=A)))</pre>
    And corresponding messages in the listener log:
    <pre>TNS-12546: TNS:permission denied
     TNS-12560: TNS:protocol adapter error
      TNS-00516: Permission denied
    

    Of course, after adding the OMS host to TCP.INVITED_NODES in sqlnet.ora, you need to tell the listener that you've made a change before the OMS will be able to connect to the database and add it as a target:

    oracle@ses:~> lsnrctl  reload
    
    LSNRCTL for Linux: Version 10.1.0.5.0 - Production on 05-DEC-2008 10:58:45
    
    Copyright (c) 1991, 2004, Oracle.  All rights reserved.
    
    Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=ses.myorg.com)(PORT=xxxx)))
    The command completed successfully

    Oracle Collaboration Suite Calendar Server

    These commands will resolve some issues with metric collection for OCS Calendar Server. I suspect that the intersection of Collaboration Suite administrators (a small set) and readers of this blog (a vanishingly small set) is probably zero, but Google works in mysterious ways. ORACLE_HOME refers to the OCS software location, not the Grid Control agent home.

    chmod g+rx $ORACLE_HOME/ocal/bin
    chmod g+rx $ORACLE_HOME/ocal/bin/uniwho
    chmod g+rx $ORACLE_HOME/ocal/bin/unireqdump
    chmod g+rx $ORACLE_HOME/ocal/bin/uniping
    chmod g+rx $ORACLE_HOME/ocal/bin/unistatus
    chmod g+rx $ORACLE_HOME/ocal/sbin
    chmod g+rx $ORACLE_HOME/ocal/sbin/who
    chmod g+rx $ORACLE_HOME/ocal/sbin/reqdump
    chmod g+rx $ORACLE_HOME/ocal/sbin/ping
    chmod g+rx $ORACLE_HOME/ocal/sbin/status
    chmod g+rx $ORACLE_HOME/ocal/oem
    chmod g+rx $ORACLE_HOME/ocal/oem/scripts
    chmod g+rx $ORACLE_HOME/ocal/oem/scripts/ocal_ps.pl
    chmod g+rx $ORACLE_HOME/ocal/oem/scripts/ocal_dbsize.pl
    chmod g+rx $ORACLE_HOME/ocas
    chmod g+rx $ORACLE_HOME/ocas/linkdb
    chmod g+rx $ORACLE_HOME/ocas/sessiondb
    chmod g+rx $ORACLE_HOME/lib
    chmod g+rx $ORACLE_HOME/ocal/lib
    chmod g+r $ORACLE_HOME/ocal/lib/*.so
    chmod g+rx $ORACLE_HOME/ocal/db
    chmod g+rx $ORACLE_HOME/ocal/db/nodes
    find $ORACLE_HOME/ocal/db/nodes -type d -exec chmod g+rx {} \;
    find $ORACLE_HOME/ocal/db/nodes -type f -exec chmod g+r {} \;
    

    E-Business Suite

    Setting these permissions in an E-Business Suite R12 instance resolved some metric collection errors of the form

    Couldn't open INST_TOP/ora/10.1.3/j2ee/oafm/application-deployments/oafm/orion-application.xml: Permission denied at AGENT_HOME/sysman/admin/scripts/ias/simpleXPath.pm line 116

    These errors may not manifest in 11i, since that version's tech stack is not based on 10gAS. INST_TOP is the "Instance TOP" environment variable for the R12 Apps installation, and serves as a reminder to connect as the Apps owner (not database software owner) to run these commands.

    chmod g+r $INST_TOP/ora/10.1.3/j2ee/forms/config/jms.xml
    chmod g+r $INST_TOP/ora/10.1.3/j2ee/forms/application-deployments/forms/orion-application.xml
    chmod g+r $INST_TOP/ora/10.1.3/j2ee/forms/application-deployments/forms/formsweb/orion-web.xml
    chmod g+r $INST_TOP/ora/10.1.3/j2ee/oacore/config/jms.xml
    chmod g+r $INST_TOP/ora/10.1.3/j2ee/oacore/application-deployments/oacore/orion-application.xml
    chmod g+r $INST_TOP/ora/10.1.3/j2ee/oacore/application-deployments/oacore/html/orion-web.xml
    chmod g+r $INST_TOP/ora/10.1.3/j2ee/oafm/config/jms.xml
    chmod g+r $INST_TOP/ora/10.1.3/j2ee/oafm/application-deployments/oafm/orion-application.xml
    chmod g+r $INST_TOP/ora/10.1.3/j2ee/oafm/application-deployments/oafm/webservices/orion-web.xml
    chmod g+r $INST_TOP/ora/10.1.3/j2ee/oafm/application-deployments/mapviewer/orion-application.xml
    chmod g+r $INST_TOP/ora/10.1.3/j2ee/oafm/application-deployments/mapviewer/web/orion-web.xml
    chmod g+r $INST_TOP/ora/10.1.3/j2ee/oafm/application-deployments/ascontrol/orion-application.xml
    
  10. Two OCP exam mini-reviews: 1Z0-040 (10g DBA New Features) and 1Z0-238 (R12 Install, Patch and Maintain)

    Here are my impressions of two Oracle certification exams I've had the opportunity to take in the past few months: 1Z0-040 (Oracle Database 10g: New Features for Administrators), and the beta version of 1Z0-238 (Oracle EBS R12: Install, Patch and Maintain Applications).

    1Z0-040: Less painful than upgrading to Vista (cheaper, too)

    For the most part, the upgrade path to 10g DBA OCP seems easier than starting from scratch. Compared to the two (soon to be three) exams and shelling out for classroom instruction required for the regular certification path, a single exam with no classroom instruction requirement seems like a piece of cake. Of course, to take the upgrade exam, you have to have a prior OCP DBA certification, which means you've already been down that "class plus several exams" road.

    What evens out the effort even further is this: It's an upgrade exam. You aren't going to be tested on just "standard DBA material" as you would be in the regular exam track. You have to care about the new stuff, and you have to care a lot. Don't use VPD or audit features? Still using normal filesystems and regular files to store your data, instead of ASM? Doesn't matter! For this exam, you need to concentrate on new features. You also need to care about the differences between versions of features you might not have been using in 9i or 10g.

    There's nothing wrong with that. It's only fair that reinforcing and testing knowledge of new features is the primary purpose of an upgrade exam. If you're already a 9i OCP DBA, the upgrade exam is an opportunity to demonstrate your skills as a 10g DBA, not a 9i DBA who uses 10g as if it were 9i. Otherwise, you're not using all the tools at your disposal, or, even worse, you could be using them wrong.

    Although the exam only covers new features, it's not a trivia quiz. You're much more likely to see questions about ASM, new RMAN features, and install/upgrade procedures than questions about obscure initialization parameters and system catalog views. There is a little bit of everything on the exam, though, so it's still useful to be prepared. Besides, there's a benefit to preparing for an exam that tests you exclusively on new features: you're more likely to learn new things. I even learned a thing or two about RMAN during my exam prep, a tool I thought I already knew pretty well. Speaking of prep material, here's what I used to get ready for the exam:

    1Z1-238: Oh, hey, just one more question...

    In a few days, the beta period for this exam expires, and 1Z1-238 becomes 1Z0-238. Some of the following commentary may therefore have a very short "relevance half-life." Update: As of 21 January 2009, the production version of this exam is now live. The Oracle Certification blog has more details.

    In some television game shows, there's the concept of a "lightning round": answer as many questions/perform as many tasks as rapidly as you can in a short period of time, with lots of dramatic buildup. Not much time to think, and the more correct completions you get, the greater the reward.

    Imagine a "lightning round" that goes on for three hours, and you have a rough idea of what it's like to take an Oracle Certification Beta Exam. Non-beta exams have a smaller time allotment, but also substantially fewer questions, such that one usually has an average of 1.3-1.5 minutes to spend on each question. Beta exams allow an average of less than a minute per question, and even though many of the questions can be answered very quickly, even a few "slow questions" can make the experience feel a bit rushed. Since the point of a beta exam is to "test the test," it's not surprising that there were a handful of questions that would rate as fiendishly difficult. There were also a few that were not difficult, just genuinely incomprehensible. In those cases, I used the comment feature to point out where a question could benefit from some editing.

    In contrast to the 10g Database New Features exam, the R12 Install, Patch, and Maintain exam is not an upgrade exam. If you've taken the 11i version of this exam, a lot of the material will be familiar. The twist, of course, is that there *are* differences between R12 and 11i. Naturally, some of the questions covered material that was new in R12, and others of the questions were clearly crafted to highlight the differences between the versions. There were a few times that I had to change my answers when I realized that I'd answered a question in a way that was correct for 11i, but not for R12. As the title of the exam suggests, you can expect lots of questions about patching procedures, Apps directory structures, and using the various AD utilities.

    One last odd thing about taking a beta exam is the lack of instant feedback on the exam. Though I feel pretty confident about my performance, I have no idea if I passed this exam. I certainly have less experience with R12 than with 11i, and with 200+ questions I quickly lost track of any "gut feel" for how many questions I was getting wrong. Regardless, with a 10-week wait after the close of the beta period, it's looking like I won't know the verdict until sometime next year. Ah well, good experience either way.

    That's a wrap

    If you have questions about these exams, please ask in the comments. Also, here are some excellent resources for those interested in a broader range of Oracle certification topics:

  1. 1
  2. Next ›
  3. Last »