DBPedias

Your Database Knowledge Community

berxblog

  1. Resetting Your Oracle User Password with SQL Developer on OSX

    This post started as a shameless copy of Jeff Smiths Resetting Your Oracle User Password with SQL Developer - but it ended in a trilogy about OSX, Oracle and SQLDeveloper.
    In my first two posts I made myself comfortable with Oracle instant client on my MacBook. Now I want to use it for something more serious, like changing passwords.

    Even with all the modifications from previous post SQLDeveloper still does not show the Reset Password... option. The reason: it just don't know about the library correctly.

    First the PATH as shown in previous post is not set in an application started via launcher - It does not contain the ORACLE_HOME at all. But even as I forced that with a temporary hack it did not help; so I just can say: with OSX it's more complicated to let SQLDeveloper use the proper libraries.

    As the PATH is not that important on OSX, the real issue is DYLD_LIBRARY_PATH. Regardless if it's set in /etc/profiles or ~/.profiles - they never reach an application started via launcher, probably for similar reasons like PATH shown above.

    But there is hope: ORACLE_HOME as set in /etc/launchd.conf is shown in the environment of SQLDeveloper. At the end, that's all we need, now just to check where to put this little flame of hope to let it grow to a bonfire.

    From now on I'm talking about SQLDeveloper 3.2.20.09! Future Versions might behave different!

    The script to check is
    /Applications/SQLDeveloper.app/Contents/Resources/sqldeveloper/ide/bin/launcher.sh. There is a function called CheckLibraryPath() which is supposed to check for a proper LD_LIBRARY_PATH (or it's equivalences) and if not set create a proper setting. Oracle just forgot to check about OSX! I write forgot on purpose, as HP-UX is checked with
    if [ `uname -s` = 'HP-UX' ] and SHLIB_PATH is set instead of LD_LIBRARY_PATH.

    So my solution is simple: I just enhance this function with this little patch:
    524a525,541
    >  elif [ `uname -s` = 'Darwin' ]
    >  then
    >   echo "OSX!"
    >   if [ "X$DYLD_LIBRARY_PATH" = "X" ]
    >  then
    >   DYLD_LIBRARY_PATH=$ORACLE_HOME
    >   #  only instant client exists on OSX
    >  else
    >   echo $DYLD_LIBRARY_PATH | egrep -e "(^|\:)$ORACLE_HOME($|\:)" > /dev/null
    >   if [ $? != 0 ]
    >   then
    >    DYLD_LIBRARY_PATH=$DYLD_LIBRARY_PATH:$ORACLE_HOME
    >   fi
    > 
    >  fi
    >  export DYLD_LIBRARY_PATH
    >  #echo "$DYLD_LIBRARY_PATH"
    


    With this little patch now I can change the password of my users with SQLDeveloper on OSX!
  2. setting environment variables for sqlplus on OSX

    In my previous post I showed how to create an universal binary for Oracle instant client so I don't need to care about 32/64 bit anymore.

    Now I try to show how to set some environment variables so I only need to enter sqlplus and can start my work without preparing anything first.

    I know I could do this in my local .profile, but I was looking for a more centralized way to provide them, maybe on a shared desktop where basic applications are installed just once.

    This is all about OSX Mountain Lion - I'm working on 10.8.2 right now.

    The variables I want to set/enhance are

    • PATH
      That is quite simple: I just added (as root) a new line in /etc/paths:
      /Users/berx/instantclient_11_2

    • ORACLE_HOME
      That was a little bit more tricky as I had to create (as root) the file first. In /etc/launchd.conf
      there is one line now:
      setenv ORACLE_HOME /Users/berx/instantclient_11_2

    • DYLD_LIBRARY_PATH
      If you are curious what this is used for, it's the equivalent to Linux LD_LIBRARY_PATH. This is needed for all applications which are linked dynamically. Even sqlplus is linked that way.
      Unfortunately it can not be set in /etc/launchd.conf. To be more precise, it CAN be set, but it will not show up in the terminal.
      Instead it will create this line at the top of each terminal window direct after it opens:
      dyld: DYLD_ environment variables being ignored because main executable (/usr/bin/login) is setuid or setgid
      That's due to a security setting in Mountain Lion.
      I decided to use the profile for this work. In my case ~/.profile, but /etc/profile also works:
      # ORACLE_HOME is set in /etc/launchd.conf !!
      export DYLD_LIBRARY_PATH=$ORACLE_HOME:$DYLD_LIBRARY_PATH

    With these 3 little steps now I can start sqlplus directly from my terminal without caring about the environment anymore.

    There is just one little drawback: all commands with setuid/setgid bit, like ps or sudo now throw a warning like this:
    dyld: DYLD_ environment variables being ignored because main executable (/bin/ps) is setuid or setgid
    But for me this is acceptable.

    The only workaround I could imagine is any kind of wrapper for sqlplus.
  3. Oracle client universal binaries for OSX

    Recently Oracle announced a new Oracle Instant Client for OSX. As always there is a 32-bit and a 64-bit version available.
    I was curious if they can be merged together to an universal binary. The short answer: yes, they can.

    It all starts with downloading the instant client zips for 32 and 64 binary. I loaded basic and sqlplus and unzipped them into 2 directories instantclient_11_2_32 and instantclient_11_2_64. I also created a target directory instantclient_11_2.

    Then the simple script
    for i in `ls instantclient_11_2_32`
    do 
    lipo -create ~/instantclient_11_2_{32,64}/$i -output ~/instantclient_11_2/$i || cp ~/instantclient_11_2_64/$i ~/instantclient_11_2/$i
    done
    


    did the work with this output:
    lipo: can't figure out the architecture type of: /Users/berx/instantclient_11_2_32/BASIC_README
    lipo: can't figure out the architecture type of: /Users/berx/instantclient_11_2_32/SQLPLUS_README
    lipo: can't figure out the architecture type of: /Users/berx/instantclient_11_2_32/glogin.sql
    lipo: can't figure out the architecture type of: /Users/berx/instantclient_11_2_32/ojdbc5.jar
    lipo: can't figure out the architecture type of: /Users/berx/instantclient_11_2_32/ojdbc6.jar
    lipo: can't figure out the architecture type of: /Users/berx/instantclient_11_2_32/xstreams.jar
    


    here what a simple file * shows:

    BASIC_README:         ASCII text
    SQLPLUS_README:       ASCII text
    adrci:                Mach-O universal binary with 2 architectures
    adrci (for architecture i386): Mach-O executable i386
    adrci (for architecture x86_64): Mach-O 64-bit executable x86_64
    genezi:               Mach-O universal binary with 2 architectures
    genezi (for architecture i386): Mach-O executable i386
    genezi (for architecture x86_64): Mach-O 64-bit executable x86_64
    glogin.sql:           ASCII English text
    libclntsh.dylib.11.1: Mach-O universal binary with 2 architectures
    libclntsh.dylib.11.1 (for architecture i386): Mach-O dynamically linked shared library i386
    libclntsh.dylib.11.1 (for architecture x86_64): Mach-O 64-bit dynamically linked shared library x86_64
    libnnz11.dylib:       Mach-O universal binary with 2 architectures
    libnnz11.dylib (for architecture i386): Mach-O dynamically linked shared library i386
    libnnz11.dylib (for architecture x86_64): Mach-O 64-bit dynamically linked shared library x86_64
    libocci.dylib.11.1:   Mach-O universal binary with 2 architectures
    libocci.dylib.11.1 (for architecture i386): Mach-O dynamically linked shared library i386
    libocci.dylib.11.1 (for architecture x86_64): Mach-O 64-bit dynamically linked shared library x86_64
    libociei.dylib:       Mach-O universal binary with 2 architectures
    libociei.dylib (for architecture i386): Mach-O bundle i386
    libociei.dylib (for architecture x86_64): Mach-O 64-bit bundle x86_64
    libocijdbc11.dylib:   Mach-O universal binary with 2 architectures
    libocijdbc11.dylib (for architecture i386): Mach-O bundle i386
    libocijdbc11.dylib (for architecture x86_64): Mach-O 64-bit bundle x86_64
    libsqlplus.dylib:     Mach-O universal binary with 2 architectures
    libsqlplus.dylib (for architecture i386): Mach-O dynamically linked shared library i386
    libsqlplus.dylib (for architecture x86_64): Mach-O 64-bit dynamically linked shared library x86_64
    libsqlplusic.dylib:   Mach-O universal binary with 2 architectures
    libsqlplusic.dylib (for architecture i386): Mach-O bundle i386
    libsqlplusic.dylib (for architecture x86_64): Mach-O 64-bit bundle x86_64
    ojdbc5.jar:           Zip archive data, at least v1.0 to extract
    ojdbc6.jar:           Zip archive data, at least v1.0 to extract
    sqlplus:              Mach-O universal binary with 2 architectures
    sqlplus (for architecture i386): Mach-O executable i386
    sqlplus (for architecture x86_64): Mach-O 64-bit executable x86_64
    uidrvci:              Mach-O universal binary with 2 architectures
    uidrvci (for architecture i386): Mach-O executable i386
    uidrvci (for architecture x86_64): Mach-O 64-bit executable x86_64
    xstreams.jar:         Zip archive data, at least v1.0 to extract
    

    Ok, now all the files are merged (or just copied) together.
    Now let's check if 32 and 64 bit works.

    I need this environment set:

    export DYLD_LIBRARY_PATH=/Users/berx/instantclient_11_2
    export PATH=$PATH:/Users/berx/instantclient_11_2

    So let's thy the 32 bit version:

    arch -arch i386 ./sqlplus 
    
    SQL*Plus: Release 11.2.0.3.0 Production on Fri Feb 15 11:22:46 2013
    
    Copyright (c) 1982, 2012, Oracle.  All rights reserved.
    
    Enter user-name: 
    

    It looks good, just let's check if it's really running at 32 bit. Activity Monitor helps:


    And the same thing for 64 bit:

    arch -arch x86_64 sqlplus 
    
    SQL*Plus: Release 11.2.0.3.0 Production on Fri Feb 15 11:38:23 2013
    
    Copyright (c) 1982, 2012, Oracle.  All rights reserved.
    
    Enter user-name: 
    





    So this can be seen as a short example how we can use only one ORACLE_HOME for 32 and 64 bit binaries on OSX.
    But as always there is the usual disclaimer: It's not supported by Oracle; never do it on a production system (who is running anything Oracle-Related on OSX Server at all?); it's not tested with all the different applications which use oracle client.

    Update 1 (2012-02-17 21:50):

    If you want the binary installation more like a real ORACLE_HOME check Ronald Roods blog!

  4. side channel attack on ORA-00942

    Oracle Databases has a powerful set of grants and permissions. One of the easy philosophies behind it is just to hide anything a user is not allowed to see. Technically this leads to an error message
    ORA-00942: table or view does not exist.
    More precisely it should give a text like table or view does not exist or you are not allowed to access it.
    For an ordinary user/schema separation there might be no big difference: If user A can not read table B.TAB it's of no value for user A whether the object does not exist or is just not accessible. But from a security point of view it might be of some interest if there exists a particular object within the database. Even it's not exploitable yet, maybe it's worth to try to reach it, or there is a combined vector of attack together with other objects.
    To gain information about a system which should be kept hidden a well known method for physical cryptosystems is the side channel attack. At this attack extra information about the system is gained by measuring external channels like timings, power consumption and so on.
    I am showing a similar method to get many informations about objects within a database a user should not know in general.

    The described attack combines 2 well known concepts of every Oracle database: row cache and session statistics.

    The row rache (or data dictionary cache) ... is a collection of database tables and views containing reference information about the database, its structures, and its users. - It's important to know this cache is filled by SQL-statements against the data dictionary tables.

    session statistics just increase a counter for every session every time an action is executed.

    Now let's check if we can get some hidden information out of my test system.
    I did a alter system flush shared_pool; to make it slightly easier for the testcase, but even with a pre-filled row cache I am sure these informations can get collected with only little more effort.

    My measurement query is
    select ms.sid, ms.statistic#, sn.name, ms.value 
    from v$mystat ms, V$STATNAME sn 
    where sn.name in ('recursive calls', 'execute count') and ms.statistic# = sn.statistic# order by 2;
    
    with a result like
           SID STATISTIC# NAME                      VALUE
    ---------- ---------- -------------------- ----------
           221          9 recursive calls            3108
           221        588 execute count               947
    

    Now let's see where we can go. Something easy for the beginning. A select from a table which does not exist:
    SQL> select * from gfdsaqwert;
    select * from gfdsaqwert
                  *
    ERROR at line 1:
    ORA-00942: table or view does not exist
    
    with the measurement
           SID STATISTIC# NAME                      VALUE
    ---------- ---------- -------------------- ----------
           221          9 recursive calls            3111
           221        588 execute count               950
    
    this means, there where 3 recursive calls and 3 execute counts.

    Now for something more interesting - let's check if a user exists:
    SQL> select * from zzz.uztrfghj;
    select * from zzz.uztrfghj
                      *
    ERROR at line 1:
    ORA-00942: table or view does not exist
    
    20 recursive calls
     3 execute count

    SQL> select * from a.uztrfghj;
    select * from a.uztrfghj
                    *
    ERROR at line 1:
    ORA-00942: table or view does not exist
    
    21 recursive calls
     3 execute count

    In my test-DB no user ZZZ exists, but a user A exists. (but no table A.UZTRFGHJ) There is one more recursive call if the user exists. Not much, but a difference.

    SQL> select * from a.m ;
    select * from a.m
                    *
    ERROR at line 1:
    ORA-00942: table or view does not exist
    
    26 recursive calls
     7 execute count

    Now there are even more recursive calls and execute count. The Table A.M exists in my system.

    I'd say this shows the concept. The official answer from the database was ORA-00942: table or view does not exist all the time. But v$mystat gave us some more information we did not get officially.
  5. do not touch if you do not know for sure


    Oracle provides and documents a huge load of possibilities and functions for nearly every purpose. For me it is impossible to know all of them. Even to know such an area exists is hard.
    But still sometimes these functions Oracle does not document for customers purpose seems to be more attractive than those officially available.
    One of these attractive packages is DBMS_SYSTEM. You will not find any description of this package in the official Oracle documentation. There are some small traces available, but nothing really useful.
    Oracle also have quite clear words about using such unofficial, and hidden, packages:
    In How To Edit, Read, and Query the Alert.Log [ID 1072547.1] you can read:

    NOTE about DBMS_SYSTEM:
    This package should in fact only be installed when requested by Oracle Support.
    It is not documented in the server documentation.
    As such the package functionality may change without notice.
    It is to be used only as directed by Oracle Support and its use otherwise is not supported.

    Per internal Note 153324.1:
    Generally, if a package is not in the Oracle documentation at all, it is intentional, as it is not for end user use. Non-documented packages should be avoided by customers unless specifically instructed to use them by either Oracle Support or Oracle Development.

    For some reasons I'm one of those which likes to play with forbidden toys like these. I found a procedure in DBMS_SYSTEM which changed behavior slightly in 11gR2 (I've tested with 11.2.0.3 patchset - so maybe other patchsets behave quite different!)

    I'm talking about DBMS_SYSTEM.READ_EV. This procedure more or less calls directly the internal C-routine READ_EV_ICD. Common sense is, it should return the level of an event given. This is also quite true, just for one exception: the probably most known event in Oracle world: 10046 - or sql_trace.

    My test-script here
    VARIABLE lev number
    SET AUTOPRINT ON
    EXECUTE sys.dbms_system.read_ev(10046, :lev)
    
    ALTER SESSION SET EVENTS '10046 trace name context forever, level 8';
    
    SELECT sql_trace, sql_trace_waits, sql_trace_binds FROM v$session WHERE sid=userenv('sid')
    
    
    EXECUTE sys.dbms_system.read_ev(10046,:lev)
    
    oradebug setmypid
    oradebug eventdump session
    

    gives the expected result in one of my 10g test DBs:
    @test_read_ev.sql
    
    PL/SQL procedure successfully completed.
    
           LEV
    ----------
             0
    
    Session altered.
    
    PL/SQL procedure successfully completed.
    
           LEV
    ----------
    8
    
    Statement processed.
    10046 trace name CONTEXT level 8, forever
    

    but an unexpected result in my 11.2.0.3 test DB:

    @test_read_ev.sql
    
    PL/SQL procedure successfully completed.
    
           LEV
    ----------
             0
    
    Session altered.
    
    PL/SQL procedure successfully completed.
    
           LEV
    ----------
    0
    
    Statement processed.
    sql_trace level=8
    

    I guessed events with an ALIAS might be excluded somehow, but other tests with DEADLOCK==60 or DB_FILES==10222 shows this special behavior only with sql_trace.

    My todays conclusion is easy:
    If it's not there for you, don't guess you can play with it without any consequences.
  6. total abuse of technology



    I had a (for my environment) unusual request:
    After the migration of a Repository Database from 9i to latest 10g I was asked to keep a backup of the old DB for at least 3 years.
    This does not sound very unusual, but it's not that simple in our environment. We do only keep backups for weeks to some month, worst case. I also cannot just backup the datafiles at all: The old database run on Solaris, but we are switching to Linux right now. With just some bad luck I would not have any system to restore (or open) this database backup at all.
    This brought me to another solution; in my point of view it was not worth to write a blog about it, but I was asked by Leighton L. Nelson and so I write:

    1. export of the full database
      I run a simple export of the database. There is no expdp in 9i, so the choice was easy.

    2. compress the files
      the dump (and the logfile!) where tared together and compressed. Just to save space.

    3. prepare a proper store
      As mentioned above, there is no dedicated system for this purpose. So I had to prepare a place where the dump is safe. As a DBA, of course I know a good place to store data: A database!
      First a DBFS came to my mind. But the DB is in Version 10.2 - no DBFS.
      But it's quite simple to do the important steps manually:
      create tablespace old_dump datafile '+<DG>' autoextend on;
      create user old_dump identified by dump_old1 default tablespace old_dump;
      GRANT CONNECT, CREATE SESSION,  CREATE TABLE to old_dump;
      alter user old_tech_dump quota unlimited on old_dump;
      
      connect old_dump/dump_old1
      
      create table old_dump_store 
      (id integer primary key, description VARCHAR(2000), file_store BLOB) 
      LOB (file_store) STORE AS  SECUREFILE 
      (TABLESPACE old_dump DISABLE STORAGE IN ROW   NOCACHE LOGGING);
      

    4. insert the dump (and some metadata)

      There is a nice way in SQL Developer to load a file to a BLOB. It's just so simple.
      At last some words in the comment field are worth - so everyone knows what's inside the BLOB.
    It still might sound strange to save the dump of an old database into it's descendant. But at the end: do you know a better place to store data than a database?
  7. restore DBMS_SCHEDULER.CREATE_CREDENTIAL cleartext password


    If you want to use Oracle file watcher, you need to Create a Credential. As there a password needs to be stored in the database, Oracle tries to save it in a secure way. But as the password must be decrypted for the purpose to login on the file watchers agent side, it is not safe at all:
    The credentials are stored with DBMS_SCHEDULER.CREATE_CREDENTIAL. Here an example:

    exec DBMS_SCHEDULER.CREATE_CREDENTIAL(
      credential_name => 'local_credential',
      username => 'oracle',  password => 'welcome1');
    exec DBMS_SCHEDULER.CREATE_CREDENTIAL(
      credential_name => 'local_credential2',
      username => 'oracle2', password => 'welcome1');


    It's quite easy to see the values again:

    select o.object_name credential_name, username, password
     FROM SYS.SCHEDULER$_CREDENTIAL c, DBA_OBJECTS o
     WHERE c.obj# = o.object_id;

    CREDENTIAL_NAME    USERNAME PASSWORD
    ------------------ -------- ------------------------------------
    LOCAL_CREDENTIAL   oracle   BWVYxxK0fiEGAmtiKXULyfXXgjULdvHNLg==
    LOCAL_CREDENTIAL2  oracle2  BWyCCRtd8F0zAVYl44IhvVcJ2i8wNUniDQ==


    At least the password is somehow encrypted, and even the password was welcome1 for both credentials, the encrypted string is not identical.

    Nothing to blame here, but I mentioned, the password can be decrypted. So let's do so:

    SELECT u.name CREDENTIAL_OWNER, O.NAME CREDENTIAL_NAME, C.USERNAME,
      DBMS_ISCHED.GET_CREDENTIAL_PASSWORD(O.NAME, u.name) pwd
    FROM SYS.SCHEDULER$_CREDENTIAL C, SYS.OBJ$ O, SYS.USER$ U
    WHERE U.USER# = O.OWNER#
      AND C.OBJ#  = O.OBJ# ;

    CREDENTIAL_OWNER CREDENTIAL_NAME      USERNAME PWD
    ---------------- -------------------- -------- --------
    SYS              LOCAL_CREDENTIAL     oracle   welcome1
    SYS              LOCAL_CREDENTIAL2    oracle2  welcome1


    Can you see it? It's there. Try it at home!
    I don't blame anyone here. It's hard to store anything really safe in case you need to decrypt it also.
    But don't expect your password save, if you store it with DBMS_SCHEDULER.CREATE_CREDENTIAL.
    Maybe it's slightly to easy to use DBMS_ISCHED.GET_CREDENTIAL_PASSWORD (ok, only SYS can do so) but even it might be slightly more difficult in the future, the basic problem will still exist.

  8. Setting Up Oracle Connection Manager (without SOURCE_ROUTE)


    This post must be seen as a direct follow up to Arup Nandas Setting Up Oracle Connection Manager.
    As there are many references to this post, please read it first. Problem and Solution are quite similar, only the architecture is a little bit different:

    The Architecture

     The network diagram of the three machines is slightly different:


    There is a new needed connection: from the instance on dbhost1 to the connection manager on cmhost1.

    After changing the setup, you will need to rewrite the TNSNAMES.ORA in the following way:

    TNS_CM = 
      (DESCRIPTION = 
        (ADDRESS = 
          (PROTOCOL = TCP)(HOST = cmhost1)(PORT = 1950)
        )
        (CONNECT_DATA = 
          (SERVICE_NAME=srv1)
        )
      )

    You see, the (SOURCE_ROUTE = YES) disappeared as well as the ADDRESS of the listener on dbhost1.

    How it Works


    Note, all the special parameters and settings on the clients TNSNAMES.ORA disappeared. But the cman must know about the SERVICE_NAME it has to serve. As the cman can be seen as a special kind of listener, there is a common way a listener gets informed about a SERVICE_NAME: the Instance has to register the services to the listener. In general this is done by pmon at registering to logal_listener and remote_listener. In this case, remote_listener is the magic parameter.

    Setting Up


    You can follow step (1) to (9) as in Arups blog.
    But before (10) an additional step is required:

    (x) on the instance add the cman to remote_listener:

    Alter System Set remote_listener='(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=cmhost1)(PORT=1950))))' scope=both;

    If there is already an entry in remote_listener, e.g. in a RAC, you can separate the different connection strings by comma. An example can be

    Alter System Set remote_listener='SCAN-IP:1521,(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=cmhost1)(PORT=1950))))' scope=both;

    (For more details about SCAN I'd recommend this PDF)

    CMCTL Primer

    As we have now the services registered also on cman, we can see it there. The  SHOW command has a 2nd parameter services. Here an example

    Services Summary...
    Proxy service "cmgw" has 1 instance(s).
      Instance "cman", status READY, has 2 handler(s) for this service...
        Handler(s):
          "cmgw001" established:1 refused:0 current:0 max:256 state:ready
             <machine: 127.0.0.1, pid: 16786 >
             (ADDRESS=(PROTOCOL=tcp)(HOST=127.0.0.1)(PORT=44391))
          "cmgw000" established:1 refused:0 current:0 max:256 state:ready
             <machine: 127.0.0.1,pid: 16784>
             (ADDRESS=(PROTOCOL=tcp)(HOST=127.0.0.1)(PORT=44390))
    Service "INSTANCE1" has 1 instance(s).
      Instance "INSTANCE1", status READY, has 1 handler(s) for this service...
        Handler(s):
          "DEDICATED" established:0 refused:0 state:ready
             REMOTE SERVER
             (ADDRESS=(PROTOCOL=TCP)(HOST=dbhost1)(PORT=1521))
    Service "cmon" has 1 instance(s).
      Instance "cman", status READY, has 1 handler(s) for this service...
        Handler(s):
          "cmon" established:3 refused:0 current:1 max:4 state:ready
             <machine: 127.0.0.1, pid: 16759>
             (ADDRESS=(PROTOCOL=tcp)(HOST=127.0.0.1)(PORT=44374))
    The command completed successfully.

    Fine Tuning

    I try to create a dedicated service for all (or a well known set of) connections via the connection manager. By doing so it's sometimes easier to separate or identify different kinds of sessions.
  9. Who created that process?

    Figure 2-7
    Connection to
    a Dedicated
    Server Process
    For some reason I was really curios who created that process. It's not about a particular process in detail, mir a well known kind of processes. At least well known for DBAs.
    Which process? 
    It's one of these:

    oracle   13096     1  0 20:05 ?        00:00:00 oracleTTT071 (LOCAL=NO)

    Yes, it's a simple server process, nothing spectacular. Nevertheless, the Concepts guide is not very specific, who created that process. So I tried to find out in more detail.
    On my linux sandbox the first column of ps -ef shows the UID, the second is the PID, the third is the PPID. Unfortunately it's 1 here, and I'm quite sure, this process was not created by init. So this proces is somewhat orphaned, as the direct parent disappeared. Very sad!
    I decided to follow Figure 2-7 from the concepts guide. I used strace -f -p <PID_of_listener> to see what's going on. -f follows all forks, so also their actions are traced.
    The first 3 lines are
    Process 2979 attached with 3 threads - interrupt to quit
    [pid  2981] futex(0xae8dee4, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...>
    [pid  2980] restart_syscall(<... resuming interrupted call ...> <unfinished ...>


    So we have 3 listener processes - it's good to know and probably worth to investigating this segregation of duties - but not in this post. There are so many interesting lines, but I'm searching for a process, so let's continue with

    [pid  2979] clone(Process 27028 attached
    child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x2aedd9914b80) = 27028
    [pid  2979] wait4(27028, Process 2979 suspended
     <unfinished ...>
    [pid 27028] clone(Process 27029 attached (waiting for parent)
    Process 27029 resumed (parent 27028 ready)
    child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x2aedd9914b80) = 27029
    [pid 27028] exit_group(0)               = ?
    Process 2979 resumed
    Process 27028 detached
    [pid  2979] <... wait4 resumed> [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27028
    [pid 27029] close(15 <unfinished ...>
    [pid  2979] --- SIGCHLD (Child exited) @ 0 (0) ---
    [pid 27029] <... close resumed> )       = 0
    [pid  2979] close(14 <unfinished ...>
    [pid 27029] close(16 <unfinished ...>
    [pid  2979] <... close resumed> )       = 0
    [pid 27029] <... close resumed> )       = 0
    [pid  2979] close(17)                   = 0


    Here the listener ([pid  2979]) creates a new process by the first clone call. This new Process has the PID 27028. This new process has only one purpose: again clone a new Process: PID 27029 and use exit_group(0) to terminate directly afterwards. By this trick the listener is not shown as parent process for PID 27029. Directly after it's creation PID 27029 closes some file handles. As by the sequence of clone calls the new process inherited a table of all open file (and network) handles it seems it tries to get rid of any it does not need as early as possible. The next part
    [pid  2979] fcntl(16, F_SETFD, FD_CLOEXEC) = 0
    [pid 27029] setsid( <unfinished ...>
    [pid  2979] fcntl(15, F_SETFD, FD_CLOEXEC <unfinished ...>
    [pid 27029] <... setsid resumed> )      = 27029
    [pid  2979] <... fcntl resumed> )       = 0
    [pid 27029] geteuid()                   = 5831
    [pid  2979] fcntl(13, F_SETFD, FD_CLOEXEC) = 0
    [pid 27029] setsid()                    = -1 EPERM (Operation not permitted)
    [pid  2979] poll([{fd=8, events=POLLIN|POLLRDNORM}, {fd=11, events=POLLIN|POLLRDNORM}, {fd=12, events=POLLIN|POLLRDNORM}, {fd=16, events=POLLIN|POLLRDNORM}, {fd=15, events=0}], 5, -1 <unfinished ...>

    makes sure the file descriptos 16, 15 and 13 will remain after an execve(2) call.
    And here it goes:
    [pid 27029] execve("/appl/oracle/product/rdbms_112022_a/bin/oracle", ["oracleTTT051", "(LOCAL=NO)"], [/* 109 vars */]) = 0
    from the man page if execve:
    execve() executes the program pointed to by filename.
    ...
    execve() does not return on success, and the text, data, bss, and stack of the calling process are overwritten by that of  the  program  loaded.   The  program invoked inherits the calling process’s PID, and any open file descriptors that are not set to close-on-exec.  Signals pending on the calling process are cleared.  Any signals set to be caught by the calling process are reset  to  their default behaviour.  The SIGCHLD signal (when set to SIG_IGN) may or may not be reset to SIG_DFL.
           If the current program is being ptraced, a SIGTRAP is sent to it after a successful execve().
           If  the  set-user-ID  bit  is set on the program file pointed to by filename, and the calling process is not being ptraced, then the effective user ID of the calling process is changed to that of the owner of the program file.  i Similarly,  when  the  set-group-ID bit of the program file is set the effective group ID of the calling process is set to the group of the program file.
    From that point on there you can see how the server process comes to life. It's very interesting in some details, but not scope of this post. After some conversation between listener and server process using file descriptors 15 and 16 (I assume these are just sockets) both close these file descriptors. The listener also closes file descriptor 13 which seems to be the TCP connection to the client. From that point the 2 processes seems to be independent.

    Well, now I know (at least on my test-system) the simplest way, the listener creates the process - and it uses execve to do so. There still are many questions open, like what's going on at this redirection as shown in Figure 2-8.

  10. bzip2 twice?


    To check the performance of RMAN backup I recently started to trace it a little bit. As most of the time was not spent in any reading from disk or writing to media manager library event, it was on CPU. It's good to know the CPUs are of any good, but as I still want to know what's going on I tried to dig any deeper. CPU cycles are not just a magic black box where we put in a problem and the answer comes out after some times. At an abstraction layer it's a chain of functions where one is called by another, and only the last is the one doing anything. There is not much information in that fact per se, but developers are humans also, and they are giving the functions they code meaningful names.


    So I had just to find these names (and where most of the time is spent) to figure out what's going on. To save my time I remembered Tanel Poders Advanced Oracle Troubleshooting Guide, Part 9 – Process stack profiling from sqlplus using OStackProf. There he described his tool ostackprof. This did all the job for me, I just had to find a rman session.

    Here's the shortstack where most of the time was spent:
    (This backup was done with COMPRESSION ALGORITHM ‘BASIC’)
    ->__libc_start_main()->main()->ssthrdmain()->opimai_real()->sou2o()->opidrv()->opiodr()->opiino()->opitsk()->ttcpip()->opiodr()->kporpc()->kkxrpc()->prient()->prient2()->pricbr()->pricar()->plsql_run()->pfrrun()->pfrrun_no_tool()->pfrinstr_ICAL()->pevm_icd_call_common()->krbibpc()->krbbpc()->krbb3crw()->krbbcdo()->kgccdo()->kgccbz2pseudodo()->kgccbz2do()->kgccm()->kgccbuf()->kgccgmtf()->__sighandler()->->

    The naming convention for functions is not public documented by oracle, but for some reasons I'm sure functions starting with krb are related to backup, whereas kgcc is used for compression. Especially the working function kgccgmtf reads like generate Move To Front.

    At that point I had a lot more information than before, still I had no way how to improve the backup speed. As we have licensed advanced compression for that particular node, we tested with different other compression methods. LOW and MEDIUM where faster, with less compression than our previous BASIC. But HIGH was even slower!

    So again I used ostackprof and that's the topmost stack trace - for HIGH:
    ->__libc_start_main()->main()->ssthrdmain()->opimai_real()->sou2o()->opidrv()->opiodr()->opiino()->opitsk()->ttcpip()->opiodr()->kporpc()->kkxrpc()->prient()->prient2()->pricbr()->pricar()->plsql_run()->pfrrun()->pfrrun_no_tool()->pfrinstr_ICAL()->pevm_icd_call_common()->krbibpc()->krbbpc()->krbb3crw()->krbbcdo()->kgccdo()->__PGOSF209_kgccbzip2pseudodo()->kgccbzip2do()->BZ2_bzCompress()->handle_compress()->BZ2_compressBlock()->generateMTFValues()->__sighandler()->->


    Do you see the difference? Until kgccdo there is no! And even afterwards, the functions are somewhat similar. One more thing is worth to mention: the bzip2 implementation for HIGH does not use oracle internal naming convention. So it's worth to search for these names on the internet. one of my best hits was a compress.c File Reference.

    Did Oracle reinvent the wheel? No. For me it looks as if they tried their best first (by doing their own kgcc implementation) and afterwards preferred simple copy&paste. Maybe they should just skip either of these 2 - they still can use parameters to achieve different compression quality. 


    If someone is interested in our results:
    for a single datafile of 30GB (with 100% usage) we achieved on a production system - with all it ongoing tasks:


    Type min backup-size
    BASIC 13:32 5.8
    LOW 5:17 8
    MEDIUM 8:52 6.14
    HIGH 65:29 4.25

    We decided to choose MEDIUM.
  1. 1
  2. Next ›
  3. Last »