Archive for March, 2011

March 30, 2011

Using CIO/DIO for Oracle on AIX

Mount FS with CIO for DFs, redo, and CFs, do not use it with archived logs and OH.
– On 9i and 10g, change file filesystemio_options=setall, by default it is asynch. So if file system is not mounted with CIO option, Oracle will not use Concurrent IO.
– It’s recommended to use JFS2 with CIO for standalone instance, use GPFS for clustered database. Do NOT use JFS with DIO or JFS with large file enabled.
– Enable large page support
– disable VMM from steeling pages from the SGA (lock_sga)

Check whether CIO is being used by Oracle

Test 1) file system /gold/GLT/db is JFS2 system mounted with CIO
while Oracle setting filesystemio_options=ASYNC

 


-- start up instance in nomount state

SQL> startup nomount;
ORACLE instance started.

Total System Global Area 3221225472 bytes
Fixed Size                  2099856 bytes
Variable Size            1593836912 bytes
Database Buffers         1610612736 bytes
Redo Buffers               14675968 bytes

-- trace dbwr process

$ ps -ef |grep dbw
gltoracl 1175784       1   0 12:10:05      -  0:00 ora_dbw0_GOLDTEST

$ truss -a -t open -p 1175784
psargs: ora_dbw0_GOLDTEST
 (sleeping...)
 (sleeping...)
 (sleeping...)
 (sleeping...)
 (sleeping...)

-- open datafiles



open("/gold/GLT/db/oradata/GOLDTEST/system01.dbf", O_RDONLY|O_LARGEFILE) = 16
open("/gold/GLT/db/oradata/GOLDTEST/system01.dbf", O_RDWR|O_DSYNC|O_LARGEFILE) = 16
open("/gold/GLT/db/oradata/GOLDTEST/undotbs01.dbf", O_RDONLY|O_LARGEFILE) = 17
open("/etc/vfs", O_RDONLY)                      = 18
open("/gold/GLT/db/oradata/GOLDTEST/undotbs01.dbf", O_RDONLY|O_LARGEFILE) = 17
open("/gold/GLT/db/oradata/GOLDTEST/undotbs01.dbf", O_RDWR|O_DSYNC|O_LARGEFILE) = 17
open("/gold/GLT/db/oradata/GOLDTEST/sysaux01.dbf", O_RDONLY|O_LARGEFILE) = 18
open("/etc/vfs", O_RDONLY)                      = 19
open("/gold/GLT/db/oradata/GOLDTEST/sysaux01.dbf", O_RDONLY|O_LARGEFILE) = 18
open("/gold/GLT/db/oradata/GOLDTEST/sysaux01.dbf", O_RDWR|O_DSYNC|O_LARGEFILE) = 18
open("/gold/GLT/db/oradata/GOLDTEST/users01.dbf", O_RDONLY|O_LARGEFILE) = 19
open("/etc/vfs", O_RDONLY)                      = 20
open("/gold/GLT/db/oradata/GOLDTEST/users01.dbf", O_RDONLY|O_LARGEFILE) = 19

CIO is not being used by Oracle even the file system is mounted with CIO option;
Test 2) file system /gold/GLT/db is JFS2 file system mounted with CIO option
while Oracle setting filesystemio_options=SETALL
————————————————————————————

— start up instance in nomount state

SQL> startup nomount;
ORACLE instance started.

Total System Global Area 3221225472 bytes
Fixed Size                  2099856 bytes
Variable Size            1593836912 bytes
Database Buffers         1610612736 bytes
Redo Buffers               14675968 bytes

— trace dbwr process

$ ps -ef |grep dbw
gltoracl 1175784       1   0 12:10:05      –  0:00 ora_dbw0_GOLDTEST

$ truss -a -t open -p 1175784
psargs: ora_dbw0_GOLDTEST
(sleeping…)
(sleeping…)
(sleeping…)
(sleeping…)
(sleeping…)

— when mounting database

SQL> alter database mount;

open(“/gold/GLT/bin/oracle/product/10.2.0/db_1/dbs/lkGOLDTEST”, O_RDWR|O_CREAT|O_EXCL, S_IRUSR|S_IWUSR|S_IRGRP|S_IWGRP) = -1
open(“/gold/GLT/bin/oracle/product/10.2.0/db_1/dbs/lkGOLDTEST”, O_RDWR) = 12
(sleeping…)
open(“/gold/GLT/admin/GOLDTEST/control/control01.ctl”, O_RDONLY|O_LARGEFILE) = -1
open(“/gold/GLT/admin/GOLDTEST/control/control01.ctl”, O_RDWR|O_CIO|O_DSYNC|O_LARGEFILE) = 13
^^^^^
open(“/gold/GLT/admin/GOLDTEST/control/control02.ctl”, O_RDONLY|O_LARGEFILE) = -1
open(“/gold/GLT/admin/GOLDTEST/control/control02.ctl”, O_RDWR|O_CIO|O_DSYNC|O_LARGEFILE) = 14
^^^^^
open(“/gold/GLT/db/GOLDTEST/control/control03.ctl”, O_RDONLY|O_LARGEFILE) = 15
open(“/gold/GLT/db/GOLDTEST/control/control03.ctl”, O_RDWR|O_CIO|O_DSYNC|O_LARGEFILE) = 15
^^^^^

— when opening database

SQL> alter database open;

open(“/gold/GLT/db/oradata/GOLDTEST/system01.dbf”, O_RDONLY|O_CIO|O_LARGEFILE) = 16
open(“/etc/vfs”, O_RDONLY)                      = 17
open(“/gold/GLT/db/oradata/GOLDTEST/system01.dbf”, O_RDONLY|O_LARGEFILE) = 16
open(“/gold/GLT/db/oradata/GOLDTEST/system01.dbf”, O_RDWR|O_CIO|O_DSYNC|O_LARGEFILE) = 16
open(“/gold/GLT/db/oradata/GOLDTEST/undotbs01.dbf”, O_RDONLY|O_CIO|O_LARGEFILE) = 17
open(“/etc/vfs”, O_RDONLY)                      = 18
open(“/gold/GLT/db/oradata/GOLDTEST/undotbs01.dbf”, O_RDONLY|O_LARGEFILE) = 17
open(“/gold/GLT/db/oradata/GOLDTEST/undotbs01.dbf”, O_RDWR|O_CIO|O_DSYNC|O_LARGEFILE) = 17
open(“/gold/GLT/db/oradata/GOLDTEST/sysaux01.dbf”, O_RDONLY|O_CIO|O_LARGEFILE) = 18
open(“/etc/vfs”, O_RDONLY)                      = 19
open(“/gold/GLT/db/oradata/GOLDTEST/sysaux01.dbf”, O_RDONLY|O_LARGEFILE) = 18
open(“/gold/GLT/db/oradata/GOLDTEST/sysaux01.dbf”, O_RDWR|O_CIO|O_DSYNC|O_LARGEFILE) = 18
open(“/gold/GLT/db/oradata/GOLDTEST/users01.dbf”, O_RDONLY|O_CIO|O_LARGEFILE) = 19
open(“/etc/vfs”, O_RDONLY)                      = 20
open(“/gold/GLT/db/oradata/GOLDTEST/users01.dbf”, O_RDONLY|O_LARGEFILE) = 19
open(“/gold/GLT/db/oradata/GOLDTEST/users01.dbf”, O_RDWR|O_CIO|O_DSYNC|O_LARGEFILE) = 19
open(“/gold/GLT/db/oradata/GOLDTEST/centestdata01.dbf”, O_RDONLY|O_CIO|O_LARGEFILE) = 20
open(“/etc/vfs”, O_RDONLY)                      = 21
open(“/gold/GLT/db/oradata/GOLDTEST/centestdata01.dbf”, O_RDONLY|O_LARGEFILE) = 20
open(“/gold/GLT/db/oradata/GOLDTEST/centestdata01.dbf”, O_RDWR|O_CIO|O_DSYNC|O_LARGEFILE) = 20
open(“/gold/GLT/db/oradata/GOLDTEST/centestidx01.dbf”, O_RDONLY|O_CIO|O_LARGEFILE) = 21
open(“/etc/vfs”, O_RDONLY)                      = 22
open(“/gold/GLT/db/oradata/GOLDTEST/centestidx01.dbf”, O_RDONLY|O_LARGEFILE) = 21
open(“/gold/GLT/db/oradata/GOLDTEST/centestidx01.dbf”, O_RDWR|O_CIO|O_DSYNC|O_LARGEFILE) = 21
open(“/gold/GLT/db/oradata/GOLDTEST/goldblob.dbf”, O_RDONLY|O_CIO|O_LARGEFILE) = 22
open(“/etc/vfs”, O_RDONLY)                      = 23
open(“/gold/GLT/db/oradata/GOLDTEST/goldblob.dbf”, O_RDONLY|O_LARGEFILE) = 22
open(“/gold/GLT/db/oradata/GOLDTEST/goldblob.dbf”, O_RDWR|O_CIO|O_DSYNC|O_LARGEFILE) = 22
open(“/gold/GLT/db/oradata/GOLDTEST/stktestdata01.dbf”, O_RDONLY|O_CIO|O_LARGEFILE) = 23
open(“/etc/vfs”, O_RDONLY)                      = 24
open(“/gold/GLT/db/oradata/GOLDTEST/stktestdata01.dbf”, O_RDONLY|O_LARGEFILE) = 23
open(“/gold/GLT/db/oradata/GOLDTEST/stktestdata01.dbf”, O_RDWR|O_CIO|O_DSYNC|O_LARGEFILE) = 23
open(“/gold/GLT/db/oradata/GOLDTEST/stktestidx01.dbf”, O_RDONLY|O_CIO|O_LARGEFILE) = 24
open(“/etc/vfs”, O_RDONLY)                      = 25
open(“/gold/GLT/db/oradata/GOLDTEST/stktestidx01.dbf”, O_RDONLY|O_LARGEFILE) = 24
open(“/gold/GLT/db/oradata/GOLDTEST/stktestidx01.dbf”, O_RDWR|O_CIO|O_DSYNC|O_LARGEFILE) = 24
open(“/gold/GLT/db/oradata/GOLDTEST/temp_01.dbf”, O_RDONLY|O_CIO|O_LARGEFILE) = 25
open(“/etc/vfs”, O_RDONLY)                      = 26
open(“/gold/GLT/db/oradata/GOLDTEST/temp_01.dbf”, O_RDONLY|O_LARGEFILE) = 25
open(“/gold/GLT/db/oradata/GOLDTEST/temp_01.dbf”, O_RDWR|O_CIO|O_DSYNC|O_LARGEFILE) = 25
open(“/gold/GLT/db/oradata/GOLDTEST/temp_02.dbf”, O_RDONLY|O_CIO|O_LARGEFILE) = 26
open(“/etc/vfs”, O_RDONLY)                      = 27
open(“/gold/GLT/db/oradata/GOLDTEST/temp_02.dbf”, O_RDONLY|O_LARGEFILE) = 26
open(“/gold/GLT/db/oradata/GOLDTEST/temp_02.dbf”, O_RDWR|O_CIO|O_DSYNC|O_LARGEFILE) = 26
open(“/gold/GLT/db/oradata/GOLDTEST/temp_03.dbf”, O_RDONLY|O_CIO|O_LARGEFILE) = 27
open(“/etc/vfs”, O_RDONLY)                      = 28
open(“/gold/GLT/db/oradata/GOLDTEST/temp_03.dbf”, O_RDONLY|O_LARGEFILE) = 27
open(“/gold/GLT/db/oradata/GOLDTEST/temp_03.dbf”, O_RDWR|O_CIO|O_DSYNC|O_LARGEFILE) = 27
open(“/gold/GLT/db/oradata/GOLDTEST/temp_04.dbf”, O_RDONLY|O_CIO|O_LARGEFILE) = 28
open(“/etc/vfs”, O_RDONLY)                      = 29
open(“/gold/GLT/db/oradata/GOLDTEST/temp_04.dbf”, O_RDONLY|O_LARGEFILE) = 28
open(“/gold/GLT/db/oradata/GOLDTEST/temp_04.dbf”, O_RDWR|O_CIO|O_DSYNC|O_LARGEFILE) = 28
open(“/gold/GLT/bin/oracle/product/10.2.0/db_1/rdbms/mesg/oraus.msb”, O_RDONLY) = 29
[/sourcecode]

Conclusion

although file system is mounted with CIO option, Oracle will ignore it if its filesystemio_options=async (default value)
In order to use Asynch IO and Concurrent IO, we have to set it to filesystemio_options=SETALL;

 

 

Further reading:

Recommendations For using CIO/DIO for Filesystems containing Oracle Files on AIX [ID 960055.1]

Modified 17-MAR-2011     Type BULLETIN     Status PUBLISHED

In this Document
Purpose
Scope and Application
Recommendations For using CIO/DIO for Filesystems containing Oracle Files on AIX
The CIO/DIO recommendations for filesystems containing Oracle Files are:
References


Applies to:

Oracle Server – Enterprise Edition – Version: 10.1.0.2 to 11.1.0.7 – Release: 10.1 to 11.1
IBM AIX on POWER Systems (64-bit)
IBM AIX Based Systems (64-bit)

Purpose

Recommendation for using CIO/DIO for Oracle Files on AIX

Scope and Application

All DBAs who work on AIX.

Recommendations For using CIO/DIO for Filesystems containing Oracle Files on AIX

From 10g onwards, If you set filesystemio_options=setall then Oracle by default uses the CIO option to perform i/o on the oracle files. There is no need to specify the cio flag explicitly to mount the disk used for the Oracle Files.

You need to set the CIO flag explicitly if you are not using filesystemio_options=setall.

The CIO/DIO recommendations for filesystems containing Oracle Files are:

Data Base Files (DBF)

++ Use CIO or no mount options for extremely sequential I/O

Redo Log/Control Files

++ Use CIO or DIO

Archive Log Files

++ Do not use CIO or DIO

Flashback Log Files

++ Flashback Log files should use CIO, DIO, or rbrw

Oracle Binaries

++ Do not use CIO or DIO

References

http://www.ibm.com/developerworks/wikis/download/attachments/53871900/Oracle_on_AIX_WebinarFeb2007.pdf?version=2

Tags: ,
March 28, 2011

Diagnostics For Database Hang

nice article from http://askdba.org/weblog/?p=81

Diagnostics For Database Hang

Many times Oracle DBA’s are in a situation when the database is hung and does not seem to be responding. In some scenarios, state is such that you cannot even connect to the sqlplus session. Majority people restart the database (Sometimes I wonder if this is due to the fact that most of us started working on Computer’s on Microsoft Windows  ) and then log a ticket with Oracle support . They inturn happily inform us that “They do not have any diagnostic information to diagnose and resolve the issue and we need to wait for next occurrence to collect some diagnostic information)

Based on my experiences , I am writing this article to assist my fellow Oracle DBA’s to diagnose the problem and collect the required information. So Let’s Start.

1) First of all we need to ensure that this is really a database hung situation and not a slow database condition. This can be done by asking some questions to users.

a) Is a particular user complaining of database hang or its the condition for all the users. If one or few user are reporting then are these users executing a batch job?

b)Are you able to make new connections to database?

c)Also check if any initialisation parameter has been changed recently?

d)Check if any resource manager plan is in effect.

One more way to establish if database is hung is t0 try to query v$session_wait view to find events being waited on

select sid,event,seq#,p1,p2,p3 from V$session_wait where wait_time=0 and event not like ‘%message%’;

This will give the events for all the waiting session. In case you see something like ‘log file switch (archiving required)’ then this problem is caused by archiving issue. See if there is free space in archiving destination.

Suppose this gives events like row cache enqueue or latches, then we need to gather Hanganalyze and Systemstate for the support.

Else it could be that you are experiencing a slow database. In this case use AWR or statspack to diagnose the issue. Look out for top timed events. In case you see Library Latch or shared pool latch consuming lot of time, then look at the Hard parses per sec section in Load profile.

2)Look at database alert log and see if any messages are present. In case you are facing Latching or Enqueue issues, then you might see errors like below

PMON failed to acquire latch, see PMON dump

Errors in file /u01/BDUMP/test10_pmon_12864.trc:

In this case you are required to upload tracefile (reported in alert log) to Oracle support.

Note : -Make sure that max_dump_file_size is set to unlimited so that the tracefile contains the complete data.

Coming back to Hanganalyze and Systemstate. Find details for them below

A)Hanganalyze

HANGANALYZE is used to determine if a session is waiting for a resource, and reports the relationships between blockers and waiters.

Use following syntax and take hanganalyze from two sessions at interval of 1 min

SQL>sqlplus “/ as sysdba”

SQL> oradebug setmypid

SQL> oradebug unlimit;

SQL> oradebug hanganalyze 3

SQL>oradebug tracefile_name

Last command will report the tracefile name which has to be uploaded to Oracle support.

Alternatively you can use

SQL>ALTER SESSION SET EVENTS ‘IMMEDIATE TRACE NAME HANGANALYZE LEVEL 3′;

In case you wish to understand how to interpret hanganalyze file, then use Metalink Note:215858.1: Interpreting HANGANALYZE trace files to diagnose hanging and performance problems

B)Systemstate

Systemstate is used to dump the process information which is useful for Oracle support to diagnose why the sessions are waiting.

For 9.2.0.6 and above gather systemstate as below

SQL> oradebug setmypid

SQL> oradebug unlimit;

SQL> oradebug dump systemstate 266

SQL>oradebug tracefile_name

Last command will report the tracefile name which has to be uploaded to Oracle support. Perform this 2-3 times at interval of 1 min.

Again you can use

ALTER SESSION SET EVENTS ‘IMMEDIATE TRACE NAME SYSTEMSTATE LEVEL 266′;

For Oracle 9.2.0.5 and less use level 10 instead of 266

ALTER SESSION SET EVENTS ‘IMMEDIATE TRACE NAME SYSTEMSTATE LEVEL 266′;

Level 266 includes short stacks (Oracle function calls) which are useful for Oracle Developers to determine which Oracle function’s are causing the problem. This is also helpful in matching existing bugs.

In case you are unable to connect to database then capture systemstate using below note

Note 121779.1-Taking a SYSTEMSTATE dump when you cannot CONNECT to Oracle.

Apart from this, following information can also be captured

a)Database alert log

b)AWR report /statspack report for 30-60 min during database hang

c)Output of OS tools to ensure that everything is fine at OS level.

E.g

$vmstat 2 20

This will capture 20 snapshots at 2 seconds interval. Look for CPU contention or swapping issues

In Addition to above , you can use utility called LTOM which has got some predefined rules based on which it determines that database is in Hung situation and takes systemstate and hanganalyze dump automatically .

Please refer to following Metalink note for more details

Note:352363.1: LTOM – The On-Board Monitor User Guide

In case you are able to narrow down to a blocking session manually, then you can very well take errorstack for the blocking process as below

connect / as sysdba

oradebug setospid 1234

oradebug unlimit

oradebug dump errorstack 3

wait 1 min

oradebug dump errorstack 3

wait 1 min

oradebug dump errorstack 3

oradebug tracefile_name

* In case the ospid for blocking session is 1234

Last command will report the tracefile name which has to be uploaded to Oracle support.

In case you are able to capture the above information, you stand 99% chance of getting solution . I have kept 1 % for the cases when Oracle Support will ask for setting up some events and waiting for Next Hang Occurence for getting more information.

March 28, 2011

ERRORSTACK- Oracle Debugging Event

very informative article from http://askdba.org/weblog/2008/04/errorstack-oracle-debugging-event/

ERRORSTACK- Oracle Debugging Event

One more post arising out of my frequent visits to Oracle forum Basically there was a question on “How to set events for ORA – 1652 errors”

I had replied to the post but Aman asked me if I could provide any notes regarding Errorstack.I didn’t had one so I informed him that I will be posting on my blog ( Cheap Publicity of my blog :) )

Ok, this was the background to it, now coming to the main thing. Errorstack is used as debugging event for following reasons

  • To extract Failing SQL statement
  • To get Oracle Function call (to be used by Oracle) for interpreting cause for the error (e.g ORA 4030, ORA- 4031). It is also used in case of database hang scenarios. Refer to my earlier article on database hang
  • It also contains execution plan in some cases which can be used to diagnose ORA – 1652 errors (Excessive temp usage)

Errorstack can be set at System level and also at Session level.

e.g To set errorstack for ORA – 1652 ,Syntax will be

alter system set events ’1652 trace name errorstack level 3′;

If suppose, you want to set it at session level, then

alter session set events ’1652 trace name errorstack level 3′;

You can also use oradebug to take errorstack for particular process (say ospid 1234) as below

sqlplus “/ as sysdba”

oradebug setospid 1234

oradebug unlimit

oradebug dump errorstack 3

There are different levels for the errorstack with level 3 being the highest.

Note: There is no such level as level 10 or 12 which you might
sometimes find on web.

e.g event=’1401 trace name errorstack, level 12′

Extracted from http://www.orafaq.com/faqdbain.htm .
Specifying this will default to level 3.

Let’s see one trace file for error ORA – 942 i.e “table or view does not exist”

SYS>alter session set events ’942 trace name errorstack level 3′;
Session altered.
SYS>select * from err;

select * from err

*ERROR at line 1:ORA-00942: table or view does not exist

Trace file is generated in user_dump_dest as a user process has failed.

Trace file Header
/u01/app/oracle/admin/sql10g/udump/sql10g_ora_23773.trc
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 – 64bit Production
With the Partitioning, OLAP and Data Mining options
ORACLE_HOME = /u01/app/oracle/oracle/product/10.2.0
System name: SunOS
Node name: test
Release: 5.9
Version: Generic_117171-12
Machine: sun4u
Instance name: sql10g
Redo thread mounted by this instance: 1
Oracle process number: 28
Unix process pid: 23773, image:
oracle@test(TNS V1-V3)

Here we find information about Database version and Process id along with Operating System.

Failing Sql
ksedmp: internal or fatal error
ORA-00942: table or view does not exist
Current SQL statement for this session:
select * from err

Here we can see that faling sql statement is “select * from err”

Call Stack

—– Call Stack Trace —–
calling call entry argument values in hex
location type point (? means dubious value)
——————– ——– ——————– —————————-
ksedmp()+744 CALL ksedst() 000000840 ?
FFFFFFFF7FFF06FC ?
000000000 ?
FFFFFFFF7FFED1F0 ?
FFFFFFFF7FFEBF58 ?
FFFFFFFF7FFEC958 ?
ksddoa()+1148 PTR_CALL 0000000000000000 000106000 ? 106323304 ?
106323000 ? 000106323 ?
000106000 ? 106323304 ?
ksdpcg()+276 CALL ksddoa() 00000000F ? 10631DCD0 ?
105502DA8 ? 10631D990 ?
000000180 ? 10631E564 ?
ksdpec()+200 CALL ksdpcg() 000000000 ? 000000000 ?
000000000 ? 000000000 ?
0000003AE ? 00010558F ?
ksfpec()+160 CALL ksdpec() 0000003AE ? 10631EDB0 ?

Immediately after Failing sql, we will find call stack which are actually Oracle functions, which can be used by Oracle Support/ Developement to find the reason for error. You will find sometimes some metalink notes containing reference to functions so as to match Bugs.

PROCESS STATE DUMP
PROCESS STATE
————-
Process global information:
process: 38a1df2f8, call: 38a2b3218, xact: 0, curses: 38a264b38, usrses: 38a264b38
—————————————-
SO: 38a1df2f8, type: 2, owner: 0, flag: INIT/-/-/0×00
(process) Oracle pid=28, calls cur/top: 38a2b3218/3877b9820, flag: (0) –
int error: 0, call error: 0, sess error: 0, txn error 0
(post info) last post received: 0 0 0
last post received-location: No post
last process to post me: none
last post sent: 0 0 0
last post sent-location: No post
last process posted by me: none
(latch info) wait_event=0 bits=0
Process Group: DEFAULT, pseudo proc: 38a2039c8
O/S info: user: oracle, term: pts/19, ospid: 23773
OSD pid info: Unix process pid: 23773, image: oracle@test (TNS V1-V3)

SO: 38a264b38, type: 4, owner: 38a1df2f8, flag: INIT/-/-/0×00
(session) sid: 71 trans: 0, creator: 38a1df2f8, flag: (41) USR/- BSY/-/-/-/-/-
SO: 38a264b38, type: 4, owner: 38a1df2f8, flag: INIT/-/-/0×00
(session) sid: 71 trans: 0, creator: 38a1df2f8, flag: (41) USR/- BSY/-/-/-/-/-
DID: 0001-001C-0000026A, short-term DID: 0000-0000-00000000
txn branch: 0
oct: 3, prv: 0, sql: 385b129b0, psql: 383567fa0, user: 0/SYS
O/S info: user: oracle, term: pts/19, ospid: 23772, machine: test
program: sqlplus@test (TNS V1-V3)
application name: sqlplus@test (TNS V1-V3), hash value=0
last wait for ‘SQL*Net message from client’ blocking sess=0×0 seq=30 wait_time=24569781 seconds since wait started=23
driver id=62657100, #bytes=1, =0

SO: 387f804d0, type: 53, owner: 38a264b38, flag: INIT/-/-/0×00
LIBRARY OBJECT LOCK: lock=387f804d0 handle=385b129b0 mode=N
call pin=0 session pin=0 hpc=0000 hlc=0000
htl=387f80550[387f2d2e8,3877d6288] htb=387f2d2e8 ssga=387f2cce0
user=38a264b38 session=38a264b38 count=1 flags=[0000] savepoint=0×40
LIBRARY OBJECT HANDLE: handle=385b129b0 mutex=385b12ae0(0)
name=select * from err
hash=737b6fa6ce797a460e47cca17ef84a50 timestamp=04-17-2008 13:56:40

Now this gives the process details like Oracle Process id (pid),OS Process Id (ospid) along with wait event which it was waiting on it. It will also show the memory objects (State Objects SO ) associated with this process.

Systemstate dump which is used to diagnose the database hang contains process state dumps for all the processes present in Database.

Execution Plan

As the execution has failed because of ORA- 942 during semantic check, it will not contain the execution plan.

March 25, 2011

lock_sga on AIX 5L

AIX 5.3 ORA-27126: unable to lock shared memory segment in core

when setting lock_sga=true

This doesn’t appear to be documented many places, so I’ll put it here. There can be a tangible performance increase to ‘locking’ SGA to pinned memory. Paging spaces are, after all, on a slower medium (disk). In AIX 5.3 you also need to ‘pin’ shared memory. (SGA is shared memory)

First, you’ll need to set the ‘vmo’ option v_pinshm to 1. This tells AIX to ‘pin’ shared memory into RAM.

Second, you’ll need to give the ‘oracle’ user (or whatever user the database runs as) the following capabilities:

CAP_BYPASS_RAC_VMM and CAP_PROPAGATE

The commands to do this (as root on the system) are as follows:

***To set v_pinshm***
# vmo -p -o v_pinshm=1


***To give the "oracle" user the necessary capabilities***
# chuser capabilities=CAP_BYPASS_RAC_VMM,CAP_PROPAGATE oracle

large memory pages may also help too

March 24, 2011

Renaming datafile

Renaming datafile when the database is shut down


-- shut down database

SQL> shutdown immediate;

SQL> !mv "/gold/GLP/db/oradata/GOLDPROD/cenprod_data_03.dbf " /gold/GLP/db/oradata/GOLDPROD/cenproddata03.dbf

SQL> !mv /gold/GLP/db/oradata/GOLDPROD/cenprod_data_02.dbf  /gold/GLP/db/oradata/GOLDPROD/cenproddata02.dbf

SQL> !mv /gold/GLP/db/oradata/GOLDPROD/cenprod_idx_02.dbf /gold/GLP/db/oradata/GOLDPROD/cenprodidx02.dbf

SQL> startup mount;
ORACLE instance started.

Total System Global Area 8589934592 bytes
Fixed Size                  2110136 bytes
Variable Size            1006636360 bytes
Database Buffers         7566524416 bytes
Redo Buffers               14663680 bytes
Database mounted.
SQL> alter database rename file '/gold/GLP/db/oradata/GOLDPROD/cenprod_data_03.dbf ' to '/gold/GLP/db/oradata/GOLDPROD/cenproddata03.dbf';

Database altered.

SQL> alter database rename file '/gold/GLP/db/oradata/GOLDPROD/cenprod_data_02.dbf' to '/gold/GLP/db/oradata/GOLDPROD/cenproddata02.dbf';

Database altered.

SQL>  alter database rename file '/gold/GLP/db/oradata/GOLDPROD/cenprod_idx_02.dbf' to '/gold/GLP/db/oradata/GOLDPROD/cenprodidx02.dbf';

Database altered.

SQL> alter database open;

Database altered.

Tags: ,
March 22, 2011

Bind variable is NULL in dynamic SQL

When we pass bind variables to search condition (WHERE), some bind variable can be NULL. The right way to use bind variables (optional NULL) in a dynamic SQL is demostrated below:

e.g.


w_sql := ' select TSARPPRZOFF,     ' ||
 '        TSARPPRZOFFFID,  ' ||
 '        TSARPMXN,        ' ||
 '        TSARPMXNFID,     ' ||
 '        TSARPSCONTO,     ' ||
 '        TSARPTSCONTO,    ' ||
 '        TSARPSCONTOFID,  ' ||
 '        TSARPTSCONTOFID  ' ||
 ' from   tst_pma_articolo ' ||
 ' where  1 = 1 ';

IF p_id_articolo <> tesi_util.CARATTERE_CAMPO_NULLO THEN
 w_sql := w_sql || ' and tsarpid = ' || p_id_articolo || ' ';
END IF;

IF p_c_promozione <> tesi_util.CARATTERE_CAMPO_NULLO THEN
 w_sql := w_sql || ' and tsarptpnopr = ' || p_c_promozione || ' ';
END IF;

IF p_c_tema <> tesi_util.CARATTERE_CAMPO_NULLO THEN
 w_sql := w_sql || ' and tsarpnthem = ''' || p_c_tema ||''' ';
END IF;

IF p_c_prestazione <> tesi_util.CARATTERE_CAMPO_NULLO THEN
 w_sql := w_sql|| ' and tsarpnprs = ' || p_c_prestazione || ' ';
END IF;

OPEN w_cursor for w_sql;

LOOP
 ...
END LOOP;

should be like:


w_sql := ' select TSARPPRZOFF,     ' ||
 '        TSARPPRZOFFFID,  ' ||
 '        TSARPMXN,        ' ||
 '        TSARPMXNFID,     ' ||
 '        TSARPSCONTO,     ' ||
 '        TSARPTSCONTO,    ' ||
 '        TSARPSCONTOFID,  ' ||
 '        TSARPTSCONTOFID  ' ||
 ' from   tst_pma_articolo ' ||
 ' where  1 = 1 ';

 IF p_id_articolo is not null THEN
 w_sql := w_sql || ' and tsarpid = :B1 ' ;
 ELSE
 w_sql := w_sql || ' and :B1 is null ';
 END IF;

 IF p_c_promozione is not null THEN
 w_sql := w_sql || ' and tsarptpnopr = :B1' ;
 ELSE
 w_sql := w_sql || ' and :B2 is null ' ;
 END IF;

 IF p_c_tema is not null THEN
 w_sql := w_sql || ' and tsarpnthem = ''' || p_c_tema ||''' ';
 ELSE
 w_sql := w_sql || ' and :B3 is null ' ;
 END IF;

 IF p_c_prestazione is not null THEN
 w_sql := w_sql|| ' and tsarpnprs = ' || p_c_prestazione || ' ';
 ELSE
 w_sql := w_sql || ' and :B4 is null' ;
 END IF;

 OPEN w_cursor for w_sql using p_id_articolo, p_c_promozione, p_c_tema, p_c_prestazione;</em>
 LOOP
 ...
 END LOOP;</em>

March 22, 2011

Articles from Kerry Osborne

Index

Creating Test Scripts With Bind Variables

GATHER_PLAN_STATISTICS

+++++Oracle Support Sanctions Manually Created SQL Profiles!

1. move sql profile from test to production 2) fix a good exec plan by sql_id and plan_hash_value

How to Attach a SQL Profile to a Different Statement – Take 2

1.move an execution plan to a different sql

 

read more »

March 21, 2011

Import DataPump: How to Import Table Data into a Table that has Different Name ?

Import DataPump: How to Import Table Data into a Table that has Different Name ? [ID 342314.1]

Modified 20-OCT-2010     Type HOWTO     Status PUBLISHED

In this Document
Goal
Solution


Applies to:

Enterprise Manager for RDBMS – Version: 10.1.0.2 to 11.2.0.2 – Release: 10.1 to 11.2
Oracle Server – Enterprise Edition – Version: 10.1.0.2 to 11.2.0.2   [Release: 10.1 to 11.2]
Oracle Server – Personal Edition – Version: 10.1.0.2 to 11.2.0.2   [Release: 10.1 to 11.2]
Oracle Server – Standard Edition – Version: 10.1.0.2 to 11.2.0.2   [Release: 10.1 to 11.2]
Information in this document applies to any platform.

Goal

Checked for relevance on 10-20-2010

How to import table data into a target table that has the same structure as the original source table, but has a different name?

Solution

For Import DataPump using Oracle 10g, you can use the REMAP_SCHEMA parameter to remap the schema name during the import (similar to the FROMUSER and TOUSER parameters in the original import utility). However, there is no parameter to remap table names. This means that Import DataPump can only import data into a table with the same name as the original table.

The examples below are based on:

A. The demo schema SCOTT that is created with script: $ORACLE_HOME/rdbms/admin/scott.sql

B. The directory object my_dir that refers to an existing directory on the server where the Oracle RDBMS is installed. Example:

Windows:
CONNECT system/manager
CREATE OR REPLACE DIRECTORY my_dir AS 'D:\export';
GRANT read,write ON DIRECTORY my_dir TO public;
Unix:
CONNECT system/manager
CREATE OR REPLACE DIRECTORY my_dir AS '/home/users/export';
GRANT read,write ON DIRECTORY my_dir TO public;

C. Two tables with identical structure, but different names:

CONNECT scott/tiger
CREATE TABLE emp2 AS SELECT * FROM emp WHERE 1=2;
SELECT COUNT(*) FROM emp;
  COUNT(*)
----------
        14
SELECT COUNT(*) FROM emp2;
  COUNT(*)
----------
         0

D. You have an export dumpfile of table SCOTT.EMP that was created with:

expdp system/manager directory=my_dir dumpfile=exp_emp.dmp logfile=exp_emp.log tables=scott.emp

OBJECTIVE: import the data from the export dumpfile into table SCOTT.EMP2
1. If the original source table can be temporary renamed:

-- temporary rename the two tables:
CONNECT scott/tiger
RENAME emp TO emp_temp;
RENAME emp2 TO emp;

-- import data into table EMP:
impdp system/manager directory=my_dir dumpfile=exp_emp.dmp logfile=imp_emp.log full=y
table_exists_action=append content=metadata_only

-- rename tables back to their original names:
CONNECT scott/tiger
RENAME emp TO emp2;
RENAME emp_temp TO emp;

2: If the original source table cannot be renamed:

-- create a temporary user:
CONNECT system/manager
CREATE USER tempuser IDENTIFIED BY tempuser DEFAULT TABLESPACE users;
ALTER USER tempuser QUOTA UNLIMITED ON users;
GRANT create session, create table TO tempuser;

-- import data into user TEMPUSER:
impdp system/manager directory=my_dir dumpfile=exp_emp.dmp logfile=imp_emp.log full=y
remap_schema=scott:tempuser

-- rename the table in the TEMPUSER schema:
CONNECT tempuser/tempuser
RENAME emp TO emp2;

-- export data again, now from (renamed) table TEMPUSER.EMP2:
expdp system/manager directory=my_dir dumpfile=exp_emp2.dmp logfile=exp_emp2.log
tables=tempuser.emp2 content=data_only

-- import data into SCOTT.EMP2:
impdp system/manager directory=my_dir dumpfile=exp_emp2.dmp logfile=imp_emp2.log full=y
remap_schema=tempuser:scott table_exists_action=append

-- check results, and cleanup:
CONNECT system/manager
SELECT COUNT(*) FROM scott.emp;
  COUNT(*)
----------
        14
SELECT COUNT(*) FROM scott.emp2;
  COUNT(*)
----------
        14
DROP USER tempuser CASCADE;

Remarks:

(1) Note that within the same database, this could also be achieved with a single INSERT statement:

SQL> insert into scott.emp2 (select * from scott.emp);

(2) Note that with the original (non-DataPump) export and import utilities there was another solution if the tablename in the export dumpfile did not exist in the schema in the target database. E.g. based on previous example:

-- export from source database:
exp system/manager file=exp_emp.dmp log=exp_emp.log tables=scott.emp

-- on target database: ensure that table EMP doesn't exist:
CONNECT scott/tiger
CREATE TABLE emp2 AS SELECT * FROM emp WHERE 1=2;
DROP TABLE emp;

-- create synonym for table EMP2:
CREATE SYNONYM emp FOR emp2;

-- import data from EMP into table EMP2:
imp system/manager file=exp_emp.dmp log=imp_emp.log full=y ignore=y indexes=n constraints=n

(3) Starting with Oracle 11g, DataPump provides a new command REMAP_TABLE to import the table data into a new table name on the target database.   See the 11g Oracle Database Utilities for more on this new feature.

— import data from EMPLOYEES into table EMPS:
impdp hr DIRECTORY=dpump_dir1 DUMPFILE=expschema.dmp TABLES=hr.employees REMAP_TABLE=hr.employees:emps

Tags: ,
March 21, 2011

Article on execution plan

Troubleshooting Bad Execution Plans

November 21, 2007

By Greg Rahn

One of the most common performance issues DBAs encounter are bad execution plans. Many try to resolve bad executions plans by setting optimizer related parameters or even hidden underscore parameters. Some even try to decipher a long and complex 10053 trace in hopes to find an answer. While changing parameters or analyzing a 10053 trace might be useful for debugging at some point, I feel there is a much more simple way to start to troubleshoot bad execution plans.

Verify The Query Matches The Business Question

This seems like an obvious thing to do, but I’ve seen numerous cases where the SQL query does not match the business question being asked. Do a quick sanity check verifying things like: join columns, group by, subqueries, etc. The last thing you want to do is consume time trying to debug a bad plan for an improperly written SQL query. Frequently I’ve found that this is the case for many of those “I’ve never got it to run to completion” queries.

What Influences The Execution Plan

I think it’s important to understand what variables influence the Optimizer in order to focus the debugging effort. There are quite a number of variables, but frequently the cause of the problem ones are: (1) non-default optimizer parameters and (2) non-representative object/system statistics. Based on my observations I would say that the most abused Optimizer parameters are:

  • OPTIMIZER_INDEX_CACHING
  • OPTIMIZER_INDEX_COST_ADJ
  • DB_FILE_MULTIBLOCK_READ_COUNT

Many see setting these as a solution to get the Optimizer to choose an index plan over a table scan plan, but this is problematic in several ways:

  1. This is a global change to a local problem
  2. Although it appears to solve one problem, it is unknown how many bad execution plans resulted from this change
  3. The root cause of why the index plan was not chosen is unknown, just that tweaking parameters gave the desired result
  4. Using non-default parameters makes it almost impossible to correctly and effectively troubleshoot the root cause

Object and system statistics can have a large influence on execution plans, but few actually take the time to sanity check them during triage. These statistics exist in views like:

  • ALL_TAB_COL_STATISTICS
  • ALL_PART_COL_STATISTICS
  • ALL_INDEXES
  • SYS.AUX_STATS$

Using GATHER_PLAN_STATISTICS With DBMS_XPLAN.DISPLAY_CURSOR

As a first step of triage, I would suggest executing the query with a GATHER_PLAN_STATISTICS hint followed by a call to DBMS_XPLAN.DISPLAY_CURSOR. The GATHER_PLAN_STATISTICS hint allows for the collection of extra metrics during the execution of the query. Specifically, it shows us the Optimizer’s estimated number of rows (E-Rows) and the actual number of rows (A-Rows) for each row source. If the estimates are vastly different from the actual, one probably needs to investigate why. For example: In the below plan, look at line 8. The Optimizer estimates 5,899 rows and the row source actually returns 5,479,000 rows. If the estimate is off by three orders of magnitude (1000), chances are the plan will be sub-optimal. Do note that with Nested Loop Joins you need to multiply the Starts column by the E-Rows column to get the A-Rows values (see line 10).

01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
select /*+ gather_plan_statistics */ ... from ... ;
select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));
------------------------------------------------------------------------------------------
|  Id | Operation                              | Name | Starts | E-Rows | A-Rows |
------------------------------------------------------------------------------------------
|   1 | SORT GROUP BY |              |     1  |      1 | 1      |
|*  2 |  FILTER                                |              |     1  |        | 1728K  |
|   3 |   NESTED LOOPS                         |              |     1  |      1 | 1728K  |
|*  4 |    HASH JOIN |              |     1  |      1 | 1728K  |
|   5 |     PARTITION LIST SINGLE              |              |     1  |   6844 | 3029   |
|*  6 | INDEX RANGE SCAN                  | PROV_IX13    |     1  |   6844 | 3029   |
|   7 |     PARTITION LIST SINGLE              |              |     1  |   5899 | 5479K  |
|*  8 | TABLE ACCESS BY LOCAL INDEX ROWID | SERVICE      |     1  |   5899 | 5479K  |
|*  9 | INDEX SKIP SCAN                  | SERVICE_IX8  |     1  |   4934 | 5479K  |
|  10 |    PARTITION LIST SINGLE               |              |  1728K |      1 | 1728K  |
|* 11 | INDEX RANGE SCAN                   | CLAIM_IX7    |  1728K |      1 | 1728K  |
------------------------------------------------------------------------------------------

Using The CARDINALITY Hint

Now that I’ve demonstrated how to compare the cardinality estimates to the actual number of rows, what are the debugging options? If one asserts that the Optimizer will choose the optimal plan if it can accurately estimate the number of rows, one can test using the not so well (un)documented CARDINALITY hint. The CARDINALITY hint tells the Optimizer how many rows are coming out of a row source. The hint is generally used like such:

1
2
3
4
5
6
7
8
select /*+ cardinality(a 100) */ * from dual a;
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   100 |   200 |     2   (0)| 00:00:01 |
|   1 | TABLE ACCESS FULL| DUAL |   100 |   200 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------

In this case I told the Optimizer that DUAL would return 100 rows (when in reality it returns 1 row) as seen in the Rows column from the autotrace output. The CARDINALITY hint is one tool one can use to give the Optimizer accurate information. I usually find this the best way to triage a bad plan as it is not a global change, it only effects a single execution of a statement in my session. If luck has it that using a CARDINALITY hint yields an optimal plan, one can move on to debugging where the cardinality is being miscalculated. Generally the bad cardinality is the result of non-representative table/column stats, but it also may be due to data correlation or other factors. This is where it pays off to know and understand the size and shape of the data. If the Optimizer still chooses a bad plan even with the correct cardinality estimates, it’s time to place a call to Oracle Support as more in-depth debugging is likely required.

Where Cardinality Can Go Wrong

There are several common scenarios that can lead to inaccurate cardinality estimates. Some of those on the list are:

  1. Data skew: Is the NDV inaccurate due to data skew and a poor dbms_stats sample?
  2. Data correlation: Are two or more predicates related to each other?
  3. Out-of-range values: Is the predicate within the range of known values?
  4. Use of functions in predicates: Is the 5% cardinality guess for functions accurate?
  5. Stats gathering strategies: Is your stats gathering strategy yielding representative stats?

Some possible solutions to these issues are:

  1. Data skew: Choose a sample size that yields accurate NDV. Use DBMS_STATS.AUTO_SAMPLE_SIZE in 11g.
  2. Data correlation: Use Extended Stats in 11g. If <= 10.2.0.3 use a CARDINALITY hint if possible.
  3. Out-of-range values: Gather or manually set the statistics.
  4. Use of functions in predicates: Use a CARDINALITY hint where possible.
  5. Stats gathering strategies: Use AUTO_SAMPLE_SIZE. Adjust only where necessary. Be mindful of tables with skewed data.

How To Best Work With Oracle Support

If you are unable to get to the root cause on your own, it is likely that you will be in contact with Oracle Support. To best assist the support analyst I would recommend you gather the following in addition to the query text:

  1. Output from the GATHER_PLAN_STATISTICS and DBMS_XPLAN.DISPLAY_CURSOR
  2. SQLTXPLAN output. See Metalink Note 215187.1
  3. 10053 trace output. See Metalink Note 225598.1
  4. DDL for all objects used (and dependencies) in the query. This is best gotten as a expdp (data pump) using CONTENT=METADATA_ONLY. This will also include the object statistics.
  5. Output from: select pname, pval1 from sys.aux_stats$ where sname='SYSSTATS_MAIN';
  6. A copy of your init.ora

Having this data ready before you even make the call (or create the SR on-line) should give you a jump on getting a quick(er) resolution.

Summary

While this blog post is not meant to be a comprehensive troubleshooting guide for bad execution plans, I do hope that it does help point you in the right direction the next time you encounter one. Many of the Optimizer issues I’ve seen are due to incorrect cardinality estimates, quite often due to inaccurate NDV or the result of data correlation. I believe that if you use a systematic approach you will find that debugging bad execution plans may be as easy as just getting the cardinality estimate correct.


					
March 15, 2011

excessive redo log generation

select * from
(select sid, round(value/1024/1024/1024) REDO_IN_GB from v$sesstat
where statistic#=139 order by value desc)
where rownum < 11;

To find sessions generating lots of redo, you can use either of the following

methods. Both methods examine the amount of undo generated. When a transaction
generates undo, it will automatically generate redo as well.

The methods are:

Query V$SESS_IO. This view contains the column BLOCK_CHANGES which indicates
how much blocks have been changed by the session. High values indicate a
session generating lots of redo.

The query you can use is:
SELECT s.sid, s.serial#, s.username, s.program,
i.block_changes
FROM v$session s, v$sess_io i
WHERE s.sid = i.sid
ORDER BY 5 desc, 1, 2, 3, 4;

Run the query multiple times and examine the delta between each occurrence
of BLOCK_CHANGES. Large deltas indicate high redo generation by the session.

2) Query V$TRANSACTION. This view contains information about the amount of
undo blocks and undo records accessed by the transaction (as found in the
USED_UBLK and USED_UREC columns).

The query you can use is:
SELECT s.sid, s.serial#, s.username, s.program,
t.used_ublk, t.used_urec
FROM v$session s, v$transaction t
WHERE s.taddr = t.addr
ORDER BY 5 desc, 6 desc, 1, 2, 3, 4;

Run the query multiple times and examine the delta between each occurrence
of USED_UBLK and USED_UREC. Large deltas indicate high redo generation by
the session.

You use the first query when you need to check for programs generating lots of
redo when these programs activate more than one transaction. The latter query
can be used to find out which particular transactions are generating redo.

 

Reference:

Diagnosing excessive redo generation [ID 199298.1]
Excessive Archives / Redo Logs Generation Troubleshooting [ID 832504.1]
Troubleshooting High Redo Generation Issues [ID 782935.1]
How to Disable (Temporary) Generation of Archive Redo Log Files [ID 177218.1]