January 27, 2011

Some top causes for unplanned outages

I read a blog from Mr. Gavin Soorma and find it vary useful to protect database from unplanned outage. Just record it here…

Some top causes for unplanned outages

I have quickly thought about some of the events which had caused unplanned downtime at my past organisation. Idea is to share the same and ensure our current monitoring infrastructure is taking into account aspects discussed below – better to learn from other’s experiences than to experience it yourself.

Listener log file grew > 2GB in a Red Hat environment. This prevented user sessions from connecting even though the listener was running. Not sure if there is an OS file limit in AIX, but if logging is enabled, a good practice will be to truncate the log on a regular basis.
Controlfile was multiplexed over three mount points and one of the mount points got 100% full. Error thrown by the application as well as alert log is ORA-0600 with a lot of arguments. Will not specifically indicate a disk space issue when looked up on Metalink.
Temporary Tablespaces created using large size for the tempfiles. At creation time, Oracle will not check if sufficient disk space is available or not. Problem will manifest itself when some operation like a sort or index build will start using temporary tablespace disk space and disk space is not adequate.
MAXEXTENTS value reached for tables and indexes. Application outage occurred when new tables and indexes were created and developer hard coded a very small value for MAXEXTENTS instead of leaving it to the default value which in UNLIMITED.
Disk holding archive log suddenly got filled due to a spurt in redo activity caused by a month end or year end batch job. Good practice is to have an automated RMAN job which will backup and delete archive log files if a certain disk space usage threshold is crossed.
Mount point holding Oracle software binaries got full. This will prevent any SQL*PLUS connections directly to the host machine. Since the trace file and core dump directories are usually on the same mount point as the ORACLE_HOME, we should ensure that effective daily housekeeping jobs are in place to not only clean up old trace files but also to rename the alert log on a daily basis so that the alert log only has entries for one day. The alert log is normally visited when a problem occurs and having the alert log have entries for the past 6 months or take a long time for the file to open using vi because it has grown very large is not a good practice.
Procedures or Packages got INVALID because some DDL activity was done on another object with dependencies. A good practice is to ensure that no INVALID objects exist in the database and to run a job which polls the database for any objects that have got invalidated. This will also serve from an auditing viewpoint as it may identify if anyone is doing an unauthorised modifications to the database.
Data Guard log shipping terminated because of some activity on the Primary database like password change for SYS, datafile added or resized on the Primary database, but the corresponding mount point on the standby machine did not have adequate space to accommodate the resize. Best practice is to ensure that the disk space allocation and mount point mapping is the same on both Primary as well as Standby machines.
Public database links are used by the application and the password of the user account used in the database link was changed as part of a monthly password change. Best practice is to never change the password for the user account that is used to connect via the database link.
Limit for parameters SESSIONS or PROCESSES which are defined in the init,ora was crossed. The only way to fix this is to kill sessions at the OS level or change the parameters and restart the database. A good practice would be to have an alerting mechanism which warns you if the threshold set is being reached. That way we can be proactive and kill any idle or inactive sessions so that this upper limit is not reached.
Planned outage for a data movement exercise was extended because Import of data failed because of lack of tablespace free space in target database.Consequently the whole exercise had to be repeated. Best practice is to use the RESUMABLE parameter while doing such operations so that they can be resumed from the point of failure and also to write a Database Event Trigger which can alert the DBA in case such an event happens.

Posted in Administration, Instance | Leave a Comment »

Tags: archived log, control file, dblink, flash recovery area, package, procedure, processes, temporary tablespace

January 25, 2011

Default file locations in OMF

The following parameters enable database Oracle-Managed File feature:

DB_CREATE_FILE_DEST

Default location for datafiles or tempfiles if no file specification is given in the creation operation. Also used as the default location for redo log files and control files if DB_CREATE_ONLINE_LOG_DEST_n is not specified.

DB_CREATE_ONLINE_LOG_DEST_n

Default location for redo log files and control files if no file specification is given in the creation operation. You can use this initialization parameter multiple times, where n specifies a multiplexed copy of the redo log or control file. You can specify up to five multiplexed copies.

DB_RECOVERY_FILE_DEST

Default location of RMAN backups if no “format” option is used, archived logs if no other local destination(LOG_ARCHIVE_DEST_N) is configured, and flashback logs. Also used as the default location for redo logs and control files if DB_CREATE_ONLINE_LOG_DEST_n is not specified.

In an OMF system, the locations for various database files are decided by such an order:

location for datafiles and tempfiles
1. as specified in creation operation;
2. if it’s not specified in file creation, created in location defined by DB_CREATE_FILE_DEST;
3. if DB_CREATE_FILE_DEST is not defined, created in $ORACLE_HOME/dbs

location for redo logs and control files
1. as defined in DB_CREATE_ONLINE_LOG_DEST_n;
2. if multiple DB_CREATE_ONLINE_LOG_DEST_n are defined, redo logs and control files are multiplexed;
3. if NONE of DB_CREATE_ONLINE_LOG_DEST_n is defined, they are created in DB_CREATE_FILE_DEST (if defined), or in DB_RECOVERY_FILE_DEST(if defined), or in both locations if both DB_CREATE_FILE_DEST and DB_RECOVERY_FILE_DEST are defined
4. If neither DB_CREATE_FILE_DEST nor DB_RECOVERY_FILE_DEST is defined, then you will receive error:

alter database add standby logfile group 4 size 50M;
Error line 1:
Ora-02236: invalid file name

location of archived logs
1. as defined in LOG_ARCHIVE_DEST_N; if multiple LOG_ARCHIVE_DEST_N are specified, archived logs are multiplexed;
2. if NONE of LOG_ARCHIVE_DEST_N is defined, then archived logs are created in DB_RECOVERY_FILE_DEST(if defined);
3. if DB_RECOVERY_FILE_DEST is not defined, then archived log are created in $ORACLE_HOME/dbs/arch.

location of backup files
1. as specified in “format” option during creation;
2. if “format” is not specified, then they are created in DB_RECOVERY_FILE_DEST (if defined);
3. if DB_RECOVERY_FILE_DEST is not defined, then they are stored in $ORACLE_HOME/dbs.

Posted in files | Leave a Comment »

Tags: archived log, control file, datafile, db_create_file_dest, db_create_online_log_dest_n, db_recovery_file_dest, Oracle-Managed Files, redo log

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Chen Guang's Blog

Categories

Posts tagged ‘control file’