otak-otak | favorite evening food

December 30, 2011

ORA-600 17182 Block corruption

Filed under: troubleshoot — ora62 @ 2:12 pm

Suddenly one of my customer make a call at night about ora-600 of their core system…

logging into machine and grep the alert log, found the ora-600; database can be opened for a while (about 2 minutes) and then terminate.

Doing block recovery for file 7 block 15312
Block recovery from logseq 147867, block 3707 to scn 3789052035
Recovery of Online Redo Log: Thread 1 Group 6 Seq 147867 Reading mem 0
Mem# 0: /redolog1/dbname/redo06.log
Block recovery stopped at EOT rba 147867.3776.16
Block recovery completed at rba 147867.3776.16, scn 0.3789052035
Thu Dec 29 21:35:26 2011
Completed: alter database open
Thu Dec 29 21:35:29 2011
Starting background process CJQ0
Thu Dec 29 21:35:29 2011
CJQ0 started with pid=45, OS id=192
Thu Dec 29 21:35:29 2011
Sweep Incident[95434]: completed
Errors in file /oracle/diag/diag/rdbms/dbnameg/dbname/trace/dbname_smon_29913.trc:
SVR4 Error: 2: No such file or directory
Additional information: 3
ORACLE Instance dbname (pid = 13) – Error 600 encountered while recovering transaction (203, 27) on object 79208.
Errors in file /oracle/diag/diag/rdbms/dbnameg/dbname/trace/dbname_smon_29913.trc:
ORA-00600: internal error code, arguments: [17182], [0xFFFFFFFF7B81D750], [], [], [], [], [], []
Errors in file /oracle/diag/diag/rdbms/dbnameg/dbname/trace/dbname_smon_29913.trc (incident=95435):
ORA-00600: internal error code, arguments: [KSMFPG2], [0xFFFFFFFF7B81D000], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [17182], [0xFFFFFFFF7B81D750], [], [], [], [], [], []
Incident details in: /oracle/diag/diag/rdbms/dbnameg/dbname/incident/incdir_95435/dbname_smon_29913_i95435.trc
Thu Dec 29 21:35:32 2011
Trace dumping is performing id=[cdmp_20111229213532]
Thu Dec 29 21:35:33 2011
Fatal internal error happened while SMON was doing active transaction recovery.
Errors in file /oracle/diag/diag/rdbms/dbnameg/dbname/trace/dbname_smon_29913.trc:
ORA-00600: internal error code, arguments: [KSMFPG2], [0xFFFFFFFF7B81D000], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [17182], [0xFFFFFFFF7B81D750], [], [], [], [], [], []
SMON (ospid: 29913): terminating the instance due to error 474
Instance terminated by SMON, pid = 29913

from the alert log I found there are corruption at block level, but oracle actualy can recovered it automatically, but it seems when there is undo application for a transaction got stuck due to block corrupt at the other block.

find the object name as informed by alertlog:

select owner,object_name from dba_objects where object_id=79208;

now time to scan all the logical corruption by rman validate command:

run {
allocate channel d1 type disk;
backup check logical validate database;
release channel d1;
}

confirmed there are 2 block corruption on the table:

Error backing up file 7, block 15312: logical corruption
Error backing up file 7, block 18714: logical corruption

another check by dbms_repair to investigate the real corruption, and marked it as corrupt:
declare
begin
dbms_repair.admin_tables(
table_name => ‘REPAIR_TABLE_99’,
table_type => dbms_repair.repair_table,
action => dbms_repair.create_action,
tablespace => ‘USERS’);
end;
/
select owner,object_name,object_type
from dba_objects
where object_name like ‘%REPAIR_TABLE%’;

declare
begin
dbms_repair.admin_tables(
table_name => ‘ORPHAN_REP_TBL_99’,
table_type => dbms_repair.orphan_table,
action => dbms_repair.create_action,
tablespace => ‘USERS’);
end;
/

select owner,object_name,object_type
from dba_objects
where object_name like ‘%ORPHAN%’;

set serveroutput on

declare
rpr_count int;
begin
rpr_count :=0;

dbms_repair.check_object(
schema_name => ‘TEST’,
object_name => ‘TBL1’,
repair_table_name => ‘REPAIR_TABLE_99’,
corrupt_count => rpr_count);

dbms_output.put_line(‘rep cnt : ‘||to_char(rpr_count));
end;
/

select object_name,block_id,corrupt_type,marked_corrupt,corrupt_description,repair_description from repair_table_99;

OBJECT_NAME BLOCK_ID CORRUPT_TYPE MARKED_COR
—————————— ———- ———— ———-
CORRUPT_DESCRIPTION
——————————————————————————–
REPAIR_DESCRIPTION
——————————————————————————–
TBL1 15312 6130 FALSE
Block Checking: DBA = 29375440, Block Type = KTB-managed data block
data header at 0x3d85b007c
kdbchk: fseo(223) < fsbo(282)
mark block software corrupt
TBL1 18714 6110 FALSE
Block Checking: DBA = 29378842, Block Type = KTB-managed data block
data header at 0x3d756e0ac
kdbchk: the amount of space used is not equal to block size
used=7403 fsc=0 avsp=769 dtl=8016
mark block software corrupt

mark those block as corrupted:

declare
l_fix_count int;
begin
l_fix_count:=0;

dbms_repair.fix_corrupt_blocks(
schema_name => ‘TEST’,
object_name => ‘TBL1’,
object_type => dbms_repair.table_object,
repair_table_name => ‘REPAIR_TABLE_99’,
fix_count => l_fix_count);

dbms_output.put_line(‘fix cnt : ‘||to_char(l_fix_count));
end;
/

select object_name,block_id,corrupt_type,marked_corrupt,corrupt_description,repair_description from repair_table_99;

OBJECT_NAME BLOCK_ID CORRUPT_TYPE MARKED_COR
—————————— ———- ———— ———-
CORRUPT_DESCRIPTION
——————————————————————————–
REPAIR_DESCRIPTION
——————————————————————————–
TBL1 15312 6130 TRUE
Block Checking: DBA = 29375440, Block Type = KTB-managed data block
data header at 0x3d85b007c
kdbchk: fseo(223) < fsbo(282)
mark block software corrupt

TBL1 18714 6110 TRUE
Block Checking: DBA = 29378842, Block Type = KTB-managed data block
data header at 0x3d756e0ac
kdbchk: the amount of space used is not equal to block size
used=7403 fsc=0 avsp=769 dtl=8016
mark block software corrupt

now time to recover the table
skip corrupted block from reading, and get the good data:

alter session set event=’10231 trace name context forever, level 10’;
create table test.tmp as select * from test.tbl1;

ok done, now the application guy who have to look into the data and fix them
have a nice sleep then :)

Advertisements

May 9, 2011

resource manager 80% cpu only

Filed under: troubleshoot — ora62 @ 7:53 am

here the steps of configuring resource manager to only gives oracle 80% cpu only for all the user sessions belongs to GROUP_OLTP.

set serverout on size 5555
ALTER SYSTEM SET RESOURCE_MANAGER_PLAN ='';
exec dbms_resource_manager.clear_pending_area();
exec dbms_resource_manager.create_pending_area();
exec dbms_resource_manager.DELETE_PLAN('PLAN_OLTP');
exec dbms_resource_manager.DELETE_CONSUMER_GROUP('GROUP_OLTP');
exec DBMS_RESOURCE_MANAGER.VALIDATE_PENDING_AREA;
exec DBMS_RESOURCE_MANAGER.SUBMIT_PENDING_AREA();
begin
dbms_resource_manager.clear_pending_area();
dbms_resource_manager.create_pending_area();
dbms_resource_manager.create_consumer_group(consumer_group=>'GROUP_OLTP',comment=>'oltp rsrc');
dbms_resource_manager.create_plan(plan=>'PLAN_OLTP',comment=>'batch plan');
dbms_resource_manager.create_plan_directive(plan=>'PLAN_OLTP',group_or_subplan=>'GROUP_OLTP',comment=>'plan dir night',cpu_p1=>80,parallel_degree_limit_p1=>16);
dbms_resource_manager.create_plan_directive(plan=>'PLAN_OLTP',group_or_subplan=>'OTHER_GROUPS',comment=>'plan dir night',cpu_p3=>80,parallel_degree_limit_p1=>8);
dbms_resource_manager.validate_pending_area();
dbms_resource_manager.submit_pending_area();
end;
/
commit
/
begin
dbms_resource_manager_privs.grant_switch_consumer_group(grantee_name=>'SCOTT',consumer_group=>'GROUP_OLTP',grant_option=>false);
dbms_resource_manager_privs.grant_switch_consumer_group(grantee_name=>'SH',consumer_group=>'GROUP_OLTP',grant_option=>false);
dbms_resource_manager.set_initial_consumer_group(user=>'SCOTT',consumer_group=>'GROUP_OLTP');
dbms_resource_manager.set_initial_consumer_group(user=>'SH',consumer_group=>'GROUP_OLTP');
end;
/
alter system set resource_manager_plan=PLAN_OLTP scope=both
/
alter session set resource_manager_plan=PLAN_OLTP scope=both
/

set lines 200
COL consumer_group      FORMAT A24  HEADING 'Consumer Group'
COL cpu_method          FORMAT A18  HEADING 'CPU Method'
COL status              FORMAT A10  HEADING 'Status'
COL mandatory           FORMAT A06  HEADING 'Manda-|tory?'
COL comments            FORMAT A32  HEADING 'Comments'
SELECT
     consumer_group
    ,cpu_method
    ,status
    ,mandatory
    ,comments
  FROM dba_rsrc_consumer_groups;
 
COL plan                      FORMAT A18  HEADING 'Resource Plan'
COL num_plan_directives       FORMAT 9999 HEADING '# of|Plan|Dirs'
COL cpu_method                FORMAT A18  HEADING 'CPU Method'
COL active_sess_pool_mth      FORMAT A32  HEADING 'Active|Session|Pool|Method'
COL parallel_degree_limit_mth FORMAT A32  HEADING 'Parallel|Limit|Method'
COL queueing_mth              FORMAT A18  HEADING 'Queueing|Method'
COL status                    FORMAT A10  HEADING 'Status'
COL mandatory                 FORMAT A06  HEADING 'Manda-|tory?'
COL comments                  FORMAT A32  HEADING 'Comments'
SELECT
     plan
    ,num_plan_directives
    ,cpu_method
    ,active_sess_pool_mth
    ,parallel_degree_limit_mth
    ,queueing_mth
    ,status
    ,mandatory
  FROM dba_rsrc_plans
;
 
COL plan                      FORMAT A18  HEADING 'Resource Plan'
COL group_or_subplan          FORMAT A12  HEADING 'Group or|SubPlan'
COL type                      FORMAT A15  HEADING 'Type'
COL cpu_p1                    FORMAT 999  HEADING 'CPU|1 %'
COL cpu_p2                    FORMAT 999  HEADING 'CPU|2 %'
COL cpu_p3                    FORMAT 999  HEADING 'CPU|3 %'
COL status                    FORMAT A08  HEADING 'Status'
COL mandatory                 FORMAT A06  HEADING 'Manda-|tory?'
COL comments                  FORMAT A24  HEADING 'Comments'

SELECT
     plan
    ,group_or_subplan
    ,type
    ,cpu_p1
    ,cpu_p2
    ,cpu_p3
    ,status
  FROM dba_rsrc_plan_directives
 WHERE plan = 'PLAN_OLTP'
;

COL initial_rsrc_consumer_group  FORMAT A24  HEADING 'Resource|Consumer|Group'
COL username                     FORMAT A12  HEADING 'User Name'
SELECT
	 initial_rsrc_consumer_group,
    username
  FROM dba_users
 ORDER BY 1,2;

SELECT se.sid sess_id, co.name consumer_group,
 se.state, se.consumed_cpu_time cpu_time, se.cpu_wait_time, se.queued_time
 FROM v$rsrc_session_info se, v$rsrc_consumer_group co
 WHERE se.current_consumer_group_id = co.id;

select username,resource_CONSUMER_GROUP,count(*) from v$session group by username,resource_CONSUMER_GROUP;

August 13, 2008

TM deadlock and FK indexing

Filed under: troubleshoot — ora62 @ 8:19 am

this RAC deadlock message in udump directory:

*** 2008-08-13 12:54:39.383
ENQUEUE DUMP REQUEST: from 0.26495 on [0x23a9][0x0],[TM] for reason 2 mtype 0
DUMP LOCAL BLOCKER/HOLDER: block level 5 res [0x23a9][0x0],[TM]
———-resource 0x3cb43d330———————-
resname       : [0x23a9][0x0],[TM]
Local node    : 1
dir_node      : 1
master_node   : 1
hv idx        : 8
hv last r.inc : 4
current inc   : 4
hv status     : 0
hv master     : 1
open options  : dd cached
grant_bits    : KJUSERNL KJUSEREX
grant mode    : KJUSERNL  KJUSERCR  KJUSERCW  KJUSERPR  KJUSERPW  KJUSEREX
count         : 1         0         0         0         0         1
val_state     : KJUSERVS_NOVALUE
valblk        : 0x00000000000000000000000000000000 .
access_node   : 1
vbreq_state   : 0
state         : x0
resp          : 3cb43d330
On Scan_q?    : N
Total accesses: 1017
Imm.  accesses: 885
Granted_locks : 1
Cvting_locks  : 1
value_block:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
GRANTED_Q :
lp 3d613b9d0 gl KJUSEREX rp 3cb43d330 [0x23a9][0x0],[TM]
master 1 gl owner 3db861760 possible pid 25585 xid 2013-013F-00000022 bast 0 rseq 2 mseq 0 history 0xa51449a5
open opt KJUSERDEADLOCK
CONVERT_Q:
lp 3d828b628 gl KJUSERNL rl KJUSEREX rp 3cb43d330 [0x23a9][0x0],[TM]
master 1 owner 0  bast 1 rseq 5 mseq 0x560056 history 0xd8d7d8da
convert opt KJUSERGETVALUE KJUSERNOTDDVICTIM
———-enqueue 0x3d613b9d0————————
lock version     : 3199
Owner node       : 1
grant_level      : KJUSEREX
req_level        : KJUSEREX
bast_level       : KJUSERNL
notify_func      : 0
resp             : 3cb43d330
procp            : 3dbc1c070
pid              : 25585
proc version     : 15
oprocp           : 0
opid             : 0
group lock owner : 3db861760
possible pid     : 25585
xid              : 2013-013F-00000022
dd_time          : 0.0 secs
dd_count         : 0
timeout          : 0.0 secs
On_timer_q?      : N
On_dd_q?         : N
lock_state       : GRANTED
Open Options     : KJUSERDEADLOCK
Convert options  : KJUSERGETVALUE KJUSERNOTDDVICTIM
History          : 0xa51449a5
Msg_Seq          : 0x0
res_seq          : 2
valblk           : 0x00000000000000000000000000000000 .
Potential blocker (pid=25585) on resource TM-000023A9-00000000
DUMP LOCAL BLOCKER: initiate state dump for TIMEOUT
possible owner[319.25585]
Submitting asynchronized dump request [28]

your friend should be calculator with hexa to decimal conversion :)

the first time to investiage this by converting TM-000023A9-00000000, 23A9 to decimal and then query the dba_objects to find the object name, object type.if you sure that the object is child from master-detail table relationship, make sure the column which designate as foreign key should be indexed.

Ujang | Jakarta | Indonesia

Create a free website or blog at WordPress.com.