Archive for the ‘Group Blog Posts’ Category

My broken Planet MySQL feed..

Сентябрь 2nd, 2011

As those of you with an interest in MySQL may have seen, I’ve recently surged into the list of “Most Active Authors” on Planet MySQL, alas I’m not as prolific a blogger as it may seem, and this is purely down to an error with the feed I submitted to Planet MySQL.

Having recently joined Pythian’s MySQL team, in anticipation of my first blog post, I was eager to add my blog feed to Planet MySQL for aggregation.  Unfortunately in my haste I pasted the link to my blog as opposed to the blog feed and as the link was erroneous, Planet MySQL recursed back up the url until it reached a valid feed.  Unfortunately for me, that was the main Pythian Blog feed and as a result all Pythian blog content is now getting aggregated under my name.

As there’s no way for me to fix this directly, I’m working with Oracle to resolve the issue asap and will hopefully have a resolution soon. Until then I apologize for any noise caused by posts that aren’t directly MySQL related.

The moral of the story here is to use the Feed Validator helpfully linked to by Planet MySQL, and heed it’s advice if it tells you there may be an issue with your feed!


PlanetMySQL Voting: Vote UP / Vote DOWN

A review of Tungsten Replicator: Part 1 – Installation

Сентябрь 1st, 2011

I’ve been following the development of Tungsten Replicator for quiet some time now, and recently was fortunate enough to find the time to take a look at the product in more detail.

If you haven’t heard of Tungsten Replicator yet, it’s an open source database replication engine that can be used to complement or completely replace native MySQL Replication.  In addition to providing standard replication functionality, Tungsten Replicator introduces exciting new features such as global transaction IDs, heterogeneous replication from MySQL to Oracle and Postgres, parallel replication, and the ability to replicate from multiple masters to a single slave.

Giuseppe Maxia and Robert Hodges have been writing some excellent blog posts on Tungsten, providing great detail on both the architecture and functionality of the product.  One of Giuseppe’s recent posts (here) detailed a new simplified installation procedure so this seemed like a natural place to begin…

Just as an aside, I’m a big fan of VirtualBox when it comes to testing on my laptop.  I’ve created several base images which I select from depending on the task, and simply create a new virtual machine by cloning one of these images.  I used to do this using a combination of VBoxManage clonehd and the VirtualBox gui, but with the release of VirtualBox 4.1 this has been simplified with the introduction of the command clonevm, and it can now be achieved through a simple gui wizard or by executing the following command:

VBoxManage clonevm VirtualMachineName --name NewVMName --register

Additionally VirtualBox provides functionality to take a snapshot of a virtual machine, which makes testing different scenarios and rolling back changes very simple.

Anyway back to installing Tungsten.. ..I looked at Giuseppe’s instructions, and began working through the list of prerequisites.

The first prerequisite was to select a Unix-like operating system, using VirtualBox, I chose a 64bit install of CentOS 5.5 as my base image, and cloned and fired up three vm’s.  The image already had MySQL 5.5 installed, and to simplify testing I set SELinux to permissive and turned off iptables, also for ease, I assigned each vm a hostname (tungsten1, tungsten2, & tungsten3) and added the references to the /etc/hosts file.

The second and third prerequisites specify some required packages, I downloaded and installed the 64bit jre rpm from www.java.com/en/download/, and installed ruby and ruby-libs using the CentOS updates repository.

Next on the list was to create a user account to install and run Tungsten.  The prereq’s specify the user needs sudo access, ssh access to the other hosts involved, and read access to the MySQL binary logs.

For the purposes of the test, as the root user I created an additional user on each machine with access to the required group as follows:

[root@tungsten1 ~]# useradd tungsten -Gmysql
[root@tungsten1 ~]# passwd tungsten

I then edited /etc/sudoers and added:

tungsten ALL=(ALL)    NOPASSWD: ALL

Then logging in as the user “tungsten”, I setup the ssh access:

[tungsten@tungsten1 ~]$ ssh-keygen
[tungsten@tungsten1 ~]$ ssh-copy-id -i ~/.ssh/id_rsa.pub tungsten2
[tungsten@tungsten1 ~]$ ssh-copy-id -i ~/.ssh/id_rsa.pub tungsten3

Finally I created the mysql user and ensured the binary log was enabled.

mysql> create user tungstenmysql identified by 'tungsten';
mysql> grant all privileges on *.* to tungstenmysql with grant option;
mysql> flush privileges;
mysql> show global variables like 'log_bin';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| log_bin       | ON    |
+---------------+-------+
1 row in set (0.00 sec)

With the prerequisites complete, I downloaded and extracted the Tungsten Replicator installation files.

[tungsten@tungsten1 ~]$ wget http://tungsten-replicator.googlecode.com/files/tungsten-replicator-2.0.3.tar.gz
[tungsten@tungsten1 ~]$ tar -xzvf tungsten-replicator-2.0.3.tar.gz

I then tried to complete the installation using the command mentioned in Giuseppe’s post:

[tungsten@tungsten1 ~]$ cd tungsten-replicator-2.0.3
[tungsten@tungsten1 tungsten-replicator-2.0.3]$ ./tools/tungsten-installer \
--master-slave \
--master-host=tungsten1 \
--datasource-user=tungstenmysql \
--datasource-password=tungsten \
--service-name=rep1 \
--home-directory=/opt/continuent/ \
--cluster-hosts=tungsten1,tungsten2,tungsten3 \
--start

Take 1: The moment of truth:

[tungsten@tungsten1 ~]$ cd /opt/continuent/
[tungsten@tungsten1 continuent]$ ./tungsten/tungsten-replicator/bin/trepctl services

NAME              VALUE
----              -----
appliedLastSeqno: -1
appliedLatency  : -1.0
role            : master
serviceName     : rep1
serviceType     : unknown
started         : true
state           : OFFLINE:ERROR
Finished services command...

Hmmm time for some research.. ..the Tungsten Replicator project is hosted on Google code here, and documentation can be found here. By taking a look through the documentation and exploring the filesystem, I found there’s a replicator log called trepsvc.log that can be found in the <home-directory>/tungsten/tungsten-replicator/log/ folder. At first glance the log appears a bit noisy, but a simple grep on ERROR and then WARN showed there was an issue reading the binary log files. It seemed to be looking for mysql-bin.index even though I knew from the my.cnf I’d used, the binary logs took the format of mysqld-bin.

Taking a look at the tungsten-installer tool on the project wiki here, I found the extended list of options, and saw I could specify a –datasource-log-pattern option (the default for which was as expected, mysql-bin). Additionally I can see options available to specify the –master-log-file and –master-log-pos, which would obviously be useful if setting this up in an already running production environment.

Take 2: I removed the current installation and reran the install with the additional –datasource-log-pattern option.

[tungsten@tungsten1 continuent]$ ./tungsten/tungsten-replicator/bin/replicator stop
[tungsten@tungsten1 continuent]$ rm -Rf /opt/continuent/*

mysql> drop database tungsten_rep1;
mysql> reset master;

[tungsten@tungsten1 continuent]$ ~/tungsten-replicator-2.0.3/tools/tungsten-installer \
--master-slave \
--master-host=tungsten1 \
--datasource-user=tungstenmysql \
--datasource-password=tungsten \
--datasource-log-pattern=mysqld-bin \
--service-name=rep1 \
--home-directory=/opt/continuent/ \
--cluster-hosts=tungsten1,tungsten2,tungsten3 \
--start

Checking the master:

[tungsten@tungsten1 continuent]$ ./tungsten/tungsten-replicator/bin/trepctl services

NAME VALUE
---- -----
appliedLastSeqno: 1
appliedLatency : 0.901
role : master
serviceName : rep1
serviceType : local
started : true
state : ONLINE
Finished services command...

Success!

Checking the slaves:

[tungsten@tungsten2 continuent]$ ./tungsten/tungsten-replicator/bin/trepctl services

NAME VALUE
---- -----
appliedLastSeqno: -1
appliedLatency : -1.0
role : slave
serviceName : rep1
serviceType : unknown
started : true
state : OFFLINE:ERROR
Finished services command...

Hmmm time for some more research.. ..checking the log I could see there was an issue with setting the NO_ENGINE_SUBSTITUTION sql_mode, it looked like it was actually trying to set it to MODE_NO_ENGINE_SUBSTITUTION, this sounded a little buggy. I decided to search all issues listed on the Google code project site, and sure enough I found issue number 112 which described the behavior exactly – it actually looks like there’s a fix been submitted already. For now, just so I could proceed with the installation, I decided to remove the NO_ENGINE_SUBSTITION sql-mode option from the my.cnf.

Take 3: I removed the current installation and reran the install with the additional –datasource-log-pattern option and without NO_ENGINE_SUBSTITUTION.

[tungsten@tungsten1 continuent]$ ./tungsten/tungsten-replicator/bin/replicator stop
[tungsten@tungsten1 continuent]$ rm -Rf /opt/continuent/*

mysql> drop database tungsten_rep1;
mysql> reset master;

[tungsten@tungsten1 continuent]$ ~/tungsten-replicator-2.0.3/tools/tungsten-installer \
--master-slave \
--master-host=tungsten1 \
--datasource-user=tungstenmysql \
--datasource-password=tungsten \
--datasource-log-pattern=mysqld-bin \
--service-name=rep1 \
--home-directory=/opt/continuent/ \
--cluster-hosts=tungsten1,tungsten2,tungsten3 \
--start

Checking the master:

[tungsten@tungsten1 continuent]$ ./tungsten/tungsten-replicator/bin/trepctl services

NAME VALUE
---- -----
appliedLastSeqno: 1
appliedLatency : 0.602
role : master
serviceName : rep1
serviceType : local
started : true
state : ONLINE
Finished services command...

Success!

Checking the slaves:

NAME VALUE
---- -----
appliedLastSeqno: 1
appliedLatency : 0.0
role : slave
serviceName : rep1
serviceType : local
started : true
state : ONLINE
Finished services command...

Success!

Whilst it wasn’t totally seamless, the installation and troubleshooting were both pretty straightforward, especially considering this was literally the first time I’ve looked at Tungsten in any detail. I’m now able to create tables, and insert test data on the master, and see it replicate to the slaves. Watching replication in action and seeing data actually move from a to b, is truly one of those things that’s a joy to see!

Having explored the basics of setting up Tungsten Replicator, in the next post, I’ll look to explore some of the commands are available to administer the engine.


PlanetMySQL Voting: Vote UP / Vote DOWN

Curious case with MySQL replication

Сентябрь 1st, 2011

MySQL Replication is a powerful tool and it’s hard to find a production system not using it. On the other hand debugging replication issues can be very hard and time consuming. Especially if your replication setup is not straightforward and you are using filtering of some kind.

Recently we got an alert from our monitoring system that replication stopped on production slave with the following error:

Can't find record in 'some_table', Error_code: 1032;
handler error HA_ERR_KEY_NOT_FOUND;
the event's master log binlog.000011, end_log_pos 735766642

This means that a ROW-based replication event was going to be applied on slave, but could not find the row it was supposed to be applied to. This is something I like about ROW format — it allows you to catch such data synchronization issues right away. In this particular case MIXED format was used, but if this event was written in STATEMENT format, slave would just apply it with no error. Event was just an UPDATE and in STATEMENT format when update doesn’t affect any rows on slave it just proceeds to next event, etc.

This error was very concerning because this was a new production machine which was built out from latest hot backup several hours ago. Two ideas which come into mind in such situations are: something went wrong during switch over to new master server or something was left running on old production master (which now became slave) and is changing data there.

The first idea about initial replication misconfiguration was proven not valid when we checked the timestamp of the row which existed on master, but was missing on slave. The row was added several hours after switchover happened, so it was very unlikely that new slave was just pointed to wrong binary log position.

To check second idea about something still writing to slave we scanned slave binary logs in search of statements with server_id equal slave server id, but there was nothing there. There were also no connections except replication in slave’s SHOW PROCESSLIST. Further investigation has shown that there were many more rows missing on slave, with several different tables affected.

So when all the simple potential root causes are checked and rejected, it’s time to start digging into binary logs. Unfortunately, there not many tools available out there to help you with this. All you have is mysqlbinlog utility and Linux text processing tools like sedand grep. To start with something I took first row which caused issues on slave, I had it’s primary key value and created_at timestamp.

After some time of master’s binary logs debugging I could see the following picture: there were several UPDATE statements, but I couldn’t find original INSERT statement where the row was initially added. Check on several other missing rows gave similar results: there were some UPDATEs in master’s binary log, but no initial INSERT. Another interesting fact was that all the missing rows were created on master around the same time in 5-10 minutes interval.

At this point it became clear that some rows were added on master, but were never written to binary log. How could this happen? I had another set of possibilities: someone was running a manual session on master with SQL_LOG_BIN=0, we were hitting some exotic MySQL bug (new master was built with MySQL 5.5) or something else happened and I just had not enough facts available.

I tried to find correlation between missing rows, but no luck and I knew customer had pretty strict access policy to this servers, so most of the people didn’t have write access to the database. Checking the MySQL bug database also didn’t reveal anything relevant and I would expect a bug which prevents some events from being written to binary log to be found very soon. And then I remembered something, something that was implemented on this system several month ago.

There are some stored procedures that are being executed on daily basis. These stored procedures create several temporary tables, fill them with data, massage the data in some way and then return result set back to client. While there procedures do not modify production data in any way, all operations related to temporary table creation are written to the binary log and thus replicated to slaves, creating absolutely unnecessary additional load on slaves.
To avoid this a decision was made to disable binary logging inside the stored procedure — it was safe as no tables other than temporary were updated. Binary logging was disabled at very beginning of the procedure and was enabled when procedure ended:

CREATE DEFINER = 'user'@'localhost' PROCEDURE proc1()
BEGIN
  SET SESSION SQL_LOG_BIN = 0;
  .............
  .............
  .............
  SET SESSION SQL_LOG_BIN = 1;
END;

This approach worked well, until something went wrong.

We did have logging enabled for stored procedures using our own framework to log every stored procedure step and overall execution status in a separate logging table. When I check this log table I found out that one of the stored procedures failed around the time we start seeing missing rows on slave. All pieces came together and it became clear what actually happened.

When stored procedure failed in the middle (due to space issue in MySQL temp directory) it’s connection was left in state where binary log was disabled but was never enabled back. Then application connection pooling came into play and this “altered” connection was reused. This means for some time one of the application threads was using a connection with disabled binary logging. I didn’t have any facts that would be pointing exactly to this, and I’m not sure if it is possible at all to obtain such evidence, but all other facts were pointing that this was the root cause.

Conclusion. I would like to say do not use custom replication filtering and this solves potential issues, but first of all there are many cases where replication filters are very useful and as with any more or less complex incident there were several factors that in particular combination triggered the issue. MySQL temp directory was placed on a smaller volume on the new master and this fact was overlooked. When decision was made to disable binary logging for stored procedures not all test cases were considered, like what happens if procedure fails. Finally, I would expect connection pool to be able to save connection session variables and restore them when connection is recycled.

This incident had serious impact on the customer’s production system, but it was also a good lesson on how even very unlikely scenarios can happen and how looking at the problem from different angles can help to reveal the real root cause.


PlanetMySQL Voting: Vote UP / Vote DOWN

RDBMS Online Patching

Август 30th, 2011

If you support Oracle RDBMS 11.2.0.2 and want to zero downtime applying
patches to databases then it is time to have a look at new possibility
of Online Patching delivered with 11.2.0.2 version and described here
RDBMS Online Patching Aka Hot Patching [ID 761111.1]

The syntax is “opatch apply online -connectString … ” but I could not find description
of the online option (only -connectString) using opatch -help (version 11.2.0.1.6)
although based on the note it is available since 11.1.0.6 version of opatch

And while going through README for the patch I found the blog
about online patching – Applying online patch on 11gr2
which helped me greatly to get proper syntax for online patching.

I had test RAC database with applied PSU3 and checked if there
were any online patches available on MOS. I found quite a few and downloaded
several of them to check what information “opatch query” will show.

Surprisingly, “opatch query” did not show any of them to be applicable online
but all of them had online directory with .pch file underneath and all
of them had “Online Installable Patch” in their README
for example 10040035: AROLTP-D: INSTANCE DIED WITH ORA-00600 [KCLPDC_21]

[oracle@r1 10040035]$ /u01/app/oracle/product/11.2.0/dbhome_1/OPatch/opatch query .
Invoking OPatch 11.2.0.1.6
...
 Need to shutdown Oracle instances: true
 Patch is roll-backable: true
 Patch is a "Patchset Update": false
 Patch is a rolling patch: true
 Patch has sql related actions: false
 Patch is an online patch: false
 Patch is a portal patch: false
 Patch is an "auto-enabled" patch: false

 List of platforms supported:
   226: Linux x86-64

 List of prereq patches:
   12419331

 List of overlay patches:
   12419331

 List of bugs to be fixed:
   10040035: AROLTP-D: INSTANCE DIED WITH ORA-00600 [KCLPDC_21]
...
OPatch succeeded.
[oracle@r1 10040035]$ /u01/app/oracle/product/11.2.0/dbhome_1/OPatch/opatch query -is_online_patch .
Invoking OPatch 11.2.0.1.6

Oracle Interim Patch Installer version 11.2.0.1.6
Copyright (c) 2011, Oracle Corporation.  All rights reserved.

Oracle Home       : /u01/app/oracle/product/11.2.0/dbhome_1
Central Inventory : /u01/app/oraInventory
   from           : /etc/oraInst.loc
OPatch version    : 11.2.0.1.6
OUI version       : 11.2.0.2.0
Log file location : /u01/app/oracle/product/11.2.0/dbhome_1/cfgtoollogs/opatch/opatch2011-08-29_01-41-03AM.log

--------------------------------------------------------------------------------
 Patch is an online patch: false

OPatch succeeded.
[oracle@r1 10040035]$ tree .
.
|-- README.txt
|-- etc
|   |-- config
|   |   |-- actions.xml
|   |   |-- deploy.xml
|   |   `-- inventory.xml
|   `-- xml
|       |-- GenericActions.xml
|       `-- ShiphomeDirectoryStructure.xml
|-- files
|   `-- lib
|       `-- libserver11.a
|           `-- kcl.o
`-- online
    |-- etc
    |   |-- config
    |   |   |-- actions.xml
    |   |   |-- deploy.xml
    |   |   `-- inventory.xml
    |   `-- xml
    |       |-- GenericActions.xml
    |       `-- ShiphomeDirectoryStructure.xml
    `-- files
        `-- hpatch
            `-- bug10040035.pch

I have been able to apply all of them without any downtime and all
of them appeared to be ENABLED on both instances:

r1221> oradebug patch list

Patch File Name                                   State
================                                =========
bug9795214.pch                                   ENABLED
bug10040035.pch                                  ENABLED
bug10222480.pch                                  ENABLED
bug11670161.pch                                  ENABLED
bug11853815.pch                                  ENABLED
bug9829397.pch                                   ENABLED

Interesting enough that either for RAC or for standalone database
“opatch apply online” was converted to RAC mode “-all_nodes”
although patches were applied properly for both clustered and non-clustered
environments.
RAC database:

[oracle@r1 10040035]$ /u01/app/oracle/product/11.2.0/dbhome_1/OPatch/opatch apply \
online -connectString r1221:sys:pwd:r1,r1222:sys:pwd:r2
Invoking OPatch 11.2.0.1.6

Oracle Interim Patch Installer version 11.2.0.1.6
Copyright (c) 2011, Oracle Corporation.  All rights reserved.

Oracle Home       : /u01/app/oracle/product/11.2.0/dbhome_1
Central Inventory : /u01/app/oraInventory
   from           : /etc/oraInst.loc
OPatch version    : 11.2.0.1.6
OUI version       : 11.2.0.2.0
Log file location : /u01/app/oracle/product/11.2.0/dbhome_1/cfgtoollogs/opatch/opatch2011-08-29_02-07-19AM.log

The patch should be applied/rolled back in '-all_nodes' mode only.
Converting the RAC mode to '-all_nodes' mode.
Applying interim patch '10040035' to OH '/u01/app/oracle/product/11.2.0/dbhome_1'
Verifying environment and performing prerequisite checks...

Do you want to proceed? [y|n]
y
User Responded with: Y
All checks passed.
Provide your email address to be informed of security issues, install and
initiate Oracle Configuration Manager. Easier for you if you use your My
Oracle Support Email address/User Name.
Visit http://www.oracle.com/support/policies.html for details.
Email address/User Name:

You have not provided an email address for notification of security issues.
Do you wish to remain uninformed of security issues ([Y]es, [N]o) [N]:  Y
Backing up files...

Patching component oracle.rdbms, 11.2.0.2.0...
The patch will be installed on active database instances.
Installing and enabling the online patch 'bug10040035.pch', on database 'r1221'.

Patching in all-node mode.

Updating nodes 'r2'
   Apply-related files are:
     FP = "/u01/app/oracle/product/11.2.0/dbhome_1/.patch_storage/10040035_Jul_25_2011_06_02_21/rac/copy_files.txt"
     DP = "/u01/app/oracle/product/11.2.0/dbhome_1/.patch_storage/10040035_Jul_25_2011_06_02_21/rac/copy_dirs.txt"
     MP = "/u01/app/oracle/product/11.2.0/dbhome_1/.patch_storage/10040035_Jul_25_2011_06_02_21/rac/make_cmds.txt"
     RC = "/u01/app/oracle/product/11.2.0/dbhome_1/.patch_storage/10040035_Jul_25_2011_06_02_21/rac/remote_cmds.txt"

Instantiating the file "/u01/app/oracle/product/11.2.0/dbhome_1/.patch_storage/10040035_Jul_25_2011_06_02_21/rac/
copy_files.txt.instantiated" by replacing $ORACLE_HOME
in "/u01/app/oracle/product/11.2.0/dbhome_1/.patch_storage/10040035_Jul_25_2011_06_02_21/rac/copy_files.txt"
with actual path.
Propagating files to remote nodes...
Instantiating the file "/u01/app/oracle/product/11.2.0/dbhome_1/.patch_storage/10040035_Jul_25_2011_06_02_21/rac/
copy_dirs.txt.instantiated" by replacing $ORACLE_HOME
in "/u01/app/oracle/product/11.2.0/dbhome_1/.patch_storage/10040035_Jul_25_2011_06_02_21/rac/copy_dirs.txt"
with actual path.
Propagating directories to remote nodes...
Installing and enabling the online patch 'bug10040035.pch', on database 'r1222' on node 'r2'.

Patch 10040035 successfully applied
Log file location: /u01/app/oracle/product/11.2.0/dbhome_1/cfgtoollogs/opatch/opatch2011-08-29_02-07-19AM.log

OPatch succeeded.

non-RAC database

[oracle@oel5 11800959]$ /u02/app/oracle/product/11.2.0/dbhome_1/OPatch/opatch apply online -connectString d122:sys:pwd:
Invoking OPatch 11.2.0.1.1

Oracle Interim Patch Installer version 11.2.0.1.1
Copyright (c) 2009, Oracle Corporation.  All rights reserved.

Oracle Home       : /u02/app/oracle/product/11.2.0/dbhome_1
Central Inventory : /u01/app/oracle/oraInventory
   from           : /etc/oraInst.loc
OPatch version    : 11.2.0.1.1
OUI version       : 11.2.0.2.0
OUI location      : /u02/app/oracle/product/11.2.0/dbhome_1/oui
Log file location : /u02/app/oracle/product/11.2.0/dbhome_1/cfgtoollogs/opatch/opatch2011-08-29_15-27-30PM.log

Patch history file: /u02/app/oracle/product/11.2.0/dbhome_1/cfgtoollogs/opatch/opatch_history.txt

The patch should be applied/rolled back in '-all_nodes' mode only.
Converting the RAC mode to '-all_nodes' mode.
ApplySession applying interim patch '11800959' to OH '/u02/app/oracle/product/11.2.0/dbhome_1'

Running prerequisite checks...

OPatch detected non-cluster Oracle Home from the inventory and will patch the local system only.

Backing up files and inventory (not for auto-rollback) for the Oracle Home
Backing up files affected by the patch '11800959' for restore. This might take a while...
Backing up files affected by the patch '11800959' for rollback. This might take a while...

Patching component oracle.rdbms, 11.2.0.2.0...
The patch will be installed on active database instances.
Installing and enabling the online patch 'bug11800959.pch', on database 'd122'.

ApplySession adding interim patch '11800959' to inventory

Verifying the update...
Inventory check OK: Patch ID 11800959 is registered in Oracle Home inventory with proper meta-data.
Files check OK: Files from Patch ID 11800959 are present in Oracle Home.

OPatch succeeded.

I think that conversion a little bit misleading for RAC database patching
because credentials should be entered for all instances of clustered database
and if you do not type it for one of them, there would not be any errors,
execution will be finished but patch will be applied only for nodes which credentials
are provided.

Online patches can be easily managed by oradebug and can be turned
on and off at each instance of RAC database or for the whole non-RAC database.
However “opatch lsinventory” does not show status of online patch
only if it is applied.

d122> oradebug patch list

Patch File Name                                   State
================                                =========
bug11800959.pch                                  ENABLED

d122> oradebug patch disable bug11800959.pch
Statement processed.
d122> oradebug patch list

Patch File Name                                   State
================                                =========
bug11800959.pch                                  DISABLED

d122> oradebug patch enable bug11800959.pch
Statement processed.
d122> oradebug patch list

Patch File Name                                   State
================                                =========
bug11800959.pch                                  ENABLED

d122> oradebug patch disable bug11800959.pch term
Statement processed.
d122> oradebug patch list

Patch File Name                                   State
================                                =========
bug11800959.pch                                  DISABLED

[oracle@oel5 u01]$ /u02/app/oracle/product/11.2.0/dbhome_1/OPatch/opatch lsinventory
Invoking OPatch 11.2.0.1.1

Oracle Interim Patch Installer version 11.2.0.1.1
Copyright (c) 2009, Oracle Corporation.  All rights reserved.

Oracle Home       : /u02/app/oracle/product/11.2.0/dbhome_1
Central Inventory : /u01/app/oracle/oraInventory
   from           : /etc/oraInst.loc
OPatch version    : 11.2.0.1.1
OUI version       : 11.2.0.2.0
OUI location      : /u02/app/oracle/product/11.2.0/dbhome_1/oui
Log file location : /u02/app/oracle/product/11.2.0/dbhome_1/cfgtoollogs/opatch/opatch2011-08-30_09-36-16AM.log

Patch history file: /u02/app/oracle/product/11.2.0/dbhome_1/cfgtoollogs/opatch/opatch_history.txt

Lsinventory Output file location : /u02/app/oracle/product/11.2.0/dbhome_1/cfgtoollogs/opatch/lsinv/lsinventory2011-08-30_09-36-16AM.txt

--------------------------------------------------------------------------------
Installed Top-level Products (1):

Oracle Database 11g                                                  11.2.0.2.0
There are 1 products installed in this Oracle Home.

Interim patches (1) :

Patch (online) 11800959: applied on Mon Aug 29 15:27:55 EST 2011
Unique Patch ID:  13914499
   Created on 8 Jul 2011, 03:51:56 hrs PST8PDT
   Bugs fixed:
     11800959

--------------------------------------------------------------------------------

d122> oradebug patch enable bug11800959.pch
Statement processed.
d122> oradebug patch list

Patch File Name                                   State
================                                =========
bug11800959.pch                                  ENABLED

Oracle Support encourages to apply online patches only in cases when
downtime can not be scheduled immediately but strongly recommends to replace
them with normal patches that requires shutdown of instances.

And all my tests were done on test databases so I have not
taken any risk yet to do it on production. But the feature is very useful
and number of patches being steadily increased over last releases.

Have a good day!


PlanetMySQL Voting: Vote UP / Vote DOWN

Alter waiting on Select statement to complete

Август 29th, 2011

A few days ago, we faced an interesting problem on one of our customer’s slave mysqld servers.  An Alter for adding a new column was run on master server took 542 seconds where as it took few hours on the slave server to complete due to a SELECT blocking the Alter was not allowed to complete.

Here is the the select on the master server and it’s execution time from the binary logs.

# at 825737566
#110720 19:55:21 server id 31415  end_log_pos 825737730         Query   thread_id=83250629      exec_time=542   error_code=0
use collect/*!*/;
SET TIMESTAMP=1311216921/*!*/;
/*!\C latin1 *//*!*/;
SET @@session.character_set_client=8,@@session.collation_connection=8,@@session.collation_server=8/*!*/;
ALTER TABLE `track` ADD `source_track_id` int(10) UNSIGNED NULL DEFAULT NULL  AFTER `vendor_track_id`
/*!*/;

Alter statement completed well on the master and it got blocked by a SELECT on the slave where as the time frames of the threads in the processlist which were executing the Alter and the SELECT resembles the ALTER was started first and then the SELECT next. As per the binary logs on the master the Alter was a single statement by the thread_id and the slave started lagging.

mysql> show processlist ;
+---------+-------------+----------------------------+-------------+-------------+--------+----------------------------------------------------------------+------------------------------------------------------------------------------------------------------+
| Id      | User        | Host                       | db          | Command     | Time   | State                                                          | Info                                                                                                 |
+---------+-------------+----------------------------+-------------+-------------+--------+----------------------------------------------------------------+------------------------------------------------------------------------------------------------------+
|     864     | system user |                            | NULL         | Connect     | 479352 | Waiting for master to send event                               | NULL                                                                                                 |
|     865     | system user |                            | collect       | Connect     |   5383 | rename result table                                          | ALTER TABLE `track` ADD `source_track_id` int(10) UNSIGNED NULL DEFAULT NULL  AFTER `vendor_track_id |
| 3277392 | ios               | intreport:56045   | collect       | Query         |   4850 | Copying to tmp table                                       | SELECT /* Reporting Query 357 */
fdr.service_provider_code AS 'Provider',
fdr.service_provider_count |
| 3279284 | uios             | solr02:47871       | collect       | Query         |   4503 | Waiting for table                                              | select track_id,
vendor_track_id,
track_name,
if(promo = 'Y', 'true', 'false') as promo_track,
if( |
.........
.........

 Seconds_Behind_Master: 5131

We did the possible checks to find the status of the Alter and why is it waiting at the RENAME RESULT TABLE to complete. The file level checks shows that the creation of the temp table for Alter was completed while ago.

pythian@db04:~$ ls -ltr /ssd2/mysql/collect/track.*
-rwxr-xr-x 1 mysql mysql      19153 2011-07-15 06:11 /ssd2/mysql/collect/track.frm
-rwxr-xr-x 1 mysql mysql       8980 2011-07-15 06:11 /ssd2/mysql/collect/track.TRG
-rwxr-xr-x 1 mysql mysql 2646605824 2011-07-20 20:40 /ssd2/mysql/collect/track.ibd

pythian@db04:~$ ls -ltr /mysql/data/collect/track.*
lrwxrwxrwx 1 mysql mysql 26 2011-07-15 08:08 /mysql/data/collect/track.TRG -> /ssd2/mysql/collect/track.TRG
lrwxrwxrwx 1 mysql mysql 26 2011-07-15 08:08 /mysql/data/collect/track.ibd -> /ssd2/mysql/collect/track.ibd
lrwxrwxrwx 1 mysql mysql 26 2011-07-15 08:08 /mysql/data/collect/track.frm -> /ssd2/mysql/collect/track.frm

pythian@db04:~$ ls -ltr /mysql/data/collect/*#*
-rwxr-xr-x 1 mysql mysql      19205 2011-07-20 20:04 /mysql/data/collect/#sql-7130_361.frm
-rwxr-xr-x 1 mysql mysql 2499805184 2011-07-20 20:23 /mysql/data/collect/#sql-7130_361.ibd

pythian@db04:~$ date
Wed Jul 20 22:13:27 PDT 2011

The show engine innodb status shows that the Alter was not started yet though it was started, created the temp table and waiting at the the RENAME TABLE operation. The SELECT query which is ACTIVE for 9168 seconds blocked the Alter statement and not allowing it to complete.

---TRANSACTION 379ED1F1F, not started, process no 28976, OS thread id 1259583824
mysql tables in use 1, locked 2
MySQL thread id 865, query id 221087068 rename result table
ALTER TABLE `track` ADD `source_track_id` int(10) UNSIGNED NULL DEFAULT NULL  AFTER `vendor_track_id`
---TRANSACTION 379ED1883, ACTIVE 9168 sec, process no 28976, OS thread id 1279453520 fetching rows, thread declared inside InnoDB 49
mysql tables in use 8, locked 0
MySQL thread id 3277392, query id 221086895 intreport 10.1.4.151 collect Copying to tmp table
SELECT /* Reporting Query 357 */
fdr.service_provider_code AS 'Provider',

A few guesses from mates saying that

1) The select has been started first so it is blocking Alter. ( Not true in this case )
2) ALTER started first and before it completes, SELECT fires, fast index creation of secondary indexes allows concurrent SELECTS but still has to get global lock during RENAME as per the explanation in docs in this blog post. But the Alter in this case is not a secondary index rebuild.

A little more digging into documentation revealed that ALTER TABLE makes a temporary copy of the original table. MySQL waits for other operations that are modifying the table, then proceeds. It incorporates the alteration into the copy, deletes the original table, and renames the new one. While ALTER TABLE is executing, the original table is readable by other sessions, but at the end ALTER still has to acquire global lock on data dictionary to swap original and new table copies, that is why we could see other threads in “Waiting for table” state. Updates and writes to the table that begin after the ALTER TABLE operation begins are stalled until the new table is ready, then are automatically redirected to the new table without any failed updates. So what happened in this situation was: ALTER started first, then concurrent SELECT started on the same table, when ALTER finished copy to a temp table it tried to RENAME table, but failed to get global lock on data dictionary. All the threads that started after this point had to wait on ALTER to finish.

More details on the Alter operation can be found here.


PlanetMySQL Voting: Vote UP / Vote DOWN

Alter waiting on Select statement to complete

Август 29th, 2011

A few days ago, we faced an interesting problem on one of our customer’s slave mysqld servers.  An Alter for adding a new column was run on master server took 542 seconds where as it took few hours on the slave server to complete due to a SELECT blocking the Alter was not allowed to complete.

Here is the the select on the master server and it’s execution time from the binary logs.

# at 825737566
#110720 19:55:21 server id 31415  end_log_pos 825737730         Query   thread_id=83250629      exec_time=542   error_code=0
use collect/*!*/;
SET TIMESTAMP=1311216921/*!*/;
/*!\C latin1 *//*!*/;
SET @@session.character_set_client=8,@@session.collation_connection=8,@@session.collation_server=8/*!*/;
ALTER TABLE `track` ADD `source_track_id` int(10) UNSIGNED NULL DEFAULT NULL  AFTER `vendor_track_id`
/*!*/;

Alter statement completed well on the master and it got blocked by a SELECT on the slave where as the time frames of the threads in the processlist which were executing the Alter and the SELECT resembles the ALTER was started first and then the SELECT next. As per the binary logs on the master the Alter was a single statement by the thread_id and the slave started lagging.

mysql> show processlist ;
+---------+-------------+----------------------------+-------------+-------------+--------+----------------------------------------------------------------+------------------------------------------------------------------------------------------------------+
| Id      | User        | Host                       | db          | Command     | Time   | State                                                          | Info                                                                                                 |
+---------+-------------+----------------------------+-------------+-------------+--------+----------------------------------------------------------------+------------------------------------------------------------------------------------------------------+
|     864     | system user |                            | NULL         | Connect     | 479352 | Waiting for master to send event                               | NULL                                                                                                 |
|     865     | system user |                            | collect       | Connect     |   5383 | rename result table                                          | ALTER TABLE `track` ADD `source_track_id` int(10) UNSIGNED NULL DEFAULT NULL  AFTER `vendor_track_id |
| 3277392 | ios               | intreport:56045   | collect       | Query         |   4850 | Copying to tmp table                                       | SELECT /* Reporting Query 357 */
fdr.service_provider_code AS 'Provider',
fdr.service_provider_count |
| 3279284 | uios             | solr02:47871       | collect       | Query         |   4503 | Waiting for table                                              | select track_id,
vendor_track_id,
track_name,
if(promo = 'Y', 'true', 'false') as promo_track,
if( |
.........
.........

 Seconds_Behind_Master: 5131

We did the possible checks to find the status of the Alter and why is it waiting at the RENAME RESULT TABLE to complete. The file level checks shows that the creation of the temp table for Alter was completed while ago.

pythian@db04:~$ ls -ltr /ssd2/mysql/collect/track.*
-rwxr-xr-x 1 mysql mysql      19153 2011-07-15 06:11 /ssd2/mysql/collect/track.frm
-rwxr-xr-x 1 mysql mysql       8980 2011-07-15 06:11 /ssd2/mysql/collect/track.TRG
-rwxr-xr-x 1 mysql mysql 2646605824 2011-07-20 20:40 /ssd2/mysql/collect/track.ibd

pythian@db04:~$ ls -ltr /mysql/data/collect/track.*
lrwxrwxrwx 1 mysql mysql 26 2011-07-15 08:08 /mysql/data/collect/track.TRG -> /ssd2/mysql/collect/track.TRG
lrwxrwxrwx 1 mysql mysql 26 2011-07-15 08:08 /mysql/data/collect/track.ibd -> /ssd2/mysql/collect/track.ibd
lrwxrwxrwx 1 mysql mysql 26 2011-07-15 08:08 /mysql/data/collect/track.frm -> /ssd2/mysql/collect/track.frm

pythian@db04:~$ ls -ltr /mysql/data/collect/*#*
-rwxr-xr-x 1 mysql mysql      19205 2011-07-20 20:04 /mysql/data/collect/#sql-7130_361.frm
-rwxr-xr-x 1 mysql mysql 2499805184 2011-07-20 20:23 /mysql/data/collect/#sql-7130_361.ibd

pythian@db04:~$ date
Wed Jul 20 22:13:27 PDT 2011

The show engine innodb status shows that the Alter was not started yet though it was started, created the temp table and waiting at the the RENAME TABLE operation. The SELECT query which is ACTIVE for 9168 seconds blocked the Alter statement and not allowing it to complete.

---TRANSACTION 379ED1F1F, not started, process no 28976, OS thread id 1259583824
mysql tables in use 1, locked 2
MySQL thread id 865, query id 221087068 rename result table
ALTER TABLE `track` ADD `source_track_id` int(10) UNSIGNED NULL DEFAULT NULL  AFTER `vendor_track_id`
---TRANSACTION 379ED1883, ACTIVE 9168 sec, process no 28976, OS thread id 1279453520 fetching rows, thread declared inside InnoDB 49
mysql tables in use 8, locked 0
MySQL thread id 3277392, query id 221086895 intreport 10.1.4.151 collect Copying to tmp table
SELECT /* Reporting Query 357 */
fdr.service_provider_code AS 'Provider',

A few guesses from mates saying that

1) The select has been started first so it is blocking Alter. ( Not true in this case )
2) ALTER started first and before it completes, SELECT fires, fast index creation of secondary indexes allows concurrent SELECTS but still has to get global lock during RENAME as per the explanation in docs in this blog post. But the Alter in this case is not a secondary index rebuild.

A little more digging into documentation revealed that ALTER TABLE makes a temporary copy of the original table. MySQL waits for other operations that are modifying the table, then proceeds. It incorporates the alteration into the copy, deletes the original table, and renames the new one. While ALTER TABLE is executing, the original table is readable by other sessions, but at the end ALTER still has to acquire global lock on data dictionary to swap original and new table copies, that is why we could see other threads in “Waiting for table” state. Updates and writes to the table that begin after the ALTER TABLE operation begins are stalled until the new table is ready, then are automatically redirected to the new table without any failed updates. So what happened in this situation was: ALTER started first, then concurrent SELECT started on the same table, when ALTER finished copy to a temp table it tried to RENAME table, but failed to get global lock on data dictionary. All the threads that started after this point had to wait on ALTER to finish.

More details on the Alter operation can be found here.


PlanetMySQL Voting: Vote UP / Vote DOWN

Viewing RMAN jobs status and output

Август 26th, 2011

Yesterday I was discussing with a fellow DBA about ways to check the status of existing and/or past RMAN jobs. Good backup scripts usually write their output to some sort of log file so, checking the output is usually a straight-forward task. However, backup jobs can be scheduled in many different ways (crontab, Grid Control, Scheduled Tasks, etc) and finding the log file may be tricky if you don’t know the environment well.
Furthermore, log files may also have already been overwritten by the next backup or simply just deleted. An alternative way of accessing that information, thus, may come handy.

Fortunately, RMAN keeps the backup metadata around for some time and it can be accessed through the database’s V$ views. Obviously, if you need this information because your database just crashed and needs to be restored, the method described here is useless.

Backup jobs’ status and metadata

A lot of metadata about the RMAN backup jobs can be found in the V$RMAN_% views. These views show past RMAN jobs as well as jobs currently running. Once the jobs complete backup sets, metadata about the sets and pieces are also added to the control file and can be accessed through the V$BACKUP_% views.

For the queries in this post I need only four of those views:

NOTE: I haven’t tested the below in Oracle 10g or earlier.

In the query below I used these views to combine in a single query the information I’m usually interested in when verifying backup jobs:

set lines 220
set pages 1000
col cf for 9,999
col df for 9,999
col elapsed_seconds heading "ELAPSED|SECONDS"
col i0 for 9,999
col i1 for 9,999
col l for 9,999
col output_mbytes for 9,999,999 heading "OUTPUT|MBYTES"
col session_recid for 999999 heading "SESSION|RECID"
col session_stamp for 99999999999 heading "SESSION|STAMP"
col status for a10 trunc
col time_taken_display for a10 heading "TIME|TAKEN"
col output_instance for 9999 heading "OUT|INST"
select
  j.session_recid, j.session_stamp,
  to_char(j.start_time, 'yyyy-mm-dd hh24:mi:ss') start_time,
  to_char(j.end_time, 'yyyy-mm-dd hh24:mi:ss') end_time,
  (j.output_bytes/1024/1024) output_mbytes, j.status, j.input_type,
  decode(to_char(j.start_time, 'd'), 1, 'Sunday', 2, 'Monday',
                                     3, 'Tuesday', 4, 'Wednesday',
                                     5, 'Thursday', 6, 'Friday',
                                     7, 'Saturday') dow,
  j.elapsed_seconds, j.time_taken_display,
  x.cf, x.df, x.i0, x.i1, x.l,
  ro.inst_id output_instance
from V$RMAN_BACKUP_JOB_DETAILS j
  left outer join (select
                     d.session_recid, d.session_stamp,
                     sum(case when d.controlfile_included = 'YES' then d.pieces else 0 end) CF,
                     sum(case when d.controlfile_included = 'NO'
                               and d.backup_type||d.incremental_level = 'D' then d.pieces else 0 end) DF,
                     sum(case when d.backup_type||d.incremental_level = 'D0' then d.pieces else 0 end) I0,
                     sum(case when d.backup_type||d.incremental_level = 'I1' then d.pieces else 0 end) I1,
                     sum(case when d.backup_type = 'L' then d.pieces else 0 end) L
                   from
                     V$BACKUP_SET_DETAILS d
                     join V$BACKUP_SET s on s.set_stamp = d.set_stamp and s.set_count = d.set_count
                   where s.input_file_scan_only = 'NO'
                   group by d.session_recid, d.session_stamp) x
    on x.session_recid = j.session_recid and x.session_stamp = j.session_stamp
  left outer join (select o.session_recid, o.session_stamp, min(inst_id) inst_id
                   from GV$RMAN_OUTPUT o
                   group by o.session_recid, o.session_stamp)
    ro on ro.session_recid = j.session_recid and ro.session_stamp = j.session_stamp
where j.start_time > trunc(sysdate)-&NUMBER_OF_DAYS
order by j.start_time;

The output of the query above looks like the one below:

SESSION      SESSION                                             OUTPUT                                       ELAPSED TIME                                            OUT
  RECID        STAMP START_TIME          END_TIME                MBYTES STATUS     INPUT_TYPE    DOW          SECONDS TAKEN          CF     DF     I0     I1      L  INST
------- ------------ ------------------- ------------------- ---------- ---------- ------------- --------- ---------- ---------- ------ ------ ------ ------ ------ -----
  35877    759180320 2011-08-14 19:25:29 2011-08-14 19:26:09        213 COMPLETED  ARCHIVELOG    Saturday          40 00:00:40
  35880    759180438 2011-08-14 19:27:28 2011-08-14 19:28:02        217 COMPLETED  ARCHIVELOG    Saturday          34 00:00:34
  35883    759180496 2011-08-14 19:28:25 2011-08-14 19:28:57        219 COMPLETED  ARCHIVELOG    Saturday          32 00:00:32
  35886    759182786 2011-08-14 20:06:26 2011-08-14 20:24:30     14,100 FAILED     DB FULL       Saturday        1084 00:18:04
  35888    759184489 2011-08-14 20:34:50                              0 FAILED     DB FULL       Saturday
  35890    759189013 2011-08-14 21:50:13                              0 FAILED     ARCHIVELOG    Saturday
  35892    759203414 2011-08-15 01:50:14 2011-08-15 01:53:55        875 COMPLETED  ARCHIVELOG    Sunday           221 00:03:41
  35895    759207985 2011-08-15 03:08:15                              0 FAILED     ARCHIVELOG    Sunday
  35899    759213359 2011-08-15 04:36:27 2011-08-15 04:38:53      3,335 COMPLETED  DATAFILE FULL Sunday           146 00:02:26
  35907    759217815 2011-08-15 05:50:15 2011-08-15 05:50:52        229 COMPLETED  ARCHIVELOG    Sunday            37 00:00:37
  35911    759218083 2011-08-15 05:54:43 2011-08-15 10:15:49    193,016 COMPLETED  DB FULL       Sunday         15666 04:21:06
  35913    759232215 2011-08-15 09:50:15 2011-08-15 09:51:39        628 COMPLETED  ARCHIVELOG    Sunday            84 00:01:24
  ...
  36209    760053012 2011-08-24 21:50:13 2011-08-24 21:53:21      2,422 COMPLETED  ARCHIVELOG    Tuesday          188 00:03:08                                          1
  36212    760067412 2011-08-25 01:50:12 2011-08-25 01:54:17      2,514 COMPLETED  ARCHIVELOG    Wednesday        245 00:04:05                                          1
  36215    760081812 2011-08-25 05:50:12 2011-08-25 05:51:40        907 COMPLETED  ARCHIVELOG    Wednesday         88 00:01:28                                          1
  36218    760096215 2011-08-25 09:50:15 2011-08-25 09:51:36        737 COMPLETED  ARCHIVELOG    Wednesday         81 00:01:21        0      0      0      0      9     1
  36221    760110617 2011-08-25 13:50:17 2011-08-25 13:54:58      3,524 COMPLETED  ARCHIVELOG    Wednesday        281 00:04:41        0      0      0      0      9     1
  36224    760114813 2011-08-25 15:00:13 2011-08-25 19:34:55    209,491 COMPLETED  DB FULL       Wednesday      16482 04:34:42        3    203      0      0     17     1
  36228    760125016 2011-08-25 17:50:16 2011-08-25 17:52:33      1,464 COMPLETED  ARCHIVELOG    Wednesday        137 00:02:17        0      0      0      0      9     1
  36237    760139412 2011-08-25 21:50:12 2011-08-25 21:52:44      1,879 COMPLETED  ARCHIVELOG    Wednesday        152 00:02:32        1      0      0      0      8     1
  36240    760153813 2011-08-26 01:50:13 2011-08-26 01:57:04      5,123 COMPLETED  ARCHIVELOG    Thursday         411 00:06:51        1      0      0      0     17     1

Most of the columns above have either a obvious meaning or derive directly from the the V$ views, so they are explained in the Oracle documentation. The few columns I added to the output that need some additional explaination are the aggregations below:

    CF: Number of controlfile backups included in the backup set
    DF: Number of datafile full backups included in the backup set
    I0: Number of datafile incremental level-0 backups included in the backup set
    I1: Number of datafile incremental level-1 backups included in the backup set
    L: Number of archived log backups included in the backup set
    OUT INST: Instance where the job was executed and the output is available (see below)

Please note that the aggregations are only shown for the recent backup jobs in the example above, since they are purged from the catalog after a few days.

Another important thing to note is that in a RAC environment some fields for a RUNNING backup job may contain invalid information until the backup job is finished. To get consistent information, run this query on the node where the backup is running.

Backup set details

Once you found the general information about the backup sets available, you may need to get more information about the backup sets for one particular backup job. Each backup job is uniquely identified by (SESSION_RECID, SESSION_STAMP), which are listed by the query above.

The query below retrieves details for a backup job, given a pair of values for (SESSION_RECID, SESSION_STAMP):

set lines 220
set pages 1000
col backup_type for a4 heading "TYPE"
col controlfile_included heading "CF?"
col incremental_level heading "INCR LVL"
col pieces for 999 heading "PCS"
col elapsed_seconds heading "ELAPSED|SECONDS"
col device_type for a10 trunc heading "DEVICE|TYPE"
col compressed for a4 heading "ZIP?"
col output_mbytes for 9,999,999 heading "OUTPUT|MBYTES"
col input_file_scan_only for a4 heading "SCAN|ONLY"
select
  d.bs_key, d.backup_type, d.controlfile_included, d.incremental_level, d.pieces,
  to_char(d.start_time, 'yyyy-mm-dd hh24:mi:ss') start_time,
  to_char(d.completion_time, 'yyyy-mm-dd hh24:mi:ss') completion_time,
  d.elapsed_seconds, d.device_type, d.compressed, (d.output_bytes/1024/1024) output_mbytes, s.input_file_scan_only
from V$BACKUP_SET_DETAILS d
  join V$BACKUP_SET s on s.set_stamp = d.set_stamp and s.set_count = d.set_count
where session_recid = &SESSION_RECID
  and session_stamp = &SESSION_STAMP
order by d.start_time;

And the output is:

                                                                               ELAPSED DEVICE              OUTPUT SCAN
    BS_KEY TYPE CF?   INCR LVL  PCS START_TIME          COMPLETION_TIME        SECONDS TYPE       ZIP?     MBYTES ONLY
---------- ---- --- ---------- ---- ------------------- ------------------- ---------- ---------- ---- ---------- ----
     33429 L    NO                1 2011-08-25 15:00:24 2011-08-25 15:03:25        181 DISK       YES         812 NO
     33428 L    NO                1 2011-08-25 15:00:24 2011-08-25 15:03:08        164 DISK       YES         812 NO
     33429 L    NO                1 2011-08-25 15:00:24 2011-08-25 15:03:25        181 DISK       YES       1,187 NO
     33429 L    NO                1 2011-08-25 15:00:24 2011-08-25 15:03:25        181 DISK       YES         812 NO
     33429 L    NO                1 2011-08-25 15:00:24 2011-08-25 15:03:25        181 DISK       YES       1,187 NO
     33428 L    NO                1 2011-08-25 15:00:24 2011-08-25 15:03:08        164 DISK       YES       1,187 NO
     33428 L    NO                1 2011-08-25 15:00:24 2011-08-25 15:03:08        164 DISK       YES       1,187 NO
     33428 L    NO                1 2011-08-25 15:00:24 2011-08-25 15:03:08        164 DISK       YES         812 NO
     33430 L    NO                1 2011-08-25 15:03:11 2011-08-25 15:03:28         17 DISK       YES          89 NO
     33436 D    NO               45 2011-08-25 15:03:38 2011-08-25 19:05:51      14533 DISK       YES      91,898 NO
     33436 D    NO               45 2011-08-25 15:03:38 2011-08-25 19:05:51      14533 DISK       YES      91,898 NO
     33443 D    NO               52 2011-08-25 15:03:38 2011-08-25 19:31:51      16093 DISK       YES     106,166 NO
     33443 D    NO               52 2011-08-25 15:03:38 2011-08-25 19:31:51      16093 DISK       YES     106,166 NO
     33437 D    NO                2 2011-08-25 19:05:58 2011-08-25 19:16:24        626 DISK       YES       2,999 NO
     33438 D    NO                3 2011-08-25 19:16:28 2011-08-25 19:29:51        803 DISK       YES       4,948 NO
     33439 D    NO                1 2011-08-25 19:29:52 2011-08-25 19:29:52          0 DISK       YES           0 NO
     33440 D    NO                1 2011-08-25 19:29:53 2011-08-25 19:29:53          0 DISK       YES           0 NO
     33441 D    NO                1 2011-08-25 19:29:54 2011-08-25 19:29:54          0 DISK       YES           0 NO
     33442 D    NO                1 2011-08-25 19:29:55 2011-08-25 19:29:55          0 DISK       YES           0 NO
     33445 L    NO                1 2011-08-25 19:32:15 2011-08-25 19:34:11        116 DISK       YES         677 NO
     33445 L    NO                1 2011-08-25 19:32:15 2011-08-25 19:34:11        116 DISK       YES         618 NO
     33445 L    NO                1 2011-08-25 19:32:15 2011-08-25 19:34:11        116 DISK       YES         677 NO
     33446 L    NO                1 2011-08-25 19:32:15 2011-08-25 19:34:18        123 DISK       YES         618 NO
     33446 L    NO                1 2011-08-25 19:32:15 2011-08-25 19:34:18        123 DISK       YES         677 NO
     33446 L    NO                1 2011-08-25 19:32:15 2011-08-25 19:34:18        123 DISK       YES         618 NO
     33445 L    NO                1 2011-08-25 19:32:15 2011-08-25 19:34:11        116 DISK       YES         618 NO
     33446 L    NO                1 2011-08-25 19:32:15 2011-08-25 19:34:18        123 DISK       YES         677 NO
     33447 D    YES               1 2011-08-25 19:34:19 2011-08-25 19:34:20          1 DISK       NO           24 NO
     33448 D    YES               1 2011-08-25 19:34:48 2011-08-25 19:34:49          1 DISK       YES           3 NO
     33449 D    YES               1 2011-08-25 19:34:49 2011-08-25 19:34:50          1 DISK       NO           24 NO

Backup job output

And finally, sometimes it may be helpful to retrieve the job’s output from the metadata kept by the instance.
It maight be that the original log on disk, if any, may have been overwritten by a more recent backup, or just that selecting it from a V$ view may be easier than connecting to a server to find out were the log file is.

The tricky thing here, though, is that the view that contains the output, V$RMAN_OUTPUT, exists in memory only; the job’s output is not stored in the controlfile or anywhere else in the database. Thus, if the instance gets restarted, the contents of that view are reset.

To retrieve the job output for a specific backup job, identified by the (SESSION_RECID, SESSION_STAMP) pair, you can use the following query:

set lines 200
set pages 1000
select output
from GV$RMAN_OUTPUT
where session_recid = &SESSION_RECID
  and session_stamp = &SESSION_STAMP
order by recid;

Which shows:

OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
connected to target database: ORCL (DBID=4132479105)
using target database control file instead of recovery catalog

echo set on

backup archivelog all not backed up delete all input;

Starting backup at 26-AUG-11
current log archived
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=233 instance=ORCL devtype=DISK
allocated channel: ORA_DISK_2
channel ORA_DISK_2: sid=261 instance=ORCL devtype=DISK
channel ORA_DISK_1: starting compressed archive log backupset
channel ORA_DISK_1: specifying archive log(s) in backup set
...
...
archive log filename=+ALOGS/ORCL/archivelog/2011_08_26/thread_2_seq_23561.462.933333703 recid=213400 stamp=933333707
archive log filename=+ALOGS/ORCL/archivelog/2011_08_26/thread_2_seq_23562.412.933335505 recid=213402 stamp=933335506
archive log filename=+ALOGS/ORCL/archivelog/2011_08_26/thread_2_seq_23563.632.933337017 recid=213404 stamp=933337020
Finished backup at 26-AUG-11

Starting Control File and SPFILE Autobackup at 26-AUG-11
piece handle=/fra/ORCL/autobackup/2011_08_26/o1_mf_s_933337284_34h856x2_.bkp comment=NONE
Finished Control File and SPFILE Autobackup at 26-AUG-11

exit;

PlanetMySQL Voting: Vote UP / Vote DOWN

Log Buffer #208, A Carnival of the Vanities for DBAs

Декабрь 6th, 2010

Welcome to Log Buffer, the weekly round up of news and happenings in the database world.

We’re planning our publishing calendar for 2011. Happy to announce that we’ll have a few guest hosts in the New Year. Don’t forget if you’d like to host or edit a future edition of Log Buffer on your own blog, send a note to the Log Buffer coordinator.

We’ve had several contributions of favorite reads from the team this week. Enjoy this issue, Log Buffer #208.

Gwen Shapira’s picks:

Iggy Fernandez uses GraphViz to visualize his explain plans – he thinks it makes them easier to read, but Gwen’s not sure she agrees. In the comments, Tim Hall and Charles Hooper give a lot of information on how to read explain plans correctly and are worth reading.

Jonathan Lewis, on Oracle Scratchpad, blogs about optimizer issues with collection types and suggests a work-around.

Asif Momen updates that Oracle released a nifty little tool for looking up DBA views and background processes.

Jared Stills ran into interesting date format issues while working on his latest book.

Pythian’s Alex, Christo and Dan were blogging live from UKOUG 2010. It looked like they were having so much fun, I’m not sure why they call it work! Welcome home, Paul and team – you made it, despite the snow.

Vadim Tkachenko blogs about a very scary InnoDB bug that can corrupt your data and crash your database. It can even allow your users to do it to you! Read and take steps to protect yourself.

In DB2 news, Fahd Mirza suggests:

Henrik Loeser expounding as how to build a full text index on PDF documents in DB2.

Raul F. Chong gives the chance to experience the next version of DB2 today!

Willie Favero appreciates the security offered by the DB2 10.

Edwin Sarmiento writes his second post in a series on HADR, further building on his point that a good HADR strategy is more than just the underlying technology.

Guiseppe Maxia, the Data Charmer, starts a lively discussion on MySQL forks, and points out 5 arguments in favor of them.

Hard to believe it’s December already.


PlanetMySQL Voting: Vote UP / Vote DOWN

Log Buffer #206, A Carnival of the Vanities for DBAs

Ноябрь 15th, 2010

Welcome to Log Buffer, the weekly news blog about blogs in the datasphere… As we kick off Log Buffer #206, our own Gwen Shapira shares a few of her weekly favorites:

Oracle:

Arup Nanda posted an excellent script on how to summarize backup information from the rman catalog. He also posted a tool for automatically purging time-based partitions.

Pythian’s resident Exadata expert, Marc Fielding posted links to the latest recording of his Exadata webinars.

Gary Mayers, on Sydney Oracle Lab explains Oracle user security model.

Riyaj Shamsudeen shows what really happens when you update a query with the same values it already had.

Jonathan Lewis explains how he tests list partitions, a post which gives an excellent review on how one should go about testing new features in general.

MySQL:

At MySQL Performance Blog, Aurimas posts a war story about a very unusual data recovery. Test your backups, or you may have to go through the same ordeal.

In same blog, Morgan Tucker gives another reason why you shouldn’t use mysqldump to take backups.

SQL Server:
MVP Edwin Sarmiento is blogging live from SQL PASS 2010 in Seattle.

Aaron Bertrand introduces a feature in SQL Server 2011, called SEQUENCE that most Oracle DBAs would recognize too.

And lastly a few general posts. One from Michael Lopp, who, on the “Rands in Repose” blog, explains how people follow conversations, with some amusing examples.

And the second from Singer Wang who reports that RedHat 6 is finally out after 4 years.

A big shout out to this week’s contributors wraps up this issue. Don’t forget, if you’re interested in hosting your own issue of Log Buffer – send an email to the Log Buffer Coordinator.


PlanetMySQL Voting: Vote UP / Vote DOWN

Log Buffer #205, A Carnival of the Vanities for DBAs

Ноябрь 5th, 2010

A very warm welcome to the Log Buffer, the premier medley of fresh information culled from the blogs related to the technology which stores the world, yes, the databases.

In this edition, the Log Buffer #205, we have yet again found the pulse of the industry.

Oracle:

On the Oracle front, leading Oracle technologist Andrey Goryunov carries on his hands-on experiments of newest version of the Oracle database. This time he slices away chopt.

It’s always very informative and exciting to know about internals of RAC Stuff like what actually is maintained in the Voting Disk . Riyaj has it here.

Jonathan Lewis does a little thought experiment with list partitioning.

HugesPages almost always provide value to the Oracle databases on the Linux Systems, and many people wonder why they are not the default. Kevin Closson touches some points regarding HugesPages, and he also notes down some finer points like the dislike of AMM and Hugespage for each other.

Hardly anyone would refuse a gift consisting of chocolate, ice cream, flower and designer watch. Yes, now you can have Tanel Poder, Cary Millsap, Jonathan Lewis, and Kerry Osborne at one Virtual Oracle Conference.

Tim Hall, like many other people is perturbed over the plagiarism of his articles.

Oracle recommends that you use JRockit JDK with your Oracle products and the reasons are described by JaySenSharma at his Weblogic Wonder blog.

DB2:

Now, RPM and DEB packages for DB2 Express-C are available for the download. Get it from here.

Troy Coleman, blogs about the ripple-creating news that Last week IBM announced the general availability of DB2 10 for z/OS.

The keynote session for the third day of the IOD conference features the authors of Freakonomics, Steven Levitt and Stephen Dubner, and Craig Mullins highly recommend that.

SQL Server:

If you are curious about the title “Plumbing The Depths of SQL Server / PowerShell Integration, then don’t miss SQL Server Connections conference on Nov 1-4 in Las Vegas and attend the session by Bob Beauchemin.

Though not earth-shattering and sky-ripping, but very valuable nonetheless this post by Jeff about calculating the median.

And following is the ever-green SQL Server Myth buster posts by Euan.

MySQL:

Zack Urlocker rambles on how open source software, cloud and software as a service are helping to bring about the consumerization of IT.

Here is one more effort where the bencmarking of MariaDB is being done.

Have a nice weekend.


PlanetMySQL Voting: Vote UP / Vote DOWN