Archive for the ‘restore’ Category

Viewing RMAN jobs status and output

Август 26th, 2011

Yesterday I was discussing with a fellow DBA about ways to check the status of existing and/or past RMAN jobs. Good backup scripts usually write their output to some sort of log file so, checking the output is usually a straight-forward task. However, backup jobs can be scheduled in many different ways (crontab, Grid Control, Scheduled Tasks, etc) and finding the log file may be tricky if you don’t know the environment well.
Furthermore, log files may also have already been overwritten by the next backup or simply just deleted. An alternative way of accessing that information, thus, may come handy.

Fortunately, RMAN keeps the backup metadata around for some time and it can be accessed through the database’s V$ views. Obviously, if you need this information because your database just crashed and needs to be restored, the method described here is useless.

Backup jobs’ status and metadata

A lot of metadata about the RMAN backup jobs can be found in the V$RMAN_% views. These views show past RMAN jobs as well as jobs currently running. Once the jobs complete backup sets, metadata about the sets and pieces are also added to the control file and can be accessed through the V$BACKUP_% views.

For the queries in this post I need only four of those views:

NOTE: I haven’t tested the below in Oracle 10g or earlier.

In the query below I used these views to combine in a single query the information I’m usually interested in when verifying backup jobs:

set lines 220
set pages 1000
col cf for 9,999
col df for 9,999
col elapsed_seconds heading "ELAPSED|SECONDS"
col i0 for 9,999
col i1 for 9,999
col l for 9,999
col output_mbytes for 9,999,999 heading "OUTPUT|MBYTES"
col session_recid for 999999 heading "SESSION|RECID"
col session_stamp for 99999999999 heading "SESSION|STAMP"
col status for a10 trunc
col time_taken_display for a10 heading "TIME|TAKEN"
col output_instance for 9999 heading "OUT|INST"
select
  j.session_recid, j.session_stamp,
  to_char(j.start_time, 'yyyy-mm-dd hh24:mi:ss') start_time,
  to_char(j.end_time, 'yyyy-mm-dd hh24:mi:ss') end_time,
  (j.output_bytes/1024/1024) output_mbytes, j.status, j.input_type,
  decode(to_char(j.start_time, 'd'), 1, 'Sunday', 2, 'Monday',
                                     3, 'Tuesday', 4, 'Wednesday',
                                     5, 'Thursday', 6, 'Friday',
                                     7, 'Saturday') dow,
  j.elapsed_seconds, j.time_taken_display,
  x.cf, x.df, x.i0, x.i1, x.l,
  ro.inst_id output_instance
from V$RMAN_BACKUP_JOB_DETAILS j
  left outer join (select
                     d.session_recid, d.session_stamp,
                     sum(case when d.controlfile_included = 'YES' then d.pieces else 0 end) CF,
                     sum(case when d.controlfile_included = 'NO'
                               and d.backup_type||d.incremental_level = 'D' then d.pieces else 0 end) DF,
                     sum(case when d.backup_type||d.incremental_level = 'D0' then d.pieces else 0 end) I0,
                     sum(case when d.backup_type||d.incremental_level = 'I1' then d.pieces else 0 end) I1,
                     sum(case when d.backup_type = 'L' then d.pieces else 0 end) L
                   from
                     V$BACKUP_SET_DETAILS d
                     join V$BACKUP_SET s on s.set_stamp = d.set_stamp and s.set_count = d.set_count
                   where s.input_file_scan_only = 'NO'
                   group by d.session_recid, d.session_stamp) x
    on x.session_recid = j.session_recid and x.session_stamp = j.session_stamp
  left outer join (select o.session_recid, o.session_stamp, min(inst_id) inst_id
                   from GV$RMAN_OUTPUT o
                   group by o.session_recid, o.session_stamp)
    ro on ro.session_recid = j.session_recid and ro.session_stamp = j.session_stamp
where j.start_time > trunc(sysdate)-&NUMBER_OF_DAYS
order by j.start_time;

The output of the query above looks like the one below:

SESSION      SESSION                                             OUTPUT                                       ELAPSED TIME                                            OUT
  RECID        STAMP START_TIME          END_TIME                MBYTES STATUS     INPUT_TYPE    DOW          SECONDS TAKEN          CF     DF     I0     I1      L  INST
------- ------------ ------------------- ------------------- ---------- ---------- ------------- --------- ---------- ---------- ------ ------ ------ ------ ------ -----
  35877    759180320 2011-08-14 19:25:29 2011-08-14 19:26:09        213 COMPLETED  ARCHIVELOG    Saturday          40 00:00:40
  35880    759180438 2011-08-14 19:27:28 2011-08-14 19:28:02        217 COMPLETED  ARCHIVELOG    Saturday          34 00:00:34
  35883    759180496 2011-08-14 19:28:25 2011-08-14 19:28:57        219 COMPLETED  ARCHIVELOG    Saturday          32 00:00:32
  35886    759182786 2011-08-14 20:06:26 2011-08-14 20:24:30     14,100 FAILED     DB FULL       Saturday        1084 00:18:04
  35888    759184489 2011-08-14 20:34:50                              0 FAILED     DB FULL       Saturday
  35890    759189013 2011-08-14 21:50:13                              0 FAILED     ARCHIVELOG    Saturday
  35892    759203414 2011-08-15 01:50:14 2011-08-15 01:53:55        875 COMPLETED  ARCHIVELOG    Sunday           221 00:03:41
  35895    759207985 2011-08-15 03:08:15                              0 FAILED     ARCHIVELOG    Sunday
  35899    759213359 2011-08-15 04:36:27 2011-08-15 04:38:53      3,335 COMPLETED  DATAFILE FULL Sunday           146 00:02:26
  35907    759217815 2011-08-15 05:50:15 2011-08-15 05:50:52        229 COMPLETED  ARCHIVELOG    Sunday            37 00:00:37
  35911    759218083 2011-08-15 05:54:43 2011-08-15 10:15:49    193,016 COMPLETED  DB FULL       Sunday         15666 04:21:06
  35913    759232215 2011-08-15 09:50:15 2011-08-15 09:51:39        628 COMPLETED  ARCHIVELOG    Sunday            84 00:01:24
  ...
  36209    760053012 2011-08-24 21:50:13 2011-08-24 21:53:21      2,422 COMPLETED  ARCHIVELOG    Tuesday          188 00:03:08                                          1
  36212    760067412 2011-08-25 01:50:12 2011-08-25 01:54:17      2,514 COMPLETED  ARCHIVELOG    Wednesday        245 00:04:05                                          1
  36215    760081812 2011-08-25 05:50:12 2011-08-25 05:51:40        907 COMPLETED  ARCHIVELOG    Wednesday         88 00:01:28                                          1
  36218    760096215 2011-08-25 09:50:15 2011-08-25 09:51:36        737 COMPLETED  ARCHIVELOG    Wednesday         81 00:01:21        0      0      0      0      9     1
  36221    760110617 2011-08-25 13:50:17 2011-08-25 13:54:58      3,524 COMPLETED  ARCHIVELOG    Wednesday        281 00:04:41        0      0      0      0      9     1
  36224    760114813 2011-08-25 15:00:13 2011-08-25 19:34:55    209,491 COMPLETED  DB FULL       Wednesday      16482 04:34:42        3    203      0      0     17     1
  36228    760125016 2011-08-25 17:50:16 2011-08-25 17:52:33      1,464 COMPLETED  ARCHIVELOG    Wednesday        137 00:02:17        0      0      0      0      9     1
  36237    760139412 2011-08-25 21:50:12 2011-08-25 21:52:44      1,879 COMPLETED  ARCHIVELOG    Wednesday        152 00:02:32        1      0      0      0      8     1
  36240    760153813 2011-08-26 01:50:13 2011-08-26 01:57:04      5,123 COMPLETED  ARCHIVELOG    Thursday         411 00:06:51        1      0      0      0     17     1

Most of the columns above have either a obvious meaning or derive directly from the the V$ views, so they are explained in the Oracle documentation. The few columns I added to the output that need some additional explaination are the aggregations below:

    CF: Number of controlfile backups included in the backup set
    DF: Number of datafile full backups included in the backup set
    I0: Number of datafile incremental level-0 backups included in the backup set
    I1: Number of datafile incremental level-1 backups included in the backup set
    L: Number of archived log backups included in the backup set
    OUT INST: Instance where the job was executed and the output is available (see below)

Please note that the aggregations are only shown for the recent backup jobs in the example above, since they are purged from the catalog after a few days.

Another important thing to note is that in a RAC environment some fields for a RUNNING backup job may contain invalid information until the backup job is finished. To get consistent information, run this query on the node where the backup is running.

Backup set details

Once you found the general information about the backup sets available, you may need to get more information about the backup sets for one particular backup job. Each backup job is uniquely identified by (SESSION_RECID, SESSION_STAMP), which are listed by the query above.

The query below retrieves details for a backup job, given a pair of values for (SESSION_RECID, SESSION_STAMP):

set lines 220
set pages 1000
col backup_type for a4 heading "TYPE"
col controlfile_included heading "CF?"
col incremental_level heading "INCR LVL"
col pieces for 999 heading "PCS"
col elapsed_seconds heading "ELAPSED|SECONDS"
col device_type for a10 trunc heading "DEVICE|TYPE"
col compressed for a4 heading "ZIP?"
col output_mbytes for 9,999,999 heading "OUTPUT|MBYTES"
col input_file_scan_only for a4 heading "SCAN|ONLY"
select
  d.bs_key, d.backup_type, d.controlfile_included, d.incremental_level, d.pieces,
  to_char(d.start_time, 'yyyy-mm-dd hh24:mi:ss') start_time,
  to_char(d.completion_time, 'yyyy-mm-dd hh24:mi:ss') completion_time,
  d.elapsed_seconds, d.device_type, d.compressed, (d.output_bytes/1024/1024) output_mbytes, s.input_file_scan_only
from V$BACKUP_SET_DETAILS d
  join V$BACKUP_SET s on s.set_stamp = d.set_stamp and s.set_count = d.set_count
where session_recid = &SESSION_RECID
  and session_stamp = &SESSION_STAMP
order by d.start_time;

And the output is:

                                                                               ELAPSED DEVICE              OUTPUT SCAN
    BS_KEY TYPE CF?   INCR LVL  PCS START_TIME          COMPLETION_TIME        SECONDS TYPE       ZIP?     MBYTES ONLY
---------- ---- --- ---------- ---- ------------------- ------------------- ---------- ---------- ---- ---------- ----
     33429 L    NO                1 2011-08-25 15:00:24 2011-08-25 15:03:25        181 DISK       YES         812 NO
     33428 L    NO                1 2011-08-25 15:00:24 2011-08-25 15:03:08        164 DISK       YES         812 NO
     33429 L    NO                1 2011-08-25 15:00:24 2011-08-25 15:03:25        181 DISK       YES       1,187 NO
     33429 L    NO                1 2011-08-25 15:00:24 2011-08-25 15:03:25        181 DISK       YES         812 NO
     33429 L    NO                1 2011-08-25 15:00:24 2011-08-25 15:03:25        181 DISK       YES       1,187 NO
     33428 L    NO                1 2011-08-25 15:00:24 2011-08-25 15:03:08        164 DISK       YES       1,187 NO
     33428 L    NO                1 2011-08-25 15:00:24 2011-08-25 15:03:08        164 DISK       YES       1,187 NO
     33428 L    NO                1 2011-08-25 15:00:24 2011-08-25 15:03:08        164 DISK       YES         812 NO
     33430 L    NO                1 2011-08-25 15:03:11 2011-08-25 15:03:28         17 DISK       YES          89 NO
     33436 D    NO               45 2011-08-25 15:03:38 2011-08-25 19:05:51      14533 DISK       YES      91,898 NO
     33436 D    NO               45 2011-08-25 15:03:38 2011-08-25 19:05:51      14533 DISK       YES      91,898 NO
     33443 D    NO               52 2011-08-25 15:03:38 2011-08-25 19:31:51      16093 DISK       YES     106,166 NO
     33443 D    NO               52 2011-08-25 15:03:38 2011-08-25 19:31:51      16093 DISK       YES     106,166 NO
     33437 D    NO                2 2011-08-25 19:05:58 2011-08-25 19:16:24        626 DISK       YES       2,999 NO
     33438 D    NO                3 2011-08-25 19:16:28 2011-08-25 19:29:51        803 DISK       YES       4,948 NO
     33439 D    NO                1 2011-08-25 19:29:52 2011-08-25 19:29:52          0 DISK       YES           0 NO
     33440 D    NO                1 2011-08-25 19:29:53 2011-08-25 19:29:53          0 DISK       YES           0 NO
     33441 D    NO                1 2011-08-25 19:29:54 2011-08-25 19:29:54          0 DISK       YES           0 NO
     33442 D    NO                1 2011-08-25 19:29:55 2011-08-25 19:29:55          0 DISK       YES           0 NO
     33445 L    NO                1 2011-08-25 19:32:15 2011-08-25 19:34:11        116 DISK       YES         677 NO
     33445 L    NO                1 2011-08-25 19:32:15 2011-08-25 19:34:11        116 DISK       YES         618 NO
     33445 L    NO                1 2011-08-25 19:32:15 2011-08-25 19:34:11        116 DISK       YES         677 NO
     33446 L    NO                1 2011-08-25 19:32:15 2011-08-25 19:34:18        123 DISK       YES         618 NO
     33446 L    NO                1 2011-08-25 19:32:15 2011-08-25 19:34:18        123 DISK       YES         677 NO
     33446 L    NO                1 2011-08-25 19:32:15 2011-08-25 19:34:18        123 DISK       YES         618 NO
     33445 L    NO                1 2011-08-25 19:32:15 2011-08-25 19:34:11        116 DISK       YES         618 NO
     33446 L    NO                1 2011-08-25 19:32:15 2011-08-25 19:34:18        123 DISK       YES         677 NO
     33447 D    YES               1 2011-08-25 19:34:19 2011-08-25 19:34:20          1 DISK       NO           24 NO
     33448 D    YES               1 2011-08-25 19:34:48 2011-08-25 19:34:49          1 DISK       YES           3 NO
     33449 D    YES               1 2011-08-25 19:34:49 2011-08-25 19:34:50          1 DISK       NO           24 NO

Backup job output

And finally, sometimes it may be helpful to retrieve the job’s output from the metadata kept by the instance.
It maight be that the original log on disk, if any, may have been overwritten by a more recent backup, or just that selecting it from a V$ view may be easier than connecting to a server to find out were the log file is.

The tricky thing here, though, is that the view that contains the output, V$RMAN_OUTPUT, exists in memory only; the job’s output is not stored in the controlfile or anywhere else in the database. Thus, if the instance gets restarted, the contents of that view are reset.

To retrieve the job output for a specific backup job, identified by the (SESSION_RECID, SESSION_STAMP) pair, you can use the following query:

set lines 200
set pages 1000
select output
from GV$RMAN_OUTPUT
where session_recid = &SESSION_RECID
  and session_stamp = &SESSION_STAMP
order by recid;

Which shows:

OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
connected to target database: ORCL (DBID=4132479105)
using target database control file instead of recovery catalog

echo set on

backup archivelog all not backed up delete all input;

Starting backup at 26-AUG-11
current log archived
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=233 instance=ORCL devtype=DISK
allocated channel: ORA_DISK_2
channel ORA_DISK_2: sid=261 instance=ORCL devtype=DISK
channel ORA_DISK_1: starting compressed archive log backupset
channel ORA_DISK_1: specifying archive log(s) in backup set
...
...
archive log filename=+ALOGS/ORCL/archivelog/2011_08_26/thread_2_seq_23561.462.933333703 recid=213400 stamp=933333707
archive log filename=+ALOGS/ORCL/archivelog/2011_08_26/thread_2_seq_23562.412.933335505 recid=213402 stamp=933335506
archive log filename=+ALOGS/ORCL/archivelog/2011_08_26/thread_2_seq_23563.632.933337017 recid=213404 stamp=933337020
Finished backup at 26-AUG-11

Starting Control File and SPFILE Autobackup at 26-AUG-11
piece handle=/fra/ORCL/autobackup/2011_08_26/o1_mf_s_933337284_34h856x2_.bkp comment=NONE
Finished Control File and SPFILE Autobackup at 26-AUG-11

exit;

PlanetMySQL Voting: Vote UP / Vote DOWN

MySQL data backup: going beyond mysqldump

Март 29th, 2011

A user on a linux user group mailing list asked about this, and I was one of the people replying. Re-posting here as I reckon it’s of wider interest.

> [...] tens of gigs of data in MySQL databases.
> Some in memory tables, some MyISAM, a fair bit InnoDB. According to my
> understanding, when one doesn’t have several hours to take a DB
> offline and do dbbackup, there was/is ibbackup from InnoBase.. but now
> that MySQL and InnoBase have both been ‘Oracle Enterprised’, said
> product is now restricted to MySQL Enterprise customers..
>
> Some quick searching has suggested Percona XtraBackup as a potential
> FOSS alternative.
> What backup techniques do people employ around these parts for backups
> of large mixed MySQL data sets where downtime *must* be minimised?
>
> Has your backup plan ever been put to the test?

You should put it to the test regularly, not just when it’s needed.
An untested backup is not really a backup, I think.

At Open Query we tend to use dual master setups with MMM, other replication slaves, mysqldump, and XtracBackup or LVM snapshots. It’s not just about having backups, but also about general resilience, maintenance options, and scalability. I’ll clarify:

  • XtraBackup and LVM give you physical backups. that’s nice if you want to recover or clone a complete instance as-is. But if anything is wrong, it’ll be all stuffed (that is, you can sometimes recover InnoDB tablespaces and there are tools for it, but time may not be on your side). Note that LVM cannot snapshot between multiple volumes consistently, so if you have your InnoDB ibdata/IBD files and iblog files on separate spindles, using LVM is not suitable.
  • mysqldump for logical (SQL) backups. Most if not all setups should have this. Even if the file(s) were to be corrupted, they’re still readable since it’s plain SQL. You can do partial restores, which is handy in some cases. It’ll be slower to load so having *only* an SQL dump of a larger dataset is not a good idea.
  • some of the above backups can and should *also* be copied off-site. that’s for extra safety, but in terms of recovery speed it may not be optimal and should not be relied upon.
  • having dual masters is for easier maintenance without scheduled outages, as well as resilience when for instance hardware breaks (and it does).
  • slaves. You can even delay a slave (Maatkit has a tool for this), so that would give you a live correct image even in case of a user error, provided you get to it in time. Also, you want enough slack in your infra to be able to initialise a new slave off an existing one. Scaling up at a time when high load is already occurring can become painful if your infra is not prepared for it.

A key issue to consider is this… if the dataset is sufficiently large, and the online requirements high enough, you can’t afford to just have backups. Why? Because, how quickly can you deploy new suitable hardware, install OS, do restore, validate, put back online?

In many cases one or more aspects of the above list simply take too long, so my summary would be “then you don’t really have a backup”. Clients tend to argue with me on that, but only fairly briefly, until they see the point: if a restore takes longer than you can afford, that backup mechanism is unsuitable.

So, we use a combination of tools and approaches depending on needs, but in general terms we aim for keeping the overall environment online (individual machines can and will fail! relying on a magic box or SAN to not fail *will* get you bitten) to vastly reduce the instances where an actual restore is required.
Into that picture also comes using separate test/staging servers to not have developers stuff around on live servers (human error is an important cause of hassles).

In our training modules, we’ve combined the backups, recovery and replication topics as it’s clearly all intertwined and overlapping. Discussing backup techniques separate from replication and dual master setups makes no sense to us. It needs to be put in place with an overall vision.

Note that a SAN is not a backup strategy. And neither is replication on its own.


PlanetMySQL Voting: Vote UP / Vote DOWN

Simple Backup Restore Trick

Май 3rd, 2010
I don't usually post these simple tricks, but it came to my attention today and it's very simple and have seen issues when trying to get around it. This one tries to solve the question: How do I restore my production backup to a different schema? It looks obvious, but I haven't seen many people thinking about it.

Most of the time backups using mysqldump will include the following line:

USE `schema`;

This is OK when you're trying to either (re)build a slave or restore a production database. But what about restoring it to a test server in a different schema?

The actual trick


Using vi (or similar) editors to edit the line will most likely result in the editor trying to load the whole backup file into memory, which might cause paging or even crash the server if the backup is big enough (I've seen it happen). Using sed (or similar) might take some time with a big file. The quick and dirty trick I like is:

grep "USE \`schema\`" backup.sql | mysql -u user -p new_schema

Adapt the mysql command options to your needs. It's necessary to escape the backticks (`), otherwise the shell might interpret it as your trying to execute schema and use the output as the actual schema name. Also, make sure that new_schema already exists in the server.

This method is quick and dirty and leaves the original backup intact. I hope you find it useful.

PlanetMySQL Voting: Vote UP / Vote DOWN

Applying binary logs without adding to the binary log

Февраль 25th, 2010

Applying binary logs to a MySQL instance is not particularly difficult, using the mysqlbinlog command line utility:

$> mysqlbinlog mysql-bin.000003 > 03.sql
$> mysql < 03.sql

Turning off binary logging for a session is not difficult, from the MySQL commandline, if you authenticate as a user with the SUPER privilege:

mysql> SET SESSION sql_log_bin=0;

However, sometimes you want to apply binary logs to a MySQL instance, without having those changes applied to the binary logs themselves. One option is to restart the server binary logging disabled, and after the load is finished, restart the server with binary logging re-enabled. This is not always possible nor desirable, so there’s a better way, that works in at least versions 4.1 and up:

The mysqlbinlog utility has the --disable-log-bin option. All the option does is add the SET SESSION sql_log_bin=0; statement to the beginning of the output, but it is certainly much better than restarting the server twice!

Here’s the manual page for the --disable-log-bin option of mysqlbinlog: http://dev.mysql.com/doc/refman/5.1/en/mysqlbinlog.html#option_mysqlbinlog_disable-log-bin


PlanetMySQL Voting: Vote UP / Vote DOWN

Tool of the Day: rsnapshot

Сентябрь 17th, 2009

rsnapshot is a filesystem snapshot utility for making backups of local and remote systems, based on rsync. Rather than just doing a complete copy every time, it uses hardlinks to create incrementals (which are from a local perspective a full backup also). You can specify how long to keep old backups, and all the other usual jazz. You’d generally have it connect over ssh. You’ll want/need to run it on a filesystem that supports hardlinks, so that precludes NTFS.

In the context of MySQL, you can’t just do a filesystem copy of your MySQL data/logs, that would be inconsistent and broken. (amazingly, I still see people insisting/arguing on this – but heck it’s your business/data to gamble with, right?)

Anyway, if you do a local mysqldump also, or for instance use XtraBackup to take a binary backup of your InnoDB tablespace/logs, then rsnapshot can be used to automate the transfer of those files to a different geographical location.

Two extra things you need to do:

  • Regularly test your backups. They can fail, and that can be fatal. For XtraBackup, run the prepare command and essentially start up a MySQL instance on it to make sure it’s all happy. Havint this already done also saves time if you need to restore.
  • For restore time, you need to include the time needed to transfer files back to the target server.

PlanetMySQL Voting: Vote UP / Vote DOWN