Design and implementation of a fool proof backup strategy has been an important topic for companies over the years. With the growth of data (like Terabytes) in recent years organizations are now looking forward to have such fool proof backup solutions which can help them to have their services online and available to their users with having minimal performance impacts during backup window.
Historically Database administrators are relying on some online backup tools and techniques provided by their databases. For example Oracle database has been supporting online or hot backup strategy using traditional begin backup and end backup statements for last many years . Now RMAN is also available which can be integrated with any backup software like TSM or Netbackup to provide online backups solutions for Oracle databases.
Main problem arises when size of database is too large (like terabytes). In that case time requirements for putting databases in online mode become a problem. In fact as long as databases are in online backup mode, database effectively remains in readonly mode to end users. So for most of organizations, it is desirable to make this time period to be as small as possible. Here comes the role of latest snapshot techniques. These snapshots tools (majority of which are provided on storage hardware level) are comprehensive way for resolution of this problem and form foundation for strategic backup solutions for such huge databases.
Nearly all high end IBM storage subsystems provide such kind of snapshot tool. In IBM terminology, this tool is commonly known as “FlashCopy “ which is available as a separate licensed feature for IBM DS4000, DS6000 & DS8000 storage subsystems series.
This feature is in fact a data snapshot technique which copies data bit by bit on storage hardware level without having any performance impact on server itself. Normal FlashCopy operations supported by IBM DS6000 or DS8000 storage subsystems usually take no longer than few seconds to make a snapshot flash of source database with terabytes of size. Anyone who is using this FlashCopy technique as a integrated part of its online backup solution, is therefore left with only task of making this snapshot available on operating system ( so as to be taken to tape cartridges etc ) with the shortest possible period of time .
In this article, I will cover different aspects of IBM DS8000 FlashCopy feature along with its implementation and integration to make a comprehensive and fully automated online backup solution for a very large Oracle database (~1.2 Terabytes). It is worth noting that although FlashCopy feature provided by every IBM storage subsystems series is technically same, however its implementation may vary from series to series. This is because, as IBM is using different user interfaces to manage storage subsystems series differently. For example, DS6000 & DS8000 are managed by DS storage manager running on windows & Storage HMC platforms (Linux) respectively while DS4000 is managed by FAST storage manager software which can be installed on variety of operating systems including AIX and windows. Similarly cli( command line interface tool) for DS6000/DS8000 has many different commands as compared to cli used for DS4000 series. I therefore in this article will concentrate for developing online automated backup solution using DScli for DS8000 storage subsystem.
Advanced Copy Services from IBM
The DS8000 series advanced copy services are powerful data backup, remote mirroring and recovery functions that can help protect data from unforeseen events. Copy services runs on the IBM Total Storage DS series and are designed to support a wide range of servers including IBM pSeries, iSeries, and zSeries environments.
Comparable Copy Services functions are also available on the IBM Total Storage Enterprise Storage Server (ESS) Models 800 and 750 as well as on DS6000 series. Copy services include the following types of functions:
o IBM TotalStorage Flashcopy®, a point-in-time copy function
o Remote mirror and copy functions include:
o IBM TotalStorage Metro Mirror (previously known as Synchronous PPRC)
o IBM TotalStorage Global Mirror (previously known as Asynchronous PPRC)
You can manage Copy services functions through the DS8000 series’ CLI, as well as the GUI-based interface provided by the IBM Total Storage DS Storage Manager which is available on S-HMC (Linux based servers supplied with DS8000 for storage management).
What is FlashCopy Technology
FlashCopy feature is designed to provide the ability to create full volume copies of data on storage hardware level. When you set up a FlashCopy operation, a relationship is established between source and target volumes, and a bitmap of the source volume is created. Once this relationship and a bitmap are created, the target volume can be accessed as though all the data had been physically copied. While a relationship between the source and target volume exists, a background process copies the tracks from the source to the target volume. Hence IBM FlashCopy tool appears to provide an instant point-in-time flash of Luns present on DS Storage subsystems. This point-in-time flash in fact can contain a consistent snapshot of original source data (taken at a specific point in time), if necessary measures have been taken on operating system and database level to make this flash as a consistent flash of data. This is very important due to the fact this snapshot has been taken on hardware level and application or database has no knowledge that a snapshot process is in progress. So, success of any backup solution comprising of IBM FlashCopy technique (and in general any storage or hardware snapshot technique) depends upon data consistency measures taken during actual snapshot operation.
Activating FlashCopy Feature on DS8000 Storage Subsystems
FlashCopy being a premium feature requires a separate license which can be brought along with DS storage subsystem or can be ordered as an upgrade (also called MES in IBM terminology) for existing DS storage subsystems.
For DS6000 & DS8000 storage subsystems, it is mandatory to activate license activation codes (or at least the Operating Environment License code- OEL). This can be done through DS SMC or through DS CLI console. Other advance features like FlashCopy (or PPRC) can be activated after activation of OEL.
For activation of Flashcopy feature for DS8000, you must gather following information first:
- What is machine signature for DS8000. This is the most important information which is needed to activate your FlashCopy feature. Machine signature can be easily found out by using following DScli Commands:
dscli> lssi
dscli> Date/Time: March 30, 2005 6:53:05 PM CEST IBM DSCLI Version: 5.0.1.99
Name ID
============================================================================
- IBM.2107-7520431 IBM.2107-7520430 922 5005076303FFC19D Online Enabled
dscli> showsi IBM.2107-7520431
Date/Time: March 30, 2005 6:53:11 PM CEST IBM DSCLI Version: 5.0.1.99 DS: IBM.2107-7520431
Name -
desc -
ID IBM.2107-7520431
Storage Unit IBM.2107-7520430
Model 922
WWNN 5005076303FFC19D
Signature 896e-c0a3-38e9-5702
State Online
- What is machine serial number? The serial number of the DS8000 can be taken from the front of the base frame (lower right corner).On DS command line interface you can also use lssu command for this purpose.
- What are order confirmation codes (OCC)? The order confirmation code is printed on the DS8000 series order confirmation code document, which is usually sent to the client’s contact person together with the delivery of the machine.
After noting down machine serial number /machine signature and OCC, you can access following IBM internet site to generate activation codes for FlashCopy.
https://www-03.ibm.com/storage/dsfa/index.jsp
On this website, after putting all these information for your DS storage , you will be redirected to ViewActivation Codes window where you can download, or highlight, then flash and paste, or write down, your activation codes. If you select Download now, you will be prompted to select a file location. The file you download will be a very small XML file.
We opted for writing activation codes in our small note book; no doubt it is more handy approach!!!
In our case , activation code for FlashCopy which we got from above web site was 234-1934-J153-10DC-01FC-CA7D-5678-5678 , so next step was simply application of this activation code. We did this using DScli option
dscli> applykey -key 234-1934-J153-10DC-01FC-CA7D-5678-5678 IBM. 2107-7520431
Date/Time: 2 May 2005 14:47:06 IBM DSCLI Version: 5.0.3.5 DS: IBM. 2107-7520431
CMUC00199I applykey: License Machine Code successfully applied to storage image
IBM.2107-7520431
We then verified activation of FlashCopy on DS6000 using lskey command
dscli> lskey IBM. 2107-7520431
Date/Time: March 30, 2005 6:53:30 PM CEST IBM DSCLI Version: 5.0.1.99 DS: IBM. 2107-7520431
Activation Key Capacity (TB) Storage Type
================================================
Flashcopy 5 FB
Operating Environment 5 All
Starting with DScli
DScli is a very powerful tool which can be used for managing IBM DS storage subsystems. Because of its interactive nature and also because of its support for scripting mode, it is very handy tool and can be used easily for automating backup solutions comprising of DS8000 flash services.
We started building our backup solution with installation of DScli. For DS8000, DScli supports nearly every major operating system including AIX 5L and windows. We selected one of our Lpar on P570 to act as DScli management station. Every DS8000 Storage HMCs has an external Ethernet interface which is supposed to be attached to customer network. We established a separate VLAN comprising of these external net (172.17.20.xx) and assign 172.17.20.100 and 172.17.20.101 ip addresses to DS8000 Storage HMC’s external Ethernet interfaces using DS manager interface. We then assign Ip address 172.17.20.102 to one of the Ethernet interface of our AIX Lpar using “smitty chinet” and tested TCPIP connectivity to both of storage HMCs of DS8000. We also created a user in SHMC with admin privilege so that dscli commands can be executed using this account.
We then installed DScli on our AIX Lpar using root user. You must have a version of Java 1.4.1 or higher that is installed on your system in a standard directory. The DS CLI installer checks the standard directories to determine if a version of Java 1.4.1 or higher exists on your system. If this version is not found in the standard directories, the installation fails. We therefore set our shell environment correctly (correct JAVA_HOME environment variable), mount the DScli installation CD and execute
setupaix.bin –console command as root user. This will install DScli in its default directory for AIX which is /opt/ibm/dscli.
We then created dscli profile which is dscli.profile text file. We mentioned Storage HMC’s Ip addresses (as hmc1 and hmc2) along with user name and password.
Below is the content of dscli profile which is used in our scenario
----------------------------------------------------------------------
#DS CLI Profile
#
# Management Console/Node IP Address (es) are specified using the hmc parameter
# hmc1 and hmc2 are equivalent to -hmc1 and -hmc2 command line options.
# hmc1 is first SHMC for DS8000
hmc1:172.16.5.100
hmc2:172.16.5.101
username: dsadmin
# The password for dsadmin:
password: passw6sd
# Default target Storage Image ID
devid: IBM.2107-7520431
--------------------------------------------------------------------
We then tested DScli functionality from our AIX Lpar as follows:
/opt/ibm/dscli/dscli lsuser
This command should list all users on SHMC without asking for any password prompt, if every thing was configured correctly.
Once DScli setup is done , there are lot of other things to be done regarding storage configuration on DS8000 ( like array sites , arrays ,volume groups, host systems and open systems volumes creation, configuration of IO ports topology etc ). These are beyond the scope of this article but good details on the subject can be found in following IBM Red Books:
In our implementation we created three DS8000 open system volumes (which could hold our Oracle data filesystems along with archive log filesystems) and assigned these volumes to AIX node (bkkwt) using volume group concept of DS8000 storage hierarchy. Later three more open system volumes (having same sizes as that of previously created ones) were created in same volume group so that these could be used as target volumes in FlashCopy relationships.
SDD 1.6.0, which is multipathing software from IBM was also installed on AIX host with proper host attachment script for DS8000.This software caused DS8000 Luns to appear as vpaths devices (rather than hdisk) on AIX operating system.
Joining all pieces together – Automated Backup Solution Implementation
Our requirement was to develop an online backup solution for 2 TB oracle 9.2 database environments running on AIX 5.3 and HACMP 5.2. We achieved this by integrating DScli, IBM FlashCopy with UNIX shell scripting for this specific environment, however in general same solution can be used ( with some scenario specific changes ) for any database which supports online backups.
DS8000 FlashCopy creation and deletion commands were called from shell scripts and then specific AIX LVM commands were used to make target Luns available on operating system level. As source filesystems are mounted when Flash operation was performed, special measures were taken in this automated solution to ensure that no writing activity was happening during flash operation. This is the only way that we can ensure backup consistency. On database level , oracle begin backup and end backup SQL commands were used to temporarily suspend write operations and on AIX level “freeze” option with chfs command was used to ensure that all data in filesystem cache should be written to disk before start of FlashCopy operation. This new “freeze” option, which is available for only AIX JFS2 filesystems removes need of using AIX “sync” command which nearly does same purpose for JFS filesystems but not guarantee it . We calculated time required for completion of actual FlashCopy operation ( which was in our case approx 20 seconds) so we freeze our JFS2 filesystems ( containing data and archive logs ) for 45 seconds so that no write activity should be done on OS level during FlashCopy operation. As soon as FlashCopy operation is completed, filesystems were thaw and then oracle end back statements were executed.
We then used powerful AIX LVM commands (including recreatevg command) to make these target filesystems available on same AIX server containing source filesystems. Hence source filesystems as well as target filesystems are mounted on same server in my implementation (although it was possible to mount target filesystems on any AIX node different from source AIX node). These Target file systems were then backed up to TSM server using TSM B/A AIX client with help of TSM scheduler.
We create two shell scripts with all these pieces together. These scripts are included in appendix. One of these script named “flashcreate.sh” created flashcopy while other “flashdelete.sh” was used to delete target FlashCopy drives and clean up all ODM information before repeating same flash creation process once again.
We selected nocp option with mkflash command. Infact for establishing FlashCopy relationships on DS8000, you may select one of the two possible modes, background copy and no-background copy (nocp). With the parameter nocp it is possible to identify if the data of the source volume should be copied to the target volume in the background or not. If -nocp isn’t used, a copy of all data from source to target takes place in the background. With -nocp selected, only updates to the source volume will cause writes to the target volume to save the time-zero data there. This option is therefore useful in solutions where an instant copy is required to be made available for backup purposes.
In our solution, as target flash filesystems have to be mounted on same AIX system having source filesystems, these are mounted (and hence archived daily to TSM server using TSM scheduler) with different mount points as compared to original mount points. For example in our implementation, these are mounted with mount points prefixed with /flash. As a result, while restoring from TSM server, it is required to create and mount these target filesystems with same mount points( like /flash/oracle/data1 etc). Once restored (say to /flash/oracle/data1) on DR server, these mount points can easily be changed back to /oracle/data1 by using chfs command before starting of application or database from DR system. You may also create a post TSM scheduler script which can use OS chfs command for changing mount points after each TSM scheduled restoration.
Appendix A - Scripts
------------------------------------------------------------------------------------------------------------------
# written : For R3BNKORA AIX node
# Date : August 2006
# Script : begin_backup.sql
# Purpose : It will place all oracle tablespaces into begin backup mode
# and hence will ensure database consistency before online backup is
# taken using FlashCopy technique.
--------------------------------------------------------------------------------------------------------------------
#!/bin/ksh
connect /as sysdba
alter tablespace BNKORABTABD begin backup;
alter tablespace BNKORABTABI begin backup;
alter tablespace BNKORACLUD begin backup;
alter tablespace BNKORALOADD begin backup;
alter tablespace BNKORALOADI begin backup;
alter tablespace BNKORAPOOLD begin backup;
alter tablespace BNKORAPOOLI begin backup;
alter tablespace BNKORAPROTD begin backup;
alter tablespace BNKORAPROTI begin backup;
alter tablespace BNKORAROLL begin backup;
alter tablespace BNKORASOURCED begin backup;
alter tablespace BNKORASOURCEI begin backup;
alter tablespace BNKORASTABD begin backup;
alter tablespace BNKORASTABI begin backup;
alter tablespace BNKORATEMP begin backup;
alter tablespace BNKORAUSER1D begin backup;
alter tablespace BNKORAUSER1I begin backup;
alter tablespace SYSTEM begin backup;
alter system switch logfile;
alter system switch logfile;
alter system switch logfile;
alter system switch logfile;
--------------------------------------------------------------------------------------------------------------
#
# written : For R3BNKORA AIX node
# Date : August 2006
# Script : end_backup.sql
# Purpose : To bring all Oracle tablespaces back to normal state
-----------------------------------------------------------------------------------------------------------
#!/bin/ksh
connect / as sysdba
alter tablespace BNKORABTABD end backup;
alter tablespace BNKORABTABI end backup;
alter tablespace BNKORACLUD end backup;
alter tablespace BNKORALOADD end backup;
alter tablespace BNKORALOADI end backup;
alter tablespace BNKORAPOOLD end backup;
alter tablespace BNKORAPOOLI end backup;
alter tablespace BNKORAPROTD end backup;
alter tablespace BNKORAPROTI end backup;
alter tablespace BNKORAROLL end backup;
alter tablespace BNKORASOURCED end backup;
alter tablespace BNKORASOURCEI end backup;
alter tablespace BNKORASTABD end backup;
alter tablespace BNKORASTABI end backup;
alter tablespace BNKORATEMP end backup;
alter tablespace BNKORAUSER1D end backup;
alter tablespace BNKORAUSER1I end backup;
alter tablespace SYSTEM end backup;
-----------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------
#script name: flashrecreate.sh
#
# written For: R3BNKORA AIX node
# Date : December 2005
# Created By: Khurram Shiraz
# Purpose : Shell Script for creating flash snapshots and making them #available on AIX so that TSM client can backup flashed filesystems to TSM Server.
-----------------------------------------------------------------------
#!/bin/ksh
TSTFL="/scripts/lockfile"
if [ ! -f $TSTFL ];
then
echo Please ensure that FlashCopy Pairs are already removed before this script execution
echo It seems that they are still in place
echo therefore exiting!!!!
exit 1
else
echo Putting Oracle into hot backup Mode
echo please wait ............................
#
su - bnkora -c "sqlplus /nolog < /scripts/begin_backup.sql"
sleep 10
chfs -a freeze=60 /oracle/data1
chfs –a freeze=60 /oracle/data2
chfs –a freeze=60 /oracle/data3
chfs –a freeze=60 /oracle/data4
chfs –a freeze=60 /oracle/archivelogs
# Execution of DScli commands
/opt/ibm/dscli/dscli mkflash -dev IBM.2107-7520431 -nocp 1100:1105
/opt/ibm/dscli/dscli mkflash -dev IBM.2107-7520431 -nocp 1101:1106
/opt/ibm/dscli/dscli mkflash -dev IBM.2107-7520431 -nocp 1102:1104
chfs –a freeze=off /oracle/data1
chfs –a freeze=off /oracle/data2
chfs –a freeze=off /oracle/data3
chfs –a freeze=off /oracle/data4
chfs –a freeze=off /oracle/archivelogs
# Putting Oracle back to normal Mode
su - bnkora -c "sqlplus /nolog < /scripts/end_backup.sql"
# Now working for Flashed Data.......
#
cfgmgr
# Starting preparation of LVM & VGs for mounting of filesystems
chdev -l vpath0 -a pv=clear
chdev -l vpath1 -a pv=clear
chdev -l vpath2 -a pv=clear
recreatevg -y flashvg1 -Y flash –L /flash vpath0
recreatevg -y flashvg2 -Y flash –L /flash vpath1
recreatevg -y flashvg3 -Y flash –L /flash vpath2
echo ……. now running fsck & mounting fs
fsck -y /flash/oracle/data1
mount /flash/oracle/data1
fsck -y /flash/oracle/data2
mount /flash/oracle/data2
fsck -y /flash/oracle/data3
mount /flash/oracle/data4
fsck -y /flash/oracle/data4
mount /flash/oracle/data3
fsck –y /flash/oracle/archivelogs
mount /flash/oracle/archivelogs
cd /scripts
rm lockfile
exit 0
fi
------------------------------------------------------------------------------------------------------------# flashdisable .sh
#
# Written : For R3BNKORA AIX node
# Date : December 2005
# Purpose : Shell Script for disabling flash target drives from TSM client # node and removing all related OS information.
-----------------------------------------------------------------------
#!/bin/ksh
# Unmount all fileystems which are created during Flashcopy operation
#
unmount /flash/oracle/data1
unmount /flash/oracle/data2
unmount /flash/oracle/data3
unmount /flash/oracle/data4
unmount /flash/oracle/archivelogs
# Varyoff all Flashcopy volume Groups
#
varyoffvg flashvg1
varyoffvg flashvg2
varyoffvg flashvg3
# Export all Flashcopy volume groups
exportvg flashvg1
exportvg flashvg2
exportvg flashvg3
# Remove all snapshot logical drives (vpaths and associated hdisks)
rmdev -dl vpath0
rmdev -dl vpath1
rmdev -dl vpath2
rmdev -dl hdisk21
rmdev –dl hdisk22
rmdev –dl hdisk23
rmdev –dl hdisk24
rmdev –dl hdisk11
rmdev –dl hdisk13
rmdev –dl hdisk17
rmdev –dl hdisk19
rmdev –dl hdisk29
rmdev –dl hdisk31
rmdev –dl hdisk33
rmdev –dl hdisk35
/opt/ibm/dscli/dscli rmflash -dev IBM.2107-7520431 -quiet 1100:1105
/opt/ibm/dscli/dscli rmflash -dev IBM.2107-7520431 -quiet 1101:1106
/opt/ibm/dscli/dscli rmflash -dev IBM.2107-7520431 -quiet 1102:1104
cd /scripts
touch lockfile
exit 0
Reference:
IBM White paper “Storage Solutions for Oracle Database:
Snapshot Backup and Recovery with IBM Total Storage
IBM Red Book “IBM TotalStorage DS8000 Series: Copy Services in Open Environments SG24-6788-00”
The IBM TotalStorageDS8000 Series: “Concepts and Architecture SG24-6471-00”
About Author: Khurram Shiraz is senior system Administrator at KMEFIC,
Note: This article was originally published in October 2007 by SysAdmin Magzine US ( www.samag.com). Now many electronic versions of this article is available on internet.
No comments:
Post a Comment