Wednesday, 6 May 2009

An Online Backup Solution using Advanced Features on IBM DS8000

Design and implementation of a fool proof backup strategy has been an important topic for companies over the years. With the growth of data (like Terabytes) in recent years organizations are now looking forward to have such fool proof backup solutions which can help them to have their services online and available to their users with having minimal performance impacts during backup window.

Historically Database administrators are relying on some online backup tools and techniques provided by their databases. For example Oracle database has been supporting online or hot backup strategy using traditional begin backup and end backup statements for last many years . Now RMAN is also available which can be integrated with any backup software like TSM or Netbackup to provide online backups solutions for Oracle databases.

Main problem arises when size of database is too large (like terabytes). In that case time requirements for putting databases in online mode become a problem. In fact as long as databases are in online backup mode, database effectively remains in readonly mode to end users. So for most of organizations, it is desirable to make this time period to be as small as possible. Here comes the role of latest snapshot techniques. These snapshots tools (majority of which are provided on storage hardware level) are comprehensive way for resolution of this problem and form foundation for strategic backup solutions for such huge databases.

Nearly all high end IBM storage subsystems provide such kind of snapshot tool. In IBM terminology, this tool is commonly known as “FlashCopy “ which is available as a separate licensed feature for IBM DS4000, DS6000 & DS8000 storage subsystems series.

This feature is in fact a data snapshot technique which copies data bit by bit on storage hardware level without having any performance impact on server itself. Normal FlashCopy operations supported by IBM DS6000 or DS8000 storage subsystems usually take no longer than few seconds to make a snapshot flash of source database with terabytes of size. Anyone who is using this FlashCopy technique as a integrated part of its online backup solution, is therefore left with only task of making this snapshot available on operating system ( so as to be taken to tape cartridges etc ) with the shortest possible period of time .

In this article, I will cover different aspects of IBM DS8000 FlashCopy feature along with its implementation and integration to make a comprehensive and fully automated online backup solution for a very large Oracle database (~1.2 Terabytes). It is worth noting that although FlashCopy feature provided by every IBM storage subsystems series is technically same, however its implementation may vary from series to series. This is because, as IBM is using different user interfaces to manage storage subsystems series differently. For example, DS6000 & DS8000 are managed by DS storage manager running on windows & Storage HMC platforms (Linux) respectively while DS4000 is managed by FAST storage manager software which can be installed on variety of operating systems including AIX and windows. Similarly cli( command line interface tool) for DS6000/DS8000 has many different commands as compared to cli used for DS4000 series. I therefore in this article will concentrate for developing online automated backup solution using DScli for DS8000 storage subsystem.

Advanced Copy Services from IBM

The DS8000 series advanced copy services are powerful data backup, remote mirroring and recovery functions that can help protect data from unforeseen events. Copy services runs on the IBM Total Storage DS series and are designed to support a wide range of servers including IBM pSeries, iSeries, and zSeries environments.

Comparable Copy Services functions are also available on the IBM Total Storage Enterprise Storage Server (ESS) Models 800 and 750 as well as on DS6000 series. Copy services include the following types of functions:

o IBM TotalStorage Flashcopy®, a point-in-time copy function

o Remote mirror and copy functions include:

o IBM TotalStorage Metro Mirror (previously known as Synchronous PPRC)

o IBM TotalStorage Global Mirror (previously known as Asynchronous PPRC)

You can manage Copy services functions through the DS8000 series’ CLI, as well as the GUI-based interface provided by the IBM Total Storage DS Storage Manager which is available on S-HMC (Linux based servers supplied with DS8000 for storage management).

What is FlashCopy Technology

FlashCopy feature is designed to provide the ability to create full volume copies of data on storage hardware level. When you set up a FlashCopy operation, a relationship is established between source and target volumes, and a bitmap of the source volume is created. Once this relationship and a bitmap are created, the target volume can be accessed as though all the data had been physically copied. While a relationship between the source and target volume exists, a background process copies the tracks from the source to the target volume. Hence IBM FlashCopy tool appears to provide an instant point-in-time flash of Luns present on DS Storage subsystems. This point-in-time flash in fact can contain a consistent snapshot of original source data (taken at a specific point in time), if necessary measures have been taken on operating system and database level to make this flash as a consistent flash of data. This is very important due to the fact this snapshot has been taken on hardware level and application or database has no knowledge that a snapshot process is in progress. So, success of any backup solution comprising of IBM FlashCopy technique (and in general any storage or hardware snapshot technique) depends upon data consistency measures taken during actual snapshot operation.

Activating FlashCopy Feature on DS8000 Storage Subsystems

FlashCopy being a premium feature requires a separate license which can be brought along with DS storage subsystem or can be ordered as an upgrade (also called MES in IBM terminology) for existing DS storage subsystems.

For DS6000 & DS8000 storage subsystems, it is mandatory to activate license activation codes (or at least the Operating Environment License code- OEL). This can be done through DS SMC or through DS CLI console. Other advance features like FlashCopy (or PPRC) can be activated after activation of OEL.

For activation of Flashcopy feature for DS8000, you must gather following information first:

  1. What is machine signature for DS8000. This is the most important information which is needed to activate your FlashCopy feature. Machine signature can be easily found out by using following DScli Commands:

dscli> lssi

dscli> Date/Time: March 30, 2005 6:53:05 PM CEST IBM DSCLI Version: 5.0.1.99

Name ID Storage Unit Model WWNN State ESSNet

============================================================================

- IBM.2107-7520431 IBM.2107-7520430 922 5005076303FFC19D Online Enabled

dscli> showsi IBM.2107-7520431

Date/Time: March 30, 2005 6:53:11 PM CEST IBM DSCLI Version: 5.0.1.99 DS: IBM.2107-7520431

Name -

desc -

ID IBM.2107-7520431

Storage Unit IBM.2107-7520430

Model 922

WWNN 5005076303FFC19D

Signature 896e-c0a3-38e9-5702

State Online

  1. What is machine serial number? The serial number of the DS8000 can be taken from the front of the base frame (lower right corner).On DS command line interface you can also use lssu command for this purpose.

  1. What are order confirmation codes (OCC)? The order confirmation code is printed on the DS8000 series order confirmation code document, which is usually sent to the client’s contact person together with the delivery of the machine.

After noting down machine serial number /machine signature and OCC, you can access following IBM internet site to generate activation codes for FlashCopy.

https://www-03.ibm.com/storage/dsfa/index.jsp

On this website, after putting all these information for your DS storage , you will be redirected to ViewActivation Codes window where you can download, or highlight, then flash and paste, or write down, your activation codes. If you select Download now, you will be prompted to select a file location. The file you download will be a very small XML file.

We opted for writing activation codes in our small note book; no doubt it is more handy approach!!!

In our case , activation code for FlashCopy which we got from above web site was 234-1934-J153-10DC-01FC-CA7D-5678-5678 , so next step was simply application of this activation code. We did this using DScli option

dscli> applykey -key 234-1934-J153-10DC-01FC-CA7D-5678-5678 IBM. 2107-7520431

Date/Time: 2 May 2005 14:47:06 IBM DSCLI Version: 5.0.3.5 DS: IBM. 2107-7520431

CMUC00199I applykey: License Machine Code successfully applied to storage image

IBM.2107-7520431

We then verified activation of FlashCopy on DS6000 using lskey command

dscli> lskey IBM. 2107-7520431

Date/Time: March 30, 2005 6:53:30 PM CEST IBM DSCLI Version: 5.0.1.99 DS: IBM. 2107-7520431

Activation Key Capacity (TB) Storage Type

================================================

Flashcopy 5 FB

Operating Environment 5 All

Starting with DScli

DScli is a very powerful tool which can be used for managing IBM DS storage subsystems. Because of its interactive nature and also because of its support for scripting mode, it is very handy tool and can be used easily for automating backup solutions comprising of DS8000 flash services.

We started building our backup solution with installation of DScli. For DS8000, DScli supports nearly every major operating system including AIX 5L and windows. We selected one of our Lpar on P570 to act as DScli management station. Every DS8000 Storage HMCs has an external Ethernet interface which is supposed to be attached to customer network. We established a separate VLAN comprising of these external net (172.17.20.xx) and assign 172.17.20.100 and 172.17.20.101 ip addresses to DS8000 Storage HMC’s external Ethernet interfaces using DS manager interface. We then assign Ip address 172.17.20.102 to one of the Ethernet interface of our AIX Lpar using “smitty chinet” and tested TCPIP connectivity to both of storage HMCs of DS8000. We also created a user in SHMC with admin privilege so that dscli commands can be executed using this account.

We then installed DScli on our AIX Lpar using root user. You must have a version of Java 1.4.1 or higher that is installed on your system in a standard directory. The DS CLI installer checks the standard directories to determine if a version of Java 1.4.1 or higher exists on your system. If this version is not found in the standard directories, the installation fails. We therefore set our shell environment correctly (correct JAVA_HOME environment variable), mount the DScli installation CD and execute

setupaix.bin –console command as root user. This will install DScli in its default directory for AIX which is /opt/ibm/dscli.

We then created dscli profile which is dscli.profile text file. We mentioned Storage HMC’s Ip addresses (as hmc1 and hmc2) along with user name and password.

Below is the content of dscli profile which is used in our scenario

----------------------------------------------------------------------

#DS CLI Profile

#

# Management Console/Node IP Address (es) are specified using the hmc parameter

# hmc1 and hmc2 are equivalent to -hmc1 and -hmc2 command line options.

# hmc1 is first SHMC for DS8000

hmc1:172.16.5.100

hmc2:172.16.5.101

username: dsadmin

# The password for dsadmin:

password: passw6sd

# Default target Storage Image ID

devid: IBM.2107-7520431

--------------------------------------------------------------------

We then tested DScli functionality from our AIX Lpar as follows:

/opt/ibm/dscli/dscli lsuser

This command should list all users on SHMC without asking for any password prompt, if every thing was configured correctly.

Once DScli setup is done , there are lot of other things to be done regarding storage configuration on DS8000 ( like array sites , arrays ,volume groups, host systems and open systems volumes creation, configuration of IO ports topology etc ). These are beyond the scope of this article but good details on the subject can be found in following IBM Red Books:

In our implementation we created three DS8000 open system volumes (which could hold our Oracle data filesystems along with archive log filesystems) and assigned these volumes to AIX node (bkkwt) using volume group concept of DS8000 storage hierarchy. Later three more open system volumes (having same sizes as that of previously created ones) were created in same volume group so that these could be used as target volumes in FlashCopy relationships.

SDD 1.6.0, which is multipathing software from IBM was also installed on AIX host with proper host attachment script for DS8000.This software caused DS8000 Luns to appear as vpaths devices (rather than hdisk) on AIX operating system.

Joining all pieces together – Automated Backup Solution Implementation



Our requirement was to develop an online backup solution for 2 TB oracle 9.2 database environments running on AIX 5.3 and HACMP 5.2. We achieved this by integrating DScli, IBM FlashCopy with UNIX shell scripting for this specific environment, however in general same solution can be used ( with some scenario specific changes ) for any database which supports online backups.

DS8000 FlashCopy creation and deletion commands were called from shell scripts and then specific AIX LVM commands were used to make target Luns available on operating system level. As source filesystems are mounted when Flash operation was performed, special measures were taken in this automated solution to ensure that no writing activity was happening during flash operation. This is the only way that we can ensure backup consistency. On database level , oracle begin backup and end backup SQL commands were used to temporarily suspend write operations and on AIX level “freeze” option with chfs command was used to ensure that all data in filesystem cache should be written to disk before start of FlashCopy operation. This new “freeze” option, which is available for only AIX JFS2 filesystems removes need of using AIX “sync” command which nearly does same purpose for JFS filesystems but not guarantee it . We calculated time required for completion of actual FlashCopy operation ( which was in our case approx 20 seconds) so we freeze our JFS2 filesystems ( containing data and archive logs ) for 45 seconds so that no write activity should be done on OS level during FlashCopy operation. As soon as FlashCopy operation is completed, filesystems were thaw and then oracle end back statements were executed.

We then used powerful AIX LVM commands (including recreatevg command) to make these target filesystems available on same AIX server containing source filesystems. Hence source filesystems as well as target filesystems are mounted on same server in my implementation (although it was possible to mount target filesystems on any AIX node different from source AIX node). These Target file systems were then backed up to TSM server using TSM B/A AIX client with help of TSM scheduler.

We create two shell scripts with all these pieces together. These scripts are included in appendix. One of these script named “flashcreate.sh” created flashcopy while other “flashdelete.sh” was used to delete target FlashCopy drives and clean up all ODM information before repeating same flash creation process once again.

We did not observe mandatory requirement of running fsck command against target flash filesystems before mounting it on AIX server as we already used freeze option of JFS2 filesystems which ensured all data in filesystem cache have already been written to disk before FlashCopy operation starts. However, for implementation using simple JFS filesystems, it is mandatory to run fsck command against target filesystems before mounting them on AIX. In our scenario, our backup window, however, allowed us to execute fsck command on target filesystems, so we adopted it as an additional tool to ensure data consistency on operating system level. We noticed a time requirement of almost 45 minutes to run a thorough fsck –y command against all target filesystems (almost 1 Terabyte) while fsck commands were run sequentially.

We selected nocp option with mkflash command. Infact for establishing FlashCopy relationships on DS8000, you may select one of the two possible modes, background copy and no-background copy (nocp). With the parameter nocp it is possible to identify if the data of the source volume should be copied to the target volume in the background or not. If -nocp isn’t used, a copy of all data from source to target takes place in the background. With -nocp selected, only updates to the source volume will cause writes to the target volume to save the time-zero data there. This option is therefore useful in solutions where an instant copy is required to be made available for backup purposes.

In our solution, as target flash filesystems have to be mounted on same AIX system having source filesystems, these are mounted (and hence archived daily to TSM server using TSM scheduler) with different mount points as compared to original mount points. For example in our implementation, these are mounted with mount points prefixed with /flash. As a result, while restoring from TSM server, it is required to create and mount these target filesystems with same mount points( like /flash/oracle/data1 etc). Once restored (say to /flash/oracle/data1) on DR server, these mount points can easily be changed back to /oracle/data1 by using chfs command before starting of application or database from DR system. You may also create a post TSM scheduler script which can use OS chfs command for changing mount points after each TSM scheduled restoration.


Appendix A - Scripts

------------------------------------------------------------------------------------------------------------------

# written : For R3BNKORA AIX node

# Date : August 2006

# Script : begin_backup.sql

# Purpose : It will place all oracle tablespaces into begin backup mode

# and hence will ensure database consistency before online backup is

# taken using FlashCopy technique.

--------------------------------------------------------------------------------------------------------------------

#!/bin/ksh

connect /as sysdba

alter tablespace BNKORABTABD begin backup;

alter tablespace BNKORABTABI begin backup;

alter tablespace BNKORACLUD begin backup;

alter tablespace BNKORALOADD begin backup;

alter tablespace BNKORALOADI begin backup;

alter tablespace BNKORAPOOLD begin backup;

alter tablespace BNKORAPOOLI begin backup;

alter tablespace BNKORAPROTD begin backup;

alter tablespace BNKORAPROTI begin backup;

alter tablespace BNKORAROLL begin backup;

alter tablespace BNKORASOURCED begin backup;

alter tablespace BNKORASOURCEI begin backup;

alter tablespace BNKORASTABD begin backup;

alter tablespace BNKORASTABI begin backup;

alter tablespace BNKORATEMP begin backup;

alter tablespace BNKORAUSER1D begin backup;

alter tablespace BNKORAUSER1I begin backup;

alter tablespace SYSTEM begin backup;

alter system switch logfile;

alter system switch logfile;

alter system switch logfile;

alter system switch logfile;

--------------------------------------------------------------------------------------------------------------

#

# written : For R3BNKORA AIX node

# Date : August 2006

# Script : end_backup.sql

# Purpose : To bring all Oracle tablespaces back to normal state

-----------------------------------------------------------------------------------------------------------

#!/bin/ksh

connect / as sysdba

alter tablespace BNKORABTABD end backup;

alter tablespace BNKORABTABI end backup;

alter tablespace BNKORACLUD end backup;

alter tablespace BNKORALOADD end backup;

alter tablespace BNKORALOADI end backup;

alter tablespace BNKORAPOOLD end backup;

alter tablespace BNKORAPOOLI end backup;

alter tablespace BNKORAPROTD end backup;

alter tablespace BNKORAPROTI end backup;

alter tablespace BNKORAROLL end backup;

alter tablespace BNKORASOURCED end backup;

alter tablespace BNKORASOURCEI end backup;

alter tablespace BNKORASTABD end backup;

alter tablespace BNKORASTABI end backup;

alter tablespace BNKORATEMP end backup;

alter tablespace BNKORAUSER1D end backup;

alter tablespace BNKORAUSER1I end backup;

alter tablespace SYSTEM end backup;

-----------------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------------------------

#script name: flashrecreate.sh

#

# written For: R3BNKORA AIX node

# Date : December 2005

# Created By: Khurram Shiraz

# Purpose : Shell Script for creating flash snapshots and making them #available on AIX so that TSM client can backup flashed filesystems to TSM Server.

-----------------------------------------------------------------------

#!/bin/ksh

TSTFL="/scripts/lockfile"

if [ ! -f $TSTFL ];

then

echo Please ensure that FlashCopy Pairs are already removed before this script execution

echo It seems that they are still in place

echo therefore exiting!!!!

exit 1

else

echo Putting Oracle into hot backup Mode

echo please wait ............................

#

su - bnkora -c "sqlplus /nolog < /scripts/begin_backup.sql"

sleep 10

chfs -a freeze=60 /oracle/data1

chfs –a freeze=60 /oracle/data2

chfs –a freeze=60 /oracle/data3

chfs –a freeze=60 /oracle/data4

chfs –a freeze=60 /oracle/archivelogs

# Execution of DScli commands

/opt/ibm/dscli/dscli mkflash -dev IBM.2107-7520431 -nocp 1100:1105

/opt/ibm/dscli/dscli mkflash -dev IBM.2107-7520431 -nocp 1101:1106

/opt/ibm/dscli/dscli mkflash -dev IBM.2107-7520431 -nocp 1102:1104

chfs –a freeze=off /oracle/data1

chfs –a freeze=off /oracle/data2

chfs –a freeze=off /oracle/data3

chfs –a freeze=off /oracle/data4

chfs –a freeze=off /oracle/archivelogs

# Putting Oracle back to normal Mode

su - bnkora -c "sqlplus /nolog < /scripts/end_backup.sql"

# Now working for Flashed Data.......

#

cfgmgr

# Starting preparation of LVM & VGs for mounting of filesystems

chdev -l vpath0 -a pv=clear

chdev -l vpath1 -a pv=clear

chdev -l vpath2 -a pv=clear

recreatevg -y flashvg1 -Y flash –L /flash vpath0

recreatevg -y flashvg2 -Y flash –L /flash vpath1

recreatevg -y flashvg3 -Y flash –L /flash vpath2

echo ……. now running fsck & mounting fs

fsck -y /flash/oracle/data1

mount /flash/oracle/data1

fsck -y /flash/oracle/data2

mount /flash/oracle/data2

fsck -y /flash/oracle/data3

mount /flash/oracle/data4

fsck -y /flash/oracle/data4

mount /flash/oracle/data3

fsck –y /flash/oracle/archivelogs

mount /flash/oracle/archivelogs

cd /scripts

rm lockfile

exit 0

fi

------------------------------------------------------------------------------------------------------------# flashdisable .sh

#

# Written : For R3BNKORA AIX node

# Date : December 2005

# Purpose : Shell Script for disabling flash target drives from TSM client # node and removing all related OS information.

-----------------------------------------------------------------------

#!/bin/ksh

# Unmount all fileystems which are created during Flashcopy operation

#

unmount /flash/oracle/data1

unmount /flash/oracle/data2

unmount /flash/oracle/data3

unmount /flash/oracle/data4

unmount /flash/oracle/archivelogs

# Varyoff all Flashcopy volume Groups

#

varyoffvg flashvg1

varyoffvg flashvg2

varyoffvg flashvg3

# Export all Flashcopy volume groups

exportvg flashvg1

exportvg flashvg2

exportvg flashvg3

# Remove all snapshot logical drives (vpaths and associated hdisks)

rmdev -dl vpath0

rmdev -dl vpath1

rmdev -dl vpath2

rmdev -dl hdisk21

rmdev –dl hdisk22

rmdev –dl hdisk23

rmdev –dl hdisk24

rmdev –dl hdisk11

rmdev –dl hdisk13

rmdev –dl hdisk17

rmdev –dl hdisk19

rmdev –dl hdisk29

rmdev –dl hdisk31

rmdev –dl hdisk33

rmdev –dl hdisk35

/opt/ibm/dscli/dscli rmflash -dev IBM.2107-7520431 -quiet 1100:1105

/opt/ibm/dscli/dscli rmflash -dev IBM.2107-7520431 -quiet 1101:1106

/opt/ibm/dscli/dscli rmflash -dev IBM.2107-7520431 -quiet 1102:1104

cd /scripts

touch lockfile

exit 0

Reference:

IBM White paper “Storage Solutions for Oracle Database:

Snapshot Backup and Recovery with IBM Total Storage Enterprise Storage Server”

IBM Red Book “IBM TotalStorage DS8000 Series: Copy Services in Open Environments SG24-6788-00”

The IBM TotalStorageDS8000 Series: “Concepts and Architecture SG24-6471-00”

About Author: Khurram Shiraz is senior system Administrator at KMEFIC, Kuwait. In his eight years of IT experience, he worked mainly with IBM technologies and products especially AIX, HACMP Clustering, Tivoli and IBM SAN/ NAS Storage. He also has worked with IBM Integrated Technology Services group. His area of expertise includes design and implementation of high availability and DR solutions based on pSeries, Linux and windows infrastructure. He can be reached at aix_tiger@yahoo.com.



Note: This article was originally published in October 2007 by SysAdmin Magzine US ( www.samag.com). Now many electronic versions of this article is available on internet.

No comments:

Post a Comment

 How to Enable Graphical Mode on Red Hat 7 he recommended way to enable graphical mode on RHEL  V7 is to install first following packages # ...