Design Your Unique Solutions with Reliability and Performance: July 2009

Friday, 31 July 2009

How to work with IBM IVM ?

Fortunately enough i got a recent opportunity to work with IBM IVM, but offcourse in a very difficult customer situation.

And when on a very short notice , i had to Logically partition a Pseries P550 server into two AIX and Linux partition , initially i thought it would infact going to be a very difficult task as customer have no HMC at all and IVM documentation are very small in number and not clear though.

So here are basic steps to implement IVM on IBM Pseries server and partition it with the help of IVM.

1. First make sure that your Pseries server have standard virtualization license with it. If yes , there is possibility that it is not enabled. Here comes the concept of ASMI.

2. ASMI is web based tool which comes by default with IBM Power systems. Intel guys can comapre it with HP LILO function as you can do a large number of tasks on hardware level ( or Firmware level). You can even restart pseries server or configure speed of serial ports through ASMI.

If you want to make sure about virtualization capabilities of your pSeries server , access it with ASMI . There are two HMC related ports ( HMC1 and HMC2) which have default IP addresses ( 169.254.2.147 and 169.254.3.147) , so you have to just IP addresses of same subnet on your laptop and use a straight ethernet cable to connect to HMC1 or HMC2. You have to use admin user and admin password to login into ASMI for first time. Please note that until you change default password of admin user, ASMI will not allow you to do any other task.

To access ASMI, use your PC browser ( netscape, IE or Mozilla) to point to https://169.254.2.147

3. Once you login to ASMI, check about virtualization license on IBM pseries. It would be a great idea, if you shutdown the server and reset it to factory default setting.

4. Now you have to boot the server into SMS mode and put the VIO Server CD and start installation of VIO. It should be noted down that IVM is nothing , but VIO server itself. I used one of the internal disk to install VIO server.

5. Once installation is done , you have to accept the license on VIO , set the date on VIO and if required , configure virtual Ethernet as well.Finally give the IP address to VIO server using mktcpip command.

6. Now you can access IVM through browser , by using http://.

7. Using IVM web based interface, you can create new Lpars with different settings of processing units and memory. You can assign HEA ports to these Lpars for tcpip access. If present , you can assign dedicated FC and Ethernet adapters to these Lpars as well.

8. Finally , if you have SEA adapters on VIO server , you can enable virtual Ethernet bridging and use these virtual Ethernet adapters on client Lpars.

9. For disks , IVM allows you to create storage pools as well. Once IVM is installed , default rootvg pool is created on one of the internal disk. It would be a nice idea if you have another internal disk and you can mirror the VIO OS disk.

10. Now for client Lpars , you can use either internal disks or SAN disks. If going for SAN disks through dedicated adapters, there is no need to create virtual disks on VIO level. On the other hand , if you want to use internal disks ( as in my case) you have to create another disk pool on one of the internal disk and then create virtual disks on that disk pool and assign them to client lpars. These client lpars will access these virtual disks through virtual scsi adapters.

11. You can finally assign CD to client lpars and boot them into SMS mode , one by one and install OS of your choice on these Lpars.

My own observations about IVM that there are many limitations as compared to HMC, however it is still not a bad tool. As based on Websphere and Java , performance is slightly slow. Sometimes, it is time out and sometime it crashes. Another thing is that an experienced HMC user still have to adjust his HMC knowledge according to IVM functionalities and interface . This generally require some time , so dont panic at all when you see IVM interface for the first time. Be patient and spend some time using IVM and soon you will be able to handle it , even with more comfort as you handle HMC.

Thursday, 30 July 2009

How to extend dynamically VIOS based rootvg !

POWER6 LPAR:
• Unmirror your rootvg and remove hdisk0 from the rootvg volume group. If there are any swap or dump devices on this disk you may need to remove them first before you can remove hdisk1 from the rootvg volume group.
• Once the disk has been removed from the rootvg, remove it from the LPAR by executing the following:
rmdev -l hdisk0 - d
• Now you execute the bosboot command and update your bootlist now that hdisk0 has been removed and is no longer part of the system:
bosboot - a - d /dev/hdisk1
bootlist -o -m normal hdisk1
VIO Server (where hdisk0 was created):
• Remove the device from the VIO server using the rmdev command
rmdev -dev < bckcnim_hdisk0 >
• Next you will need to access the AIX OS part of the VIO server by executing:
oem_setup_env
• Now you have two options: you can extend the existing logical volume or a create a new one if there is more than enough disk space left. In this example I will be using bckcnim_lv.
smitty extendlv and add additonal LP's or smitty mklv
• Exit it out of oem_setup_env by just typing exit at the OS prompt.
• Now that you are back within the restricted shell of the VIO Server, execute the following command. You can use whatever device name you wish. I used bckcnim_hdisk1 just for example purposes:
mkvdev -vdev bckcnim_lv -vadapter < vhost# > -dev bckcnim_hdisk0
POWER6 LPAR:
• Execute cfgmgr to add the new hdisk0 back to LPAR.
• Add hdisk0 back to the rootvg volume group using the extendvg or smitty extendvg.
• Mirror rootvg using the mirrorvg command or smitty mirrorvg
• Sync the mirroring process to the background and wait to complete.
• Now you must execute bosboot again and update the bootlist again:
bosboot -a
bootlist -o -m normal hdisk0 hdisk1

Saturday, 25 July 2009

Should we consider using SSDs or not?

Until the costs drop even further, performance-boosting solid-state drives (SSDs) won't make economic sense for every type of application, so it's important to follow best practices to ensure they are working for your storage shop. Here are 10 SSD best practices to follow:
1. Identify I/O-intensive applications that will benefit from faster data storage.
Types of applications that may be well suited to SSD technology include databases, data mining, data warehousing, analytics, trading, high-performance computing, server virtualization, Web serving and email systems.
A check of how many enterprise-grade 15,000 rpm and 10,000 rpm hard disk drives are in use and how much money has been spent on DRAM for performance will help to determine if SSDs will be worth the investment.
Study application workloads and assess hot vs. warm vs. cold data sets. Active data can be directed to the flash solid-state drives, and the less frequently accessed data can go on Fibre Channel (FC) drives or SATA drives.
"If you have a good understanding of that, then you can understand how much solid-state storage you're likely to need to be able to optimize the performance of your system," said Jim Handy, an analyst who focuses on memory chips and SSDs at Objective Analysis.
2. Ensure that applications, especially those that are custom-written, can handle the faster solid-state drives.
"For most applications, this should not really be a problem, but depending on coding and timings, you can have the chance to have things done too quickly which can throw off timing a bit, as well as the processors actually jumping due to it not having to wait for the disk anymore," said Jon-Carlos Mayes, IT director at CCP hf, a Reykjavik, Iceland-based online game producer. CCP uses both DRAM and NAND flash SSD technology from Texas Memory Systems Inc.
3. When evaluating solid-state drive technology, concentrate on cost per IOPS, not cost per gigabyte.
"Focus on what would be the lowest overall system cost to get the throughput that you require," Handy said. "If you focus on cost per gigabyte, then a solid-state drive will always look bad because it ranges from 20 times [more than] the cost per gigabyte of a hard disk drive."
4. Make sure the performance and reliability of a vendor's SSDs can be measured in terms of random reads and writes across small blocks and pages.
"Vendors will quote you whatever they can do the best in the lab, and that may not be what you're actually running," said Joseph Unsworth, a research director at Gartner Inc. Once you determine which supplier can deliver the results you need, he added, have the vendor do a proof of concept and make sure your service-level agreement is tailored specifically to your application needs.
5. Determine which type of solid-state drive will be the best fit.
The chief SSD choice confronting users today is NAND flash or DRAM. DRAM SSDs are significantly faster and perform reads and writes equally well, but they're also considerably more expensive and consume more power. NAND flash SSDs -- whether single-level cell (SLC) or multilevel cell (MLC) – perform better on reads than writes, and they wear out over time.
"Are you just accessing data? Or are you doing a lot of programming of data? That will determine whether or not you can go with a cheaper but less robust multilevel cell-based SSD, or you need to go with single-level cell," Unsworth said. "If you're doing a mix of both [reads and writes], then you're going to want to make sure that you're using the single-level cell technology over the multilevel cell."
The more expensive SLC solid-state drives are better suited to the enterprise because the wear life is longer for continuous writing than it is with MLC-based drives.
NAND Flash SSDs can be especially helpful for a read-intensive database table, for example, whereas DRAM SSDs -- whether in an appliance, in cache or in DRAM combined with flash -- would be a better option for transaction logs or journal files, where you're recording a copy of what's changing, said Greg Schulz, founder and analyst at StorageIO Group.
"I tell folks if they're going to need to have super transaction rate capability because they're running some data mining application, then one of the high-performance boxes like Violin or Texas Memory Systems has could be a pretty economical way to go," said Gene Ruth, a senior storage analyst at Burton Group. "The alternative would be to build out a huge hard disk drive-based system with all the power and space and maintenance and failure rates that go along with having lots of hard disks."
6. Consider NAND flash solid-state drives for caching purposes as a way to augment application performance.
Write caching is typically done at the storage device with DRAM cache that persists the writes to back-end storage. When that approach isn't fast enough, DRAM solid-state drives have been used to accelerate write-intensive applications. Now that lower cost flash is available, SLC SSDs will increasingly become the preferred option for write caching, especially when budgets are tight.
The effectiveness of any read caching layer depends on the size of the data set being accessed, the frequency with which the data is read and the performance of the cache. If the data set is small and being read on a frequent basis, server RAM usually suffices. But if the data set is large and the reads are random, flash SSD can work well. Although SLC and the less expensive MLC are both options, the more durable SLC is generally favored.
7. Consider solid-state drive over short-stroking.
Formatting a hard disk drive so that data is written only to the outer sector of the disk's platter can increase performance in high I/O environments, since it reduces the time the drive actuator needs to locate the data. But that practice, known as short-stroking, leaves a substantial percentage of the disk drive's capacity unused.
"You're deliberately not using what you bought. Because we're so used to it, people think that's how it has to be," said Mark Peters, an analyst at Enterprise Strategy Group. Even though SSD technology is nascent, he said, it already makes financial sense in scenarios where users now short-stroke. "Everyone says, 'I don't want to buy solid state because it's 10 times the cost [of hard disk drives],'" Peters said. "But you could take that 5 GB from 20 [hard disk] drives and put it on one solid-state [drive]."
8. Determine how much power your data center is consuming.
According to Peters, "You're either in parts of the country or a data center where you have oodles of power, or you're in parts of the country or a data center where actually you don't have much power left and you're going to hit the wall at some point. If you don't know how much [power] you're using, how can you know when that's going to happen?"
Because SSD technology is more energy efficient than hard disk drives, it can help to extend the life of a data center with power constraints. But, Peters said, many IT organizations have no clue what their electric usage is.
9. Experiment with solid-state drive technology in the lab.
"They definitely want to bring in, if not the individual disk devices, possibly a subsystem that's based on SSDs," Burton Group's Ruth said, possibly to target a particular application for test purposes. Even using SSDs in a laptop can help to illustrate the potential advantages, he added. "People are familiar with hard disks. They get that. They need to develop that comfort level with solid-state disks as well."
10. Make sure long-term planning takes into account a potential solid-state storage tier.
Solid-state drives will be integrated into storage systems as standard fare going forward, so IT organizations shouldn't lock themselves into a hard disk drive-only strategy. Instead, they need to entertain the possibility of an SSD tier, or tier 0, for their most I/O-intensive servers.

Friday, 24 July 2009

When future of my country will change?

I am a Pakistani professional and I love my country like all other Pakistani professionals.Even if i become Canadian citizen or Australian national , my roots will remain with my country ... with green colors... There is no doubt in that...

I don't know about my next generation who may not know single word of Pakistani national anthem ,, but for me it is still a song of my heart... I can not change my roots ....However i still think like millions of Pakistani professionals , why i should change my roots? Answer is probably because i am not optimistic about my own country 's future...

When i see millions of illiterate Pakistanis who belong to rural areas of Sindh, Punjab, Balochistan and NWFP , looking for single drop of clean water and running after UNO trucks carrying food AIDS i feel sadness in my heart. When i receive calls from my relatives who live in biggest city of Pakistan , complaining about excessive load -shedding i feel ashame ...What I am doing currently for my fellow men ? Nothing?

On the other hand when i see sons & daughters of our ruler class politicians as becoming so called "OUR LEADERS", OUR PRIME MINISTERS and OUR PRESIDENTS , i feel that we all Pakistanis are responsible for what is currently happening to us. We can not select our leaders in right way , then what else we can do ? nothing?

Anybody can tell me what are the characteristics of these elite sons and daughters which are making them next rulers for us ? Nothing except that they are sons and daughters of some martyr or alive leaders and they have studied in leading US/UK
universities? are they professional ? are they genious ? are they extraordinary in any sense? no but they are sons and daughters of Mr.X and Ms.X and thats it !!!!
and this quality is enough for making them next pakistani presidents and prime ministers?

My nation is very talented , Every young man of my nation is professional in his work , but when the time of election comes , he can not make a right decision ... he can not select right leaders for him ... Punjabi's select Punjabi leaders , Sindhi's select Sindhi , NWFP people select Pushtoon leaders and Muhajir select urdu speaking leaders...We can not vote on basis of character , education , nobleness and professionalism of leaders and that is why most of International president/prime ministers are really professionals while ours are just...

When we change our this attitude i don't know , but that is sure that if we don't change it , we will remain same ... A talented nation , but with no future ... surviving one by another night , but without light.

Saturday, 18 July 2009

A visit to 360Mall in Kuwait

Yes , we were planning to go to newly opened 360 Mall ( which is the biggest Mall of Kuwait now) since last three weeks but finally managed to go there yesterday.

Ok, as my per opinion , 360 Mall is a huge Mall and its interior is really beutiful which fascinates its visitors but still as far as space is concerned i still considers Avenues as most spacious Mall of Kuwait.

There are few very good things inside 360 Mall. First of all concept of having gardens inside the Mall is really great. Second, the interior is really beautiful. They have laid down big carpets inside the mall ( how they will manage these carpets with pessage of time , God knows). You will also find some really good & arabic traditional work , which will capture your mind and soul with its beauty.

Shops are still not open fully , however you can see big brands present there. So hopefully this Mall will be full of foreigners who will happy to see their favorite brands present in Kuwait.

Food Court is relatively small and i think that Mall Management has to think to grow it big in near future otherwise it will become more difficult in near future to accommodate so many people in this small food court.

I would recommend to go there at least one time to just have a look. For weekly grocery kind of thing , i still would say carfour inside Aveneues is more better option.
Let's see , with the pessage of time , my this opinion about 360Mall change or not.

Using Oracle Rman to create backups on disk

Taking a physical backup of an Oracle database is no simple task. For years, DBAs depended on complex shell scripts that would extract lists of all the files that needed backup, build SQL and Unix commands to put tablespaces into backup mode and copy files, and monitor the process for exceptions. Backups for large databases could run for hours because the process simply made a copy of each database file, and there was no provision for incremental backups. Perhaps the deepest, darkest fear of any DBA was to perform a database recovery and realize that the entire plan was invalid due to the lack of one small, yet crucial, file.
Enter Recovery Manager, or RMAN, introduced with Oracle 8. RMAN is a feature-rich tool that allows an Oracle database to be backed up through a series of simple commands. It also interfaces easily with tape backup devices, and has provisions for four levels of incremental backup, among other things.
Although a full treatment of RMAN is enough to fill a book and thus well beyond the scope of this article, I will describe the basic steps required to use RMAN to take a full or incremental backup of a database, which will work with any version of Oracle -- from Oracle 8 through Oracle10g. I've successfully used these techniques to perform backups of Oracle databases on a variety of platforms, including HP-UX, Solaris, and Red Hat Linux.
RMAN vs. Scripted Backups
RMAN offers several advantages over scripted backups, the most significant of which is that its syntax is simpler and easier to learn and use. And because RMAN is supported by Oracle, any problems you encounter with RMAN backups or commands can be taken directly to Oracle for support. Simply put, RMAN is a less complex, proven, and more reliable method of achieving the same goal. Other advantages include:
• Reduced overhead. With scripted backups of online tablespaces, you must first place the tablespace into backup mode. Doing so causes any activity on that tablespace to be recorded as redo activity. For busy or large databases, this could become quite significant. With RMAN, that necessity is eliminated and the overhead is reduced.
• Smaller backup footprint. RMAN backups are often smaller than scripted file copies, because RMAN only backs up changed data blocks below the high water mark of a datafile.
• Support for incremental backups. There is no way with a simple file copy strategy to backup only the data that has changed since the last backup. With RMAN, five levels of incremental backups are available, potentially saving hours of backup time each week.
• Automated recovery. RMAN knows which files it needs to recover a database and restores only what it needs. Contrast this to a manual recovery, where queries against the database dictionary must be performed to identify which files are needed, followed by manual restores of those files. RMAN will simply recover your database in less time, with less intervention, and do it more safely than you could with a scripted backup of the database.
• All files are included automatically in a full backup. With scripted backups, the database files that need to be backed up must be manually coded in your backup scripts, or queries against database tables must be performed to identify which files need to be backed up. Any change to the database represents a change in your backup scripts that must somehow be managed. With RMAN, that file management is internalized and automatic.
• Verification of your backup's veracity. With RMAN, you can simulate a recovery without actually restoring your files and be confident that, should disaster strike, your database will be recoverable.
• Automated management of your backup files. By setting a retention policy, RMAN can purge obsolete backups from your system and even be used to manage archived log files. Using a retention policy, RMAN will never remove a backup set that is required to recover your database fully. So even if a backup fails and the last valid files are beyond the retention period, RMAN will leave them intact. RMAN can also be used to manage your archive log files.
Logging into RMAN
RMAN is invoked through the rman executable under ORACLE_HOME/bin and is part of the standard installation of Oracle. You must provide RMAN with some basic information before you can get started. RMAN needs the login information to the target database (which is the database we'll perform backups against) and information about the recovery catalog that will be used.
A recovery catalog is an Oracle database that stores information about RMAN backups. In the absence of a recovery catalog, recovery information is stored in the database control file. Most Oracle installations do not use a separate recovery catalog, and the included scripts reflect this. However, it is a simple matter of changing the connection strings to utilize a recovery catalog if you wish to do so.
If ORACLE_HOME is in your PATH, invoke RMAN as:
rman target user/password nocatalog (without a catalog)
Or:
rman target user/password catalog rmanuser/rman_password@catalog (with a catalog)
Those familiar with SQL*Plus will recognize the connect strings that identify the username, password, and database alias. Note that the login to the target database must be as a user with SYSDBA privileges.
RMAN also supports the use of OS-authenticated logins. OS-authenticated users provide an extra layer of security and convenience for batch operations. Consider the following login string, which can be executed at either the command prompt or within a shell script:
rman target system/manager nocatalog
Now, from another session, execute:
bash-2.03$ ps -ef | grep rman | grep -v grep
oracle 14547 13399 0 12:01:57 pts/1 0:01 rman target system/manager nocatalog
There's the username and password for the world to see! If this is being run via a script, somewhere on the system will be a file with a hard-coded password. With an OS-authenticated account, a login to RMAN is simply:
rman target / nocatalog
No password information shows up under ps -ef, and access is controlled by OS authentication, where it is much easier to enforce strong passwords.
Backing Up a Database
Now that we've successfully logged into RMAN, let's take a look at a typical RMAN login and backup command:
bash-2.03$ rman target / nocatalog

Recovery Manager: Release 10.1.0.2.0 û Production
Copyright © 1995, 2004, Oracle. All rights reserved.

Connected to target database: ORA10G (DBID=3190394834)
Using target database instead of recovery catalog

RMAN> run {
2> allocate channel d1 type disk maxpiecesize=2000M;
3> backup database plus archivelog;
4> }
The first thing we see is the run block. In versions prior to Oracle9i, it is necessary to enclose most RMAN commands within a run { } block. (Beginning in Oracle9i, the run { } block is still supported syntax but is not necessary.) There is no performance loss incurred by using the run block, so I prefer this syntax because it is compatible across all installations.
The first step within the run block is to allocate a channel for RMAN to use for the backup command itself. Allocating a channel allows us to limit the size of any file created (in the example it is limited to 2,000MB) and define the location to which the files will be backed up. It also determines whether the output of the following commands will be directed to disk or to a tape drive. Allocating a second or third channel will cause RMAN to parallelize the operation whenever possible (provided you're using the Enterprise Edition of the database).
Each channel allocated will also result in one or more backup sets being created. A backup set is a file that contains backup information for one or more Oracle datafiles or archive log files. The elegance of RMAN is that a backup set is often smaller than the sum of its parts. For example, consider a datafile that is 500MB, but only contains 250MB of data. A straight file copy results in a 500MB file. But in RMAN, that file will be significantly smaller because the empty space is not written -- only the data.
The backup command can be used to backup an entire database, a control file, archive log files, or one or more individual datafiles or tablespaces. Where once pages of scripts took tablespaces in and out of backup mode and copied files, a backup of your entire database using RMAN is now potentially as simple as:
backup database plus archivelog;
Of course, although it could be that simple, it usually isn't. Often we'll want to extend more control over the backup process. This is a more typical set of commands to back up a database:
run {
allocate channel c1 type disk;
allocate channel c2 type disk;
backup full
filesperset 5
tag full_backup
format
'rman_%d_%t_%U.bus' database include current controlfile;
sql 'alter system archive log current';
backup filesperset 50
archivelog all
format
'rman_%d_%t_%U.bar';
release channel c1;
release channel c2;
}
Although slightly more complicated than the previous example, this set of commands is still fairly straightforward. The differences between this and the previous simple run block are the addition of the filesperset, tag, and format options, the inclusion of the control file in the backup set, the addition of a SQL command, and a separate command to back up the archive log files.
None of these are Earth-shattering changes. filesperset limits the number of datafiles included in each backup set. The tag is a way of naming a backup so that queries against the recovery catalog will make a little more sense to the operator. The format command defines how the files generated by the backup will be named. In this case, there are three wildcard options used in the filename. %d and %t insert the database SID and current time, respectively, and %U adds a unique identifier. RMAN offers several other wildcards besides these as well.
The include current controlfile clause does just that -- it includes the current control file in the database backup set. The control file contains information about the current state of Oracle's database files and is required to restore the database. I suggest that you include it in any backup that you perform, since changes to the physical and logical structure of the database are included in the control file. You will be unable to restore your database if you're using an older or obsolete version of the control file that does not reflect such changes.
As shown in the example, we can include regular SQL commands within RMAN, too. The SQL command included here causes the current redo log file to be written to disk, so the backup of all the archive log files will enable a full and consistent recovery of the database. Once the redo logs are flushed to the archive logs, we take a backup of all the archive log files.
Note that RMAN is a very verbose tool. Most successful commands will generate a large volume of informational messages, which is often disquieting to users new to RMAN. If you're used to SQL*Plus or the Oracle alert log, which return ORA-nnnnn errors to indicate a problem with the database or failure of a command, RMAN will definitely surprise you. Don't be alarmed. You can identify a failure in an RMAN command by looking for the RMAN-00569 condition, which is a general indication that an error stack will follow. When running RMAN from within a shell script, you simply test the exit status for a non-zero value. Listing 1 provides an example of testing the exit status of RMAN from within a shell script.
Incremental Backups
Note that the database backup command in the previous run block specifies "full". It is also possible to perform incremental backups of an Oracle database using RMAN. RMAN offers five incremental levels of backup. Level 0 is the equivalent of a full backup. Levels 1 through 4 provide fine-grained control over the amount of data that is backed up. For example, a level 1 backup only records changes made to the database since the last full backup. A level 2 backup only backs up data changed since the last level 1 backup, and so on.
Incremental backups allow you to perform full backups less frequently, while still being protected against data loss and minimizing both the time required to back up the database and the space on disk or tape to store the backup.
File Copies
If you are used to or prefer file copies, RMAN can accommodate you as well. Without RMAN, to get a consistent copy of a datafile, you would have to place the tablespace into backup mode, copy the file using OS commands, then take the tablespace out of backup mode.
With RMAN, the whole process is handled automatically and simply through the copy command:
RMAN> copy datafile '/u02/oradata/ora10g/system01.dbf'
2> to '/u12/backups/ora10g/system01.dbf';
The copy command can only be used to copy files to disk, and it copies the file bit-for-bit, so the advantages of potentially smaller backup sizes from the backup command are lost. However, if you prefer file copy operations, the copy command is far superior to the alternative hot backup method.
Putting It to Work
Listing 1, rman_backup, includes a basic RMAN run block as part of a shell script that can be used to perform a full or incremental hot backup of your database. It accepts three command-line parameters. The first is the database SID and is required. The second is the destination for the backups. This can be a full path to a disk directory or simply the value TAPE if you are writing the backups to a tape drive. Third is the incremental backup level. The default is to perform a full (level 0) backup if no value is provided. It can be called from a cron job using syntax similar to the following:
00 1 * * 0 /home/oracle/dba/rman_backup ora10g TAPE 0 > \
/home/oracle/logs/rman_backup.log
Included near the top of the script is an email address to which administrative alerts are to be sent. If either of the RMAN commands in the script fails, the DBA (or other responsible party) is notified via email or pager alert.
The script verifies that the database SID being called is legitimate, then lists information about the environment, including database connections, disk space (for disk backups), and memory before beginning the backup itself. The backup command utilized is very similar to the one shown above. Once the backup is complete, the same diagnostic information is listed for comparison purposes and can be used to identify problems with the RMAN process.
Listing and Validation
Once the backup is complete, the script invokes RMAN a second time to list and validate backups on the system. The listing of the backups is informational only, but the following command:
restore database validate
causes RMAN to emulate a restore of the database to make sure that the backup is complete and valid. This is useful as a sanity check of the process and verifies that the backup sets that have been created could actually be used to recover your database if needed.
This script calls the file in Listing 2, oracle_env. oracle_env, which sets up the environment for Oracle to use when the parent script is called through a cron job.
Space Management
One feature that makes Oracle such a robust database is its ability to recover right up until a failure. Oracle uses archive log files to store every change made to a database. These are written to a location on disk known as the archive log destination. Left unattended, they will eventually consume all the space on the drive. If this happens, Oracle will, by design, stop working until space has been freed up on the system.
Oracle DBAs used to either manually purge these files or write a shell script to automatically delete files older than a few days, but there was no check to make sure that the files deleted were genuinely obsolete. RMAN, however, offers a few options to more efficiently manage archive log files.
Beginning in Oracle9i, the DBA can set a retention policy for backup sets as a whole. This includes database backups and archive log backups:
RMAN> configure retention policy to recovery window of 3 days;
When a backup becomes older than three days, RMAN will mark it as obsolete. Deleting these old backups is as simple as allocating a maintenance channel (similar to allocating a disk or tape channel) and telling RMAN to delete them. The beauty of this method is that RMAN knows whether a backup older than three days is still needed for a full recovery. If so, it won't be marked as obsolete (and won't be deleted):
RMAN> run {
2> allocate channel for maintenance device type disk;
3> delete obsolete;
4> }
This method is convenient if you're storing backups and archive log files on disk.
The second method works well when your backup policy writes files to tape and is available for all versions of RMAN. It simply requires the addition of the option delete input to the archive log backup command. For example:
backup filesperset 50
archivelog all
format
'rman_%d_%t_%U.bar' delete input;
Once the archive log file is backed up, Oracle deletes the files automatically. This command can be added to the rman_backup script, which can be run at a regular interval not only to back up the database and archive log files but to keep the archive log destination clean.
Another solution is included in Listing 3, which mimics the functionality of setting a retention policy for pre-Oracle9i installations. It requires two parameters -- the database SID, and a retention period (in days) inside which files are to be kept. This script calls SQL*Plus to identify files that are outside the retention period and that are not needed to fully back up the database in the event of failure. It then removes these files using the OS rm command.
Once the files are removed from the system, they need to be removed from the recovery catalog. This is accomplished by allocating a maintenance channel then performing the two crosscheck commands. Crosscheck verifies that the files in the recovery catalog are actually available in their expected location. When a backup piece is missing, RMAN marks it as "EXPIRED".
The RMAN delete command can then be used to remove recovery catalog references to expired files. In the script, this is accomplished with three commands:
delete noprompt expired backup of database;
delete noprompt expired backup of controlfile;
delete noprompt expired backup of archivelog all;
Summary
Oracle's RMAN is a robust and feature-rich application that simplifies backing up Oracle databases and managing the files created by the backups. It should be the tool of choice for those responsible for managing backup and recovery of Oracle databases. RMAN's ability to interface with popular media management tools, such as those offered by Veritas and HP, makes it as easy to back up to tape as to disk.
The multiple levels of incremental backups make RMAN an excellent tool for managing backups of large databases, as well as high-transaction environments where a minimal performance impact from backups is required. The simplicity of commands makes it accessible to every DBA, regardless of their level of experience, and puts a successful backup policy within their grasp.

Friday, 17 July 2009

Unix Versus Windows- Which OS to choose?

1.Availability :

UNIX:UNIX operating system and Kernel, by design are free from viruses, worms and malicious system threats including spying software

WINDOWS:Windows Operating system inherently capable of being attacked by large number of viruses, worms and Trojan horses. These are potential threats to systems stability and may crash even a powerful production server running on windows or even corrupt the production data.

Additional Comments:
I
According to Dr. Nic Peeling and Dr Julian Satchell's Analysis of the Impact of Open Source Software
Windo("There are about 60,000 viruses known for Windows, 40 or so for the Macintosh, about 5 for commercial Unix versions, and perhaps 40 for Linux. Most of the Windows viruses are not important, but many hundreds have caused widespread damage. Two or three of the Macintosh viruses were widespread enough to be of importance. None of the Unix or Linux viruses became widespread - most were confined to the specific files."

2. Downtime Costs:

UNIX:Down time costs are low in case of UNIX.

WINDOWS:Downtime costs are high in case of windows OS.

Additional Comments:
Windows Server Downtime Costs Companies two to three times as much as Linux/UNIX server downtime. This is not due to any inherent flaws in the Windows server OS, but rather reflects the crucial nature of the data and applications running on Windows servers. Windows application servers racked up the biggest downtime expenses: $5,624 per hour versus $1,168 in hourly downtime costs for comparable Linux
Source :
http://www.iaps.com/Linux-Windows-TCO-Survey-2005.04.html

3. Stability:

UNIX:Unix is more stable operating system and it does not go down as often as Windows does, therefore requires less administration and maintenance

WINDOWS:Blue screen is a common windows dilemma. System crashes are very common due to many possible reasons including hardware and software.

Additional Comments:

According to latest servers reliability survey 2008 http://www.iaps.com/2008-server-reliability-survey.html,
UNIX has a superior edge over windows in terms of reliability and availability.. According to same survey report, IBM AIX is leading UNIX variant in terms of less down time, with HPUX as number two.

4. Performance:

UNIX:Unix possesses much greater processing power than Windows.

WINDOWS:Windows Operating system running on CICS architecture machines and is bit slower. Moreover, recent Microsoft attempts to increase OS hardening have also reduced windows performance little bit.

Most of leading UNIX servers are RISC architecture based and therefore much performance oriented as compared to Windows servers.

5.System Upgrades:

UNIX:Software upgrades are straightforward and does not require any prerequisites. Moreover time required for patch management is not high

WINDOWS:Software upgrades from Microsoft often require the user to purchase new or more hardware or prerequisite software. Time required for patch management is quite high due to large number of patches released by Microsoft.

6. Security

UNIX:Unix has greater built-in security and permissions features than Windows

ADDITIONAL COMMENTS:For example, on a Windows system, programs installed by a non-Administrative user can still add DLLs and other system files that can be run at a level of permission that damages the system itself. Even worse, the collection of files on a Windows system - the operating system, the applications, and the user data - can't be kept apart from each other. Things are intermingled to a degree that makes it unlikely that they will ever be satisfactorily sorted out in any sensibly secure fashion.

7.Manageability:

UNIX:UNIX system administration is available through command line as well as through GUI ( KDE, XWINDOWS, GNOME)

WINDOWS:Windows administration is more users friendly and available through GUI.

Traditionally speaking UNIX is considered as command line text editor oriented operating system, however availability of XWINDOWS and KNOME etc has changed the perspective and UNIX administration is far easy as compared to what it was ten years ago. Still, in terms of user friendliness , Windows have an edge.

8.System Recovery:

UNIX:System recovery time is slightly higher in case of UNIX systems

WINDOWS:System recovery is faster in windows systems

According to a survey, Linux Servers Take Nearly Four Hours or 30% Longer to Recover from a Security Attack than a similar Windows Server. The survey respondents revealed that it took them 17 hours on average for their Linux servers to recover from a security attack compared to an average recovery time of 13.2 hours for Windows Servers.
Source: http://www.iaps.com/Linux-Windows-TCO-Survey-2005.04.html

9. Virtualization:

UNIX:Virtualization is built-in for most of latest commercially available UNIX variants. AIX for example supports Lpars, Micropartitioning, PLM and WPARS for virtualization.

WINDOWS:Virtualization is available through third party tools like VMware, however possesses certain limitations.

Thursday, 9 July 2009

Building a highly available solution using RHEL cluster suite

“When mission critical applications fail, so does your business”. This is a true statement in today’s environments where most of the organizations are spending million of dollars in making their services available 24x7x365. Organizations, regardless of the fact, that whether they are serving external customers or internal customers are deploying high available solutions to make their applications as highly available applications.

In view of growing demand for high available solutions, now almost every IT vendor is providing high availability solutions for its specific platform. Famous commercial high availability solutions are IBM’s HACMP, Veritas cluster server and HP service guard.

If you go for any commercially sold out highly availability solution on Red-Hat Enterprise Linux, probably best choice would be Red-Hat cluster suite itself.
In early 2002 Red Hat introduced the first member of its Red Hat Enterprise Linux family of products - Red Hat Enterprise Linux AS (originally called Red Hat
Linux Advanced Server). Since then the family of products has grown steadily and now includes Red Hat Enterprise Linux ES (for entry/mid range servers) and
Red Hat Enterprise Linux WS (for desktops/workstations). These products are designed specifically for use in enterprise environments to deliver superior application support, performance, availability and scalability.
The original release of Red Hat Enterprise Linux AS, version 2.1, included a high availability clustering feature as part of the base product. This feature was not included in the smaller Red Hat Enterprise Linux ES product. However, with the success of the Red Hat Enterprise Linux family it became clear that high availability clustering was a feature that should be made available for both AS and ES server products. Consequently, with the release of Red Hat Enterprise Linux, version 3, in October 2003, the high availability clustering feature was packaged into an optional layered product, called Red Hat Cluster Suite, and certified for use on both the Enterprise Linux AS and Enterprise Linux ES products.

It should be noted down that RHEL cluster suite is a separate licensed product and it should be purchased from Red-Hat on top of Red-Hat base ES Linux license.

1.0 Red-Hat Cluster Suite Overview

The product Red-Hat cluster suite comprised of two major features, one is cluster manager which provides high availability while other feature is called IP load balancing (this feature is originally called Pirhana). Cluster Manager and IP Load Balancing (Piranha) are complementary high availability technologies that can be used separately or in combination, depending on application requirements. Both of these technologies are integrated in Red Hat Cluster Suite.

In this article, we will focus towards cluster manager as it is mainly used for building high availability solutions.

1.1 Software Components

From software components & software subsystem point of view, following are the major components of RHEL cluster manager:

Software Subsystem Component Purpose
Fence fenced To provide fencing infrastructure for specific hardware platform
DLM libdlm,dlm-kernel Contains distributed lock management (DLM) library

Cman cman Contains the Cluster Manager (CMAN), which is used for managing
cluster membership, messaging, and notification

GFS & related locks Lock_NoLock Contains shared filesystem support which can be concurrently mounted on multiple nodes
GULM gulm Contains the GULM lock management userspace tools and libraries
(an alternative to using CMAN and DLM).

Rgmanager clurgmgrd ,clustat Manages cluster services and resources

CCS ccsd , ccs_test and ccs_tool Contains the cluster configuration services daemon (ccsd) and
associated files

Cluster Configuration Tool System-config-cluster Contains the Cluster Configuration Tool, used to
graphically configure the cluster and the display of the current status of the
nodes, resources, fencing agents, and cluster services

Magma magma and magma-plugins Contains an interface library for cluster lock management and required plug-ins

IDDEV iddev Contains libraries used to identify the file system (or volume
manager) in which a device is formatted

1.2 Shared Storage & Data Integrity

Lock management is a common cluster-infrastructure service that provides a mechanism for other cluster infrastructure components to synchronize their access to shared resources. In a Red Hat cluster, DLM (Distributed Lock Manager) or alternatively GULM (grand Unified Lock Manager) are possible lock manager choices. GULM, on the other hand, is a server-based unified cluster/lock manager for GFS, GNBD, and CLVM. It can be used in place of CMAN and DLM. A single GULM server can be run in stand-alone mode but introduces a single point of failure for GFS. Three or five GULM servers can also be run together in which case the failure of 1 or 2 servers can be tolerated respectively. GULM servers are usually run on dedicated machines, although this is not a strict requirement.
In my cluster implementation, I used DLM which is a distributed lock manager and it runs in each cluster nodes. DLM is good choice for small clusters (up to two nodes) as it removes quorum requirements as imposed by GULM mechanism.
Based on DLM or GLM locking functionality, there are two basic data integrity techniques which can be used by RHEL cluster for ensuring data integrity in concurrent access environments. The traditional way is the use of clvm which works well in most of such RHEL clusters implementation with lvm based logical volumes.

Another technique is GFS. It is a cluster file system which allows a cluster of nodes to simultaneously access a block device that is shared among the nodes. It employs distributed metadata and multiple journals for optimal operation in a cluster. To maintain file system integrity, GFS uses lock manager (DLM or GULM) to coordinate I/O. When one node changes data on a GFS file system, that change is immediately visible to the other cluster nodes using that file system.

Hence when you are implementing a RHEL cluster with concurrent data access requirement (like in case of Oracle RAC implementation), you can use either GFS or clvm. In most of implementation of such Red-Hat cluster implementation, GFS is used with direct access configuration to shared SAN from all cluster nodes. However, for same purpose, you can also deploy GFS in a cluster that is connected to a LAN with servers that use GNBD (Global Network Block Device) or to iSCSI (Internet Small Computer System Interface) devices.
It must be noted down that both GFS and CLVM use locks from the lock manager. However GFS uses locks from the lock manager to synchronize access to file system metadata (on shared storage), while CLVM uses locks from the lock manager to synchronize updates to LVM volumes and volume groups (also on shared storage).
For non –concurrent RHEL cluster implementations, you can rely on CLVM or can use native RHEL Journaling based techniques (like ext2 and ext3). As for non-concurrent access clusters, data integrity issues are minimal, I tried to keep simplicity within my cluster implementations by using native RHEL OS techniques.

1.3 Fencing Infrastructure

Fencing is also an important component of every RHEL based cluster implementation. Main purpose of fencing implementation is to ensure data integrity in clustered environment.
Infact, to ensure data integrity, only one node can run a cluster service and access cluster-service data at one time. The use of power switches in the cluster hardware configuration enables a node to power-cycle another node before restarting that node's cluster services during the failover process. This prevents any two systems from simultaneously accessing the same data and corrupting it. It is strongly recommended that fence devices (hardware or software solutions that remotely power, shutdown, and reboot cluster nodes) are used to guarantee data integrity under all failure conditions. Software based Watchdog timers are an alternative used to ensure correct operation of cluster service failover, however in most of RHEL cluster implementation, hardware fence devices are used, like HP ILO, APC power switches, IBM Blade center devices and Bull Novascale Platform Administration Processor (PAP) Interface.
It must be noted down, that for RHEL cluster solutions with shared storage, and implementation of fence infrastructure is a mandatory requirement.

2.0 Step-by Step Implementation of RHEL cluster

Implementation of RHEL clusters start with selection of proper hardware and their connectivity. In most of implementations ( without IP load balancing ) , a shared storage is used with two or more than two servers running RHEL operating system and RHEL cluster suite.
A proper designed cluster, no matter you are building RHEL based cluster or IBM HACMP based cluster should not contain any single point of failures. Keeping this fact in your mind, you have to remove any single point of failures from your cluster design. For this purpose, you can place your servers physically in two separate Racks with redundant power supplies. You also have to remove SPOF from network infrastructure used for cluster. Ideally speaking, you should have at least two network adapters on each cluster node and two network switches should be used for building network infrastructure for cluster implementation.

2.1 Software Installation

Building up RHEL cluster starts with installation of RHEL on two cluster nodes. In my setup I have two HP Proliant servers (DL740) servers with shared fiber storage ( HP MSA 1000 storage ).
I started with RHEL V4 installation on both nodes. It is always better to install latest available operating system version and its update. I selected V4 update 4 (which was latest version of RHEL while I was building that cluster). If you have a valid software subscription from Red-Hat, you can login to Red-Hat network and go to software channels to download latest update available there. Later, once you download ISO images, you can burn it to CDs using any appropriate software.
During RHEL OS installation, you will be gone through various configuration selections, most important of them are date and time zone configuration, Root user password setting, firewall settings and OS security level selection. Another important configuration option is Network settings; these settings can be left for a later stage especially in building a high availability solution with ether-channel (or ethernet bonding configuration).

After OS installation, it is always nice idea to go for necessary drivers and hardware support packages installation, In my case, as HP hardware platform was used, so I proceeded for downloading necessary RHEL support package for DL 740 servers (HP Proliant Support pack which is available from http://h18004.www1.hp.com/products/servers/linux/dl740-drivers-cert.html).

Next step would be installation of cluster software package itself. This package is again available from RHEL network and you definitely have to select latest available cluster package over there. I selected rhel-cluster-2.4.0.1 for my setup which was latest cluster suite available at that time.
Once downloaded, it will be in tar format. Extract it and then install following rpms at least so that RHEL cluster with DLM can be installed and configured.

Magma and magma-plugins
Perl-net-telnet
Rgmanager
System-config-cluster
DLM & dlm kernel
DLM-kernel-hugemem & SMP support for DLM
Iddev and ipvsadm
Cman , cman-smp, cman-hugemem & cman-kernelheaders
Ccs

It is always a good idea to restart both RHEL cluster nodes after installation of vendor related hardware support drivers and RHEL cluster suite.

2.2 Network Configuration

For network configuration, best way to use is network configuration GUI. However, if you plan to use ethernet channel bonding, configuration steps would be slightly different.

Ethernet channel bonding allows for a fault tolerant network connection by combining two Ethernet devices into one virtual device. The resulting channel bonded interface ensures that in the event that one Ethernet device fails, the other device will become active. Ideally speaking, connections from these ethernet devices should go to separate ethernet switches or hubs so that SPOF would be eliminated even on ethernet switches and hubs level.
To configure two network devices for channel bonding, perform the following on node1:
1. Create bonding devices in /etc/modules.conf. For example, I used following commands on each cluster node
alias bond0 bonding
options bonding miimon=100 mode=1
2. This loads the bonding device with the bond0 interface name, as well as passes options to the bonding driver to configure it as an active-backup master device for the enslaved network interfaces.
3. Edit the /etc/sysconfig/network-scripts/ifcfg-eth0 configuration file for both eth0 and /etc/sysconfig/network-scripts/ifcfg-eth1 for eth1 interface so that these files show identical contents as shown below
DEVICE=ethx
USERCTL= no
ONBOOT=yes
MASTER=bond0
SLAVE=yes
BOOTPROTO=none
4. This will enslave ethX (replace X with the assigned number of the Ethernet devices) to the bond0 master device.
5. Create a network script for the bonding device (for example, /etc/sysconfig/network-scripts/ifcfg-bond0), which would appear like the following example:
DEVICE=bond0
USERCTL=no
ONBOOT=yes
BROADCAST=172.16.2.255
NETWORK=172.16.2.0
NETMASK=255.255.255.0
GATEWAY=172.16.2.1
IPADDR=172.16.2.182
6. Reboot the system for the changes to take effect.

Similarly on node 2, repeat the same steps with the only difference that file /etc/sysconfig/network-scripts/ifcfg-bond0 will contain IPADDR entry with value of 172.16.2.183.

As a result of these configuration steps, you would finish with two RHEL cluster nodes with IP address of 172.16.2.182 and 172.16.2.183 which has been assigned to virtual ethernet channels (underlying two physical ethernet adapters for each ethernet channel).

Now you can easily use Network configuration GUI on cluster nodes to set other network configuration details like hostname and primary/secondary DNS server configuration. I set Commsvr1 and Commsvr2 as hostnames for cluster nodes and also ensure that name resolution in both long name and short names should work fine from both DNS server as well as /etc/hosts file.

It should be noted down that RHEL cluster, by default uses /etc/hosts for node names resolution. The cluster node name needs to match the output of uname -n or the value of HOSTNAME in /etc/sysconfig/network.

############################################################
Contents of /etc/hosts file in each server are as follows:
############################################################
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1 localhost.localdomain localhost
172.16.2.182 Commsvr1 Commsvr1.kmefic.com.kw
172.16.2.183 Commsvr2
172.16.1.186 Commilo1 Commilo1.kmefic.com.kw
172.16.1.187 Commilo2 Commilo2.kmefic.com.kw
172.16.2.188 Commserver
192.168.10.1 node1
192.168.10.2 node2
172.16.2.4 KMETSM

################################################################

If you have an additional ethernet interface in each cluster node, it is always a better idea to configure separate IP network as additional network for heart beats between cluster nodes. It is important that RHEL cluster uses, by default eth0 on cluster nodes for heart beats. However, it is still possible to use other interfaces for additional heartbeat exchanges.
For this type of configuration, you can simply use network configuration GUI to assign ip addresses like 192.168.10.1 and 192.168.10.2 on eth2 and get it resolved from /etc/hosts file.

2.3 Setup of Fencing Device

In my case, as HP hardware was being used, I relied on configuration of HP ILO devices as fencing device for my cluster. You may however consider configuration of other fencing devices, depending upon hardware type being used for your cluster configuration.

To configure HP ILO, you have to reboot your servers and press F8 key to enter into ILO configuration menus. Basic configuration is relatively simpler; you have to just assign Ip addresses to ILO devices with name of ILO device. I assigned 172.16.1.100 with Commilo1 as name of ILO device on node1 and 172.16.1.101 with Commilo2 as ILO device name on node2.Be sure , however , to connect ethernet cables to ILO adapters which are usually marked noticeably on back side of HP servers.
Once rebooted, you can use browsers on your Linux servers to access ILO devices. Default username is Administrator with password, which is usually available on hard copy tag associated physically with HP servers. Later, you can change Administrator password to password of your own choice, using same web based ILO administration interface.

2.4 Setup of Shared storage drive and Quorum partitions

In my environment for cluster setup, I used an HP Fiber based shared storage MSA1000. I configured a RAID-1 of 73.5 GB size using HP smart array utility and then assign to both of my cluster nodes using selective host presentation feature.
After rebooting both nodes, I used HP fiber utilities like hp_scan so that both servers should be able to see this array physically.
For verification of physical availability of shared storage to both cluster nodes , you can look into /dev/proc/proc file and look for any entry like /dev/sda or /dev/sdb depending upon your environment.
Once you find your shared storage on OS level, you have to partition it according to your cluster storage requirements. I used parted tool on one of my cluster node to partition the shared storage.

I created two small primary partitions to hold raw devices, while third primary partition was created to hold shared data filesystem

Parted> select /dev/sda

Parted > mklabel /dev/sda msdos

Parted > mkpart primary ext3 0 20

Parted > mkpart primary ext3 20 40

Parted > mkpart primary ext3 40 40000

I rebooted both cluster nodes, followed by creation of /etc/sysconfig/rawdevices file with following contents:
#####################################################################
/dev/raw/raw1 /dev/sda1
/dev/raw/raw2 /dev/sda2
#######################################################################
A restart of rawdevices services on both nodes will configure raw devices as quorum partitions

/home/root> services rawdevices restart

I then crated a JFS2 filesystem on third primary partition using mke2jfs command, however its related entry should not be putted in /etc/fstab file on either cluster nodes as this shared filesystem will be under the control of Rgmanager of cluster suite.

/home/root> mke2jfs –j –b 4096 /dev/sda3

You can now create a directory structure called /shared/data on both nodes and then verify accessibility of shared filesystem from both cluster nodes by mounting that filesystem one by one at each cluster node ( mount /dev/sda3 /shared/data). However, never try to mount this filesystem on both cluster nodes simultaneously as it might corrupt the filesystem itself.

2.5 Cluster Configuration

Now almost everything required for cluster infrastructure has been done so next step would be configuration of cluster itself.

A RHEL cluster can be configured in many ways. However, easiest way to configure a RHEL cluster is to use RHEL GUIsystem management -> cluster management.-> create a cluster
I created a cluster with Cluster Name Commcluster with node names of Commsvr1 and Commsvr2.
I added fencing to both nodes fencing devices Commilo1 and Commilo2 respectively, so that each node should have one fence level with one fence device. In your environment, if you have multiple fence devices, you can add another fence level with more fence devices to each node.
I also added a shared IP address of 172.16.2.188, which will be used as the service IP address for this cluster. This is the IP address which should also be used as service IP address for applications or databases (like for listener configuration, if you are going to use Oracle database in cluster).
I added a failover domain namely Kmeficfailover with priorities given in following sequence:
Commsvr1
Commsvr2

I added a service called CommSvc and then put that service in the above defined failover domain. Next step would be addition of resources to this service. I added a private resource of filesystem type which have the characteristic of device=/dev/sd3, mount point of /shared/data and mount type of ext3.
I also added a private resource of script type (/root/CommS.sh) to service CommSvc. This script will start my C-based application and therefore have to be present in /root directory on both cluster nodes. It is very important to have correct ownership of root & security otherwise you may expect unpredictable behavior during cluster startup and shutdown.
It should be noted down that Application or database startup and shutdown scripts are very important for proper functionalities of RHEL based cluster. RHEL cluster uses same scripts for providing application/database monitoring and high availability so every application script to be used in RHEL cluster should have a specific format.
All such scripts at least should have a start & stop subsections along with status subsection. In case, when application or database is available and running, status subsection of script should return a return value of 0; while in case when application is not running and available , it should return a return value of 1. The script should also contain a restart subsection in which services are tried to be restarted if application is found to be dead.
It should be noted that RHEL cluster always tries to restart application on same node which is previous owner of application, before trying to move that application to other cluster node.
A sample application script, which was used in my RHEL cluster implementation (to provide high availability to a legacy C-based application), is as follows:

##################################################################
#Script Name: CommS.sh
#Script Purpose:To provide application start/stop/status under Cluster
#Script Author: Khurram Shiraz
##################################################################
#!/bin/sh
basedir=/home/kmefic/KMEFIC/CommunicationServer
case $1 in
'start')
cd $basedir
su kmefic -c "./CommunicationServer -f Dev-CommunicationServer.conf"
exit 0
;;
'stop')
z=`ps -ef | grep Dev-CommunicationServer | grep -v "grep"| awk ' { print $2 } '
`
if [[ $? -eq 0 ]]
then
kill -9 $z
fuser -mk /home/kmefic
exit 0
fi
;;
'restart')
/root/CommunicationS.sh stop
sleep 2
echo Now starting......
/root/CommunicationS.sh start
echo "restarted"
;;

'status')
ps -U kmefic | grep CommunicationSe 1>/dev/null
if [[ $? = 0 ]]
then
exit 0
else
exit 1
fi
;;
esac

################################################################

Finally , you have to add shared IP address ( 172.16.2.188) to service present in our failover domain so that the service conclusively should contain three resources ( two private resources (one filesystem and one script) along with one shared resource which is service IP address for cluster).

Last step would be synchronization of cluster configuration across the cluster nodes. RHEL cluster administration & configuration tool provides an option “save configuration to cluster” but it will appear once you start cluster services. Hence for the first time synchronization, it is better to send cluster configuration file manually to all cluster nodes. You can easily use scp command to synchronize /etc/cluster/cluster.conf file across the cluster nodes.

/home/root> scp /etc/cluster/cluster.conf Commsvr2:/etc/cluster/cluster.conf

Once synchronized, you can start cluster services on both cluster nodes. There is a special sequence in which you should start and stop RHEL related cluster services.
To start:
service ccsd start
service cman start
service fenced start
service rgmanager start
To stop:
service rgmanager stop
service fenced stop
service cman stop
service ccsd stop

( please note that if you use GFS then startup/shutdown of gfs and clvmd services have to be included in this sequence)

I, therefore, prepared three simple shell scripts which can start and stop RHEL cluster services as well as can also give status information about cluster services. These shell scripts are as follows

2.6 Additional Considerations

In my environment, I decided to not to start cluster services at RHEL boot time and not to shutdown these services automatically at the time of shutdown of RHEL box. However, if you want depending upon 24x7 services availability requirement of your business , you can easily do this by using chkconfig command.

Another consideration is to log cluster messages into a different log file. By default, all cluster messages goes into RHEL log messages file (/var/log/messages), which makes cluster troubleshooting somewhat difficult in some scenarios.
For this purpose , I edited /etc/syslog.conf file to enable the cluster to log events to a file that is different from the default log file and add following line

daemon.* /var/log/cluster

To apply this change, I restarted syslogd with the service syslog restart command. Another important step would be specification of time period for rotation of cluster log files. This can be done by specifying name of cluster log file in /etc/logrotate.conf file (default is weekly rotation).
-------------------------------------------------------------------------------------------------
/var/log/messages /var/log/secure /var/log/maillog /var/log/spooler
/var/log/boot.log /var/log/cron /var/log/cluster {
sharedscripts
postrotate
/bin/kill -HUP `cat /var/run/syslogd.pid 2> /dev/null` 2>
/dev/null || true
endscript
}

-------------------------------------------------------------------------------------------------

You also have to give special attention to keep UIDs and GIDs synchronized across cluster nodes. This is important in the sense that proper permissions should be maintained especially with reference to shared data filesystem.

Grub also need to configure to suite environment specific needs. For instance, many system administrators, in RHEL cluster environment reduces GRUB selection timeout to some lower values like 2 seconds to accelerate system restart times.

3.0 Databases Integration with RHEL cluster

Same RHEL clustered infrastructure can be used for providing high availability to databases like Oracle, Mysql and IBM DB2.
Most important thing is to remember to base your database related services on shared IP address, for example you have to configure Oracle listener based on shared service IP address.
In the last section of this article I will guide you through simple steps to demonstrate how you can use an already configured RHEL cluster for providing high availability to Mysql database server which is no doubt one of the most commonly used database on RHEL.

I am assuming that mysql related rpms are installed on both cluster nodes and RHEL cluster is already configured with service IP address of 172.16.2.188.
Now proceeding ahead, you have to simply define a failover domain using cluster configuration tool (with cluster node of your choice, having high priority than other one). This failover domain will have mysql service , which in turn will have two private resources and one shared resource ( service ip address).

Out of private resources, one of the private resources should be of filesystem type (in my configuration, having a mount point of /shared/mysqld) while other private resource should be of script type, pointing towards /etc/init.d/mysql.server script. Contents of this script are as follows (which should be available on both cluster nodes)

##################################################################

#!/bin/sh
# Copyright Abandoned 1996 TCX DataKonsult AB & Monty Program KB & Detron HB
# This file is public domain and comes with NO WARRANTY of any kind

# MySQL daemon start/stop script.

# Usually this is put in /etc/init.d (at least on machines SYSV R4 based
# systems) and linked to /etc/rc3.d/S99mysql and /etc/rc0.d/K01mysql.
# When this is done the mysql server will be started when the machine is
# started and shut down when the systems goes down.

# Comments to support chkconfig on RedHat Linux
# chkconfig: 2345 64 36
# description: A very fast and reliable SQL database engine.
###################################################################
# Comments to support LSB init script conventions
### BEGIN INIT INFO
# Provides: mysql
# Required-Start: $local_fs $network $remote_fs
# Required-Stop: $local_fs $network $remote_fs
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
# Short-Description: start and stop MySQL
# Description: MySQL is a very fast and reliable SQL database engine.
### END INIT INFO

# If you install MySQL on some other places than /usr/local/mysql, then you
# have to do one of the following things for this script to work:
#
# - Run this script from within the MySQL installation directory
# - Create a /etc/my.cnf file with the following information:
# [mysqld]
# basedir=
# - Add the above to any other configuration file (for example ~/.my.ini)
# and copy my_print_defaults to /usr/bin
# - Add the path to the mysql-installation-directory to the basedir variable
# below.
#
# If you want to affect other MySQL variables, you should make your changes
# in the /etc/my.cnf, ~/.my.cnf or other MySQL configuration files.

# If you change base dir, you must also change datadir. These may get
# overwritten by settings in the MySQL configuration files.

basedir=
datadir=

# The following variables are only set for letting mysql.server find things.

# Set some defaults
pid_file=
server_pid_file=
use_mysqld_safe=1
user=mysql
if test -z "$basedir"
then
basedir=/usr/local/mysql
bindir=./bin
if test -z "$datadir"
then
datadir=/shared/mysqld/data
#datadir=/usr/local/mysql/data
fi
sbindir=./bin
libexecdir=./bin
else
bindir="$basedir/bin"
if test -z "$datadir"
then
datadir="$basedir/data"
fi
sbindir="$basedir/sbin"
libexecdir="$basedir/libexec"
fi

# datadir_set is used to determine if datadir was set (and so should be
# *not* set inside of the --basedir= handler.)
datadir_set=

#
# Use LSB init script functions for printing messages, if possible
#
lsb_functions="/lib/lsb/init-functions"
if test -f $lsb_functions ; then
source $lsb_functions
else
log_success_msg()
{
echo " SUCCESS! $@"
}
log_failure_msg()
{
echo " ERROR! $@"
}
fi

PATH=/sbin:/usr/sbin:/bin:/usr/bin:$basedir/bin
export PATH

mode=$1 # start or stop

case `echo "testing\c"`,`echo -n testing` in
*c*,-n*) echo_n= echo_c= ;;
*c*,*) echo_n=-n echo_c= ;;
*) echo_n= echo_c='\c' ;;
esac

parse_server_arguments() {
for arg do
case "$arg" in
--basedir=*) basedir=`echo "$arg" | sed -e 's/^[^=]*=//'`
bindir="$basedir/bin"
if test -z "$datadir_set"; then
datadir="$basedir/data"
fi
sbindir="$basedir/sbin"
libexecdir="$basedir/libexec"
;;
--datadir=*) datadir=`echo "$arg" | sed -e 's/^[^=]*=//'`
datadir_set=1
;;
--user=*) user=`echo "$arg" | sed -e 's/^[^=]*=//'` ;;
--pid-file=*) server_pid_file=`echo "$arg" | sed -e 's/^[^=]*=//'` ;;
--use-mysqld_safe) use_mysqld_safe=1;;
--use-manager) use_mysqld_safe=0;;
esac
done
}

parse_manager_arguments() {
for arg do
case "$arg" in
--pid-file=*) pid_file=`echo "$arg" | sed -e 's/^[^=]*=//'` ;;
--user=*) user=`echo "$arg" | sed -e 's/^[^=]*=//'` ;;
esac
done
}

wait_for_pid () {
i=0
while test $i -lt 35 ; do
sleep 1
case "$1" in
'created')
test -s $pid_file && i='' && break
;;
'removed')
test ! -s $pid_file && i='' && break
;;
*)
echo "wait_for_pid () usage: wait_for_pid created|removed"
exit 1
;;
esac
echo $echo_n ".$echo_c"
i=`expr $i + 1`
done

if test -z "$i" ; then
log_success_msg
else
log_failure_msg
fi
}

# Get arguments from the my.cnf file,
# the only group, which is read from now on is [mysqld]
if test -x ./bin/my_print_defaults
then
print_defaults="./bin/my_print_defaults"
elif test -x $bindir/my_print_defaults
then
print_defaults="$bindir/my_print_defaults"
elif test -x $bindir/mysql_print_defaults
then
print_defaults="$bindir/mysql_print_defaults"
else
# Try to find basedir in /etc/my.cnf
conf=/etc/my.cnf
print_defaults=
if test -r $conf
then
subpat='^[^=]*basedir[^=]*=$.*$$'
dirs=`sed -e "/$subpat/!d" -e 's//\1/' $conf`
for d in $dirs
do
d=`echo $d | sed -e 's/[ ]//g'`
if test -x "$d/bin/my_print_defaults"
then
print_defaults="$d/bin/my_print_defaults"
break
fi
if test -x "$d/bin/mysql_print_defaults"
then
print_defaults="$d/bin/mysql_print_defaults"
break
fi
done
fi

# Hope it's in the PATH ... but I doubt it
test -z "$print_defaults" && print_defaults="my_print_defaults"
fi

#
# Read defaults file from 'basedir'. If there is no defaults file there
# check if it's in the old (depricated) place (datadir) and read it from there
#

extra_args=""
if test -r "$basedir/my.cnf"
then
extra_args="-e $basedir/my.cnf"
else
if test -r "$datadir/my.cnf"
then
extra_args="-e $datadir/my.cnf"
fi
fi

parse_server_arguments `$print_defaults $extra_args mysqld server mysql_server mysql.server`

# Look for the pidfile
parse_manager_arguments `$print_defaults $extra_args manager`

#
# Set pid file if not given
#
if test -z "$pid_file"
then
pid_file=$datadir/mysqlmanager-`/bin/hostname`.pid
else
case "$pid_file" in
/* ) ;;
* ) pid_file="$datadir/$pid_file" ;;
esac
fi
if test -z "$server_pid_file"
then
server_pid_file=$datadir/`/bin/hostname`.pid
else
case "$server_pid_file" in
/* ) ;;
* ) server_pid_file="$datadir/$server_pid_file" ;;
esac
fi

# Safeguard (relative paths, core dumps..)
cd $basedir

case "$1" in
start)
# Start daemon
manager=$bindir/mysqlmanager
if test -x $libexecdir/mysqlmanager
then
manager=$libexecdir/mysqlmanager
elif test -x $sbindir/mysqlmanager
then
manager=$sbindir/mysqlmanager
fi

echo $echo_n "Starting MySQL"
if test -x $manager -a "$use_mysqld_safe" = "0"
then
# Give extra arguments to mysqld with the my.cnf file. This script may
# be overwritten at next upgrade.
$manager --user=$user --pid-file=$pid_file >/dev/null 2>&1 &
wait_for_pid created

# Make lock for RedHat / SuSE
if test -w /var/lock/subsys
then
touch /var/lock/subsys/mysqlmanager
fi
elif test -x $bindir/mysqld_safe
then
# Give extra arguments to mysqld with the my.cnf file. This script
# may be overwritten at next upgrade.
pid_file=$server_pid_file
$bindir/mysqld_safe --datadir=$datadir --pid-file=$server_pid_file >/dev/null 2>&1 &
wait_for_pid created

# Make lock for RedHat / SuSE
if test -w /var/lock/subsys
then
touch /var/lock/subsys/mysql
echo "mysql.server" > /var/lock/mysql
fi
else
log_failure_msg "Couldn't find MySQL manager or server"
fi
echo "I was here `date`" >> /var/log/rhcs.debug
;;

stop)
# Stop daemon. We use a signal here to avoid having to know the
# root password.

# The RedHat / SuSE lock directory to remove
lock_dir=/var/lock/subsys/mysqlmanager

# If the manager pid_file doesn't exist, try the server's
if test ! -s "$pid_file"
then
pid_file=$server_pid_file
lock_dir=/var/lock/subsys/mysql
fi

if test -s "$pid_file"
then
mysqlmanager_pid=`cat $pid_file`
echo $echo_n "Shutting down MySQL"
kill $mysqlmanager_pid
echo "stopped" > /var/lock/mysql
# mysqlmanager should remove the pid_file when it exits, so wait for it.
wait_for_pid removed

# delete lock for RedHat / SuSE
if test -f $lock_dir
then
rm -f $lock_dir
fi
else
log_failure_msg "MySQL manager or server PID file could not be found!"
fi
;;

restart)
# Stop the service and regardless of whether it was
# running or not, start it again.
$0 stop
$0 start
;;

reload)
if test -s "$server_pid_file" ; then
mysqld_pid=`cat $server_pid_file`
kill -HUP $mysqld_pid && log_success_msg "Reloading service MySQL"
touch $server_pid_file
else
log_failure_msg "MySQL PID file could not be found!"
fi
;;

status)
(mysql -e "select 1" > /var/mysql)||exit 1
state=`cat /var/mysql|head -1`
#echo "state is $state"
#if [$state=1]; then
if [ "$state" = "1" ]; then
touch /var/lock/subsys/mysql
echo "mysql.server" > /var/lock/mysql
cat /var/lock/mysql
exit 0
fi
;;
*)
# usage
echo "Usage: $0 start|stop|status|restart|reload"
exit 1
;;
esac

########################################################################

As you can see, this script is setting data directory to /shared/mysqld/data , which is infact available on our shared RAID array and should be available from both cluster nodes.

Testing for high availability of mysql database can easily be carried out with help of any mysql client. I used “SQLyog” which is windows based mysql client, connect to mysql database on Commsvr1 and then crashed this cluster node using “halt command”. As a result of this system crash, RHEL cluster events triggered and mysql database automatically restarted on Commsvr2. This whole failover process took one to two minutes and happened quite seamlessly.

Summary: RHEL clustering technology, no doubt provides a reliable high available infrastructure which can be used for meeting 24x7 business requirements for databases as well as legacy applications. The most important thing to be kept in minds is that it is always better to plan carefully before actual implementation and nevertheless test thoroughly your cluster and all possible failover scenarios before going live with RHEL cluster. A well documented cluster test plan can also be very much helpful in this regard.

About Author: Khurram Shiraz is senior system Administrator at KMEFIC, Kuwait. In his eight years of IT experience, he worked mainly with IBM technologies and products especially AIX, HACMP Clustering, Tivoli and IBM SAN/ NAS Storage. He also has worked with IBM Integrated Technology Services group. His area of expertise includes design and implementation of high availability and DR solutions based on pSeries, Linux and windows infrastructure. He can be reached at aix_tiger@yahoo.com.

Note: This article is one of my printed article which was published in Linux Journal October 2007.

Wednesday, 8 July 2009

Five Pillars of Islam

Whenever i start writing something about Islam, i feel very difficult to write about my lovely religion. Main reason about that i feel guilty about myself that beside practicing fundamentals of Islam i don't do too much for my religion.

But through articles of this category , my main purpose is to present Islam in its most simplest form to all visitors of my website...and if a single person gets saved from being punished and become inclined towards this holy truth, my whole life objective will be achieved.

Ok , let's start with five pillars of Islam. These are the pillars around whole Islam rounds about. What are these, let's have a look

1. "Tauheed".. This is first basic teaching of Islam..which says " SAY LORD IS ONLY ONE and HE IS WHO CREATED WHOLE UNIVERSE..HE IS ALONE..NO ONE CAN COMPETE WITH HIM.HE HAS NO WIFE AND NO SON. "

This teaching is not unique with Islam only. All holy religions like Christianity and Jewism teach the same faith originally...however teaching of these religions are spoiled now,..these are diverted now and includes wrong things and believes.

And as Allah said in holy QURAN " HE WILL FORGIVE everything except "SHIRK" " which is a sin which can never be forgiven by Allah ..that is making anybody companion or partner with ALLAH ( god forsaken)....

2. Belief in Angels and prophets: It is believe in holy angels and all holy prophets including prophet Adam,Christ, Moses,Suliman,Nooh, Yaqub, Ibrahim, Marium (PBUH)and Prophet Muhammad ( PBUH)and . We also have to belief that Pohphet Muhammad (PBUH)was the last prophet of GOD and nobody after him will come as prophet of God.

3. Prayer: All muslims have to pray times a day . They have to worship Allah in these prayers and remind themselves that main purpose of their arrival in this world to worship only Allah and nobody else.

3. Fasting: All muslims have to fast for 30 days in holy month of Ramadan. This is to refresh souls of muslims and to practice them with tools through which they can protect themselves for whole year against sins. So as you can observe that muslims while fasting in Ramadan try their best to keep themselves away from all kinds of sins and bad things

4. Zakat: This is annual amount of money which muslims have to transfer to poor people of their society as mandatory. Beside this amount of money , muslims can pay "sadqa" and other contributions to poor people.

5. Hajj: This means that everybody who believes in Allah and prophet Muhammad ( PBUH) has to visit Khan-e-Kaaba in Mecca ( Suadia Arabia) at least one time in his life.

Tuesday, 7 July 2009

AIX with HP EVA Storage subsystem

Few days ago i was asked to get control of a situation where an AIX server has to be connected to HP EVA storage subsystem. It was EVA 8100 , which is considered to be a mid range subsystem from HP.

Customer was trying to install AIX ( SAN boot configuration ) on HP EVA Luns and after two to three tries , he was successful in installing AIX but when system came up , rootvg was appearing on 8 hdisks.

So far so good, but when he tried to install any additional software on AIX , he started getting bosboot verification error.

Ok,Now at this stage , i got involved. When i tried to search for AIX and HP EVA storage combination i got shocked. There were not many hits in IBM case , even google was not able to find many hits....The only important information which i got was that if your HP EVA firmware is below 6.xx , then you will face severe performance problems with AIX as in that case your AIX hdisks will be having queue depth of "one" which thereby result in giving single path to Lun in essence.

Ok first of all , i found that HP provides two /three multi path drivers for AIX. Infact AIX built-in MPIO device drivers can not work successfully alone with HP EVA. These MPIO drivers have to be used in conjunction with "other multi path drivers " for AIX which are available on HP website. So i downloaded these drivers from HP .

As a first step i asked customer to upgrade Firmware of EVA to latest level. While done, i switch the AIX host to single path configuration and then with only one disk ( hdisk0 ) i reinstalled AIX 6.1.

After machine rebooted , I installed HP MPIO device driver on AIX using smitty installp and then rebooted again. Now i still getting one hdisk with rootvg but now AIX MPIO command lspath is reporting eight paths ( as expected ). Also hp MPIO command hsvcpath are now showing eight paths ( out of which four paths are active and remaining four are inactive , waiting for other four to fail for takeover).

Now by executing lsattr -El hdisk0 , I was getting queue_depth value of eight , which is 100% correct. Task was done sucessfully...

Design Your Unique Solutions with Reliability and Performance