• Chapter 1. Installing and Configuring Windows Server 2003
  • software development Company Server 2003
  • Chapter 1. Installing and Configuring Windows Server 2003
  • New Features in Windows Server 2003
  • Best Practices
  • Moving Forward
  • Version Comparisons
  • Hardware Recommendations
  • Installation Checklist
  • Functional Overview of Windows Server 2003 Setup
  • Installing Windows Server 2003
  • Post Setup Configurations
  • Functional Description of the Windows Server 2003 Boot Process
  • Correcting Common Setup Problems
  • Chapter 2. Performing Upgrades and Automated Installations
  • New Features in Windows Server 2003
  • NT4 Upgrade Functional Overview
  • Upgrading an NT4 or Windows 2000 Server
  • Automating Windows Server 2003 Deployments
  • Moving Forward
  • Chapter 3. Adding Hardware
  • New Features in Windows Server 2003
  • Functional Description of Windows Server 2003 Architecture
  • Overview of Windows Server 2003 Plug and Play
  • Installing and Configuring Devices
  • Troubleshooting New Devices
  • Moving Forward
  • Chapter 4. Managing NetBIOS Name Resolution
  • New Features in Windows Server 2003
  • Moving Forward
  • Overview of Windows Server 2003 Networking
  • Name Resolution and Network Services
  • Network Diagnostic Utilities
  • Resolving NetBIOS Names Using Broadcasts
  • Resolving NetBIOS Names Using Lmhosts
  • Resolving NetBIOS Names Using WINS
  • Managing WINS
  • Disabling NetBIOS-over-TCP/IP Name Resolution
  • Chapter 5. Managing DNS
  • New Features in Windows Server 2003
  • Configuring a Caching-Only Server
  • Configuring a DNS Server to Use a Forwarder
  • Managing Dynamic DNS
  • Configuring Advanced DNS Server Parameters
  • Examining Zones with Nslookup
  • Command-Line Management of DNS
  • Configuring DHCP to Support DNS
  • Moving Forward
  • Overview of DNS Domain Structure
  • Functional Description of DNS Query Handling
  • Designing DNS Domains
  • Active Directory Integration
  • Configuring DNS Clients
  • Installing and Configuring DNS Servers
  • Configuring Secondary DNS Servers
  • Integrating DNS Zones into Active Directory
  • Chapter 6. Understanding Active Directory Services
  • New Features in Windows Server 2003
  • Active Directory Support Files
  • Active Directory Utilities
  • Bulk Imports and Exports
  • Moving Forward
  • Limitations of Classic NT Security
  • Directory Service Components
  • Brief History of Directory Services
  • X.500 Overview
  • LDAP Information Model
  • LDAP Namespace Structure
  • Active Directory Namespace Structure
  • Active Directory Schema
  • Chapter 7. Managing Active Directory Replication
  • New Features in Windows Server 2003
  • Replication Overview
  • Detailed Replication Transaction Descriptions
  • Designing Site Architectures
  • Configuring Inter-site Replication
  • Controlling Replication Parameters
  • Special Replication Operations
  • Troubleshooting Replication Problems
  • Moving Forward
  • Chapter 8. Designing Windows Server 2003 Domains
  • New Features in Windows Server 2003
  • Design Objectives
  • DNS and Active Directory Namespaces
  • Domain Design Strategies
  • Strategies for OU Design
  • Flexible Single Master Operations
  • Domain Controller Placement
  • Moving Forward
  • Chapter 9. Deploying Windows Server 2003 Domains
  • New Features in Windows Server 2003
  • Preparing for an NT Domain Upgrade
  • In-Place Upgrade of an NT4 Domain
  • In-Place Upgrade of a Windows 2000 Forest
  • Migrating from NT and Windows 2000 Domains to Windows Server 2003
  • Additional Domain Operations
  • Moving Forward
  • Chapter 10. Active Directory Maintenance
  • New Features in Windows Server 2003
  • Loss of a DNS Server
  • Loss of a Domain Controller
  • Loss of Key Replication Components
  • Backing Up the Directory
  • Performing Directory Maintenance
  • Moving Forward
  • Chapter 11. Understanding Network Access Security and Kerberos
  • New Features in Windows Server 2003
  • Windows Server 2003 Security Architecture
  • Security Components
  • Password Security
  • Authentication
  • Analysis of Kerberos Transactions
  • MITv5 Kerberos Interoperability
  • Security Auditing
  • Moving Forward
  • Chapter 12. Managing Group Policies
  • New Features in Windows Server 2003
  • Group Policy Operational Overview
  • Managing Individual Group Policy Types
  • Moving Forward
  • Chapter 13. Managing Active Directory Security
  • New Features in Windows Server 2003
  • Overview of Active Directory Security
  • Using Groups to Manage Active Directory Objects
  • Service Accounts
  • Using the Secondary Logon Service and RunAs
  • Using WMI for Active Directory Event Notification
  • Moving Forward
  • Chapter 14. Configuring Data Storage
  • New Features in Windows Server 2003
  • Functional Description of Windows Server 2003 Data Storage
  • Performing Disk Operations on IA32 Systems
  • Recovering Failed Fault Tolerant Disks
  • Working with GPT Disks
  • Moving Forward
  • Chapter 15. Managing File Systems
  • New Features in Windows Server 2003
  • Overview of Windows Server 2003 File Systems
  • NTFS Attributes
  • Link Tracking Service
  • Reparse Points
  • File System Recovery and Fault Tolerance
  • Quotas
  • File System Operations
  • Moving Forward
  • Chapter 16. Managing Shared Resources
  • New Features in Windows Server 2003
  • Functional Description of Windows Resource Sharing
  • Configuring File Sharing
  • Connecting to Shared Folders
  • Resource Sharing Using the Distributed File System (Dfs)
  • Printer Sharing
  • Configuring Windows Server 2003 Clients to Print
  • Managing Print Services
  • Moving Forward
  • Chapter 17. Managing File Encryption
  • New Features in Windows Server 2003
  • File Encryption Functional Description
  • Certificate Management
  • Encrypted File Recovery
  • Encrypting Server-Based Files
  • EFS File Transactions and WebDAV
  • Special EFS Guidelines
  • EFS Procedures
  • Moving Forward
  • Chapter 18. Managing a Public Key Infrastructure
  • New Features in Windows Server 2003
  • Moving Forward
  • PKI Goals
  • Cryptographic Elements in Windows Server 2003
  • Public/Private Key Services
  • Certificates
  • Certification Authorities
  • Certificate Enrollment
  • Key Archival and Recovery
  • Command-Line PKI Tools
  • Chapter 19. Managing the User Operating Environment
  • New Features in Windows Server 2003
  • Side-by-Side Assemblies
  • User State Migration
  • Managing Folder Redirection
  • Creating and Managing Home Directories
  • Managing Offline Files
  • Managing Servers via Remote Desktop
  • Moving Forward
  • Chapter 20. Managing Remote Access and Internet Routing
  • New Features in Windows Server 2003
  • Configuring a Network Bridge
  • Configuring Virtual Private Network Connections
  • Configuring Internet Authentication Services (IAS)
  • Moving Forward
  • Functional Description of WAN Device Support
  • PPP Authentication
  • NT4 RAS Servers and Active Directory Domains
  • Deploying Smart Cards for Remote Access
  • Installing and Configuring Modems
  • Configuring a Remote Access Server
  • Configuring a Demand-Dial Router
  • Configuring an Internet Gateway Using NAT
  • Chapter 21. Recovering from System Failures
  • New Features in Windows Server 2003
  • Functional Description Ntbackup
  • Backup and Restore Operations
  • Recovering from Blue Screen Stops
  • Using Emergency Management Services (EMS)
  • Using Safe Mode
  • Restoring Functionality with the Last Known Good Configuration
  • Recovery Console
  • Moving Forward
  • Who Should Read This Book
  • Who This Book Is Not For
  • Conventions
  • Acknowledgments
  • About the Author
  • About the Technical Reviewers
  • Index
  • Index A
  • Index B
  • Index C
  • Index D
  • Index E
  • Index F
  • Index G
  • Index H
  • Index I
  • Index J
  • Index K
  • Index L
  • Index M
  • Index N
  • Index O
  • Index P
  • Index Q
  • Index R
  • Index S
  • Index SYMBOL
  • Index T
  • Index U
  • Index V
  • Index W
  • Index X
  • Index Z
  • Preface
  • Previous Section Next Section

    Recovering Failed Fault Tolerant Disks

    If you have your data on a single drive or a non-fault tolerant volume such as a spanned volume or a striped volume, you expect to lose data if a drive fails. Because disk failure is an unavoidable fact of computer life, I assume that you have a good backup system and a plan for restoring data quickly. If a single disk in a striped volume fails, for example, you must delete the volume from the remaining disks, replace the disk, and rebuild the volume.

    On the other hand, if you don't want to deal with masses of panicked users and their crazed managers who will gather outside the server room like enraged French revolutionaries looking for guillotine fodder, you'll want to put your data on a fault tolerant subsystem of one form or another. This topic covers putting a system back in a stable condition following a disk failure and then recovering the system to normal operation. This includes the following operations:

    • Replacing a failed disk in a RAID 5 volume

    • Building a fault tolerant boot floppy

    • Replacing a failed disk in a mirrored volume

    • Moving dynamic volumes between computers

    Replacing a Failed Disk in a RAID 5 Volume

    When a disk fails in a RAID 5 volume, you will get a very small and very temporary information balloon from a drive icon in the system tray. The message states A disk that is part of a fault-tolerant volume can no longer be accessed. The message comes from a process called FT Orphan. This is a special process that logically disconnects the drive from the system to eliminate the possibility of data corruption.

    The file system on the volume with the failed disk continues to be active. Your only indication of the failure (unless you have installed a third-party utility to alert you of error log entries) is a slight decrease in I/O performance.

    When you discover that you have a failed disk, open the Disk Management console. You'll get a display that looks something like that in Figure 14.7. Each disk for the volume shows a Failed Redundancy status and the failed drive shows a red Stop indicator.

    Figure 14.7. Disk Management console showing failed disk in RAID 5 volume.

    graphics/14fig07.jpg

    Thanks to the fault tolerant nature of RAID 5, the system remains operational. However, you have now entered a statistical universe where the numbers are not in your favor. The next drive crash will cause data loss. If the drives were all manufactured in the same batch, your time might run out very quickly depending on the cause of the crash

    Obtain a spare drive that has at least as much capacity as the drive you are replacing. It should be configured for the same SCSI ID to simplify installation, although this is not a requirement.

    Use the Disk Management console to check the SCSI ID assigned to the dead drive. Right-click the status block and select PROPERTIES from the flyout menu. The SCSI ID (called the Target ID) and the Local Unit Number (LUN) are listed. I recommend that you paste a screen print of this window on the server so that you have a reference when you replace the disk. The snarl of SCSI cables inside the machine can lead you astray unless you have a good map. Nothing is quite so embarrassing as replacing the wrong drive.

    After you have the replacement drive in your hands and your users have left for the day, you're ready to get to work. Down the server and replace the drive. Test the drive operability using any IDE or SCSI hardware utilities you like.

    Now restart and let the operating system load. The RAID 5 volume will initialize and the file system should mount. Open the Disk Management console. The display should look something like that in Figure 14.8.

    Figure 14.8. Disk Management console showing replacement disk with Unknown status and the RAID 5 array with a Failed Redundancy status.

    graphics/14fig08.jpg

    The RAID 5 volume still shows a Failed Redundancy status. A status block for the missing disk opens because its information is contained in the LDM database on the other disks. The replacement disk is brand new, so it does not have a fault tolerant signature or a Master Boot Record. The system lists its status as Unknown. Follow Procedure 14.13.

    Delays in Updating Disk Management Display

    It sometimes happens that the LDM does not initialize correctly when loading the Disk Management console. The RAID 5 volume may show Healthy even though it is not. If this happens, select ACTION | RESCAN DISKS from the menu, close the Disk Management console, and open it again. You may need to do this a couple of times to get the display to show a Failed Redundancy status.

    Procedure 14.13 Replacing a Failed Disk in a RAID 5 Volume

    1. Write a signature to the new disk by following the wizard instructions.

    2. Upgrade the disk to a dynamic disk.

    3. Right-click the RAID 5 volume and select REPAIR VOLUME from the flyout menu.

    4. Select the new disk to use as a replacement for the failed disk. The new disk now becomes part of the RAID 5 volume and the system begins regenerating. This can take a long time, sometimes hours. It will take much longer if users access the drive.

    5. While the regeneration is in progress, right-click the status block for the missing disk and select REMOVE DISK from the flyout menu. (Make absolutely sure you have the correct disk.) The status block disappears and the graphic display rearranges to show the new drive configuration.

    Building a Fault Tolerant Boot Floppy

    If you mirror your boot volumeЧthe most popular fault tolerant choiceЧone of the most important tools you have for recovering from a failure is a fault tolerant boot floppy. The secondary drive is not necessarily bootable, so you need a way to boot the system to the mirrored volume on the secondary drive if the primary drive fails.

    Even if the secondary drive is bootable, you or a colleague may have forgotten to modify the Boot.ini file to point at the secondary volume.

    A fault tolerant boot floppy also comes in handy if you experience problems with the MBR or boot sector on a server that prevents the machine from booting. Viruses are one common cause for this problem.

    A fault tolerant boot floppy does not boot Windows Server 2003 on a floppy. It uses the system files that are normally found at the root of the hard drive to bring up the operating system.

    Procedure 14.14 shows a brief set of steps for creating a fault tolerant boot floppy. Chapter 3, "Adding Hardware," contains information about ARC paths and Boot.ini entries.

    Procedure 14.14 Building a Fault Tolerant Boot Floppy

    1. Format a floppy. You cannot use a preformatted floppy because the boot sector must look for Ntldr. You can use a disk formatted on an NT4 machine.

    2. Copy the system files to the root of the A: drive. These files are as follows:

      • Ntldr

      • Ntdetect.com

      • Boot.ini

      • Ntbootdd.sys (if required)

    3. Use ATTRIB to remove the read-only attribute from Boot.ini.

    4. Edit the Boot.ini file on the floppy to include the ARC path of the boot volume on the second drive. This would look something like this:

      
      multi(0)disk(0)rdisk(1)partition(1)\Windows="Windows Server 2003 Mirrored Secondary Disk" 
      graphics/ccc.gif/fastdetect
      

      You might also want to change the time setting to [ms]1. This disables the counter.

    5. Restart the computer and boot from the fault tolerant boot floppy.

    6. When the BOOT menu appears, select the second disk. The system will boot from the secondary disk. At this point, the floppy is no longer needed. Remove it from the drive.

    Replacing a Failed Disk in a Mirrored Volume

    If you lose a disk that is part of a mirrored volume, the system responds as it did for a failed disk in a RAID 5 volume. When the system attempts to write to the volume and fails to get a response from the disk, the FT Orphan process disconnects the system from the drive and announces this via a System Tray icon. The FT Orphan process locks the Registry on the failed drive, if possible, so that even if you get the drive back in service, the system will refuse to load the operating system from it.

    When you open the Disk Management console following the drive failure, you'll get a display like that in Figure 14.9. The failed drive has a Missing status. The mirrored volume shows a Failed Redundancy status. The secondary drive moves to the top of the drive list. This may be different in your system, depending on your SCSI ID configuration.

    Figure 14.9. Disk Management console showing failed primary drive in a mirrored volume.

    graphics/14fig09.gif

    As you can see by the figure, it can be difficult to determine exactly which drive failed. Keep careful records of the SCSI IDs or IDE controller numbers. As with the RAID 5 failure, you do not need to take immediate corrective action. As many administrators will attest, however, you take a big chance if you wait too long.

    Obtain a new disk that is at least the size of the one you're replacing. Configure it for the same SCSI ID or IDE master/slave configuration to simplify recovery. When you're ready to replace the drive, follow Procedure 14.15.

    Procedure 14.15 Replacing a Failed Disk in a Mirrored Volume

    1. Down the server and replace the drive.

    2. Restart and boot using a fault tolerant boot floppy. If you replaced the drive using the same SCSI ID, pick the Boot.ini menu item corresponding to the original rdisk() value of the secondary drive. If you used a different SCSI ID, you need to figure out the rdisk() value based on the SCSI scan order. Use your SCSI adapter's configuration utility to see the scan order, then modify the Boot.ini file on the fault tolerant boot floppy accordingly.

    3. After the operating system finishes loading, open the Disk Management console. The new drive does not have a fault tolerant signature or a copy of the LDM database, so an Initialize and Convert Disk Wizard opens to walk you through applying the signature and converting the disk to a dynamic disk.

    4. Once you've completed the Wizard, the Disk Management console is visible. You might be surprised to see that the old disk still appears in the display along with the new disk. This is to remind you of the original configuration. Figure 14.10 shows an example.

      Figure 14.10. Disk Management console following disk replacement of a failed mirrored drive prior to regenerating the new disk.

      graphics/14fig10.jpg

    5. Right-click the mirrored volume and select REMOVE MIRROR from the flyout menu. The Remove Mirror window opens.

    6. Select the missing disk from the list and click Remove Mirror. The system prompts for verification. Click Yes. The remaining disk now shows a Healthy status.

    7. Right-click the status block for the missing disk and select REMOVE DISK from the flyout menu. The disk disappears immediately.

    8. If you have verified that the new primary disk is bootable, remirror the volume to the new drive using the instructions in Procedure 14.8, "Creating Mirrored Volumes." If the new primary disk is not bootable, you'll need to get a good backup and then reinstall the operating system and recover from tape to get a bootable primary disk. After this is done, remirror the volume.

    Moving Dynamic Disks Between Computers

    It sometimes happens that a server or workstation goes to that big byte bucket in the sky. (This usually happens about a half hour before your plane is due to leave on that vacation you've been planning for the past year.) If the problem is not with the storage system, one quick recovery method that might get you to the plane on time is to move the data disks to a new machine.

    Moving disks between machines can cause other problems. Windows Server 2003 has a lot of information in the Registry that is hardware-dependent. If you move the boot disk (the disk with the operating system files) to a different platform, expect to see lots of Plug and Play (PnP) activity when you start the machine. You may need to supply hardware drivers. You may get blue screen stop errors if the memory management subsystem cannot interpret the chipset or memory configuration. You will certainly get a failure if the new server requires a different Hardware Abstraction Layer (HAL).

    Moving data disks between machines is a much simpler matter. If the disk is a basic disk, the system sees the new disk, reads the partition table, and assigns the next available drive letters to any partitions it finds.

    Moving dynamic disks, however, especially dynamic disks that contain volumes that span disks, is a bit more complicated. You'll need to merge the LDM database entries for the disks into the LDM database of the machine where you install them.

    The operating system identifies disks with an outside disk group name as foreign disks. The purpose of the steps shown in Procedure 14.16 is to import the LDM information on those disks so that the disk group name can be changed and the system will accept the new entries.

    Procedure 14.16 Moving a Fault Tolerant Volume to Another Computer

    1. Down the two servers and transfer the drives. You might have to make room for new drives. You might need to rework the terminators and assign new SCSI IDs and so forth. The objective is to keep the drives together, if possible, although this is not absolutely necessary.

    2. After the drives have been installed, test them to make sure that they are connected and that you know the order of their installation. The LDM permits the sequence of disks to be changed, but you make your job more difficult if the Disk Management console display has the foreign disks distributed willy-nilly.

    3. Boot the operating system and make sure that the system loads. The data on the new drives will not be available until you import the disks. Also, any share points you have for directories on the drives will need to be recreated.

    4. Open the Disk Management console. After initialization, the graphical display looks something like that in Figure 14.11. The disks from the other computer are flagged as Foreign.

      Figure 14.11. Disk Management console showing a dynamic disk moved from another machine and introduced into a new machine as a foreign disk.

      graphics/14fig11.jpg

    5. Right-click the status block of the foreign disk or disks and select IMPORT FOREIGN DISKS from the flyout menu. The Import Foreign Disks window opens showing the name of the original server from where the disks came. This information comes from the LDM database at the end of the disks.

    6. Click OK. The system analyzes the disks, and then the Verify Volumes on Foreign Disks window opens.

      The system may report the Data Condition as Data Incomplete. This indicates that you did not move all the disks in the disk group. This is expected if the boot/system disk in the original server was a dynamic disk, or if there were other dynamic disks in the original server that you intentionally didn't move. Make sure that you have all the disks that are in the shared volume. You are permitted to move a subset of a disk group, but you'll need to do a few more steps.

    7. Click OK. The system warns you that it might not be able to recover data if you had a Data Incomplete status in the preceding window.

    8. Click OK to acknowledge the warning. The system imports the disks and then attempts to build the volumes and initialize the file system. The status may go to Failed on the disks. Don't worry (at least not yet). This is normal if you did not include all disks in the disk group in the transfer.

    9. Right-click the status block for any of the new disks and select REACTIVATE DISK from the flyout menu. The system will think a long time and you'll hear lots of disk activity. If the reactivation is successful, a drive letter for the volume appears and the status changes to Regenerating. This regeneration takes a long time and consumes many CPU cycles. The file system is active during this time and you can access files, but this is not recommended because it slows down regeneration. After regeneration has completed, the new volumes show a status of Healthy.

    10. There is a chance that any existing dynamic disks in the new machine will show an Error status after the import in the status block of the disk. This is because their copy of the LDM database has values that they cannot interpret. If this happens, right-click the status block for the disk and select REACTIVATE DISK from the flyout menu. This should immediately correct the problem.

      Previous Section Next Section