Disaster Recovery for SAP NetWeaver HA deployment with Azure Shared Disk on Windows using ASR
Overview
You have set up the SAP system on Windows to be highly available with Azure shared disk, following the steps in Cluster SAP ASCS/SCS instance on Windows Server Failover Cluster (WSFC) using shared disk in Azure. This makes the SAP system resilient to platform maintenance or hardware failure within an Azure region. But it doesn’t safeguard applications from large-scale regional disaster. The good news is that with the public preview of ASR for Azure shared disk, you can now easily configure DR for your high available SAP ASCS/SCS running on WSFC with Azure shared disk.
NOTE: For DR of your Windows SAP system with File share, see Disaster Recovery for SAP NetWeaver high availability deployment with File Share on Windows using ASR for details.
IMPORTANT NOTES:
The example shown in this article is tested with the following version, cluster share and quorum options –
SAP ASCS/ERS OS version: Windows Server 2019 Datacenter.
Enqueue Server version: ENSA1.
Quorum: Cloud witness.
Cluster share: Azure shared disks.
Shared disk type: ZRS.
As ASR for Azure shared disk is still in public preview, we don’t advise implementing the scenario for critical production workloads. Carefully review the Support matrix for shared disks in Azure VM disaster recovery (preview) – Azure Site Recovery.
This article focuses on the central services and application server’s component of SAP system. For database DR approach, see Disaster Recovery recommendation for SAP workload.
Failover of other dependent services like Domain Name System (DNS) or Active Directory (AD) is not covered in this article.
To replicate VMs using ASR for DR, review supported regions.
ASR doesn’t replicate Azure load balancer that is used as virtual IP for the SAP ASCS/ERS cluster configuration in the source site. You need to manually create one in the DR site before or during the failover event.
The cloud witness uses Azure blob storage, so you need a separate storage account in the DR region before or during the failover event.
The procedure described here has not been tested with different OS releases. So, make sure you test and document the entire procedure thoroughly in your environment.
Read Disaster Recovery overview and infrastructure guidelines for SAP workload and Disaster Recovery recommendation for SAP workload for general guidance, strategies, and factors to consider when designing DR for SAP workload. Disaster recovery architecture of SAP ASCS/ERS with Azure shared disk.
DR architecture for SAP workload on Windows with Azure shared disks
The following figure shows how the ENSA1 high availability of SAP ASCS (sapnw6cl1), and SAP ERS (sapnw6cl2) instance is set up using WSFC, with Azure shared disk attached to both the VMs. The cluster uses a cloud witness as a quorum option. To achieve DR for the setup, ASR is used to replicate the SAP ASCS/ERS VMs across the sites, which would replicate OS disk and Azure share disk. In the same way, for application servers (sapnw6a01 and sapnw6a02) that have OS and data disk (premium managed disk), set up ASR to replicate VMs to DR site.
NOTE:
This article describes steps related to the ENSA1 architecture. The same DR process can also be applied to the ENSA2 architecture as well.
This article does not include the scope of using SMB volumes on Azure Files or Azure NetApp Files in your SAP system for interface or anything else. If you use them, ensure that they get replicated into DR region as well.
To have similar high availability SAP system setup in the DR site, you need to make sure that all the components that are part of the SAP system are replicated.
Components
DR setup
SAP ASCS/ERS VMs (includes OS disk and Azure shared disk).
Replicate VMs using Azure Site Recovery.
Storage used for cloud witness.
Create separate storage in the DR region.
Load balancer used for cluster virtual IP.
Create a separate load balancer in the DR region.
SAP Application Servers VMs (include OS and data disk that uses premium managed disks)
Replicate VMs using Azure Site Recovery
IMPORTANT: Use of Azure Site Recovery for SAP databases isn’t recommended. For more details on the DR recommendation for databases, refer to SAP database servers DR guidelines.
Disaster Recovery (DR) site preparation
To achieve a similar SAP system setup on DR site including high availability setup of SAP ASCS/ERS, you need to make sure that all the components are replicated and available in the event of a failover.
Configure ASR for SAP ASCS/ERS and application server VMs
Set up Resource Group, Virtual Network, Subnet and Recovery Service Vault in the secondary site that you would use in setting up your DR. To learn more about networking, see prepare networking for Azure VM disaster recovery.
Before enabling ASR on SAP ASCS and SAP ERS VMs, it is essential that WSFC is configured, and Azure shared disk is managed by cluster.
Configure ASR for SAP ASCS and ERS VMs with Azure shared disk by following the steps in the Shared disks in Azure Site Recovery document. Follow Configure replication for Azure VMs in Azure Site Recovery to configure ASR for SAP application servers.
When you use ASR to set up DR for VMs, the VM’s OS, data disks, and Azure shared disk (for ASCS/ERS VMs) are copied to the DR site.
NOTE: With Azure shared disk, SAP ASCS and ERS VMs will be grouped together in ASR. This way, the VMs in the group will replicate together to have app-consistent recovery snapshot. In the event of a failover, the VMs will fail over as a group.
After the VMs are replicated, the status of protected cluster (sapnw6) and individual VMs (sapnw6cl1 and sapnw6cl2) would turn into “Protected” and the replication health would be “Healthy”.
Configure the cloud witness for SAP ASCS/ERS in the DR site
Tip: Based on your DR strategy, you can either execute this step when you are preparing your DR site like setting up ASR or you can execute at the time of the DR failover process.
Create an Azure storage account on the DR site for the usage as a cloud witness.
Site
Storage cloud witness
Primary
nw6cloudwitness
DR
nw6cloudwintess-dr
Configure standard load balancer for SAP ASCS/ERS in the DR site
Tip: Based on your DR strategy, you can either execute this step when you are preparing your DR site like setting up ASR or you can execute at the time of the DR failover process.
Create an Azure standard load balancer on the DR site, similar to the one you have created in your primary site. If you are creating the load balancer in advance on the DR site, you won’t be able to add VMs to the backend pool because the VMs don’t exist yet in the DR site. You would need to create the backend pool as an empty pool. This allows you to define the load balancing rules. But you would need to add the VMs in the backend pool, when the DR failover of the VMs through ASR has been done.
Keep the probe port of the DR site load balancer the same as in the primary site.
When VMs without public IP addresses are placed in the backend pool of the internal standard load balancer, there would not be any outbound connectivity from these VMs, unless additional configuration is performed to allow routing to public end point. For details on how to achieve outbound connectivity see public endpoint connectivity for Azure VMs & Standard ILB in SAP HA scenarios.
Site
Frontend IP
Primary – ASCS
10.52.0.16
DR – ASCS
10.150.0.9
NOTE: This example uses the ENSA1 setup. For ASR configuration on ENSA2 architecture, you need to configure additional frontend IP and load balancing rules as described in prepare Azure infrastructure for SAP HA with WSFC.
Disaster Recovery (DR) failover event
[A] – Applicable to SAP ASCS Node, [B] – Applicable to SAP ERS Node, [C] – Applicable to SAP Dialog Nodes.
The following procedure should be used for the SAP ASCS/ERS with Azure shared disk and the SAP application servers in the event of a DR failover. The failover procedure here assumes that the system in the primary site is unreachable or unavailable for some reasons. Hence, the DR failover process is started. The VMs in the primary site would stay down after the failover to the DR region is triggered.
NOTE: The exact steps and the order of recovery of your SAP system must be tested, documented and fine-tuned regularly.
Perform the failover of SAP ASCS/ERS and all application server VMs that are configured in ASR to the DR region.
Central Services: If both SAP ASCS/ERS VMs (sapnw6cl1 and sapnw6cl2) that have Azure shared disk(s) in the protected cluster are up and running in primary site, and recovery points are consistent across both the VMs. Follow run a failover – recovery point is consistent across all the VMs to perform failover.
Central Services: If one of the VM (sapnw6cl1 or sapnw6cl2) is down on primary site, and you need to start a failover to the DR site, then follow run a failover – recovery point is consistent only for a few VMs document. In this case, the VM that is down won’t be a part of cluster recovery point, instead you would need to select individual recovery point of that VM to initiate failover.
Application Servers: To perform the failover of application server VMs, see Tutorial to fail over Azure VMs to a secondary region for disaster recovery with Azure Site Recovery.
After the failover is completed, the status of replicated items in the recovery service vault would be like below –
Change the IP address of VMs in DNS or in host files (if used). In this example, change the IP address for SAP ASCS/ERS, and all application servers. The Windows cluster also registers the ASCS/ERS server name in DNS. So, you need to change the IP address of ASCS/ERS server name in DNS or in host files too.
Entries in DNS
Primary Site
DR site
nw6clust.internal.contoso.net
10.52.0.10
10.52.0.11
10.150.0.5
10.150.0.4
nw6ascscl
10.52.0.16 (LB frontend IP)
10.150.0.9 (LB Frontend IP)
sapnw6cl1
10.52.0.10
10.150.0.5
sapnw6cl2
10.52.0.11
10.150.0.4
sapnw6a01
10.52.0.12
10.150.0.6
sapnw6a02
10.52.0.13
10.150.0.7
If you have created an Azure standard load balancer in the DR site beforehand with an empty backend pool. Add ASCS/ERS VMs into the backend pool.
[A] Update the IP address of ASCS server name resource configured in the cluster to the frontend IP of load balancer (the one provisioned in DR site).
IMPORTANT: For ENSA2, you would need to change two IP addresses (one for ASCS, and one for ERS) to the respective frontend IP that you set up in Azure load balancer.
[A] Change the quorum to the cloud witness storage account created on the DR site.
[A] Start cluster role.
[C] Update the user store in all application server instances with the correct database hostname that is running in DR region. Check SAP Note 1852017 to get more insights on how to update the ‘hdbuserstore’ on Windows.
[C] Start all dialog instances.
Failback to the former primary site
Before you begin to failback VMs to the former primary site, ensure that you have committed the failover and status of your virtual machine is “failover committed”.
Re-protect failed over protected cluster (sapnw6) and application server VMs. For more detail, see re-protect VMs with Azure shared disk to the primary site with ASR, and re-protect VMs to the primary site with ASR.
On the event of a failure, follow the same post steps described above.
Microsoft Tech Community – Latest Blogs –Read More