2017-07-15

[SQL Server] AlwaysOn, Error 41131, failover failed,容錯移轉 失敗

SQL Server AlwaysOn Availability Group 執行容錯移轉時,因故發生失敗,若收到的錯誤代碼是:41131

有可能是 [NT AUTHORITY\SYSTEM] 帳戶的權限不足所導致,可賦予以下權限來解決問題:
  1. Alter Any Availability Group
  2. Connect SQL
  3. View server state
[NT AUTHORITY\SYSTEM]  帳戶是 SQL Server AlwaysOn 「健全狀況」的偵測用帳戶。
若因故權限不足,則將造成:
  • 無法啟動 AlwaysOn 「健全狀況」 
  • 造成 AlwaysOn 可用性群組 無法執行 「容錯移轉(Failover) 」

系統顯示的錯誤訊息是:

訊息 41131,層級 16,狀態 0,行 13
無法讓可用性群組 'AGDBG01' 上線。作業逾時。
請確認本機 Windows Server 容錯移轉叢集 (WSFC) 節點已上線。
然後,確認可用性群組資源存在 WSFC 叢集中。
如果此問題持續發生,您可能需要卸除可用性群組,然後再次建立它。

Msg 41131, Level 16, State 0, Line 2
Failed to bring availability group 'AGDBG01' online.  The operation timed out.
Verify that the local Windows Server Failover Clustering (WSFC) node is online. 
Then verify that the availability group resource exists in the WSFC cluster. 
If the problem persists, you might need to drop the availability group and create it again.

-- 100_Error_41131




-- 101_錯誤訊息_41131



示範版本:
SQL Serve 2012、2014



錯誤
41131 的解決方案:賦予權限

步驟01. 檢查帳戶:[NT AUTHORITY\SYSTEM] 帳戶是否存在。

若不存在,請使用以下語法重建此帳戶

-- 01. 檢查帳戶:NT AUTHORITY\SYSTEM,是否存在。若不存在,請使用以下語法重建此帳戶
-- To create the [NT AUTHORITY\SYSTEM] account

/****** Object:  Login [NT AUTHORITY\SYSTEM] ******/
USE [master]
GO
IF NOT EXISTS (SELECT * FROM sys.server_principals WHERE name = N'NT AUTHORITY\SYSTEM')
BEGIN
 CREATE LOGIN [NT AUTHORITY\SYSTEM] FROM WINDOWS WITH DEFAULT_DATABASE=[master]
END
GO

-- 01_ 檢查帳戶_NT AUTHORITY_SYSTEM,是否存在



步驟02. 若[NT AUTHORITY\SYSTEM] 帳戶已經存在,查詢其權限

-- 02. 若 [NT AUTHORITY\SYSTEM] 帳戶已經存在,查詢其權限
-- Listing effective permissions of the user
USE [master]
GO
EXECUTE AS LOGIN = N'NT AUTHORITY\SYSTEM';
SELECT 
 permission_name AS [Permission]
FROM fn_my_permissions(NULL, N'SERVER')
ORDER BY permission_name;
REVERT;

-- 02_1_若帳戶 NT AUTHORITY_SYSTEM 已經存在,查詢其權限



只具備兩項權限:
  1. CONNECT SQL
  2. VIEW ANY DATABASE

-- 02_2_缺少ALTER ANY AVAILABILITY GROUP



[NT AUTHORITY\SYSTEM] 帳戶,在 伺服器層級 應該具備的權限有:
  1. Alter Any Availability Group
  2. Connect SQL
  3. View server state

經過比對,缺少 2 項權限,分別是:
  1. Alter Any Availability Group
  2. View server state

步驟03. 授予 [NT AUTHORITY\SYSTEM] 帳戶必要的權限

-- 03_授予 [NT AUTHORITY\SYSTEM] 帳戶必要的權限
-- To grant the permissions to the [NT AUTHORITY\SYSTEM] account
use [master]
GO
GRANT ALTER ANY AVAILABILITY GROUP TO [NT AUTHORITY\SYSTEM]
GO
GRANT CONNECT SQL TO [NT AUTHORITY\SYSTEM]
GO
GRANT VIEW SERVER STATE TO [NT AUTHORITY\SYSTEM]
GO 

-- 03_授予 [NT AUTHORITY_SYSTEM] 帳戶必要的權限



步驟04. 再度查詢  NT AUTHORITY\SYSTEM 帳戶的權限

-- 04. 再度查詢  NT AUTHORITY\SYSTEM 帳戶的權限
-- Listing effective permissions of the user
USE [master]
GO
EXECUTE AS LOGIN = N'NT AUTHORITY\SYSTEM';
SELECT 
 permission_name AS [Permission]
FROM fn_my_permissions(NULL, N'SERVER')
ORDER BY permission_name;
REVERT;

-- 04_1再度查詢  NT AUTHORITY_SYSTEM 帳戶的權限



-- 04_2_SSMS_已經擁有ALTER ANY AVAILABILITY GROUP





經過比對後,權限已調整為


權限增加為
  1. ALTER ANY AVAILABILITY GROUP
  2. CONNECT SQL
  3. CREATE AVAILABILITY GROUP
  4. VIEW ANY DATABASE
  5. VIEW SERVER STATE
既有權限
  1. CONNECT SQL
  2. VIEW ANY DATABASE
先前賦予的權限
  1. ALTER ANY AVAILABILITY GROUP
  2. VIEW SERVER STATE
系統自動多給一個權限
  1. CREATE AVAILABILITY GROUP




[NT AUTHORITY\SYSTEM]  帳戶 的功能

是 SQL Server AlwaysOn 「健全狀況」的偵測用帳戶。
若因故權限不足,則無法啟動 AlwaysOn 「健全狀況」 ,也將造成 AlwaysOn 可用性群組 無法執行 「容錯移轉(Failover) 」

The [NT AUTHORITY\SYSTEM] account is used by SQL Server AlwaysOn health detection to connect to the SQL Server computer and to monitor health.

When you create an availability group, health detection is initiated when the primary replica in the availability group comes online.
If the [NT AUTHORITY\SYSTEM] account does not exist or does not have sufficient permissions, health detection cannot be initiated, and the availability group cannot come online during the creation process.

Make sure that these permissions exist on each SQL Server computer that could host the primary replica of the availability group.

Note The Resource Host Monitor Service process (RHS.exe) that hosts SQL Resource.dll can be run only under a System account.

The SQL Server Database Engine resource DLL connects to the instance of SQL Server that is hosting the primary replica by using ODBC in order to monitor health.

The logon credentials that are used for this connection are the local SQL Server NT AUTHORITY\SYSTEM login account.
By default, this local login account is granted the following permissions:
Alter Any Availability Group
Connect SQL
View server state

If the NT AUTHORITY\SYSTEM login account lacks any of these permissions on the automatic failover partner (the secondary replica), then SQL Server cannot start health detection when an automatic failover occurs.

Therefore, the secondary replica cannot transition to the primary role.



參考訊息

Cannot create a high-availability group in Microsoft SQL Server 2012
https://support.microsoft.com/en-us/help/2847723/cannot-create-a-high-availability-group-in-microsoft-sql-server-2012

sys.fn_my_permissions (Transact-SQL)
https://docs.microsoft.com/en-us/sql/relational-databases/system-functions/sys-fn-my-permissions-transact-sql

沒有留言:

張貼留言