翻譯|使用教程|編輯:楊鵬連|2020-12-09 10:40:43.720|閱讀 301 次
概述:Phil Factor演示了臨時表和表變量的用法,并提供了一些簡單的規(guī)則來確定表變量是否比臨時表(ST011)更好,反之亦然(ST012)。
# 界面/圖表報表/文檔/IDE等千款熱門軟控件火熱銷售中 >>
相關(guān)鏈接:
SQL Prompt是一款實用的SQL語法提示工具。SQL Prompt根據(jù)數(shù)據(jù)庫的對象名稱、語法和代碼片段自動進(jìn)行檢索,為用戶提供合適的代碼選擇。自動腳本設(shè)置使代碼簡單易讀--當(dāng)開發(fā)者不大熟悉腳本時尤其有用。SQL Prompt安裝即可使用,能大幅提高編碼效率。此外,用戶還可根據(jù)需要進(jìn)行自定義,使之以預(yù)想的方式工作。
人們可以并且確實對表變量和臨時表的相對優(yōu)點爭論不休。有時,就像編寫函數(shù)時一樣,您別無選擇。但是當(dāng)您這樣做時,您會發(fā)現(xiàn)兩者都有其用途,并且很容易找到其中一個速度更快的示例。在本文中,我將解釋選擇一個或另一個涉及的主要因素,并演示一些簡單的“規(guī)則”以獲得最佳性能。
假設(shè)您遵循基本的接觸規(guī)則,那么在使用相對較小的數(shù)據(jù)集時,應(yīng)將表變量視為首選。與使用臨時表相比,它們更易于使用,并且在使用它們的例程中觸發(fā)更少的重新編譯。表變量還需要較少的鎖定資源,因為它們是創(chuàng)建它們的過程和批處理的“專用”資源。SQL Prompt將此建議作為代碼分析規(guī)則ST011實施-考慮使用表變量而不是臨時表。
如果您要對臨時數(shù)據(jù)進(jìn)行更復(fù)雜的處理,或者需要使用其中的少量數(shù)據(jù),那么本地臨時表可能是一個更好的選擇。根據(jù)他的建議,SQL Code Guard包含一個代碼分析規(guī)則ST012 –考慮使用臨時表代替表變量,但是SQL Prompt當(dāng)前未實現(xiàn)。
表變量和臨時表的優(yōu)缺點
表變量趨向于“糟糕”,因為使用它們的查詢偶爾會導(dǎo)致執(zhí)行計劃效率低下。但是,如果遵循一些簡單的規(guī)則,它們對于中間“工作”表以及在例程之間傳遞結(jié)果是個不錯的選擇,在常規(guī)例程中數(shù)據(jù)集很小,所需的處理相對簡單。
表變量的使用非常簡單,主要是因為它們是“零維護(hù)”。它們的作用域僅限于創(chuàng)建它們的批處理或例程,一旦完成執(zhí)行便會自動刪除,因此在長期存在的連接中使用它們不會冒著tempdb中“資源占用”問題的風(fēng)險。如果在存儲過程中聲明了表變量,則該表變量是該存儲過程的本地變量,并且不能在嵌套過程中引用。表變量也沒有基于統(tǒng)計信息的重新編譯,因此您不能ALTER一種,因此使用它們的例程比使用臨時表的例程傾向于更少的重新編譯。它們也沒有完全記錄,因此創(chuàng)建和填充它們的速度更快,并且在事務(wù)日志中需要的空間更少。在存儲過程中使用它們時,在高并發(fā)條件下,系統(tǒng)表上的爭用較少。簡而言之,保持事物整潔更容易。
當(dāng)使用相對較小的數(shù)據(jù)集時,它們比類似的臨時表快。但是,隨著行數(shù)的增加(超過大約15,000行,但根據(jù)上下文而變化),您可能會遇到困難,這主要是由于它們?nèi)狈y(tǒng)計的支持。即使對表變量強(qiáng)制執(zhí)行PRIMARY KEY和UNIQUE約束的索引也沒有統(tǒng)計信息。因此,優(yōu)化器將使用從表變量返回的1行的硬編碼估計,因此傾向于選擇最適合處理小型數(shù)據(jù)集(例如嵌套循環(huán))的運算符。聯(lián)接運算符)。表變量中的行越多,估計與實際之間的差異就越大,并且成為優(yōu)化器計劃選擇的效率越低。最終的計劃有時是可怕的。
有經(jīng)驗的開發(fā)人員或DBA會尋找此類問題,并準(zhǔn)備將OPTION (RECOMPILE)查詢提示添加到使用表變量的語句中。當(dāng)我們提交包含表變量的批處理時,優(yōu)化器將首先編譯該批處理,此時表變量為空。當(dāng)批處理開始執(zhí)行時,提示將僅導(dǎo)致重新編譯單個語句,此時將填充表變量,優(yōu)化器可以使用實際行數(shù)為該語句編譯新的計劃。有時,但很少,即使這樣也無濟(jì)于事。同樣,過度依賴此提示將在某種程度上抵消表變量比臨時表具有更少的重新編譯優(yōu)勢。
其次,在處理大型數(shù)據(jù)集時,表變量的某些索引限制變得更加重要。現(xiàn)在,您可以使用內(nèi)聯(lián)索引創(chuàng)建語法在表變量上創(chuàng)建非聚集索引,但是存在一些限制,并且仍然沒有關(guān)聯(lián)的統(tǒng)計信息。
即使行數(shù)相對較少,但如果您嘗試執(zhí)行作為聯(lián)接的查詢,也可能會遇到查詢性能問題,而忘記了在用于聯(lián)接的列上定義PRIMARY KEY或UNIQUE約束。沒有它們提供的元數(shù)據(jù),優(yōu)化器將不知道數(shù)據(jù)的邏輯順序,也不知道聯(lián)接列中的數(shù)據(jù)是否包含重復(fù)值,并且可能會選擇效率低下的聯(lián)接操作,從而導(dǎo)致查詢緩慢。如果使用表變量堆,則只能使用一個簡單列表,該列表很可能在單個gulp中處理(表掃描)。如果您同時使用OPTION (RECOMPILE) 提示,準(zhǔn)確的基數(shù)估計以及連接列上的鍵(可為優(yōu)化器提供有用的元數(shù)據(jù)),然后對于較小的數(shù)據(jù)集,您通常可以達(dá)到與使用本地臨時表相似或更好的查詢速度。
一旦行數(shù)增加到表變量的允許范圍之外,或者您需要執(zhí)行更復(fù)雜的數(shù)據(jù)處理,那么最好切換為使用臨時表。在這里,您可以使用完整的選項來建立索引,并且優(yōu)化器將可以為每個索引使用統(tǒng)計信息。當(dāng)然,缺點是臨時表的維護(hù)成本較高。您需要確保自己清理一下,以避免tempdb擁塞。如果更改臨時表或修改臨時表中的數(shù)據(jù),則可能會導(dǎo)致父例程的重新編譯。
當(dāng)需要大量刪除和插入(行集共享)時,臨時表會更好。如果必須從表中完全刪除數(shù)據(jù),尤其是這樣,因為只有臨時表支持截斷。如果數(shù)據(jù)易變,則表變量設(shè)計中的折衷辦法(例如缺乏統(tǒng)計信息和重新編譯)會不利于它們。
何時需要使用表變量
我們將從一個表變量理想的示例開始,它可以帶來更好的性能。我們將列出Adventureworks的員工列表,他們工作的部門以及工作班次。我們正在處理一個小的數(shù)據(jù)集(291行)。
USE AdventureWorks2016; --initialise out timer DECLARE @log TABLE (TheOrder INT IDENTITY(1,1), WhatHappened varchar(200), WHENItDid Datetime2 DEFAULT GETDATE()) CREATE TABLE #employees (Employee NATIONAL CHARACTER VARYING(500) NOT NULL); ----start of timing INSERT INTO @log(WhatHappened) SELECT 'Starting My_Section_of_code'--place at the start --start by using a table variable for workpad DECLARE @WorkPad TABLE (NameOfEmployee NATIONAL CHARACTER VARYING(100) NOT NULL, BusinessEntityID INT PRIMARY KEY NOT NULL, NationalIDNumber NATIONAL CHARACTER VARYING(15) NOT NULL); INSERT INTO @WorkPad (NameOfEmployee, BusinessEntityID, NationalIDNumber) SELECT Coalesce(Person.Title + ' ', '') + Person.FirstName + ' ' + Coalesce(Person.MiddleName + ' ', '') + Person.LastName + ': ' + Coalesce(Person.Suffix, '') + Employee.JobTitle, Employee.BusinessEntityID, Employee.NationalIDNumber FROM HumanResources.Employee INNER JOIN Person.Person ON Person.BusinessEntityID = Employee.BusinessEntityID; INSERT INTO #Employees(Employee) SELECT TheList.NameOfEmployee + ' - ' + Coalesce( Stuff( (SELECT ', ' + Department.Name + ' (' + Department.GroupName + ') ' + Convert(CHAR(5), Shift.StartTime) + ' to ' + Convert(CHAR(5), Shift.EndTime) FROM HumanResources.EmployeeDepartmentHistory INNER JOIN HumanResources.Department ON Department.DepartmentID = EmployeeDepartmentHistory.DepartmentID INNER JOIN HumanResources.Shift ON Shift.ShiftID = EmployeeDepartmentHistory.ShiftID WHERE EmployeeDepartmentHistory.BusinessEntityID = TheList.BusinessEntityID FOR XML PATH(''), TYPE).value('.', 'varchar(max)'), 1,2,''),'?') AS Department FROM @WorkPad TheList; INSERT INTO @log(WhatHappened) SELECT 'The use of a Table Variable took '--where the routine you want to time ends --now use a temp table for workpad instead CREATE TABLE #WorkPad (NameOfEmployee NATIONAL CHARACTER VARYING(100) NOT NULL, BusinessEntityID INT PRIMARY KEY NOT NULL, NationalIDNumber NATIONAL CHARACTER VARYING(15) NOT NULL); INSERT INTO #WorkPad (NameOfEmployee, BusinessEntityID, NationalIDNumber) SELECT Coalesce(Person.Title + ' ', '') + Person.FirstName + ' ' + Coalesce(Person.MiddleName + ' ', '') + Person.LastName + ': ' + Coalesce(Person.Suffix, '') + Employee.JobTitle, Employee.BusinessEntityID, Employee.NationalIDNumber FROM HumanResources.Employee INNER JOIN Person.Person ON Person.BusinessEntityID = Employee.BusinessEntityID; INSERT INTO #Employees(Employee) SELECT TheList.NameOfEmployee + ' - ' + Coalesce( Stuff( (SELECT ', ' + Department.Name + ' (' + Department.GroupName + ') ' + Convert(CHAR(5), Shift.StartTime) + ' to ' + Convert(CHAR(5), Shift.EndTime) FROM HumanResources.EmployeeDepartmentHistory INNER JOIN HumanResources.Department ON Department.DepartmentID = EmployeeDepartmentHistory.DepartmentID INNER JOIN HumanResources.Shift ON Shift.ShiftID = EmployeeDepartmentHistory.ShiftID WHERE EmployeeDepartmentHistory.BusinessEntityID = TheList.BusinessEntityID FOR XML PATH(''), TYPE).value('.', 'varchar(max)'), 1,2,''),'?') AS Department FROM #WorkPad TheList; INSERT INTO @log(WhatHappened) SELECT 'The use of a temporary Table took '--where the routine you want to time ends DROP TABLE #Employees DROP TABLE #WorkPad /* now we see how long each took. */ SELECT ending.WhatHappened, DateDiff(ms, starting.WHENItDid, ending.WHENItDid) AS ms FROM @log AS starting INNER JOIN @log AS ending ON ending.TheOrder = starting.TheOrder + 1; --list out all the timings這是我的慢速測試機(jī)器上的典型結(jié)果:
規(guī)模問題和忘記提供關(guān)鍵或提示
如果我們聯(lián)接兩個表變量,性能如何?讓我們嘗試一下。在此示例中,我們需要兩個簡單的表,一個表包含英語中的所有常用單詞(CommonWords),另一個表包含Bram Stoker的“ Dracula”中的所有單詞的列表(WordsInDracula)。該TestTVsAndTTs下載包括腳本來創(chuàng)建這兩個表,并填充和與之相關(guān)的文本文件中每一個。有60,000個常用詞,但Bram Stoker僅使用了10,000個。前者遠(yuǎn)未達(dá)到收支平衡點,在那里人們開始偏愛臨時表。
我們將使用四個簡單的外部聯(lián)接查詢,測試結(jié)果的NULL值,以查找不存在于德古拉中的常見單詞,不存在于德古拉中的常見單詞,不存在于德古拉中的單詞,最后是另一個查詢以查找在德古拉語中很常見,但方向相反。當(dāng)我顯示測試裝備的代碼時,您很快就會看到查詢。
以下是初始測試運行的結(jié)果。在第一次運行中,兩個表變量都具有主鍵,而在第二次運行中,它們都是堆,只是為了查看我是否在夸大未在表變量中聲明索引的問題。最后,我們對臨時表運行相同的查詢。出于說明目的,所有測試都故意在緩慢的開發(fā)服務(wù)器上運行;使用生產(chǎn)服務(wù)器,您將獲得截然不同的結(jié)果。
除了對主要差異進(jìn)行一些廣泛的解釋之外,我不會深入研究這些績效指標(biāo)背后的執(zhí)行計劃的細(xì)節(jié)。對于臨時表查詢,優(yōu)化器具有對基數(shù)和主鍵約束中的元數(shù)據(jù)的全面了解,因此選擇了有效的“合并聯(lián)接”運算符來執(zhí)行聯(lián)接操作。對于具有主鍵的表變量,優(yōu)化器知道連接列中行的順序,并且它們不包含重復(fù)項,但假定它僅處理一行,因此改為選擇嵌套循環(huán)加入。在這里,它掃描一個表,然后針對返回的每一行執(zhí)行另一表的單獨查找。數(shù)據(jù)集越大,效率越低,并且在掃描CommonWords表變量的情況下尤其不利,因為這會導(dǎo)致對表變量的搜索超過60K Dracula。該嵌套循環(huán)聯(lián)接達(dá)到“峰值效率”使用表變量堆二,十分鐘的查詢,因為它涉及數(shù)千表掃描CommonWords。有趣的是,這兩個“德古拉中的常用單詞”查詢的性能要好得多,這是因為對于這兩個查詢,優(yōu)化器選擇了哈希匹配聯(lián)接。
總體而言,臨時表似乎是最佳選擇,但我們還沒有完成!讓我們OPTION (RECOMPILE)向使用帶有主鍵的表變量的查詢添加提示,然后針對這些查詢以及使用臨時表的原始查詢重新運行測試。我們暫時不去那些可憐的堆。
如果您也給那些可憐的人OPTION (RECOMPILE)暗示,會發(fā)生什么呢?瞧,故事為他們而改變,所以所有三個時機(jī)都更加接近。
有趣的是,即使在堆上也很快速的兩個“德古拉常用詞”查詢現(xiàn)在要慢得多。擁有正確的行數(shù)后,優(yōu)化器會更改其策略,但是由于在定義約束和鍵時它仍然沒有可用的有用元數(shù)據(jù),因此,它是一個錯誤的選擇。它掃描CommonWords堆,然后嘗試“部分聚合”,估計它將從6萬行聚合到幾百行。它不知道沒有重復(fù)項,因此實際上它根本不會聚合下來,并且聚合和隨后的聯(lián)接會溢出到tempdb。
試驗臺
請注意,這是最終形式的測試臺,顯示了三種不同類型表的大致相同的性能。您將需要刪除OPTION (RECOMPILE)提示以恢復(fù)原始狀態(tài)。
USE PhilFactor; --create the working table with all the words from Dracula in it DECLARE @WordsInDracula TABLE (word VARCHAR(40) NOT NULL PRIMARY KEY CLUSTERED); INSERT INTO @WordsInDracula(word) SELECT WordsInDracula.word FROM dbo.WordsInDracula; --create the other working table with all the common words in it DECLARE @CommonWords TABLE (word VARCHAR(40) NOT NULL PRIMARY KEY CLUSTERED); INSERT INTO @CommonWords(word) SELECT commonwords.word FROM dbo.commonwords; --create a timing log DECLARE @log TABLE (TheOrder INT IDENTITY(1, 1), WhatHappened VARCHAR(200), WhenItDid DATETIME2 DEFAULT GetDate()); ----start of the timing (never reported) INSERT INTO @log(WhatHappened) SELECT 'Starting My_Section_of_code'; --place at the start ---------------section of code using table variables --first timed section of code using table variables SELECT Count(*) AS [common words not in Dracula] FROM @CommonWords AS c LEFT OUTER JOIN @WordsInDracula AS d ON d.word = c.word WHERE d.word IS NULL OPTION(RECOMPILE); INSERT INTO @log(WhatHappened) SELECT 'common words not in Dracula: Both table variables with primary keys '; --where the routine you want to time ends --Second timed section of code using table variables SELECT Count(*) AS [common words in Dracula] FROM @CommonWords AS c LEFT OUTER JOIN @WordsInDracula AS d ON d.word = c.word WHERE d.word IS NOT NULL OPTION(RECOMPILE); INSERT INTO @log(WhatHappened) SELECT 'common words in Dracula: Both table variables with primary keys '; --where the routine you want to time ends --third timed section of code using table variables SELECT Count(*) AS [uncommon words in Dracula ] FROM @WordsInDracula AS d LEFT OUTER JOIN @CommonWords AS c ON d.word = c.word WHERE c.word IS NULL OPTION(RECOMPILE); INSERT INTO @log(WhatHappened) SELECT 'uncommon words in Dracula: Both table variables with primary keys '; --where the routine you want to time ends --last timed section of code using table variables SELECT Count(*) AS [common words in Dracula ] FROM @WordsInDracula AS d LEFT OUTER JOIN @CommonWords AS c ON d.word = c.word WHERE c.word IS NOT NULL OPTION(RECOMPILE); INSERT INTO @log(WhatHappened) SELECT 'more common words in Dracula: Both table variables with primary keys '; --where the routine you want to time ends ---------------section of code using heap variables DECLARE @WordsInDraculaHeap TABLE(word VARCHAR(40) NOT NULL); INSERT INTO @WordsInDraculaHeap(word) SELECT WordsInDracula.word FROM dbo.WordsInDracula; DECLARE @CommonWordsHeap TABLE(word VARCHAR(40) NOT NULL); INSERT INTO @CommonWordsHeap(word) SELECT commonwords.word FROM dbo.commonwords; INSERT INTO @log(WhatHappened) SELECT 'Test Rig Setup '; --where the routine you want to time ends --first timed section of code using heap variables SELECT Count(*) AS [common words not in Dracula] FROM @CommonWordsHeap AS c LEFT OUTER JOIN @WordsInDraculaHeap AS d ON d.word = c.word WHERE d.word IS NULL OPTION(RECOMPILE); INSERT INTO @log(WhatHappened) SELECT 'common words not in Dracula: Both Heaps '; --where the routine you want to time ends --second timed section of code using heap variables SELECT Count(*) AS [common words in Dracula] FROM @CommonWordsHeap AS c LEFT OUTER JOIN @WordsInDraculaHeap AS d ON d.word = c.word WHERE d.word IS NOT NULL OPTION(RECOMPILE); INSERT INTO @log(WhatHappened) SELECT 'common words in Dracula: Both Heaps '; --where the routine you want to time ends --third timed section of code using heap variables SELECT Count(*) AS [uncommon words in Dracula ] FROM @WordsInDraculaHeap AS d LEFT OUTER JOIN @CommonWordsHeap AS c ON d.word = c.word WHERE c.word IS NULL OPTION(RECOMPILE); INSERT INTO @log(WhatHappened) SELECT 'uncommon words in Dracula: Both Heaps '; --where the routine you want to time ends --last timed section of code using heap variables SELECT Count(*) AS [common words in Dracula ] FROM @WordsInDraculaHeap AS d LEFT OUTER JOIN @CommonWordsHeap AS c ON d.word = c.word WHERE c.word IS NOT NULL OPTION(RECOMPILE); INSERT INTO @log(WhatHappened) SELECT 'common words in Dracula: Both Heaps '; --where the routine you want to time ends ---------------section of code using Temporary tables CREATE TABLE #WordsInDracula (word VARCHAR(40) NOT NULL PRIMARY KEY); INSERT INTO #WordsInDracula(word) SELECT WordsInDracula.word FROM dbo.WordsInDracula; CREATE TABLE #CommonWords (word VARCHAR(40) NOT NULL PRIMARY KEY); INSERT INTO #CommonWords(word) SELECT commonwords.word FROM dbo.commonwords; INSERT INTO @log(WhatHappened) SELECT 'Temp Table Test Rig Setup '; --where the routine you want to time ends --first timed section of code using Temporary tables SELECT Count(*) AS [common words not in Dracula] FROM #CommonWords AS c LEFT OUTER JOIN #WordsInDracula AS d ON d.word = c.word WHERE d.word IS NULL; INSERT INTO @log(WhatHappened) SELECT 'common words not in Dracula: Both Temp Tables '; --where the routine you want to time ends --Second timed section of code using Temporary tables SELECT Count(*) AS [common words in Dracula] FROM #CommonWords AS c LEFT OUTER JOIN #WordsInDracula AS d ON d.word = c.word WHERE d.word IS NOT NULL; INSERT INTO @log(WhatHappened) SELECT 'common words in Dracula: Both Temp Tables '; --where the routine you want to time ends --third timed section of code using Temporary tables SELECT Count(*) AS [uncommon words in Dracula ] FROM #WordsInDracula AS d LEFT OUTER JOIN #CommonWords AS c ON d.word = c.word WHERE c.word IS NULL; INSERT INTO @log(WhatHappened) SELECT 'uncommon words in Dracula:Both Temp Tables '; --where the routine you want to time ends --last timed section of code using Temporary tables SELECT Count(*) AS [common words in Dracula ] FROM #WordsInDracula AS d LEFT OUTER JOIN #CommonWords AS c ON d.word = c.word WHERE c.word IS NOT NULL; INSERT INTO @log(WhatHappened) SELECT 'common words in Dracula: Both Temp Tables '; --where the routine you want to time ends DROP TABLE #WordsInDracula; DROP TABLE #CommonWords; SELECT ending.WhatHappened AS [The test that was run], DateDiff(ms, starting.WhenItDid, ending.WhenItDid) AS [Time Taken (Ms)] FROM @log AS starting INNER JOIN @log AS ending ON ending.TheOrder = starting.TheOrder + 1; --list out all the timings清單2
結(jié)論
使用表變量沒有什么魯ck的事情。當(dāng)用于預(yù)期目的時,它們可以提供更好的性能,并且可以自行清理。在某個時候,讓他們獲得更好性能的妥協(xié)(不觸發(fā)重新編譯,不提供統(tǒng)計信息,不回滾,不并行)成為他們的失敗。
通常,SQL Server專家會就結(jié)果的大小提供一些明智的建議,這將導(dǎo)致表變量出現(xiàn)問題。我在本文中向您顯示的結(jié)果將建議您過分簡化問題。有兩個重要因素:如果結(jié)果超過了,比如說1000行(該數(shù)字取決于上下文),那么對于連接到表變量的任何查詢,都需要具有PRIMARY KEY或UNIQUE鍵。在某個時候,您還需要觸發(fā)重新編譯以獲得一個體面的執(zhí)行計劃,該計劃有其自身的開銷。
即使這樣,性能也會受到嚴(yán)重影響,尤其是在執(zhí)行更復(fù)雜的處理時,因為優(yōu)化器仍然無法訪問統(tǒng)計信息,因此也不了解任何查詢謂詞的選擇性。在這種情況下,您需要切換到使用臨時表。
試用下載>>>本站文章除注明轉(zhuǎn)載外,均為本站原創(chuàng)或翻譯。歡迎任何形式的轉(zhuǎn)載,但請務(wù)必注明出處、不得修改原文相關(guān)鏈接,如果存在內(nèi)容上的異議請郵件反饋至chenjj@fc6vip.cn
文章轉(zhuǎn)載自: