Differences between SQL Server 2000, 2005 and 2008
The main differences among SQL Server 2000, 2005 and 2008 are shown in the following table.
S.NO
|
SQL SERVER 2000
| SQL SERVER 2005 |
SQL SERVER 2008
|
1 | Query Analyzer and Enterprise manager are separate. | Both are combined as SQL Server management Studio (SSMS). | Both are combined as SQL Server management Studio (SSMS). |
2 | No XML datatype is used. | XML datatype is introduced. | XML datatype is used. |
3 | We can create a maximum of 65,535 databases. | We can create 2(pow(20))-1 databases. | We can create 2(pow(20))-1 databases. |
4 | Nill | Exception Handling | Exception Handling |
5 | Nill | Varchar(Max) data type | Varchar (Max) data type |
6 | Nill | DDL Triggers | DDL Triggers |
7 | Nill | DataBase Mirroring | DataBase Mirroring |
8 | Nill | RowNumber function for paging | RowNumber function for paging |
9 | Nill | Table fragmentation | Table fragmentation |
10 | Nill | Nill | Nill |
11 | Nill | Bulk Copy Update | Bulk Copy |
12 | Nill | Can't encrypt | Encrypt the entire database introduced in 2008. |
13 | Can't compress the tables and indexes. | Can compress tables and indexes (introduced in 2005 SP2). | Compress indexes. |
14 | Datetime datatype is used for both date and time. | Datetime is used for both date and time. | Date and time are separately used for date and time datatype. |
15 | No table datatype is included. | No table datatype is included. | Table datatype introduced. |
16 | No SSIS is included. | SSIS is started using. | SSIS avails in this version. |
17 | CMS is not available. | CMS is not available. | Central Management Server (CMS) was introduced. |
18 | PBM is not available. | PBM is not available. | Policy based management (PBM) server was introduced. |
19 | PIVOT and UNPIVOT functions are not used. | PIVOT and UNPIVOT functions are used. | PIVOT and UNPIVOT functions are used. |
Overview and advantages of SQL Server 2008 depending on our need
SQL Server 2008 can be a data storage backend for various varieties of data: XML, time/calendar, file, document, spatial, and so on as well as perform search, query, analysis, sharing, and synchronization across all data types.
Other new data types include specialized date and time types and a Spatial data type for location-dependent data. Better support for unstructured and semi-structured data is provided using the new FILESTREAM data type, that can be used to reference any file stored on the file system. Structured data and metadata about the file is stored in SQL Server database, whereas the unstructured component is stored in the file system. Such files can be accessed both via Win32 file handling APIs as well as via SQL Server using T-SQL.
SQL Server 2008 delivers a rich set of integrated services that enable you to do more with your data, such as query, search, synchronize, report, and analyze.
SQL Server 2008 provides the highest levels of security, reliability, and scalability for your business-critical applications.
SQL Server 2008 also natively supports hierarchical data, and includes T-SQL constructs to directly deal with them, without using recursive queries.
Other new data types include specialized date and time types and a Spatial data type for location-dependent data. Better support for unstructured and semi-structured data is provided using the new FILESTREAM data type, that can be used to reference any file stored on the file system. Structured data and metadata about the file is stored in SQL Server database, whereas the unstructured component is stored in the file system. Such files can be accessed both via Win32 file handling APIs as well as via SQL Server using T-SQL.
SQL Server 2008 delivers a rich set of integrated services that enable you to do more with your data, such as query, search, synchronize, report, and analyze.
SQL Server 2008 provides the highest levels of security, reliability, and scalability for your business-critical applications.
SQL Server 2008 also natively supports hierarchical data, and includes T-SQL constructs to directly deal with them, without using recursive queries.
Data Compression
Data compression reduces the amount of storage space needed to store tables and indexes, that enables more efficient storage of data. Data Compression does not require changes to be made to applications in order to be enabled.
Saves disk storage.
Enable the compression option for individual tables or indexes.
Configuration is easy using the Data Compression wizard.
Applications do not need to be reconfigured, since SQL Server handles compression and decompression of data.
Compression can improve disk I/O and memory utilization.
Transparent Data Encryption
Transparent Data Encryption enables data to be stored securely by encrypting the database files. If the disks that contain database files become compromised then data in those files is protected because that data can only be de-encrypted by an authorized agent. SQL Server performs the encryption and de-encryption directly, so the process is entirely transparent to connecting applications. Applications can continue to read and write data to and from the database as they normally would. Backup copies of encrypted database files are also automatically encrypted.
Implements strong encryption keys and certificates to secure data.
Applications do not need to be modified to support Transparent Data Encryption.
Enables compliance with data privacy regulations.
Does not increase the size of the database.
Backup Compression
Backup compression enables the backup of a database to be compressed without the need to compress the database itself. All backup types, including log backups, are supported and data is automatically uncompressed upon restore.
- Saves storage space
- Compressed backups can be stored on tape or on disk
- Simple configuration using SQL Server Management Studio
- The default state of all backups on a server to be compressed can be configured
Table-Valued Parameters
Table-Valued Parameters (TVPs) allows Stored Procedures to accept and return lists of parameters. Developers can write applications that pass sets of data into Stored Procedures rather than just one value at a time. Table-valued parameters make the development of Stored Procedures that manipulate data more straightforward and can improve performance by reducing the number of times a procedure needs to call a database.
Table-Valued Parameters (TVPs) allows Stored Procedures to accept and return lists of parameters. Developers can write applications that pass sets of data into Stored Procedures rather than just one value at a time. Table-valued parameters make the development of Stored Procedures that manipulate data more straightforward and can improve performance by reducing the number of times a procedure needs to call a database.
- Eliminates the need to use less efficient methods of processing data sets, such as passing XML data into Stored Procedures
- Reduces complexity and simplifies the process of working with data sets for developers
In 2008, Microsoft introduced "Table Value Parameters" (TVP).
CREATE TYPE PeepsType AS TABLE (Name varchar(20), Age int)
DECLARE @myPeeps PeepsType
INSERT @myPeeps SELECT 'murphy', 35
INSERT @myPeeps SELECT 'galen', 31
INSERT @myPeeps SELECT 'samuels', 27
INSERT @myPeeps SELECT 'colton', 42
exec sp_MySproc2 @myPeeps
And the sproc would look like this:
CREATE PROCEDURE sp_MySproc2(@myPeeps PeepsType READONLY) ...
The advantage here is that you can treat the Table Type as a regular table, use it in joins, and so on.
LINQ for SQL
Application developers who are not database developers have long wished for a more tightly bound way to express SQL in their source code. Dynamic SQL can't be checked at compile time, and Stored Procedures aren't visible in application source code. Enter LINQ for SQL, or "Language Integrated Query". LINQ for SQL allows developers to code their queries against local pre-generated classes, not remote database objects. The classes isolate the developer from the specifics of where the data is stored and in what format, and offer a higher degree of type fidelity and developer productivity.
Virtual Earth Integration
Use the new spatial data types in SQL Server 2008 with Microsoft Virtual Earth to deliver rich graphical representations of the physical locations stored in a database. Use Virtual Earth support to create applications that display data about locations in Desktop maps or web pages. For example, SQL Server 2008 makes it easy to show the locations of all company sites that are less than 50 kilometers from Denver.
Spatial data types enable geographical data to be stored natively in SQL Server 2008 databases.
Microsoft Office 2007
SQL Server 2008 can tightly integrate with Microsoft Office 2007. For example, in SQL Server Reporting Server, reports can now export directly to Word. In addition, both Word and Excel can be used as templates for SSRS reports by using the Report Authoring tool. Excel SSAS performance has been improved and there is a data mining add-in.
Merge
SQL 2008 includes the TSQL command MERGE. This statement allows a single statement to UPDATE, INSERT, or DELETE a row depending on its condition. The example below demonstrates the MEGRE being used to combine a new list of Inventory Item descriptions into the existing Inventory Master. In addition to new Descriptions, there are some new parts included in the NewInventory table. Without the Merge statement, two commands would need to run. The first would look for a match then upgrade the Description. The second statement would look for non matches and then INSERT. With MERGE, one statement can perform both tasks as in the following.
And the sproc would look like this:
CREATE PROCEDURE sp_MySproc2(@myPeeps PeepsType READONLY) ...
The advantage here is that you can treat the Table Type as a regular table, use it in joins, and so on.
LINQ for SQL
Application developers who are not database developers have long wished for a more tightly bound way to express SQL in their source code. Dynamic SQL can't be checked at compile time, and Stored Procedures aren't visible in application source code. Enter LINQ for SQL, or "Language Integrated Query". LINQ for SQL allows developers to code their queries against local pre-generated classes, not remote database objects. The classes isolate the developer from the specifics of where the data is stored and in what format, and offer a higher degree of type fidelity and developer productivity.
Virtual Earth Integration
Use the new spatial data types in SQL Server 2008 with Microsoft Virtual Earth to deliver rich graphical representations of the physical locations stored in a database. Use Virtual Earth support to create applications that display data about locations in Desktop maps or web pages. For example, SQL Server 2008 makes it easy to show the locations of all company sites that are less than 50 kilometers from Denver.
Spatial data types enable geographical data to be stored natively in SQL Server 2008 databases.
Microsoft Office 2007
SQL Server 2008 can tightly integrate with Microsoft Office 2007. For example, in SQL Server Reporting Server, reports can now export directly to Word. In addition, both Word and Excel can be used as templates for SSRS reports by using the Report Authoring tool. Excel SSAS performance has been improved and there is a data mining add-in.
Merge
SQL 2008 includes the TSQL command MERGE. This statement allows a single statement to UPDATE, INSERT, or DELETE a row depending on its condition. The example below demonstrates the MEGRE being used to combine a new list of Inventory Item descriptions into the existing Inventory Master. In addition to new Descriptions, there are some new parts included in the NewInventory table. Without the Merge statement, two commands would need to run. The first would look for a match then upgrade the Description. The second statement would look for non matches and then INSERT. With MERGE, one statement can perform both tasks as in the following.
MERGE
InventoryMaster AS im
USING
(SELECT InventoryID, Descr FROM NewInventory) AS src
ON
im. InventoryID = src. InventoryID
WHEN
MATCHED THEN
UPDATE
SET im.Descr = src.Descr
WHEN
NOT MATCHED THEN
INSERT
(InventoryID, Descr) VALUES (src. InventoryID, src.Descr);
Inline variable assignment
Instead of:
Inline variable assignment
Instead of:
DECLARE @myVar int
SET @myVar = 5
you can do it in one line:
DECLARE @myVar int = 5
New Feature and datatype of SQL Server 2008 over 2000
Common table expressions ( CTE)
Common table expressions are temporary result sets defined within the execution scope of a single SELECT, INSERT, UPDATE, DELETE or CREATE VIEW statement. It's like a derived table that lasts only as long as the session of the query.
Advantages of CTE
Common table expressions are temporary result sets defined within the execution scope of a single SELECT, INSERT, UPDATE, DELETE or CREATE VIEW statement. It's like a derived table that lasts only as long as the session of the query.
Advantages of CTE
- Can be used to create a recursive query
- Can be substituted for a view
- Allow grouping by a column that might be derived from a scalar subset
- Can reference itself multiple times
Recently I recommended to our development group the use of CTE instead of a temporary table or table variable and the benefits of CTE. We implemented CTE in production to help leverage execution time for a particular query.
CTEs provide an easy way of writing and reviewing queries. Another unique and defining element of CTE is that it can self-reference, meaning it can refer to itself as many times as required in the same query. Microsoft developed the Common Table Expression for SQL Server 2005 based on the ANSI SQL-99 standard. Before its introduction, SQL 2000 users were using derived and temporary tables.
Temporary tables as used in SQL 2000, allow them to be called repeatedly from within a statement, whereas in a Common Table Expression, it can be called immediately after stating it. Hence, if you write syntax similar to the following example, there won't be any response to the CTE you call.
CTEs provide an easy way of writing and reviewing queries. Another unique and defining element of CTE is that it can self-reference, meaning it can refer to itself as many times as required in the same query. Microsoft developed the Common Table Expression for SQL Server 2005 based on the ANSI SQL-99 standard. Before its introduction, SQL 2000 users were using derived and temporary tables.
Temporary tables as used in SQL 2000, allow them to be called repeatedly from within a statement, whereas in a Common Table Expression, it can be called immediately after stating it. Hence, if you write syntax similar to the following example, there won't be any response to the CTE you call.
USE AdventureWorks
GO
WITH MyCTE (LP, SP) AS
(
SELECT LP, LP * .95 FROM itemlist.item
)
SELECT * FROM itemlist.item
SELECT * FROM My_CTE
GO
Both a CTE and a temporary table can be called by name using the SELECT * statement.
Derived tables in SQL 2000 and CTEs in SQL 2005 are very similar. Derived tables serve in simple queries. They however suffer from two drawbacks; they can be used only once and you cannot refer to them by name. CTEs score over derived tables when it comes to complex statements. In fact, CTEs can efficiently handle simple queries too, since they can be rewritten as derived tables as in the following:
Both a CTE and a temporary table can be called by name using the SELECT * statement.
Derived tables in SQL 2000 and CTEs in SQL 2005 are very similar. Derived tables serve in simple queries. They however suffer from two drawbacks; they can be used only once and you cannot refer to them by name. CTEs score over derived tables when it comes to complex statements. In fact, CTEs can efficiently handle simple queries too, since they can be rewritten as derived tables as in the following:
SELECT * FROM
(
SELECT LP, (LP * .95) AS SP
FROM itemlist.item
)
MyDerivedTable
A recursive query used in SQL Server 2005 refers to a recursive CTE. It allows you to set up arbitrary levels of queries on data that have a tree-like structure (hierarchical). For instance, the employee reporting structure in an organization is something that was not possible in SQL 2000.
EXAMPLE OF CTE
The following example shows the components of the CTE structure: expression name, column list, and query. The CTE expression Sales_CTE has three columns (SalesPersonID, NumberOfOrders, and MaxDate) and is defined as the total number of sales orders and the most recent sales order date in the SalesOrderHeader table for each salesperson. When the statement is executed, the CTE is referenced two times: one time to return the selected columns for the salesperson, and again to retrieve similar details for the salesperson's manager. The data for both the salesperson and the manager are returned in a single row.
A recursive query used in SQL Server 2005 refers to a recursive CTE. It allows you to set up arbitrary levels of queries on data that have a tree-like structure (hierarchical). For instance, the employee reporting structure in an organization is something that was not possible in SQL 2000.
EXAMPLE OF CTE
The following example shows the components of the CTE structure: expression name, column list, and query. The CTE expression Sales_CTE has three columns (SalesPersonID, NumberOfOrders, and MaxDate) and is defined as the total number of sales orders and the most recent sales order date in the SalesOrderHeader table for each salesperson. When the statement is executed, the CTE is referenced two times: one time to return the selected columns for the salesperson, and again to retrieve similar details for the salesperson's manager. The data for both the salesperson and the manager are returned in a single row.
WITH Sales_CTE (SalesPersonID, NumberOfOrders, MaxDate)
AS
( SELECT SalesPersonID, COUNT(*), MAX(OrderDate)
FROM Sales.SalesOrderHeader
GROUP BY SalesPersonID
)
SELECT E.EmployeeID, OS.NumberOfOrders, OS.MaxDate,
E.ManagerID, OM.NumberOfOrders, OM.MaxDate
FROM HumanResources.Employee AS E
JOIN Sales_CTE AS OS
ON E.EmployeeID = OS.SalesPersonID
LEFT OUTER JOIN Sales_CTE AS OM
ON E.ManagerID = OM.SalesPersonID
ORDER BY E.EmployeeID;
Another Example Of CTE
Instead of using a function, I'm hoping to use a CTE to split a string into individual records. I have most of it, except that the last record is the word-before last (see the sample results below for reference). You will notice that "delimiter" is the new rowdata, and "a" is the itemdata. I want another row that has "" as row data, and "delimiter" as itemdata.
Another Example Of CTE
Instead of using a function, I'm hoping to use a CTE to split a string into individual records. I have most of it, except that the last record is the word-before last (see the sample results below for reference). You will notice that "delimiter" is the new rowdata, and "a" is the itemdata. I want another row that has "" as row data, and "delimiter" as itemdata.
DECLARE @RowData varchar(2000)
DECLARE @SplitOn varchar(1)
SELECT
@RowData = 'This is a test for splitting using a CTE to break a varchar into records based on a delimiter',
@SplitOn = ' '
;WITH CTE (RowData, SplitOn, ItemData, Cnt) AS
(
SELECT
SUBSTRING(@RowData, CHARINDEX(@SplitOn, @RowData) + 1, LEN(@RowData)),
@SplitOn,
RTRIM(LTRIM(SUBSTRING(@RowData, 1, CHARINDEX(@SplitOn, @RowData) - 1))),
UNION ALL
SELECT
SUBSTRING(RowData, CHARINDEX(SplitOn, RowData) + 1, LEN(RowData)),
SplitOn,
RTRIM(LTRIM(SUBSTRING(RowData, 1, CHARINDEX(SplitOn, RowData) - 1))),
Cnt + 1
FROM CTE B
WHERE
CHARINDEX(SplitOn, RowData) > 0
)
SELECT * FROM CTE
Change Data Capture (CDC)
Change Data Capture makes database auditing easier to create and maintain. While DML auditing has been available via triggers in SQL Server for a few versions, and DDL auditing has been available since SQL Server 2005, the solutions need to be customized by a developer with reasonably intimate details of the database's underpinnings. With Change Data Capture, system Stored Procedures are used to mark which types of objects you want to audit, and the Stored Procedures take care of how the auditing occurs.
While setting up these auditing procedures is typically a one-time deal, business requirements change over time; it is usually easier to use system Stored Procedures to handle changes, plus it's less error prone than changing your custom auditing code. Unfortunately, when you use Change Data Capture, you will lose the auditing ability that you currently have with your own auditing solution.
Change data capture enables SQL Server administrators and developers to capture insert, update and delete events in a SQL Server table as well as the details of the event that caused data change on the relevant database table. When you apply a Change Data Capture feature on a database table, a mirror of the tracked table is created that reflects the same column structure of the original table and additional columns that include metadata that is used to summarize what is the change in the database table row.
So enabling the Change Data Capture feature on a database table, you can track the activity on modified rows or records in the related table. Change Data Capture (CDC) can be considered as the Microsoft solution for data capture systems in SQL Server 2008 and later versions. There were samples of data capture solutions implemented for Microsoft SQL Server 2000 and SQL Server 2005 by using after update/insert or after delete triggers. But CDC enables SQL Server developers to build SQL Server data archiving without a necessity to create triggers on tables for logging. SQL Server database administrators or programmers can also easily monitor the activity for the logged tabled.
How to enable a SQL Server database for Data Change Capture
Before applying the data change capture (CDC) on a SQL Server database table, the database should be enabled for Data Change Capture. To see whether data change capture (CDC) is enabled over a database, you can run and check the "is_cdc_enabled" column value in the "sys.databases" catalog view by running the following query.
Change Data Capture (CDC)
Change Data Capture makes database auditing easier to create and maintain. While DML auditing has been available via triggers in SQL Server for a few versions, and DDL auditing has been available since SQL Server 2005, the solutions need to be customized by a developer with reasonably intimate details of the database's underpinnings. With Change Data Capture, system Stored Procedures are used to mark which types of objects you want to audit, and the Stored Procedures take care of how the auditing occurs.
While setting up these auditing procedures is typically a one-time deal, business requirements change over time; it is usually easier to use system Stored Procedures to handle changes, plus it's less error prone than changing your custom auditing code. Unfortunately, when you use Change Data Capture, you will lose the auditing ability that you currently have with your own auditing solution.
Change data capture enables SQL Server administrators and developers to capture insert, update and delete events in a SQL Server table as well as the details of the event that caused data change on the relevant database table. When you apply a Change Data Capture feature on a database table, a mirror of the tracked table is created that reflects the same column structure of the original table and additional columns that include metadata that is used to summarize what is the change in the database table row.
So enabling the Change Data Capture feature on a database table, you can track the activity on modified rows or records in the related table. Change Data Capture (CDC) can be considered as the Microsoft solution for data capture systems in SQL Server 2008 and later versions. There were samples of data capture solutions implemented for Microsoft SQL Server 2000 and SQL Server 2005 by using after update/insert or after delete triggers. But CDC enables SQL Server developers to build SQL Server data archiving without a necessity to create triggers on tables for logging. SQL Server database administrators or programmers can also easily monitor the activity for the logged tabled.
How to enable a SQL Server database for Data Change Capture
Before applying the data change capture (CDC) on a SQL Server database table, the database should be enabled for Data Change Capture. To see whether data change capture (CDC) is enabled over a database, you can run and check the "is_cdc_enabled" column value in the "sys.databases" catalog view by running the following query.
select [name], database_id, is_cdc_enabled from sys.databases
If the "is_cdc_enabled" column value for the related database is false, then you can enable this change data capture by running the "sys.sp_cdc_enable_db_change_data_capture" system Stored Procedure in the related database.
exec sys.sp_cdc_enable_db_change_data_capture
When CDC is enabled on a database, a new user named "cdc", a schema named "cdc" and the following tables are created on the related database.
- cdc.captured_columns
- cdc.change_tables
- cdc.ddl_history
- cdc.index_columns
- cdc.lsn_time_mapping
How to enable a database table for Data Change Capture
After enabling the database for CDC, you can configure the tables to track the modifications upon by running the "sys.sp_cdc_enable_table_change_data_capture" system Stored Procedure.
Before running the s"ys.sp_cdc_enable_table_change_data_capture" Stored Procedure you can check whether the related table is enabled prior for data change capture by querying the "is_tracked_by_cdc" column value in the "sys.tables" for the related database table.
select [name], is_tracked_by_cdc from sys.tables
Also for Change Data Capture to be successful, SQL Server Agent should be running. Otherwise, you may get the following error message:
SQLServerAgent is not currently running so it cannot be notified of this action.
exec sys.sp_cdc_enable_table_change_data_capture @source_schema = N'dbo', @source_name = N'Country', @role_name = N'cdc_Admin'
SQL Server Agent plays an important role in the database design or implementation of data capture solution for Microsoft SQL Server databases since for each database two CDC related jobs are created automatically when "sys.sp_cdc_enable_table_change_data_capture" is first executed for a database. These jobs are named "cdc.database_name_capture" and "cdc.database_name_cleanup".
The "cdc.database_name_capture" SQL Server job runs the SQL Server system Stored Procedure "sys.sp_MScdc_capture_job" to start the Change Data Capture Collection Agent.
For more details inside views of SQL Server for CDC you can examine the Stored Procedure "sp_cdc_scan" to see how the CDC scan is executed on a data capture enabled SQL database.
The "cdc.database_name_cleanup" SQL Server job runs the SQL Server system Stored Procedure "sys.sp_MScdc_cleanup_job" to clean up database changes tables.
After you have run the T-SQL command above, you can view the "cdc.change_tables" for the newly created record. "capture_instance" is a parameter that can be set by passing it as a parameter to the "sys.sp_cdc_enable_table_change_data_capture" procedure. If you have not passed the capture instance as a parameter then you can get the value created by default from the "capture_instance column" value of the "cdc.change_tables table".
After you have enabled CDC for a table, for example for Country table just as I did, another table is created for keeping changed data and information about the changes.
As you will see there are five additional fields to the mirrored original table as in the following:
After enabling the database for CDC, you can configure the tables to track the modifications upon by running the "sys.sp_cdc_enable_table_change_data_capture" system Stored Procedure.
Before running the s"ys.sp_cdc_enable_table_change_data_capture" Stored Procedure you can check whether the related table is enabled prior for data change capture by querying the "is_tracked_by_cdc" column value in the "sys.tables" for the related database table.
select [name], is_tracked_by_cdc from sys.tables
Also for Change Data Capture to be successful, SQL Server Agent should be running. Otherwise, you may get the following error message:
SQLServerAgent is not currently running so it cannot be notified of this action.
exec sys.sp_cdc_enable_table_change_data_capture @source_schema = N'dbo', @source_name = N'Country', @role_name = N'cdc_Admin'
SQL Server Agent plays an important role in the database design or implementation of data capture solution for Microsoft SQL Server databases since for each database two CDC related jobs are created automatically when "sys.sp_cdc_enable_table_change_data_capture" is first executed for a database. These jobs are named "cdc.database_name_capture" and "cdc.database_name_cleanup".
The "cdc.database_name_capture" SQL Server job runs the SQL Server system Stored Procedure "sys.sp_MScdc_capture_job" to start the Change Data Capture Collection Agent.
For more details inside views of SQL Server for CDC you can examine the Stored Procedure "sp_cdc_scan" to see how the CDC scan is executed on a data capture enabled SQL database.
The "cdc.database_name_cleanup" SQL Server job runs the SQL Server system Stored Procedure "sys.sp_MScdc_cleanup_job" to clean up database changes tables.
After you have run the T-SQL command above, you can view the "cdc.change_tables" for the newly created record. "capture_instance" is a parameter that can be set by passing it as a parameter to the "sys.sp_cdc_enable_table_change_data_capture" procedure. If you have not passed the capture instance as a parameter then you can get the value created by default from the "capture_instance column" value of the "cdc.change_tables table".
After you have enabled CDC for a table, for example for Country table just as I did, another table is created for keeping changed data and information about the changes.
As you will see there are five additional fields to the mirrored original table as in the following:
- __$start_lsn
- __$end_lsn
- __$seqval
- __$operation
- __$update_mask
- __$operation identifies the Data Manipulation Language (DML) operation that caused Change Data Capture to fetch this process.
- An __$operation column value 2 refers to an Insert statement
If an Update statement is run against a row in the table then a row with a "__$operation" column value of 3 is created referring to the column values before update and a row with a "__$operation" column value 4 is created referring to the column values after the update statement is run.
- An __$operation column value which is equal to 1 refers to a Delete statement.
- __$update_mask identifies the columns that are affected by the DML command. If an Insert command (where __$operation is equal to 2) or a Delete command (where __$operation is equal to 1) since all the columns are effected __$update_mask has a value formed of bit value 1's for each column.
For example, the following insert command results in a binary value of "111" for each column having 1's in order:
INSERT INTO Country SELECT N'TR', N'TĆ¼rkiye', N'Ankara'
The __$update_mask becomes 0x07 where we have "111" in binary. The binary digit with the least value is the first column in the table.
If an update command is run and the Capital column value is updated then 2 new rows are created in the CDC table of the Country table.
UPDATE Country SET Capital = N'Tunus' WHERE Code = 'TU'
The first row indicates the values prior to the Update statement. This row has the "__$operation" equal to 3 and the "__$update_mask" value equal to 0x04 which is "100" in binary. You can see the updated column value from the table referencing the related column. The second row indicated the values after the update statement. This second row has "__$operation" equal to 4 and again the same "__$update_mask" column value with the new value of the related columns.
You can also get the changes on a table by calling the "cdc.fn_cdc_get_all_changes_capture_instance" function. The "capture_instance" in the function name is the "capture_instance" column value that is declared or determined when the table is enabled for capturing data changes.
You can also find the capture_instance for a table that is tracked and logged for data changes in the "cdc.change_tables" table as the "capture_instance" column value.
You can execute the following SQL statement to fetch the "capture_instance" value to use as a parameter to the "cdc.fn_cdc_get_all_changes_capture_instance" function:
select capture_instance from cdc.change_tables
Sample Demonstration on Change Data Capture with DML queries
Assume that we have an empty dbo.Country table as shown in the following figure.
Assume that on the sample table (that is initially empty) and that the CDC feature is enabled, the following DML queries are executed in order.
insert into Country select 'TR', 'Turkey', 'Ankara'
update Country Set Name = N'TĆ¼rkiye' where Code= 'TR'
delete from Country where code = 'TR'
If you want to see the effects of the preceding insert, update and delete queries on the capture instance of the Country table, run the following select query.
select * from cdc.dbo_Country_CT
How to get a list of all the data changes for a SQL Server database table
So you can run the following SELECT query from the CDC table to get all the changes.
select * from cdc.dbo_Country_CT
Or you can run the following SELECT query from the user function that returns rows only whose "__$start_lsn" value is between @from_lsn and @to_lsn:
DECLARE @begin_time datetime
DECLARE @end_time datetime
DECLARE @from_lsn binary(10)
DECLARE @to_lsn binary(10);
SET @begin_time = GETDATE()-1;
SET @end_time = GETDATE();
SELECT @from_lsn = sys.fn_cdc_map_time_to_lsn('smallest greater than or equal', @begin_time);
SELECT @to_lsn = sys.fn_cdc_map_time_to_lsn('largest less than or equal', @end_time);
select * from cdc.fn_cdc_get_all_changes_dbo_Country(@from_lsn, @to_lsn, 'all');
The LSN values "__$start_lsn" are the Log Sequence Number associated with the commit transaction for the changes.
Log sequence numbers for a date-time value can be obtained from the "cdc.lsn_time_mapping" table.
The SQL syntax for "sys.fn_cdc_map_time_to_lsn" to get the mapped lsn value for a datetime value is as follows:
sys.fn_cdc_map_time_to_lsn ( '<relational_operator>' , tracking_time )
Tracking time is in datetime format and the possible relational_operator values are:
The LSN values "__$start_lsn" are the Log Sequence Number associated with the commit transaction for the changes.
Log sequence numbers for a date-time value can be obtained from the "cdc.lsn_time_mapping" table.
The SQL syntax for "sys.fn_cdc_map_time_to_lsn" to get the mapped lsn value for a datetime value is as follows:
sys.fn_cdc_map_time_to_lsn ( '<relational_operator>' , tracking_time )
Tracking time is in datetime format and the possible relational_operator values are:
- largest less than
- largest less than or equal
- smallest greater than
- smallest greater than or equal
To disable the Change Data Capture feature on a database table, you can use the system procedure:
sys.sp_cdc_disable_table_change_data_capture. The syntax of the sys.sp_cdc_disable_table_change_data_capture procedure is as follows;
sys.sp_cdc_disable_table_change_data_capture [ @source_schema = ] 'source_schema' , [ @source_name = ] 'source_name' [, [ @capture_instance = ] 'capture_instance' | 'all' ]
Change's in the DATE and TIME DataTypes
Understanding Of Date Time
In SQL Server 2005 and earlier, SQL Server only offered two date and time data types: DATETIME and SMALLDATETIME. While they were useful in many cases, they had many limitations, including:
sys.sp_cdc_disable_table_change_data_capture. The syntax of the sys.sp_cdc_disable_table_change_data_capture procedure is as follows;
sys.sp_cdc_disable_table_change_data_capture [ @source_schema = ] 'source_schema' , [ @source_name = ] 'source_name' [, [ @capture_instance = ] 'capture_instance' | 'all' ]
Change's in the DATE and TIME DataTypes
Understanding Of Date Time
In SQL Server 2005 and earlier, SQL Server only offered two date and time data types: DATETIME and SMALLDATETIME. While they were useful in many cases, they had many limitations, including:
- Both the date value and the time value are part of both of these data types, and you can't choose to store just one of them. This often causes a lot of wasted storage (because you store data you don't need or want); adds unwanted complexity to many queries because the data types often need to be converted to a different form to be useful and often reduces performance because WHERE clauses with these date and time data types often need to include functions to convert them to a more useful form, preventing these queries from using indexes.
- They are not time-zone aware, that often requires extra coding for time-aware applications.
- Precision is only .333 seconds, that is often not granular enough for some applications.
- The range of supported dates is not adequate for some applications, and the range does not match the range of .NET CLR DATETIME data type, that requires additional conversion code.
To overcome these problems, SQL Server 2008 introduces new date and time data types, that include:
- DATE: As you can imagine, the DATE data type only stores a date in the format of YYYY-MM-DD. It has a range of 0001-01-01 through 9999-12-32, that should be adequate for most business and scientific applications. The accuracy is 1 day, and it only takes 3 bytes to store the date.
- TIME: Time is stored in the format: hh:mm:ss.nnnnnnn, with a range of 00:00:00.0000000 through 23:59:59:9999999 and is accurate to 100 nanoseconds. Storage depends on the precision and scale selected, and runs from 3 to 5 bytes.
DATETIME2: DATETIME2 is very similar to the older DATETIME data type, but has a greater range and precision. The format is YYYY-MM-DD hh:mm:ss:nnnnnnnm with a range of 0001-01-01 00:00:00.0000000 through 9999-12-31 23:59:59.9999999, and an accuracy of 100 nanoseconds. Storage depends on the precision and scale selected, and runs from 6 to 8 bytes.
- All of these new date and time data types work with SQL Server 2008 date and time functions, that have been enhanced in order to properly understand the new formats. In addition, some new date and time functions have been added to get the advantage of the new capabilities of these new data types.
In SQL Server 2005, there were DATETIME or SMALLDATETIME data types to store datetime values but there was no specific datatype to store a date or time value separtely. In addition, search functionality doesn't work on DATETIME or SMALLDATETIME fields if you only specify a data value in the where clause. For example the following SQL query will not work in SQL Server 2005 since you have only specified the date value in the where clause.
SELECT * FROM tblMyDate Where [MyDateTime] = '2010-12-11'
To make it work you need to specify both date and time components in the where clause.
SELECT * FROM tblMyDate Where [MyDateTime] = '2010-12-11 11:00 PM'
With introduction of DATE datatype the above problem is resolved in SQL Server 2008. See the following example.
DECLARE @mydate as DATE
SET @ mydate = getdate()
PRINT @ mydate
The output from the preceding SQL query is the present date only (such as 2010-12-11), no time component is added with the output.
The TIME datatype is also introduced in SQL Server 2008. See the following query using the TIME datatype.
DECLARE @mytime as TIME
SET @mytime = getdate ()
PRINT @mytime
The output of the preceding SQL script is a time only value. The range for the TIME datatype is 00:00:00.0000000 through 23:59:59.9999999.
SQL Server 2008 also introduced a new datatype called DATETIME2. In this datatype, you will have an option to specify the number of fractions (minimum 0 and maximum 7). The following example shows how to use the DATETIME2 datatype.
The output of the preceding SQL script is a time only value. The range for the TIME datatype is 00:00:00.0000000 through 23:59:59.9999999.
SQL Server 2008 also introduced a new datatype called DATETIME2. In this datatype, you will have an option to specify the number of fractions (minimum 0 and maximum 7). The following example shows how to use the DATETIME2 datatype.
DECLARE @mydate7 DATETIME2 (7)
SET @mydate7 = Getdate()
PRINT @mydate7
The result of the script above is 2010-12-11 22:11:19.7030000.
Sparse columns
Sparse columns that optimize storage for null values was introduced in SQL Server 2008. When a column value contains a substantial number of null values, defining the column as sparse saves a significant amount of disk space. In fact, a null value in a sparse column doesn't take any space.
If you decide to implement a sparse column then it must be nullable and cannot be configured with IDENTITY properties and cannot include a default. In addition, you cannot define a column as sparse if it is configured with certain datatypes, such as TEXT, IMAGE, or TIMESTAMP.
The following SQL script shows how to create a table with sparse columns:
Create table mysparsedtable
(
column1 int primary key,
column2 int sparse,
column3 int sparse,
column4 xml column_set for all_sparse_columns
)
Passing tables to functions or procedures using the new Table-Value parameters
SQL Server 2008 introduces a new feature to pass a table datatype into Stored Procedures and functions. The table parameter feature greatly helps to reduce the development time because developers no longer need to worry about constructing and parsing long XML data.
Using this feature, you can also allow the client-side developers (using .NET code) to pass data tables from client-side code to the database. The following example shows how to use Table-Value parameter in Stored Procedures.
In the first step, I created a Student table using the following script:
GO
CREATE TABLE [dbo].[TblStudent]
(
[StudentID] [int] IDENTITY(1,1) NOT NULL PRIMARY KEY,
[StudentName] [varchar](30) NOT NULL,
[RollNo] [int] NOT NULL,
[Class] [varchar](10) NOT NULL
)
GO
Next, I have created table datatype for Student table.
GO
CREATE TYPE TblStudentTableType AS TABLE
(
[StudentName] [varchar](30) NOT NULL,
[RollNo] [int] NOT NULL,
[Class] [varchar](10) NOT NULL
)
GO
Then I created a Stored Procedure with the table datatype as an input parameter and to insert the data into the Student table.
Then I created a Stored Procedure with the table datatype as an input parameter and to insert the data into the Student table.
GO
CREATE PROCEDURE sp_InsertStudent
(
@TableVariable TblStudentTableType READONLY
)
AS
BEGIN
INSERT INTO [TblStudent]
(
[StudentName] , [RollNo] , [Class]
)
SELECT
StudentName , RollNo , Class FROM @TableVariable WHERE StudentName = 'Tapas Pal'
END
GO
In the last step, I have entered one sample student record in the table variable and executed the Stored Procedure to enter a sample record in the TblStudent table.
DECLARE @DataTable AS TblStudentTableType
INSERT INTO @DataTable(StudentName , RollNo , Class)
VALUES ('Tapas Pal','1', 'Xii')
EXECUTE sp_InsertStudent
@TableVariable = @DataTable
New MERGE command for INSERT, UPDATE and DELETE operations
SQL Server 2008 provides the MERGE command that is an efficient way to perform multiple Data Manipulation Language (DML) operations at the same time. In SQL Server 2000 and 2005, we needed to write separate SQL statements for INSERT, UPDATE, or DELETE data based on certain conditions, but in SQL Server 2008, using the MERGE statement we can include the logic of similar data modifications in one statement based on a where condition match and mismatch. In the following example, I have created two tables (TblStudent and TblStudentMarks) and inserted sample data to show how a MERGE command works.
New MERGE command for INSERT, UPDATE and DELETE operations
SQL Server 2008 provides the MERGE command that is an efficient way to perform multiple Data Manipulation Language (DML) operations at the same time. In SQL Server 2000 and 2005, we needed to write separate SQL statements for INSERT, UPDATE, or DELETE data based on certain conditions, but in SQL Server 2008, using the MERGE statement we can include the logic of similar data modifications in one statement based on a where condition match and mismatch. In the following example, I have created two tables (TblStudent and TblStudentMarks) and inserted sample data to show how a MERGE command works.
GO
CREATE TABLE [dbo].[TblStudent]
(
[StudentID] [int] IDENTITY(1,1) NOT NULL PRIMARY KEY,
[StudentName] [varchar](30) NOT NULL,
[RollNo] [int] NOT NULL,
[Class] [varchar](10) NOT NULL
)
GO
CREATE TABLE TblStudentMarks
(
StudentID INTEGER REFERENCES TblStudent,
StudentMarks INTEGER
)
GO
INSERT INTO TblStudent VALUES('Tapas', '1', 'Xii')
INSERT INTO TblStudent VALUES('Vinod', '2', 'Xiv')
INSERT INTO TblStudent VALUES('Tamal', '3', 'Xii')
INSERT INTO TblStudent VALUES('Tapan', '4', 'Xiii')
INSERT INTO TblStudent VALUES('Debabrata', '5', 'Xv')
INSERT INTO TblStudentMarks VALUES(1,230)
INSERT INTO TblStudentMarks VALUES(2,280)
INSERT INTO TblStudentMarks VALUES(3,270)
INSERT INTO TblStudentMarks VALUES(4,290)
INSERT INTO TblStudentMarks VALUES(5,240)
Now to perform the following operations, I have written a single SQL statement.
Now to perform the following operations, I have written a single SQL statement.
- Delete Record with student name 'Tapas'
- Update Marks and Set to 260 if Marks is <= 230
- Insert a record in TblStudentMarks table if the record doesn't exist
MERGE TblStudentMarks AS stm
USING (SELECT StudentID,StudentName FROM TblStudent) AS sd
ON stm.StudentID = sd.StudentID
WHEN MATCHED AND sd.StudentName = 'Tapas' THEN DELETE
WHEN MATCHED AND stm.StudentMarks <= 230 THEN UPDATE SET stm.StudentMarks = 260
WHEN NOT MATCHED THEN
INSERT(StudentID,StudentMarks)
VALUES(sd.StudentID,25);
GO
Optimizing MERGE Statement Performance
Spatial datatypes
Optimizing MERGE Statement Performance
Spatial datatypes
Spatial is the new data type introduced in SQL Server 2008 that represents the physical location and shape of any geometric object. Using spatial data types, you can represent countries, roads and so on. The Spatial data type in SQL Server 2008 is implemented as .NET Common Language Runtime (CLR) data type. There are two types of spatial data types available, geometry and geography data type. Let me show you example of a geometric object.
DECLARE @point geometry;
SET @point = geometry::STGeomFromText ('POINT (4 9)', 0);
SELECT @point.STX; -- Will show output 4
SELECT @point.STY; -- Will show output 5
You can use the methods STLength, STStartPoint, STEndPoint, STPointN, STNumPoints, STIsSimple, STIsClosed and STIsRing with geometric objects.
Manage your files and documents efficiently by implanting FILESTREAM datatype
SQL Server 2000 and 2005 do not provide much for storing videos, graphic files, Word documents, Excel spreadsheets and other unstructured data. In SQL Server 2005 you can store unstructured data in VARBINARY (MAX) columns but the maximum limit is 2 GB. To resolve the unstructured files storage issue, SQL Server 2008 introduced the FILESTREAM storage option. The FILESTREAM storage is implemented in SQL Server 2008 by storing VARBINARY (MAX) binary large objects (BLOBs) outside the database and in the NTFS file system. Before implementing FILESTREAM storage, you need to perform the following procedure:
- Enable your SQL Server database instance to use FILESTREAM (enable it using the sp_filestream_configure Stored Procedure. sp_filestream_configure @enable_level = 3)
- Enable your SQL Server database to use FILESTREAM
- Create the "VARBINARY (MAX) FILESTREAM" datatype column in your database
Faster queries and reporting with grouping sets
SQL Server 2008 implements grouping set, an extension to the GROUP BY clause that helps developers to define multiple groups in the same query. Grouping sets help dynamic analysis of aggregates and make querying/reporting easier and faster. The following is an example of grouping set.
SELECT StudentName, RollNo, Class , Section
FROM dbo.tbl_Student
GROUP BY GROUPING SETS ((Class), (Section))
ORDER BY StudentName
ROW_NUMBER,RANK(),DENSE_RANK()
No comments:
Post a Comment