My Favorite Blogs

Saturday, February 22, 2014

Most useful MDW queries I've written but didn't get chance to share long time but now here they are...

Hmm... after a long gap writing this post on my blog... not sounds right... but I was little engaged with both personal and official work which blocked me to spend more time on my writing..

In this write up, I've provided couple of MDW queries which helps DBAs very much when they need to investigate some issues. You can tune up these queries depends on your requirement.

This morning reported an issue that every day at 6PM top of the hour there is a performance degradation on the application, the issue came to me for investigation with a set of given sprocs, I've checked the sproc performance with the collected data and don’t see any performance bottleneck on both db side or SAN side. I've been using MDW for a while and written couple of queries as well, with some additional tune up in the existing one came up with the below query which helped to find the hourly connections on the specific database. This query need to be run on the specific MDW database where you are collecting data for the SQL instance. I've passed the DBName, SQLInstanceName and the collection_time value to get the data for last 5 days and found that every day 6PM the call factor on the db is between 4X to 5X. Then realized that this is not a DB issue but it is volume issue. What else now app team is breaking their heads.... :)

 You can tune this query as needed and MDW has lot more details for your investigations.

-- To find Connections hourly
select  sii.instance_name ServerName,DATEPART(Month, acs.collection_time) CMonth,DATEPART(Day, acs.collection_time)CDay,DATEPART(hour, acs.collection_time) CHour ,COUNT(*) ConsCount
from snapshots.active_sessions_and_requests acs
INNER JOIN core.snapshots_internal si ON acs.snapshot_id=si.snapshot_id
INNER JOIN core.source_info_internal sii ON sii.source_id=si.source_id
where acs.database_name='DBName'
AND sii.instance_name = 'SQLInstanceName'
and collection_time BETWEEN  'From DateTime' AND 'To DateTime'
GROUP BY  sii.instance_name,DATEPART(Month, acs.collection_time),DATEPART(Day, acs.collection_time),DATEPART(hour, acs.collection_time)
Order BY 1,2

Below formatted result set got from the above query output and copied in excel sheet and made a graph out of it give a better understanding to the application teams.



The following query will have with hostname added, I mean the hourly connection list grouped by connection hosts, this will help you to nail down which host is making more calls during the issue time.

-- To find Connections hourly with Hostwise
select  sii.instance_name ServerName,acs.host_name,
DATEPART(Month, acs.collection_time) CMonth,DATEPART(Day, acs.collection_time)CDay,DATEPART(hour, acs.collection_time) CHour ,COUNT(*) ConsCount
from snapshots.active_sessions_and_requests acs
INNER JOIN core.snapshots_internal si ON acs.snapshot_id=si.snapshot_id
INNER JOIN core.source_info_internal sii ON sii.source_id=si.source_id
where acs.database_name='DBName'
AND sii.instance_name = 'SQL InstanceName'
and collection_time BETWEEN  '2014-02-15 08:00:00.9230000 -08:00' AND '2014-02-21 15:00:00.9230000 -08:00'
GROUP BY  sii.instance_name,host_name, DATEPART(Month, acs.collection_time),DATEPART(Day, acs.collection_time),DATEPART(hour, acs.collection_time)
Order BY 6 desc


The following query may be need little more tuning, but this one I've written to find blocking history on a specific database on specific time. The is_blocking column can help you to find the culprit or the head blocker if the value is 1. The blocking_session_id is the column will list all the victims. From this result set you can find the object from sql_text and check wait_type and description to deep dive into the issue give recommendation to the devs or fix.

-- Find Blocking Culprits and Victims

SELECT
collection_time ,Host_name
--, program_name, database_name, login_name
, session_last_request_start_time,session_last_request_end_time, command, object_id, object_name
--, substring(sql_text, 1, 50) -- Sproc or statement details.
,wait_type, wait_duration_ms,wait_resource,resource_description,request_logical_reads,request_total_elapsed_time
,session_id
,blocking_session_id,is_blocking,blocking_exec_context_id
--,*
from snapshots.active_sessions_and_requests acs
INNER JOIN core.snapshots_internal si ON acs.snapshot_id=si.snapshot_id
INNER JOIN core.source_info_internal sii ON sii.source_id=si.source_id
INNER JOIN snapshots.notable_query_text snq ON snq.sql_handle=acs.sql_handle
where acs.database_name='DBName'
AND sii.instance_name = 'SQL InstanceName'
and collection_time BETWEEN  '2014-02-20 17:00:00.9230000 -08:00' AND '2014-02-20 18:00:00.9230000 -08:00'
AND acs.host_name like 'hostname%'
--AND is_blocking = 1 -- Head blocker/ Culprit
AND object_name='Sproc/objectname'
Order by collection_time

The following query is a really helpful one to find a database growth status for a period, you can run the query on the MDW database where you are collecting data for a specific database.

--To find DB growth trend...

SELECT
    c.instance_name
  , a.dbsize / 128 AS 'DBSizeMB'
  , a.logsize / 128 AS 'LogSizeMB'
  , a.ftsize
  , a.database_name
  , convert(datetime,a.collection_time,102)
  ,a.reservedpages / 128 AS 'RpageMB'
  ,a.usedpages / 128 AS 'UsedpagesMB'
  ,a.pages / 128 AS 'PagesMB'
FROM
    MDW.snapshots.disk_usage a INNER JOIN
    MDW.core.snapshots_internal b
    ON a.snapshot_id = b.snapshot_id INNER JOIN
    MDW.core.source_info_internal c
    ON b.source_id = c.source_id
WHERE
      a.database_name='DBName'
ORDER BY
  --  c.instance_name
  --, a.database_name
   a.collection_time DESC

Hoping this is helpful for DBA community... 

Saturday, February 23, 2013

Msg 5184, Level 16, State 2, Line 1

Today I was working on one of the SQL instance on a four node cluster. There are 4 data drives which were mounted already and yesterday I imported 2 new drives (Data5 and 6) in the cluster, they are all mounted drives (Eg. D:\Data1, D:\Data2, where D:\ is a smaller size SAN on it mount all the required drives) The plan was to add new data files on the new drives and cap the existing filled drives to avoid SCOM drive free space alerts. When I tried to add new files on a database in the newly imported drives it didnt succeeded where as it thrown below errors.


Msg 5184, Level 16, State 2, Line 1
Cannot use file 'D:\Data5\SQLDBOps_data5.ndf' for clustered server. Only formatted files on which the cluster resource of the server has a dependency can be used. Either the disk resource containing the file is not present in the cluster group or the cluster resource of the Sql Server does not have a dependency on it.
Msg 5009, Level 16, State 8, Line 1
One or more files listed in the statement could not be found or could not be initialized.

This error occurred due to the newly added drives are not dependencies of the SQL server resource in the cluster resource group. This error i encountered one ago so I just want to share in my blog it would help others too...

Here are the steps you have to do to fix it.

1. Go to Failover Cluster Manager --> Click on the correct SQL resouce group --> Right click on the SQL Resource --> Go to the Dependencies tab
2. Check the SQL Server resource decencies on the cluster resource group.Sure you wont find the newly imported drives in it.
3. Add the new driver(s) in the Resource column and click apply and then OK.
4. Repeat the step 3 if you need to add any more drives as SQL Server dependencies. Make sure all the Data,Log and TempDB drives are in the dependencies.

Once this is completed, I tried to add new files in the new data drives, which works without any errors.


Hope this helps...



Friday, June 15, 2012

Cannot find server certificate with thumbprint Errors

When I'm trying to setup log shipping for a database, I got the below certificate error and I understood that the database got TDE enabled and encrypted with Master key (private key). After struggled for a while and got the steps to fix it.

So before restore that database on the secondary database backup the certificate with master key (private key) and restore it on the secondary  server. Without certificate with Master key on any server and if you try to restore the TDE enabled database you will encounter the below errors.


Message
Executed as user: Domain\ServiceAccount. Cannot find server certificate with thumbprint '0x62A3B87C122C170DFEEBFC6AFCB14B4E5BB5543B'. [SQLSTATE 42000] (Error 33111)  RESTORE DATABASE is terminating abnormally. [SQLSTATE 42000] (Error 3013).  The step failed.

Backup the certificate with master key on the Primary Server
BACKUP CERTIFICATE  [EncryptionCertificate] 
TO FILE = 'Certificate File path'
WITH PRIVATE KEY (FILE = 'Master Key File path.dat', ENCRYPTION BY PASSWORD = 'password')


Restore the certificate with master key password on the Secondary Server
CREATE CERTIFICATE [EncryptionCertificate]
FROM FILE='Certificate File path'
WITH PRIVATE KEY ( FILE  'Master Key File path.dat' , DECRYPTION BY PASSWORD = 'password')

This will resolve the issue and can restore the database with out any errors.

Thursday, May 3, 2012

ARE YOU USING THE RIGHT SQL SERVER PERFORMANCE METRICS?

This article talks about the five important SQL metrics when you need to help in solving performance problems.
Source: http://thomaslarock.com

Saturday, February 4, 2012

In SQL Management Studio resgiter multiple servers using Powershell

Recently our team builted around 60+ DR servers in our new datacenter and I thought to register them in SQL management studio, of course it will take some time to register them all together and thought that if a script does the same that would be wonderful.

After searching the indernet for sometime I found a peice of code and tweaked it to register a server and getting an input parameter from a text file.

Here is the code.
param (
 [string] $filename
 )
 add-pssnapin SqlServerProviderSnapin100
add-pssnapin SqlServerCmdletSnapin100
  $readbuffer=get-content -path $filename
 $i=0;
 foreach ($line in $readbuffer)
{
write-output "$i : $line"
sl "SQLSERVER:\SQLRegistration\Database Engine Server Group\SQLDBOps" ;
##remove-Item $(Encode-Sqlname "$line") -Force
new-Item $(Encode-Sqlname "$line") -itemtype registration -Value “server=$line;integrated security=true”;
 $i++;
 }
remove-pssnapin SqlServerProviderSnapin100
remove-pssnapin SqlServerCmdletSnapin100

In the path name I’ve specified \SQLDBOps is which I’ve created under Registered ServersàDatabase EngineàLocal Server Groups and the \SQLRegistration\Database Engine Server Group is the default existing Local Servers Groups.
Code snap

Save the above script in a poweshell file as RegisterServer.ps1 and save your SQL instance names in a txt file as below.
SQLInstance1
SQLInstance2
SQLInstance3
SQLInstance4
SQLInstance5
SQLInstance6
SQLInstance7
SQLInstance8
SQLInstance9
SQLInstance10

Open the Powershell command window and navigate to the path where the script and text file exists and execute the file like  .\RegisterServer.ps1 SQLSrvrs.txt, the servers will be registered under your group as below.


Saturday, January 7, 2012

Transactional Log Full on Mirrored with Log shipped database


Today I worked on a Transactional log full issue on one of the critical production database server, the database involved in this issue is configured both Mirroring and Log shipping.  From the mirrored database a scheduled job generating snapshot for every 30 mins.  The production database is involved high volume of ecommerce OLTP, everything were working fine till midnight and after some changes happened in the front end database started receiving huge volume of data and the Tlog backup size turned to 20GB+ from 500MB normal value, this continued for 3 hours and the logs have been reduced and back to normal but the mirroring didn’t caught up so fast it was crawling till the morning 7.30AM and went into disconnected state.

The Tlog full issue started at 1.00AM and my fellow DBA assumed that issue was causing by a trim job which runs every 8 hours and the 1AM schedule usually runs more than 6hrs, but that was not the true cause of the issue, it was mirroring, the mirroring was absolutely slow, since the transactions were not transferred to the mirrored database, the unsent logs staying in the principal and waiting to be sent, of course this is causing not to flush the transactional logs though the log shipping Tlog backup job running every 5 minutes.
Finally figured out the issue, since the mirroring went into disconnected state I ended up to break the mirroring completely and changed the database recovery model from FULL to SIMPLE, which brought the Tlog usage 99% to less than 1% drastically. Then I shrunk the log to required size.

Once all back to normal I setup the mirroring and log shipping once again... oh yeah a day work for a big database... :)

When I did the investigation,  used the sp_dbmmonitorresults  stored procedure and dbm_monitor_data system table on msdb database on the principal database instance.

As per the below table, the mirroring was fine till 7.32AM and it broken 7.33AM, if you see the unsent_log column value in the table it started piling up.

sp_dbmmonitorresults dbname, 4, 0;


Mirroring the following table also has almost same but little more, you can see the log_flush_rate and send_queue_size columns started increasing from 7.33AM.

SELECT * FROM dbm_monitor_data
where database_id=5




Thursday, January 5, 2012

Msg 15110, Level 16, State 1, Line 1 The proposed new database owner is already a user or aliased in the database

              Today I got an user request to change  database owner for couple of databases on a lab environment. The user tried to change the owner by himself  but he got "Msg 15110, Level 16, State 1, Line 1" error, because the user name is already exists in the database which is erroring out. I looked into that and tried to change the db owner and got the same error as below.


-- Step 1
sp_changedbowner 'SQL\DBOPS'
Msg 15110, Level 16, State 1, Line 1
The proposed new database owner is already a user or aliased in the database.

 Based on the above error I decided to  drop the user from the db but I got different error now as below.

drop user [SQL\DBOPS]
Msg 15138, Level 16, State 1, Line 1
The database principal owns a schema in the database, and cannot be dropped.

 Based So I checked that the user owns which schema on the db using the below statement and found that the user have a schema with the same name which is blocking to drop the user.

-- Step 2
select * from information_schema.SCHEMATA
where schema_owner ='SQL\DBOPS'

CATALOG_NAME
SCHEMA_NAME
SCHEMA_OWNER
DEFAULT_CHARACTER_SET_CATALOG
DEFAULT_CHARACTER_SET_SCHEMA
DEFAULT_CHARACTER_SET_NAME
DBOps_UsageAndHealthDB
SQL\DBOPS
SQL\DBOPS
NULL
NULL
iso_1


In order to drop the schema, first I've to remove/change the schema owner  to something else, so I've to  run the below statement, the below statement change the schema owner to dbo from SQL\DBOPS user.

--Step 3
Alter authorization on SCHEMA:: [SQL\DBOPS]  to dbo
Command(s) completed successfully.

 Run the below SELECT statement again to make sure no other schemas owned by the user.  
--Step 4
select * from information_schema.schemata
where schema_owner ='SQL\DBOPS'
--Result empty

Now I'm good to drop the user and change the owner of the db. 
--Step 5
drop user [SQL\DBOPS]
Command(s) completed successfully.

--Step 6
sp_changedbowner 'SQL\DBOPS'
Command(s) completed successfully.
--succeeded



Issue resolved now... :)