Category Archives: Administration

SQL Saturday Dublin 2014 – I’m speaking!

I have been fortunate enough to have been chosen to speak at SQL Saturday Dublin. I am especially excited about this one for two reasons:

  1. I will be delivering a different topic to my other speaking engagements of late, although not too far from the usual.  This time my session will be “Replication Troubleshooting and Monitoring“. This is a topic that I wrestle with on a regular basis and know how difficult it can be when you are staring at seemingly useless error messages.  I hope to be able to show some common errors that I have encountered and how to go about solving them.

  2. I will be returning to Dublin! I spent an excellent weekend in Dublin in Decemeber 2013, where I got the chance to take part in a mini-conference at the Dublin SQL Server User Group along with André Kamman ( t | b ) and Mladen Prajdić ( t | b ). Going by that weekend, SQL Sat 310 should be one hell of a time.

So, now I have to get back to making these demos work. I’ll see you in Dublin on the 20th of September!

Speaking at SQL Saturday 313 in Sankt Augustin (Germany) – June 28 2014

sqlsatrheinland

Just a quick post to say that I have been chosen to speak at SQL Saturday 313 in Sankt Augustin near Bonn in Germany!

This is my first speaking engagement in my home country and I am really looking forward to it.  I will be doing a talk on Replication Troubleshooting where I hope to explain how replication can go wrong and how to go about fixing it.

I am unsure whether to try out a session in German or English, I will make the choice on the day – depending on the audience,

Either way, I’m sure it will be a fun day.  There are a lot of really good speakers and sessions lined up, so you are guaranteed to have a good time.  If you are attending, come and see my session or just have a chat.

Bis in 5 Wochen!

SQL Hangout with Boris Hristov

I spoke at SQL Saturday in Slovenia just before Christmas and had a great time.  Whilst I was there I made a new SQL friend – Boris Hristov ( t | b ) a SQL Server expert from Bulgaria. He attended my session on SQL Server consolidation and joined in to an interesting discussion towards the end of the session.

After Christmas, Boris got in touch with me and asked if I would like to be a guest on his regular SQL Server Hangout “show” and talk with him about consolidating SQL Servers. I liked the sound of that, so we met up on G+ and had a pretty informal/unplanned chat about our experiences with consolidation and what we think needs special attention.

Boris said he will be posting more details about the Hangout on his blog, but in the mean time, here is the recording:

http://www.youtube.com/watch?v=LFbnOXPKLYI

Please make sure to check out the other Hangouts in the series, as Boris has managed to record some really interesting topics with experts from all over the world.

Speaking at SQLSaturday #269 in Exeter

I received some great news last night; I have been chosen to speak at SQL Saturday #269 in Exeter on the 22nd of March.

I will be presenting my session “SQL Server Consolidation – Resistance is futile” where I will be going over some details and stumbling blocks when trying to approach a consolidation project.

This is a return to my speaking roots – I gave my first ever public speaking session at last year’s SQL Saturday in Exeter.  I enjoyed myself so much that I went on to speak a further 4 times last year and have been bitten by the speaking bug,

I am really looking forward to coming back to Exeter – the hosts Jonathan Allen ( t | b ) and his wife Annette ( t ) have been good friends and mentors to me over the past few years, so seeing them is always good.  Speaking at SQL Saturdays is really fun and I really enjoy getting to meet people in the community and discussing all things SQL, so if you spot me there, come and have a chat.

If you want some really good free training (not forgetting the pre-con day the day before!) and would like to see a bit of the South-West of the UK, head on down to SQL Saturday #269 in Exeter on the 21st and 22nd of March., going by last year’s event you will *not* be disappointed!

I hope to see you there!

SQL Server Consolidation – Resistance is Futile: Remote presentation for SQL Southwest, 21st November 2013

I am booked to speak at the SQL Southwest User Group in the UK on the 21st of November at some time from 18:00 (GMT) onwards.

This will be a new session on SQL Server consolidation titled “SQL Server Consolidation – Resistance is Futile!”.  The session will cover the reasons for and against consolidation and hopefully help you to understand how consolidation can be performed in your environment. Due to the subject there won’t be many demos (if at all) and it will be more of a high-level discussion.

The session will be run remotely using a yet to be decided medium (GoToMeeting, WebEx or something similar), which will be a first for me.  A suitable link or login details will be made available via the SQL Southwest website and I will tweet the details ahead of time too.  I have been informed that there will be some sort of question/answer capability which will also be communicated in advance and during the session.

I hope to see you there!

2 MVPs & 1 DBA – See me speak in Dublin

2 MVPs & 1 DBA

You live in or around Dublin and don’t have any plans for Friday the 29th of November? (or maybe you want to go but haven’t managed to get there yet)

Then get yourself down to the Microsoft Auditorium in Dublin for a FREE 4 hour “Half Day Conference” starring Mladen Prajdić ( b | t ), André Kamman ( b | t ) (2 MVPs) and myself (DBA).

Registration for the event: Register

We’ll most likely be speaking on (session voting is still in progress):

  • Extended Events (Mladen)
  • Building a monitoring system with Powershell Remoting (André)
  • SQL Server Consolidation (Me)

The conference will run from 14:00 until 18:00 with an open end in terms of discussions. We may get kicked out of the Microsoft building, but we can continue talking over a pint afterwards.

I look forward to seeing you in Dublin!

P.S. Its for FREE!

Problems creating an SMB share on a windows cluster

Problems creating an SMB share on a windows cluster

TLDR: If you experience the error “New SMB shared folder cannot be created. The object already exists.” on a clustered windows server, ensure that storage dependencies to the Fileserver service are correctly registered.  The cluster virtual machine name and the clustered disk holding the share have to be in the dependency list.

I was recently trying (and failing) to add a windows share to a folder on a Windows cluster.  Windows kept throwing the following error message:

New SMB shared folder cannot be created. The object already exists

New SMB shared folder cannot be created. The object already exists

“New SMB shared folder cannot be created. The object already exists.”

I knew that the folder was not already shared and checked in the “Share and Storage Management” console to ensure that there wasn’t another folder using that share name.  The question was now “Why can’t I create a windows share on a clustered disk?”

I dug around in the failover cluster manager and checked all areas, trying to find something that wasn’t right.  Low and behold, there was a missing dependency between the clustered fileserver service and the physical disk that it was serving.

When you run a cluster and want to serve files from it, a separation between the physical storage resource and the logical storage location is made (to allow failovers and multi-node access to the storage).  Windows adds a file server service into the cluster which then serves the underlying storage to user requests.  As the fileserver service can only serve data when the storage is physically available, the service has a dependency on that storage.  This, and any other dependency, is stored as a property of the service that is running.  You can see these dependencies by viewing the properties of any clustered service, application or resource.

The dependencies of a cluster resource

The dependencies of a cluster resource

For easier viewing, you can also run a dependency report at the service level.  This will create a detailed html style report showing which dependencies exist.

A Cluster Dependency report

A Cluster Dependency report

I had recently moved a lot of storage subsystems around on this particular cluster, which removed some drives and added some new ones.  Through this housekeeping work, the dependencies of the cluster had been removed.  This didn’t have any adverse effects, as the dependencies to other services ensured that the fileserver always had access to the underlying drives.

As soon as the missing dependencies had been re-instated I was able to create new shares on folders stored on those disks.

The error message thrown by windows is very misleading.  It states that the share already exists, when what really should be thrown is an error saying the storage dependencies are not OK.

Lesson learned: Always check your dependencies when you are making changes to cluster resources!

Windows Failover Cluster Patching using PowerShell

PowerShell is the new standard scripting and automation language for Microsoft products. In this post I will be describing how I used two new Powershell commands to help semi-automate the installation of Windows Updates to a Windows 2012 Failover Cluster.

I run a number of clusters on both Windows Server 2008 and Windows Server 2008R2 and I must say that although Windows Clustering was a new thing to me about 5 years ago, my overall experience has been pretty painless during that time. From what I hear/read Server 2008 was the first outing of a much updated Windows Failover Clustering (including PowerShell integration), so I managed to body-swerve the more painful versions before it.

Be that as it may, Windows Server 2008 was always lacking in terms of scripting/automation and didn’t get much better with Windows Server 2008R2. The previous incarnation of the automation tooling was “cluster.exe” which was a command line tool that allowed for quite detailed control of a cluster. This was a tool that has “grown” over time to cover many areas of Windows Clustering, but was not designed to be as flexible or programmable as PowerShell. The PowerShell command-set for Clustering covered just about everything that “cluster.exe” covered and then some more – a good overview can be found at the Windows File Server Team Blog.

As Windows has evolved, the PowerShell offerings for each feature within Windows has also evolved. I recall hearing that the with Windows Server 2012 you can control the entire O/S from within PowerShell and that the GUI basically constructs and runs these PowerShell commands in the background (my brain may be playing tricks on me though).

With this in mind, I am preparing a server migration/replacement which will move a cluster from Windows Server 2008 to Windows Server 2012.  I took another look at the PowerShell commands available to me on Server 2012 and was rather pleased to see two improvements in the Failover Clustering command-set. The major command addition that I discovered, and had to write about here, was the “Suspend-ClusterNode” command and its brother “Resume-ClusterNode”.

Suspend-ClusterNode allows you to temporarily “pause” a cluster node. This basically removes any currently running resources/services from the specified node and the cluster will not assign any workload to that node until it has been re-instated as an active cluster participant. Resume-ClusterNode brings a previously suspended node back online within the cluster.

You may ask “Why would you want to do this?” or be thinking “This was possible with Windows Server 2008R2”; well dear reader, let’s take a look at those two points.

“Why would you want to do this?”

The short answer: Server maintenance with minimal downtime.

The slightly longer answer: Imagine you have a 3 node cluster. It is patchday and you want to cleanly patch the nodes, one after the other, with a minimum of downtime. You can afford for one node of the three to be offline for maintenance at any one time. This means that you can suspend node 3 while nodes 1 and 2 remain online. The cluster then knows that node 3 is down, but it is not down not due to a failure (so will not start panicking). You can then patch the node, reboot as necessary (who am I kidding, this is a windows server, you’ll have to reboot it!) and then the node is ready to re-join the cluster as an active/available member. You are then able repeat this process on the remaining nodes to install updates across the whole cluster.

“This was possible with Windows Server 2008R2”

The short answer: Yes it was, but required some manual intervention.

The long answer: Windows Server 2008R2 offered the ability to suspend a cluster node, but without the ability to control how the cluster dealt with any resources on the node being suspended. This is where the real magic in the new PowerShell command “Suspend-ClusterNode” comes into play.

Let’s take a quick look at the syntax of both so we can compare:

Suspend-ClusterNode

Old Syntax

Suspend-ClusterNode [[-Name] ] [[-Cluster] ]

New Syntax

Suspend-ClusterNode [[-Name] ] [[-Cluster] ] [[-TargetNode] ] [-Drain] [-ForceDrain] [-Wait] [-Confirm] [-WhatIf]

As we can see, the new syntax offers quite a few extra parameters over the old; the main ones to note are [TargetNode] and [Drain]. [TargetNode] allows us to specify where any resources should be moved to and [Drain] initiates the move of the resources/services. This allows for a much finer control over resources within a cluster during maintenance operations. With the new command it is really easy to perform the 3 node cluster maintenance outlined earlier. We suspend one node after the other, moving any resources they should have to another node of our choosing and can then resume the node after the maintenance has completed. If we now take a look at Resum-ClusterNode, we will see another level of brilliance that becomes available to us that further eases node maintenance work:

Resume-ClusterNode

Let’s compare old vs. new:

Old Syntax

Resume-ClusterNode [[-Name] ] [-Cluster ]

New Syntax

Resume-ClusterNode [[-Name] ] [-Cluster ] [[-Failback] {Immediate | NoFailback | Policy}]

Again, we can see that there is more to decide upon with the new syntax. When you resume a suspended node, you can decide what happens to the resources that were previously running on that node.

The parameter [Failback] has three options:

  • “Immediate” is pretty obvious and will immediately take control of the resources that were previously running on that node before suspension.
  • “NoFailback” will resume the node but leave all resources where they currently are – this is a good idea if you have already failed over to an updated node and don’t want another service outage in this maintenance period.
  • Finally, “Policy” would postpone any failback until a pre-defined failover timeframe is reached.

Once we see how Suspend-ClusterNode and Resume-ClusterNode have been extended, we can understand how the extensions open up better scripting control of clusters and their resources.  I have prepared a script that can be run in our maintenance window that will suspend a node and push all resources to another node.  The suspended node can then receive all Windows Updates and be rebooted if necessary and finally a script can be run to bring the node back online.  The same set of scripts are then run against the second node, reversing the flow back to the node we have just patched.  Using Suspend-ClusterNode and Resume-ClusterNode only reduces the overall code in the PowerShell scripts by a few more lines, but the complexity is reduced drastically.  This makes code maintenance easier and the code is just easier on the eye.

After seeing the improvement to my local code, I am certain that the extension to the PowerShell command-set was internally driven at Microsoft. Imagine how many clusters they have running that needed this extension to improve their automation. As far as I can see it, this is an obvious knock-on effect of their drive to “The Cloud” and a very positive one at that!

As always, think before running any scripts and please don’t blindly setup scripts to suspend and resume nodes in a cluster. This is an aid to your daily work and not a robot you should build to run any/all updates onto a cluster 🙂

Have fun!

SQL Server Connection Encryption and .Net Framework 4.5

SQL Server has offered the ability to encrypt network connections for quite a while. This can be achieved using the self-generated certificate that SQL Server generates when installed, or using a certificate generated outside SQL Server (Your own Enterprise Certificate Authority in your Active Directory or a third-party Certificate Authority). These certificates, once registered to the server, can be loaded inside SQL Server and used to open up SSL encrypted connections between clients and the server.   In general you would use encryption as an additional layer of security when opening up a database server to the world (e.g. an internet facing DB-Server). However, there are cases when the network connection between a client and the server are required to be encrypted inside the confines of a corporate network.   SQL Server will check for the availability of a server certificate at start-up. If it finds a certificate issued by a certificate authority it will load and use it for encryption purposes. If, however, no certificate can be found, then a self-generated certificate will be used.   You can find out if you are using a self-generated certificate or a third-party certificate by reading the SQL Server error log and looking through the start-up messages at the beginning of the current log-file. A message about the encryption certificate is written after the initial information about start-up parameters and initialisation of the system DBs: A self-generated certificate generates the following log entry:

.

A third-party generated certificate on the other hand:

My Issue: I stumbled across an edge-case (for me at least) when working with encrypted connections to SQL Server using the .Net Framework v. 4.5 recently that I wrestled with for a few days before coming to a conclusion and thought it at least noteworthy.   As of v. 4.5 of the .Net Framework the authentication of a certificate is checked more thoroughly than in previous versions. As stated in the MSDN Library (http://msdn.microsoft.com/en-us/library/ex6y04yf.aspx), when the connection property “Encrypt” is set to true and the property “TrustServerCertificate” is set to false (default value), the server name (or Virtual Name for Clustered Instances) in the connection string must *exactly* match the common name (CN) specified in the certificate. If this is not the case, then the connection attempt will fail:   As you can see above, the principal name did not match so no connection could be established. This applies to the following scenarios:

  1. Not using fully qualified domain names in the connection string (FQDN):
    • You have a company domain (mydomain.com) You have a SQL Server (server1), FQDN = server1.mydomain.com
    • You have an SSL certificate generated for the FQDN (server1.mydomain.com) which is registered and loaded by SQL Server at startup
    • Your DB connections only use the server name, not the FQDN: “Server=server1; Initial Catalog=Northwind; Integrated Security=true; Encrypt=True”

In previous versions of the .Net Framework the connection would be opened and the servername “server1” would still be accepted when compared to the servername on the certificate, if the name was the same and the domain was not included, then the connection would still be accepted. As of v. 4.5 this certificate verification is basically doing what it should – identifying the name provided against the name in the certificate as strictly as expected.

  1. Using a DNS Alias to identify and connect to the SQL Server:
    • You have a company domain (mydomain.com) You have a SQL Server (server1), FQDN = server1.mydomain.com
    • You have an SSL certificate generated for the FQDN (server1. mydomain.com) which is registered and loaded by SQL Server at startup
    • You have a DNS Alias (mssql) created as an abstraction layer to your physical SQL Server (allows for easier infrastructure changes).
    • Your DB connections use the Alias name, not the Servername or FQDN: “Server=mssql; Initial Catalog=Northwind; Integrated Security=true; Encrypt=True”

As in scenario 1, previous .Net Framework versions were more than happy to honour the alias redirect to the actual server and accept encrypted connections   The same error occurs as in scenario 1 when trying to connect using .Net Framework 4.5, the reasons are the same – the alias name is not registered in the standard certificate so cannot be strictly identified with the name in the certificate. The on-going theme here is that name resolution between the connection string and the server name in the certificate has become much stricter and also made alias usage (scenario 2) impossible with a standard certificate requested through the certificate authority.   The solution is simple enough – either change your software to use the fully qualified name of the server (as is present in the certificate itself), or request a certificate that incorporates all possible “versions” of the server name and/or alias that you wish to use. These additional names are stored as part of the certificate in a portion of the certificate called the “Subject Alternative Name”.   As I want to continue to use aliases to keep our infrastructure “invisible” to any deployed applications, I went for door number two and created a certificate request that incorporates all name variations and aliases for the system that I plan on using. Solution:   Here are the steps for creating a certificate request within your domain that will fulfill the requirements by using the Subject Alternative Name field to store and use the alias when connecting to the SQL Server using SSL encryption (done on the server you are wanting to set up SSL on). This is an example using a certificate authority within a domain, the workflow may be different when using an external authority (I have never used them for this work before).   I have included a video showing how to create the custom certificate request using the Microsoft Management Console (mmc) and the certificate snap-in. The steps in the video should be simple to follow and will create an offline certificate request .txt file that will contain the certificate request information required at the certificate authority.   There are a few things to note from the video:

  • The friendly name and description are purely for reference.
  • The information entered is there to make your life easier when trying to identify all the different certificates that are installed on the machine.
  • The tab “Subject” is where the actual server information for the certificate will be entered.
  • The “Subject Name” needs to be the Fully Qualified Domain Name (FQDN) of the server you are creating this certificate for entered as the Common Name (CNAME) type.
  • The “Alternative Name” section is where all the alias names will be entered that you want to use to access the server. In this example I have entered a fully qualified alias name and also the short version (I am not sure if this is really necessary, I’m just hedging my bets with regard to the .Net 4.5 issues I was experiencing).

Notice that you can enter as many different aliases here as you feel like – This is the magic to allow you to offer up different aliases to your “customers” so they don’t necessarily know that they are all on one server. You could even use a wildcard here if you are happy for the certificate to be used by *any* server – I’m not too keen on this as *any* server can use the certificate then, so you lose a little control over the certificate usage (maybe this isn’t really an issue, but I prefer to assign single certificates instead). Once you have the request in a text file you then connect to the certificate authority site and send the contents of the text file to the authority system. The response is a certificate that is valid within your domain for the server named in the subject and also for the alternatives listed in the alternatives list. This process is listed below:   The text file created in the video should look something like this: First of all you need to call up the certificate authority site (ask your domain administrator team for the address). You will be confronted with a rather snazzy Microsoft website for Active Directory Certificate Services:

Your choice will be to request a certificate. The next page should offer two possibilities:

  1. “Create and submit a certificate request”
  2. “Submit a certificate request using a bas-64-encoded CMC or PKCS #10 file, or submit a renewal request by using a base64-encoded PKCS #7 file.”

Quite a mouthful (and written verbatim here for Google searches) but we want to take the second option as we have already created the certificate request and only want to submit it. You now open your previously created text file and copy the contents into the corresponding text box on the webpage:

There is a good possibility that your domain/certificate admins have created a template to include all other necessary settings (e.g. certificate expiry). These templates are offered up in the “Certificate Template” dropdown. Once this information has been entered you can submit the request. Assuming that the information is created correctly, your request will be successfully processed and a certificate will be available for download.   The certificate can be downloaded to the SQL server and “installed”. This is done by importing the certificate into the local computer certificate store (the MMC snap-in loaded at the beginning of this blog post). To import you need to navigate to personal certificate store, right-click and choose “Import”. A new wizard will prompt for the certificate file and then import this into the local certificate store. You will now have a new entry in the personal certificate folder with the friendly name used in the certificate request. As stated at the beginning of this blog post: At start-up, SQL Server will automatically look in the local machine certificate store to see if a valid SSL certificate is available. The certificate will be automatically loaded if the certificate subject exactly matches the server name (or the virtual name for a clustered server). You will be able to see this in the SQL Server log in the start-up messages:

You now have a SQL Server that can accept SSL encrypted connections, even if you use an alias. You can also then decide whether you want to force encryption of all SQL Server traffic, which is done through the SQL Server Configuration Manager under “SQL Server Network Configuration” and the properties of the Protocols:

This is also where you can see which certificate SQL Server will use for SSL encryption:

So that was how I solved my problem of wanting to use SSL with server aliases and .Net 4.5 on a SQL Server. I hope it could help you if you ever need to do the same in the future.

As always, comments/suggestions/improvements/corrections are more than welcome, just get in touch!

Reducing data footprints on test systems

Picture the scene:

You are a DBA. You have to make everyone happy, including DEVs (yes, including them too). Part of that is to ensure that your DEVs have a fresh copy of your production data so that they can test their programming with current data and structures. The issue with that is your production database is multiple hundreds of GB in size and your test system doesn’t have the capacity to store a full copy of your database. Your IT Budget will also not stretch to purchasing more storage! What can you do?

I had a similar situation recently and took a look at the possibilties and here is what I came up with:

1. Try out tools like SQL Virtual Restore to allow you to “mount” a backup from a remote server avoiding the space issue all together. This is, of course, a solution that requires a license (no budget maybe?)

2. Use the great features offered to you by SQL Server Developer Edition: data compression to the rescue!

So of course, I chose door number 2.

As you may know, Developer Edition offers the same features as Enterprise Edition with the caveat that the instance cannot be used for production purposes in any way. This means that the seriously powerful and useful data compression becomes available for your test servers! This counts even if you only use Standard Edition in production – joy of joys! The good thing being that Developer Edition only costs around $50 (or less if you have an MSDN subscription) – even more joys!

If you have never seen/used compression take a quick look over on MSDN to see what it all is (Data Compression). Basically, you can compress data in indexes and tables at the storage level avoiding any query rewrites and still profiting from the storage savings. This can be a major advantage in terms of raw storage needs, but can also benefit you in high read environments with low RAM sizes. The data remains compressed when held in the buffer pool and is only de-compressed when being accessed. This means that you can keep more data in the buffer pool and reduce hard disk I/O (obviously not for free, compressing and de-compressing costs CPU cycles).  This may be acceptable on a test system with extremely constrained storage space.

The usage in my example scenario is now an even better proposition, not only can I reduce my storage footprint, I can also potentially increase test system performance on the I/O side of the equation (who doesn’t have I/O issues, especially on a test box!).

The next hurdle is of course identifying which indexes and tables that are in the database you want to squeeze down. This is possible via SSMS’ object explorer, but only if you want to spend an eternity doing so! The best way is to take a look at the meta-data tables/views/functions to interrogate the system objects. These are really interesting on their own (at least they are to me), but after writing my Index Script Creator, I realised the potential for using the information in these system objects to allow me to automate certain tasks like this one.

Similarly to the Index Script Creator, my Index Compressor Script (uploaded as a doc  but it is a .sql file really – stupid wordpress restrictions!) runs off and investigates all tables and indexes (down to the partition level) and then generates an index rebuild command to compress any currently non-compressed partitions of indexes / tables. The resulting commands can then be run and space can be reclaimed – just be aware that rebuilding indexes does cause log activity.

After compressing your tables and indexes you should have a much smaller amount of data (depending on how well your data compresses) and be able to reclaim some disk space.

I realise that this method will mean that you have to have a large enough disk to have the database in its uncompressed state to begin with, but you can end up with a cost effective solution to a budget-constrained test environment.

Happy zipping!