SQL Server Special Ops Tour

If you’ve been following recent SQL Server news, you know that the next version of SQL Server has lost its “Denali” code name[1] and has been officially named SQL Server 2012.[0] This is actually sort of old news, but I’m mentioning it here because it segues so nicely into the main topic of this post: the upcoming Special Ops tour.

The SQL Server Special Ops tour is a 12-city[2] tour being organized by the SQL Server team. You’ve probably seen tours like this before, with technical experts providing deep insight into the features and capabilities of a new software release. But this is a tour with a twist, and the difference is you.[3] To be more specific, it’s members of the SQL Server community who will select the cities for the tour, and who will help select the topics. You can go to www.specialopstour.com today to cast your votes. There will be even more ways to participate, so check out the site today and watch for more news.

And while you’re there, you may also want to check out just what it is that members of the SQL Server team get up to when they’re not at the office…

Do you see any familiar faces?

[0] You also know that SQL Server 2012 Release Candidate 0 is now available for download, as of this morning.

[1] Although I expect most people on the SQL Server team will be calling it “Denali” for many years to come. After joining the SSIS team this summer I had to dredge through my long-term memory to recall what “Yukon” and “Katmai” and “Kilimanjaro” were.

[2] See the clever tie-in there? SQL Server 2012. 12 cities? Ahhh…

[3] Yes, I’m playing this up a little bit. I’m home sick today (which is probably the only reason why I have time to blog during business hours) and am feeling a bit punchy…

Advertisements
Posted in 2012, Special Ops, SQL Server, SSIS | 1 Comment

SQL Server 2012 RC 0 Available for Download

If you’re working with the pre-release versions of SQL Server 2012, or if you’re thinking about preparing for its release, today is a big day. The first release candidate has been published on the Microsoft Download Center:

http://www.microsoft.com/download/en/details.aspx?id=28145

RC0 is a major release for SQL Server Integration Services, with lots of improvements since the CTP3 release in July. If you’ve been waiting to kick the tires, you may not want to wait too much longer…

Posted in 2012, Denali, SQL Server, SSIS | Leave a comment

Your Window Improvements May Vary

I love my new job. This morning I’m sitting in a training session organized by the SQL Server team to bring Microsoft team members up to speed on new Business Intelligence capabilities in SQL Server “Denali”. The session is about to begin, and I’m reviewing the SSIS slides so I can be prepared for questions as they arise. (Did I mention I love my new job?)

I also love ambiguity.[1]

One of the features listed in the presentation is “Variable Window Improvements”. Reading this, all I could think is “we’ve really improved the windows, but they’re different all the time, so we can’t tell you what they are.”[2]

Of course, that’s not what the slide was about. The topic was the specific improvements made to the Variables Window for the SSIS package designer in SQL Server “Denali” Business Intelligence Development Studio. This may sound less than exciting[3] and even in the context of a two-day training event it only rates one slide bullet. But despite this, these tool improvements address a significant real-world need, and are certainly worthy of a blog post or two.

Before we look at the new “Denali” goodness, let’s look at a scenario in pre-“Denali” SSIS today[4]:

  1. You’re editing an Execute SQL task in your package, and realize that you need a variable in which to store the results of the query you’re running.
  2. You exit the task editor, and in the Variables window you create a new variable.
  3. Later on, you’re looking for the variable, and you can’t find it. You frown, scratch your head, and re-create the variable.
  4. When you run your package, you find that although the package (and the Execute SQL task) runs successfully, the variable is never populated with the value from the query.
  5. You get yourself another coffee and settle down for a long troubleshooting session…

Experienced SSIS developers will probably be saying something along the lines of “well, of course, you needed to click on the package Control Flow design surface between steps 1 and 2. Duh!”

Those who are newer to SSIS may be missing the key point in this scenario: When you create a new variable in SSIS, the variable is created at the scope of whatever container or task was selected in the designer. Although this behavior is consistent and documented, it often comes as a surprise, because nobody actually reads the documentation.[5] It also often comes as a frustration, because most of the time variables should be created at the package level in order to be useful.[6]

In SQL Server “Denali” SSIS, all new variables are created at the package scope. This change ensures that the “what the heck” moments we experienced in previous versions of SSIS will no longer occur, and that all new variables will be defined at the most frequently used scope.

What about those situations when you actually do want a variable at a different scope? In previous versions of SSIS, variable scope could not be changed – you needed to delete the old variable and create a new one at the new scope.[7]

In SQL Server “Denali” SSIS, you can move or copy variables between scopes. Odds are you will need this functionality less than you needed it in earlier versions of SSIS (you know, when it wasn’t there) but when you need it, you’ll love it.

And that’s that. As per usual, I wrote two pages when I could have written two sentences (the ones I highlighted in red) but there you have it. The facts are always more meaningful when presented in the context of a story, and hopefully the story helped turn these two bullets into something more than a”yawn” moment. I’m sure you’ll let me know…


[1] In case you’re interested: http://en.wikipedia.org/wiki/Syntactic_ambiguity.

[2] Yes, I crack myself up.

[3] And less funny that my ambiguity joke, at least a little.

[4] This is true with SSIS in SQL Server 2005, 2008 and 2008 R2.

[5] Except you. I know you do, I meant everybody else. You know those guys…

[6] One of the “lightweight best practices” I have included in my SSIS presentations over the years is “always right-click on the package design surface and choose Variables from the context menu” because it ensures that the new variable is at the package scope regardless of what task you were working on beforehand.

[7] This functionality does exist in the amazing BIDS Helper add-in, but it was not included in SSIS.

Posted in BI, Denali, SSIS, Training | Leave a comment

Back in the SSISaddle

Testing…

Testing…

Is this thing on?

Although I still get the occasional comment on older blog posts, I assume that pretty much nobody subscribes to my blog anymore for SSIS information. When I joined Microsoft Learning in October 2008, the SQL/BI/SSIS side of my life got pushed to the side. I kept working with the tools, but it was no longer a core part of my job, and one of the things I largely gave up on blogging and related technical community efforts. It wasn’t a change I wanted to make, but I needed to ensure that I rocked my new job as much as possible. The end result was that over the last 30 months I’ve posted fewer than ten SSIS-related posts.[1]

This will now change.

This week I joined the SQL Server Integration Services team as a Senior Program Manager.[2]

In the months leading up to the SQL Server “Denali” release, I hope to blog regularly on “Denali”-related features, enhancements and techniques. The “Denali” release is a major release for SSIS, with a ton of exciting new capabilities for new users and seasoned SSIS professionals alike.[3] If you are using SSIS already, SSIS in “Denali” is going to make your life easier in many ways. If you’re just starting off with SSIS, the “Denali” release will ease your learning curve, and help you swear less. In short, I’ll have a lot to blog about.

With this said, it may still be a few weeks before I post again. Not only will I be getting up to speed on my new team and my new role, I will also be relocating my family from New York to Redmond.[4] The relocation will be consuming most or all of my free time for the next few weeks, but once I’m settled in you should expect to hear from me more often.

It’s good to be back.

Back in the SSISaddle indeed.

[1] And that I rocked the new job pretty darned hard, but that’s a longer story. Ask your friendly neighborhood Microsoft Certified Trainer.

[2] I wish you could see the smile on my face as I typed that sentence.

[3] If you want an overview of these new things, please be sure to download and view Matt Masson’s presentation from TechEd North America. We’ll have more announcements coming closer to release, but this session is more than enough to whet your appetite.

[4] I have two children, two cats and one wife, and the effort and stress of moving the whole family is staggering. I thought I knew what I was getting into, but boy was I wrong.

Posted in Denali, Personal Life, SSIS, Work | 5 Comments

Transformations are NOT Tasks

This is another one of those “I thought I blogged this years ago” posts. I was having a conversation with a fellow MCT a few days ago, and he described SSIS data flow transformations as “tasks on the data flow tab.” I didn’t say anything then, but this phrase has been sticking in my mind ever since – not unlike a nasty splinter – and I think I need to get it written down before my brain lets me move on.

Why is this seemingly innocuous phrase causing me so much mental distress?

Because transformations are not tasks!

Let’s list a few characteristics of tasks in SSIS. Tasks:

  • Are used in the SSIS Control Flow.
  • Are Containers, with all of the properties (such as the ability to control transaction scope, or to be enabled or disabled) of containers.
  • Support property expressions.
  • Support breakpoints and debugging.
  • Can be individually configured for logging.
  • Are presented as boxes connected by arrows on the BIDS design surface.

The only characteristic that data flow transformations share with tasks is the last one: they’re presented as boxes, connected by arrows.

But the similarities end there. Transformations are metadata-bound functions that operate on data stored in the memory buffers managed by the data flow engine. They are subcomponents of the Data Flow task, which is just another task. It’s an important task, and it gets its own designer because of this, but it is just another task nonetheless.

Why is this important to me? Some (ok, most) may argue that I’m just being pedantic. And perhaps they’re right, to some degree. But any technical field has its own terminology that allows its practitioners to communicate efficiently and precisely. Hearing an SSIS developer use “task” for “transformation” is kind of like hearing a surgeon use “abdomen” for “stomach” or a .NET developer use “component” for “assembly.” The terms are related, but their meanings are very different. In each case it’s probably likely that the speaker knows the difference and is simply not speaking carefully, but sometimes (particularly with the surgeon 😉 you just want to be sure.

So the next time someone tells you that he’s using “the Lookup Task” please send him here.

P.S. I should also be sure to mention that the MCT who used this offending phrase probably does not need this rant to let him know what’s going on. MCTs tend to have a solid grasp of the technologies they teach; you tend to crash in burn in front of the classroom if this is not the case. But I knew that I was going to have this stuck in my mind forever if I didn’t say something somewhere…

Posted in BI, SSIS | 2 Comments

SQL Server MVP Deep Dives

I’m not a SQL Server Most Valuable Professional anymore[1], but I was when this amazing project got underway, and I am proud to contribute as an author on SQL Server Integration Services. SQL Server MVP Deep Dives is a unique book. It has:

  • 53 authors
  • 59 chapters
  • Coverage of database design, development, administration and performance tuning
  • Coverage of the SQL Server BI suite: SSIS, SSAS and SSRS

But that’s not the coolest thing about this book. The coolest thing is where the money goes. All author royalties (that’s right – 100%) go to War Child International, a network of independent organizations, working across the world to help children affected by war.[2] So not only will this book help you become a better SQL Server practitioner, it will also help children around the world.

You can pre-order the book online today at the SQL Server MVP Deep Dives web site, and get immediate access to a pre-release PDF copy of the book. (Note: if you order before the end of September and use the code pop0928 at the checkout you will also save 50%!) Also, if you’re attending the PASS Summit in Seattle in early November, you can buy a copy at the conference bookstore and it signed by many of the authors. I’ll be there, and based on the email I’ve seen we’ll see a majority of the authors and editors should be there as well.

If you only buy one SQL Server book this year, why not make it this book? And since all proceeds go to benefit a great cause, why not buy two?

I’ll see you at PASS!

[1] You can’t be an MVP and a Microsoft employee at the same time, so when I joined Microsoft they took away my MVP status. Which was OK.

[2] Not, as some may think, a non-profit organization dedicated to helping aging Jethro Tull fans.

Posted in Best Practices, Book, Conference, SQL Server, SSIS, T-SQL | 3 Comments

SSIS Package Templates: The Good, The Bad and The Ugly.

This is one of those “I could have sworn I blogged on this topic years ago” posts. I speak to this topic whenever I’m presenting on SSIS development best practices, and with my half-day mini-seminar in Stockholm last week[1] I was thinking of it – and speaking about it – again. Although SSIS is a great platform and provides pretty much all the tools you need to get the job done, sometimes it’s not always easy to know what tools to use and how to use them.

Which brings us, of course, to package templates. So before we continue, please read this Knowledge Base article on how to create a package template in BIDS.

Done? Good – let’s review, and paraphrase while we do it.

Package templates are pretty much the entire reuse story in SSIS.[2] You define the baseline package, including connection managers, log providers, configurations, variables and expressions, tasks, containers and precedence constraints and sometimes data flow components as well. When you create a new package you create a new copy of the template with this pre-defined starting point instead of starting with a blank package, so you can be sure that the package is just like the baseline you created.

In keeping with the theme established in the post title, this is the “good” side of SSIS package templates. They’re valuable, and when you’re building a large number of packages for the ETL subsystem of a DW/BI project, they’re invaluable, in that the common “plumbing” required by all packages can be defined once and used consistently everywhere. And this is particularly vital when working in a team environment, where different developers (each of whom inherently has his own style and own way of building packages) are working on the same project, which is typically the case on ETL projects of anything but the most trivial in scale.

Which brings us to the “bad” side of package templates, at least as implemented when following the guidance in the KB article referenced above. This guidance is firmly targeted at the lone developer; did you notice how all the instructions about creating a package and saving it in the Visual Studio install folder never mentioned on what computer you should be performing these steps? That’s because the article assumes the context is “your machine” – the lone developer’s machine or – at scale – on every developer’s machine.

Why is this bad, you ask? Surely it can’t be so much work to copy a DTSX file to the same folder on a few computers, can it?

Of course not. The “bad” side of following this approach comes not when creating the templates, but when the real world comes into play and the project continues to grow and evolve. In a perfect world, when the needs of the project are understood 100% before development begins, and the needs of the project never change after development begins, this approach will work just fine for team development. Your mileage may vary, but I’ve yet to work an SSIS project[3] like this.

In the real world, most successful BI projects are iterative and agile, just like most other successful development projects. If you look at SQL Server MVP John Welch or MVP and VSTS Ranger Ted Malone, their blogs focus strongly on agile BI. If you read Kimball, you will see an overarching theme of agile and iterative development. And if you’ve ever seen former SSIS program manager Michael Entin present on SSIS performance tuning[4] you’ve noticed this best practice under the heading of “Principles of Good Package Design”: “Follow Microsoft Development Guidelines – Iterative design, development & testing.”

Ok, so why did I just go off on that tangent and drop so many names? Regular readers will quickly note that my writing style practically revolves around tangents and will likely seize on this as the reason, but today it is to drive home a point: The real world is neither pretty nor neat, and people who know more about BI than I will ever forget[5] have gone out of their way to point this out.

Why is this bad, you ask yet again? Surely it can’t be so much work to copy a DTSX file to the same folder on a few computers, can it? And can you please just get to the point, Matthew?

Maybe. Picture this story:

A small team of five or six SSIS developers are working together on a project. The lead developer has created a package template for all developers on the team to use moving forward. He emails the DTSX file to each team member along with a reference to KB article 908018 for instructions on how to install and use the template.

A week later the lead developer identifies a problem with the package template.[6] He updates the template package and emails it out to his team with instructions to update their local machines with the new package right away.

Unfortunately, one of the team members is a “heads-down” kind of developer who turns off email during the day, and finishes a few more packages before he gets the message, even though he does save the new template package once he checks his email.

And another developer is on vacation when the email is sent, and by the time he gets back he has too many new messages in his inbox to ever get through, so he decides that if anything is really important then the sender will re-send it, and he just deletes anything that is unread to get a fresh start with no email backlog.[7] So he keeps on developing packages using the outdated template until the next update is sent out a few weeks later…

You see where this is going, right? This is gong to the “ugly” side of SSIS package templates. The ugly part comes in downstream when integration testing begins. The packages developed by the individual team members need to be deployed into a test environment and tested together, and of course at that point things start to fall apart. Everyone works extra hours and deadlines slip by while people try to track down problems they knew had been solved long ago. Although to be honest, the worst case scenario is that the test cases don’t find the problem at this point, and the ETL solution is deployed into production without the errors being resolved.

Unfortunately, I have seen this story play out in one way or another on several different projects over the years. The details are always different, but the ending is always the same: extra work, unhappy clients, and pain.

Now picture this story retold, with a few different details:

The lead developer works with his team to establish Team Foundation Server (TFS) source code control[8] before any development begins. He develops a set of packages that serve as “template examples” and checks them into TFS.

He also establishes standards with his team, so that each team member knows to Get Latest whenever starting work on a new package. This ensures that – no matter what happens – each developer always has the most recent versions of all packages with no need for “special case” tasks like updating a template.

The lead developer also instructs each development team member to create any new packages by right-clicking in Solution Explorer and selecting Add Existing Package, and then browsing to a current file from TFS.[9] This ensures that whenever a new package is created, it is being created as a new instance of a “rolling template” that is stored in TFS and as such the same version is always available for use by all developers.

Then, whenever problems are identified in existing packages, the “problem packages” are updated immediately, and the fixes are applied to any existing packages created from a now-outdated template.[10]

And whenever anyone goes on vacation, problems are found earlier, their impact is lessened, deadlines are met, and everyone lives happily ever after.

Ok, so the results are probably not so extreme, but in my experience on multiple projects this “rolling template” approach is the best practice way to approach SSIS package templates. Any attempt to utilize the Microsoft guidance in a team environment runs into problem after problem, and at the worst extreme I have seen a development team manually re-implement dozens of SSIS packages using a new template after churning for a full week while trying to reconcile and resolve the problems disseminated by using the “official” template approach.

And to be fair, the definition and implementation of policies and procedures can help make following the “official” approach less bad, but the starting point is inherently flawed for team development. If anyone were to suggest to me that following KB 908018 was the right way to approach package templates, I would have to politely[11] disagree.

What do you think? Even though I have been around the block a few times with SSIS, I don’t claim to know it all. I’d love to hear your thoughts on how you use package templates – especially if I took your name in vain earlier on – including what works and what does not work in your team environment.

 

[1] I never seem to find the time to do tourist stuff when traveling for work, but for some reason I just could not say no when I was invited to speak on SSIS best practices when in Sweden for a friend’s wedding. I have the world’s most understanding wife.

[2] I say “pretty much” because using small, modular packages and executing them via the Execute Package task or some other client, or using the SSIS .NET API to dynamically construct and execute packages are the other parts. In SQL Server 2008 and SQL Server 2008 there is not a lot to say about reuse in SSIS.

[3] Or projects using any other technology, for that matter. If you know of any projects like this and are looking for a consultant, I’m pretty sure I can work out a moonlighting arrangement with Microsoft. 😉

[4] Or have seen me blatantly plagiarize his slide decks for my own talks.

[5] No Ted, this doesn’t include you. Sorry.

[6] I won’t go into the details of the problem because it isn’t really important to the story, but it could be anything along the lines of:

  a) A missing variable, which needed to be added.
  b) A variable defined at an incorrect scope, hiding a variable at the package scope when accessed from a Script task.
  c) A log provider needed to be added to support new project requirements.

You get the idea, right?

[7] Yes, people really do this. Yes, I’m thinking about doing it right now. I’ve been on vacation for a week and have been checking email at least once a day to respond to high priority items, but I still have nearly 700 unread messages in my inbox. ;-(

[8] Or Visual SourceSafe or pretty much any modern SCC system. But TFS has so many additional benefits, it’s hard to recommend anything else.

[9] Just to be clear, I’m not implying that there is any ability to browse TFS repositories in BIDS, but instead am saying that the developers all have the most recent packages from TFS and can browse to them on their local file systems.

[10] Odds are they use a tool like PacMan to batch update the existing packages. 😉

[11] Or perhaps not so politely depending on the hour…

Posted in Best Practices, BI, dev, SSIS | 6 Comments