How to Leverage SSIS in ADF (Packt)

Learn how to leverage SSIS on premises to interact with cloud data in ADF v2 in this tutorial by Christian Cote, Michelle Gutzait, and Giuseppe Ciaburro, the authors of Hands-On Data Warehousing with Azure Data Factory.

SQL Server Integration Services (SSIS) has been the Microsoft ETL predilection tool for more than a decade. A lot of enterprises have used SSIS to load their on-premise data warehouses since its inception in SQL Server 2005.

In the last couple of years, IT departments have had to deal with different kinds of data and specific toolsets to process them. SSIS has successfully been able to access cloud data from on-premise ETL servers since 2015 with the Azure Feature Pack (https://docs.microsoft.com/en-us/sql/integration-services/azure-feature-pack-for-integration-services-ssis?view=sql-server-2017). However, issues occur when most of the ETL is in the cloud and SSIS is in a small part of the chain. And, up until now, it was very complex to use ADF v1 as the orchestrator in the cloud, with some SSIS package calls in the pipeline.

The following sections will describe how SSIS on-premises can be successfully leveraged to interact with cloud data in ADF v2.

Sample setup

The first thing that you need to do is set up an SSIS solution. You can use Microsoft wide world importers. The samples can be found at https://github.com/Microsoft/sql-server-samples/releases/tag/wide-world-importers-v1.0.

From the list, you can download the databases and the SSIS solution:

  • Daily.ETL.ispac: This is the SSIS solution that takes data from WideWorldImporters to the WideWorldImportersDW database
  • WideWorldImporters-Full.bacpac: This is the transactional database that is used as a source for the data warehouse
  • WideWorldImportersDW-Full.bacpac: This is the sample data warehouse database that you’ll use in this article

Sample databases

The first step is to set up sample databases. For this, you’ll need to install SQL Server Management Studio (SSMS). This program is freely available from Microsoft at https://docs.microsoft.com/en-us/sql/ssms/download-sql-server-management-studio-ssms.

Start SSMS to connect to your server and right-click on the Databases folder in Object Explorer. Select Import Data-tier Application… from the contextual menu as shown in the following screenshot:

The Import Data-tier Application wizard will open. Now click on Next.

In the Import Settings step, make sure that the Import from local disk option is selected and click on Browse…. From the Open window, navigate to the location from where you previously downloaded the .bacpac files and click on Open, as shown in the following screenshot:

Click on Next on the Import Settings screen.

You can now proceed to the Database Settings step. Make sure that the entry in the New database name textbox is WideWorldImporters and click on Next.

The Summary will be displayed as shown in the following screenshot. Click on Finish to start the BACPAC import process:

The database is created as well as all the objects in it. This process may take a few minutes. Click on Close to terminate the process:

Repeat the same process for the other BACPAC WideWorldImportersDW-Full.bacpac. Make sure that you specify WideWorldImportersDW as the database name.

SSIS components

The SSIS setup usually requires an existing installation of SQL Server, with the Integration Services feature selected. With ADF v2, this step is less mandatory since you can deploy the packages in an Azure database.

That being said, it’s always better to test your SSIS solution on premise before deploying it in Azure. The next section will describe how to set up a sample SSIS solution.

Integration Services Catalog setup

This step is necessary for brand new SQL Server installations where you may want to deploy SSIS packages. When logged in to the server with SSMS, simply right-click on the Integration Services Catalogs folder and select Create Catalog…

The Create Catalog window will open up. You’ll need to fill in the properties and click on OK to create the integration services catalog:

Next, you need to create a folder for your project. In Object Explorer, expand the Integration Services Catalogs node, right-click on SSISDB, and select Create Folder… from the contextual menu:

The Create Folder window will open up. Set the properties as shown in the following screenshot and click on OK:

You’re now ready to open the solution in Visual Studio. Here, you can deploy it to your newly created on-premises Integration Services Catalogs folder.

Sample solution in Visual Studio

For this section, you’ll need an existing installation of Visual Studio 2015 or 2017 and SQL Server Data Tools (SSDT) installed. Visual Studio is available at https://docs.microsoft.com/en-us/sql/ssdt/download-sql-server-data-tools-ssdt?view=sql-server-2017.

Once SSDT is opened, click on FILE | New | Project…:

The New Project window will open up. Select Integration Services Import Project Wizard, type WWImportersDaily in the textbox beside Name, and click on OK:

The Integration Services Import Project Wizard window will open; click on Next to proceed to the next step:

In the Select Source window, select the Project deployment file radio button and click on Browse…. This will open up the Open file window. Navigate to the folder where .ispac, the SSIS sample file, has been downloaded and click on Open:

Once back to the wizard, click on Next to proceed to the next step.

In the Review window, click on the Import button.

Once the project has been imported, you can see it in Solution Explorer:

You’ll have to modify the two connection managers if your server is not the default instance. To do so, double-click on WWI_DW_Destination_DW.conmgr and WWI_Source_DB.conmgr, and change their server properties.

If the connections are online, you can run the package and see if everything is working fine.

Deploying the project on premises

You’re now ready to deploy and test your project on your local server:

1. From the Solution Explorer, right-click on the WWImportersDaily project and select Deploy. Integration Services Deployment Wizard will open up as shown in the following screenshot. Click on Next to proceed to the next step:

2. Connect to your SQL Server where the integration services catalog was created before and click on Browse…. The Browse for Folder or Project window will open.

3. Expand SSISDB and select the DWDailyRefresh

4. Click on OK to close the window:

5. On the Select Destination window, click on Next.

6. The next step is to validate. However, the on-premise deployments don’t use it.

7. Click on Next to proceed to the Review window. Here, you can review whether you have selected the right .ispac file and confirm that you’re going to deploy it to the right folder in SSIS:

8. Now click on Deploy to start the deployment of the .ispac file.

9. The last step, Results, shows the progress of your deployment. If everything is successful (as it should be), click on the Close button:

You’re now done with the deployment.

10. The next step is to execute the package via the catalog. You can switch to SQL Server Management Studio for this.

11. In the Object Explorer, expand Integration Services Catalogs and navigate through the DailyETLMain.dtsx package.

12. Right-click on it and select Execute… from the contextual menu to open up the execution window:

13. Go to the Connection Managers tab, check the Connection String property for both connectors, and adjust them if necessary. If everything is okay, click on OK to start the package execution:

14. Click on Yes to see the report. Once the report has opened, you might notice that the package is still running. When you refresh it, though, you’ll see that the package has been successfully executed:

You have successfully leveraged SSIS on-premises. If you found this article to be an interesting read, you can explore Hands-On Data Warehousing with Azure Data Factory to learn how to build your own ETL solutions and address the key challenges faced while building them.

Cheers!

P.S. This post was provided by good people at Packt Publishing. Thank you, Ron!

4th SQL Server UG meeting at MIC Varazdin

4th_sql_ugToday we had the honor to host the 4th meeting of SQL Server User Group. The main theme of this meeting was “SQL Server 2008 Disaster Recovery and High Availability 101“. Dean Vitner (MVP) was the main presenter, and I was there to fill the gap with some demos until the so called P’n’P part of the meeting… 🙂

It was nice to see so many people at the meeting again (to all of them – thanks for coming!); it seems that SQL & High Availability are the popular themes… For those that didn’t make it to Varazdin, we arranged virtual meeting (with Live Meeting). All the info (and hopefully the video), will be available at Microsoft Community Croatia portal soon.

I think the meeting went well, hope the attendees are happy (we need more slots for these topics, definitely), and hope to host another user group meeting at Microsoft Innovation Center Varazdin soon…

 

1. sastanak SQL UG (report)

Jučer je u Zagrebu, kako je i najavljeno, održan 1. sastanak hrvatske SQL user grupe. Ukratko, mogu reći da se tražilo “mjesto više” – dvorana Dante (u Microsoftovom uredu) nije bila dovoljno velika da primi sve zainteresirane, što je jako dobra stvar jer pokazuje da zainteresiranih za SQL grupu ima jako puno (i ne samo u Zagrebu).

Nakon početnih informacija o osnivanju grupe, sponzorima i sljedećim sastancima, održana su dva odlična predavanja dvojice naših MVP-eva.

Prvo je održao Dean Vitner, a pričao je o transaction logu – iako nisam DBA, predavanje mi je bilo dosta zanimljivo (jedino što je moralo završiti baš kad je postalo zanimljivo jer su stigle pizze). U drugom predavanju je Marko Čulo pričao o skladištima podataka što je također bilo dosta zanimljivo predavanje.

Mislim da je prvi sastanak grupe prošao i više nego dobro, nadam se da su i leadovi zadovoljni te da će im za svaki sljedeći sastanak trebati sve više prostora/slobodnih mjesta…

S obzirom da sam snimao predavanja, snimka bi trebala (uz sve ostale materijale) biti dostupna negdje (točnu lokaciju trenutno ne znam). Slike za sada možete pogledati na Gordonovom SkyDriveu (ovdje), a uskoro ćemo onda prikupiti sve materijale i staviti ih na jedno mjesto…

Nova user grupa u Hrvatskoj – SQL UG!

Jako mi je drago da mogu konačno podijeliti s vama ovu vijest – Microsoftova zajednica (community) u Hrvatskoj se polako širi jer su (konačno) i database administratori (i ostali SQL fanovi) došli na svoje, odnosno dobili svoju zasebnu user grupu. Prvi sastanak grupe će se održati u Zagrebu (Microsoft Hrvatska, Turinina 3/IV, dvorana Dante), a ideja sljedećih je da se održavaju i na drugim mjestima…

Ukoliko ste slobodni u utorak, 1. veljače 2011. (u 17:30), navratite i pogledajte o čemu će Dean Vitner i Marko Čulo pričati (i budite dio “stvaranja povijesti”). Prijaviti se možete na http://sqlserverug.eventbrite.com/ i slobodno preporučite onima koje bi to moglo interesirati…

Originalan tekst poziva:

Drage kolegice i kolege,

Drago mi je da vas napokon možemo pozvati na prvi sastanak prve hrvatske SQL Server korisničke grupe!

Kada? 1. veljače 2011 od 17:30.

Gdje? Zagreb, MS Hrvatska ured, Turinina 3. Namjera nam je sastanke u budućnosti održavati i u drugim centrima u Hrvatskoj, ne samo u Zagrebu.

Zašto? Potreba za osnivanjem zasebnog DBA/DBDev communityja postoji već prilično dugo vremena. Sad se skupila kritična masa ljudi dovoljno zainteresiranih da su spremni uložiti svoj trud i slobodno vrijeme u njegovo kreiranje.

Tko? Pozivaju vas Marko Čulo i Dean Vitner, SQL Server MVP-evi.

Agenda:

17:15 – 17:30     Okupljanje

17:30 – 17:45     Pozdrav, uvodna riječ, gdje smo i kamo idemo

17:45 – 18:45     SQL Server transaction log (Dean Vitner)
Transakcijski log je najvažniji fajl u bazi podataka. Zašto to kažemo, što je u njemu, za što služi? Što su to uopće transakcije, što je recovery, checkpoint..? Kojim alatima možemo monitorirati sadržaj loga?

18:45 – 19:15     P’n’P

19:15 – 20:15     Osnove skladištenja podataka (Marko Čulo)
Skladišta podataka i njihov dizajn su osnova implementacija BI riješenja, kao i osnova kvalitetnih analiza poslovanja. Kvalitetan dizajn skladišta i posljedično procesa obrade i prijenosa podataka (ETL proces) smanjuju probleme koji se tipično pojavljuju. Best practices i primjeri riješavanja problema u implementaciji i izrade osnova su predavanja. Sami primjeri su zamišljeni kao osnova za raspravu pošto pravilo kaže da nema savršenog dizajna, postoje samo osnove za raspravu.

Molimo vas da se prijavite na http://sqlserverug.eventbrite.com/. Broj mjesta je ograničen.

Ukoliko niste iz Zagreba ili iz nekog drugog razloga ne možete prisustvovati sastanku, a željeli biste čuti predavanja i sudjelovati u raspravi, javite se na [email protected]. Ako bude dovoljno zainteresiranih, organizirat ćemo streaming predavanja.