Browsing:

Tag: Polybase

Polybase configuration on SQL Server 2017 Part II

Image: Microsoft

Nowadays your precious data can be stored everywhere, not just on several servers with different SQL versions, your data probably is wide spread in the cloud. It’s also a good idea to store data in the cloud with stretch database to release your local discs from excessive data and still be able to query it, but also use it in your SSIS and BI environment and keep an acceptable ETL. With Microsoft’s polybase you can access, import and export any data structured, semi, or non structured on the Hadoop platform and azure blob storage using T- SQL language.

The best business knowledge comes from the data you collect. So it might be a good idea to put the data you collect into some good use. Businesses collect lot’s of data, but in most cases this is also where it ends. Those who read my posts before,  know I am all about combining various sources with linked servers, since SQL 2014 lot’s of new features are available for using all your data on business intelligence platforms.

In my last post, we had a first look and troubleshoot of a polybase installation. This time we are going to configure and use the polybase in SQL server 2017. I’m going to use the Blob storage on Azure to demonstrate how you can implement this solution in your (local) SQL database.

First things first, now you’ve got your polybase installation ready, check if the services for polybase exist and are running.

Services: ‘SQL Server PolyBase Data Movement’ and ‘SQL Server PolyBase Engine’

You need to configure Polybase in order to start using it. Fire up SSMS and open a new query window. Type

sp_configure ' hadoop connectivity', 4;

reconfigure

Option 4 is Azure blob storage (WASB[S]). For more info on availability of the Polybase connectivity configuration, take a look here. Run the query and make sure you restart both Polybase services on the machine to finish the configuration.

In order to start using the blob storage make sure you have an Azure storage account if you don’t have an Azure account yet, create one here.

Login to Azure and on the left side select and create a new storage account

Give it some time, once the storage is created, you also need to create a container on the Azure storage.

To connect your local db to azure  storage, you need to get the azure storage key from your Azure storage account you just created and put it in the configuration file of your SQL installation.
Look for the core-site.xml file in the installation path of SQL Server.
The path looks simular to this: “C:\Program Files\Microsoft SQL Server\MSSQL14.MSSQLSERVER\MSSQL\Binn\Polybase\Hadoop\conf”
This will open the config directory  with the core-site.xml file.

Open the file in notepad and add code before the block of code mentioning Kerebos.

Fill in the storage name, in my case polybasedemo  and the storagekey and save the file.

 fs.azure.account.key..blob.core.windows.net

Now we have to create an external data source in SSMS. Replace containername@storagename with the names you created on Azure.

CREATE EXTERNAL DATA SOURCE PolyBaseDemo WITH
( TYPE = HADOOP, -- wasbs:// containername@storagename.blob.core.windows.net/ 
LOCATION = 'wasbs://containername@storagename.blob.core.windows.net/' );

Next up, we create the external file format to define external data on Azure blob storage, this needs to be done in order to create the external table

CREATE EXTERNAL FILE FORMAT PolybaseFormat 
WITH ( FORMAT_TYPE = DELIMITEDTEXT , FORMAT_OPTIONS ( FIELD_TERMINATOR = ',' ) );

This creates 2 new server objects in SSMS and now all is left to create the external table itself. In this demo I use an excel sheet with some irrelevant data to have some test data available.

USE [DemoPolybase] CREATE EXTERNAL TABLE [dbo].[Customers] ( [Name] VARCHAR(255) NULL, [adres] VARCHAR(255) NULL, [postalcode] VARCHAR(6) NULL ) WITH (LOCATION = N’/Customer_Export.csv’, DATA_SOURCE = PolyBaseDemo, FILE_FORMAT = PolybaseFormat, REJECT_TYPE = Value, REJECT_VALUE = 10) GO

And to see if this worked, just query the data 🙂

SELECT * FROM [Customers]

Now put this knowledge into action yourself with some real data!


Polybase installation on SQL Server 2017 part I- Oracle JRE 7 Update 51 (64-bit) or higher is required

Fresh new year, so a good time to check out the newest SQL Server! So far the installing process itself in SQL server 2017 brings no big new surprises. Just like the SQL Server 2016, you have to optionally download and install the SSMS via the Microsoft website, the link will be provided once the installation has finished.

SQL Server 2017

Next the install en configuration starts. I’ll highlight the one pain in the ass I encountered this time.

I already talked about the Polybase feature related to the content in a podcast early 2016, but this time an install and setup walkthrough, plus a warning for all the people bravely installing oracles newest version of java.

When you select the Polybase to be installed and you payed close attention, or already used it in 2016 edition, you know that you need the oracle SE Java Runtime Environment.Polybase Oracle JRE

If this is not already installed on you’re computer, the installation will fail, resulting in this message :

---------------------------
 Rule Check Result
 ---------------------------
 Rule "Oracle JRE 7 Update 51 (64-bit) or higher is required for Polybase" failed.

This computer does not have the Oracle Java SE Runtime Environment Version 7 Update 51 (64-bit) or higher installed. The Oracle Java SE Runtime Environment is software provided by a third party. Microsoft grants you no rights for such third-party software. You are responsible for and must separately locate, read and accept applicable third-party license terms. To continue, download the Oracle SE Java Runtime Environment from https://go.microsoft.com/fwlink/?LinkId=526030.
 ---------------------------
 OK
 ---------------------------

 

You need to head over to oracle.com and install a 7.51 or higher version, currently 9.0.1 is the highest, so seems legit to install this one.

Java install

 

 

 

 

 

Once you downloaded the correct product, In my case I choose the Windows Offline. Now run the Java install and return to your SQL Server setup for a re-run.

Wait what? Same message! “Requires JRE 7 update 51 or higher”. I just installed the latest JRE version, did a restart and java is up and running.

So, this it the moment you ask yourself, do I really really want the polybase feature that bad? The anwser is Yes! To start the troubleshoot, I decided, to do some backward compatibility, the oldest version available from site, without using my oracle client registration is 8.151, and guess what…This did the trick!

So stay away from the newest 9 version for as long as possible.

Next post will be the setup and configuration of the polybase