Looking for a good intro and tutorial for Hadoop.
Look no further: Hadoop Tutorial at Yahoo!.
Why Yahoo! - they were effectively the first major company to embrace and extend Hadoop, helping to make it into what it is today.
Monday, October 31, 2016
Overview of R versions: CRAN, MRO, and R Server
Wondering just how the different versions of R compare? And now doubt you have many of these questions:
- Should I use:
- pure open source community supported CRAN version,
- enhanced Microsoft R Open, or
- R Server
- Multi-threaded
- which R option, if any, supports multi-threading
- In-memory constraint
- which R option, if any, supports big data, and breaks out of the in-memory road block
Frank Banin has written up an excellent summary that compares the various R versions currently available. You can find it on SQLServerCentral: Advanced Analytics with R & SQL: Part I - R Distributions
Thursday, October 27, 2016
SSRS support for Power BI - in technical Preview!
If you love Power BI, but want a localized server option using SSRS, well its coming.
Announced on October 25, 2016 on the Power BI blog, they are releasing it into technical preview. For this first review, you'll need an Azure account. No mention of when a local, on-premises preview will be available, but still this is great news.
See: Technical Preview of Power BI reports in SQL Server Reporting Services
Video - here's a video showing Power BI against SSRS 2016
Create a modern enterprise reporting and mobile BI solution with SQL Server 2016
Ok, what you really want is to run a localized VM version. Christopher Finlan has added a posting that outlines how you can download the VM from Azure and get it up and running as a localized VM. Find that post here: How to run the Technical Preview of Power BI Reports in SQL Server Reporting Services on-prem using Hyper-V
Announced on October 25, 2016 on the Power BI blog, they are releasing it into technical preview. For this first review, you'll need an Azure account. No mention of when a local, on-premises preview will be available, but still this is great news.
See: Technical Preview of Power BI reports in SQL Server Reporting Services
Video - here's a video showing Power BI against SSRS 2016
Create a modern enterprise reporting and mobile BI solution with SQL Server 2016
Can I try it out on-premises?
Ok, what you really want is to run a localized VM version. Christopher Finlan has added a posting that outlines how you can download the VM from Azure and get it up and running as a localized VM. Find that post here: How to run the Technical Preview of Power BI Reports in SQL Server Reporting Services on-prem using Hyper-V
Azure - Business Continuity with Geo-Replication
One great reason to move to Azure is the ease with which you can setup and manage disaster recovery (DR), high availability (HA) and your overall business continuity plan (BCP). And in April of 2016, Microsoft announced support for geo-replication for all service tiers.
Great, so where to get started? SimpleTalk.com has posted a great beginner article that discusses Geo-Replication. You can think of Geo-Replication as either log-shipping or mirroring for the Azure SQL Database environment, but either way, one benefit is that the passive secondary databases have three distinct benefits:
- Readability! the secondaries are readable.
- Multiple passive secondaries
- A possible lower cost that the primary
You can find Simple-Talk's report at SQL Database: How to Configure Active Geo-Replication.
Tuesday, October 25, 2016
Master Data Services - an overview of the mechanics
Looking for an overview of the mechanics of Master Data Services. One that covers the internals, and other mechanical basics? Then check out this posting on Simple-Talk: Master Data Services – The Basics.
Warning for Custom Visualizations in Power BI can now be suppressed!
Kryptonite No More!
Early adopters of Power BI have long had to endure the pain of seeing a warning message whenever a user was reviewing a dashboard or report with a custom visual. Well, no more. Here is the official posting by Will Thompson (Product Manager, Power BI):
Hi everyone. We’ve added an option to turn off the warning for custom visuals. You’ll find it under the Security tab of the Options dialog. You can also remove unwanted visuals from your report by clicking the … in the Visualizations pane and selecting ‘Delete a Custom Visual’.
This is great news. Having a warning pop-up as one of your executives is looking over a Power BI dashboard is not what most of us want. You can read more about it here:
Friday, October 21, 2016
Data Center Comics, by Diane Alber
Looking for just that right data center comic? We'll look no further.
You can find some great comics at the site http://www.kipandgary.com/.
You can find some great comics at the site http://www.kipandgary.com/.
Thursday, October 13, 2016
Analytic database comparison
We found an in-depth article that compares several of the leading analytic databases. So if you are in the market for an analytic database, this might be a good place to do some comparison shopping. We have personally taken a look at Exasol and found that it performs well. Find our postings about Exasol here: Exasol postings at Realized Design.
Below are the summary results from the web site. If you want the full details, you can find them here: [The site is in Russian, so you'll want to enable translation.]
Development → Comparison of analytical in-memory database.
The database engines compare include:
- Greenplum
- Yandex Clickhouse
- SAP HANA
- EXASOL
- MemSQL
- Impala
Below are the summary results from the web site. If you want the full details, you can find them here: [The site is in Russian, so you'll want to enable translation.]
Development → Comparison of analytical in-memory database.
Results reported are in seconds - lower is better:
Inquiry
|
Greenplum
|
Exasol
|
Clickhouse
|
Memsql
|
SAP Hana
|
Impala
|
N1
|
14
|
<1
|
-
|
108
|
6
|
78
|
N2
|
131
|
eleven
|
-
|
-
|
127
|
Error
|
N3
|
67
|
85
|
-
|
-
|
122
|
733
|
T1
|
14
|
1.8
|
64
|
70
|
20
|
100
|
T2
|
17
|
4.2
|
86
|
105
|
20
|
127
|
D1
|
1393
|
284
|
-
|
45
|
1500
|
-
|
D2
|
> 7200
|
1200
|
-
|
> 7200
|
Error
|
-
|
Clickhouse Benchmarks
On the clickhouse.yandex site, they list several benchmark tests comparing Clickhouse to several other analytic databases. But - none of the queries JOIN another table. So sadly, not a very useful comparison. See: https://clickhouse.yandex/benchmark.htmlMemSQL vs. MySQL (w/ InnoDB)
Good read with thoughtful comments on comparing the two: https://dom.as/2012/06/26/memsql-rage/EXASOL release version 6 and a Free Small Business Edition
This week, EXASOL released version 6 of their analytic database. We have had good experience with version 5, so we are expecting version 6 to continue to excel. While the latest version of EXASOL has numerous enhancements, the two we especially found interesting was the addition of a data virtualization framework and the rebranding of the Community Edition to the Free Small Business Edition.
You can read more about it here:
LinkedIn - Analyzing data fast - Sean Jackson
EXASOL Free Trail
Data Virtualization
Analytic databases like EXASOL serve as a double edge sword for massive amounts of data. Their scalability provide the ability to store large repositories of data -yet, for many analytic tasks, users require much smaller, more targeted data sets. The addition of the data virtualization framework just might be what is needed to segment data for targeted end users.Free Small Business Edition
We suspect that is simply a legal rephrasing to allow organizations to feel they have the legal ability to run production systems using their Community Edition. To that end, we are pleased that this was done.Other Items
- Increased support for Hadoop
- Improved connectivity
- Expanded support for programming languages
- Increased performance and scalability
You can read more about it here:
LinkedIn - Analyzing data fast - Sean Jackson
EXASOL Free Trail
Wednesday, October 5, 2016
Best Practice for PolyBase Table Location - Use a Folder
Use Folders! For production, create a new folder for each file type. Then point your LOCATION= parameter to just the folder, and not the specific file.
1) Add more files to the directory, and Polybase External table will automagically read them.
2) Do INSERTS and UPDATES from PolyBase back to your files in Hadoop.
( See PolyBase - Insert data into a Hadoop Hue Directory ,
PolyBase - Insert data into new Hadoop Directory ).
3) It's cleaner.
Here is a typical data folder in Hortonworks:
Why?
1) Add more files to the directory, and Polybase External table will automagically read them.
2) Do INSERTS and UPDATES from PolyBase back to your files in Hadoop.
( See PolyBase - Insert data into a Hadoop Hue Directory ,
PolyBase - Insert data into new Hadoop Directory ).
3) It's cleaner.
Here is a typical data folder in Hortonworks:
And here is the corresponding Create External Table:
CREATE EXTERNAL TABLE [dbo].AWDW_CSV_Sales_Date (
ModelName nvarchar(200) NULL ,
ShipDate datetime NULL ,
ExtendedAmt money NULL ,
OrderQty smallint NULL
)
WITH (LOCATION='/user/hue/AWDW2012_SalesData',
DATA_SOURCE = hdp23 ,
FILE_FORMAT = CSVfile
);
Tuesday, October 4, 2016
SQL Server 2016 Version Differences
Updated Nov 16, 2016:
With the release of SP 1 for 2016, all things have changed, and in a good way!
Technet posted a detailed comparison of the different SQL Server 2016 versions (see below), dated May 26, 2016. Alas, the web page was not well formatted. So, below is the same data (again as of May 26, 2016) but with improved formatting.
Source: https://technet.microsoft.com/en-us/windows/cc645993(v=sql.90)
Naturally, reference Microsoft for any changes, adjustments or inaccuracies.
With the release of SP 1 for 2016, all things have changed, and in a good way!
- Many features only on Enterprise are now on all, or almost all editions
- they did a better job formatting the page, so go to:
https://msdn.microsoft.com/en-us/library/cc645993.aspx
to get all the updated features in Ent, Std, Web and Express.
Technet posted a detailed comparison of the different SQL Server 2016 versions (see below), dated May 26, 2016. Alas, the web page was not well formatted. So, below is the same data (again as of May 26, 2016) but with improved formatting.
Source: https://technet.microsoft.com/en-us/windows/cc645993(v=sql.90)
Naturally, reference Microsoft for any changes, adjustments or inaccuracies.
Cross-Box Scale Limits
|
||||||
Feature
|
Enterprise
|
Standard
|
Web
|
Express with Advanced Services
|
Express with Tools
|
Express
|
Maximum compute capacity used by a single instance - SQL Server
Database Engine1
|
Operating system maximum
|
Limited to lesser of 4 sockets or 24 cores
|
Limited to lesser of 4 sockets or 16 cores
|
Limited to lesser of 1 socket or 4 cores
|
Limited to lesser of 1 socket or 4 cores
|
Limited to lesser of 1 socket or 4 cores
|
Maximum compute capacity used by a single instance - Analysis
Services or Reporting Services
|
Operating system maximum
|
Limited to lesser of 4 sockets or 24 cores
|
Limited to lesser of 4 sockets or 16 cores
|
Limited to lesser of 1 socket or 4 cores
|
Limited to lesser of 1 socket or 4 cores
|
Limited to lesser of 1 socket or 4 cores
|
Maximum memory utilized per instance of SQL Server Database
Engine
|
Operating System Maximum
|
128 GB
|
64 GB
|
1 GB
|
1 GB
|
1 GB
|
Maximum memory utilized per instance of Analysis Services
|
Operating System Maximum
|
Tabular: 16 GB
|
N/A
|
N/A
|
N/A
|
N/A
|
MOLAP: 64 GB
|
||||||
Maximum memory utilized per instance of Reporting Services
|
Operating System Maximum
|
64 GB
|
64 GB
|
4 GB
|
N/A
|
N/A
|
Maximum relational database size
|
524 PB
|
524 PB
|
524 PB
|
10 GB
|
10 GB
|
10 GB
|
1. Enterprise Edition
with Server + Client Access License (CAL) based licensing
|
||||||
(not available for new agreements) is
limited to a maximum of 20 cores per SQL Server instance.
|
||||||
There are no limits under
the Core-based Server Licensing model. For more information,
|
||||||
see Compute Capacity
Limits by Edition of SQL Server.
|
||||||
RDBMS High Availability
|
||||||
Feature
|
Enterprise
|
Standard
|
Web
|
Express with Advanced Services
|
Express with Tools
|
Express
|
Server core support 1
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Log shipping
|
Yes
|
Yes
|
Yes
|
No
|
No
|
No
|
Database mirroring
|
Yes
|
Yes
|
Witness only
|
Witness only
|
Witness only
|
Witness only
|
Full safety only
|
||||||
Backup compression
|
Yes
|
Yes
|
||||
Database snapshot
|
Yes
|
|||||
Always On failover cluster instances
|
Yes
|
Yes
|
||||
Number of nodes is the operating system maximum
|
Support for 2 nodes
|
|||||
Always On availability groups
|
Yes
|
|||||
Up to 8 secondary replicas, including 2 synchronous secondary
replicas
|
||||||
Basic availability groups
|
Yes
|
|||||
Support for 2 nodes
|
||||||
Connection director
|
Yes
|
|||||
Online page and file restore
|
Yes
|
|||||
Online indexing
|
Yes
|
|||||
Online schema change
|
Yes
|
|||||
Fast recovery
|
Yes
|
|||||
Mirrored backups
|
Yes
|
|||||
Hot add memory and CPU
|
Yes
|
|||||
Database recovery advisor
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Encrypted backup
|
Yes
|
Yes
|
||||
Smart backup
|
Yes
|
Yes
|
||||
1 For more information on
installing SQL Server 2016 on Server Core, see Install SQL Server 2016 on
Server Core.
|
||||||
RDBMS Scalability and
Performance
|
||||||
Feature
|
Enterprise
|
Standard
|
Web
|
Express with Advanced Services
|
Express with Tools
|
Express
|
Operational analytics
|
Yes
|
|||||
In-Memory Column Store
|
Yes
|
|||||
Stretch Database
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Persistent Main Memory
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Multi-instance support
|
50
|
50
|
50
|
50
|
50
|
50
|
Table and index partitioning
|
Yes
|
|||||
Data compression
|
Yes
|
|||||
Resource Governor
|
Yes
|
|||||
Partition Table Parallelism
|
Yes
|
|||||
Multiple Filestream containers
|
Yes
|
|||||
NUMA Aware and Large Page Memory and Buffer Array Allocation
|
Yes
|
|||||
Buffer Pool Extension
|
Yes
|
Yes
|
||||
IO Resource Governance
|
Yes
|
|||||
In Memory OLTP
|
Yes
|
|||||
Delayed Durability
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
RDBMS Security
|
||||||
Feature
|
Enterprise
|
Standard
|
Web
|
Express
|
Express with Advanced Services
|
Express with Tools
|
Row-level security
|
Yes
|
Yes
|
No
|
|||
Always Encrypted
|
Yes
|
|||||
Dynamic data masking
|
Yes
|
Yes
|
||||
Basic auditing
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Fine grained auditing
|
Yes
|
|||||
Transparent database encryption
|
Yes
|
|||||
Extensible key management
|
Yes
|
|||||
User-defined roles
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Contained databases
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Encryption for backups
|
Yes
|
Yes
|
||||
Replication
|
||||||
Feature
|
Enterprise
|
Standard
|
Web
|
Express with Advanced Services
|
Express with Tools
|
Express
|
Heterogeneous subscribers
|
Yes
|
Yes
|
||||
Merge replication
|
Yes
|
Yes
|
Yes (Subscriber only)
|
Yes (Subscriber only)
|
Yes (Subscriber only)
|
Yes (Subscriber only)
|
Oracle publishing
|
Yes
|
|||||
Peer to peer transactional replication
|
Yes
|
|||||
Snapshot replication
|
Yes
|
Yes
|
Yes (Subscriber only)
|
Yes (Subscriber only)
|
Yes (Subscriber only)
|
Yes (Subscriber only)
|
SQL Server change tracking
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Transactional replication
|
Yes
|
Yes
|
Yes (Subscriber only)
|
Yes (Subscriber only)
|
Yes (Subscriber only)
|
Yes (Subscriber only)
|
Transactional replication to Azure
|
Yes
|
Yes
|
Yes
|
|||
Transactional replication updateable subscription
|
Yes
|
|||||
Management Tools
|
||||||
Feature
|
Enterprise
|
Standard
|
Web
|
Express with Advanced Services
|
Express with Tools
|
Express
|
SQL Management Objects (SMO)
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
SQL Configuration Manager
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
SQL CMD (Command Prompt tool)
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
SQL Server Management Studio
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
|
Distributed Replay - Admin Tool
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
|
Distribute Replay - Client
|
Yes
|
Yes
|
Yes
|
|||
Distributed Replay - Controller
|
Yes (Up to 16 clients)
|
Yes (1 client)
|
Yes (1 client)
|
|||
SQL Profiler
|
Yes
|
Yes
|
No 1
|
No 1
|
No 1
|
No 1
|
SQL Server Agent
|
Yes
|
Yes
|
Yes
|
|||
Microsoft System Center Operations Manager Management Pack
|
Yes
|
Yes
|
Yes
|
|||
Database Tuning Advisor (DTA)
|
Yes
|
Yes 2
|
Yes 2
|
|||
Deploy a SQL Server Database to a Windows Azure VM Wizard
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
SQL Server Data Files in Windows Azure
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
1 SQL Server Web, SQL
Server Express, SQL Server Express with Tools, and SQL Server Express with
Advanced Services can be profiled using SQL Server Standard and SQL Server
Enterprise editions.
|
||||||
2 Tuning enabled only on
Standard edition features
|
||||||
RDBMS Manageability
|
||||||
Feature
|
Enterprise
|
Standard
|
Web
|
Express with Advanced Services
|
Express with Tools
|
Express
|
User instances
|
Yes
|
Yes
|
Yes
|
|||
LocalDB
|
Yes
|
Yes
|
||||
Dedicated admin connection
|
Yes
|
Yes
|
Yes
|
Yes with trace flag
|
Yes with trace flag
|
Yes with trace flag
|
PowerShell scripting support
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
SysPrep support 1
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Support for data-tier application component operations - extract,
deploy, upgrade, delete
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Policy automation (check on schedule and change)
|
Yes
|
Yes
|
Yes
|
|||
Performance data collector
|
Yes
|
Yes
|
Yes
|
|||
Able to enroll as a managed instance in multi-instance
management
|
Yes
|
Yes
|
Yes
|
|||
Standard performance reports
|
Yes
|
Yes
|
Yes
|
|||
Plan guides and plan freezing for plan guides
|
Yes
|
Yes
|
Yes
|
|||
Direct query of indexed views (using NOEXPAND hint)
|
Yes
|
Yes
|
Yes
|
|||
Automatic indexed views maintenance
|
Yes
|
Yes
|
Yes
|
|||
Distributed partitioned views
|
Yes
|
Partial. Distributed partitioned views are not updatable
|
Partial. Distributed partitioned views are not updatable
|
Partial. Distributed partitioned views are not updatable
|
Partial. Distributed partitioned views are not updatable
|
Partial. Distributed partitioned views are not updatable
|
Parallel indexed operations
|
Yes
|
|||||
Automatic use of indexed view by query optimizer
|
Yes
|
|||||
Parallel consistency check
|
Yes
|
|||||
SQL Server Utility Control Point
|
Yes
|
|||||
Contained databases
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Buffer pool extension
|
Yes
|
Yes
|
||||
1 For more information,
see Considerations for Installing SQL Server Using SysPrep.
|
||||||
Development Tools
|
||||||
Feature
|
Enterprise
|
Standard
|
Web
|
Express with Advanced Services
|
Express with Tools
|
Express
|
Microsoft Visual Studio integration
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Intellisense (Transact-SQL and MDX)
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
SQL Server Data Tools (SSDT)
|
Yes
|
Yes
|
Yes
|
Yes
|
||
SQL query edit and design tools
|
Yes
|
Yes
|
||||
Version control support
|
Yes
|
Yes
|
||||
MDX edit, debug, and design tools
|
Yes
|
Yes
|
||||
Programmability
|
||||||
Feature
|
Enterprise
|
Standard
|
Web
|
Express with Advanced Services
|
Express with Tools
|
Express
|
Basic R integration
|
Yes
|
Yes
|
Yes
|
Yes
|
||
Advanced R integration
|
Yes
|
|||||
R Server (Standalone)
|
Yes
|
|||||
Polybase compute node
|
Yes
|
Yes
|
||||
Polybase head node
|
Yes
|
|||||
JSON
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Query Store
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Temporal
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Common Language Runtime (CLR) Integration
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Native XML support
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
XML indexing
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
MERGE & UPSERT capabilities
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
FILESTREAM support
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
FileTable
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Date and Time datatypes
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Internationalization support
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Full-text and semantic search
|
Yes
|
Yes
|
Yes
|
Yes
|
||
Specification of language in query
|
Yes
|
Yes
|
Yes
|
Yes
|
||
Service Broker (messaging)
|
Yes
|
Yes
|
No (Client only)
|
No (Client only)
|
No (Client only)
|
No (Client only)
|
Transact-SQL endpoints
|
Yes
|
Yes
|
Yes
|
|||
Data Warehouse
|
||||||
Feature
|
Enterprise
|
Standard
|
Web
|
Express with Advanced Services
|
Express with Tools
|
Express
|
Create cubes without a database
|
Yes
|
Yes
|
||||
Auto-generate staging and data warehouse schema
|
Yes
|
Yes
|
||||
Change data capture
|
Yes
|
|||||
Star join query optimizations
|
Yes
|
|||||
Scalable read-only Analysis Services configuration
|
Yes
|
|||||
Parallel query processing on partitioned tables and indexes
|
Yes
|
|||||
xVelocity memory optimized columnstore indexes
|
Yes
|
|||||
Global batch aggregation
|
Yes
|
|||||
Subscribe to:
Posts (Atom)