Wednesday, February 28, 2018


About Qubole

Qubole is passionate about making data-driven insights easily accessible to anyone. Qubole customers currently process nearly an exabyte of data every month, making us the leading cloud-agnostic big-data-as-a-service provider. Customers have chosen Qubole because we created the industry’s first autonomous data platform. This cloud-based data platform self-manages, self-optimizes and learns to improve automatically and as a result delivers unbeatable agility, flexibility, and TCO. Qubole customers focus on their data, not their data platform. Qubole investors include CRV, Lightspeed Venture Partners, Norwest Venture Partners and IVP.

Ashish Thusoo and Joydeep Sen Sarma were part of building and leading the original Facebook Data Service Team from 2007-2011 during which they authored many prominent data industry tools including the Apache Hive Project. Their goal was not only to enable massive speed and scale to the data platform, but also to provide better self-service access to the data for business users. With the lessons learned from successes at Facebook, Qubole was launched in 2013 with these very same product principles of speed, scale and accessibility in analytics.

Qubole Core Values

  • Be Action Oriented
  • Be Honest and Transparent
  • Be a Team Player
  • Be an Innovator
  • Be a Shepherd to our Market
  • Have a Deep Empathy for our Users
  • Be Humble but Take Pride in your Work
  • Have High Standards
  • Have Fun
  • Qubole + Snowflake: Getting Started with Machine Learning, Big Data, and Cloud Data Warehouses — [1 of 3]
    to bring a new level of integrated product capabilities that make it easier and faster to build and deploy machine learning (ML) and artificial intelligence (AI) models in Apache Spark using data stored in Snowflake and big data sources.
    Through this product integration Data Engineers can also use Qubole to read Snowflake data, perform advanced data preparation to create refined data sets, and write the results to Snowflake, thereby enabling new analytic use cases.
    In this three–blog series we cover the use cases directly served by the Qubole–Snowflake integration. First we will discuss how to get started with ML in Apache Spark using data stored in Snowflake. Blogs two and three will cover reading and transforming data in Apache Spark; extracting data from other sources; processing ML models; and loading it into Snowflake.
    How the Process Works
    Advanced analytics allows companies to derive maximum value from the critical information they store in their Snowflake cloud data warehouses. Among the many tools at the disposal of data teams, Apache Spark and ML are ideal for carrying out these value-generating advanced analytics.
    The Qubole-Snowflake integration allows data scientists and other users to:
    • Leverage the scalability of the cloud. Both Snowflake and Qubole separate compute from storage, allowing organizations to scale up or down computing resources for data processing as needed. Qubole’s workload–aware autoscaling automatically determines the optimal size of the Apache Spark cluster based on the workload.
    • Securely store connection credentials. When Snowflake is added as a Qubole data  store, credentials are stored encrypted and they do not need to be exposed in plain text in Notebooks. This gives users access to reliably secure collaboration.
    • Configure and start up Apache Spark clusters hassle–free. The Snowflake Connector is preloaded with Qubole Apache Spark clusters, eliminating manual steps to bootstrap or load Snowflake JAR files into Apache Spark.
    As the figure below illustrates, the process begins by adding Snowflake as a Qubole data store through the Qubole interface. Users can then select their prefered language for reading data from Snowflake, process it with machine learning algorithms, then write back the results to either Snowflake or any other storage or application, including dashboards, mobile apps, etc.
    Adding Snowflake as a Datastore in Qubole
    If you’re ready to take the dive into Qubole–Snowflake integration for your own analytics projects, getting started is straightforward. The first step is to connect to a Snowflake virtual Data Warehouse from the Qubole Data Service (QDS). Administrators can do this by adding Snowflake as a Data Store–individual users can add a number of Data Stores to a Qubole account. These Data Stores are direct connections to Snowflake databases with configured credentials that are encrypted and protected by QDS. Non-administrator users cannot edit or see the credentials.
Share This
Previous Post
Next Post

Pellentesque vitae lectus in mauris sollicitudin ornare sit amet eget ligula. Donec pharetra, arcu eu consectetur semper, est nulla sodales risus, vel efficitur orci justo quis tellus. Phasellus sit amet est pharetra