The Sandbox Concept
A Purpose-‐Built Cloud Environment to Solve Complex Business Problems
Any modern business today must continuously assess itself. Whether it is solving a throughput problem in the supply chain, optimizing a sophisticated network of operating units, or forecasting demand under uncertainty, making multi-‐dimensional decisions from data is the oxygen of corporate life.
Most firms assemble spreadsheets based on data gleaned from corporate networks. Yet, spreadsheets are limited in computational power, and, once done, are usually filed away on a hard drive never to be seen again. That is an ineffective and inefficient way to solve problems—the most important business
function of an organization. There must be a better way.
A Sandbox is an environment specifically designed for business problem solving using data analytics
Enter the “sandbox” concept. In a real sandbox you have a walled-‐off area, scattered with toys allowing children to dream and build and play without hurting anything (as long as you stay in the sandbox). For our purposes, It is a “builder’s space” for data scientists and analysts. A proper sandbox should have all of the toys (tools) needed to conduct any conceivable analysis. Here’s how it works:
A secure “private cloud” environment is set up inside an organization completely behind the corporate firewall. This provides all the advantages of the seamless cloud architecture with no extraordinary compromises with security. The private cloud is completely separate but interconnected to the Production IT environment for data sharing purposes.
In the cloud, users can create simple functions or sophisticated models using a language that is purpose-‐built for mathematical problem solving…the technical computing language is in place to serve a wide range of computing needs that go well beyond spreadsheets.
The uniform environment promotes code re-‐use. It is expected that users will take advantage of code fragments that have been created before, snapping them together in “Lego-‐like” fashion to enhance the speed of development and to reduce errors.
Where Does The Data Come From?
There are several options for getting data into the sandbox. The first is direct connection with corporate databases using some form of ETL mechanism. Another is to use a Data Federation approach whereby an ontology is used to create a virtual layer that represents a normalized database to the sandbox while the data actually remains in its raw form across a constellation of internal and external databases. This latter approach has the advantage of letting the data reside in its “natural” repositories undisturbed while still maintaining a clean, uniform database for the sandbox users to work with.
Why Do Companies Need Sandboxes?
Corporations solve complex problems each and every day. Increasingly, those problems are dealt with with a dose of data and analytics. That in turn implies a computing environment for models (financial, operational, forecasting) that tell the story behind a complex problem. Spreadsheets are fine for the simplest problems, but are no match for the complexity of many of the key decisions an organizations faces routinely.
The sandbox gives the analyst a “safe” place to build models and run experiments as a means to gain an understanding of how the company works (or would work under a certain set of conditions). Moreover, the work of one analyst can be used to leverage another, as the single sandbox by its nature tends to support code re-‐use. Done well, an organization gains richer content in its sandbox over time, making it a vital asset.
Where Did The Ideas For The Sandbox Concept Come From?
Much of the inspiration for the sandbox concept comes from the book
Profit from Science, by George E. Danner