Google is adding another product in its range of big data services on the Google Cloud Platform today. The new Google Cloud Dataproc service sits between managing the Spark data processing engine or Hadoop framework directly on virtual machines and a fully managed service like Cloud Dataflow, which lets you orchestrate your data pipelines on Google’s platform.
As Google’s director of product management for Google Cloud Platform Greg DeMichillie told me, Dataproc users will be able to spin up a Hadoop cluster in under 90 seconds — significantly faster than other services — and Google will only charge 1 cent per virtual CPU/hour in the cluster. That’s on top of the usual cost of running virtual machines and data storage, but as DeMichillie noted, you can add Google’s cheaper preemptible instances to your cluster to save a bit on compute cost. Billing is per-minute, with a ten-minute minimum.
Because Dataproc can spin up clusters this fast, users will be able to set up ad-hoc clusters when needed and because it is managed, Google will handle the administration for them.
This article has been extracted from http://techcrunch.com, please click on this link to read the article in full http://techcrunch.com/2015/09/23/google-launches-cloud-dataproc-a-managed-spark-and-hadoop-big-data-service/
Montash is a multi-award winning global technology recruitment business. Specialising in permanent and contract positions across mid-senior appointments across a wide range of industry sectors and IT functions, including: