CarWale BigQuery

About

CW BigQuery is a service that enables interactive analysis of massively large datasets stored in Cassandra Storage. It allows user to submit and schedule Spark Job , just using SQL like syntax.

Architecture

The Architecture of Bigquery consists of three components

  • ProxyQuery
  • Mist Server
  • Spark

Query Proxy

Query Proxy is Most important component of our system, it acts as mediator (as the name Proxy suggests) between Web Application and the Mist Server.

It receives request from the user , validates and push the job request in queue along with returning Jobid to user, if queue is empty.

After getting responce from Query Proxy, it save responce to file named after the job Id .

When User request for result it will generate result using pagination and return result to ther user.

Query Proxy is also responsible for cleaning response files after certain amount of time.

proxyQuery


Mist Server

proxyQuery

MistServer is a full-featured, next-generation streaming media toolkit for OTT (internet streaming). It takes care of all the annoying little problems you come across in media streaming projects, allowing you to focus on what makes your product or service unique. The MistServer software and our accompanying services allow anyone to quickly gain and keep a competitive edge.

MistServer is written entirely in C++ and comes with an easy to use and integrate JSON API.

for knowing more about what is media server checkout mistserver documentation


Spark

Spark

Apache Spark™ is a fast and general engine for large-scale data processing

Spark Run programs up to 100x faster than Hadoop MapReduce in memory, or 10x faster on disk.

Spark offers over 80 high-level operators that make it easy to build parallel apps. And you can use it interactively from the Scala, Python and R shells.

for more read the quick start guide here

Say Something about BigQuery

Skills