Analytics - I2IT Data Science Platform

A comprehensive Hospital Management System, with modules for all of the hospital departments and continuous product upgradations to keep up with emerging industry trends.


Ideas2IT HIPAA Compliant Analytics Frame Work

Our Analytics Solution frame work provides a platform to quickly build any analytical solutions. This frame work provides the basic building blocks enabling the deployment of data transformation and data science models quickly without having to worry about the needed plumbing.


Data Acquisition

Data is acquired from sources such as HMS, EMR, Public Heath Data, Lab records etc. and is fed into the pipeline. Our technology choice for the data pipeline is Luigi, while other pipelines such as AWS, Azkaban and Pinball can also be considered based on customer preferences. Luigi is preferred because of its simple code level dependency mapping eliminating the need for complex configuration files. There may be a need for an intermediate data store for the pipeline to use while it processes the data. This data store can vary based on the type and volume of data. The data pipeline orchestrates the tasks involved in cleansing and transforming the data.

Master Data Store

Transformed data is streamed to a master store, which is immutable. Only further data addition is allowed. While multiple options are available based on the type, structure and volume of data, in this instance, Amazon S3 is our data store of choice.

Lambda Architecture

The processing portion of the analytics framework is built on Lambda Architecture with two paths for the data.

  • Speed Layer – Streaming data is fed into the speed layer, where instantaneous scoring and application of built models happens. This provides the most current view of data without past history. Our choice of technology for this layer is Spark Streaming, on Amazon EMR
  • Batch Layer – This runs on the immutable master data store mentioned above, re-calculating and adjusting model outputs based on the entire data. Because of the time-intensive nature of these jobs, this typically happens periodically (as opposed to continuously). Our technology choice for the batch layer is Spark SQL, on Amazon EMR
  • Serving Layer – Output from both the Batch and the Speed Layers are combined in this layer to create the most comprehensive view. Our choice of technology for the Serving Layer is Apache Drill.

Features of Our Analytics Framework

  • A platform to quickly prototype analytical solutions.
  • Built in a modular fashion using industry standard architecture.
  • In the fast evolving technology scenario, any component can be seamlessly replaced by a better and new component making the framework future-ready.