System Integration

If you have been in the IT industry for quite a sometime now, then you may have come across Dashboards in software applications that display various metrics like System Health information dashboards, Stock Market rates, Traffic information, etc. Traditionally the refresh intervals for such Dashboards have been in hours.

So let me start with what is Apache Kafka and Apache Spark.
  • Apache Kafka: Its a high-throughput distributed messaging system. Its strengths are as follows:
  • *High-Throughput & Low Latency: Even with very modest hardware, Kafka can support hundreds of thousands of messages per second, with latencies as low as a few milliseconds.
  • *Scalability: A Kafka cluster can be elastically and transparently expanded without downtime.
  • *Durability & Reliability: Messages are persisted on disk and replicated within the cluster to prevent data loss.
  • *Fault-Tolerance: Immune to machine failure in the Kafka cluster.
  • *High Concurrency: Ability to simultaneously handle a large number (thousands) of 
diverse clients, simultaneously writing to and reading from Kafka.