Are you having difficulty keeping up to date on all the frequent changes and updates in the streaming data space? Then the 'Streaming Data Monthly Digest' ( updated daily!) has the solution you’re looking for. Please find below a list of web resources related to streaming data in general for January 2017.
I am daily updating this list without a focus on a particular tool be it open source or commercial. Many web resources listed below are future events such as meetups and conference talks. Related slides and videos will be added as they are made available.
Not a single streaming data processor can claim to be a silver bullet! All streaming data processors have their own strengths and weaknesses and are sweet spots for particular use cases.
January 2nd, 2017
- [Blog] Apache Flink: A New Wave to Real-time Stream Processing
- [Article] Big Data: Spark 2.1 bringt Neuerungen für Streaming und maschinelles Lernen
- [Blog] Using Kafka With JUnit
- [Blog] Behind Data Streaming, Hugo Picado
January 3rd, 2017
- [Blog] Monitoring Wikipedia Edit Streams using Apache Flink and Packaging the Application with Dependencies
- [Blog] Managing IoT devices with Kafka and MQTT, Marcello Vitaletti
- [Blog] The Battle of the Crawlers : Apache Nutch vs StormCrawler
- [Blog] Closure on Apache Nifi
- [Article] 8 data trends on our radar for 2017
- [Blog] Overview: Apache Spark on HDInsight Linux
- [Blog] Kafka 0.10 Compression Benchmark
January 4th, 2017
- [Blog] Asynchronous Processing and Multithreading in Apache Samza, Part I: Design and Architecture
- [Blog] What 2017 Will Bring: 10 More Big Data Predictions. Alex Woodie
- [Blog] Databricks and Apache Spark 2016 Year in Review
- [Presentation] IoT Project Flogo - How to Build an Apache Kafka Connector / Adapter. Kai Wähner Video Slides
- [Blog] Join: Storm in a Teacup, Continued..
- [Blog] Applying Machine Learning to Real Time Streaming Analytics
- [Whiteboard Walkthrough] A Better Way to Build a Fraud Detector: Streaming Data and Microservices Architecture
- [Blog] Kafka Summit 2017 Talk Proposal
- [Conference talk] The Role of Data Virtualization in IoT Integration. Slides Video
January 5th, 2017
- [Blog] Proof of concept using KafkaStreams and KTables
- [Blog] Monitoring Real-Time Uber Data Using Spark Machine Learning, Streaming, and the Kafka API (Part 2)
- [Blog] Log Compaction: Highlights in the Apache Kafka and Stream Processing Community - January 2017. Gwen Shapira
- [Blog] Microservices messaging on Oracle Cloud using Apache Kafka
- [Video] Where Does Apache Geode Fit in CQRS Architectures?
- [Blog] Apache Nifi Installation on Ubuntu
January 6th, 2017
- [Blog] Asynchronous Processing and Multithreading in Apache Samza, Part II: Experiments and Evaluation. Xinyu Liu.
- [Blog] Kafka Avro Scala Example
- [Tutorial] How to use Flume in IOP with Message Hub?
- [Blog] How to Build a Custom Flogo Adapter
January 7th, 2017
- [Meetup] A Deep-dive into Structured Streaming / Predictive Analytics with SparkR. Istanbul Spark Meetup
- [Article] Big Data Processing with Apache Spark - Part 3: Spark Streaming
- [Article] The Impact of Data Grids in IoT
January 8th, 2017
- [Video + Slides] Spring and Big Data. Thomas Risberg
January 9th, 2017
- [Blog] Better Complex Event Processing at Scale Using a Microservices-based Streaming Architecture (Part 1). Mathieu Dumoulin
- [Blog] Release 0.4.0 adds a runner for Apache Apex. Thomas Weise
January 10th, 2017
- [Blog] Real-time Smart City Traffic Monitoring Using Microservices-based Streaming Architecture (Part 2). Mathieu Dumoulin
- [Blog] Google Lauds Outside Influence on Apache Beam, Alex Woodie
- [News] The Apache Software Foundation Announces Apache® Beam™ as a Top-Level Project
- [Blog] Apache Beam graduates to a top-level project
- [Blog] Apache Beam established as a new top-level project, Davor Bonaci
- [News] Google must be Beaming as Apache announces its new top-level projects
- [Blog] Apache Beam graduates to a top-level project, Tyler Akidau. Google Open Source Blog.
- [Blog] Apache Beam graduates from incubation: Try it today on Google Cloud Dataflow, Frances Perry, Google.
- [Blog] What's new in StormCrawler 1.3, Julien Nioche.
January 11th, 2017
- [Meetup] Introduction to Kafka Streams with a Real-Life Example, Apache Kafka DC Slides
- [Webinar] Top 5 IoT Use Cases, Vijay Raja & Dave Shuman
- [Blog] Apache Software Foundation announces two more top-level open source projects. Mike Wheatley
- [Article] Apache Beam unifies batch and streaming for big data, Serdar Yegulalp
- [Slides] Stream Processing as a Foundational Paradigm and Apache Flink's Approach to It. Stephan Ewen.
- [Blog] Kafka vs. MapR Streams: Why MapR? Ian Downard
- [News] Apache Spark 2.1 Improves Structured Streaming, David Ramel.
January 12th, 2017
- [Webinar] January 12: Business insight in minutes with Oracle Stream Analytics
- [Meetup] Processing IoT data with Apache Kafka. Matt Howlett, Confluent. Bay Area Full Stack, Mountain View, CA. Slides
- [Video + Slides] Spring for Apache Kafka, Gary Russel.
- [Blog] Getting Started with Spark Streaming, Python, and Kafka, Robin Moffat.
- [Video + Slides] Architecting for Cloud Native Data: Data Microservices Done Right Using Spring Cloud. Fred Melo
- [Article] Apache Beam and Spark: New coopetition for squashing the Lambda Architecture? Tony Baer, Ovum.
- [Blog] Streaming Analytics for Chain Monitoring, Natalino Busa.
- [Presentation] Staging Reactive Data Pipelines using Kafka as the Backbone. Manchester Geek Nights Video
January 13th, 2017
- [Blog] Apache Kafka: Getting started. G.
- [Blog] Importing JSON into Hadoop via Kafka. By Nuria Ruiz, Andrew Otto, Wikimedia Foundation.
- [Blog] Developing Transactional Microservices Using Aggregates, Event Sourcing and CQRS - Part 2. Part 1
- [Blog] Data Processing and Enrichment in Spark Streaming with Python and Kafka, Robin Moffatt
- [Blog] The Future of Apache Beam, Now a Top-Level Apache Software Foundation Project, Jean-Baptiste Onofre, Talend.
- [Blog] SQL on Apache Apex. Chinmay Kolhatkar
- [Blog] Creating An Email Bot in Apache NiFi, Timothy Spann
January 14th, 2017
- [Slides + Video] Reactive Kafka, Rajini Sivaram. Pivotal
January 15th, 2017
- [Blog] Scaling Kafka with Docker Containers, Jorge Quilcate
January 16th, 2017
- [Presentation] Streaming Real-Time from On-Premise Databases to Big Data in the Amazon Web Services Cloud Slides Video
- [Blog] Updating Materialized Views and Caches Using Kafka. Zach Cox
- [Blog] Ingest Remote Camera Images from Raspberry Pi via MQTT and FTP in Apache NiFi, Timothy Spann.
January 17th, 2017
- [Meetup] Fast Data: Selecting The Right Streaming Technologies For Never Ending Data Sets, Chicago Real-Time Streaming Analytics
- [Meetup] Spark Discussion with Dr. Alex Liu, IBM's Chief Data Scientist. Chicago Spark Users
- [Meetup] Sensor Data Ingestion and Processing with NiFi and Spark. Future of Data, Amsterdam.
- [Blog] Performance Tuning of an Apache Kafka/Spark Streaming System. Mathieu Dumoulin
- [Webinar] Solving the Really Big Tech Problems with IoT
- [Presentation] Building Reactive Fast Data & the Data Lake with Akka, Kafka, Spark, Todd Fritz
- [Blog] Data in Motion Evolution: Where We’ve Been…Where We Need to Go, Girish Pancha
- [Blog] We All Need Direction: Hello, Apache NiFi! Eric Kavanagh
- [Blog] Ruby Scripting in NiFi Sebastian Carroll
- [Webinar] Big Data Hadoop Streaming ETL: template for Kafka-Filter-HDFS, Slides Video Mohit Jotwani and Deepak Narkhede, DataTorrent.
January 18th, 2017
- [Meetup] Apache Kafka Meetup with Walmart Labs and Confluent, Apache Kafka Bay Area
- [Meetup] Instrumenting Apache Kafka, Apache Kafka London Slides
- [Meetup] Understanding Big Data Streaming and Apache Flink. Fremont Big Data and Cloud Meetup
- [Meetup] Running Kafka in production. streamprocessing.be Meetup Slides
- [Meetup] TensorFlow & TensorFrames w/ Apache Spark + Deep-dive into Structured Streaming, Apache Spark and more, Milano. Slides
- [Presentation] Reactive integrations with Akka Streams, Konrad Malawski
- [Blog] Ingest Data into Splunk with StreamSets Data Collector, Pat Patterson
- [Blog] Capitalizing on IoT using Oracle Stream Analytics - Oil&Gas In Action! Issam Hijazi
- [Presentation] A Deep Dive into Structured Streaming in Apache Spark, Burak Yavuz
- [Presentation] Reactive more than just streams, Konrad Malawski, Lighbend.
- [Presentation] Introduction to Structured Streaming, Manish Mishra, Knoldus Software LLC.
- [Blog] Announcing Spring Cloud Data Flow 1.1: Cloud-Native Architecture for Enterprise Data, Sabby Anandan
January 19th, 2017
- [Meetup] gRPC, Kubernetes, Mesos, Spark ML, Structured Streaming, Tensorflow, HDFS, Kafka. Advanced Spark and TensorFlow Meetup, San Francisco, CA
- [Meetup] Fast Data / Event-Driven Architecture with Kafka Streams
- [Webinar] Exploring Reactive Integrations with Akka Streams, Alpakka and Kafka. Slides Video
- [Meetup] Real time product recommendations, Montreal Apache Spark Meetup
- [Presentation] Intro to Big Data AppHub: Demo of HDFS to Kafka and Kafka to HDFS templates. Ashwin Putta, Sanjay Pujare, Devendra Tagare, DataTorrent Slides Video
- [Blog] Apache Flink® User Survey 2016 Results, Part 1, Michael Winters
- [Blog] Building a Kafka that doesn’t depend on ZooKeeper Travis Jeffery
- [Blog] The Infrastructure Behind Twitter: Scale, Mazdak Hashemi
- [Blog] Real-time Streaming ETL with Structured Streaming in Apache Spark 2.1, by Tathagata Das, Michael Armbrust and Tyson Condie
- [Blog] Getting Started with Kafka REST Proxy for MapR Streams, Tugdual Grall
- [Blog] About Akka Streams, by Ivan Yurchenko
- [Presentation] Apache kafka a distributed streaming platform, Paolo Castagna, Confluent.
January 20th, 2017
- [Blog] Getting Started Guide to Apache Kafka, Karan Shah
- [Article] How These Banking, Energy, and Pharma Firms Use Spark, Alex Woodie.
- [Blog] Ingesting data into Couchbase using StreamSets Data Collector, Pat Patterson
- [Presentation] Getting started with Azure Event Hubs and Stream Analytics services, Vladimir Bychkov
January 21st, 2017
- [Slides + Video] Spring with Apache NiFi, Oleg Zhurakousky, Hortonworks.
January 23rd, 2017
- [Meetup] 19th Swiss Big Data User Group Meeting
- [Blog] Towards a realtime streaming architecture, Alice Kaerast
- [Interview] In a way, Apache Beam is the glue that connects many big data systems together, Kypriani Sinaris
- [Conference Talk] Drinking from a Firehose with Apache Spark Streaming and Flink, Topconf Tallinn 2016
January 24th, 2017
- [Meetup] Distributed Reactive Applications & Dynamic load balancing with Akka Streams. Krakow Scala User Group.
- [Blog] Apache Flink® User Survey 2016 Results, Part 2 Michael Winters
- [Presentation] SQL on Apache Apex, Chinmay Kolhatka
- [Presentation] Storm over Gearpump, Tianlun Zhang, Intel
- [Blog] ODP: An Infrastructure for On-Demand Service Profiling
- [Article] Calculating Movies Ratings Distribution With Apache Flink, Ivan Mushketyk
January 25th, 2017
- [Meetup] Apache Kafka with Jay Kreps and Michael Noll, Talk 1: Introducing Kafka's Streams API Slides Talk 2: Apache Kafka lessons learned @PAYBACK Slides. Talk 3: Apache Kafka at Trivago, Clemens Valiente, Slides Video
- [Meetup] Real time data ingestion & streaming: talks from Avvo, Expedia and Confluent, Seattle Apache Kafka Meetup Talk 1: Streaming Data Ecosystems with Brandon O'Brien Slides. Talk 2: State of the Streaming Platform 2017 : An Overview of Apache Kafka and the Confluent Platform. Slides. Talk 3: Apache Flume – Real time data ingestion into HDFS, Slides.
- [Article] Streaming hot: Real-time big data architecture matters, George Anadiotis
- [Blog] How Storm SQL is executed
- [Blog] Drinking from the industrial IoT data fire hose, Andy Oram
January 26th, 2017
- [Webinar] Streaming Data Analytics with Apache Spark Streaming. IBM Analytics
- [Meetup] Kafka Connect & Repeatable deployment of Kafka Streams topologies on kubernetes. Kafka Meetup Utrecht, Netherlands Video recordings: Part 1, Part 2, Part 3, Part 4, Part 5, Part 6, Part 7
- [Meetup] Evolution of Streaming Systems@Twitter - Apache Storm to Twitter Heron, Karthik Ramasamy
- [Article] Chaperone - A Kafka Auditing Tool from the Uber Engineering Team, by
January 27th, 2017
- [Presentation] Kafka connect, Andrew Stevenson, Data Mountaineer.
- [Blog] Stream processing and the IBM Open Platform
January 28th, 2017
- [Article] 5 Solid Use Cases of IOT Analytics that Makes it Truly Innovative!
- [Blog] From MQTT to Kafka with Connect and Stream Reactor, Marios Andreopoulos
January 29th, 2017
- [Article] Dipping Into Java 8 Streams, Dan Newton
- [Video + Slides] Streaming Live Data and the Hadoop Ecosystem,
- [Video + Slides] I Can't Believe It's Not a Queue: Using Kafka with Spring, Joe Kutner, Heroku.
January 30th, 2017
- [Interview] Expert Interview (Part 1): Co-Founder Neha Narkhede on Origins of Confluent and Kafka
- [Video] Reactive Kafka, Rajini Sivaram.
- [Video] Real-Time Dashboards for Apache Apex (Next Gen Hadoop) apps, Sasha Parfenov, DataTorrent.
- [Slides] What no one tells you about writing a streaming app, Mark Grover and Ted Malaska
- [Slides] Lightbend Fast Data Platform, Dean Wampler.
- [News] Spring Boot 1.5.1 released
- [Video] Kafka Streams & Datio Fenrir
- [Blog] OSTMap - Open Source Tweet Map
- [Whiteboard Walkthrough] Anomaly Detection Using Metrics and Exception Logs Ted Dunning
- [Blog] Open NLP Example Apache NiFi Processor, Timothy Spann
January 31st, 2017
- [Meetup] Microservices, The Data Dichotomy: Rethinking Data & Services with Streams, Ben Stopford, Confluent. London Dev Community
- [Article] Spring Boot 1.5 bringt nativen Support für Apache Kafka
- [Interview] Expert Interview (Part 2): Confluent’s Neha Narkhede on Schema Registry Strategy and Purpose
- [Blog] NiFi and OAuth 2.0 to request WordPress API
- [Blog] Kafka - Rewind Consumer Offsets