GitHub - mdmamunhasan/streamsql: Apache Spark Consuming Kafka Processing PostgreSql To Redshift

Requirements

Maven
Apache Spark
Scala

Clone the repo

Use the following commands:

sudo yum install git
git clone https://github.com/mdmamunhasan/streamsql.git
cd streamsql

Install the code

Use the following command: mvn clean install

Reference

This post demonstrates how to set up Apache Kafka on Amazon EC2, use Spark Streaming on Amazon EMR to process data coming in to Apache Kafka topics, and query streaming data using Spark SQL on Amazon EMR.

This repo provides:

An AWS CloudFormation stack to set up Apache Kafka on Amazon EC2
Scripts/code to create the Apache Kafka topic and producer
Spark Streaming and Spark SQL code to run on Amazon EMR

For more information about how to set everything up, see the post.

https://github.com/awslabs/aws-big-data-blog.git

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
cloudformation		cloudformation
src/main/scala		src/main/scala
.gitignore		.gitignore
Readme.md		Readme.md
derby.log		derby.log
pom.xml		pom.xml
streamsql.iml		streamsql.iml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Requirements

Clone the repo

Install the code

Reference

About

Releases

Packages

Languages

mdmamunhasan/streamsql

Folders and files

Latest commit

History

Repository files navigation

Requirements

Clone the repo

Install the code

Reference

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages