Skip to content

byh0215/pyspark-tutorial

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

94 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PySpark Tutorial

  • PySpark is the Python API for Spark.
  • The purpose of PySpark tutorial is to provide basic distributed algorithms using PySpark.
  • PySpark has an interactive shell ($SPARK_HOME/bin/pyspark) for basic testing and debugging and is not supposed to be used for production environment.
  • You may use $SPARK_HOME/bin/spark-submit command for running PySpark programs (may be used for testing and production environemtns)

PySpark Examples and Tutorials

PySpark Tutorial and References...

Questions/Comments

Thank you!

best regards,
Mahmoud Parsian

Data Algorithms with Spark

PySpark Algorithms Book

Data Algorithms Book

About

PySpark-Tutorial provides basic algorithms using PySpark

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 74.6%
  • Shell 25.4%