FAQ

Most common problems from Data Dojo study group.

Run pyspark with ipython:

IPYTHON=1 ./bin/pyspark

Windows paths:

Directories are seperated by \\, not \.

Switch off verbose logs

Run cluster with more memory

To fix issue with: java.lang.OutOfMemoryError run spark with --driver-memory argument:

./bin/pyspark --driver-memory 1G