This is a cache of https://developer.ibm.com/components/jupyter/. It is a snapshot of the page as it appeared on 2026-02-23T17:18:26.879+0000.
Jupyter - IBM Developer
IBM Developer

Jupyter

An open source project that supports interactive data science and scientific computing across all programming languages

JupyterLab is an open source web-based IDE for notebooks, code, and data. Jupyter Notebook is the original open source web application for notebooks, code, and data.

20 January 2020

Tutorial

Getting started with PySpark

This tutorial covers Big Data via PySpark (a Python package for spark programming). We explain SparkContext by using map and filter methods with Lambda functions in Python. We also create RDD from object and external files, transformations and actions on RDD and pair RDD, SparkSession, and PySpark DataFrame from RDD, and external files. In addition, we use sql queries with DataFrames (by using Spark SQL module). And finally, machine learning with PySpark MLlib library.

Getting started with PySpark