Tools in Data Science

    • May 2025: Tools in Data Science
    • 1. Development Tools
      • Editor: VS Code
      • AI Code Editors: GitHub Copilot
      • Python tools: uv
      • JavaScript tools: npx
      • Unicode
      • Browser: DevTools
      • CSS Selectors
      • JSON
      • Terminal: Bash
      • AI Terminal Tools: llm
      • Spreadsheet: Excel, Google Sheets
      • Database: SQLite
      • Version Control: Git, GitHub
    • 2. Deployment Tools
      • Markdown
      • Images: Compression
      • Static hosting: GitHub Pages
      • Notebooks: Google Colab
      • Serverless hosting: Vercel
      • CI/CD: GitHub Actions
      • Containers: Docker, Podman
      • DevContainers: GitHub Codespaces
      • Tunneling: ngrok
      • CORS
      • REST APIs
      • Web Framework: FastAPI
      • Authentication: Google Auth
      • Local LLMs: Ollama
    • 3. Large Language Models
      • Prompt engineering
      • TDS TA Instructions
      • TDS GPT Reviewer
      • LLM Sentiment Analysis
      • LLM Text Extraction
      • Base 64 Encoding
      • Vision Models
      • Embeddings
      • Multimodal Embeddings
      • Topic modeling
      • Vector databases
      • RAG with the CLI)
      • Hybrid RAG with TypeSense
      • Function Calling
      • LLM Agents
      • LLM Image Generation
      • LLM Speech
      • LLM Evals
    • Project 1
    • 4. Data Sourcing
      • Scraping with Excel
      • Scraping with Google Sheets
      • Crawling with the CLI
      • BBC Weather API with Python
      • Scraping IMDb with JavaScript
      • Nominatim API with Python
      • Wikipedia Data with Python
      • Scraping PDFs with Tabula
      • Convert PDFs to Markdown
      • Convert HTML to Markdown
      • LLM Website Scraping
      • LLM Video Screen-Scraping
      • Web Automation with Playwright
      • Scheduled Scraping with GitHub Actions
      • Scraping emarketer.com
      • Scraping: Live Sessions
    • 5. Data Preparation
      • Data Cleansing in Excel
      • Data Transformation in Excel
      • Splitting Text in Excel
      • Data Aggregation in Excel
      • Data Preparation in the Shell
      • Data Preparation in the Editor
      • Data Preparation in DuckDB
      • Cleaning Data with OpenRefine
      • Parsing JSON
      • Data Transformation with dbt
      • Transforming Images
      • Extracting Audio and Transcripts
    • 6. Data Analysis
      • Correlation with Excel
      • Regression with Excel
      • Forecasting with Excel
      • Outlier Detection with Excel
      • Data Analysis with Python
      • Data Analysis with SQL
      • Data Analysis with Datasette
      • Data Analysis with DuckDB
      • Data Analysis with ChatGPT
      • Geospatial Analysis with Excel
      • Geospatial Analysis with Python
      • Geospatial Analysis with QGIS
      • Network Analysis in Python
    • Project 2
    • 7. Data Visualization
      • Data Storytelling
      • HTML Slides: RevealJS
      • Markdown Presentations: Marp
      • Interactive Notebooks: Marimo
      • RAWgraphs
      • Data Visualization with Seaborn
      • Visualizing Forecasts with Excel
      • Visualizing Animated Data with PowerPoint
      • Visualizing Animated Data with Flourish
      • Visualizing Network Data with Kumu
      • Actor Network Visualization
      • Data Visualization with ChatGPT
      • Data Storytelling with LLMs

    Spreadsheets

    • Spreadsheet: Excel, Google Sheets

    Spreadsheet: Excel, Google Sheets#

    You’ll use spreadsheets for data cleaning and exploration. The most popular spreadsheet program is Microsoft Excel followed by Google Sheets.

    You may be already familiar with these. If not, make sure to learn the basics of both.

    Go through the Microsoft Excel video training and make sure you cover:

    • Intro to Excel
    • Rows & columns
    • Cells
    • Formatting
    • Formulas & Functions
    • Tables
    • PivotTables

    Watch this video for an introduction to Google Sheets (49 min):

    Google Sheets Tutorial for Beginners (49 min)

    Backward Llm Sqlite Forward
    • Spreadsheet: Excel, Google Sheets