Skip to main content

Scikit-Learn Boss in 90 Days

Day 2: Setting Up Environment

Python Environment Setup

πŸ“‘ Table of Contents

  1. 🌟 Welcome to Day 2
  2. πŸ–₯️ Installing Python and Dependencies
    • Python Installation
    • Package Managers (pip, conda)
    • Virtual Environments
  3. πŸ”§ Setting Up Your Machine Learning Workspace
    • Anaconda Distribution
    • Miniconda
    • Using Requirements Files
  4. πŸ’» IDEs and Editors
    • VS Code
    • PyCharm
    • Jupyter Notebook
  5. πŸ“ Organizing Your Projects
    • Directory Structure
    • Version Control (Git)
  6. 🧩 Hands-On Exercises
  7. πŸ“š Resources
  8. πŸ’‘ Tips and Tricks

1. 🌟 Welcome to Day 2

Welcome to Day 2 of your journey to becoming a Scikit-Learn Boss in 90 Days! πŸŽ‰ Today’s focus is on setting up a clean and efficient environment for your machine learning projects. A well-structured environment ensures smoother development, reproducibility, and maintains consistency across projects. Let’s get everything ready to tackle advanced concepts in the days ahead! πŸš€


2. πŸ–₯️ Installing Python and Dependencies

πŸ“ Python Installation

  • Official Python Website: Download and install the latest stable version of Python from python.org.
  • Check Installation:
    python3 --version
    

πŸ“ Package Managers (pip, conda)

  • pip: The default Python package manager.
    pip install numpy pandas scikit-learn
    
  • conda: An environment and dependency manager that comes with Anaconda or Miniconda.
    conda install numpy pandas scikit-learn
    

πŸ“ Virtual Environments

  • Creation:
    python3 -m venv my_env
    
  • Activation:
    • Linux/MacOS:
      source my_env/bin/activate
      
    • Windows:
      my_env\Scripts\activate
      
  • Deactivation:
    deactivate
    

3. πŸ”§ Setting Up Your Machine Learning Workspace

πŸ“ Anaconda Distribution

  • All-In-One Package: Comes pre-installed with Python, Jupyter, and popular data science libraries.
  • Installation: Download from anaconda.com and follow the setup instructions.

πŸ“ Miniconda

  • Lightweight Alternative: Provides a minimal environment and conda for managing packages and environments.
  • Installation: Download from docs.conda.io.

πŸ“ Using Requirements Files

  • requirements.txt:
    pip install -r requirements.txt
    
  • environment.yml (for conda):
    conda env create -f environment.yml
    

4. πŸ’» IDEs and Editors

πŸ“ VS Code

  • Extensions: Python, Jupyter, Pylance.
  • Integrated Terminal: Manage environments and run Python files directly.

πŸ“ PyCharm

  • Professional Environment: Robust tools for testing, debugging, and refactoring.
  • Integrated Tools: Virtual environment setup and version control within the IDE.

πŸ“ Jupyter Notebook

  • Interactive Environment: Perfect for exploration, experimentation, and quick prototyping.
  • Run in Browser:
    jupyter notebook
    

5. πŸ“ Organizing Your Projects

πŸ“ Directory Structure

  • Example:
    project_name/
      data/
      notebooks/
      src/
      tests/
      README.md
      requirements.txt
    

πŸ“ Version Control (Git)

  • Initialize Repo:
    git init
    
  • Commit Changes:
    git add .
    git commit -m "Initial commit"
    

6. 🧩 Hands-On Exercises

πŸ“ Exercise 1: Create and Activate a Virtual Environment

  • Task: Create a virtual environment named ml_env and activate it. Install numpy, pandas, and scikit-learn inside it.
    python3 -m venv ml_env
    source ml_env/bin/activate
    pip install numpy pandas scikit-learn
    

πŸ“ Exercise 2: Set Up a Conda Environment

  • Task: Using Anaconda or Miniconda, create a conda environment named ml_conda_env with Python 3.9 and install matplotlib and seaborn.
    conda create --name ml_conda_env python=3.9
    conda activate ml_conda_env
    conda install matplotlib seaborn
    

πŸ“ Exercise 3: Start a Jupyter Notebook

  • Task: From within your activated environment, start a Jupyter Notebook and verify you can import sklearn.
    jupyter notebook
    # In a new notebook cell:
    import sklearn
    print(sklearn.__version__)
    

7. πŸ“š Resources

8. πŸ’‘ Tips and Tricks

πŸ’‘ Pro Tip

Isolate Your Projects: Keep separate environments for different projects to avoid dependency conflicts.

  • Poetry: For dependency management and packaging.
  • Docker: For containerized, reproducible environments.

πŸš€ Speed Up Your Setup

  • Use Environment Files: Automate environment creation with requirements.txt or environment.yml.
  • Shortcuts: conda activate and conda deactivate for quick environment switching.

πŸ” Debugging Setup Issues

  • Check PATH: Ensure Python and package managers are on your PATH.
  • Reinstall: If issues persist, reinstall Python or Anaconda.
  • Community Forums: Leverage Stack Overflow and official docs for troubleshooting.