Embark on Python Journey

Embark on Your Python Journey: Mastering Classes, Abstraction, Packages, and Poetry for Aspiring Developers!

Introduction

In this guide, aspiring developers will explore the foundational elements of Python that are crucial for building robust applications. From understanding classes and abstraction to organizing code with packages and modules, you’ll gain practical insights into structuring your projects effectively.

Package and dependency management

Python package management is the process of managing the dependencies and packages used by Python projects. This includes installing, upgrading, and removing packages, as well as keeping track of their versions, requirements and dealing with python versions . Popular tools are : Pyenv , Poetry, UV, Conda

A - Manage Python version with Pyenv

Pyenv is a tool that allows you to manage multiple Python versions on your computer. It helps in installing and switching between different Python versions easily, allowing developers to work with various projects that require specific Python versions.

To install Pyenv, follow these steps: Go to the Github repo and follow steps according to you environment to install dependencies.

Installation

curl https://pyenv.run | bash

This will install pyenv along with a few plugins that are useful:

  • pyenv: The actual pyenv application
  • pyenv-virtualenv: Plugin for pyenv and virtual environments
  • pyenv-update: Plugin for updating pyenv
  • pyenv-doctor: Plugin to verify that pyenv and build dependencies are installed
  • pyenv-which-ext: Plugin to automatically lookup system commands

List all available versions of Python

After installing Pyenv, you can start managing Python versions on your machine. To list all available versions of Python that can be installed with Pyenv, run:

pyenv install --list

To install a specific version of Python, use the install command followed by the desired version number. For example, to install Python 3.10.8, run:

pyenv install 3.13.0

Once installed, you can list all available Python versions on your system with:

pyenv versions

If, for example, you wanted to use version 3.13.0, then you can use the global command:

pyenv global 3.13.0
python -V #to confirm python version

With Pyenv, managing Python versions and creating virtual environments becomes much more straightforward, allowing you to work efficiently on various projects with different Python requirements.

B - Manage Python projects dependency and packaging with Poetry

Poetry is a Python tool designed for dependency management and packaging of Python projects. It aims to streamline the process of installing, managing, and sharing Python packages by providing an easy-to-use command-line interface.

Installation

You can install Poetry using pip or using the official installer like below :

curl -sSL https://install.python-poetry.org | python3 -

Basic usage

To create new project named poetry-demo

poetry new poetry-demo
# Created package poetry_demo in poetry-demo

This will create the poetry-demo directory and poetry_demo python package. The directory tree looks like :

poetry-demo
├── README.md
├── poetry_demo
│   └── __init__.py
├── pyproject.toml
└── tests
    └── __init__.py

The pyproject.toml file is what is the most important here. This will orchestrate your project and its dependencies. For now, it looks like this:

[tool.poetry]
name = "poetry-demo"
version = "0.1.0"
description = ""
authors = ["James Kokou GAGLO <[email protected]>"]
readme = "README.md"

[tool.poetry.dependencies]
python = "^3.10"


[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"

Python version is just a declaration, Poetry will not install python mentioned. The python version must be available on your system.

Dependencies management

To add dependencies the you can run :

poetry add playwright # add playwright dependency

Virtual env management

Poetry can manage your virtual environments .You need to have the python executable in your PATH to be able to switch between environments.

poetry env use python3.12 # create virtual env with python3.12
poetry env list # list all virtual envs
poetry env info

Poe a Poetry plugin for tasks management

Poe the Poet is Poetry useful plugin which helps run tasks.

pip install poethepoet

We can define some tasks in our pyproject.toml

[tool.poe.tasks]
py-version         = "python --version"               

We can then run the task using Poetry command

poetry poe py-version  

C - Python module and package

In Python, a module is a single file that contains Python code and definitions. It can be imported into other Python scripts or programs using the import statement. A package, on the other hand, is a directory that contains one or more Python modules along with a special file called __init__.py.

Every module has a built-in attribute called __name__, which is set to the name of the module. If a script is run directly, __name__ equals __main__. Consider poetry_demo package tree below:

poetry-demo
├── README.md
├── poetry.lock
├── poetry_demo
│   ├── __init__.py
│   ├── main.py
│   ├── module1.py
│   ├── subpackage1
│   │   ├── __init__.py
│   │   └── module2.py
│   └── subpackage2
│       ├── __init__.py
│       └── module3.py
├── pyproject.toml
└── tests
  └── __init__.py

Here is imports to use modules from packages in poetry_demo/main.py

# main.py
import module1
from subpackage1 import module2
from subpackage2 import module3

print(module1.func1())  # Output: This is func1 from poetry_demo.module1
print(module2.func2())  # Output: This is func2 from poetry_demo.subpackage1.module2
print(module3.func3())  # Output: This is func3 from poetry_demo.subpackage2.module3

D - Python and OOP concepts

Inheritance

Here is how to extend classes and override methods.

class Animal:
    def speak(self):
        return "I make a sound."

class Dog(Animal):
    def speak(self):
        return "Woof!"

dog = Dog()
print(dog.speak())  # Output: "Woof!"

Use super() to access parent class methods.

Polymorphism

Use methods with the same name in different classes:

class Cat:
    def speak(self):
        return "Meow!"

class Dog:
    def speak(self):
        return "Woof!"

animals = [Cat(), Dog()]
for animal in animals:
    print(animal.speak())

Abstraction

abc (Abstract Base Classes) package is used to define required methods for subclasses:

from abc import ABC, abstractmethod

class Animal(ABC):
    @abstractmethod
    def speak(self):
        pass

class Dog(Animal):
    def speak(self):
        return "Woof!"

Class and Static Methods

@classmethod: This is a decorator, which tells Python that the following function (method) is intended to be used as a class method. def from_string(cls): This defines the method from_string that belongs to the class. The cls parameter is a reference to the class itself, not an instance of the class. Class methodes can be called on the class itself or on an instance of the class . You typically use class methods when you need to operate on the class attributes or when you want to define a factory method for creating instances of the class.

@staticmethod: This is another decorator, which indicates that the following method is intended to be a static method. def is_valid(): This defines the method is_valid that belongs to the class. You use static methods when you want to group logically related functions together, or when you need a function that doesn’t depend on the state of the class or its instances.

class MyClass:
@classmethod
    def from_string(cls, arg):
    # Create an instance of MyClass based on the string argument
    return cls(arg)

    @staticmethod
    def is_valid(arg):
        # Perform a validation check based on the argument
        return arg > 0

E- Pydantic: a library for data validation

Pydantic is a data validation and settings management library in Python, which uses Python type annotations to validate the input data. It’s widely used for creating data models in web applications, especially when working with FastAPI, but it can be utilized in any application requiring robust data validation.

Key Features of Pydantic:

  • Data Validation: Automatically validates that data matches the expected types and constraints.
  • Type Annotations: Uses Python type hints to define expected input/output for models.
  • Serialization/Deserialization: Converts between complex objects and native Python data types (e.g., JSON).
  • Default Values: Supports default values for fields if they are not provided in input data.
  • Custom Validators: Allows custom validation logic using validator methods.

Defining a Model

Pydantic models is definied by subclassing pydantic.BaseModel

from pydantic import BaseModel

class User(BaseModel):
    name: str
    age: int
    email: str = None

Validating Data: Create an instance of the model with input data, which will be validated

user_data = {
    "name": "Alice",
    "age": 30,
    # 'email' is optional as it has a default value of None
}

try:
    user = User(**user_data)
    print(user.name)  # Output: Alice
    print(user.age)   # Output: 30
    print(user.email) # Output: None
except ValueError as e:
    print(e)

Custom Validators

Custom validation logic is added using the @validator decorator

from pydantic import BaseModel, validator

class User(BaseModel):
    name: str
    age: int

    @validator('age')
    def check_age(cls, value):
        if value < 0:
            raise ValueError("Age must be positive")
        return value

user_data = {"name": "Bob", "age": -5}

try:
    user = User(**user_data)
except ValueError as e:
    print(e)  # Output: 1 validation error for User
              # age
              #   Age must be positive (type=value_error)

Serialization

The model instance can be converted to JSON or other formats.

user_dict = user.dict()
user_json = user.json()

print(user_dict)  # Output: {'name': 'Alice', 'age': 30, 'email': None}
print(user_json)  # Output: {"name": "Alice", "age": 30, "email": null}

F- Pydantic Settings: load a settings and configs

Pydantic Settings provides optional Pydantic features for loading a settings or config class from environment variables or secrets files.

Define a Settings Model

You’ll need to define a model using Pydantic that represents the settings or configuration for your application. This is done by subclassing BaseSettings.

from pydantic import BaseSettings, ValidationError

class MyAppSettings(BaseSettings):
    database_url: str
    debug_mode: bool = False  # default value if not provided
    max_connections: int = 10
    
    class Config:
        env_file = '.env'  # Optionally specify a .env file for loading environment variables
        env_prefix = 'MYAPP_'  # Environment variable prefix

# You can add validators or other methods to the settings model as needed.

Load Settings from Environment Variables

With the model defined, you can load your configuration settings from environment variables. Pydantic will automatically map the environment variables based on the field names and prefixes specified in the Config subclass.

import os

# Set example environment variables for demonstration purposes
os.environ['MYAPP_DATABASE_URL'] = 'sqlite:///mydb.sqlite'
os.environ['MYAPP_DEBUG_MODE'] = 'True'

try:
    settings = MyAppSettings()
    print(settings.json(indent=2))  # Display the loaded configuration as JSON
except ValidationError as e:
    print("Error loading settings:", e)

Access Settings in Your Application

You can then use these settings throughout your application:

def main():
    try:
        settings = MyAppSettings()
        
        if settings.debug_mode:
            print("Debug mode is enabled.")
        
        # Use settings in the application logic
        connect_to_database(settings.database_url)
        
    except ValidationError as e:
        print("Error loading settings:", e)

def connect_to_database(url):
    print(f"Connecting to database at {url} with max connections: {settings.max_connections}")

if __name__ == "__main__":
    main()

G- Click : package for creating command line interfaces

Python-Click is a popular library for creating command-line interfaces (CLIs) in Python. It provides decorators to add commands, options, and arguments to your CLI application with minimal boilerplate code. A simple example of using Click to create a basic command-line application:

import click

@click.command()
@click.option('--name', default='World', help='The person to greet.')
def hello(name):
    """Simple program that greets NAME."""
    click.echo(f'Hello {name}!')

if __name__ == '__main__':
    hello()
  • @click.command(): This decorator turns the function into a Click command.
  • @click.option(…): This adds an option to the command. In this case, it’s adding a –name option with a default value of ‘World’.
  • def hello(name): The decorated function takes in the arguments provided by the options and arguments in the command line.

We can run the program from your terminal like so:

python script.py --name James
python script.py # the default value World will be used

H - SqlAlchemy: SQL ORM

SQLAlchemy ORM (Object-Relational Mapping) is a powerful and flexible toolkit for working with databases in Python. It allows developers to map database tables to Python classes, enabling them to work with data in an object-oriented manner rather than writing raw SQL queries.

Key Concepts :

  1. Declarative Base:

    • You define your models (which correspond to database tables) by subclassing a base class provided by SQLAlchemy.
    • The Base class is usually created using declarative_base().
  2. Mapping:

    • Each model corresponds to a table in the database, and each instance of a model represents a row in that table.
    • Columns in the tables are represented as attributes on the model classes.
  3. Session:

    • The Session object acts as a staging zone for all objects loaded or associated with it during its lifespan.
    • It handles transactions and provides methods to query, add, delete, or update records.
  4. Querying:

    • SQLAlchemy ORM uses high-level querying API that allows you to construct SQL queries in a Pythonic way.
    • Queries are executed by the Session object, which returns results as model instances.

Here’s an example demonstrating how to define and use models with SQLAlchemy ORM:

import datetime
from loguru import logger
from sqlalchemy.orm import DeclarativeBase
from sqlalchemy.orm import mapped_column,Mapped
from sqlalchemy.orm import registry
from sqlalchemy.dialects.postgresql import UUID
import uuid
from sqlalchemy import create_engine
from crawler.settings import settings
from typing import Type, TypeVar, Generic
from sqlalchemy.orm import Session
from sqlalchemy import func
from sqlalchemy import select
from sqlalchemy import MetaData


_database  = create_engine(settings.db_url, echo=settings.db_debug)
mapper_registry = registry()
my_metadata = MetaData()

T = TypeVar('T', bound='PsqlBaseModel')  # Ensuring T is bound to PsqlBaseModel

class PsqlBaseModel(DeclarativeBase,Generic[T]):
    metadata = my_metadata
    id = mapped_column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
    created_at: Mapped[datetime.datetime] = mapped_column(server_default=func.now())
    updated_at: Mapped[datetime.datetime] = mapped_column(server_default=func.now(),server_onupdate=func.now())
    def __eq__(self, value: object) -> bool:
        if not isinstance(value, self.__class__):
            return False
        return self.id == value.id
    
    @staticmethod
    def bulk_insert(objects):
        """Bulk insert objects into the database.
        Args:
            objects (List[object]): List of objects to insert.
        """
        with Session(_database) as session:
            session.bulk_save_objects(objects)
            session.commit()
    
    @classmethod
    def get_by_hash(cls: Type[T],hash: str):
        """Get object by hash.
        Args:
            hash (str): Hash to search for.
        Returns:
            object: Object found.
        """
        logger.info(f"Searching class {cls} object with hash: {hash}")
        with Session(_database) as session:
            return session.query(cls).filter(cls.hash == hash).first()

To create all the tables :

PsqlBaseModel.metadata.create_all(_database)

Integrating SQLAlchemy with PostgreSQL functions and triggers can be a powerful way to extend the capabilities of your database schema while maintaining ORM-based application logic. Here’s how you can achieve this:

import datetime
from typing import List, Optional
from sqlalchemy.orm import Mapped,mapped_column
from sqlalchemy import Column, String, DateTime, DECIMAL, func, event, text
from sqlalchemy.dialects.postgresql import UUID
from .category import ProductCategory
from crawler.domain.base.psql import PsqlBaseModel
from sqlalchemy.orm import DeclarativeBase

class Product(PsqlBaseModel):
    __tablename__ = "products"
    name: Mapped[str] = mapped_column(nullable=True)
    hash: Mapped[str] = mapped_column(index=True)
    price: Mapped[int] = mapped_column(index=True,type_=DECIMAL(10,2),nullable=True,default=0)
    url: Mapped[str]
    website: Mapped[str]
    image: Mapped[str] = mapped_column(nullable=True)
    website_updated_at: Mapped[datetime.datetime]
    crawler_updated_at: Mapped[datetime.datetime]

    def __str__(self):
        return repr(self)

    def __repr__(self):
        return f'Product(name={self.name}, price={self.price}, url={self.url}, website={self.website})'
    
class ProductPriceQueue(PsqlBaseModel):
    __tablename__ = "products_queue"
    id: Mapped[int] = mapped_column(primary_key=True,autoincrement=True)
    product_id = mapped_column(UUID(as_uuid=True),default=func.uuid_generate_v4(),unique=True)
    action: Mapped[str]
    status: Mapped[str] = mapped_column(default="pending")
    url: Mapped[str]
    processed_at: Mapped[datetime.datetime] = mapped_column(nullable=True)
    
# PostgreSQL function and trigger as raw SQL
detect_price_change_sql = """
CREATE OR REPLACE FUNCTION detect_price_change()
RETURNS TRIGGER AS $$
BEGIN
    IF NEW.price IS NULL OR NEW.price = 0 THEN
        INSERT INTO products_queue (product_id, action,url, status)
        VALUES (NEW.id, 'missing_price', NEW.url, 'pending')
        ON CONFLICT (product_id) DO NOTHING ;
    END IF;

    IF NEW.website_updated_at IS DISTINCT FROM OLD.website_updated_at THEN
        INSERT INTO products_queue (product_id, action, url, status)
        VALUES (NEW.id, 'update', NEW.url,'pending')
        ON CONFLICT (product_id) DO NOTHING;
    END IF;

    RETURN NEW;
END;
$$ LANGUAGE plpgsql;
"""

create_trigger_sql = """
CREATE TRIGGER trigger_price_change
AFTER INSERT OR UPDATE ON products
FOR EACH ROW
EXECUTE FUNCTION detect_price_change();
"""

# Event listener to execute after table creation
@event.listens_for(PsqlBaseModel.metadata, "after_create")
def create_postgresql_function(target, connection, **kw):
    connection.execute(text(detect_price_change_sql))
    connection.execute(text(create_trigger_sql))

I - Alembic : SqlAlchemy migration

In progress

J - FastApi : web framework for building APIs

In progress