Pythonic Code: From Good to Great

Python's simplicity is deceptive — writing clean, maintainable Python requires knowing its idioms and conventions. Whether you're building APIs, scripts, or data pipelines, these best practices will level up your Python code.

Project Structure

Standard Python Project Layout

my-project/
├── pyproject.toml          # Project config and dependencies
├── README.md
├── src/
│   └── my_project/
│       ├── __init__.py
│       ├── main.py
│       ├── config.py
│       ├── models/
│       │   ├── __init__.py
│       │   └── user.py
│       ├── services/
│       │   ├── __init__.py
│       │   └── auth.py
│       └── utils/
│           ├── __init__.py
│           └── helpers.py
├── tests/
│   ├── __init__.py
│   ├── conftest.py         # Shared fixtures
│   ├── test_models.py
│   └── test_services.py
└── scripts/
    └── seed_db.py

Use `pyproject.toml` Over `setup.py`

[project]
name = "my-project"
version = "1.0.0"
requires-python = ">=3.12"
dependencies = [
    "fastapi>=0.110",
    "sqlalchemy>=2.0",
    "pydantic>=2.0",
]

[project.optional-dependencies]
dev = [
    "pytest>=8.0",
    "ruff>=0.3",
    "mypy>=1.9",
]

[tool.ruff]
line-length = 88
target-version = "py312"

[tool.mypy]
strict = true

Code Style

Follow PEP 8 and Use a Formatter

Use ruff for both linting and formatting:

# Format code
ruff format .

# Lint and auto-fix
ruff check --fix .

Naming Conventions

# Variables and functions: snake_case
user_name = "Alice"
def get_active_users():
    pass

# Classes: PascalCase
class UserService:
    pass

# Constants: UPPER_SNAKE_CASE
MAX_RETRIES = 3
DATABASE_URL = "postgresql://..."

# Private: leading underscore
class User:
    def __init__(self):
        self._internal_state = {}  # Convention: "don't touch"

    def _validate(self):  # Internal method
        pass

Use f-Strings for Formatting

# Bad
greeting = "Hello, " + name + "! You are " + str(age) + " years old."
greeting = "Hello, {}! You are {} years old.".format(name, age)

# Good
greeting = f"Hello, {name}! You are {age} years old."

# f-strings support expressions
message = f"Total: ${price * quantity:.2f}"
debug = f"{user=}"  # Prints: user=User(name='Alice')

Type Hints

Use Type Hints Everywhere

from typing import Optional

def get_user(user_id: int) -> Optional[User]:
    """Fetch a user by ID."""
    ...

def create_user(name: str, email: str, age: int = 25) -> User:
    """Create a new user."""
    ...

def process_items(items: list[str]) -> dict[str, int]:
    """Count occurrences of each item."""
    return {item: items.count(item) for item in set(items)}

Use `TypedDict` for Structured Dictionaries

from typing import TypedDict

class UserDict(TypedDict):
    id: int
    name: str
    email: str
    is_active: bool

def process_user(user: UserDict) -> str:
    return f"{user['name']} ({user['email']})"

Use Pydantic for Validation

from pydantic import BaseModel, EmailStr, field_validator

class CreateUserRequest(BaseModel):
    name: str
    email: EmailStr
    age: int

    @field_validator("age")
    @classmethod
    def validate_age(cls, v: int) -> int:
        if v < 0 or v > 150:
            raise ValueError("Age must be between 0 and 150")
        return v

# Automatic validation
user = CreateUserRequest(name="Alice", email="alice@example.com", age=30)

Functions

Keep Functions Small and Focused

# Bad - does too many things
def process_order(order_data):
    # validate
    if not order_data.get("items"):
        raise ValueError("No items")
    # calculate
    total = sum(item["price"] * item["qty"] for item in order_data["items"])
    tax = total * 0.2
    # save
    db.orders.insert({"total": total + tax, **order_data})
    # notify
    send_email(order_data["email"], f"Order total: {total + tax}")

# Good - separated concerns
def validate_order(order_data: dict) -> None:
    if not order_data.get("items"):
        raise ValueError("No items")

def calculate_total(items: list[dict]) -> float:
    subtotal = sum(item["price"] * item["qty"] for item in items)
    return subtotal * 1.2  # Including tax

def process_order(order_data: dict) -> None:
    validate_order(order_data)
    total = calculate_total(order_data["items"])
    save_order(order_data, total)
    notify_customer(order_data["email"], total)

Use `*` to Force Keyword Arguments

# Bad - easy to mix up positional args
def create_user(name, email, active, admin):
    ...
create_user("Alice", "alice@test.com", True, False)  # What's True? What's False?

# Good - keyword-only after *
def create_user(name: str, email: str, *, active: bool = True, admin: bool = False):
    ...
create_user("Alice", "alice@test.com", active=True, admin=False)

Return Early to Reduce Nesting

# Bad
def get_discount(user):
    if user is not None:
        if user.is_premium:
            if user.years > 5:
                return 0.3
            else:
                return 0.2
        else:
            return 0.05
    else:
        return 0

# Good
def get_discount(user) -> float:
    if user is None:
        return 0
    if not user.is_premium:
        return 0.05
    if user.years > 5:
        return 0.3
    return 0.2

Data Structures

Use Dataclasses for Simple Data Containers

from dataclasses import dataclass, field

@dataclass
class User:
    name: str
    email: str
    age: int
    tags: list[str] = field(default_factory=list)

    @property
    def is_adult(self) -> bool:
        return self.age >= 18

user = User(name="Alice", email="alice@test.com", age=30)

Use Comprehensions

# List comprehension
squares = [x ** 2 for x in range(10)]
active_users = [u for u in users if u.is_active]

# Dict comprehension
user_map = {u.id: u for u in users}
word_counts = {word: text.count(word) for word in set(text.split())}

# Set comprehension
unique_domains = {email.split("@")[1] for email in emails}

# Generator expression for large datasets (lazy evaluation)
total = sum(order.total for order in orders)

Use `collections` for Specialized Data Structures

from collections import defaultdict, Counter, deque

# defaultdict - no KeyError
word_groups = defaultdict(list)
for word in words:
    word_groups[word[0]].append(word)

# Counter - counting made easy
word_counts = Counter(text.split())
top_10 = word_counts.most_common(10)

# deque - fast append/pop from both ends
recent_items = deque(maxlen=100)
recent_items.append(new_item)

Error Handling

Catch Specific Exceptions

# Bad - catches everything including KeyboardInterrupt
try:
    result = process_data()
except:
    pass

# Bad - too broad
try:
    result = process_data()
except Exception:
    log.error("Something went wrong")

# Good - specific exceptions
try:
    result = process_data()
except ValueError as e:
    log.warning(f"Invalid data: {e}")
    result = default_value
except ConnectionError as e:
    log.error(f"Connection failed: {e}")
    raise

Create Custom Exceptions

class AppError(Exception):
    """Base exception for the application."""
    pass

class NotFoundError(AppError):
    def __init__(self, resource: str, id: str):
        self.resource = resource
        self.id = id
        super().__init__(f"{resource} with id '{id}' not found")

class ValidationError(AppError):
    def __init__(self, field: str, message: str):
        self.field = field
        super().__init__(f"Validation error on '{field}': {message}")

# Usage
raise NotFoundError("User", "123")

Use Context Managers

# Bad - might not close on error
f = open("data.txt")
data = f.read()
f.close()

# Good - always closes
with open("data.txt") as f:
    data = f.read()

# Custom context manager
from contextlib import contextmanager

@contextmanager
def timer(label: str):
    start = time.perf_counter()
    yield
    elapsed = time.perf_counter() - start
    print(f"{label}: {elapsed:.3f}s")

with timer("data processing"):
    process_large_dataset()

Async Python

Use `asyncio` for I/O-Bound Work

import asyncio
import httpx

async def fetch_user(client: httpx.AsyncClient, user_id: int) -> dict:
    response = await client.get(f"/api/users/{user_id}")
    response.raise_for_status()
    return response.json()

async def fetch_all_users(user_ids: list[int]) -> list[dict]:
    async with httpx.AsyncClient() as client:
        tasks = [fetch_user(client, uid) for uid in user_ids]
        return await asyncio.gather(*tasks)

# Run
users = asyncio.run(fetch_all_users([1, 2, 3, 4, 5]))

Testing

Use `pytest` with Clear Test Names

import pytest

def test_create_user_with_valid_data():
    user = create_user(name="Alice", email="alice@test.com")
    assert user.name == "Alice"
    assert user.email == "alice@test.com"

def test_create_user_raises_on_invalid_email():
    with pytest.raises(ValidationError, match="Invalid email"):
        create_user(name="Alice", email="not-an-email")

@pytest.mark.parametrize("age,expected", [
    (17, False),
    (18, True),
    (65, True),
])
def test_is_adult(age: int, expected: bool):
    user = User(name="Test", email="t@t.com", age=age)
    assert user.is_adult == expected

Use Fixtures for Setup

# conftest.py
import pytest

@pytest.fixture
def sample_user():
    return User(name="Alice", email="alice@test.com", age=30)

@pytest.fixture
async def db_session():
    session = await create_test_session()
    yield session
    await session.rollback()
    await session.close()

Performance

Use Generators for Large Datasets

# Bad - loads everything into memory
def read_large_file(path: str) -> list[str]:
    with open(path) as f:
        return [line.strip() for line in f]  # Could be millions of lines

# Good - processes one line at a time
def read_large_file(path: str):
    with open(path) as f:
        for line in f:
            yield line.strip()

for line in read_large_file("huge_file.txt"):
    process(line)

Use `functools.lru_cache` for Expensive Computations

from functools import lru_cache

@lru_cache(maxsize=128)
def get_user_permissions(user_id: int) -> set[str]:
    # Expensive database query
    return db.fetch_permissions(user_id)

Use `slots` for Memory-Efficient Classes

@dataclass(slots=True)
class Point:
    x: float
    y: float
    z: float

# Uses ~40% less memory per instance than regular classes

Quick Reference

Practice	Why
Type hints everywhere	Catch bugs early, better IDE support
`ruff` for formatting and linting	Fast, consistent code style
f-strings for formatting	Readable, fast string formatting
Dataclasses for data containers	Less boilerplate than manual `__init__`
Comprehensions over loops	Concise, Pythonic, often faster
Specific exception handling	Don't hide bugs
Context managers (`with`)	Guaranteed resource cleanup
`pytest` with parametrize	Thorough, readable tests
Generators for large data	Memory efficient
`lru_cache` for expensive calls	Automatic memoization

Summary

Writing Pythonic code comes down to:

Be explicit — type hints, descriptive names, keyword arguments
Be idiomatic — comprehensions, context managers, dataclasses
Handle errors properly — specific exceptions, custom error classes
Test thoroughly — pytest, fixtures, parametrize
Optimize when needed — generators, caching, slots