A few days ago, I was coding casually while re-watching some episodes of Bob’s Burgers. Inspired by the show, and with a vague memory of an old SpongeBob web-game I use to play as a kid, I decided to create a simple burger-flipping game. While the game itself isn’t particularly significant (I may end up sharing it later), what is important is that I needed a lot of images. Since I have Google’s Gemini, I started using it. I would sketch a basic idea on paper and then prompt Gemini to re-draw it in a different style. As you can imagine, this process became quite tedious.
Instead of continuing with the Gemini-based approach, I decided to build my own tool called Imagenation. Initially, it was a command-line tool, but I soon realized that I would likely need to reuse it in other projects. Therefore, I decided to learn how to create my own package and eventually upload it to PyPI. As a Python enthusiast, this was a long-standing curiosity of mine.
At a high-level, the tool needed to:
- Generate images from text prompts (Text-to-Image).
- Modify existing images with text prompts (Text+Image-to-Image).
- Runs as a simple CLI tool that can chew through a CSV or JSON file of prompts.
- Works as an importable Python library for other projects.
Git Repo: https://github.com/pmsosa/Imagenation
PyPI: https://pypi.org/project/imagenation
Leveraging Google's Imagen
First things first: just talking to the API. Google's generative AI tools are incredibly powerful, and getting a single image generated in Python is refreshingly simple.
You set up your API key, initialize the model, and essentially just… ask for what you want.
First, go pick up your API key from: https://aistudio.google.com/api-keys
import google.generativeai as genai
# Make sure your GOOGLE_API_KEY is an env variable
genai.configure(api_key=os.environ["GOOGLE_API_KEY"])
model = genai.GenerativeModel('gemini-pro-vision') # Or your model of choice
response = model.generate_content(
[
"A beautiful sunset over mountains",
PIL.Image.open("my-image.png") # Optional input image
]
)
# ...then you have to process the response, get the bytes, save to a file...
The first time this works, it feels like magic. But the second time you run it—and the third, fourth, fifth, and sixth time, all within a minute—you hit the real boss: Rate Limiting.
The free tier for the API (at the time of writing) allows about 5 requests per minute. If you send that sixth request, you get a 429 error, and your script comes to a crashing halt.
I added a --delay parameter that defaults to 12 seconds (just enough for 5 requests/min), but you can set it to 0.2 if you're on a paid tier (300 requests/min) or 15.0+ if you're running a massive batch and want to be extra careful.
From Script to Package
The project could have ended there. I had a imagenation.py script that worked for me. But I've been doing this long enough to know that "future me" is going to want to use this again, and he's not going to want to hunt for a stray .py file. He's going to want to pip install it.
This was the part I was most excited about: turning this pile of code into a real, distributable PyPI package.
It turns out, with modern Python tooling, this is way more accessible than I thought. The process was basically:
- Restructure: Stop putting everything in one file. I split the logic into:
imagenation/generator.py: AImagenationGeneratorclass that holds the core logic (generating, rate limiting, saving files).imagenation/cli.py: All theargparsestuff to handle command-line inputs.imagenation/__main__.py: To make the package runnable withpython -m imagenation.
- Configure
pyproject.toml: This is the new-ish, all-in-one config file. You define your project name, version, dependencies, and entry points. This file (along withsetup.pyorsetup.cfgfor compatibility) tells tools likepipandtwine(the uploader) what your package is and how to build it. - Build & Upload: Once the structure was right, the magic commands were:
# Build the package
python -m build
# Upload the PyPI (TestPyPI first!)
# Get you API keys at: https://test.pypi.org
python -m twine upload --repository testpypi dist/*
# Upload to PyPI
# Get your API keys at: https://pypi.org
python -m twine upload dist/*
The first time I typed pip install imagenation and it actually worked... that was a good feeling.
Now, the project has its final, intended forms.
As a library you can just import:
from imagenation import ImagenationGenerator
# Initialize with a faster delay for a paid tier
gen = ImagenationGenerator(rate_limit_delay=0.2)
# Generate a text-to-image
gen.generate_text_to_image("A serene lake at sunset", "lake.png")
# Generate a text+image-to-image
gen.generate_text_image_to_image(
"Make this photo look vintage",
"input.jpg",
"vintage_output.jpg"
)
And as the CLI tool it was always meant to be:
# Generate a single image
./start.sh -it "A raccoon flying a Cessna 152" -o flying_raccoon.png
# Process a whole batch of ideas from a CSV file
./start.sh --csv batch_data.csv
# Or from a JSON file
./start.sh --json batch_data.json
Just a Fun Project
In the end, this was one of those perfect micro-projects. It solved a genuine (albeit minor) problem I had, provided me with a tangible reason to learn two new things (the Imagen API and PyPI publishing), and resulted in a tool that I’ll actually use again.
It may not be a grand project, but it serves as a simple pretext to familiarize myself with PyPI. It’s not intended for mastery, but rather for exploration. And, hey, it’s significantly more enjoyable than spending an entire day copying and pasting prompts.

Be First to Comment