Local LLMs Let Hobbyist Prototype Dream Game on RTX 4070 Ti

On a quiet Saturday, a hobbyist developer turned a plain RTX 4070 Ti into a sandbox for a wedding simulator, producing a fully playable text‑based prototype that runs entirely offline. The project shows how open‑source large language models can be run locally to lower the barrier for individual game designers.

The first step was installing Ollama, a lightweight platform that manages local LLMs. From Ollama the developer downloaded a quantized version of Google DeepMind’s Gemma 4 (e4b). The model fits comfortably in the 12 GB of VRAM on the RTX 4070 Ti, and a side‑by‑side test against Alibaba’s Qwen 3.5 showed that Gemma 4 delivered faster responses and higher‑quality dialogue for the game’s narrative.

Before writing any code, the developer built a spreadsheet database of characters, events, and decision points. Using Python 3.12.10 and the OpenPyXL 3.1.5 library, the spreadsheet was parsed into data structures that the game engine could consume. The game itself runs on Pygame 2.6.1, a popular Python library for 2‑D game development.

To turn the data into interactive gameplay, the developer fed a clear description of the desired loop—three turns per session, decision points, and narrative branching—into Claude Opus 4.8. The model produced Python code that wired the database to Pygame’s event loop, yielding a playable text adventure that can be restarted repeatedly without any cloud calls.

A second version of the game was built for the developer’s fiancée. While the core world and characters stayed the same, the narrative perspective and decision options shifted to reflect a bride’s viewpoint. The developer notes that the same underlying world behaves differently when the protagonist changes.

The prototype remains in early stages. It is a proof‑of‑concept rather than a finished product. Currently the game relies on text output, but the developer has already experimented with integrating Stable Diffusion to generate low‑poly artwork between scenes, and with AI‑generated sound effects and background music.

Running the LLM locally eliminates subscription fees that would otherwise be incurred with cloud‑based services. The developer can iterate on dialogue and mechanics freely, restarting the simulation as many times as needed.

Gemma 4 was released in April 2026 as an open‑weight model derived from Google’s Gemini research. The e4b variant is specifically designed for on‑device deployment, making it suitable for consumer GPUs such as the RTX 4070 Ti.

Ollama, the platform used to host the model, provides a command‑line interface, a local REST API, and integration tools for coding assistants. The developer’s setup is fully local: the LLM, the game engine, and the data processing all run on the same machine.

The project is intended as a personal experiment rather than a commercial release. The developer states that the goal is to bring an idea that had existed for years into a playable form.

This work illustrates how hobbyists can leverage recent advances in open‑source AI to prototype game concepts without the need for large cloud budgets. By combining a local LLM, a spreadsheet‑based database, and a lightweight game engine, the developer has shown that even complex narrative systems can be assembled quickly and cost‑effectively.

At present, the game remains a text‑only experience. Future iterations may incorporate more sophisticated visuals and audio, but the core proof‑of‑concept demonstrates the feasibility of local AI‑assisted game development.

The developer’s experiment underscores the growing accessibility of AI tools for independent creators. With models like Gemma 4 and platforms such as Ollama, individual developers can experiment with large‑scale language generation on commodity hardware, opening new possibilities for personal and experimental game projects.