by Markus U. Wahl

Automating Product Listing Translation Using Large Language Models and Structured Outputs

Automating Product Listing Translation Using Large Language Models and Structured Outputs

If you’ve been using ChatGPT or other large language models (LLMs) to help create and translate your product listings for Amazon or other e-commerce platforms, you’ve probably noticed that the results can sometimes be inconsistent. This inconsistency can be a problem — especially when you need structured data for bulk uploads or want consistent output across multiple products and languages.

In this post, I’ll show you how to use LLM APIs — such as the OpenAI API or Google’s Gemini API — to generate structured outputs that can be easily parsed and used for Amazon product listings. This approach not only improves consistency but also makes it possible to automate and scale the translation process efficiently.

We’ll walk through translating a sample dataset from English to German using Python and the Gemini API. Once you understand the workflow, you can adapt it for any product category, language, or marketplace.


The Example Dataset

Here’s an example of a simplified Amazon product listing in English for demonstration purposes:

parent_child item_sku brand item_name bullet_point1 bullet_point2 bullet_point3 bullet_point4 bullet_point5 product_description color_name size_name
parent TSHIRT-CLASSIC My Cool Brand My Cool Brand Men's Crewneck T-Shirt Material: 100% Cotton Fit: Classic Regular Fit Comfort: Tagless Neck Label Quality: Durable Double Stitching This men's t-shirt is a staple item for any wardrobe. It offers a comfortable fit and is made from soft, breathable fabric.
child TSHIRT-CLASSIC-RED-S My Cool Brand My Cool Brand Men's Crewneck T-Shirt - Red Material: 100% Cotton Fit: Classic Regular Fit Comfort: Tagless Neck Label Quality: Durable Double Stitching Color: Red, Size: S This men's t-shirt is a staple item for any wardrobe. It offers a comfortable fit and is made from soft, breathable fabric. Red S
child TSHIRT-CLASSIC-RED-M My Cool Brand My Cool Brand Men's Crewneck T-Shirt - Red Material: 100% Cotton Fit: Classic Regular Fit Comfort: Tagless Neck Label Quality: Durable Double Stitching Color: Red, Size: M This men's t-shirt is a staple item for any wardrobe. It offers a comfortable fit and is made from soft, breathable fabric. Red M
child TSHIRT-CLASSIC-RED-L My Cool Brand My Cool Brand Men's Crewneck T-Shirt - Red Material: 100% Cotton Fit: Classic Regular Fit Comfort: Tagless Neck Label Quality: Durable Double Stitching Color: Red, Size: L This men's t-shirt is a staple item for any wardrobe. It offers a comfortable fit and is made from soft, breathable fabric. Red L
child TSHIRT-CLASSIC-ORN-S My Cool Brand My Cool Brand Men's Crewneck T-Shirt - Orange Material: 100% Cotton Fit: Classic Regular Fit Comfort: Tagless Neck Label Quality: Durable Double Stitching Color: Orange, Size: S This men's t-shirt is a staple item for any wardrobe. It offers a comfortable fit and is made from soft, breathable fabric. Orange S
child TSHIRT-CLASSIC-ORN-M My Cool Brand My Cool Brand Men's Crewneck T-Shirt - Orange Material: 100% Cotton Fit: Classic Regular Fit Comfort: Tagless Neck Label Quality: Durable Double Stitching Color: Orange, Size: M This men's t-shirt is a staple item for any wardrobe. It offers a comfortable fit and is made from soft, breathable fabric. Orange M
child TSHIRT-CLASSIC-ORN-L My Cool Brand My Cool Brand Men's Crewneck T-Shirt - Orange Material: 100% Cotton Fit: Classic Regular Fit Comfort: Tagless Neck Label Quality: Durable Double Stitching Color: Orange, Size: L This men's t-shirt is a staple item for any wardrobe. It offers a comfortable fit and is made from soft, breathable fabric. Orange L
child TSHIRT-CLASSIC-YLW-S My Cool Brand My Cool Brand Men's Crewneck T-Shirt - Yellow Material: 100% Cotton Fit: Classic Regular Fit Comfort: Tagless Neck Label Quality: Durable Double Stitching Color: Yellow, Size: S This men's t-shirt is a staple item for any wardrobe. It offers a comfortable fit and is made from soft, breathable fabric. Yellow S
child TSHIRT-CLASSIC-YLW-M My Cool Brand My Cool Brand Men's Crewneck T-Shirt - Yellow Material: 100% Cotton Fit: Classic Regular Fit Comfort: Tagless Neck Label Quality: Durable Double Stitching Color: Yellow, Size: M This men's t-shirt is a staple item for any wardrobe. It offers a comfortable fit and is made from soft, breathable fabric. Yellow M
child TSHIRT-CLASSIC-YLW-L My Cool Brand My Cool Brand Men's Crewneck T-Shirt - Yellow Material: 100% Cotton Fit: Classic Regular Fit Comfort: Tagless Neck Label Quality: Durable Double Stitching Color: Yellow, Size: L This men's t-shirt is a staple item for any wardrobe. It offers a comfortable fit and is made from soft, breathable fabric. Yellow L

Retrieving a Gemini API Key

To access the Gemini API, we first must generate an API key through Google AI Studio or the Google Cloud Console.

Once we have the API key, it is best practice to store it in a .env file in the root directory of the project. If we name the variable in the .env file GOOGLE_API_KEY, the Gemini SDK will automatically load it when we initialize the client.


Installing the Required Libraries

First, we need to ensure that we have the necessary Python libraries installed.

One of these is the Gemini SDK, which we need to make requests to the Gemini API. This installation includes the important pydantic library, which we will use later to define our structured output format. You can find more information about the installation in the Gemini API Quick Start Guide.

Next, we install the popular pandas library to read, edit, and display our dataset.

pip install google-genai pandas

Importing the Necessary Libraries

We’ll start our script by importing the required Python libraries.

  • json: Handles input/output in JSON format.
  • Enum: Defines fixed sets of valid values for specific fields.
  • pandas: Enables easy manipulation of tabular product data (similar to Excel).
  • google.genai: Provides the Gemini API client and configuration options.
  • pydantic: Defines and validates structured outputs returned by the API.
import json

from enum import Enum

import pandas as pd

from google import genai
from google.genai.types import HttpOptions

from pydantic import BaseModel

Reading the Dataset

Next, we’ll use pandas to load a sample dataset from a TSV (tab-separated) file and convert it into a list of dictionaries. Each dictionary represents one SKU or product variation. You can download the sample dataset here.

df_source = pd.read_csv(
    "https://raw.githubusercontent.com/muw78/automating-product-listing-translation-using-llms-and-structured-outputs/refs/heads/main/source_listing.tsv",
    sep="\t",
)
df_source.fillna("", inplace=True)  # Replace empty cells with empty strings
source_listing = df_source.to_dict(orient="records")

Defining the Structured Output Format

To ensure the API returns data in a predictable and uniform format, we define a structured output schema using pydantic. This guarantees that each product listing follows the same field structure, regardless of language.

Step 1: Define an Enumeration

We define an enumeration for the parent_child field, ensuring that it only accepts "parent" or "child" values.

class ParentChildEnum(str, Enum):
    PARENT = "parent"
    CHILD = "child"

Step 2: Define a SKU Model

We then create an AmazonSKU model representing a single SKU entry.

class AmazonSKU(BaseModel):
    parent_child: ParentChildEnum
    item_sku: str
    brand: str
    item_name: str
    bullet_point1: str
    bullet_point2: str
    bullet_point3: str
    bullet_point4: str
    bullet_point5: str
    product_description: str
    color_name: str
    size_name: str

Step 3: Define a Listing Model

Finally, we wrap multiple SKUs into an AmazonListing model.

class AmazonListing(BaseModel):
    skus: list[AmazonSKU]

This structure ensures that the API’s output is consistent and machine-readable — perfect for direct integration with Amazon Seller Central or other e-commerce backends.


Generating the Prompt

Next, we create a dynamic prompt template that defines the translation task and formatting instructions.

target_language = "German"

prompt_template = """
Translate the following Amazon listing into **{target_language}**.

- CRITICAL: Do NOT translate the values for `parent_child`, `item_sku`, or `brand`.
- The `item_name` should always start with the brand name.
- Use clear, professional, and descriptive language appropriate for the product category.

```json
{source_listing}
```
"""

We fill the placeholders with the actual data and desired target language:

prompt = prompt_template.format(
    target_language=target_language,
    source_listing=json.dumps(source_listing, indent=4),
)

Initializing the Gemini Client

Now we initialize the Gemini client. If the .env file is configured correctly, the SDK automatically loads your API key.

We also set a 3-minute timeout to accommodate long or complex translations.

GEMINI_TIMEOUT = 3 * 60 * 1000  # 3 minutes in milliseconds
genai_client = genai.Client(http_options=HttpOptions(timeout=GEMINI_TIMEOUT))

Sending the Request to the Gemini API

We can now send the prompt and schema to the Gemini model. The model generates structured JSON output that matches the schema we defined.

response = genai_client.models.generate_content(
    model="gemini-2.5-pro",
    contents=[prompt],
    config={
        "response_mime_type": "application/json",
        "response_schema": AmazonListing,
        "temperature": 0.2,
    },
)

Parsing the Response

If the request was successful, the response contains a JSON string representing our translated listing. This string corresponds to the schema that we have defined.

We can now convert this string into a list of dictionaries that we can further inspect and export.

result_string = response.text.strip()
result = json.loads(result_string)
translated_listing = result["skus"]

To verify that the critical fields item_sku, brand, and parent_child remain unchanged during translation, we can run the following check:

all(
    (source["item_sku"], source["brand"], source["parent_child"])
    == (translated["item_sku"], translated["brand"], translated["parent_child"])
    for source, translated in zip(source_listing, translated_listing)
) # Should return True

Finally, we convert the translated listing into a Pandas DataFrame so that it can be easily viewed and exported to CSV or Excel. It can then be uploaded to Amazon Seller Central, Shopify, or any other e-commerce platform.

df_translated = pd.DataFrame(translated_listing)
parent_child item_sku brand item_name bullet_point1 bullet_point2 bullet_point3 bullet_point4 bullet_point5 product_description color_name size_name
parent TSHIRT-CLASSIC-RED-S My Cool Brand My Cool Brand Herren T-Shirt mit Rundhalsausschnitt - Rot Material: 100% Baumwolle Passform: Klassischer Regular Fit Komfort: Etikettenloses Nackenlabel Qualität: Strapazierfähige Doppelnähte Farbe: Rot, Größe: S Dieses Herren-T-Shirt ist ein unverzichtbarer Bestandteil jeder Garderobe. Es bietet eine bequeme Passform und ist aus weichem, atmungsaktivem Stoff gefertigt. Rot S
child TSHIRT-CLASSIC-RED-M My Cool Brand My Cool Brand Herren T-Shirt mit Rundhalsausschnitt - Rot Material: 100% Baumwolle Passform: Klassischer Regular Fit Komfort: Etikettenloses Nackenlabel Qualität: Strapazierfähige Doppelnähte Farbe: Rot, Größe: M Dieses Herren-T-Shirt ist ein unverzichtbarer Bestandteil jeder Garderobe. Es bietet eine bequeme Passform und ist aus weichem, atmungsaktivem Stoff gefertigt. Rot M
child TSHIRT-CLASSIC-RED-L My Cool Brand My Cool Brand Herren T-Shirt mit Rundhalsausschnitt - Rot Material: 100% Baumwolle Passform: Klassischer Regular Fit Komfort: Etikettenloses Nackenlabel Qualität: Strapazierfähige Doppelnähte Farbe: Rot, Größe: L Dieses Herren-T-Shirt ist ein unverzichtbarer Bestandteil jeder Garderobe. Es bietet eine bequeme Passform und ist aus weichem, atmungsaktivem Stoff gefertigt. Rot L
parent TSHIRT-CLASSIC-ORN-S My Cool Brand My Cool Brand Herren T-Shirt mit Rundhalsausschnitt - Orange Material: 100% Baumwolle Passform: Klassischer Regular Fit Komfort: Etikettenloses Nackenlabel Qualität: Strapazierfähige Doppelnähte Farbe: Orange, Größe: S Dieses Herren-T-Shirt ist ein unverzichtbarer Bestandteil jeder Garderobe. Es bietet eine bequeme Passform und ist aus weichem, atmungsaktivem Stoff gefertigt. Orange S
child TSHIRT-CLASSIC-ORN-M My Cool Brand My Cool Brand Herren T-Shirt mit Rundhalsausschnitt - Orange Material: 100% Baumwolle Passform: Klassischer Regular Fit Komfort: Etikettenloses Nackenlabel Qualität: Strapazierfähige Doppelnähte Farbe: Orange, Größe: M Dieses Herren-T-Shirt ist ein unverzichtbarer Bestandteil jeder Garderobe. Es bietet eine bequeme Passform und ist aus weichem, atmungsaktivem Stoff gefertigt. Orange M
child TSHIRT-CLASSIC-ORN-L My Cool Brand My Cool Brand Herren T-Shirt mit Rundhalsausschnitt - Orange Material: 100% Baumwolle Passform: Klassischer Regular Fit Komfort: Etikettenloses Nackenlabel Qualität: Strapazierfähige Doppelnähte Farbe: Orange, Größe: L Dieses Herren-T-Shirt ist ein unverzichtbarer Bestandteil jeder Garderobe. Es bietet eine bequeme Passform und ist aus weichem, atmungsaktivem Stoff gefertigt. Orange L
parent TSHIRT-CLASSIC-YLW-S My Cool Brand My Cool Brand Herren T-Shirt mit Rundhalsausschnitt - Gelb Material: 100% Baumwolle Passform: Klassischer Regular Fit Komfort: Etikettenloses Nackenlabel Qualität: Strapazierfähige Doppelnähte Farbe: Gelb, Größe: S Dieses Herren-T-Shirt ist ein unverzichtbarer Bestandteil jeder Garderobe. Es bietet eine bequeme Passform und ist aus weichem, atmungsaktivem Stoff gefertigt. Gelb S
child TSHIRT-CLASSIC-YLW-M My Cool Brand My Cool Brand Herren T-Shirt mit Rundhalsausschnitt - Gelb Material: 100% Baumwolle Passform: Klassischer Regular Fit Komfort: Etikettenloses Nackenlabel Qualität: Strapazierfähige Doppelnähte Farbe: Gelb, Größe: M Dieses Herren-T-Shirt ist ein unverzichtbarer Bestandteil jeder Garderobe. Es bietet eine bequeme Passform und ist aus weichem, atmungsaktivem Stoff gefertigt. Gelb M
child TSHIRT-CLASSIC-YLW-L My Cool Brand My Cool Brand Herren T-Shirt mit Rundhalsausschnitt - Gelb Material: 100% Baumwolle Passform: Klassischer Regular Fit Komfort: Etikettenloses Nackenlabel Qualität: Strapazierfähige Doppelnähte Farbe: Gelb, Größe: L Dieses Herren-T-Shirt ist ein unverzichtbarer Bestandteil jeder Garderobe. Es bietet eine bequeme Passform und ist aus weichem, atmungsaktivem Stoff gefertigt. Gelb L

Conclusion

In this post, we explored how to use large language models like Gemini to automate the translation of Amazon product listings into structured, consistent formats. By defining a clear schema and using pydantic, we ensured that the output is predictable and easy to integrate with e-commerce platforms. This approach can be adapted and scaled easily to process large catalogs and products with more complex attributes.

A Jupyter Notebook with the complete code is available on GitHub.

Need help implementing custom LLM solutions for your Amazon business? Let's discuss how I can help you automate your product listing translations and scale your multilingual operations.

Book a Free Consultation