Automating Product Listing Translation Using Large Language Models and Structured Outputs
If you’ve been using ChatGPT or other large language models (LLMs) to help create and translate your product listings for Amazon or other e-commerce platforms, you’ve probably noticed that the results can sometimes be inconsistent. This inconsistency can be a problem — especially when you need structured data for bulk uploads or want consistent output across multiple products and languages.
In this post, I’ll show you how to use LLM APIs — such as the OpenAI API or Google’s Gemini API — to generate structured outputs that can be easily parsed and used for Amazon product listings. This approach not only improves consistency but also makes it possible to automate and scale the translation process efficiently.
We’ll walk through translating a sample dataset from English to German using Python and the Gemini API. Once you understand the workflow, you can adapt it for any product category, language, or marketplace.
The Example Dataset
Here’s an example of a simplified Amazon product listing in English for demonstration purposes:
| parent_child | item_sku | brand | item_name | bullet_point1 | bullet_point2 | bullet_point3 | bullet_point4 | bullet_point5 | product_description | color_name | size_name |
|---|---|---|---|---|---|---|---|---|---|---|---|
| parent | TSHIRT-CLASSIC | My Cool Brand | My Cool Brand Men's Crewneck T-Shirt | Material: 100% Cotton | Fit: Classic Regular Fit | Comfort: Tagless Neck Label | Quality: Durable Double Stitching | This men's t-shirt is a staple item for any wardrobe. It offers a comfortable fit and is made from soft, breathable fabric. | |||
| child | TSHIRT-CLASSIC-RED-S | My Cool Brand | My Cool Brand Men's Crewneck T-Shirt - Red | Material: 100% Cotton | Fit: Classic Regular Fit | Comfort: Tagless Neck Label | Quality: Durable Double Stitching | Color: Red, Size: S | This men's t-shirt is a staple item for any wardrobe. It offers a comfortable fit and is made from soft, breathable fabric. | Red | S |
| child | TSHIRT-CLASSIC-RED-M | My Cool Brand | My Cool Brand Men's Crewneck T-Shirt - Red | Material: 100% Cotton | Fit: Classic Regular Fit | Comfort: Tagless Neck Label | Quality: Durable Double Stitching | Color: Red, Size: M | This men's t-shirt is a staple item for any wardrobe. It offers a comfortable fit and is made from soft, breathable fabric. | Red | M |
| child | TSHIRT-CLASSIC-RED-L | My Cool Brand | My Cool Brand Men's Crewneck T-Shirt - Red | Material: 100% Cotton | Fit: Classic Regular Fit | Comfort: Tagless Neck Label | Quality: Durable Double Stitching | Color: Red, Size: L | This men's t-shirt is a staple item for any wardrobe. It offers a comfortable fit and is made from soft, breathable fabric. | Red | L |
| child | TSHIRT-CLASSIC-ORN-S | My Cool Brand | My Cool Brand Men's Crewneck T-Shirt - Orange | Material: 100% Cotton | Fit: Classic Regular Fit | Comfort: Tagless Neck Label | Quality: Durable Double Stitching | Color: Orange, Size: S | This men's t-shirt is a staple item for any wardrobe. It offers a comfortable fit and is made from soft, breathable fabric. | Orange | S |
| child | TSHIRT-CLASSIC-ORN-M | My Cool Brand | My Cool Brand Men's Crewneck T-Shirt - Orange | Material: 100% Cotton | Fit: Classic Regular Fit | Comfort: Tagless Neck Label | Quality: Durable Double Stitching | Color: Orange, Size: M | This men's t-shirt is a staple item for any wardrobe. It offers a comfortable fit and is made from soft, breathable fabric. | Orange | M |
| child | TSHIRT-CLASSIC-ORN-L | My Cool Brand | My Cool Brand Men's Crewneck T-Shirt - Orange | Material: 100% Cotton | Fit: Classic Regular Fit | Comfort: Tagless Neck Label | Quality: Durable Double Stitching | Color: Orange, Size: L | This men's t-shirt is a staple item for any wardrobe. It offers a comfortable fit and is made from soft, breathable fabric. | Orange | L |
| child | TSHIRT-CLASSIC-YLW-S | My Cool Brand | My Cool Brand Men's Crewneck T-Shirt - Yellow | Material: 100% Cotton | Fit: Classic Regular Fit | Comfort: Tagless Neck Label | Quality: Durable Double Stitching | Color: Yellow, Size: S | This men's t-shirt is a staple item for any wardrobe. It offers a comfortable fit and is made from soft, breathable fabric. | Yellow | S |
| child | TSHIRT-CLASSIC-YLW-M | My Cool Brand | My Cool Brand Men's Crewneck T-Shirt - Yellow | Material: 100% Cotton | Fit: Classic Regular Fit | Comfort: Tagless Neck Label | Quality: Durable Double Stitching | Color: Yellow, Size: M | This men's t-shirt is a staple item for any wardrobe. It offers a comfortable fit and is made from soft, breathable fabric. | Yellow | M |
| child | TSHIRT-CLASSIC-YLW-L | My Cool Brand | My Cool Brand Men's Crewneck T-Shirt - Yellow | Material: 100% Cotton | Fit: Classic Regular Fit | Comfort: Tagless Neck Label | Quality: Durable Double Stitching | Color: Yellow, Size: L | This men's t-shirt is a staple item for any wardrobe. It offers a comfortable fit and is made from soft, breathable fabric. | Yellow | L |
Retrieving a Gemini API Key
To access the Gemini API, we first must generate an API key through Google AI Studio or the Google Cloud Console.
Once we have the API key, it is best practice to store it in a .env file in the root directory of the project. If we name the variable in the .env file GOOGLE_API_KEY, the Gemini SDK will automatically load it when we initialize the client.
Installing the Required Libraries
First, we need to ensure that we have the necessary Python libraries installed.
One of these is the Gemini SDK, which we need to make requests to the Gemini API. This installation includes the important pydantic library, which we will use later to define our structured output format. You can find more information about the installation in the Gemini API Quick Start Guide.
Next, we install the popular pandas library to read, edit, and display our dataset.
pip install google-genai pandas
Importing the Necessary Libraries
We’ll start our script by importing the required Python libraries.
json: Handles input/output in JSON format.Enum: Defines fixed sets of valid values for specific fields.pandas: Enables easy manipulation of tabular product data (similar to Excel).google.genai: Provides the Gemini API client and configuration options.pydantic: Defines and validates structured outputs returned by the API.
import json from enum import Enum import pandas as pd from google import genai from google.genai.types import HttpOptions from pydantic import BaseModel
Reading the Dataset
Next, we’ll use pandas to load a sample dataset from a TSV (tab-separated) file and convert it into a list of dictionaries. Each dictionary represents one SKU or product variation. You can download the sample dataset here.
df_source = pd.read_csv(
"https://raw.githubusercontent.com/muw78/automating-product-listing-translation-using-llms-and-structured-outputs/refs/heads/main/source_listing.tsv",
sep="\t",
)
df_source.fillna("", inplace=True) # Replace empty cells with empty strings
source_listing = df_source.to_dict(orient="records")
Defining the Structured Output Format
To ensure the API returns data in a predictable and uniform format, we define a structured output schema using pydantic.
This guarantees that each product listing follows the same field structure, regardless of language.
Step 1: Define an Enumeration
We define an enumeration for the parent_child field, ensuring that it only accepts "parent" or "child" values.
class ParentChildEnum(str, Enum):
PARENT = "parent"
CHILD = "child"
Step 2: Define a SKU Model
We then create an AmazonSKU model representing a single SKU entry.
class AmazonSKU(BaseModel):
parent_child: ParentChildEnum
item_sku: str
brand: str
item_name: str
bullet_point1: str
bullet_point2: str
bullet_point3: str
bullet_point4: str
bullet_point5: str
product_description: str
color_name: str
size_name: str
Step 3: Define a Listing Model
Finally, we wrap multiple SKUs into an AmazonListing model.
class AmazonListing(BaseModel):
skus: list[AmazonSKU]
This structure ensures that the API’s output is consistent and machine-readable — perfect for direct integration with Amazon Seller Central or other e-commerce backends.
Generating the Prompt
Next, we create a dynamic prompt template that defines the translation task and formatting instructions.
target_language = "German"
prompt_template = """
Translate the following Amazon listing into **{target_language}**.
- CRITICAL: Do NOT translate the values for `parent_child`, `item_sku`, or `brand`.
- The `item_name` should always start with the brand name.
- Use clear, professional, and descriptive language appropriate for the product category.
```json
{source_listing}
```
"""
We fill the placeholders with the actual data and desired target language:
prompt = prompt_template.format(
target_language=target_language,
source_listing=json.dumps(source_listing, indent=4),
)
Initializing the Gemini Client
Now we initialize the Gemini client.
If the .env file is configured correctly, the SDK automatically loads your API key.
We also set a 3-minute timeout to accommodate long or complex translations.
GEMINI_TIMEOUT = 3 * 60 * 1000 # 3 minutes in milliseconds genai_client = genai.Client(http_options=HttpOptions(timeout=GEMINI_TIMEOUT))
Sending the Request to the Gemini API
We can now send the prompt and schema to the Gemini model. The model generates structured JSON output that matches the schema we defined.
response = genai_client.models.generate_content(
model="gemini-2.5-pro",
contents=[prompt],
config={
"response_mime_type": "application/json",
"response_schema": AmazonListing,
"temperature": 0.2,
},
)
Parsing the Response
If the request was successful, the response contains a JSON string representing our translated listing. This string corresponds to the schema that we have defined.
We can now convert this string into a list of dictionaries that we can further inspect and export.
result_string = response.text.strip() result = json.loads(result_string) translated_listing = result["skus"]
To verify that the critical fields item_sku, brand, and parent_child remain unchanged during translation, we can run the following check:
all(
(source["item_sku"], source["brand"], source["parent_child"])
== (translated["item_sku"], translated["brand"], translated["parent_child"])
for source, translated in zip(source_listing, translated_listing)
) # Should return True
Finally, we convert the translated listing into a Pandas DataFrame so that it can be easily viewed and exported to CSV or Excel. It can then be uploaded to Amazon Seller Central, Shopify, or any other e-commerce platform.
df_translated = pd.DataFrame(translated_listing)
| parent_child | item_sku | brand | item_name | bullet_point1 | bullet_point2 | bullet_point3 | bullet_point4 | bullet_point5 | product_description | color_name | size_name |
|---|---|---|---|---|---|---|---|---|---|---|---|
| parent | TSHIRT-CLASSIC-RED-S | My Cool Brand | My Cool Brand Herren T-Shirt mit Rundhalsausschnitt - Rot | Material: 100% Baumwolle | Passform: Klassischer Regular Fit | Komfort: Etikettenloses Nackenlabel | Qualität: Strapazierfähige Doppelnähte | Farbe: Rot, Größe: S | Dieses Herren-T-Shirt ist ein unverzichtbarer Bestandteil jeder Garderobe. Es bietet eine bequeme Passform und ist aus weichem, atmungsaktivem Stoff gefertigt. | Rot | S |
| child | TSHIRT-CLASSIC-RED-M | My Cool Brand | My Cool Brand Herren T-Shirt mit Rundhalsausschnitt - Rot | Material: 100% Baumwolle | Passform: Klassischer Regular Fit | Komfort: Etikettenloses Nackenlabel | Qualität: Strapazierfähige Doppelnähte | Farbe: Rot, Größe: M | Dieses Herren-T-Shirt ist ein unverzichtbarer Bestandteil jeder Garderobe. Es bietet eine bequeme Passform und ist aus weichem, atmungsaktivem Stoff gefertigt. | Rot | M |
| child | TSHIRT-CLASSIC-RED-L | My Cool Brand | My Cool Brand Herren T-Shirt mit Rundhalsausschnitt - Rot | Material: 100% Baumwolle | Passform: Klassischer Regular Fit | Komfort: Etikettenloses Nackenlabel | Qualität: Strapazierfähige Doppelnähte | Farbe: Rot, Größe: L | Dieses Herren-T-Shirt ist ein unverzichtbarer Bestandteil jeder Garderobe. Es bietet eine bequeme Passform und ist aus weichem, atmungsaktivem Stoff gefertigt. | Rot | L |
| parent | TSHIRT-CLASSIC-ORN-S | My Cool Brand | My Cool Brand Herren T-Shirt mit Rundhalsausschnitt - Orange | Material: 100% Baumwolle | Passform: Klassischer Regular Fit | Komfort: Etikettenloses Nackenlabel | Qualität: Strapazierfähige Doppelnähte | Farbe: Orange, Größe: S | Dieses Herren-T-Shirt ist ein unverzichtbarer Bestandteil jeder Garderobe. Es bietet eine bequeme Passform und ist aus weichem, atmungsaktivem Stoff gefertigt. | Orange | S |
| child | TSHIRT-CLASSIC-ORN-M | My Cool Brand | My Cool Brand Herren T-Shirt mit Rundhalsausschnitt - Orange | Material: 100% Baumwolle | Passform: Klassischer Regular Fit | Komfort: Etikettenloses Nackenlabel | Qualität: Strapazierfähige Doppelnähte | Farbe: Orange, Größe: M | Dieses Herren-T-Shirt ist ein unverzichtbarer Bestandteil jeder Garderobe. Es bietet eine bequeme Passform und ist aus weichem, atmungsaktivem Stoff gefertigt. | Orange | M |
| child | TSHIRT-CLASSIC-ORN-L | My Cool Brand | My Cool Brand Herren T-Shirt mit Rundhalsausschnitt - Orange | Material: 100% Baumwolle | Passform: Klassischer Regular Fit | Komfort: Etikettenloses Nackenlabel | Qualität: Strapazierfähige Doppelnähte | Farbe: Orange, Größe: L | Dieses Herren-T-Shirt ist ein unverzichtbarer Bestandteil jeder Garderobe. Es bietet eine bequeme Passform und ist aus weichem, atmungsaktivem Stoff gefertigt. | Orange | L |
| parent | TSHIRT-CLASSIC-YLW-S | My Cool Brand | My Cool Brand Herren T-Shirt mit Rundhalsausschnitt - Gelb | Material: 100% Baumwolle | Passform: Klassischer Regular Fit | Komfort: Etikettenloses Nackenlabel | Qualität: Strapazierfähige Doppelnähte | Farbe: Gelb, Größe: S | Dieses Herren-T-Shirt ist ein unverzichtbarer Bestandteil jeder Garderobe. Es bietet eine bequeme Passform und ist aus weichem, atmungsaktivem Stoff gefertigt. | Gelb | S |
| child | TSHIRT-CLASSIC-YLW-M | My Cool Brand | My Cool Brand Herren T-Shirt mit Rundhalsausschnitt - Gelb | Material: 100% Baumwolle | Passform: Klassischer Regular Fit | Komfort: Etikettenloses Nackenlabel | Qualität: Strapazierfähige Doppelnähte | Farbe: Gelb, Größe: M | Dieses Herren-T-Shirt ist ein unverzichtbarer Bestandteil jeder Garderobe. Es bietet eine bequeme Passform und ist aus weichem, atmungsaktivem Stoff gefertigt. | Gelb | M |
| child | TSHIRT-CLASSIC-YLW-L | My Cool Brand | My Cool Brand Herren T-Shirt mit Rundhalsausschnitt - Gelb | Material: 100% Baumwolle | Passform: Klassischer Regular Fit | Komfort: Etikettenloses Nackenlabel | Qualität: Strapazierfähige Doppelnähte | Farbe: Gelb, Größe: L | Dieses Herren-T-Shirt ist ein unverzichtbarer Bestandteil jeder Garderobe. Es bietet eine bequeme Passform und ist aus weichem, atmungsaktivem Stoff gefertigt. | Gelb | L |
Conclusion
In this post, we explored how to use large language models like Gemini to automate the translation of Amazon product listings into structured, consistent formats. By defining a clear schema and using pydantic, we ensured that the output is predictable and easy to integrate with e-commerce platforms. This approach can be adapted and scaled easily to process large catalogs and products with more complex attributes.
A Jupyter Notebook with the complete code is available on GitHub.
Need help implementing custom LLM solutions for your Amazon business? Let's discuss how I can help you automate your product listing translations and scale your multilingual operations.
Book a Free Consultation