Episode 1: Tinkering with Image Generation from Text with the OpenAI API

This post is a companion to the YouTube video DALL-E Text to Image Generation.

We will be exploring image generation using the DALL-E API. We'll be using Python to write our code.

To get started, you should sign-up with OpenAI here OpenAI Home

Once registered, it's a good idea to read through the GETTING STARTED introduction, including the Quickstart tutorial.

You will then want to generate an API key and save it to a file on your local machine. Don't lose this! Name it something like OPENAI_API_KEY.txt. You will need to point to this file in your code.

If you don't have Python installed on your computer, you need to do that. For Windows users, the easiest thing is to install from the Windows Store - say Python 3.10 from the Python Software Foundation.

Once you have Python installed, you will need to install the openAI API library from the command console:

$ pip install openai

In order to avoid constant changing of Python code in your editor and having to re-run the code for every new prompt or parameter, I created a simple GUI front-end for the openAI API. The code listing is found below.

You will also need to install the following library for the listed code to work:

$ pip install PySimpleGUI

This is required for the GUI, which provides a convenient user interface.

For the rest, watch the video and follow along.

If you want to get straight to tinkering without the bother of installing Python, you can get a Windows executable to run here:

~~https://drive.google.com/file/d/xxxxxxxxxx~~

This was generated using pyinstaller on the Python code. Just note that the file will look like a "suspicious" download (I will try to find a solution for that) and Windows will likely complain about running an unsigned executable. SORRY, TURNS OUT I CANNOT SHARE AN EXECUTABLE... but you can accomplish the same for convenience if desired.

# DALL_gui.py
# GUI interface to DALL-E api
# --- Tinker Foundry ---
import datetime
import PySimpleGUI as sg
import os.path
import openai

# we're using PySimpleGui
sg.theme('SandyBeach') # Window theme

# define api parameter choices
size_choices = ('256x256', '512x512', '1024x1024')
images_choices = (1,2,3,4,5,6,7,8,9,10)
layout = [
         [sg.Text('Enter DALL-E Image Creation Input Parameters')], 
         [sg.Text('OpenAI Key Path', size = (15,1)), sg.Input(expand_x=True, key='keypath'), sg.FileBrowse()],
         [sg.Text('Save Output File', size = (15,1)), sg.Input(expand_x=True, key='outfilepath'), sg.FileSaveAs()],
         [sg.Text('Prompt', size = (15,1)), sg.Multiline(expand_x=True, expand_y=True, key='prompt')],
         [sg.Text('Images Requested', size = (15,1)), sg.Combo(images_choices, size = (15, len(images_choices)), default_value = '1', readonly = True, key='n')],
         [sg.Text('Image Resolution', size = (15,1)), sg.Combo(size_choices, size=(15, len(size_choices)), default_value = '256x256', readonly = True, key='resolution')],
         [sg.Button('Generate')],
         [sg.HSeparator()],
         [sg.Text('Response output:')],
         [sg.Multiline(expand_x=True, expand_y=True, key='result')],
         [sg.HSeparator()]
]

window = sg.Window('DALL_Gui Image Generation Control Panel', resizable=True).Layout(layout)

while True:  # The Event Loop
    event, values = window.read() 
    #print(event, values['keypath'], values['outfilepath'], values['prompt'], values['n'], values['resolution'])
    if event == sg.WIN_CLOSED or event == 'Exit':
        break
    if event == 'Generate':
        window['result'].print("Submitting image creation request to OpenAI with the following parameters:")
        window['result'].print("Prompt: ", values['prompt'])
        window['result'].print("Number of images: ", values['n'])
        window['result'].print("Image resolution: ", values['resolution'])
        window['result'].print("")
        f = open(values['outfilepath'], "a") # open the file for output of results
        # add a timestamp to the file output
        now = datetime.datetime.now()
        date_time = now.strftime("%m/%d/%Y, %H:%M:%S")
        f.write("-----------------------------")
        f.write('\n')
        f.write(date_time)
        f.write('\n')
        f.write("Submitting image creation request to OpenAI with the following parameters:")
        f.write('\n')
        f.write("Prompt:")
        f.write('\n')
        f.write(values['prompt'])
        f.write('\n')
        f.write("Number of images: ")
        f.write(str(values['n']))
        f.write('\n')
        f.write("Image resolution: ")
        f.write(values['resolution'])
        f.write('\n')
        
        openai.api_key_path = values['keypath']
        try:
           response = openai.Image.create(
              prompt=values['prompt'],
              n=values['n'],
              size=values['resolution']
           )
           window['result'].print("Response:")
           f.write("Response:")
           f.write('\n')
           
           for idx in range(0, values['n']): # go through the response URLs and print them out
              window['result'].print("URL for Image #", idx, ":")
              window['result'].print(response['data'][idx]['url'])
              window['result'].print("")
              # write the URL into the output file
              f.write(response['data'][idx]['url'])
              f.write('\n')

           # close the output file
           f.close()

        except openai.error.OpenAIError as e:
           window['result'].print("Error")
           #print(e.http_status)
           #print(e.error)

window.close()

Search This Blog

The Tinker Foundry

Episode 1: Tinkering with Image Generation from Text with the OpenAI API

Comments

Post a Comment

Popular posts from this blog

ESP32 Based Pulse-Oximeter using MAX30102

Heart Rate Measurement with ESP32 and Pulse Oximeter Module using FFT

ESP32 Web Sockets - Transferring Image Raw Data