Build an Image-to-Text Extractor in C# Using OpenAI Vision API (.NET Console Application)

Need to extract text from images using C#? In this tutorial, you'll build a simple Image to Text Generator using the OpenAI Vision API and .NET. The application reads an image, sends it to OpenAI, and returns all visible text from the image.

Prerequisites

Visual Studio 2022
.NET 8 SDK
OpenAI API Key
Basic knowledge of C#

Project Structure

ImageToTextGenerator
│
├── Services
│   ├── IImageToTextService.cs
│   └── ImageToTextService.cs
│
├── Program.cs
├── appsettings.json
└── ImageToTextGenerator.csproj

Step 1: Create Interface (IImageToTextService.cs)

Create a new interface inside the Services folder.

namespace ImageToTextGenerator.Services;

public interface IImageToTextService
{
    Task<string> ExtractTextAsync(
        string imagePath,
        CancellationToken cancellationToken = default);
}

Step 2: Create ImageToTextService

Create a new class named ImageToTextService.cs inside the Services folder.

using System.Net.Http.Headers;
using System.Net.Http.Json;
using System.Text.Json;
using Microsoft.Extensions.Configuration;

namespace ImageToTextGenerator.Services;

public class ImageToTextService : IImageToTextService
{
    private readonly HttpClient _httpClient;
    private readonly string _modelId;
    private readonly string _apiKey;

    public ImageToTextService(
        HttpClient httpClient,
        IConfiguration configuration)
    {
        _httpClient = httpClient;

        _modelId = configuration["OpenAI:ModelId"]
            ?? throw new InvalidOperationException(
                "OpenAI:ModelId not found in appsettings.json.");

        _apiKey = configuration["OpenAI:ApiKey"]
            ?? throw new InvalidOperationException(
                "OpenAI:ApiKey not found in appsettings.json.");
    }

    public async Task<string> ExtractTextAsync(
        string imagePath,
        CancellationToken cancellationToken = default)
    {
        if (!File.Exists(imagePath))
            throw new FileNotFoundException(
                $"Image not found: {imagePath}",
                imagePath);

        var bytes = await File.ReadAllBytesAsync(
            imagePath,
            cancellationToken);

        var base64 = Convert.ToBase64String(bytes);

        var mimeType = GetMimeType(imagePath);

        var dataUrl = $"data:{mimeType};base64,{base64}";

        var payload = new
        {
            model = _modelId,
            messages = new object[]
            {
                new
                {
                    role = "user",
                    content = new object[]
                    {
                        new
                        {
                            type = "text",
                            text = "Extract all text visible in this image. Return only the text, no commentary."
                        },
                        new
                        {
                            type = "image_url",
                            image_url = new
                            {
                                url = dataUrl
                            }
                        }
                    }
                }
            }
        };

        using var request = new HttpRequestMessage(
            HttpMethod.Post,
            "chat/completions")
        {
            Content = JsonContent.Create(payload)
        };

        request.Headers.Authorization =
            new AuthenticationHeaderValue("Bearer", _apiKey);

        using var response = await _httpClient.SendAsync(
            request,
            cancellationToken);

        var body = await response.Content.ReadAsStringAsync(
            cancellationToken);

        if (!response.IsSuccessStatusCode)
            throw new HttpRequestException(
                $"OpenAI endpoint returned {(int)response.StatusCode}: {body}");

        using var doc = JsonDocument.Parse(body);

        return doc.RootElement
            .GetProperty("choices")[0]
            .GetProperty("message")
            .GetProperty("content")
            .GetString() ?? string.Empty;
    }

    private static string GetMimeType(string path)
        => Path.GetExtension(path).ToLowerInvariant() switch
        {
            ".jpg" or ".jpeg" => "image/jpeg",
            ".png" => "image/png",
            ".gif" => "image/gif",
            ".webp" => "image/webp",
            ".bmp" => "image/bmp",
            _ => throw new NotSupportedException(
                $"Unsupported image extension: {Path.GetExtension(path)}")
        };
}

Step 3: Program.cs

Configure dependency injection, load configuration, and call the image extraction service.

using ImageToTextGenerator.Services;
using Microsoft.Extensions.Configuration;
using Microsoft.Extensions.DependencyInjection;

try
{
    var config = new ConfigurationBuilder()
        .SetBasePath(AppContext.BaseDirectory)
        .AddJsonFile(
            "appsettings.json",
            optional: false,
            reloadOnChange: false)
        .Build();

    var endpoint = config["OpenAI:Endpoint"]
        ?? throw new InvalidOperationException(
            "OpenAI:Endpoint not found in appsettings.json.");

    if (!Uri.TryCreate(
            endpoint,
            UriKind.Absolute,
            out var endpointUri))
    {
        throw new InvalidOperationException(
            $"OpenAI:Endpoint '{endpoint}' is not a valid absolute URI.");
    }

    var baseAddress = endpoint.EndsWith('/')
        ? endpointUri
        : new Uri(endpoint + "/");

    var services = new ServiceCollection();

    services.AddSingleton(config);

    services.AddHttpClient<IImageToTextService, ImageToTextService>(http =>
    {
        http.BaseAddress = baseAddress;
        http.Timeout = TimeSpan.FromMinutes(3);
    });

    await using var provider =
        services.BuildServiceProvider();

    string? imagePath =
        @"C:\Screenshot 2026-07-02 143404.png";

    if (string.IsNullOrWhiteSpace(imagePath))
        throw new InvalidOperationException(
            "No image path provided.");

    imagePath = imagePath.Trim().Trim('"');

    var service =
        provider.GetRequiredService<IImageToTextService>();

    var text = await service.ExtractTextAsync(imagePath);

    Console.WriteLine();
    Console.WriteLine("---- Extracted text ----");
    Console.WriteLine(text);

    return 0;
}
catch (FileNotFoundException ex)
{
    Console.Error.WriteLine($"File error: {ex.Message}");
    return 1;
}
catch (NotSupportedException ex)
{
    Console.Error.WriteLine($"Unsupported input: {ex.Message}");
    return 1;
}
catch (InvalidOperationException ex)
{
    Console.Error.WriteLine($"Configuration error: {ex.Message}");
    return 1;
}
catch (HttpRequestException ex)
{
    Console.Error.WriteLine($"HTTP error: {ex.Message}");
    return 1;
}
catch (TaskCanceledException)
{
    Console.Error.WriteLine("Request timed out (3 minutes).");
    return 1;
}
catch (Exception ex)
{
    Console.Error.WriteLine(
        $"Fatal error: {ex.GetType().Name}: {ex.Message}");
    return 1;
}

appsettings.json

Create an appsettings.json file in your project.

{
  "OpenAI": {
    "Endpoint": "https://api.openai.com/v1/",
    "ModelId": "gpt-4.1-mini",
    "ApiKey": "YOUR_OPENAI_API_KEY"
  }
}

How It Works

Reads the image from disk.
Converts the image into Base64.
Creates a Data URL.
Sends the image to the OpenAI Chat Completions API.
Uses a Vision-capable model to read the image.
Returns only the extracted text.

Sample Output

---- Extracted text ----

Invoice Number: INV-2026-105
Customer Name: John Smith
Date: 02 July 2026
Total Amount: $325.00

Conclusion

You have successfully built an Image to Text Generator in C# using the OpenAI Vision API. This approach works for screenshots, scanned documents, invoices, forms, receipts, and many other image types. You can further enhance the application by adding batch processing, PDF support, drag-and-drop, or exporting extracted text to Word, Excel, or a database.

Arjun Singh Faguda

Thursday, July 2, 2026

Build an Image to Text Extractor using OpenAi