Optimize Object Storage Performance with the AWS Go SDK

Zuletzt aktualisiert am 12. Mai 2026

Understanding Performance: Workers vs. Concurrency

To achieve high throughput when interacting with Object Storage, it is essential to understand how the AWS SDK for Go handles data transfer. Performance tuning involves two main dimensions:

Workers (Horizontal Parallelism): This refers to the number of separate files being uploaded simultaneously. Increasing workers is the most effective way to improve performance for small files, where the overhead of the HTTP handshake is the primary bottleneck.
Concurrency (Vertical Parallelism): This is managed by the S3 Transfer Manager. It determines how many parts of a single large file are uploaded in parallel using Multipart Upload. This is crucial for saturating bandwidth with large files.

Configuration Guidelines

Based on our testing, here are example starting points for different workloads. These parameters should be adjusted based on your environment’s resources.

Workload Type	File Size	Recommended Workers	Concurrency (per File)	Part Size
Small Files	~1 MB	400	1	N/A
Medium Files	10 - 100 MB	40 - 200	4 - 8	5 MB
Large Files	> 1000 MB	8 - 16	32 - 64	64 MB

How to Determine Worker and Concurrency Values

There is no single magic formula for the optimal number of workers and concurrency, as it depends entirely on your environment. The key is to measure, tune, and repeat.

Guiding Principles

Start with workers for file-level parallelism.
- Goal: Keep the CPU and network busy by processing multiple files at once. This is most effective for workloads with many small-to-medium-sized files.
- Starting Point: A good starting point is 2 to 4 times the number of CPU cores on your machine. For an 8-core machine, start with 16 to 32 workers.
- Limiting Factors:
  - CPU: Too many workers can cause excessive context switching, where the CPU spends more time switching between tasks than doing actual work.
  - Memory: Each worker and its associated upload tasks consume memory.
  - File Handles: Your operating system has a limit on the number of open files.
Tune concurrency for single-file throughput.
- Goal: Saturate your network connection when uploading a single large file.
- Starting Point: For large files (>100 MB) on a fast network, values between 8 and 32 are common. The SDK’s default is 10.
- Limiting Factors:
  - Network Bandwidth: If your network link is saturated, increasing concurrency further will not help and may even slightly degrade performance due to overhead.
  - Memory: Each concurrent part consumes a buffer in memory (PartSize). The total memory for one file upload is roughly Workers * Concurrency * PartSize.

The Tuning Process

Follow this iterative process to find the right balance:

Establish a Baseline.

Run the script with low, conservative values (e.g., WORKERS=4, CONCURRENCY=4). This is your baseline performance.
Increase Workers.

Keep CONCURRENCY fixed and gradually increase WORKERS (e.g., 4, 8, 16, 32, 64). Monitor your CPU usage and total upload time. You will reach a point where adding more workers no longer improves performance or even makes it worse. This is your optimal worker count for that file set.
Increase Concurrency.

Using your optimal WORKERS count, now begin to increase CONCURRENCY (e.g., 4, 8, 16, 32). This will primarily help if you have large files in your dataset. Again, find the point of diminishing returns.
Adjust Part Size.

For very large files (multiple GB), a larger PART_SIZE_MB (e.g., 64, 128, 256) can be more efficient, as it reduces the total number of parts and API calls required for an upload.

By methodically tuning these three parameters, you can tailor the performance to the specific characteristics of your hardware and workload.

Example Go Script

This script configures the S3 Client and the Upload service. It uses a Worker Pool to upload multiple files from a local directory in parallel. You can tune the performance directly within the CONFIGURATION block.

Save as main.go

package main

import (
  "context"
  "fmt"
  "log"
  "net/http"
  "os"
  "path/filepath"
  "sync"
  "time"

  "github.com/aws/aws-sdk-go-v2/config"
  "github.com/aws/aws-sdk-go-v2/feature/s3/manager"
  "github.com/aws/aws-sdk-go-v2/service/s3"
)

func main() {
  // --- CONFIGURATION ---
  // 1. Set your S3 Bucket and Region
  bucket := "your-s3-bucket-name"
  region := "eu01"

  // 2. Set Local Source and Performance Parameters
  srcDir := "./test-data" // Directory containing your test files

  // Performance Tuning
  workers := 8      // How many files to upload at the same time
  concurrency := 32 // How many chunks per file to upload at once
  partSizeMB := 64  // The size of each chunk in Megabytes
  // ---------------------

  ctx := context.TODO()

  // 3. Initialize SDK with a 15s Timeout
  cfg, err := config.LoadDefaultConfig(ctx,
    config.WithRegion(region),
    config.WithHTTPClient(&http.Client{
      Timeout: 15 * time.Second,
    }),
  )
  if err != nil {
    log.Fatalf("unable to load SDK config: %v", err)
  }

  client := s3.NewFromConfig(cfg)
  uploader := manager.NewUploader(client, func(u *manager.Uploader) {
    u.PartSize = int64(partSizeMB) * 1024 * 1024
    u.Concurrency = concurrency
  })

  // 4. Gather all files from the source directory
  files, err := os.ReadDir(srcDir)
  if err != nil {
    log.Fatalf("failed to read directory %q: %v", srcDir, err)
  }

  // 5. Worker Pool Logic
  jobs := make(chan string, len(files))
  var wg sync.WaitGroup
  start := time.Now()

  fmt.Printf("Starting benchmark: %d workers, %d concurrency per file, %dMB part size\n", workers, concurrency, partSizeMB)

  for w := 1; w <= workers; w++ {
    wg.Add(1)
    go func(workerID int) {
      defer wg.Done()
      for fileName := range jobs {
        fullPath := filepath.Join(srcDir, fileName)
        file, err := os.Open(fullPath)
        if err != nil {
          fmt.Printf("[Worker %d] Error opening %s: %v\n", workerID, fileName, err)
          continue
        }

        _, err = uploader.Upload(ctx, &s3.PutObjectInput{
          Bucket: &bucket,
          Key:    &fileName,
          Body:   file,
        })
        file.Close()

        if err != nil {
          fmt.Printf("[Worker %d] Error uploading %s: %v\n", workerID, fileName, err)
        }
      }
    }(w)
  }

  // Send files to the worker pool
  for _, f := range files {
    if !f.IsDir() {
      jobs <- f.Name()
    }
  }
  close(jobs)
  wg.Wait()

  fmt.Printf("\nFinished! Uploaded %d files in %v\n", len(files), time.Since(start))
}

How to Run the Test script

Set Credentials

Export your AWS access key and secret key in your terminal. The script will automatically use them to authenticate.
Terminal window
```
export AWS_ACCESS_KEY_ID="YOUR_ACCESS_KEY"
export AWS_SECRET_ACCESS_KEY="YOUR_SECRET_KEY"
```
The AWS Region and S3 Bucket are hardcoded in main.go. If you need to change them, please edit the region and bucket variables directly in the script.

Prepare Test Data

The script uploads files from a local directory (default ./test-data) First, create the directory:

mkdir -p ./test-data

Next, create some sample files to simulate a workload. The dd command is useful for this.

Here are three examples of how to create test files in different sizes. Each for loop will create 1 GB of test files with different file sizes.

# Create 4 large 256 MB file
for i in {1..4}; do
dd if=/dev/zero of=./test-data/large_256MB_$i.tmp bs=1M count=256 2>/dev/null
done

# Create 20 medium 50 MB file
for i in {1..20}; do
dd if=/dev/zero of=./test-data/medium_50MB_$i.tmp bs=1M count=50 2>/dev/null
done

# Create 1024 small 1 MB file
for i in {1..1024}; do
dd if=/dev/zero of=./test-data/small_1MB_$i.tmp bs=1M count=1 2>/dev/null
done

Tune the Parameters

Go into the main.go script and choose the right parameter under the Configuration Section. Change the following lines when needed:
```
// Performance Tuning
workers     := 8    // How many files to upload at the same time
concurrency := 32   // How many chunks per file to upload at once
partSizeMB  := 64   // The size of each chunk in Megabytes
```
When tuning the parameters, be aware that high parallelism consumes significant RAM. The SDK buffers parts in memory before sending them. You can estimate the maximum memory consumption of the script with this formula: RAM Usage = Workers * Concurrency * PartSizeMB Example: With 8 Workers, 32 Concurrency, and 64 MB PartSize, the script can consume up to 16 GB of RAM (8 * 32 * 64 = 16.384 MB Always ensure your testing machine has enough physical memory available for your chosen settings to avoid system instability.

Run the Script

go mod init main.go
go mod tidy
go run main.go