Performance Optimization Guide
This guide covers performance optimization techniques used in the Book Store project.
Garbage Collection (GC) Optimization
The API service is configured with optimal garbage collection settings for server workloads.
Configuration
See BookStore.ApiService.csproj:
<PropertyGroup>
<!-- Server GC: Uses multiple heaps and threads for better throughput -->
<ServerGarbageCollection>true</ServerGarbageCollection>
<!-- Concurrent GC: Reduces pause times by running GC concurrently -->
<ConcurrentGarbageCollection>true</ConcurrentGarbageCollection>
<!-- Retain VM: Keeps virtual memory allocated for better performance -->
<RetainVMGarbageCollection>true</RetainVMGarbageCollection>
<!-- Dynamic PGO: Profile-guided optimization for hot paths -->
<TieredCompilation>true</TieredCompilation>
<TieredCompilationQuickJit>true</TieredCompilationQuickJit>
</PropertyGroup>
Server GC vs Workstation GC
Server GC (Enabled)
How it works:
- Creates a separate heap and dedicated GC thread for each CPU core
- Performs garbage collection in parallel across all heaps
- Optimized for throughput over latency
Benefits:
- ✅ Higher throughput - Can handle more requests per second
- ✅ Better scalability - Utilizes all available CPU cores
- ✅ Larger heap sizes - Can manage more memory efficiently
- ✅ Ideal for server applications - ASP.NET Core, APIs, background services
Trade-offs:
- ⚠️ Uses more memory (separate heap per core)
- ⚠️ Slightly longer GC pauses (but less frequent)
Workstation GC (Default for console apps)
How it works:
- Single heap and single GC thread
- Optimized for low latency and UI responsiveness
When to use:
- Desktop applications
- Interactive tools
- Memory-constrained environments
Concurrent GC
Enabled: <ConcurrentGarbageCollection>true</ConcurrentGarbageCollection>
How it works:
- Runs most of the Gen2 collection concurrently with application threads
- Application threads can continue allocating objects during GC
- Only brief pauses for critical GC phases
Benefits:
- ✅ Reduced pause times - Application remains responsive during GC
- ✅ Better user experience - Fewer noticeable delays
- ✅ Improved throughput - Less time blocked waiting for GC
Performance Impact:
- Typical Gen2 pause: 10-50ms (vs 100-500ms without concurrent GC)
- Background GC thread uses ~10-25% of one CPU core
Retain VM GC
Enabled: <RetainVMGarbageCollection>true</RetainVMGarbageCollection>
How it works:
- Keeps virtual memory pages allocated after GC
- Avoids frequent memory allocation/deallocation from OS
- Memory is decommitted but not released
Benefits:
- ✅ Faster allocations - No need to request memory from OS
- ✅ Reduced fragmentation - More consistent memory layout
- ✅ Better performance - Fewer system calls
When to use:
- Long-running server applications
- Applications with predictable memory patterns
- Environments with dedicated resources
Trade-off:
- Uses more virtual memory (but not physical memory)
- May not be ideal for memory-constrained containers
Tiered Compilation & Dynamic PGO
Enabled:
<TieredCompilation>true</TieredCompilation>
<TieredCompilationQuickJit>true</TieredCompilationQuickJit>
How it works:
Tier 0 (Quick JIT):
- Methods are compiled quickly with minimal optimizations
- Application starts faster
- Low compilation overhead
Tier 1 (Optimized JIT):
- Hot methods (frequently called) are recompiled with full optimizations
- Uses runtime profiling data (Dynamic PGO)
- Optimizes based on actual usage patterns
Benefits:
- ✅ Faster startup - Quick JIT gets app running immediately
- ✅ Better steady-state performance - Hot paths are fully optimized
- ✅ Adaptive optimization - Optimizes based on real workload
- ✅ Reduced memory - Only hot methods get full optimization
Dynamic PGO (Profile-Guided Optimization):
- Collects runtime profiling data
- Optimizes based on actual code paths taken
- Inlines hot methods
- Devirtualizes interface/virtual calls when possible
- Optimizes branch predictions
Performance Monitoring
Aspire Dashboard (Recommended)
The easiest way to monitor GC performance! Aspire provides a built-in dashboard with real-time metrics.
Access the Dashboard
# Start the application
aspire run
# Dashboard automatically opens at:
# https://localhost:17238 (or check console output)
Monitor GC Metrics
The Aspire Dashboard provides comprehensive .NET runtime metrics:
1. Navigate to Metrics Tab
- Select your service (e.g.,
apiservice) - View real-time metrics
2. Key GC Metrics to Monitor
Garbage Collection Metrics:
process.runtime.dotnet.gc.collections.count- Gen 0: Should be frequent (every few seconds)
- Gen 1: Moderate (every 10-30 seconds)
- Gen 2: Rare (every few minutes)
- ✅ Good: Gen2 collections are infrequent
- ❌ Bad: Frequent Gen2 collections indicate memory pressure
process.runtime.dotnet.gc.heap.size- Total heap size across all generations
- Should stabilize after warmup period
- With Server GC: Larger but more efficient
process.runtime.dotnet.gc.pause.time- Target: <50ms for Gen2 collections
- Concurrent GC: Typically 10-30ms
- Without Concurrent GC: 100-500ms
- Lower is better!
process.runtime.dotnet.gc.duration- Time spent in GC as percentage
- Target: <5% of total time
- Good: 1-3%
- Warning: >5% indicates GC pressure
Memory Metrics:
process.runtime.dotnet.gc.allocations.size- Allocation rate (bytes/second)
- Lower is better for GC pressure
- Identify allocation hot spots
process.runtime.dotnet.gc.committed_memory.size- Physical memory committed to GC
- Should be stable under load
JIT Compilation Metrics:
process.runtime.dotnet.jit.compiled_methods.count- Methods compiled by JIT
- Should plateau after warmup
process.runtime.dotnet.jit.compilation_time- Time spent in JIT compilation
- With Tiered Compilation: Initial spike, then low
3. Visualize Performance Impact
Create custom charts in Aspire Dashboard:
Chart 1: GC Pause Times
- Metric: process.runtime.dotnet.gc.pause.time
- Filter: generation=2
- Expected: <50ms with our optimizations
Chart 2: GC Frequency
- Metric: process.runtime.dotnet.gc.collections.count
- Group by: generation
- Expected: Gen0 >> Gen1 >> Gen2
Chart 3: Time in GC
- Metric: process.runtime.dotnet.gc.duration
- Expected: <5%
Chart 4: Heap Size
- Metric: process.runtime.dotnet.gc.heap.size
- Expected: Stable after warmup
4. Compare Before/After
To see the impact of GC optimizations:
# 1. Baseline (disable GC settings)
# Comment out GC settings in BookStore.ApiService.csproj
aspire run
# 2. Open Aspire Dashboard
# Note metrics for 5 minutes under load
# 3. Optimized (enable GC settings)
# Uncomment GC settings
aspire run
# 4. Compare in Dashboard
# You should see:
# - Lower GC pause times (-60-80%)
# - Higher throughput (+20-40%)
# - More stable heap size
Traces Tab
Monitor request performance:
- Navigate to Traces
- Filter by service:
apiservice - Look for:
- Request duration (P50, P95, P99)
- Database query times
- External service calls
Expected improvements with GC optimizations:
- Lower P99 latency (fewer GC pauses during requests)
- More consistent response times
- Better performance under load
Logs Tab
Monitor GC-related logs:
// Add GC logging to Program.cs if needed
var gcInfo = GC.GetGCMemoryInfo();
logger.LogInformation(
"GC: Gen0={Gen0} Gen1={Gen1} Gen2={Gen2} Heap={HeapMB}MB Pause={PauseMs}ms",
GC.CollectionCount(0),
GC.CollectionCount(1),
GC.CollectionCount(2),
gcInfo.HeapSizeBytes / 1024 / 1024,
gcInfo.PauseDurations.Length > 0 ? gcInfo.PauseDurations[0].TotalMilliseconds : 0);
View in Aspire Dashboard Logs tab with real-time filtering.
Structured Logs
Aspire automatically captures structured logs with:
- Correlation IDs
- Trace IDs
- Service names
- Timestamps
Perfect for correlating GC events with request performance!
Alternative: dotnet-counters (Command Line)
If you prefer command-line monitoring:
# Install dotnet-counters
dotnet tool install -g dotnet-counters
# Find process ID
aspire run
# Note the apiservice process ID from console
# Monitor GC metrics
dotnet-counters monitor --process-id <pid> --counters System.Runtime
# Look for:
# - gc-heap-size: Should be larger with Server GC
# - gc-pause-time: Should be lower with Concurrent GC
# - time-in-gc: Should be <5% for healthy application
# - gen-0-gc-count, gen-1-gc-count, gen-2-gc-count
Verify GC Settings
Check that GC settings are applied correctly:
# Run the application
dotnet run --project src/BookStore.ApiService
# In another terminal, check GC info
dotnet-counters monitor --process-id <pid> --counters System.Runtime
# Look for:
# - gc-heap-size: Should be larger with Server GC
# - gc-pause-time: Should be lower with Concurrent GC
# - time-in-gc: Should be <5% for healthy application
Key Metrics to Monitor
GC Metrics
- GC Pause Time - Should be <50ms for Gen2 collections
- GC Frequency - Gen0/Gen1 should be frequent, Gen2 should be rare
- Time in GC - Should be <5% of total CPU time
- Heap Size - Should stabilize after warmup
Application Metrics
- Request Throughput - Requests per second
- Response Time - P50, P95, P99 latencies
- CPU Usage - Should be <80% under normal load
- Memory Usage - Should not grow unbounded
Load Testing with Aspire Dashboard
The best way to see GC optimizations in action:
Step 1: Start Baseline Test
# 1. Temporarily disable GC optimizations
# Comment out the PropertyGroup in src/ApiService/BookStore.ApiService/BookStore.ApiService.csproj
# 2. Start Aspire
aspire run
# 3. Open Aspire Dashboard (URL in console output)
# Navigate to Metrics → apiservice
# 4. In another terminal, run load test
# Install wrk: brew install wrk (macOS) or apt-get install wrk (Linux)
wrk -t4 -c100 -d60s http://localhost:5000/api/books/search?query=code
# 5. Watch metrics in Aspire Dashboard for 60 seconds
# Note:
# - GC pause times (process.runtime.dotnet.gc.pause.time)
# - GC frequency (process.runtime.dotnet.gc.collections.count)
# - Time in GC (process.runtime.dotnet.gc.duration)
# - Request latency (in Traces tab)
Step 2: Test with Optimizations
# 1. Stop the application (Ctrl+C)
# 2. Re-enable GC optimizations
# Uncomment the PropertyGroup in src/ApiService/BookStore.ApiService/BookStore.ApiService.csproj
# 3. Start Aspire again
aspire run
# 4. Open Aspire Dashboard
# 5. Run same load test
wrk -t4 -c100 -d60s http://localhost:5000/api/books/search?query=code
# 6. Compare metrics in Dashboard
Step 3: Analyze Results
Expected Improvements in Aspire Dashboard:
| Metric | Before | After | Improvement |
|---|---|---|---|
| GC Pause Time (Gen2) | 100-500ms | 10-50ms | -60-80% ⚡ |
| Time in GC | 8-15% | 2-5% | -50-70% 📈 |
| Gen2 Collections | Frequent | Rare | -40-60% 🎯 |
| Request P99 Latency | Higher | Lower | -10-20% ⚡ |
| Throughput (req/s) | Baseline | +20-40% | +20-40% 🚀 |
Screenshots to Capture:
- GC pause time chart (before/after)
- GC collections count (before/after)
- Request latency distribution (Traces tab)
- Heap size over time
Alternative: Use Aspire's Built-in Load Testing
// Create a simple load test using HttpClient
// Run from a separate console app or test project
using var client = new HttpClient { BaseAddress = new Uri("http://localhost:5000") };
var tasks = Enumerable.Range(0, 100).Select(async _ =>
{
for (int i = 0; i < 100; i++)
{
await client.GetAsync("/api/books/search?query=code");
await Task.Delay(10); // 10ms between requests
}
});
await Task.WhenAll(tasks);
Watch the results in real-time in Aspire Dashboard!
Expected Performance Improvements
Based on typical ASP.NET Core applications with these settings:
Throughput
- Server GC: +20-40% more requests/second vs Workstation GC
- Concurrent GC: +10-15% throughput (less time blocked)
- Tiered Compilation: +15-25% steady-state performance
Latency
- Concurrent GC: -60-80% reduction in GC pause times
- Server GC: More consistent latency under load
- Dynamic PGO: -10-20% reduction in P99 latency
Memory
- Server GC: Uses ~2-3x more memory than Workstation GC
- Retain VM: +10-20% virtual memory usage
- Overall: Better memory efficiency under sustained load
Container Considerations
When running in containers (Docker, Kubernetes), consider:
Memory Limits
# docker-compose.yml or Kubernetes manifest
resources:
limits:
memory: "2Gi" # Set appropriate limit
requests:
memory: "1Gi"
GC Heap Limit
.NET automatically respects container memory limits, but you can tune:
# Set GC heap limit to 75% of container memory
DOTNET_GCHeapHardLimit=0x60000000 # 1.5GB for 2GB container
CPU Limits
Server GC creates one heap per CPU. In containers:
resources:
limits:
cpu: "2000m" # 2 CPUs
.NET will create 2 GC heaps automatically.
Advanced Tuning
GC Heap Count
Override automatic heap count:
# Force 4 GC heaps regardless of CPU count
DOTNET_GCHeapCount=4
GC LOH Threshold
Tune Large Object Heap threshold:
# Objects >85KB go to LOH by default
# Increase threshold to reduce LOH fragmentation
DOTNET_GCLOHThreshold=100000 # 100KB
GC Conserve Memory
For memory-constrained environments:
# More aggressive memory management
DOTNET_GCConserveMemory=9 # 0-9, higher = more aggressive
Monitoring in Production
Aspire Dashboard (Development & Staging)
Recommended for development and staging environments:
Aspire Dashboard provides the best developer experience with:
- ✅ Real-time metrics visualization
- ✅ Distributed tracing
- ✅ Structured logs with correlation
- ✅ No configuration needed
- ✅ Built-in .NET runtime metrics
# Development
aspire run
# Staging (with Aspire hosting)
# Deploy AppHost to staging environment
# Dashboard URL provided in deployment output
OpenTelemetry (Production)
For production, export metrics to your observability platform:
Already configured in the project via ServiceDefaults:
// ServiceDefaults automatically configures OpenTelemetry
// Exports to configured OTLP endpoint
// Runtime metrics tracked:
// - process.runtime.dotnet.gc.collections.count
// - process.runtime.dotnet.gc.heap.size
// - process.runtime.dotnet.gc.pause.time
// - process.runtime.dotnet.gc.duration
// - process.runtime.dotnet.jit.compiled_methods.count
Supported backends:
- Azure Monitor / Application Insights
- Prometheus + Grafana
- Datadog
- New Relic
- Honeycomb
- Any OTLP-compatible backend
Application Insights
For Azure deployments:
// Program.cs (already configured via ServiceDefaults)
builder.Services.AddApplicationInsightsTelemetry();
// Automatically tracks:
// - GC metrics
// - Memory usage
// - Request performance
// - Dependencies
// - Exceptions
View in Azure Portal:
- Performance → Server metrics
- Metrics Explorer → .NET CLR metrics
- Live Metrics for real-time monitoring
Custom Metrics
// Track GC metrics manually
var gcInfo = GC.GetGCMemoryInfo();
logger.LogInformation(
"GC Info: Gen0={Gen0}, Gen1={Gen1}, Gen2={Gen2}, Heap={HeapSize}MB",
GC.CollectionCount(0),
GC.CollectionCount(1),
GC.CollectionCount(2),
gcInfo.HeapSizeBytes / 1024 / 1024);
Best Practices
DO
✅ Use Server GC for ASP.NET Core applications
✅ Enable Concurrent GC for better responsiveness
✅ Enable Tiered Compilation for optimal performance
✅ Monitor GC metrics in production
✅ Set appropriate container memory limits
✅ Use ArrayPool<T> and MemoryPool<T> for frequently allocated buffers
✅ Implement IDisposable for large objects
✅ Avoid allocations in hot paths
DON'T
❌ Use Workstation GC for server applications ❌ Disable Concurrent GC (unless specific requirements) ❌ Ignore GC metrics and warnings ❌ Set container limits too low for Server GC ❌ Create unnecessary allocations ❌ Hold references to large objects longer than needed ❌ Use finalizers unless absolutely necessary
Troubleshooting
High GC Pause Times
Symptom: GC pauses >100ms
Solutions:
- Enable Concurrent GC (should already be enabled)
- Reduce heap size by fixing memory leaks
- Use
GC.TryStartNoGCRegion()for critical sections - Profile with
dotnet-traceto find allocation hot spots
High Memory Usage
Symptom: Memory grows unbounded
Solutions:
- Check for memory leaks with
dotnet-dump - Review static collections and event handlers
- Ensure
IDisposableobjects are disposed - Use
WeakReferencefor caches - Enable
GCConserveMemorymode
Low Throughput
Symptom: Lower than expected requests/second
Solutions:
- Verify Server GC is enabled:
dotnet-counters monitor - Check CPU usage - should be <80%
- Profile with
dotnet-traceto find bottlenecks - Review async/await usage
- Check database connection pooling