AzureWebSocketFix

Development isply.io Projects

Azure WebSocket Connection Fix Documentation

Problem Summary

WebSocket connections in the Azure Web App were closing unexpectedly with status code 1006 (abnormal closure) after approximately 30-60 seconds. This was occurring in the production environment but not in development, despite both running the same build.

The error in browser console:

[2025-06-07T19:59:39.196Z] Information: Normalizing '_blazor' to 'https://isply.io/_blazor'.
[2025-06-07T19:59:39.500Z] Information: WebSocket connected to wss://isply.io/_blazor?id=ds6PQEbFV8kJ9NRM5cQzig.
[2025-06-07T20:00:15.905Z] Error: Connection disconnected with error 'Error: WebSocket closed with status code: 1006 (no reason given).'.

Additionally, high CPU usage in MethodBase.Invoke called from Microsoft.JSInterop.Infrastructure.DotNetDispatcher.InvokeSynchronously was observed.

Root Causes Identified

  1. Mismatched WebSocket and SignalR KeepAlive Intervals:
    • SignalR hub was configured with a 10-second keepalive interval
    • WebSocket middleware was using a 30-second keepalive interval
    • This inconsistency caused timing conflicts leading to premature connection closures
  2. Missing IIS WebSocket Configuration:
    • No explicit WebSocket configuration in web.config
    • Required for Azure App Service to properly handle WebSocket connections
  3. Inefficient JSInterop Calls:
    • Frequent JavaScript interop calls causing high CPU usage
    • No batching or optimization of JavaScript operations

Comprehensive Solution

1. WebSocket and SignalR Configuration Alignment

Changes to Program.cs

// HubOptions configuration
builder.Services.AddRazorComponents().AddInteractiveServerComponents().AddHubOptions(options =>
{
    options.MaximumReceiveMessageSize = 32 * 1024;
    options.ClientTimeoutInterval = TimeSpan.FromMinutes(10);
    options.HandshakeTimeout = TimeSpan.FromMinutes(2);

    // Decrease the KeepAliveInterval to ensure more frequent pings
    options.KeepAliveInterval = TimeSpan.FromSeconds(10);

    // Enable detailed errors for troubleshooting
    options.EnableDetailedErrors = true;

    // Limit parallel invocations to reduce JSInterop overhead
    options.MaximumParallelInvocationsPerClient = 1;

    // Add streaming buffer capacity to optimize SignalR performance
    options.StreamBufferCapacity = 20;
});

// Add Circuit options to prevent premature disconnects
builder.Services.Configure<CircuitOptions>(options =>
{
    options.DisconnectedCircuitRetentionPeriod = TimeSpan.FromMinutes(5);
    options.JSInteropDefaultCallTimeout = TimeSpan.FromMinutes(1);

    // Add MaxBufferedUnacknowledgedRenderBatches to limit memory usage
    options.MaxBufferedUnacknowledgedRenderBatches = 10;
});

// WebSocket middleware configuration (aligned with SignalR)
app.UseWebSockets(new WebSocketOptions { 
    KeepAliveInterval = TimeSpan.FromSeconds(10),  // Match the SignalR setting exactly
    AllowedOrigins = { "*" }  // Allow all origins
});

New web.config File

<?xml version="1.0" encoding="utf-8"?>
<configuration>
  <system.webServer>
    <webSocket enabled="true" />
    <handlers>
      <add name="aspNetCore" path="*" verb="*" modules="AspNetCoreModuleV2" resourceType="Unspecified" />
    </handlers>
    <aspNetCore processPath="dotnet" arguments=".Simple.dll" stdoutLogEnabled="true" stdoutLogFile=".logsstdout" hostingModel="inprocess" />
    <security>
      <requestFiltering>
        <requestLimits maxAllowedContentLength="30000000" />
      </requestFiltering>
    </security>
  </system.webServer>
</configuration>

2. Circuit Handlers for Connection Management

AzureCircuitHandler.cs

using Microsoft.AspNetCore.Components.Server.Circuits;
using Microsoft.ApplicationInsights;
using System.Collections.Concurrent;

namespace Simple.Services;

public class AzureCircuitHandler : CircuitHandler
{
    private readonly ILogger<AzureCircuitHandler> _logger;
    private readonly TelemetryClient _telemetryClient;
    private static readonly ConcurrentDictionary<string, DateTime> _lastPingTimes = new();
    private static readonly ConcurrentDictionary<string, Circuit> _activeCircuits = new();

    public AzureCircuitHandler(
        ILogger<AzureCircuitHandler> logger,
        TelemetryClient telemetryClient)
    {
        _logger = logger;
        _telemetryClient = telemetryClient;
    }

    public override Task OnCircuitOpenedAsync(Circuit circuit, CancellationToken cancellationToken)
    {
        var instanceId = Environment.GetEnvironmentVariable("WEBSITE_INSTANCE_ID") ?? "unknown";
        _logger.LogInformation("Circuit {CircuitId} opened on Azure instance {InstanceId}", 
            circuit.Id, instanceId);

        _telemetryClient.TrackEvent("CircuitOpened", new Dictionary<string, string>
        {
            { "CircuitId", circuit.Id },
            { "InstanceId", instanceId }
        });

        // Store the circuit and record the ping time
        _activeCircuits.TryAdd(circuit.Id, circuit);
        _lastPingTimes.TryAdd(circuit.Id, DateTime.UtcNow);

        return Task.CompletedTask;
    }

    public override Task OnCircuitClosedAsync(Circuit circuit, CancellationToken cancellationToken)
    {
        _logger.LogInformation("Circuit {CircuitId} closed", circuit.Id);

        _telemetryClient.TrackEvent("CircuitClosed", new Dictionary<string, string>
        {
            { "CircuitId", circuit.Id }
        });

        // Remove the circuit
        _activeCircuits.TryRemove(circuit.Id, out _);
        _lastPingTimes.TryRemove(circuit.Id, out _);

        return Task.CompletedTask;
    }

    public override Task OnConnectionUpAsync(Circuit circuit, CancellationToken cancellationToken)
    {
        _logger.LogInformation("Connection for circuit {CircuitId} established", circuit.Id);
        _lastPingTimes[circuit.Id] = DateTime.UtcNow;

        return Task.CompletedTask;
    }

    public override Task OnConnectionDownAsync(Circuit circuit, CancellationToken cancellationToken)
    {
        _logger.LogInformation("Connection for circuit {CircuitId} lost", circuit.Id);

        return Task.CompletedTask;
    }

    // Method for healthcheck or other services to access circuit state
    public static bool IsCircuitActive(string circuitId)
    {
        return _activeCircuits.ContainsKey(circuitId);
    }

    // Method to get the last ping time for a circuit
    public static DateTime GetLastPingTime(string circuitId)
    {
        return _lastPingTimes.TryGetValue(circuitId, out var time) ? time : DateTime.MinValue;
    }

    // Method to update ping time (can be called from heartbeat service)
    public static void UpdatePingTime(string circuitId)
    {
        if (_activeCircuits.ContainsKey(circuitId))
        {
            _lastPingTimes[circuitId] = DateTime.UtcNow;
        }
    }
}

HeartbeatCircuitHandler.cs

using Microsoft.AspNetCore.Components.Server.Circuits;

namespace Simple.Services;

/// <summary>
/// Circuit handler to manage heartbeat services for Blazor circuits
/// </summary>
public class HeartbeatCircuitHandler : CircuitHandler
{
    private readonly IServiceProvider _serviceProvider;
    private readonly ILogger<HeartbeatCircuitHandler> _logger;

    public HeartbeatCircuitHandler(
        IServiceProvider serviceProvider,
        ILogger<HeartbeatCircuitHandler> logger)
    {
        _serviceProvider = serviceProvider;
        _logger = logger;
    }

    public override async Task OnCircuitOpenedAsync(Circuit circuit, CancellationToken cancellationToken)
    {
        try
        {
            // Create a scope to resolve the heartbeat service
            using var scope = _serviceProvider.CreateScope();
            var heartbeatService = scope.ServiceProvider.GetRequiredService<BlazorHeartbeatService>();

            // Initialize the heartbeat service with this circuit
            await heartbeatService.InitializeAsync(circuit);

            _logger.LogInformation("Heartbeat service initialized for circuit {CircuitId}", circuit.Id);
        }
        catch (Exception ex)
        {
            _logger.LogError(ex, "Failed to initialize heartbeat service for circuit {CircuitId}", circuit.Id);
        }
    }

    public override Task OnCircuitClosedAsync(Circuit circuit, CancellationToken cancellationToken)
    {
        _logger.LogInformation("Heartbeat service shutting down for circuit {CircuitId}", circuit.Id);
        return Task.CompletedTask;
    }
}

BlazorHeartbeatService.cs

using Microsoft.AspNetCore.Components;
using Microsoft.JSInterop;
using Microsoft.AspNetCore.Components.Server.Circuits;

namespace Simple.Services;

/// <summary>
/// Service to maintain WebSocket connections and prevent timeouts
/// </summary>
public class BlazorHeartbeatService : IDisposable
{
    private readonly IJSRuntime _jsRuntime;
    private readonly ILogger<BlazorHeartbeatService> _logger;
    private readonly NavigationManager _navigationManager;
    private DotNetObjectReference<BlazorHeartbeatService> _dotNetRef;
    private Circuit _circuit;
    private string _circuitId;
    private bool _isStarted;

    public BlazorHeartbeatService(
        IJSRuntime jsRuntime,
        ILogger<BlazorHeartbeatService> logger,
        NavigationManager navigationManager)
    {
        _jsRuntime = jsRuntime;
        _logger = logger;
        _navigationManager = navigationManager;
    }

    public async Task InitializeAsync(Circuit circuit)
    {
        _circuit = circuit;
        _circuitId = circuit.Id;
        _dotNetRef = DotNetObjectReference.Create(this);

        try
        {
            // Start the heartbeat on the client side
            await _jsRuntime.InvokeVoidAsync("blazorHeartbeat.start", _dotNetRef, 5000);
            _isStarted = true;
            _logger.LogInformation("Heartbeat started for circuit {CircuitId}", _circuitId);
        }
        catch (Exception ex)
        {
            _logger.LogError(ex, "Failed to start heartbeat for circuit {CircuitId}", _circuitId);
        }
    }

    [JSInvokable]
    public Task ReceiveHeartbeat()
    {
        try
        {
            // Update the ping time in the AzureCircuitHandler
            AzureCircuitHandler.UpdatePingTime(_circuitId);

            // Log every 10th heartbeat to avoid excessive logging
            if (DateTime.UtcNow.Second % 10 == 0)
            {
                _logger.LogDebug("Heartbeat received for circuit {CircuitId}", _circuitId);
            }
        }
        catch (Exception ex)
        {
            _logger.LogError(ex, "Error processing heartbeat for circuit {CircuitId}", _circuitId);
        }

        return Task.CompletedTask;
    }

    public void Dispose()
    {
        try
        {
            if (_isStarted)
            {
                // Stop the heartbeat on the client
                _ = _jsRuntime.InvokeVoidAsync("blazorHeartbeat.stop");
            }

            _dotNetRef?.Dispose();
        }
        catch (Exception ex)
        {
            _logger.LogError(ex, "Error disposing heartbeat service for circuit {CircuitId}", _circuitId);
        }
    }
}

3. JSInterop Optimization for Performance

JSInteropOptimizer.cs

using Microsoft.JSInterop;
using System.Collections.Concurrent;
using System.Text.Json;

namespace Simple.Services
{
    /// <summary>
    /// Service to optimize JSInterop calls and reduce CPU usage in MethodBase.Invoke
    /// </summary>
    public class JSInteropOptimizer
    {
        private readonly IJSRuntime _jsRuntime;
        private readonly ConcurrentDictionary<string, DotNetObjectReference<JSInteropCallback>> _callbackReferences = new();
        private readonly ConcurrentDictionary<string, TaskCompletionSource<object>> _pendingCalls = new();

        public JSInteropOptimizer(IJSRuntime jsRuntime)
        {
            _jsRuntime = jsRuntime;
        }

        /// <summary>
        /// Batches multiple JS operations to execute in one JSInterop call
        /// </summary>
        public async Task<IEnumerable<object>> BatchOperationsAsync(IEnumerable<JSOperation> operations)
        {
            var operationsArray = operations.ToArray();
            var batchId = Guid.NewGuid().ToString();

            var callback = new JSInteropCallback(results => 
            {
                if (_pendingCalls.TryRemove(batchId, out var tcs))
                {
                    tcs.TrySetResult(results);
                }
                return Task.CompletedTask;
            });

            var callbackRef = DotNetObjectReference.Create(callback);
            _callbackReferences[batchId] = callbackRef;

            var tcs = new TaskCompletionSource<object>();
            _pendingCalls[batchId] = tcs;

            try
            {
                await _jsRuntime.InvokeVoidAsync("domBatch.processBatchOperations", 
                    operationsArray, callbackRef, batchId);

                var results = await tcs.Task;
                return ((JsonElement)results).EnumerateArray()
                    .Select(e => e.ValueKind == JsonValueKind.Null ? null : e.GetRawText())
                    .ToArray();
            }
            finally
            {
                _callbackReferences.TryRemove(batchId, out var reference);
                reference?.Dispose();
            }
        }

        /// <summary>
        /// Performs a DOM update operation optimized to reduce JSInterop calls
        /// </summary>
        public async Task UpdateDomElementAsync(string id, string property, object value)
        {
            await _jsRuntime.InvokeVoidAsync("updateDomElement", id, property, value);
        }

        /// <summary>
        /// Caches a DOM reference to avoid repeated lookups
        /// </summary>
        public async Task<bool> CacheDomReferenceAsync(string id, string selector)
        {
            return await _jsRuntime.InvokeAsync<bool>("cacheDomReference", id, selector);
        }

        /// <summary>
        /// Debounces an event to reduce the frequency of JSInterop calls
        /// </summary>
        public async Task DebounceEventAsync(string eventName, int waitMs = 100)
        {
            await _jsRuntime.InvokeVoidAsync("debounceEvent", eventName, waitMs);
        }
    }

    /// <summary>
    /// Represents a JavaScript operation to be batched
    /// </summary>
    public class JSOperation
    {
        public string Type { get; set; }
        public string Target { get; set; }
        public object[] Args { get; set; }

        public JSOperation(string type, string target, params object[] args)
        {
            Type = type;
            Target = target;
            Args = args;
        }
    }

    /// <summary>
    /// Callback class for receiving results from batched JS operations
    /// </summary>
    public class JSInteropCallback
    {
        private readonly Func<object, Task> _callback;

        public JSInteropCallback(Func<object, Task> callback)
        {
            _callback = callback;
        }

        [JSInvokable]
        public Task OnResultsReceived(object results)
        {
            return _callback(results);
        }
    }
}

4. JavaScript Client-Side Optimizations

site.js (Added JavaScript Functions)

// Add heartbeat mechanism to prevent WebSocket timeouts
window.blazorHeartbeat = {
    interval: null,
    dotNetReference: null,

    // Start the heartbeat
    start: function(dotNetRef, intervalMs) {
        // Clear any existing interval
        if (this.interval) {
            clearInterval(this.interval);
        }

        // Store the .NET reference
        this.dotNetReference = dotNetRef;

        // Set default interval if not provided
        if (!intervalMs) intervalMs = 5000; // 5 seconds by default

        // Start the heartbeat interval
        this.interval = setInterval(() => {
            if (this.dotNetReference) {
                try {
                    // Send a ping to the server
                    this.dotNetReference.invokeMethodAsync('ReceiveHeartbeat');
                } catch (e) {
                    console.error('Error sending heartbeat', e);
                    this.stop();
                }
            }
        }, intervalMs);

        return true;
    },

    // Stop the heartbeat
    stop: function() {
        if (this.interval) {
            clearInterval(this.interval);
            this.interval = null;
        }

        // Release the .NET reference
        this.dotNetReference = null;
    }
};

// Cache DOM references to reduce the need for JS interop
let domCache = {};

window.cacheDomReference = function(id, selector) {
    domCache[id] = document.querySelector(selector);
    return domCache[id] !== null;
};

window.updateDomElement = function(id, property, value) {
    const element = domCache[id] || document.getElementById(id);
    if (element) {
        element[property] = value;
        return true;
    }
    return false;
};

// Add event debouncing to reduce number of JSInterop calls
const debounceTimers = {};

window.debounceEvent = function(eventName, func, wait) {
    return function() {
        const context = this;
        const args = arguments;
        clearTimeout(debounceTimers[eventName]);
        debounceTimers[eventName] = setTimeout(() => {
            func.apply(context, args);
        }, wait);
    };
};

// Batch multiple DOM operations to reduce JSInterop calls
window.domBatch = {
    operations: [],
    timeoutId: null,

    // Add an operation to the batch
    addOperation: function(operation) {
        this.operations.push(operation);

        if (!this.timeoutId) {
            this.timeoutId = setTimeout(() => this.processBatch(), 50);
        }
    },

    // Process all batched operations
    processBatch: function() {
        const ops = this.operations;
        this.operations = [];
        this.timeoutId = null;

        for (let i = 0; i < ops.length; i++) {
            const op = ops[i];
            try {
                if (op.type === 'setAttribute') {
                    const el = document.getElementById(op.id);
                    if (el) el.setAttribute(op.name, op.value);
                } else if (op.type === 'removeAttribute') {
                    const el = document.getElementById(op.id);
                    if (el) el.removeAttribute(op.name);
                } else if (op.type === 'setProperty') {
                    const el = document.getElementById(op.id);
                    if (el) el[op.name] = op.value;
                } else if (op.type === 'setStyle') {
                    const el = document.getElementById(op.id);
                    if (el && el.style) el.style[op.name] = op.value;
                } else if (op.type === 'addClass') {
                    const el = document.getElementById(op.id);
                    if (el) el.classList.add(op.name);
                } else if (op.type === 'removeClass') {
                    const el = document.getElementById(op.id);
                    if (el) el.classList.remove(op.name);
                } else if (op.type === 'toggleClass') {
                    const el = document.getElementById(op.id);
                    if (el) el.classList.toggle(op.name);
                }
            } catch (e) {
                console.error('Error in batch operation', e, op);
            }
        }
    },

    // Process batch operations and call back with results
    processBatchOperations: function(operations, dotnetRef, batchId) {
        const results = [];

        for (let i = 0; i < operations.length; i++) {
            const op = operations[i];
            try {
                let result = null;

                if (op.type === 'getAttribute') {
                    const el = document.getElementById(op.target);
                    result = el ? el.getAttribute(op.args[0]) : null;
                } else if (op.type === 'getProperty') {
                    const el = document.getElementById(op.target);
                    result = el ? el[op.args[0]] : null;
                } else if (op.type === 'querySelector') {
                    const root = op.target ? document.getElementById(op.target) : document;
                    const el = root ? root.querySelector(op.args[0]) : null;
                    result = el ? el.id || true : null;
                } else if (op.type === 'invokeMethod') {
                    const el = document.getElementById(op.target);
                    if (el && typeof el[op.args[0]] === 'function') {
                        result = el[op.args[0]].apply(el, op.args.slice(1));
                    }
                } else {
                    // Execute other operations without return value
                    this.addOperation(op);
                }

                results.push(result);
            } catch (e) {
                console.error('Error in batch operation', e, op);
                results.push(null);
            }
        }

        // Process any operations that were added
        if (this.operations.length > 0) {
            this.processBatch();
        }

        // Call back to .NET with the results
        dotnetRef.invokeMethodAsync('OnResultsReceived', results);
    }
};

5. Service Registration in Program.cs

// Add Circuit Handler
builder.Services.AddScoped<CircuitHandler, AzureCircuitHandler>();

// Add HeartbeatCircuitHandler to manage WebSocket connections
builder.Services.AddScoped<CircuitHandler, HeartbeatCircuitHandler>();

// Add BlazorHeartbeatService to prevent WebSocket timeouts
builder.Services.AddScoped<BlazorHeartbeatService>();

// Add HttpClient
builder.Services.AddHttpClient();

// Add Identity Service
builder.Services.AddScoped<Simple.SimpleIdentityService>();

// Add JSInteropOptimizer service to reduce JSInterop overhead
builder.Services.AddScoped<JSInteropOptimizer>();

Testing and Verification

  1. Azure Configuration Checks:
    • Verified WebSockets are enabled in Azure App Service settings
    • Confirmed proper Azure App Service configuration for Session Affinity
  2. Performance Testing:
    • Monitored WebSocket connection lifetimes in production
    • Verified CPU usage reduction in MethodBase.Invoke

Results

  • WebSocket connections remain stable without the 1006 status code disconnections
  • CPU usage in JSInterop calls has been significantly reduced
  • Application performance and stability improved across all environments

Additional Recommendations

  1. Monitoring:
    • Set up alerts for WebSocket disconnections in Application Insights
    • Monitor CircuitHandler events to detect patterns of connection issues
  2. Regular Maintenance:
    • Keep SignalR and WebSocket settings in sync when making changes
    • Review JavaScript interop optimization opportunities regularly
  3. Client-Side Considerations:
    • Ensure mobile browsers support WebSockets properly
    • Consider implementing client-side reconnection strategies

This comprehensive solution addresses both the immediate WebSocket connection issues and the underlying JSInterop performance problems, resulting in a more stable and responsive application.

Leave a Reply

Your email address will not be published. Required fields are marked *

Scroll to top

Discover more from HIVOLTECH

Subscribe now to keep reading and get access to the full archive.

Continue reading