In a stroke of bad timing would be comical if it were not so annoying, Microsoft's multi-factor authentication system (MFA), used for Azure, Office 365 and Dynamics, has been reduced for the second time this month, only a few hours after La The company published its findings in a 14-hour cut on November 19.
Azure Active Directory multifactor authentication services were disconnected just before 05:00 UTC and did not work until shortly before 19:00 UTC. The servers initially affected were those that provide services to the Europe and Middle East region and the Asia and the Pacific region; When those regions woke up and tried to authenticate themselves, the servers became overloaded and shut down. Microsoft tried to redirect some authentication attempts to the EE servers. UU., But this simply had the effect of overloading them too.
Subsequent analysis of the company has shown that three individual errors came together to cause the problems. On November 19, a code change that had been progressively implemented during the previous six days caused a cascade of failures. Above a certain level of traffic, the new code caused a significant increase in latency between the front-end servers and the cache servers. This, in turn, revealed a race condition in the back-end servers, which caused the application servers to reboot again and again. That later revealed a third problem: back-end servers would create more and more processes, eventually lacking resources and leaving them unanswered.
Today's problems are still under investigation. The MFA servers have expired since 14:25 UTC, which causes login attempts to fail when the MFA is in use. Currently, the company believes that resolving a previous DNS error has produced a barrage of authentication attempts, essentially flooding the MFA system with more requests than it can handle.