Intermittently in SharePoint Online the Office 365 toolbar would just not appear. In other cases it would appear but Outlook Web Access would get stuck at logon and eventually fail with message “something went wrong”
This problem only occurred to users on the corporate network, and was not affecting users connecting to the sites externally.
Looking at a WireShark capture we could see traffic was going “direct” instead of via proxy.
We were also able to reproduce the issue while running Fiddler http://www.telerik.com/fiddler
With Fiddler we were able to identify easily:
- Whenever toolbar didn’t appear, http://cdn.sharepointonline.com showed result 502
- Whenever Outlook Web Access didn’t load, http://r4.res.outlook.com and http://xsi.outlook.com showed result 502
In the details view fiddler provided this information:
Looking up these IP addresses we could find they were part of the Akamai content delivery network, used by Microsoft.
Checking proxy configuration we noticed a PAC file in use. We tested with proxy set manually to server and port number and the site worked. So we needed to check the PAC file. At first glance it appeared the PAC file should have resolved these URLs to Proxy
/* Externally hosted sites to be proxied */
if (dnsDomainIs(host, “cdn.sharepointonline.com”
|| dnsDomainIs(host, “.outlook.com”))
The problem here is don’t assume the comment “to be proxied” actually means they are going to be proxied. As we can see it is returning the value of a function ProxyChoice.
Using my PacDbg tool here https://chentiangemalc.wordpress.com/2013/09/30/pacdbg-custom-proxy-browser-set-proxy-cmd-line-tool/ I was able to quickly figure out why it was going direct…
We can see within ProxyChoice function there was a statement that would return DIRECT
These statements had been added in to improve performance of heavy utilized site by bypassing the proxy.
Why did it work most of the time going DIRECT?
This is because the firewall had rules to let out HTTP/S traffic to documented Office 365 IP addresses http://office.microsoft.com/en-us/office365-sharepoint-online-enterprise-help/sharepoint-online-urls-and-ip-addresses-HA102772748.aspx
However the host names cdn.sharepointonline.com and r4.res.outlook.com are load balanced sites that are constantly returning different IP addresses. Whenever there was a failure it was because the IP address was not permitted “direct” due to firewall rules.
Adding the new IPs to Firewall fixed the issue, but this risked breaking again when new IPs were introduced, so the PAC file was modified to ensure these went via proxy.