Tag Archives: SharePoint 2013

Changing the Distributed Cache Service Account

So you want to follow the security by least privileges best practice for your SharePoint 2013 farm and decide to create a dedicated service account for distributed cache. You head on over to TechNet and check out Manage the Distributed Cache service in SharePoint Server 2013: Change the service account where you find the following script:

$farm = Get-SPFarm
$cacheService = $farm.Services | where {$_.Name -eq "AppFabricCachingService"}
$accnt = Get-SPManagedAccount -Identity domain_name\user_name
$cacheService.ProcessIdentity.CurrentIdentityType = "SpecificUser"
$cacheService.ProcessIdentity.ManagedAccount = $accnt
$cacheService.ProcessIdentity.Update() 
$cacheService.ProcessIdentity.Deploy()

Provided you’ve already added your dedicated service account as a Managed Account, the script works. The trouble is the documentation is missing one important piece of information: the service account needs to be a local machine administrator on all the cache hosts before running the Deploy() method (the last line).

If the account is not a local machine administrator, you’ll get this exception after waiting a number of minutes:

Exception calling "Deploy" with "0" argument(s): "Error occurred while performing the operation on host
CACHEHOST:22233 : ErrorCode<ERRCAdmin003>:SubStatus<ES0001>:Time-out occurred on
net.tcp://CACHEHOST.example.com:22233."
At line:1 char:1
+ $cacheservice.ProcessIdentity.deploy()
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : NotSpecified: (:) [], MethodInvocationException
    + FullyQualifiedErrorId : CmdletInvocationException

What happens is the AppFabricCachingService Windows service gets stuck on starting because the service account doesn’t have the necessary rights on the server to set up the service for the first time. Grant it local admin and Deploy() goes off smoothly.

Remember to remove the local admin rights for the service account and restart the server after distributed cache is running. After all you’re following least privileges and the last thing you want is a service account running around as a local administrator.

Note as well when you first set up the farm distributed cache uses the farm service account which too needs to be a local admin for the same reason (the AppFabricCachingService won’t start otherwise).

One last reminder: if you spin up a new server or want to turn on distributed cache on another server in the farm you’ll need to first grant the current distributed cache service account local admin rights on the new server otherwise you’ll encounter the same issue.

Share Button

Distributed Cache Needs Ping

After setting up a number of SharePoint 2013 farms in different environments I discovered that to correctly set up the Distributed Cache service you require allowing ICMPv4 (ping) traffic between the cache hosts. This requirement is partially documented at the bottom of a TechNet page.

Check out the full story in the Habanero Insight, Distributed Cache Needs Ping

Share Button

Are we using SharePoint Enterprise Features?

Here’s a quick PowerShell script (actually two scripts) that will go through your web applications, site collections, and sites and tell you which ones are using Enterprise features.

The first script is for SharePoint 2010 and SharePoint 2013:

Add-PSSnapin Microsoft.SharePoint.PowerShell
foreach ($webapp in Get-SPWebApplication) {

	foreach ($feature in $webapp.Features) {

		if ($feature.Definition.Displayname -eq "PremiumWebApplication") {
			Write-Output "$($Webapp.DisplayName) contains enterprise web application features"
		} # if enterprise web application feature

	} # foreach web application feature

	foreach ($site in $webapp.Sites) {

		foreach ($feature in $Site.Features) {

			if ($feature.Definition.Displayname -eq "PremiumSite") {
				Write-Output "$($Site.Url) contains enterprise site collection features"
			} # if enterprise site collection feature

		} # foreach site collection feature

		foreach ($web in $site.AllWebs) {

			foreach ($feature in $web.Features) {

				if ($feature.Definition.Displayname -eq "PremiumWeb") {
					Write-Output "$($web.Url) contains enterprise site features"
				} # if enterprise site feature

			} # foreach site feature

		} # foreach site

	} # foreach site collection

} # foreach web application

The second is for SharePoint 2007:

[System.Reflection.Assembly]::LoadWithPartialName("Microsoft.SharePoint")
$farm = [Microsoft.SharePoint.Administration.SPFarm]::Local
$websvcs = $farm.Services | where -FilterScript {$_.GetType() -eq [Microsoft.SharePoint.Administration.SPWebService]}
$webapps = @()
foreach ($websvc in $websvcs) {
    foreach ($webapp in $websvc.WebApplications) {  

		foreach ($feature in $webapp.Features) {
						
			if ($feature.Definition.Displayname -eq "PremiumWebApplication") {
				Write-Output "$($Webapp.DisplayName) contains enterprise web application features"
			} # if enterprise web application feature
			
		} # foreach web application feature
			
        foreach ($site in $webapp.Sites) {
		
			foreach ($feature in $Site.Features) {
				
				if ($feature.Definition.Displayname -eq "PremiumSite") {
					Write-Output "$($Site.Url) contains enterprise site collection features"
				} # if enterprise site collection feature
				
			} # foreach site collection feature
	
			foreach ($web in $site.AllWebs) {
			
				foreach ($feature in $web.Features) {
				
					if ($feature.Definition.Displayname -eq "PremiumWeb") {
						Write-Output "$($web.Url) contains enterprise site features"
					} # if enterprise site feature
					
				} # foreach site feature
				
			} # foreach site
			
		} # foreach site collection
		
	} # foreach web application
						
} # foreach web service

The output will look something like this:

WebAppExample1 contains enterprise web application features
http://webappexample1/sites/Test contains enterprise site collection features
http://webappexample1/sites/Test contains enterprise site features
http://webappexample1/sites/Test/Site1 contains enterprise site features
http://webappexample1/sites/Test/Site1/SubSite contains enterprise site features
http://webappexample1/sites/testteamsite contains enterprise site collection features
http://webappexample1/sites/testteamsite contains enterprise site features
WebAppExample2 contains enterprise web application features
MySites contains enterprise web application features
SSPAdmin contains enterprise web application features
http://sspadmin/ssp/admin contains enterprise site collection features
http://sspadmin/ssp/admin/Content contains enterprise site features
Share Button

Where’s that pesky Correlation ID?

Sometimes when you’re kicking around on a SharePoint site you encounter something strange or you get an error. On some sites, for example a public web site, the error may not contain the correlation ID — SharePoint’s unique identifier for your request — that comes standard on error pages in SharePoint 2010 and SharePoint 2013. At some point the site’s architect decided outputting this to the user wasn’t something they wanted and the developers removed it from the error page. As in, the “error page” is “friendly” and contains no “useful” diagnostic information.

Not having the correlation ID makes troubleshooting difficult, because the ID helps enormously with tracking down exceptions and makes following the flow of a request within the ULS logs (relatively) simple.

[Got a correlation ID and not sure how to find the log entries? Check out An even better way to get the real SharePoint error from the ULS logs featuring the always excellent Merge-SPLogFile SharePoint PowerShell cmdlet)]

Thankfully, SharePoint tells you the correlation ID on every request to the site, even if there weren’t any errors. Finally reading ULS logs becomes a realistic hobby.

So where is the correlation ID? It’s in the SPRequestGuid HTTP response header (MSDN: SPResponseGuid). If you’re using a tool like Fiddler, it captures all the headers in its log. This is useful if you or your testers are doing lots of tests and you want to review a specific test in the logs later.

If you don’t use Fiddler, never fear. Chrome, Firefox, and probably IE can show you the headers in their “developer tools” (on Windows, other browsers and platforms may too).

Correlation IDs were introduced in SharePoint 2010 so this will only work in SharePoint 2010 or newer farms including SharePoint Online (Office 365)! You may wonder how useful it is to have the correlation ID in SPO (or any hosted SharePoint solution) since you don’t have physical access to the ULS logs, but consider providing the correlation ID to Microsoft Support when working with them so you can help them narrow down your request from the hojillion requests hitting their servers. Actually, same thing if you’re not the farm administrator. Putting the correlation ID in your email to your SharePoint administrators for them it’s like winning the lottery. You’ll totally make their day.

Let’s see it in action. In Chrome it looks something like this for SharePoint 2010:

The SPRequestGuid HTTP response header holds the correlation ID for a request to a SharePoint 2010 site
The SPRequestGuid HTTP response header holds the correlation ID for a request to a SharePoint 2010 site

SharePoint 2013:

The SPRequestGuid HTTP response header holds the correlation ID for a request to a SharePoint 2013 site
The SPRequestGuid HTTP response header holds the correlation ID for a request to a SharePoint 2013 site

SharePoint Online:

The SPRequestGuid HTTP response header holds the correlation ID for a request to a SharePoint Online site
The SPRequestGuid HTTP response header holds the correlation ID for a request to a SharePoint Online site

Wonderful.

Share Button

Opening SharePoint Networking Ports Before Creating a Farm

There are a few articles up on TechNet about hardening SharePoint:

Plan security hardening for server roles within a server farm (Office SharePoint Server)
Plan security hardening (SharePoint Server 2010)
Plan security hardening for SharePoint 2013

These guides detail the networking ports SharePoint needs in order to function.  And there are quite a few ports:

  • TCP 80, 443, custom Web applications
  • TCP 16500-16519 Search index component
  • TCP 22233-22236, ICMP AppFabric Caching Service, which is used by the Distributed Cache SharePoint service
  • TCP 808 Windows Communication Foundation used for communication between search components
  • TCP 32843 (HTTP), TCP 32844 (HTTPS), TCP 32845 (net.tcp) SharePoint web services
  • TCP 5725 Forefront Identity Manager (FIM), used by user profile service
  • TCP/UDP 389 (LDAP), TCP/UDP 88 (Kerberos), TCP/UDP 53 (DNS), UDP 464 (Kerberos Change Password) Active Directory queries and integration
  • TCP 1433 SQL Server
  • UDP 1434 SQL Server Browser (if using a non-default SQL instance and you’re not specifying the port in the connection string)
  • TCP 25 Incoming/Outgoing email

Anyway, my advice is to ensure these ports are opened on your farm servers before installing SharePoint. Out of the box Windows Firewall or a Group Policy Object (GPO) may block some of these ports, as well consider any network firewalls and ensure they are not blocking the ports between servers. There’s nothing like running in to issues while setting up a farm and finally figuring out the issue is because one of these ports is blocked!

I’m currently developing a SharePoint 2013 hardening guide which will look at these ports in more detail — why they’re needed, what they do, and on which servers they need to be opened. Stay tuned!

Share Button

X of X slides that were being published to http://sharepoint/slide/library failed. Try publishing again.

If you receive the error,

X of X slides that were being published to http://sharepoint/slide/library failed. Try publishing again

when trying to publish a PowerPoint file or slide to a SharePoint Slide Library where X is the number of slides in the presentation, make sure the WebClient Windows service is present on your computer and started.

To see if it’s present and its status, open an elevated CMD prompt and use sc.exe:

sc query WebClient

If it’s stopped, start it:

net start WebClient

The service may not be on your system. The WebClient service is normally a part of the client machines (Windows XP, Vista, 7, 8). On Windows Server you need to add the Desktop Experience Feature which will install the service. You’ll need to reboot as well. Once the server is back up you can start the WebClient service.

That said, WebClient is a client service. It’s not needed on a SharePoint server, just the end users need it.  You shouldn’t be installing the service on a SharePoint server unless you know what you’re doing. 😉

Share Button

Distributed Cache bug in SharePoint Server 2013

Distributed Cache is a new component of SharePoint 2013 that is used to cache data for activity feeds, news feeds, search queries, authentication tokens, security trimming, Apps-related data and views. Even though it’s making it’s debut, it’s a pretty critical component to the functionality of a SharePoint farm.

The Distributed Cache service uses Windows AppFabric caching technology behind the scenes.

The cache can consume a lot of memory and needs to be constantly accessing the stored data so for best performance, Microsoft recommends including dedicated Distributed Cache servers in your farm. In large server farms this makes a lot of sense, though for smaller farms you can usually make due without the dedicated servers.

On a recent project, I ran into an issue with Distributed Cache — requests for items in the cache kept timing out which caused delays to other components that were relying on the data from the cache. It wasn’t occasional requests either, there were hundreds of timeouts every second. Something was up with the service.

Tracing through the logs, we saw that when a user accesses a page, SharePoint attempts to authorize the user to ensure they have access. SharePoint stores the user’s token in the user’s browser session and in the DistributedCacheLogonTokenCache container. When SharePoint tried to retrieve the token from Distributed Cache, the connection would time out or a connection would be unavailable and the comparison would fail. Since it couldn’t validate the presented token SharePoint had no choice but to log the user out and redirect them to the sign in page.

One of the interesting things about this issue was when I consulted the MSDN about the timeout values, the documentation didn’t provide the units for the values. I had no idea if the timeouts were in milliseconds or seconds.

What are the units for the ChannelOpenTimeOut and RequestTimeout? The ChannelInitializationTimeout is much larger at 60000, so maybe it’s milliseconds. Are RequestTimeout and ChannelOpenTimeout then 20 milliseconds? That seems really small. Maybe it’s 20 seconds? The MSDN page for RequestTimeout doesn’t provide an answer so we initially had to guess. In our development environments we were able to reproduce the issue when we reduced the time outs to a value of “5”. So we tried increasing them to 40 in the test environments. Then 60. Then 120. The issue persisted.

With the help of Microsoft Support I sorted out these initial questions but the issue continued even after increasing the timeouts to larger values. Microsoft called in help from their development support team and with some additional logging determined the issue was actually caused by the way AppFabric handles garbage collection. AppFabric 1.1 Cumulative Update 1 is a prerequisite for SharePoint 2013 and in this version garbage collection “takes too long.”

In AppFabric 1.1 CU1, imagine that the garbage collection happens with a little man who walks around the memory of the computer with one of those sticks with a nail on the end. When the man finds things lying around that AppFabric no longer needs he stabs the garbage with the nail-stick and takes it away. He continues looking for other pieces to clean up and for a room that is 14 GB in size this can take quite some time. He tells AppFabric once he’s done, and then AppFabric un-pauses and continues where it left off. Since everything is waiting for our garbage collector to finish checking everything lying around, other dependent services will get tired of waiting and move on. Sometimes this results in having to perform the original operation again (like a search query), and sometimes it means there is no data available to the requesting service. Sometimes it will result in an exception, and sometimes, as in our case the user gets logged out of the site.

So Microsoft wrote a hotfix that changes the way garbage collection happens in AppFabric. Instead of telling everything to wait for our garbage collector and asking him to go find all of the trash, the hotfix now tells the garbage collector to walk around looking for trash to pick up forever. With our man on the ground always tidying up, AppFabric can now just request things without waiting.

As of this writing, the most recent AppFabric CU is AppFabric Cumulative Update 4. I recommend applying this update to your SharePoint 2013 farms if you’re experiencing lots of timeouts with calling Distributed Cache. Once applied you need to modify the Distributed Cache configuration file, which is typically found in C:\Program Files\AppFabric 1.1 for Windows Server\DistributedCacheService.exe.config. Add the following section within the Configuration element between the configurationSections and dataCacheConfig elements:

<appSettings><add key="backgroundGC" value="true"/></appSettings>

So you end up with something like this:


<?xml version="1.0" encoding="utf-8"?>
<configuration>
   <configSections>
      ... other configurations ...
   </configSections>
   <appSettings><add key="backgroundGC" value="true"/></appSettings>
   <dataCacheConfig>
   ... other configurations ...
</configuration>

(Update January 30, 2014: HT to Aben Samuel and Gavin Barron for discovering that the appSettings element neets to go between configSections and dataCacheConfig)

Share Button