SharePoint 2010 – 2013 – The Complete guide to starting the User Profile Synchronization Service

I’ve put off writing this post for a long time, hoping that the User Profile Synchronization service (aka: FIM Sync) would go away. And it is going away with the eventual retirement of SharePoint 2010 and 2013, but that’s not happening soon enough, and meanwhile we’re still seeing a lot of support cases on it.

First things first:

Do you really even need to use FIM Sync?

If you’re still using SharePoint 2010, (which you probably shouldn’t be considering support for that products ends in April of 2021), then yes, you’re stuck with FIM as your only option.

If you’re using SharePoint 2013, you can use FIM Sync, but don’t necessarily need to. You should use AD Import instead, if possible. See this: https://joshroark.com/sharepoint-considerations-when-switching-from-fim-sync-to-ad-import/

 

Lets say you must use FIM Sync (sorry to hear that). Where to start?

Best practices for successful startup:

  • Make sure the Farm service account (the one running the SharePoint Timer service and Central Admin app pool) is in the local Administrators group on the Sync server, and logon to the machine as that account.
  • Make sure you’re using a supported version of SQL server to store your Sync database. For example, for SharePoint 2013, the highest version of SQL you can use is SQL 2014.
  • Make sure you’re at a supported build of SharePoint.
    • SharePoint 2013 minimum supported build is 15.0.5023.1001 (April 2018 Update).
    • SharePoint 2010 minimum supported build is 14.0.7015.1000 (Service Pack 2). — But come on, even that is from 2013. I’d recommend using a build released within the last year.
  • Make sure the “Manage servers in this farm” page in Central Admin shows the “Status” of all servers as “No Action Required”. If they say pretty much anything else, especially “Upgrade Required”, you’ll want to fix that first.

There are generally two different behaviors:

1. The User Profile Synchronization Service goes to “starting” for a few minutes and then goes back to “stopped”.

2. The User Profile Synchronization Service gets stuck at “starting” indefinitely.

Depending on which behavior you see, you’ll need to troubleshoot it slightly differently. The Sync startup process uses a one-time timer job called ” ProfileSynchronizationSetupJob” to configure the service. The two behaviors indicate different problems with that timer job.

 

#1: Goes to “starting” for a few minutes and then goes back to “stopped”.

This is the more common behavior. It indicates that the ProfileSynchronizationSetupJob was created, ran, and failed with an error. In that case, we need to gather the SharePoint ULS log and check for the error.

The ProfileSynchronizationSetupJob should run on the server that you’re trying to start the Sync service on, and should start within a minute or so of attempting to start the Sync service. Check the Timer Job Status page in Central Admin for the timer job. We want to gather the log from the server that ran ProfileSynchronizationSetupJob, covering the entire duration of the timer job, which can be up to 10 minutes because it attempts to start the service multiple times.

If you don’t already have it, you’ll want to download ULSViewer. It makes analyzing ULS logs much easier. What I like to do is put it on the Sync server, and set it to display the logging real-time (File | Open From | ULS).

I usually filter on:

Category | Contains | User Profiles

OR
Message | Contains | ILM Configuration

Note: Because all of the important stuff is logged at the “medium” level, it does not do much good to turn logging up to “verbose” for this exercise.

Here’s what the ULS logs should look like during a successful start of the Sync service. There’s a number of “ILM Configuration” steps that it works through. From this example, you should be able to tell how far along in the process your server got before it failed.

I’ve seen the ProfileSynchronizationSetupJob fail at just about every step, and for multiple reasons per-step. I know I called this post the “complete” guide, but a complete list of every conceivable failure and solution would make this post far too long, and is a bit beyond the scope. I’ll list the most common problems:

1. ERR_CONFIG_DB

OWSTIMER.EXE (0x18F4)    0x6058    SharePoint Portal Server    User Profiles    9i1w    Medium    ILM Configuration: Error ‘ERR_CONFIG_DB’

‘ERR_CONFIG_DB’ is one of the more generic errors and means that the FIM service could not connect to the Synchronization database.  Here’s a list of possible causes:

https://blogs.msdn.microsoft.com/spses/2015/06/16/err_config_db-while-starting-the-upa-synchronization-service-in-sharepoint-20102013/

In my experience, it’s usually because either the Sync server has the wrong SQL Native Client installed (should be 2008 R2), or TLS 1.0 has been disabled on the SQL server, and the Sync server has not been fully configured to use TLS 1.2.  The two servers cannot agree on which version of TLS to use, so the connection fails. Reference: https://docs.microsoft.com/en-us/SharePoint/security-for-sharepoint-server/enable-tls-and-ssl-support-in-sharepoint-2013

You can use a UDL file to test connectivity between the Sync server and SQL server that hosts the Sync database.  Reference: https://blogs.msdn.microsoft.com/chaitanya_medikonduri/2011/03/09/sql-server-connectivity-issuestroubleshooting-tips/ To do the test properly, you’d want to be logged on as the Farm service account (because that’s the account that runs the Sync service) and use “SQL Server Native Client 10.0” on the “Provider” tab, with “Windows NT Integrated security”. If the test fails with “The client and server cannot communicate because they do no possess a common algorithm” or “SSL Security error“, you can be pretty sure you have a TLS configuration problem.

“Microsoft SQL Server Native Client” provider TLS / SSL error:

“Microsoft OLE DB Provider for SQL Server” provider TLS / SSL error:

 

This is what a successful UDL test looks like:

 

2. ERR_INVALID_GROUPS

The account listed to start the Sync service was wrong. It was not the Farm account. Someone must have changed it to some other service account.

Note: “Farm account” means the farm service account that is running the Timer service and the Central Admin application pool.

Go to Central Admin | Security | Configure Service Accounts | Windows Service – User Profile Synchronization Service. Change the account to the Farm account. In services.msc on the Sync server, change the log on account for “Forefront Identity Manager Synchronization Service” to the Farm account. It’s fine to leave the service disabled here. The Sync startup process will start it. Make sure the farm account is in the local Administrators group on the Sync server and make sure you can log on successfully to the Sync server as the farm account.

 

3. ERR_START_SERVICE

UserProfileApplication.SynchronizeMIIS: Failed to configure MIIS post database, will attempt during next rerun. Exception: System.Configuration.ConfigurationErrorsException: ERR_START_SERVICE at Microsoft.Office.Server.UserProfiles.Synchronization.ILMPostSetupConfiguration.ValidateConfigurationResult(UInt32 result) at Microsoft.Office.Server.UserProfiles.Synchronization.ILMPostSetupConfiguration.ConfigureMiisStage2() at Microsoft.Office.Server.Administration.UserProfileApplication.SetupSynchronizationService(ProfileSynchronizationServiceInstance profileSyncInstance)

This can also be a permissions issue. Verify that the Farm Service Account (the one running the central admin app pool and Timer service) is a member of the Administrators group on the local Synch machine. Log onto the Synch machine as the Farm Service Account, and use that account to start the Synch service in central admin.

 

4. ILM Configuration: Configuring certificate — No error in the ULS logs, but the process gets stuck for a while at this step…

OWSTIMER.EXE (0x0828) 0x0878 SharePoint Portal Server User Profiles 9q1h Medium ILM Configuration: Configuring certificate.

In the Application Event Log, you might see this:

ILM Certificate could not be created: Cert step 2 could not be created: E:\Program Files\Microsoft Office Servers\15.0\Tools\MakeCert.exe -pe -sr LocalMachine -ss My -a sha1 -n CN=”ForefrontIdentityManager” -sky exchange -pe -in “ForefrontIdentityManager” -ir localmachine -is root

You’ll want to clear all of the “ForefrontIdentityManager” certificates out of the local computer account certificate store. References:

https://aurramu.blogspot.com/2015/01/ilm-certificate-could-not-be-created.html

https://support.microsoft.com/en-us/help/2498715

 

5. ILM Configuration: Configuring databaseExternal component has thrown an exception.

UserProfileApplication.SynchronizeMIIS: Failed to configure MIIS post database, will attempt during next rerun. Exception: System.Runtime.InteropServices.SEHException (0x80004005): External component has thrown an exception. at Microsoft.Office.Server.UserProfiles.Synchronization.ILMPostSetupConfiguration.ConfigureMiisStage2() at Microsoft.Office.Server.Administration.UserProfileApplication.SetupSynchronizationService(ProfileSynchronizationServiceInstance profileSyncInstance).

This can happen if you have PowerShell problems on the Sync server. For example if you have a custom PowerShell Profile for the farm account. If you open the SharePoint Management Shell on the Sync server, and it throws errors during initialization, it’s a pretty good bet you have a problem within a PowerShell Profile.

References:

https://msftplayground.com/2014/02/unable-to-start-user-profile-synchronization-service/

https://docs.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_profiles?view=powershell-7

 

 

#2: Gets stuck at “starting” indefinitely.

This is less common. It indicates that either the ProfileSynchronizationSetupJob timer job is not doing anything, or maybe it didn’t even get created.

We need to make sure that the Timer service is healthy across the farm. Start by verifying the “SharePoint Timer Service” Windows service (in services.msc) is started and running as the Farm service account on every server in the farm.

Then we need to make sure the Timer Service instances within SharePoint are online. You can use my “CheckTimerAndAdminServices.ps1” PowerShell script from here: https://joshroark.com/sharepoint-all-about-one-time-timer-jobs/ for that. In fact, since the ProfileSynchronizationSetupJob is a one-time timer job, that entire article is applicable to this issue.

In order to attempt to start the service again, you need to get it stopped. If it’s stuck at “starting” in the UI, then you can only do that via PowerShell:

get-spserviceinstance | ? {$_.typename -match “Synchronization”} | Stop-SPServiceInstance

In some cases, the ProfileSynchronizationSetupJob will be created, but will not do anything. In that case, that one-time timer job needs to be deleted before you attempt to start the Sync service again. Failure to clean that up before another start attempt will result in this error:

SynchronizeMIIS encounters an exception: Microsoft.SharePoint.Administration.SPDuplicateObjectException: An object of the type Microsoft.Office.Server.Administration.ProfileSynchronizationSetupJob named “ProfileSynchronizationSetupJob” already exists under the parent Microsoft.SharePoint.Administration.SPTimerService named “SPTimerV4”. Rename your object or delete the existing object.

Reference: https://sharepointrelated.com/2016/08/09/stuck-on-starting/

If you’re finding that ProfileSynchronizationSetupJob is not even getting created, then you may have a problem with the “System Job to Manage User Profile Synchronization” timer job. That’s the timer job that is used to create and start the ProfileSynchronizationSetupJob one-time timer job. To troubleshoot that, you’ll need to review the ULS logs from the server that has been running that job. You can find which server has been running the “System Job” using this PowerShell:

$tjs = Get-SPTimerJob | ? {$_.displayname -match “System Job to Manage”}

foreach ($tj in $tjs)

{$tj.name

$tj.displayname

$tj.historyentries | select StartTime, EndTime, ServerName, status -first 10 | sort -Descending starttime}

 

There’s also this rare problem that may only be an issue for SharePoint 2010, but I’ve seen it enough times that I wrote a public KB article about it. Basically, the timer job never starts and gives you no feedback whatsoever in the ULS logs. Reference: https://support.microsoft.com/en-us/help/2719512/

 

Add a Comment