Friday, April 18, 2008

TFS Pain

We recently migrated the source code of one of our products (which I'll call ACME) from Visual Source Safe to Team Foundation Version Control. This change necessitated a change to the ACME's project configuration in CruiseControl.NET. While not as painful as surgery, the procedure was extraordinarily frustrating.

You might wonder why we're using CC.NET if the TFS Build Server is available. It's all a matter of timing. ACME is a 2+ year migration of an existing application written in Oracle Forms. We're near the end of the project and CC.NET has performed admirably the entire time. It was disruptive enough to educate everyone on TFS in order to use WorkItem Tracking and source control. I just want to finish the migration at this point so TFS Build can wait a couple more months.

I started the process on 4/18/2008 by configuring the development CC.NET server to use our development TFS source control. Installing the CC.NET TFS plugin was easy enough. I then created a local account on the TFS machine that had the same name and password as the owner of the CC.NET service. I then granted read permissions to this account so that it could access the ACME source code. I was able to successfully build in a really short period of time. It was now time for the real thing.

Repeating the process that I'd used in the development environment, I installed the CC.NET TFS plugin. And the install consisted of copying the TFS plugin files from the development server to the production server. Upon starting the CC.NET service I received the following exception in the Event Log:


Event Type: Error
Event Source: CCService
Event Category: None
Event ID: 0
Date: 4/18/2008
Time: 4:04:03 PM
User: N/A
Computer: CORP-TWDEV
Description:
Service cannot be started. System.BadImageFormatException: The format of the file 'ccnet.vsts.plugin.dll' is invalid.
File name: "ccnet.vsts.plugin.dll"
at System.Reflection.Assembly.nLoad(AssemblyName fileName, String codeBase, Boolean isStringized, Evidence assemblySecurity, Boolean throwOnFileNotFound, Assembly locationHint, StackCrawlMark& stackMark)
at System.Reflection.Assembly.InternalLoad(AssemblyName assemblyRef, Boolean stringized, Evidence assemblySecurity, StackCrawlMark& stackMark)
at System.Reflection.Assembly.LoadFrom(String assemblyFile, Evidence securityEvidence, Byte[] hashValue, AssemblyHashAlgorithm hashAlgorithm)
at Exortech.NetReflector.NetReflectorTypeTable.Add(String path, String searchPattern)
at ThoughtWorks.CruiseControl.Core.Config.NetReflectorConfigurationReader..ctor()
at ThoughtWorks.CruiseControl.Core.CruiseServerFactory.NewConfigurationService(String configFile)
at ThoughtWorks.CruiseControl.Core.CruiseServerFactory.Create(Boolean remote, String configFile)
at ThoughtWorks.CruiseControl.Service.CCService.CreateAndStartCruiseServer()
at ThoughtWorks.CruiseControl.Service.CCService.OnStart(String[] args)
at System.ServiceProcess.ServiceBase.ServiceQueuedMainCallback(Object state)

=== Pre-bind state information ===
LOG: Where-ref bind. Location = C:\CI\CruiseControl.NET\server\ccnet.vsts.plugin.dll
LOG: Appbase = C:\CI\CruiseControl.NET\server\
LOG: Initial PrivatePath = NULL
Calling assembly : (Unknown).
===

LOG: Policy not being applied to reference at this time (private, custom, partial, or location-based assembly bind).
LOG: Attempting download of new URL file:///C:/CI/CruiseControl.NET/server/ccnet.vsts.plugin.dll.


Here was my first problem. My development environment was running a newer version of CC.NET based on the .NET 2.0 framework. Our production environment was running against the .NET 1.1 framework. The TFS plugin is a .NET 2.0 application. I would have to upgrade CC.NET on our production server before I could proceed. This was an unexpected development.

So I backed up all of my CC.NET configuration files and upgraded. The upgrade went smoothly and was fast. The service started this time. But my builds were failing with no explanation. I dove into the CC.NET log file.

Ah, here was the culprit!

2008-04-18 16:30:47,458 [5940:INFO] Force Build for project: ACME
2008-04-18 16:30:47,458 [5940:INFO] Project: 'ACME' is added to queue: 'ACME' in position 0.
2008-04-18 16:30:47,552 [ACME:INFO] Project: 'ACME' is first in queue: 'ACME' and shall start integration.
2008-04-18 16:30:47,645 [ACME:ERROR] INTERNAL ERROR: Could not load file or assembly 'Microsoft.TeamFoundation.VersionControl.Client, Version=9.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a' or one of its dependencies. The system cannot find the file specified.
----------
System.IO.FileNotFoundException: Could not load file or assembly 'Microsoft.TeamFoundation.VersionControl.Client, Version=9.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a' or one of its dependencies. The system cannot find the file specified.
File name: 'Microsoft.TeamFoundation.VersionControl.Client, Version=9.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a'
at ThoughtWorks.CruiseControl.Core.Sourcecontrol.Vsts.GetModifications(IIntegrationResult from, IIntegrationResult to)
at ThoughtWorks.CruiseControl.Core.Sourcecontrol.MultiSourceControl.GetModifications(IIntegrationResult from, IIntegrationResult to)
at ThoughtWorks.CruiseControl.Core.Sourcecontrol.QuietPeriod.GetModifications(ISourceControl sourceControl, IIntegrationResult lastBuild, IIntegrationResult thisBuild)
at ThoughtWorks.CruiseControl.Core.IntegrationRunner.Integrate(IntegrationRequest request)
at ThoughtWorks.CruiseControl.Core.Project.Integrate(IntegrationRequest request)
at ThoughtWorks.CruiseControl.Core.ProjectIntegrator.Integrate()
at ThoughtWorks.CruiseControl.Core.ProjectIntegrator.Run()

WRN: Assembly binding logging is turned OFF.
To enable assembly bind failure logging, set the registry value [HKLM\Software\Microsoft\Fusion!EnableLog] (DWORD) to 1.
Note: There is some performance penalty associated with assembly bind failure logging.
To turn this feature off, remove the registry value [HKLM\Software\Microsoft\Fusion!EnableLog].

The TFS version 9 assemblies, which were installed on the development box, couldn't be found by the CC.NET TFS plugin on production. Why not? We were using Visual Studio 2005 at the time so I'd continued to use the version 8 assemblies on the production server. Anyway, I now had to go back to the TFS plugin site on CodePlex and get the version that worked against the 8.0 binaries installed on my production build server. Could I have just copied the version 9 assemblies from dev to prod? Sure, but I was getting concerned about the number of changes I was suddenly making to the build server. (I've since upgraded to Version 9.)

So back to CodePlex I go, or tried. Some corporate firewall change had been made, and every time I tried to go to CodePlex I got this error message: "Internet Explorer cannot download View.aspx from www.codeplex.com." At least IE7 gave me an error message- Firefox just displayed a page of control characters. I used Remote Desktop to connect to some different machines and was finally able to access CodePlex using IE 6. Got the 1.3 version of the TFS plugin and was ready to build!

But not so fast...

2008-04-18 16:49:28,306 [ACME:INFO] Project: 'ACME' is first in queue: 'ACME' and shall start integration.
2008-04-18 16:49:28,306 [ACME:DEBUG] Checking Team Foundation Server for Modifications
2008-04-18 16:49:28,306 [ACME:DEBUG] From: 4/18/2008 9:28:47 AM - To: 4/18/2008 4:49:28 PM
2008-04-18 16:49:28,384 [ACME:ERROR] INTERNAL ERROR: TF30063: You are not authorized to access http://tfs:8080.

Nice. OK, it seems that though I had two identically named accounts on both the build and TFS server, the passwords were different. So I synchronized the passwords, which then broke my Oracle continuous integration solution. The Oracle CI project still lives in VSS because the Visual Studio Team System 2008 Team Foundation Server MSSCCI Provider doesn't work correctly for us. This VSS instance is on yet another machine so I had to sync another password and then the Oracle project was back online.

And then this,

C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\Microsoft.Common.targets (3340,9): errorMSB3482: SignTool reported an error 'Key not valid for use in specified state. '.

We're still using a temporary key to sign the ClickOnce manifest. Something somewhere changed and I had to supply a password. This required logging into the build server using the credentials of the local account used to run the CruiseControl.NET service. And to be able to do that I had to login to the machine and add the local account to the Remote Desktop Users group.

Ah, so finally I'm finished, right? No.

ThoughtWorks.CruiseControl.Core.CruiseControlException: Unable to load
transform: e:\dev\CruiseControl.NET\webdashboard\xsl\msbuild.xsl --->
System.Xml.Xsl.XslLoadException: XSLT compile error. An error occurred
at e:\dev\CruiseControl.NET\webdashboard\xsl\msbuild.xsl(0,0). --->
System.Xml.XmlException: For security reasons DTD is prohibited in
this XML document. To enable DTD processing set the ProhibitDtd
property on XmlReaderSettings to false and pass the settings into
XmlReader.

The previous version of CC.NET had been based on the .NET 1.1 framework, which allowed DTD's. The version of CC.NET that I had just upgraded to doesn't allow DTD's. because it's based on the 2.0 .NET framework. Some of the XSL files that ship with CC.NET use DTD's. Google to the rescue again.