Docker agent ID doesn't persist, creating huge number of "distinct" agents

We've set up an Amazon EC2 agent, which on startup runs an image based on the jetbrains/teamcity-minimal-agent image (tag 2020.2.1). The agent is activated and deactivated by a TeamCity Cloud Profile.

Everything is working well, except: every time the agent is started, the TeamCity server registers it as a "new" agent (belonging to the same cloud image).

We run:

docker run -d -it -e SERVER_URL=$OUR_URL -e AGENT_NAME="Software-Linux-dockerAgent-01" -v /home/ubuntu/ci_agent/agent_config:/data/teamcity_agent/conf -v /home/ubuntu/ci_agent/work:/opt/buildagent/work -v /home/ubuntu/ci_agent/system:/opt/buildagent/system my_linux_ci_agent:1.4

I can see that agent_config/buildAgent.properties records the name and authorization token assigned by the server:

## The unique name of the agent used to identify this agent on the TeamCity server
## Use blank name to let server generate it.
## By default, this name would be created from the build agent's host name
name=Software-Linux-EC2-Agent-01-i-09307d6796743bda4-9

(...)

## A token which is used to identify this agent on the TeamCity server for agent authorization purposes.
## It is automatically generated and saved back on the first agent connection to the server.
authorizationToken=eb0b786556d8f12c1c8917cb10dfdd14
But the next time the docker image is started, this is overridden with new values:
## The unique name of the agent used to identify this agent on the TeamCity server
## Use blank name to let server generate it.
## By default, this name would be created from the build agent's host name
name=Software-Linux-EC2-Agent-01-i-09307d6796743bda4-10

(...)

## A token which is used to identify this agent on the TeamCity server for agent authorization purposes.
## It is automatically generated and saved back on the first agent connection to the server.
authorizationToken=f966d0c9f6e44fe284243edd19cfba69

Server logs show that each time the agent connects, it "has no name defined":

[2021-01-17 15:13:44,401]   INFO -    jetbrains.buildServer.AGENT - Set generated name "Software-Linux-EC2-Agent-01-i-09307d6796743bda4-2" to agent "" {id=66} as it had no name defined.
[2021-01-17 16:29:54,529]   INFO -    jetbrains.buildServer.AGENT - Set generated name "Software-Linux-EC2-Agent-01-i-09307d6796743bda4-3" to agent "" {id=67} as it had no name defined.
[2021-01-17 23:21:29,798]   INFO -    jetbrains.buildServer.AGENT - Set generated name "Software-Linux-EC2-Agent-01-i-09307d6796743bda4-4" to agent "" {id=68} as it had no name defined.
[2021-01-18 09:27:10,853]   INFO -    jetbrains.buildServer.AGENT - Set generated name "Software-Linux-EC2-Agent-01-i-09307d6796743bda4-5" to agent "" {id=69} as it had no name defined.
[2021-01-18 09:32:32,258]   INFO -    jetbrains.buildServer.AGENT - Set generated name "Software-Linux-EC2-Agent-01-i-09307d6796743bda4-5" to agent "" {id=70} as it had no name defined.
[2021-01-18 09:35:42,878]   INFO -    jetbrains.buildServer.AGENT - Set generated name "Software-Linux-EC2-Agent-01-i-09307d6796743bda4-6" to agent "" {id=71} as it had no name defined.
[2021-01-18 13:23:38,313]   INFO -    jetbrains.buildServer.AGENT - Set generated name "Software-Linux-EC2-Agent-01-i-09307d6796743bda4-7" to agent "" {id=72} as it had no name defined.
[2021-01-18 14:39:46,703]   INFO -    jetbrains.buildServer.AGENT - Set generated name "Software-Linux-EC2-Agent-01-i-09307d6796743bda4-8" to agent "" {id=73} as it had no name defined.
[2021-01-18 15:13:59,884]   INFO -    jetbrains.buildServer.AGENT - Set generated name "Software-Linux-EC2-Agent-01-i-09307d6796743bda4-9" to agent "" {id=74} as it had no name defined.
[2021-01-18 15:15:51,545]   INFO -    jetbrains.buildServer.AGENT - Set generated name "Software-Linux-EC2-Agent-01-i-09307d6796743bda4-10" to agent "" {id=75} as it had no name defined.
[2021-01-18 15:22:39,975]   INFO -    jetbrains.buildServer.AGENT - Set generated name "Software-Linux-EC2-Agent-01-i-09307d6796743bda4-11" to agent "" {id=76} as it had no name defined.
[2021-01-18 15:22:59,652]   INFO -    jetbrains.buildServer.AGENT - Set generated name "Software-Linux-EC2-Agent-01-i-09307d6796743bda4-12" to agent "" {id=77} as it had no name defined.
[2021-01-18 15:38:44,687]   INFO -    jetbrains.buildServer.AGENT - Set generated name "Software-Linux-EC2-Agent-01-i-09307d6796743bda4-13" to agent "" {id=78} as it had no name defined.
[2021-01-18 15:39:04,313]   INFO -    jetbrains.buildServer.AGENT - Set generated name "Software-Linux-EC2-Agent-01-i-09307d6796743bda4-14" to agent "" {id=79} as it had no name defined.

I've found one similar post here on the community forum, Multiple TC build agents on a single AWS EC2 instance connected with the Agent Cloud plugin generate new agent names for each run , and the symptoms sound very similar. But in my case, I have only a single build agent running on the instance.



5 comments
Comment actions Permalink

Hi, I'd really appreciate an initial direction here; the problem is persisting, and I'm not finding any explanation for it.

Thanks!

0
Comment actions Permalink

Hi Ziv,

We would need to see how your cloud profile is configured. In general, if you first finished the process of setting up the image, then configured the image to be run as an instance instead of an image, the default behavior is to reuse the agent without assigning new names. If you set it up as an image, then a new agent is supposed to be spawned every time. 

 

The  message in particular seems to imply that your agent does not have an agent name defined. If the agent is not persisting its state for some reason this could cause this behavior. Please share the settings of the cloud agent (feel free to obscure the private information) for review.

0
Comment actions Permalink

Hi Denis; thanks for the assistance!

I'm going to ask for a little more guidance on what information you need -- what files, what menu/screen on the server, etc.? I'm not entirely in my element here, and particularly where you mention instances and images, I'm not clear on whether you mean the Amazon EC2 images and instances, or the docker images and instances.

(Is this what you're looking for, in the Cloud Profile?)

In the meantime, to the best of my understanding:
1. I am working on multiple specific Amazon EC2 instances (they are listed under instances in the cloud profile image source, and they persist data between shutdowns).
2. On each instance, I have a tagged docker image (what I called "my_linux_ci_agent:1.4" in my docker-run line). On instance startup, docker-run is called with this image, using several persisting volumes for buildagent config and data:

docker run -d -it -e SERVER_URL=$OUR_URL -e AGENT_NAME="Software-Linux-dockerAgent-01" \
   -v /home/ubuntu/ci_agent/agent_config:/data/teamcity_agent/conf \
  
-v /home/ubuntu/ci_agent/work:/opt/buildagent/work \
  
-v /home/ubuntu/ci_agent/system:/opt/buildagent/system \
   my_linux_ci_agent
:1.4

It is evident that the volumes (and agent configuration) do persist between docker runs; and yet they're obviously being overridden.
On the other hand, anything outside these volumes will obviously not persist between different runs of the docker image.

0
Comment actions Permalink

I've been looking into this further, and here's something that might clarify the matter.

When `/run-agent.sh` is run, we see that it successfully finds and uses the agent configuration file /data/teamcity_agent/conf/buildAgent.properties :

Wed 10 Feb 2021 01:02:14 PM GMT
File buildAgent.properties was found in /data/teamcity_agent/conf. Will start the agent using it.
name=Software-Linux-EC2-Agent-03-i-(REDACTED)-9
Starting TeamCity build agent...

And the agent log shows:

[2021-02-10 13:02:21,189] INFO - dAgentConfigurationInitializer - Loading build agent configuration from /data/teamcity_agent/conf/buildAgent.properties

Soon, though, I see this in the log:

[2021-02-10 13:02:25,461] INFO - n.AmazonInstanceMetadataReader - Cannot read EC2 Instance Metadata key: meta-data/public-hostname
[2021-02-10 13:02:25,462] INFO - n.AmazonInstanceMetadataReader - Cannot read EC2 Instance Metadata key: meta-data/public-ipv4
[2021-02-10 13:02:25,464] INFO - .agent.AmazonPropertiesUpdater - Fetched AmazonEC2 instance metadata: ami-id=ami-(REDACTED)
ami-launch-index=1
ami-manifest-path=(unknown)
instance-id=i-(REDACTED)
instance-type=c5.xlarge
local-hostname=ip-(REDACTED)
local-ipv4=(REDACTED)
reservation-id=r-(REDACTED)

[2021-02-10 13:02:25,479] INFO - .agent.AmazonPropertiesUpdater - TeamCity Build Agent was started by the TeamCity Server in Amazon EC2.
Server URL: (REDACTED).
Proposed agent name: <will be set on connection to the server>
Cloud profile:
Idle time to shutdown: 40 minutes
CustomParameters: {
cloud.amazon.agent-name-prefix = Software-Linux-EC2-Agent-03
system.cloud.profile_id = amazon-1
teamcity.cloud.agent.remove.policy = unauthorize
teamcity.cloud.instance.hash = (REDACTED)
}

Concluding with this:

[2021-02-10 13:02:25,481] INFO - .agent.AmazonPropertiesUpdater - Amazon marker file /opt/buildagent/conf/amazon-i-(REDACTED) does not exists. Agent name will be determined on the server

So this is the most direct indication I'm seeing -- the Amazon marker file indeed doesn't exist, because /opt/buildagent/conf isn't one of the persisting volumes. And this leads directly to a new marker file being created, and to the agent being understood as a "new" agent instead of a returning one.

0
Comment actions Permalink

Hi; can I please get a response to this? The ticket has been open for a month now.

0

Please sign in to leave a comment.