Agent unable to publish artefacts: SocketException Broken Pipe
Hello!
For a few days, I've been experiencing issues with a TeamCity agent (OSX) being unable to publish artefacts to our TeamCity Server.
The builds these issues appear on have worked in the past with no sizeable change in artefact size.
There have been some changes in the environment with both OS & TeamCity updates.
Namely an update from OSX 14.4 to 14.4.1 and TeamCity 2023.11.4 to 2024.03.
I have already attempted to restore the old TeamCity version via a backup with no notable changes.
Looking at the logs I can see the following two exceptions:
Server: Processing of multipart/form-data request failed. java.io.EOFException
Agent: java.net.SocketException: Broken pipe (Write failed)
From what I can tell there are no network issues on either side as other similar builds on a Windows Agent are going through without problems.
Additionally, looking at the Activity Monitor within OSX it seems like more data is sent than the artefact is big with some minor dips now and then. There don't seem to be any related log messages popping up when these dips occur.
I've uploaded both the server & agent log files including DEBUG messages.
Upload id: 2024_04_17_NMu5HabXZJuFRFqSvCa1aP
Please sign in to leave a comment.
Could you also provide the full build log for the build with artifacts that can't be published? Please upload it and share the upload ID.
Best regards,
Anton
Hi Anton!
Here's the requested build log: 2024_04_18_3sgmnycmgPNi2x9nQjRZzJ
Best regards,
Jannik
It almost certainly looks like a network issue.
On the agent's side: the JDK library responsible for network connectivity fails.
On the server's side: end of file exception - the server was receiving a file, but it stopped, and the file was finished unexpectedly.
You can troubleshoot it further by running the agent and server on the same machine and seeing if it'll be reproduced. In general, please look into network connectivity, and if you use any firewalls/proxies between the agents and server.
Best regards,
Anton
Hi Anton,
I've done some testing with your previous message, and from what I can tell, the issue seems to be on the server's side.
A separate Windows agent within the same network as the OSX one is also running into the same issue (although it's a Connection reset by peer this time). Switching over to a different network also doesn't help.
Looking at the HTTP access log I've noticed the following popping up regularly related to artifact uploading:
"POST /httpAuth/artefactUpload.html HTTP/1.1" 499 21 60001ms
I've removed most of the unrelated information with the 21 being the Content Length.
As mentioned previously, similar builds work with a Windows Agent on a Root Server on the same Hoster as the TeamCity Server.
The OSX Agent runs on residential internet with much slower upload speeds.
Could this be a timeout somewhere along the line?
I've noticed 60000ms being set in both the server.xml as well as the server-https-proxy.xml under /opt/teamcity/conf as the connectionTimeout for the active Connector.
Updating these values didn't seem to have any effect though, with the request still aborting after the initial 60000ms.
Best regards,
Jannik Weise
Publishing artifacts failing on agents installed on separate machines and not failing when installed on the same one confirms my assumption that it is a network issue. Do you use any proxy for the TeamCity server?
Best regards,
Anton
Hi Anton!
I've done some more digging over the weekend and found the cause of the issue.
There was a 60s timeout on the reverse proxy we're using (Traefik) that caused the upload to be cut off.
I assume those were the small dips on the network graph I observed. I assume it ran into a retry on the agent side?
This timeout was added a few weeks ago as stated here: https://doc.traefik.io/traefik/migration/v2/#v2112
That's also why this issue only started appearing now.
Thanks for all your assistance in resolving this issue!
Best regards,
Jannik
I'm glad to hear that you were able to resolve this issue.
Best regards,
Anton