Bug 1306 - Spurious : "chan_read_failed for istate 3" errors from sshd
Summary: Spurious : "chan_read_failed for istate 3" errors from sshd
Status: CLOSED FIXED
Alias: None
Product: Portable OpenSSH
Classification: Unclassified
Component: sshd (show other bugs)
Version: 4.6p1
Hardware: Other All
: P2 normal
Assignee: Assigned to nobody
URL:
Keywords:
: 1525 (view as bug list)
Depends on:
Blocks: V_4_7 V_4_6_P2
  Show dependency treegraph
 
Reported: 2007-04-10 09:55 AEST by Darren Tucker
Modified: 2023-01-13 13:57 AEDT (History)
7 users (show)

See Also:


Attachments
Damien's patch from http://marc.info/?l=openssh-unix-dev&m=117607220603593 (2.78 KB, patch)
2007-04-10 12:48 AEST, Darren Tucker
no flags Details | Diff
Suppress errors when errno==0 instead (3.25 KB, patch)
2007-04-10 12:50 AEST, Darren Tucker
no flags Details | Diff
Revert the changes that trigger the error messages - patch agains 4.6p1 (3.37 KB, patch)
2007-04-10 19:28 AEST, Darren Tucker
no flags Details | Diff
Update patch #1260 based on djm's comments (3.25 KB, patch)
2007-04-14 10:57 AEST, Darren Tucker
no flags Details | Diff
Revised diff (921 bytes, patch)
2007-05-17 15:27 AEST, Damien Miller
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Darren Tucker 2007-04-10 09:55:36 AEST
On some platforms (at least Solaris, probably others) sshd in OpenSSH 4.6p1 will log the following error to syslog, but otherwise works OK.

   error: channel 0: chan_read_failed for istate 3

Mentioned on the mailing list:
http://marc.info/?l=openssh-unix-dev&m=117469473826677
http://marc.info/?l=openssh-unix-dev&m=117565843600487

This is a side effect of the changes from the infamous bug #52.
Comment 2 Darren Tucker 2007-04-10 12:50:30 AEST
Created attachment 1260 [details]
Suppress errors when errno==0 instead

This works for me on Solaris 8.
Comment 3 Damien Miller 2007-04-10 13:07:57 AEST
Comment on attachment 1260 [details]
Suppress errors when errno==0 instead

>Index: nchan.c
>===================================================================
>RCS file: /usr/local/src/security/openssh/cvs/openssh/nchan.c,v
>retrieving revision 1.56
>diff -u -p -r1.56 nchan.c
>--- nchan.c	5 Aug 2006 02:39:40 -0000	1.56
>+++ nchan.c	10 Apr 2007 02:41:02 -0000
>@@ -133,17 +133,19 @@ chan_rcvd_oclose1(Channel *c)
> 	}
> }
> void
>-chan_read_failed(Channel *c)
>+chan_read_failed(Channel *c, int err_no)
> {
>-	debug2("channel %d: read failed", c->self);
>+	debug2("channel %d: read failed%s, istate %d", c->self,
>+	    err_no ? "" : " (simulated)", c->istate);

maybe s/simulated/closed/ here

> 	switch (c->istate) {
> 	case CHAN_INPUT_OPEN:
> 		chan_shutdown_read(c);
> 		chan_set_istate(c, CHAN_INPUT_WAIT_DRAIN);
> 		break;
> 	default:
>-		error("channel %d: chan_read_failed for istate %d",
>-		    c->self, c->istate);
>+		if (err_no != 0)
>+			error("channel %d: chan_read_failed for istate %d: %s",
>+			    c->self, c->istate, strerror(errno));

s/errno/err_no/ on this line
Comment 4 Darren Tucker 2007-04-10 19:28:34 AEST
Created attachment 1261 [details]
Revert the changes that trigger the error messages - patch agains 4.6p1

This patch reverts the changes that trigger the error messages, but also reintroduces the hang-on-exit bug (#52).

It's intended for people who need a sure fix for the errors now, while we determine the best long-term fix that doesn't also un-fix #52.
Comment 5 Darren Tucker 2007-04-10 19:34:59 AEST
(In reply to comment #3)
> >+	debug2("channel %d: read failed%s, istate %d", c->self,
> >+	    err_no ? "" : " (simulated)", c->istate);
> maybe s/simulated/closed/ here

Sure.

> >+			    c->self, c->istate, strerror(errno));
> s/errno/err_no/ on this line

Already spotted and changed that.  Thanks anyway.
Comment 6 Darren Tucker 2007-04-14 10:57:43 AEST
Created attachment 1262 [details]
Update patch #1260 based on djm's comments
Comment 7 Damien Miller 2007-05-17 15:27:59 AEST
Created attachment 1281 [details]
Revised diff

This diff does not force a read when the channel is in istate CHAN_INPUT_CLOSED and so should silence these messages.

I'd appreciate test reports, as well as confirmation that it doesn't break the hang-on-exit fix :)
Comment 8 Jochen Kirn 2007-05-22 03:55:20 AEST
(In reply to comment #7)
> Created an attachment (id=1281) [details]
> Revised diff 
> 
> This diff does not force a read when the channel is in istate
> CHAN_INPUT_CLOSED and so should silence these messages.
> 
> I'd appreciate test reports, as well as confirmation that it doesn't
> break the hang-on-exit fix :)
> 


i've applied the patch #1281 on OpenSSH 4.6p1 
and succesfully build (using OpenSSL 0.98d) on following platforms

Solaris 8 / sparc
Solaris 9 / sparc
Solaris 10 / sparc
HP-UX 11.00 / parisc
HP-UX 11.23 / Itanium
AIX 5.2 / powerpc
AIX 5.3 / powerpc
SLES 9 / x86
SLES 9 / x86_64
SLES 10 / x86_84



result:
the spurious message " error: channel 0: chan_read_failed for istate 3" is now gone :-)
previously I've encountered this message especially for scp copy transactions in log

Scenario:
adminhost -> (via ssh) -> hostA 
On hostA I ran following command to verify on all platforms mentioned above
$ ssh -f <hostB> "sleep 120" ; exit

where hostA is AIX, HP-UX, Solaris and Linux running the OpenSSH 4.6p1 build with patch #1281 for bug #1306

The test succeeds on AIX,Solaris and Linux, ie. the ssh connection to hostB forks into background
and I am able to disconnect from hostA properly while the ssh session between hostA and hostB still
lives on as expected (until sleep cmd finishes after 120 sec).

I consider this bug as fixed while #52 on HP-UX is not finally resolved as it seems.

On both HP-UX machines the ssh session to hostA hangs until the sleep command finished (120 sec).
Only 'ssh -f <hostB> "sleep 120" </dev/null >/dev/null 2>&1' has helped to disconnect
properly from hostA (running HP-UX).
The login shell on HP-UX is ksh, please let me know if you need additional information.

thx
Jochen
Comment 9 Damien Miller 2007-06-11 13:03:49 AEST
Thanks for testing this. I'm committing the fix in attchment #1281 - it will be in OpenSSH 4.7p1. 

Please file a separate bug for the HP/UX problems.
Comment 10 Damien Miller 2007-06-11 14:17:05 AEST
Actually mark FIXED
Comment 11 Damien Miller 2008-04-04 09:59:21 AEDT
Close resolved bugs after release.
Comment 12 Damien Miller 2008-09-19 02:41:07 AEST
*** Bug 1525 has been marked as a duplicate of this bug. ***