View Issue Details

IDProjectCategoryView StatusLast Update
0000737LDMud 3.3Networkingpublic2018-01-30 04:59
Reporterzesstra Assigned Tozesstra  
PrioritynormalSeverityminorReproducibilityrandom
Status resolvedResolutionfixed 
Platformi686OSGNU/LinuxOS Version4.0
Target Version3.3.720Fixed in Version3.3.720 
Summary0000737: Disconnects of TLS connections with SSL_ERROR_WANT_READ.
DescriptionFew players here have sporadic TLS disconnects when much data is transfered, but one player also claims it happens with specific (but unknown) text. The debug log states:
TLS: Received corrupted data (2). Closing the connection.
2 is here SSL_ERROR_WANT_READ, which is IMHO the equivalent to EGAIN.

This error message originates in pkg-openssl:tls_read() and is called by get_message(). In case of SSL_ERROR_WANT_READ/SSL_ERROR_WANT_WRITE is retries to read up to 6 times and then calls tls_deinit_connection().

Gnomi commented: Is it necessary to close the connection in that case?
I also tend to change this to just wait until the next cycle to read again. But then we would have to change get_message() (and un-encrypted connections) as well, because they also disconnect upon any error in read()/tls_read(), including EAGAIN.

I think, usually this should not happen in the first place, because we do a select() before and read only sockets for which data should be available... But in rare cases select() does not seem to be sufficient for TLS connections to guarantee that...
TagsNo tags attached.

Activities

zesstra

2010-03-16 16:53

administrator   ~0001807

Some more digging into our debug log shows, that some of the disconnects are with errors SSL_ERROR_SSL and SSL_ERROR_SYSCALL as well. These look weird and may well be not our responsibility. But we should try do create better error messages than just issue to plain numbers by using ERR_error_string() and maybe process the error queue.

Gnomi

2010-03-16 17:04

manager   ~0001808

There is a difference between encrypted and unencrypted sessions regarding EAGAIN. While on unencrypted sockets select() should guarantee that there is at least one byte to read, this is not necessarily true for encrypted sockets, because the SSL layer may consume some bytes itself. That's why I think it should be legitimate for tls_read() to try again on an EAGAIN later (and not terminate the socket).

zesstra

2010-03-16 17:18

administrator   ~0001809

And additionally it might also be that upon read the SSL engine first has to write something and vice versa because the might be a handshake going on. Did some search in the net this evening and the SSL people seem to recommend just to repeat the operation some time later (they say one should use select()). So this seems to be the right thing to do. We could either set errno to EAGAIN and change get_message() to not disconnect in that case or return 0 as pretend a sucessful read of 0 bytes.

Oh, BTW: There seems to be an error: tls_read() returns ret. In case of errors it is <0, but ret = SSL_get_error(ip->tls_session, ret); changes it to positive numbers. And these are returned to get_message() which regards the tls_read() then successfull... And because a positive return value of tls_read() is regarded as the number of bytes read, this changes (falsely) the ip->text_end pointer (and the statistics). IMHO garbage will then be processed as player input, between 1 and 8 bytes.

zesstra

2010-03-17 17:55

administrator   ~0001810

Mhmm, we might have been wrong:

>When an SSL_read() operation has to be repeated because of SSL_ERROR_WANT_READ
>or SSL_ERROR_WANT_WRITE, it must be repeated with the same arguments.
I think, we can't guarantee that if we we defer back to get_message() or comm_send_buf() to try later again. The buffer addresses will be still the same, but the sizes to read/write maybe not... In this case, we may have no choice but to disconnect if we don't want to rewrite larger parts of the comm code?

zesstra

2011-01-05 21:54

administrator   ~0001940

Ok, some more research... SSL_write() needs to be called with the exact arguments, because it may already have committed a part of the buffer to the network even if it returns -1. Therefore one has to retry with the exact same arguments until SSL_write() returns > 0. The long explanation can be found in this thread:
http://readlist.com/lists/openssl.org/openssl-users/0/1726.html

If we can't send the data immediately we copy it into a queue of buffers, so the buffer address changes. However, we can make OpenSSL to accept a different buffer for the next call as long as the _content_ of the buffer is the same by setting SSL_MODE_ACCEPT_MOVING_WRITE_BUFFER in the SSL context.

Additionally, we need to do a similar thing for pkg-gnutls.c as well. However in that case, we have to to call gnutls_record_send() and gnutls_record_recv() with either exactly the same parameters (impractical in our case, see above) or with NULL until the interrupted operation succeeds (NULL causes these functions to resume the last call). We might use a static var in the functions to record the success state of the last call.

In both cases, we have to ensure that we try to resend the very same content during retries. This should be the case because the buffers in the queue are not changed.

zesstra

2011-02-19 16:07

administrator   ~0001996

Addition to 0000737:0001940:
GnuTLS stores the content in an internal buffer in case of GNUTLS_E_INTERRUPTED or GNUTLS_E_AGAIN. If gnutls_record_send() is called and there is content in this buffer, it will send the content of this internal buffer instead the content of <buffer>. According to the documentation we should call again with NULL as buffer in this case, but GnuTLS does not care and we will anyway try to send the same content. It is way easier for us to not call with NULL, we just have to ensure to try to re-send the same content.

The only problem this has: GnuTLS might change since we are relying on non-documented, implementation defined behaviour. :-(

Issue History

Date Modified Username Field Change
2010-03-16 16:34 zesstra New Issue
2010-03-16 16:53 zesstra Note Added: 0001807
2010-03-16 17:04 Gnomi Note Added: 0001808
2010-03-16 17:18 zesstra Note Added: 0001809
2010-03-17 17:55 zesstra Note Added: 0001810
2011-01-05 21:54 zesstra Note Added: 0001940
2011-01-06 00:18 zesstra Source_changeset_attached => ldmud.git master 6d2fe77e
2011-01-06 00:18 zesstra Source_changeset_attached => ldmud.git master-3.3 32d5d714
2011-01-10 23:07 zesstra Source_changeset_attached => ldmud.git master 6b166be0
2011-01-10 23:07 zesstra Source_changeset_attached => ldmud.git master 0bba24f8
2011-01-10 23:07 zesstra Source_changeset_attached => ldmud.git master 64c16b54
2011-01-10 23:07 zesstra Source_changeset_attached => ldmud.git master a6652698
2011-01-10 23:07 zesstra Source_changeset_attached => ldmud.git master f991e84e
2011-01-11 09:15 zesstra Assigned To => zesstra
2011-01-11 09:15 zesstra Status new => assigned
2011-01-11 09:16 zesstra Target Version => 3.3.720
2011-02-19 16:07 zesstra Note Added: 0001996
2011-02-20 00:17 zesstra Source_changeset_attached => ldmud.git master-3.3 379bcdc0
2011-02-20 00:17 zesstra Source_changeset_attached => ldmud.git master-3.3 9cbafbca
2011-02-20 00:17 zesstra Source_changeset_attached => ldmud.git master-3.3 176263c4
2011-02-20 00:17 zesstra Source_changeset_attached => ldmud.git master-3.3 bdce6e49
2011-02-20 00:17 zesstra Source_changeset_attached => ldmud.git master-3.3 68eb40b9
2011-02-22 23:58 zesstra Status assigned => resolved
2011-02-22 23:58 zesstra Fixed in Version => 3.3.720
2011-02-22 23:58 zesstra Resolution open => fixed
2018-01-29 19:59 zesstra Source_changeset_attached => ldmud.git master 6b166be0
2018-01-29 19:59 zesstra Source_changeset_attached => ldmud.git master 0bba24f8
2018-01-29 19:59 zesstra Source_changeset_attached => ldmud.git master 64c16b54
2018-01-29 19:59 zesstra Source_changeset_attached => ldmud.git master a6652698
2018-01-29 19:59 zesstra Source_changeset_attached => ldmud.git master f991e84e
2018-01-29 19:59 zesstra Source_changeset_attached => ldmud.git master 6d2fe77e
2018-01-29 19:59 zesstra Source_changeset_attached => ldmud.git master-3.3 379bcdc0
2018-01-29 19:59 zesstra Source_changeset_attached => ldmud.git master-3.3 9cbafbca
2018-01-29 19:59 zesstra Source_changeset_attached => ldmud.git master-3.3 176263c4
2018-01-29 19:59 zesstra Source_changeset_attached => ldmud.git master-3.3 bdce6e49
2018-01-29 19:59 zesstra Source_changeset_attached => ldmud.git master-3.3 68eb40b9
2018-01-29 19:59 zesstra Source_changeset_attached => ldmud.git master-3.3 32d5d714
2018-01-29 22:57 zesstra Source_changeset_attached => ldmud.git master 6b166be0
2018-01-29 22:57 zesstra Source_changeset_attached => ldmud.git master 0bba24f8
2018-01-29 22:57 zesstra Source_changeset_attached => ldmud.git master 64c16b54
2018-01-29 22:57 zesstra Source_changeset_attached => ldmud.git master a6652698
2018-01-29 22:57 zesstra Source_changeset_attached => ldmud.git master f991e84e
2018-01-29 22:57 zesstra Source_changeset_attached => ldmud.git master 6d2fe77e
2018-01-29 22:57 zesstra Source_changeset_attached => ldmud.git master-3.3 379bcdc0
2018-01-29 22:57 zesstra Source_changeset_attached => ldmud.git master-3.3 9cbafbca
2018-01-29 22:57 zesstra Source_changeset_attached => ldmud.git master-3.3 176263c4
2018-01-29 22:57 zesstra Source_changeset_attached => ldmud.git master-3.3 bdce6e49
2018-01-29 22:57 zesstra Source_changeset_attached => ldmud.git master-3.3 68eb40b9
2018-01-29 22:57 zesstra Source_changeset_attached => ldmud.git master-3.3 32d5d714
2018-01-30 04:59 zesstra Source_changeset_attached => ldmud.git master 6b166be0
2018-01-30 04:59 zesstra Source_changeset_attached => ldmud.git master 0bba24f8
2018-01-30 04:59 zesstra Source_changeset_attached => ldmud.git master 64c16b54
2018-01-30 04:59 zesstra Source_changeset_attached => ldmud.git master a6652698
2018-01-30 04:59 zesstra Source_changeset_attached => ldmud.git master f991e84e
2018-01-30 04:59 zesstra Source_changeset_attached => ldmud.git master 6d2fe77e