Saturday, October 03, 2020

LuaSocket connection problems on Mac OS Catalina 10.15

UPDATE: If you are interested in the below, then you should check out the update.

Our game runs as server or client, but we had a problem where recently, the game on a Mac stopped connecting to the same localhost Mac server (for testing).

We are using copas which calls LuaSocket, but I tested without and eliminated that.  

So:

  • Linux Desktop Client to Linux Desktop Server - works fine
  • Mac Desktop Client to Linux on Raspberry Pi test server - works fine
  • Mac Desktop Client to same Mac Desktop server - doesn't work.

The error in the client (returned by the following code) is 'connection refused'. 

local success, err = copas.connect(master_socket, ServerIPAddress, ServerPort)

Tested the Mac Desktop Client to connect to port 22... no connection refused from this line. (I have ssh enabled on this machine).

Tested with two sets of Python 3 code as the server:


sock = socket.socket(socket.AF_INET6, socket.SOCK_STREAM) server_address = ('localhost', 25561) sock.bind(server_address) sock.listen(1) connection, client_address = sock.accept()

and

sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) server_address = ('localhost', 25561) sock.bind(server_address) sock.listen(1) connection, client_address = sock.accept()
 
Since it connected to the first one, and not the second one, it leads me to believe it's trying to connect to IPv6 and failing to the server.

Looking at the server code in the game it uses:

self.Server = socket.bind("*", self.Port)

This socket.bind() is actually a short of short form for socket.tcp or socket.tcp6 then sock.bind, sock.listen, as mentioned in the LuaSocket manual. It transforms "*" into "0.0.0.0" and then does a getaddrinfo() and steps through the results. As soon as it gets a result, it returns the newly formed listening tcp server socket from this single entry.

You can view socket.bind() in socket.lua from the LuaSocket library - it's not in the C part of the library, but in the Lua part.

I see two problems here. It's not clear to me that "0.0.0.0" will return IPv6 results - in fact, I can see that the Mac isn't, but maybe other systems are, or more likely, they bind to both from a single entry. Secondly if there are multiple interfaces on the server, maybe it's possible for getaddinfo to return multiple results, leading us to only forming an server on some of the results? I haven't checked this second item out yet.

Anyway, to me it looks like this operation has changed on the Mac, because our code has been stable for years, and it definitely worked up till relatively recently.
 
My work around is to iterate around a list, something like this:

local server_addresses = { "0.0.0.0", "::"} -- previously "*"

function GulpNetworking:enableServer(ServerPort)

    local success = false

    -- Bind the host and port
    for _, server_address in ipairs(server_addresses) do
        
        local server, err = socket.bind(server_address, ServerPort)
        if (server) then
            table.insert(self.Servers, server)

            copas.addserver(server, ServerReceiveFunction)

    -- etc.,etc...


NOTE, then this code runs on a Raspberry Pi as a test server I get:

> Started server 0.0.0.0 on port 25561
socket.bind error       address already in use  for     ::
Started server(s)

Which probably means that it seems to has bound to both IPv4 and IPv6 ports with one call.

For testing, I added a 'print_table(addrinfo)' into socket.lua and got the following:
 
1:  
  family: inet
  addr: 0.0.0.0
Started server 0.0.0.0 on port 25561
1:  
  family: inet6
  addr: ::
socket.bind error       address already in use  for     ::
Started server(s)


These results are the same as on the Mac. This means it's calling socket.tcp() .. but I haven't debugged this to see what it's creating inside on the Mac and on Linux. Or whether it's a difference in bind. This will require further investigation.

Other Notes

On previous (Python based) server code, I specifically split the IPv4 and IPv6 server receiving code for various reasons, mostly because, if I remember this correctly, Windows XP didn't support mapped addresses at the time (no IPV6_V6ONLY socket option).

I also used HOST = socket.gethostname() to pass into getaddrinfo() on Windows machines, although on Linux I passed in None. I'm not sure how this related to the native C API... probably maps to a NULL pointer, which appears to be supported on Unix/Linux.

Happy networking!

0 Comments:

Post a Comment

<< Home

Newer›  ‹Older