Discover uses processId to distuingish nodes, which is not unique across multiple servers

tobspr

I'm planning to use colyseus on a project with high numbers of ccu, for which I need it to scale on multiple servers.

However, the current proxy uses the processId to distinguish between nodes which is not unique across multiple servers:

Issue Description:

The getNodeAddress of the discovery looks like this (https://github.com/colyseus/colyseus/blob/master/src/discovery/index.ts#L12)

async function getNodeAddress(node: Node) {
  const host = process.env.SELF_HOSTNAME || await ip.v4();
  const port = process.env.SELF_PORT || node.port;
  return `${node.processId}/${host}:${port}`; // here
}

This generated address is then used to publish the presence in the discovery channel just below (https://github.com/colyseus/colyseus/blob/master/src/discovery/index.ts#L18):

export async function registerNode(presence: Presence, node: Node) {
  const nodeAddress = await getNodeAddress(node);
  await presence.sadd(NODES_SET, nodeAddress);
  await presence.publish(DISCOVERY_CHANNEL, `add,${nodeAddress}`); // here
}

The proxy then reconstructs this information (https://github.com/colyseus/proxy/blob/master/discovery.ts#L17)

function parseNode(data: string): Node {
    const [processId, address] = data.split("/");
    return { processId, address };
}

And uses it to index its internal mappings (https://github.com/colyseus/proxy/blob/master/proxy.ts#L44):

function register(node: Node) {
  // skip if already registered
  if (processIds[node.processId]) { return; }

What's the issue?

When the server runs on two instances, it could in theory have the same process id on both systems. The proxy would then forward the requests to the wrong server.

Suggestion

Instead of indexing by node.processId, I suggest to index by hash(processId + address) where hash could be the first 8 digits of sha256 for example.
I haven't actually started working with colyseus yet so this would require the review of a more experienced person. I'd be happy to make a PR though.

endel

Hi @tobspr, welcome! 👋 You did a great job digging the sources! Awesome!

It is very unlikely that the processId is going to collide when spawning a small number of servers, though. Ids are generated using nanoid which does a pretty decent job. Maybe ids could collide if you have an insane amount of ids to be generated at exactly the same time, like +10,000

Have you experienced some issue when deploying on multiple servers? If so, it may be that the issue you're experiencing has a different cause than processId collision.

Cheers!

tobspr

Hey,

I was actually assuming node.processId would equal to process.pid, but since nanoid is used then the whole thing is not an issue at all - thanks for the super quick response! (I should probably have digged a bit further ...)

Thanks!