One of the other things which is odd about working on a game while having no game development experience yet substantial systems and distributed experience is that I think about what could be called "anti-cheat" more often than I probably should.
I have tried to look into how multi-player game state is typically managed in games but I only find articles and blog posts which explain the problem in the terms I roughly already know (send mutations versus snapshots, where to validate state, etc). What I want to know is how this is done in actual practice and if there are any "clearly correct" or "clearly incorrect" approaches to parts of the problem which would avoid a lot of duplicated stumbling about in the dark.
So, how I have been looking at this is as a more general distributed systems problem. In this model, there are 2 big issues I am having to resolve: (1) Only the server can be trusted and all clients should be considered cheaters (meaning the network cannot directly trust them) and (2) The game has real-time rules while operating on a naturally asynchronous network and having discrete "ticks" of logic.
Generally, I have been using a mutation-based design for many reasons: Mutations are small, all parties can validate the state transition of the mutation (assuming they agree on the starting state - the previous tick), and mutations can have consequent mutations (which is a problem most games don't have - at least not to this degree). For the most part, things have been going well (although I was wondering if a partial snapshot design may be required in the future, at least for other entities - since the other peers only need position, not all the other internal state which should probably be hidden) but dealing with gravity has created new questions.
Gravity is currently being applied by the server, while motion is applied by each client, for themselves. This can create a problem when trying to, for example, move to catch a ledge while falling. In this case, the movement will be impossible since the client and the server will never be absolutely in sync so the the server will reject all of the client's attempts to move (since the pre-location check will be invalid).
I was thinking that I could allow some finite number of "previous locations" and apply from there, but this does cause some problems in that consequent mutations will still happen, even if their cause was undone. This also creates an opportunity for cheating, but a pretty small one (if I haven't seen you move in 10 ticks, you could use knowledge from a recent tick to influence 10 ticks ago, but who is to say that isn't honestly what happened?).
Another possibility is to state that all data domains have strict owners, including the server, and the server otherwise only applies consensus rules onto the client's data domains. In this case, the client would apply gravity and the server would just check that it was applied correctly. This is a more complex transformation, would cause some bizarre visual behaviours (a client watching their peers fall would see them fall at times defined by their latencies), but may be ultimately the most correct way to do this since it avoids the current problem of the client and server both trying to write to the same data domain (the entity) while being typically out of sync. This would mean that the server doesn't apply the gravity but does require that the client does so in order to move (if you end a tick above air, you need to fall before moving in any other direction). This does mean that the client would be "stuck" while it tries to rationalize this, although a well-behaved client would rarely encounter this (a block is asynchronously broken below them while they are trying to walk on it).
Of course, this does still open the question of who rationalizes combat, so this may imply that the server is always needing to rationalize access to the entity domain, so maybe gravity should be resolved there. However, if we are to consider "the field, not the entity", then combat and gravity might be different domains, themselves, not just "single entity domain".
If only I had someone with whom to talk through these matters. I suspect that I am over-complicating something,
Jeff