@wtfrank said in InterShardSegments for each shard rather than a shared one.:
That's absolutely false. Writes to the single asynchronous block are atomic, so it's 100% possible to do locking albeit with an infinite time horizon
There are two points that everyone keeps missing when I discuss this:
- Atomic writes are not the only requirement for safe locking. You also have to make sure that no-one else can read or write the lock in between time. You can do this in a CPU environment, but not in screeps.
- Writes are atomic, but you can only write to the entire block. If shardA tries to change a lock flag and shardB tries to write some data, and shardA wins, the data that shardB wrote is gone. At first glance it seems like locking should prevent this situation from occurring at all, but you have to account for slow or glitchy ticks.
Thus, you cannot implement an independent locking system that is 100% safe, because we do not have the guarantees required for this. As previously stated, locking systems which require cooperation between shards are possible, but will block communication if one shard is down (and any attempt to resolve this makes it unsafe again).
I contend that anyone who thinks they have implemented a 100% safe locking system has made one of the following assumptions:
- That shards all run at the same rate and no shard is ever down (not true)
- That only one shard is executing at a time (not true, shards are executing simultaneously on different servers)
- That you can read and write the segment atomically without any other shard reading or writing in between (not guaranteed, since shards are executing simultaneously)
- That there is a limit to the time between reading the segment and writing it (not true, there are no guarantees here)
- That you can atomically update one part of the segment (e.g. a flag) without changing the rest of the segment (not true, it's all one block)
Now I'm sure many people have locking systems implemented and working fine 99.9% of the time. The point is that it's not 100% safe, which means you have to think about what happens when it goes wrong. I suspect for most people the answer is that they don't care (so long as their locking system recovers without blocking everything). They'll lose some data, but it either wasn't important or will be re-written again soon enough. This is perfectly valid, but working 99.9% of the time and not caring when it goes wrong is not the same as provably 100% safe.
The interesting thing about this is that any one of the assumptions above would allow us to implement 100% safe locking. It's entirely possible that assumption 4 is actually already true, but we would need guarantees from the devs before we could rely on it.
Of course, the new
InterShardMemory scheme is much simpler and solves all these problems.