Optimizations roadmap
-
@dissi: it was not a technical question about code.. I know how to get public segments of other players.
The problem is the other player would not know when I put some data into my public segment... so the plan is to put something into a public segment and then tell somebody in slack to fetch it?
-
W4rl0ck- you have to work out the protocol you build on top of the segments with the players who are also using them. There are a number of different options for this- The Culture is using terminals to transfer private keys as well as segment id's to watch, for instance- but at the moment there is no "universal" messaging system (although one could easily be built by the community).
-
Yeah. But the devs said in an other thread that they don‘t like using terminals for communication...
-
That was just one example, it was not intended to be the only one. You could easily share private keys in slack, or not use private keys at all. You can agree on communication segment ids and code them in yourself. The point is that the protocol itself is something that the group of people communicating need to come up with themselves.
This is a bit of a tangent though. If you are interested in creating a standard protocol for this you can join #diplomacy in slack where we're discussing these things.
-
Will the creep memory be passed along? I’d love to at least give it some purpose once it’s on the other side.
No it won’t, since the player may have custom Memory structure. But the creep’s name will be transferred, and you will be able to pass custom text messages to other shard as well (somehow).
You could probably make a memory segment public per shard, which you can load in other segments. It could be loosely coupled
This would require to connect runtime workers to all shards’ memory storages which creates to much coupling between all backend components. The idea is that inter-shard connectivity is being done only by one single isolated component running in background, passing creeps and text messages along between shards.
-
I think that the concerns here regarding the competitive advantage of young vs old shards are well expressed. I think that it should be possible to address these concerns by scaling the core progression statistics based on each shard's time dilation.
Base resources scaling could be something like y = 2 - 2/x so that at a 2s tick there is no benefit but the bonus is not linear (encouraging DoS).
For example, if with 3 shards exist: A (the origin, 5s tick), B (first new shard, 3s tick), and C (brand new shard, 2s tick) then the benefit to several resources should exist:
* GCL gained on an account might be scaled up per y = 2 - 2/x. Shard A would contribute 60% additional GC per RC, B would contribute 33% additional GC per RC.
* Power processed could have limited scaling perhaps on the function y = 1.5 - 1/x. Shard A would contribute 25% additional Power per Power processed. B would contribute a 16.7% bonus.
* The 5% market surcharge may be reduced to y = 0.05 * (0.5/x +0.75). Shard A would charge 4.38% on market orders. Shard B would charge 4.58%.
CPU expenditure would not be affected.
There's probably some edge cases I haven't considered, but scaling of this kind would slightly blunt the advantage players get from aggressively populating new shards, and provide some incentive to spanning new and old shards.
-
Dear Screeps devs,
I'm very happy to see the team's dedication to handling the performance issues in the game. This is certainly a core issue for the game to continue thriving and should take precedence over the development of new features.
On the subject of performance, I believe the first and most important aspect is CPU stability. From the various individuals I've spoken with, tick times are usually a secondary concern compared to tick variability. The VM proposition for this topic sounds fantastic. Looking very much forward to that.
Regarding the proposal of world sharding. I am honestly quite worried.
1. It fundamentally changes the game without specifically improving gameplay. It adds another layer of complexity which players have to take into account.
3. It adds a significant layer of game engine complexity which will likely require major dev investment and maintenance.
4. Its effects on the community are hard to quantify. It's incredibly risky to play around with such a small, fragile community.
5. It's hard to accept that all technical options have been explored with regards to optimizing performance.To me (not a specialist in the industry/field), it feels like it should be possible to parallelize and scale the processing of the game world. Is the DB the only bottleneck?
Afaik the concept of DB scalability is quite a studied field of computer science. What is fundamentally different about the Screeps world compared to other large-scale services/applications?
Afaik Screeps uses Mongodb. Why is sharding not a valid option?
Perhaps if additional information would be provided to the community about this problem, we might be able to assist. There's quite a number of individuals in the community with experience in the programming field :).
Kind regards,
Atavus
-
Aterm,
Have you guys considered talking to CCP? They have a large single sharded game, and while they are using a different database platform they may have some helpful advice on how to scale a large single instance database.
-
eve online is completely different. While everything is one cluster, every star system is a seperated server that runs for itself. It doesn’t matter if jita lags, your star system is fine. The same in wow. It doesn’t matter if ironforge lags, stormwind will be fine. In screeps the next tick can’t begin before the last one is finished with everyone and everything.
In screeps there is stuff that can’t run in parallel. The market system for example... if you sell something in eve online, it is directly taken out of your possession. So all transactions can run in a seperated database. In screeps all teurminal transfers and market transactions have to run after all rooms are processed... and they can’t really be run in parallel. It has to check if the resource is still available in the sellers terminal and if the destination terminal still has space free. That’s why they introduced the terminal cool down and why they don’t like using terminals for messaging.
-
> 1. It fundamentally changes the game without specifically improving gameplay. It adds another layer of complexity which players have to take into account.
I actually think adding an extra dimension to the game could add some interesting gameplay mechanics. It would allow the devs to give out larger CPU rewards for higher GCL (with the limitation that each shard has a maximum of 300). The pathfinding challenge would be *amazing*, as would the ability to send surprise raids through other shards. I think it could be a cool way to expand the game- as long as each shard is roughly "equal" with the others in things like resources and tick rates.
Hell, this could bring my ultimate dream by creating a fully automated shard (no console or manual placement of flags or structures).
> 3. It adds a significant layer of game engine complexity which will likely require major dev investment and maintenance.
I don't see why this would be the case. Each shard is complete separate with the exception of a communication API for specific functions (transfer creeps, credits or "messages"). I don't see this as a huge project- the rewrite of the front end is probably a larger project.
> 4. Its effects on the community are hard to quantify. It's incredibly risky to play around with such a small, fragile community.
Agreed, although I think it can be mitigated by making sure the worlds interact as much as possible.
> 5. It's hard to accept that all technical options have been explored with regards to optimizing performance.
Also agreed.
-
@Dissi/Artem: have you considered switching to Apache Cassandra? We use it at my company, and I can say this is a database that is built for horizontal scalability from the ground up. Originating at Facebook, it is now used at other very large companies. Citing their homepage:
Some of the largest production deployments include Apple's, with over 75,000 nodes storing over 10 PB of data, Netflix (2,500 nodes, 420 TB, over 1 trillion requests per day), Chinese search engine Easou (270 nodes, 300 TB, over 800 million requests per day), and eBay (over 100 nodes, 250 TB).
I can't believe that Cassandra would really have any performance problems with handling all Screeps world data in a single database cluster.
-
If a big difference remains between the tick times of the different shards, those who don't utilize the fast shard will be at a disadvantage on the slower shards.
The fast shard could be used to create an attack force to send back to the slow shard. When power creeps are implemented, the people that utilize the fast shard will be at a huge advantage if they decide to attack someone that has not used the fast shard. They will level their power creep on the fast shard.
-
To me (not a specialist in the industry/field), it feels like it should be possible to parallelize and scale the processing of the game world. Is the DB the only bottleneck?
This is correct. Processing of the game world is well-parallelized, it is the database that is the bottleneck, not processing.
Afaik the concept of DB scalability is quite a studied field of computer science. What is fundamentally different about the Screeps world compared to other large-scale services/applications?
Because other services are quite different. W4rl0ck got it quite well in the post above. This proposed world sharding change is what can make Screeps closer to traditional use cases, and thus more applicable for traditional solutions.
Afaik Screeps uses Mongodb. Why is sharding not a valid option?
Because we tried it in all possible ways, and it made performance worse rather than better. Distributing every DB request (tens of thousands of them every second) among a cluster of shards incurs huge network and CPU overhead. This topic is not something that we are not competent in, we literally spent months learning this area and all possible options. Database sharding always comes at a cost. It is better to fix the flaw in our architecture than to keep trying looking for a solution for a man-made problem.
-
If a big difference remains between the tick times of the different shards, those who don’t utilize the fast shard will be at a disadvantage on the slower shards.
On the other hand, people on the old shard can benefit from the well developed market and relations to established players. Either way has its pros and cons, and every player is free to choose on which shard he is willing to play.
-
@Dissi/Artem: have you considered switching to Apache Cassandra?
Very interesting, have not considered it yet. We’ll make some benchmarks on our dataset and workload profile using it, thanks for the tip.
-
> Very interesting, have not considered it yet. We’ll make some benchmarks on our dataset and workload profile using it, thanks for the tip.
You may have to model your data differently than what you're used to to really reap the benefits of Cassandra. Look closely at the way partition keys work. I'm willing to answer questions and give advice on data modelling, no strings attached, NDA is okay if needed.
-
It looks like Cassandra is a better fit than MongoDB for big data set cases, not for higher read/write throughput. See this benchmark for example. In our case the data set is relatively small and completely fits into RAM of one single machine, but it is the requests per second rate that is crucial.
-
> It looks like Cassandra is a better fit than MongoDB for big data set cases, not for higher read/write throughput.
Well, but read/write throughput in Cassandra scales linearly if you add more machines. So you don't have the problem of "overhead due to replication" killing the performance benefit of scaling horizontally.
Without going too much into details, the way Cassandra achieves this is because the partition key allows calculating which node(s) are responsible for the given query, and the driver will only ask these nodes. As a simple example (without duplicating data across nodes for fault tolerance), if I have 5 nodes, each of them will contain one fifth of the data, so only one fifth of queries will be handled by it. Thus, throughput load is spread evenly, and adding more nodes helps improving performance.
-
Fair points.
If the primary goal of the world shard is to isolate the data structures/processing better, then I would be in favor of forcing the tick rate of all shards to be synchronized.
Effectively, you split the world into shards for the necessary performance benefit, but it's still a single synchronized game world.
-
Synchronized ticks would mean all shards would have the tick rate of the slowest shard. I don't think that would make moving attractive. Starting in a empty world with a 5 sec tick.
I don't think the current plan is to crush the world into pieces ... but to add alternatives worlds that are loosely connected. I don't know if / how shrinking the current world would work.