PTR Changelog 2016-09-29

stybbe

Expected usage is from 0.005 to 0.05 CPU per flag depending on how many rooms they are placed in.

Is the test realm cpu different? I just checked and the cpu usage for flags over there and it's enormous. I've quite many flags mostly just with 2 characters and majority of them are probably flags I just forgot to remove in some random rooms.

[12:54:14] flags: 289 cpu: 58.924792

[12:54:19] flags: 289 cpu: 225.406025

[12:54:25] flags: 289 cpu: 54.946661

[12:54:31] flags: 289 cpu: 85.424826

[12:54:38] flags: 289 cpu: 62.086441

[12:54:46] flags: 289 cpu: 201.968857

[12:54:50] flags: 289 cpu: 63.095823

[12:54:59] flags: 289 cpu: 65.384927

[12:55:05] flags: 289 cpu: 157.434697

[12:55:11] flags: 289 cpu: 58.212785000000004

[12:55:19] flags: 289 cpu: 71.51562600000001

[12:55:23] flags: 289 cpu: 73.69439799999999

[12:55:31] flags: 289 cpu: 72.750881

[12:55:36] flags: 289 cpu: 85.90243

[12:55:47] flags: 289 cpu: 130.39085

Dissi

Before I start providing feedback, can we get an actual copy of the world on PTR?

To re-create all flags on the PTR will take way too much time and resources, I don't know how this will impact me otherwise.

____

Will memory be loaded simultaniously to accessing flags? (which might cause even more CPU-usage)

artch

I kinda like it, but can we at the same time remove the high creation cost of them, so they will be more in line with how memory works?

We may consider this change in the future.

Is the test realm cpu different?

It has the same hardware as live runtime workers, however it also has a lot of background processes going on. So you should treat your results carefully, better to make a long series of tests and take the minimum. I’ve copied your flags and I can see that the minimal usage for your particular case is 1-2 CPU on the PTR, it should be the expected usage on the live realm.

Before I start providing feedback, can we get an actual copy of the world on PTR?

Unfortunately, we are unable to copy the entire world data to the PTR now, its database cannot handle it anymore.

Will memory be loaded simultaniously to accessing flags? (which might cause even more CPU-usage)

I don’t get your question. Memory parsing has nothing to do with flags parsing, and they don’t affect each other.

Dissi

Just for this change, can we get a copy of world?

I can not test the load at all like this, or prepare for the change when needed. Can we get testing-space for this change? In my opinion a big change like this should be tested on PTR on production-like hardware. If I can't test this change, I can't write code for it.

n00bish

So, as some people may know, I have a lot of flags. I don't try to use them as a way to get around memory usage but as location markers, which leads me to a very important question - can we override how flag parsing is done similar to how we have RawMemory? I'm fine with this change going in, flags are definitely a bit too cheap as things stand, but what I really would like to have is the ability to change what happens during the parsing step. As an example, I don't usually find Game.flags itself useful for more than checking if a flag exists by name, what matters most to me is colors and color combinations. Currently, I have to process all of my flags in a secondary step to filter them by color - I'd like the ability to do this during the initial parse period as well to save on iteration cost. If we can add additional functionality to the base Game.flags parsing, I support this change - if we can't, this is a big problem since I'd essentially pay for parsing twice each tick.

artch

Just for this change, can we get a copy of world?

I can not test the load at all like this, or prepare for the change when needed. Can we get testing-space for this change? In my opinion a big change like this should be tested on PTR on production-like hardware. If I can’t test this change, I can’t write code for it.

Well, your flags on the PTR are the same, you can benchmark their CPU usage and use this as a reference point of how much spare CPU you have to release.

artch

I’d essentially pay for parsing twice each tick.

No, you pay for parsing and for iterating through the parsed flags. Iterating is an order of magnitude less expensive than parsing.

We cannot give access to the parsing mechanism since it is executed in the engine scope.

n00bish

Also, I don't have any of my flags on the PTR. I can't test this unless they are copied over, can you please ensure everyone has their flags copied over so we can check the cpu usage and adjust accordingly?

artch

PTR has been just deployed to a fresh copy.

Dissi

http://i.imgur.com/4Ty0v9E.png

It seems this adds about 8 CPU for 1000 flags ( purely coincidental, i had 1000 ). CPU for parsing, but it varies wildly per tick!

I don't mind the change too much, but the variations per tick are extremely bad. If you could look into making the parsing consistent per tick, or allowing us to hook into the parsing of the flags (as n00bish said) it wouldn't be so bad.

On another note

>> However, some players started to make use of them as a free memory storage. This has major impact on the game performance, and it is not their intended usage anyway

What is the intended usage for flags?

n00bish

Is it at all possible to enforce a constant cost on the flag parsing operations? I know for Memory this isn't possible as the number of objects within and the layout affects things much more than the size, but flags are a rather constant object. On the PTR the parsing cost is jumping anywhere from 19 to 83 cpu, which is a big difference. On good ticks the cost is fine, maybe even too low, but on bad ticks the cost is unreasonably high. What are your thoughts about making flag parsing cost a (low cost) constant amount per flag? It would make the calculation of impact *much* simpler and they'd still have a per-tick cost without the (sometimes absurd) fluctuations in parsing cost that I already see with Memory.

artch

On the PTR the parsing cost is jumping anywhere from 19 to 83 cpu, which is a big difference.

This is because of the PTR. It has background processes that shoot sometimes and affect some ticks.

Is it at all possible to enforce a constant cost on the flag parsing operations?

Flags parsing depends not only on flags count, but also on rooms count. The formula would be too complex.

What is the intended usage for flags?

“…to visualize your processes, debug things, and manually give orders”, or whatever else if you’re ready to pay the cost.

Dissi

What is causing the massive differences between ticks in flag parsing?

artch

What is causing the massive differences between ticks in flag parsing?

It is the third time (really) I write this sentence in this thread:

This is because of the PTR. It has background processes that shoot sometimes and affect some ticks.

Voronoi

"What is the intended usage for flags?"
@Dissi I guess you are somehow playing with words there.

The point is, there is some kind of cpu and memory limit pressure on every mechanism in this game.
It's isn't right to have a mechanism that allow to bypass that, which is the case with flags ATM.
This is definitely an exploit that must be fixed.

To make it clearer:
Let's assume that a player wrote a ridiculously simple scripts that systematically fill every room with flags on every position.
The storage and parsing would be very high, but it would not be taxed on the player who own the flags since it is taxed on the engine side.
The net effect is that every player experience will degrade due to longer ticks.

And this can be done with a 10 CPU account.

So yeah, fix this.

Dissi

>> Memory parsing has nothing to do with flags parsing,

Game.flags['flag'].memory might be touched during initialization for all I know

_______________________________________________________________________

>> It is the third time (really) I write this sentence in this thread

There is a reason I ask the question so many times, you don't seem to see the problems the players are currently facing, and are waiving our concerns in a condescending/hostile way.

How can I test the changes made to my code when I don't even have my normal setup in a test environment?
How do you to reliably test the change if CPU varies so wildly?
Can I use the flags as a pointer to some location in the games map (seemed like a proper use for it, apparently it's not now).

My own solution would be easy:

I can change flags to a memory-based layout for less costs. but I can not test this anywhere.

Other people's solutions are wildly different, and they may need a shitton of time to change it, you will probably hear their concerns later today.

_______________________________________________________________________

>> Flags parsing depends not only on flags count, but also on rooms count. The formula would be too complex

Why would rooms count even matter to flags? A flag is

{ name: "stuff", pos: new RoomPosition(1,1,'E5N5'), color: COLOR_WHITE, secondaryColor: COLOR_WHITE}

I see no reference (except RoomPosition) to a room. If you access `Game.flags["test"].room` it could be an alias to `Game.rooms[flag.pos.roomName]`.

This way you can eliminate the "but also on rooms count" part.

Just slapping it to user runtime and saying "deal with it" when not providing an option to improve it (like RawMemory) is in my opinion a desperate move.

My hit seems to be only 8 CPU, but that's about 2~3 fully build rooms for me. I just freed up 20~ CPU this week, it seems this will be going towards flags now.

artch

There is a reason I ask the question so many times, you don’t seem to see the problems the players are currently facing, and are waiving our concerns in a condescending/hostile way.

I didn’t mean to offend you or being hostile, and I’m sorry if it looks like that. You just have all the answers we can give, and asking them again doesn’t help really. We understand your concerns, this is why we have deployed this change on the PTR two weeks in advance.

Game.flags[‘flag’].memory might be touched during initialization for all I know

It is not touched.

How can I test the changes made to my code when I don’t even have my normal setup in a test environment?

Unfortunately, we cannot help with this currently. It would require a lot of new expensive hardware in order to scale the PTR to the size when it can handle the live world data.

How do you to reliably test the change if CPU varies so wildly?

CPU varies on the PTR only due to its specific environment. It should not be the case when this change is deployed (the runtime workers don’t have any background processes there). If it is, we’ll figure it out then. It is not like memory parsing, it's a more stable algorithm.

Why would rooms count even matter to flags?

Flags are serialized and unserialized on per room basis. There are two nested loops in the flags parsing routine - one per room and one per flag in the room. Otherwise it would be a lot more expensive than 0.005 CPU per flag.

n00bish

Using your same benchmarking style of test for Memory on the production systems, I see these results:

[9:05:55 AM] Tick 14065006 Memory parse time result: 30.2501
[9:05:58 AM] Tick 14065007 Memory parse time result: 9.3850
[9:06:00 AM] Tick 14065008 Memory parse time result: 6.9617
[9:06:03 AM] Tick 14065009 Memory parse time result: 9.7145
[9:06:06 AM] Tick 14065010 Memory parse time result: 10.6524
[9:06:09 AM] Tick 14065011 Memory parse time result: 11.7271
[9:06:12 AM] Tick 14065012 Memory parse time result: 7.3918
[9:06:15 AM] Tick 14065013 Memory parse time result: 11.2062
[9:06:18 AM] Tick 14065014 Memory parse time result: 11.3516
[9:06:21 AM] Tick 14065015 Memory parse time result: 26.5043
[9:06:24 AM] Tick 14065016 Memory parse time result: 50.3858
[9:06:27 AM] Tick 14065017 Memory parse time result: 7.6152
[9:06:30 AM] Tick 14065018 Memory parse time result: 11.9079
[9:06:33 AM] Tick 14065019 Memory parse time result: 10.3699
[9:06:36 AM] Tick 14065020 Memory parse time result: 31.6772

Do the production servers also have background processes running? Because this is a fluctuation between 7.6 cpu and 50.4 cpu, just to access memory. This is the test code:

module.exports.loop = function () {
    // console.log(`------------------- tick start: ${Game.time} -------------------`);
    let preMemCpu = Game.cpu.getUsed();
    Memory;
    let postCpu = Game.cpu.getUsed() - preMemCpu;
    console.log(`Tick ${Game.time} Memory parse time result: ${postCpu.toFixed(4)}`);
...
}

Let's just call this what it is - variability due to system overhead, maybe garbage collection, I don't know for sure. If you can see this kind of variability in production, why is PTR written off as an invalid case when seeing those numbers? I'm happy to post more from production, maybe I'll get lucky again and have the Memory test report 200 cpu as I have seen in the past. Please take this seriously, because right now it just feels like you are writing our concerns off. Flag processing plus memory parsing on a single tick could (given only the numbers I've posted in this thread) cost 80 + 50 = 130 cpu, which is my current limit.

Oh, and some more results from the memory timing test:

[9:14:38 AM] Tick 14065185 Memory parse time result: 9.6542
[9:14:41 AM] Tick 14065186 Memory parse time result: 72.8470
[9:14:44 AM] Tick 14065187 Memory parse time result: 86.3037
[9:14:47 AM] Tick 14065188 Memory parse time result: 17.5044

What is the reason for this?

artch

Actually, it won’t hurt if I show the flag parsing snippet, we’re going to opensource it soon anyway. Here it is:

serializedFlags.forEach(flagRoomData => {
<span class="hljs-keyword">var</span> data = flagRoomData.data.split(<span class="hljs-string">"|"</span>);
data.<span class="hljs-keyword">forEach</span>(flagData =&gt; {
    <span class="hljs-keyword">if</span>(!flagData) {
        <span class="hljs-keyword">return</span>;
    }
    <span class="hljs-keyword">var</span> info = flagData.split(<span class="hljs-string">"~"</span>);     
    <span class="hljs-keyword">var</span> id = <span class="hljs-string">'flag_'</span>+info[<span class="hljs-number">0</span>];
    register._objects[id] = <span class="hljs-keyword">new</span> globals.Flag(info[<span class="hljs-number">0</span>], info[<span class="hljs-number">1</span>], info[<span class="hljs-number">2</span>], flagRoomData.room, info[<span class="hljs-number">3</span>], info[<span class="hljs-number">4</span>]);
})

});

artch

Let’s just call this what it is - variability due to system overhead, maybe garbage collection, I don’t know for sure.
…
What is the reason for this?

GC is most likely. You may be hit by it in any point of execution, not only in Memory or flags parsing, but in an empty while loop also.

But PTR has more than that, this is why it has a lot more spikes than the live server. It's a single machine with everything running on it - mongodb, redis, node processes, cronjobs, everything. The infrastructure of the live server is much more separated.