IVM heap usage & game objects



  • With 1000+ Game.creeps, 4450+ Game.structures and 220+ Game.rooms I'm definetly at the limit (my rooms are only on Shard0). My heap [(heapStats.total_heap_size + heapStats.externally_allocated_size) / heapStats.heap_size_limit] runs between 76% and 105%. I only trigger gc() myself when it's >110%, because only that would led to that memory error. Normally it's ok for me, but in bad server times I got too much automatic garbage collection time added (~150-300cpu) to my ticktime. With this I got ticktimes where Game.cpu.getUsed()>600.

    This was a bad server day (20.3.) since 7am for me: 0_1521578184240_6f3b26ab-3f11-428f-b3ff-c0201ce95379-image.png

    It normalized in the evening...



  • Currently the server has a top performance. Not seen as strong in a long time. 0_1521798995581_28bc0e6a-e6c9-439e-b2cf-c6e67cf0209e-image.png

    My script also runs very well with this. I almost can not detect any auto-GC within my ticktime, which had lately brought down my skript performance. So currently I don't have that much problems with my big heap, it's constantly at ~91%. Nice server tweaks. Thx!

    Only RoomHistory does not seem to work that well.



  • Since monday I have massive "out-of-tick" heap spikes again. Even with an empty main file like:

    "use strict";
    module.exports.loop = function() {
        console.log(JSON.stringify(Game.cpu.getHeapStatistics()));
    };
    

    I'm getting this output:

    [02:21:50][shard0]{"total_heap_size":136630272,"total_heap_size_executable":3670016,"total_physical_size":132199368,"total_available_size":253438720,"used_heap_size":115367752,"heap_size_limit":380633088,"malloced_memory":8192,"peak_malloced_memory":3157248,"does_zap_garbage":0,"externally_allocated_size":10403655}
    [02:21:54][shard0]{"total_heap_size":224628736,"total_heap_size_executable":3670016,"total_physical_size":223458688,"total_available_size":172878960,"used_heap_size":197190144,"heap_size_limit":380633088,"malloced_memory":8192,"peak_malloced_memory":8342992,"does_zap_garbage":0,"externally_allocated_size":10404068}
    [02:22:00][shard0]{"total_heap_size":319496192,"total_heap_size_executable":3670016,"total_physical_size":317757464,"total_available_size":88642040,"used_heap_size":280341456,"heap_size_limit":380633088,"malloced_memory":8192,"peak_malloced_memory":8342992,"does_zap_garbage":0,"externally_allocated_size":104044810}
    [02:22:04][shard0]{"total_heap_size":205230080,"total_heap_size_executable":3670016,"total_physical_size":203091360,"total_available_size":180387432,"used_heap_size":194743176,"heap_size_limit":380633088,"malloced_memory":8192,"peak_malloced_memory":8342992,"does_zap_garbage":0,"externally_allocated_size":104040682}
    * [02:22:08][shard0]{"total_heap_size":304816128,"total_heap_size_executable":3670016,"total_physical_size":303294784,"total_available_size":87892272,"used_heap_size":281215952,"heap_size_limit":380633088,"malloced_memory":8192,"peak_malloced_memory":8342992,"does_zap_garbage":0,"externally_allocated_size":104044810}
    [02:22:13][shard0]{"total_heap_size":202608640,"total_heap_size_executable":3670016,"total_physical_size":200794288,"total_available_size":180377760,"used_heap_size":194807280,"heap_size_limit":380633088,"malloced_memory":8192,"peak_malloced_memory":8342992,"does_zap_garbage":0,"externally_allocated_size":104040682}
    [02:22:17][shard0]{"total_heap_size":304291840,"total_heap_size_executable":3670016,"total_physical_size":302934176,"total_available_size":88121808,"used_heap_size":280890640,"heap_size_limit":380633088,"malloced_memory":8192,"peak_malloced_memory":8342992,"does_zap_garbage":0,"externally_allocated_size":104044810}
    [02:22:21][shard0]Error: Isolate was disposed during execution due to memory limit
        at ()
        at runUser (/opt/engine/dist/core/runtime/make.js:162:34)
        at _tickDomainCallback (internal/process/next_tick.js:228:7)
    [02:22:25][shard0]Script execution has been terminated: your isolate disposed unexpectedly, restarting virtual machine
    


  • I had a bit of an empire collapse due to a big war. I've recovered a bit now, and recently added some code that leads to quite a bit more room visibilty (checking for power banks pre-observer with static creeps).

    This has lead to my heap hitting the memory limit repeatedly once more. It seems that game objects are still a major issue when it comes to heap.

    @stuy486 mentioned in slack that maybe this is due to JS GC, and maybe could be tweaked. He said:

    Now armed with 15 minutes of reading on Node GC and nothing else... I wonder if, for the Screeps use case, the new-space heap size that's being used isn't large enough? This would cause a bunch of tick-only aged stuff to get moved to old-space, which might be the reason we see the heap size step up like that... This would affect larger players much more since the amount of heap used to generate Game is much higher.

    I do hope this is being looked at. I don't feel we should be penalized by resets/skipped ticks just because we can see a lot of game objects. Rooms visibily seems to be often the culprit.



  • I'm not sure about the new-space old-space thing, as I mentioned in my comment, I just read a quick post about it. Seems like something that should be investigated if it hasn't already been.

    My current experience is that I'm getting 5-10 CPU time limit reached messages an hour, and they are very clearly occurring when GC hits in the middle of my tick. For a period of about a week ending a couple days ago, it seemed that the 500 CPU/tick limit was relaxed. During that time, I was never getting errors, but was frequently seeing ticks that would take 700-800CPU. Given that I short circuit my code at 350CPU and average ~215CPU, this means that GC is taking 400-500CPU when it hits my code mid-tick.

    I suspect these heap/GC issues have always been around and now they're just more obvious with IVM and the addition of get_heap_statistics(). I also suspect that if we're able to figure out what's going on, the servers as a whole will really benefit. Similar to @demawi, I see my used heap climb ~70MB ever tick before my code even runs. It only increases 3-5MB while my code is running.



  • I have more problems with this: (don't know if it's also a heap problem?)

    [00:06:38][shard0]Script execution timed out ungracefully
    [00:06:43][shard0]Script execution timed out ungracefully
    [00:06:48][shard0]Script execution timed out ungracefully
    [00:06:53][shard0]Script execution timed out ungracefully
    [00:06:58][shard0]Script execution timed out ungracefully
    [00:07:03][shard0]Script execution timed out ungracefully
    [00:07:07][shard0]Script execution timed out ungracefully
    [00:07:12][shard0]Script execution timed out ungracefully
    

    currently... each tick without getting to my code.. I already had such a phase a few days ago.where it lasted over 200 ticks..



  • Since wednsday it went back to previously described heap problems. And he doesn't stabilize any more. Currently every 3 ticks I have to call gc myself, but still getting "Isolate was disposed during execution due to memory limit"-errors regularily. Frustrating... 😢

    with an empty main-file:

    [20:00:57][shard0]{"total_heap_size":140148736,"total_heap_size_executable":3670016,"total_physical_size":131834272,"total_available_size":252732968,"used_heap_size":115534184,"heap_size_limit":380633088,"malloced_memory":8192,"peak_malloced_memory":3157248,"does_zap_garbage":0,"externally_allocated_size":104036698}
    [20:01:01][shard0]{"total_heap_size":234536960,"total_heap_size_executable":3670016,"total_physical_size":225970904,"total_available_size":175003304,"used_heap_size":195191992,"heap_size_limit":380633088,"malloced_memory":8192,"peak_malloced_memory":7507824,"does_zap_garbage":0,"externally_allocated_size":104040970}
    [20:01:06][shard0]{"total_heap_size":323624960,"total_heap_size_executable":3670016,"total_physical_size":322073392,"total_available_size":76740336,"used_heap_size":291719864,"heap_size_limit":380633088,"malloced_memory":8192,"peak_malloced_memory":7507824,"does_zap_garbage":0,"externally_allocated_size":104047668}
    [20:01:10][shard0]{"total_heap_size":206999552,"total_heap_size_executable":3670016,"total_physical_size":202776616,"total_available_size":182927992,"used_heap_size":192229592,"heap_size_limit":380633088,"malloced_memory":8192,"peak_malloced_memory":7507824,"does_zap_garbage":0,"externally_allocated_size":104040970}
    [20:01:14][shard0]{"total_heap_size":314101760,"total_heap_size_executable":3670016,"total_physical_size":303882056,"total_available_size":87456760,"used_heap_size":280090032,"heap_size_limit":380633088,"malloced_memory":8192,"peak_malloced_memory":7507824,"does_zap_garbage":0,"externally_allocated_size":104045242}
    [20:01:19][shard0]{"total_heap_size":205426688,"total_heap_size_executable":3670016,"total_physical_size":201582624,"total_available_size":182969808,"used_heap_size":192228672,"heap_size_limit":380633088,"malloced_memory":8192,"peak_malloced_memory":7507824,"does_zap_garbage":0,"externally_allocated_size":104040970}
    [20:01:23][shard0]{"total_heap_size":308334592,"total_heap_size_executable":3670016,"total_physical_size":301953816,"total_available_size":89031408,"used_heap_size":279926960,"heap_size_limit":380633088,"malloced_memory":8192,"peak_malloced_memory":7507824,"does_zap_garbage":0,"externally_allocated_size":104045242}
    [20:01:27][shard0]{"total_heap_size":205950976,"total_heap_size_executable":3670016,"total_physical_size":201367544,"total_available_size":182943552,"used_heap_size":192254952,"heap_size_limit":380633088,"malloced_memory":8192,"peak_malloced_memory":7507824,"does_zap_garbage":0,"externally_allocated_size":104040970}
    [20:01:31][shard0]{"total_heap_size":312528896,"total_heap_size_executable":3670016,"total_physical_size":302439016,"total_available_size":88842216,"used_heap_size":280037088,"heap_size_limit":380633088,"malloced_memory":8192,"peak_malloced_memory":7507824,"does_zap_garbage":0,"externally_allocated_size":104045242}
    [20:01:36][shard0]{"total_heap_size":206475264,"total_heap_size_executable":3670016,"total_physical_size":202223896,"total_available_size":182919864,"used_heap_size":192277880,"heap_size_limit":380633088,"malloced_memory":8192,"peak_malloced_memory":7507824,"does_zap_garbage":0,"externally_allocated_size":104040970}
    [20:01:40][shard0]{"total_heap_size":312528896,"total_heap_size_executable":3670016,"total_physical_size":302239240,"total_available_size":86718288,"used_heap_size":280431392,"heap_size_limit":380633088,"malloced_memory":8192,"peak_malloced_memory":7507824,"does_zap_garbage":0,"externally_allocated_size":104045242}
    [20:01:45][shard0]{"total_heap_size":208048128,"total_heap_size_executable":3670016,"total_physical_size":203117480,"total_available_size":182891712,"used_heap_size":192277784,"heap_size_limit":380633088,"malloced_memory":8192,"peak_malloced_memory":7507824,"does_zap_garbage":0,"externally_allocated_size":104040970}
    [20:01:49][shard0]{"total_heap_size":310681600,"total_heap_size_executable":3670016,"total_physical_size":301337400,"total_available_size":85588216,"used_heap_size":280931136,"heap_size_limit":380633088,"malloced_memory":8192,"peak_malloced_memory":7507824,"does_zap_garbage":0,"externally_allocated_size":104045242}
     
    


  • Me again!

    It's better than it was, but I'm still seeing heap limits hit when I get too much room visibility. Up to 200 rooms visibile before it starts crapping out now.

    I still feel fairly strongly that we shouldn't be seeing errors as users just because we can see too much stuff. It's a reallt weird game mechanic.


  • AYCE

    I've been moving rooms from shard0 to shard1/2 to reduce heap size, and have a similar experience as Tigga visible rooms was close to 200. I've now reduced it to below 180, and have no longer problem with heap size. It is probably a combination of your improvements and my room-reduction.

    I'll add graph of global age for the last month, where most resets the last 10 days have been code uploads.

    0_1524089825560_global.PNG



  • as long as you're tweaking the server and those heap spikes are a regularily problem, it would be nice if you could spend more heap memory (in general, gcl-dependent, or per claimed room don't know what's possilble).. so nobody has to worry about being thrown out of the game after a server tweak



  • Still having issues at about 200 rooms. It's usually around 202/203 it starts going.

    Is there an ETA for a fix? I heard rumours persistent game objects were planned. It'd be nice to get this sooner rather than later, or maybe a +20% heap allocation while we're waiting.

    ☝

  • Dev Team

    @tigga

    maybe a +20% heap allocation while we're waiting

    That would just shift the visible rooms limit to ~240. Do you think the 20% increase solves the issue for you?



  • @artch I guess there's other things going into the equation (such as creeps/structures).

    I think it would probably be sufficient until we get persistent game objects. My understanding was that the ETA was that was not too far away. I'm GCL 24 right now, so there's only 20% more CPU I can get.


  • Dev Team

    Alright, we've temporarily increased heap limit by 64 MB. We'll keep an eye on our servers, to see if they actually can mantain such memory usage.

    👍


  • Just a individual feedback: For me 64MB is not enough to stand the ~100MB spikes. The time before the tweak my memory has magically stabilized, currently he doesn't. I don't have to call gc() myself anymore, to prevent out of memory IVM resets. But intick-autoGC triggers regularly and consumes a significant amount of cpu. Before I was at ~265cpu/tick in average, currently it's over 300.


  • Dev Team

    @demawi Please elaborate what tweak are you referring to?



  • I meant the increase of the heap limit by 64MB (I have no knowledge about other changes today)



  • 0_1524842925244_c161874d-3d06-4c3d-b866-377d52533186-image.png

    My heap is a lot more unstable after today. It reached thousands of ticks on the same global (purple line) before, but now its resetting a lot more often now.



  • @geir1983 said in IVM heap usage & game objects:

    My heap is a lot more unstable after today. It reached thousands of ticks on the same global (purple line) before, but now its resetting a lot more often now.

    I wonder what's going on there. The change was made at 12:30, right? I didn't see the 9:30 change on my plots, but I think you're on a different shard. It's the 9:30 change that seems to have messed something up for you.



  • This post is deleted!