Changelog 2017-07-05



  • As kshepard mentioned, my script is seeing a large number of timeouts since this update. The performance drop is fairly dramatic unfortunately.



  • same here.. In my script those ticks starting with a massive memory loading time. I have detected: 280-380CPU for ~900kb.. 


  • Culture

    On the opposite side, my code actually seems to be doing better, CPU usage staying at the usual average of 100, but bucket is staying a bit higher and more of my kernel processes are running every tick. (Sometimes even all of them)

    Looking at graphs, new globals seem to not be hitting me as hard as they used to either



  • I experience much more timeouts than before the update (4 timeouts yesterday and ~50 since the update).

    Also, there are no more invaders since the update

     



  • So far the CPU use seems more consistent and the compilation penalty is not as high.



  • Zoom out the the 7 day display on http://status.screeps.com ... can you guess when the update was done?!?  

    From the looks of it this is hitting most players' CPU pretty hard.  I'm regularly seeing 100 - 230 CPU used already on the very first line of my loop.  I only get 80... so at this point the game is completely broken.   On normal turns I'm running 45 to 70 depending on if I run my check deals code... but 1 / 5 or more of my turns start out as described.  It does not specifically coincide with the code reloading as my code detects these runs and displays this run-time usage statistic as well which also seems to be much higher than normal.  

    I'm also seeing creeps regularly using 4x to 10x the CPU they were before... not every turn, but more often than not on the same turns as the issue above.


  • Culture

    Creep CPU usage is normal for me, most of my creeps are running at their usual ~0.25/t usage


  • Culture

    My code is very very happy with this upgrade- things are working so much faster.

    One thing I am noticing though is that JSON is actually performing worse with these upgrades than it did before. I'm seeing this in two places- my memory parse time went from 18cpu a tick to 26cpu a tick, and the json stringify call for saving statistics into a segment is also significantly higher.

    Overall this is still a huge win for me, as the improvements in speed in other places more than makes up for the json changes.


  • Dev Team

    Runtime workers now will be reset every 10 mins, let's see if it makes any difference.



  • It made a little bit of difference for me. I'm receiving fewer timeouts, but am still seeing on average one timeout every 10 minutes.


  • Culture

    The workers reseting more often has made my code perform worse, 
    CPU is more eratic (http://i.imgur.com/8gIWJFA.png)
    Memory parse is on average more expensive too (http://i.imgur.com/Dws72zo.png)
    Ignore the long gap in the graphs, my stats agent died during that period.


  • Culture

    Did the recent change trigger a reset storm? I'm seeing a performance drop but it could just be the reset storm.



  • It seemed a little better after the reset frequency change, but now I'm back to getting a timeout every minute or two for the past hour or so.


  • Dev Team

    Node.js has been rolled back to 6.11.0 LTS. It doesn't seem to be stable enough. We'll get back to it when some fixes appear, and/or 8.x branch becomes the next LTS.



  • Sounds like a good decision, thanks! It was worth a try.


  • Culture

    Ever since you reverted the code back my system has been performing like absolute crap. Before the update to node8 I was running about 220 programs a tick, after the upgrade it went up to 260 programs a tick, and now that you've reverted it my system is running at 160 programs a tick.



  • My average CPU per tick dropped by about ~10 while we were running Node 8. I really hope we don't have to wait too long for this upgrade.


  • Culture

    I am with the people who got a massive performance boost from the change. I saw an 80% increase in number of processes my kernel runs per tick (and the lower priority processes are the ones which generally consume much more CPU)

     

    I think a lot of the negative reactions stemmed from the surprise nature of this change. At the same time I'm very disappointed that we didn't have a long enough test to really see how things shook out. I understand that we have to do this on live server because we don't have any way to really test this against actual code that players are using, so I might suggest a different test methodology for the next iteration...

    Declare a day to be an "alternate timeline" day. Back up the world state, then upgrade to node 8. Give everyone a free day sub extension (this is the most painful part, of course). Let it run for a day. Then roll back everything and ask everyone to weigh in.


  • Culture

    I'm also learning in slack that a lot of people don't have the greatest protection against CPU spikes. 

    For example, multiple people (including at least one in The Culture) have their code set so it will continue to run up to 450cpu. That means that a single ill timed garbage collection would be more than enough to throw them over the 500cpu.

    From what I've seen of node8 so far it seems that a lot of things are optimized that weren't before, but there does seem to be an increase in json times. I really think that that overall performance has in fact improved but that occasional spikes are also a bit more common, so if people just put more safety checks into their code and gave a larger buffer before that 500cpu mark that this would be a successful deployment.


  • Dev Team

    It was done not because of particular players' code slowdown, but because of instability and overall significant overall tick duration increase.