Changelog 2017-07-05


  • Culture

    Did the recent change trigger a reset storm? I'm seeing a performance drop but it could just be the reset storm.



  • It seemed a little better after the reset frequency change, but now I'm back to getting a timeout every minute or two for the past hour or so.


  • Dev Team

    Node.js has been rolled back to 6.11.0 LTS. It doesn't seem to be stable enough. We'll get back to it when some fixes appear, and/or 8.x branch becomes the next LTS.



  • Sounds like a good decision, thanks! It was worth a try.


  • Culture

    Ever since you reverted the code back my system has been performing like absolute crap. Before the update to node8 I was running about 220 programs a tick, after the upgrade it went up to 260 programs a tick, and now that you've reverted it my system is running at 160 programs a tick.



  • My average CPU per tick dropped by about ~10 while we were running Node 8. I really hope we don't have to wait too long for this upgrade.


  • Culture

    I am with the people who got a massive performance boost from the change. I saw an 80% increase in number of processes my kernel runs per tick (and the lower priority processes are the ones which generally consume much more CPU)

     

    I think a lot of the negative reactions stemmed from the surprise nature of this change. At the same time I'm very disappointed that we didn't have a long enough test to really see how things shook out. I understand that we have to do this on live server because we don't have any way to really test this against actual code that players are using, so I might suggest a different test methodology for the next iteration...

    Declare a day to be an "alternate timeline" day. Back up the world state, then upgrade to node 8. Give everyone a free day sub extension (this is the most painful part, of course). Let it run for a day. Then roll back everything and ask everyone to weigh in.


  • Culture

    I'm also learning in slack that a lot of people don't have the greatest protection against CPU spikes. 

    For example, multiple people (including at least one in The Culture) have their code set so it will continue to run up to 450cpu. That means that a single ill timed garbage collection would be more than enough to throw them over the 500cpu.

    From what I've seen of node8 so far it seems that a lot of things are optimized that weren't before, but there does seem to be an increase in json times. I really think that that overall performance has in fact improved but that occasional spikes are also a bit more common, so if people just put more safety checks into their code and gave a larger buffer before that 500cpu mark that this would be a successful deployment.


  • Dev Team

    It was done not because of particular players' code slowdown, but because of instability and overall significant overall tick duration increase.


  • Culture

    Artem, is there any chance you forgot to turn off the ten minute worker reset? I'm still seeing resets every ten minutes, and others have also reported a much higher than normal amount of global resets.


  • Dev Team

    Yeah, that experiment was ongoing. Node 8 had shown decent tick times only with 10-minutes resets, wanted to compare the difference with Node 6 as well. Back to once per hour now.