I think it's time to say good by



  • Screeps is a really fun game, both because it's kind of "programming lite" and because of it's concepts. But I think it's time for me to say good by. There has been one single over riding frustration in the year plus that I have played. an a couple of minor gripes, but now it's just gotten to a point that it's no fun any more. 

    Thank you to the community that has been, overwhelmingly positive and has certainly contributed to all the fun.  

    Below I list my main gripes, not because I think doing so will get them fixed, but because, in doing so, maybe they can add one more +1 to a column on the spread sheet and, when the time comes to choose what to work on, the community will be better off.

    The major gripe is  that I have no control over my CPU usage. It used to be that you got a spike or a little reset storm, and all is well, you just move on. I even rewrote parts of my code base to be aware of this. But with the recent changes to the "lock out" now, through no fault of my own, that I can find (even with an empty script) at least one time a day I get 

    Your script is temporary blocked due to a hard reset inflicted to the runtime process.
    Please try to change your code in order to prevent causing hard timeout resets.

    I know in the grand scheme of things it probably doesn't matter, because were all hit by this issue, but when a large part of the game revolves around CPU tuning, it just sucks to not have any way to actually effect your CPU usagge. 

    Minor gripes are smaller and honestly I could live with them but here you go any way. 

    I don't like alliances and what they mean. 

    The way noob areas and respawn areas (not the signs and so on, just they way they work in general) just rubs me the wrong way. I don't have a suggested fix though.

    More and more the game seems to stifle creativity and funnel everyone down a single correct way to play. Over and over again I see the same solution to the same problems. Not because it's the  "best" solution but because due to flaw or design it's the only solution.

    So at any rate, thanks for a great game. Hey it took me a year to get here, that should count for something. Thanks for a (mostly) awesome community. And thanks for all the fish.


  • Culture

    The "hard limit" issue is getting out of hand. There's been a huge effort to in slack to show that the issue is not caused by players in most of the cases, but show up do to what we've started calling "bad nodes". Basically, sometimes you end up on a server where for some reason (machine load, other players code, whatever) performance takes a huge hit. Even for people such as my self that autoscale their code down this is a big deal because it means a lot less code runs while on those machines.

     I understand why the hard time out needs to exist, but the punishment for them is out of line for the issue. It really does feel like players are being punished for something that is regularly out of their control. 



  • I was trying to think of a solution, and the only one I can think of is to not "charge" for the execution time of the loop. Charge for the intents, and the stuff that we call the api for, but don't charge for the rest. 

     

    For course that would suck, IRL but its the best I can come up with. Maybe your "charged" every tick for your last 1,000 ticks average CPU time  minus API time. 

     

    Like API time + your average code execution time over a long period = CPU. It's still massively exploitable, but the current situation just isn't any fun, and I think that with some checks of some kind it can be made better. I can "tune" all I want and I am still subject to these hard time outs and "high" cpu usage from time to time. And it usually (though I have caused my own problems too of course) has nothing to do with anything I have done.

     



  • At the very least, the hard timeout rules need to change.

    The devs are likely thinking that when a player is shown to use more than 1000 CPU a tick that the player is automatically in the wrong, and they should get punished for 5 ticks.  In actuality, the player has been averaging near their CPU limit, but this huge spike that happens once or twice a day is really caused by something on the server, and only ever happens in one consecutive tick.  The rule should be changed to instead look for a number of large "hard resets" several times in a row or within a certain small period of time, and THEN punish them, perhaps even with harsher restrictions than just 5 ticks off.

    We've already seen the devs err on the side of the players, for instance the reset storms that happen 4 times a day where we're constantly getting resets, and the server compensates the players with occasional 1000 CPU bucket gifts.  I think the same philosophy needs to come into play with these hard resets.

     

    The broader issue however is inconsistent CPU usage.  I can go hours with a full bucket, and I can go hours in or near a bucket crisis (which I define to be < 9000 bucket), and all without changing my code or any parameters in memory.  For a couple hours before the most recent series of resets, I had no problem keeping a full bucket, and every time I have a full bucket I do a full market query, and those queries were barely scratching my CPU.  After the resets, my empire fell into a bucket crisis, and while watching it try to recover, I'd see it make some headway only to have an unexplained 350 CPU event knock it back down into crisis mode.

    I've actually had to reduce the amount of remote mining I've been doing in response to degrading server performance over the past month or two, and really hope that this is just something temporary, but I am close to being in coteyr's shoes here and moving on.  I really enjoy this game, but these random fluctuations have been rather frustrating.


  • YP

    when looking in the new documentation today I found following sentence in the server architecture article: 

    > Though runInContext is invoked with an execution timeout specific for each player, it is not always able to gracefully finish script execution at certain workload types. If this situation occurs, the whole fork rather than vm is terminated when the time is out. All the players contexts in this process disappear and get re-created from scratch.

    As far as I understand  this is what is called a "hard" reset. It occurs when a process did not respond to the "soft" reset call to stop the execution after a certain amount of time. If I read that corectly all players beeing in this process will get the punishment? But it should be only one current player because node is single threaded, right? In which state is the vm not able to respond to the soft reset? when doing a GC? or maybe stuck in the pathfinder?


  • Culture

    When they kill the runner itself, it restarts the node process itself, which can contain hundreds of user's globals, hence why its a hard reset. (Only one user should ever be running at once per node)


  • Culture

    Considering that most people get one of these hard timeouts and then nothing else after it does seem like the issue isn't their code. Perhaps giving people one or two penalty free hard resets per 20k ticks would solve the problem- this way people who are constantly getting hard timeouts will still get 'punished' but for times when it's not the player causing the issues there is some leeway.



  • But the hard resets are a symptom not the "problem" the problem is I can't control my own CPU usage. 

     

    module.exports.loop = function () {

    return true

    }

     

    May take 1 CPU or 100 CPU (and may cause a timeout) and there's nothing I can do about it.



  • And before we go down the path of "it's not that bad" YES IT IS!!!! That's my complaint. Sometimes things do exactly like they should and that code uses a tiny amount of CPU, but other times, for no reason that I have any control over it doesn't and it "uses" a  ton of CPU.



  • Yeah, that's why I mentioned the deal with the broader issue.  They should definitely look into the CPU fluctuations, but the current symptom should also be addressed because it could have rather catastrophic consequences, ie: losing 5 ticks in a battle could be devastating.



  • I think each function, api call, instruction, etc. should have a virtual cost in cpu time wich should be the same whenever the server is overloaded or not...



  • My guess is, that the current time measurement simply compares time before and after the script run. If this is the case, then any kind of server overload will also block the thread and cause CPU time to spike, even thought he thread was paused for most of the time. As long as you measure time like this, there is no solution for this issue, because server overloads can happen at any time.

    Part of the problem is definitely the choice of language, as JS does come with very poor threading support, lacks basic script-language features, is inherently slow on its own and comes with quirks like sudden GC hickups. If other scripting languages would have been chosen, those issues probably wouldn't exist at all. I.E. Lua has an instruction counter mechanism, that lets you run N instructions then returns to the caller. Proper limiting of execution time would have been a piece of cake with Lua.

    But since the devs seem to only know about JS the game will certainly not be rewritten in a different language in any foreseeable future. Which also means that at least some of the issues will stay as they are unfixable thanks to the choice of programming language. Which is in the end kind of a killer argument for me as well.

    But at least I get a mail every morning that my bots squished some random noob that thought "what a lovely free area around this guy", which makes me smile. 😉



  • @TwoThe I'm fairly certain the choice of JS as the language of the game was not influenced by how suitable the language is in terms of performance or reliability. The choice for JS was most likely influenced by practical considerations of the availability and accessibility of the language.

    @coteyr What do alliances mean?

    As far as I am aware, alliances/clans/communities are a natural occurrence of any MMO game. I also don't feel that the alliances in Screeps are particularly aggressive towards independent players. Most alliances I know off will leave independent players alone for the most part.

    Regarding CPU fluctuations, it is a valid point. I would certainly appreciate an accurate CPU behavior and this area is certainly worth the investment by the devs IMHO. Perhaps in favor of the development of power creeps or other game functionality.

    I've currently coded "around" it and I'm not particularly concerned, but this reality forces me to run my AI at reduced capacity to deal with the CPU fluctuations.

    Regardless of the case, this game is still a fascinating experience for me both in the technical and community side.



  • @Atavus If practicability had any influence on the choice of language, then JS would not have been chosen for a server-side software. What other languages would have to offer would be significantly advanced over Node.js and also run a lot faster. That is why I assume that the dev(s) know mostly JS and no alternatives.



  • @TwoThe I think you misread my statement. The choice of using JS was likely NOT influenced by predictability, reliability or performance. It was most likely chosen due to its accessibility and availability.

    I do not know the professional qualifications of Artem and his team, but if you are suggesting that their skills are lacking I strongly disagree.

    There are plenty of programming games out there using any number of different languages. I've been grabbing and abandoning such games for 20 years now and I've never found anything remotely as exciting and complex as Screeps. Perhaps you have had better fortune than I and can point me to better alternatives.

    For the moment I remain quite satisfied with Screeps, Artem and their efforts. My only wish is that community would be a bit more supportive.


  • YP

    The guys doing screeps are obvious javascript developers.... without javascript developers there would not be a decent web client 😉 

    Of course there would be much faster choices for the backend .. but you also have to see where screeps is coming from. When you look for the original indigogo campain (which I backed btw 🙂 ) you will see that it wasn't really overrun. If they would have chosen LUA for example I'm pretty sure it would have been much less people interested in screeps 😉

    They developed a prototype running completly in the browser.. and from that they created the server. 

    I'm very interested in different runtimes for the usercode .. and it should be possible to implement if you look at the server architecture page.. ( lua runtime gets state from redis, executes code, puts intentions back to redis )  

    But also I'm enjoying typescript and I'm not sure if I could do all the stuff I have done for creeps in LUA 😉

    It should be possibe to replace parts of the backend chain with compiled modules... maybe the module that resolves all the intentions and does the room. Maybe something that takes load from the mongodb. (I'm not really a fan of mongodb)

    But all that takes time .. they are looking for developers... I guess they have some stuff in the backlog.

    But .. we don't have the insight what would be most important or most useful to optimize .. since we don't have any benchmarking data 😉



  • @Atavus It's a minor gripe, but I just don't like alliances. I don't have an alternative. The largest thing I don't like about them is they create a "red v blue" situation. If I go to poke a bear, I first make sure it's smaller then me, then I poke. Alliances mean the bear you just poked suddenly gets 500 friends. That great for the pokie, but the only "counter" to that is to bring my 500 friends. Which leads to a EVE like red v blue situation. Specially with the timers and cool-downs and tick lengths. 

    It's been years sense I played EVE but all I see is a TCU and several POSes every time the "safe mode" kicks in.  Alliances just aren't my thing, but, as I said, minor gripe, and totally on my end. What would be nice is an in game representation of an alliance. At least that way I could make sure I was poking a small lonely bear and not 500 unknown bears.

     

    As to supportive, I try to be where I can, but IMHO the game play is just purely broken. Not in the teenager, "match making sucks how come this noob is on my team broken", but fundamentally flawed. It's not that you have to account for CPU that part of the game, it's that you CAN NOT account for CPU. For no reason at all  that you the player can control, you not only use more CPU this tick but are also blocked for 5 ticks.  That's a big deal in a fight. It's even a big deal in a game where the SCOREBOARD's margins come down to 0.00001% differences in resource collection. 

    Now again, it's not that I care, like a couple of days ago when the servers are down. That happens. Everyone sucks the same. But with the CPU issue it seems to be isolated to "nodes". Every once in a while your code will get stuck on a "bad" "node" and then no matter what you do, your screwed. 

    Having to "work around" the issue is not fun, mostly because the work around doesn't exist. You simply have to accept that though no fault of your own, your code just vomits this tick, and you get punished for it.

     

    Currently there is no way to "win", there may be a lot of objectives  to aim for, but you simply can not reach them. You have the score board and it's metrics, except when you get punished DAILY for something outside of your control and the top spots are all very close in terms of efficiency, your not going to be doing things there. Mostly, because the "punishment" is not handed out to everyone equally, it's "random".

    There are PVP aspects, which you may or may not be able to win, but in a fight when your actually using 80-90% of your CPU your can be damn sure your going to get locked out. How will that effect the fight. Your opponent get's 5 free hits? Well if your HEAL tanking a tower shot there goes your expensive, possibly boosted creep. 

    Maybe you just want to strive for efficiency, except there again,  looking at the "top players" a whacking or two with the ban hammer is all the difference that exists between the top 10.

    So once again you have this, thing (CPU Usage) that is core to your game play (most of the game is actually managing CPU v.s. your goals) and you get no control over it. It would be like me saying Tedvm and ags131 enter into a race from Florida to London. But I will dish out the gas. You can pick out anything else you want, but you must use a car, boat, or plane, and I will give you the fuel. They ask how much fuel, I will give you 1 gallon of fuel every 30 mins. They start the race, but this I give ags131 1 gallon of gas and tedvm .8 gallons. Then for some unknown reason I give Tedvm 1 gammon of gas but tell ags131 he can have any for 2.5 hours. Later I give Tedvm .5 gallons and tell ags131 he gets no gas again for 2.5 hours.  When asked why, I just shrug and say it's the way gas works, and hand them both an empty gas can, but I continue to dish out gas at seeming random amounts.

     

    I hope that makes sense.


  • Culture

    > What would be nice is an in game representation of an alliance. At least that way I could make sure I was poking a small lonely bear and not 500 unknown bears.

    You should install the tampermonkey/greasemonkey extension. It adds alliance data to the in game map and the leaderboards.

    https://github.com/LeagueOfAutomatedNations/loan-browser-ext


  • Culture

    I keep getting hit with these every once in awhile, I can tell you it's extremely irritating.

    There is no way in hell my code can cause this, as it can run fine for weeks, than in 1 day I get hit with 20 of these for no reason whatsoever.

     

    My code is fine-tuned to do tasks with as small as creeps as possible. Due to this issue these creeps get killed off because they can't fire commands (Source keeper miners, invader defenders, power bank miners).

    I find the lack of communication quite disturbing, as it's affecting the people who can't do anything about it. People with shitty code who cause large timeouts should be punished. Not the people playing the game as intended, and with proper codes without infinite loops.

     

    @artem, please fix this issue. It's seriously taking the joy out of the game at the current rate these things are happening.



  • You'll be missed, you were always an insightful part of the discussion in the forum and on slack.

    Agreed about the hard resets. It is getting to the point where I don't really want to work on my code. Every time I've sat down to work on code over the last few months it has been trying to get to the bottom of that issue.