Auth Tokens

ags131

Any chance we can get an endpoint to query a tokens access? Being able to determine what a token can access would be helpful in cases for example, where a user selects an option to pull from segment 5, but has only granted access to segment 10.

Kenji

@artch said in Auth Tokens:

olve Auth Tokens and thus are NOT rate limited at all. They will work as before, without any limits, including code uploads. The documentation article is updated to indicate that. I'll answer to other comments and suggestions on Monday. Rate limits values will be most probably changed, they are not final in any sense.

Will this replace basic access authentication as well come February when the auth tokens replace the current system?

tedivm

I have another request that I think will be super helpful. Right now the options are Full Access or the selection of various options. I think a Read Only option would be extremely useful, and since all of the write operations are POST requests it should be easy to define what is read only to just the GET requests.

This would allow third party developers to build really informative applications. Pretty much all of the League stuff can be handled with a read only token (with the exception of populating the public segments, but that's just one of roughly three systems the League site uses). The Screeps Dashboard used by Quorum is also read only. The backup tool could also be setup with a read only key.

CGamesPlay

As someone who enjoys making tooling for the screeps ecosystem, I'm pretty excited about this new feature, but wanted to come in to express my concerns about the rate limits. Since you've already said that the values will likely be changed, I'll just ask one question: Why do the rate limits exist? Here are my thoughts on potential answers to this:

The rate limits are intended to reduce demand on Screeps infrastructure.

In this case the limits should very likely be set so high that only problematic scripts would ever trigger them. For example, requesting a memory segment (100 KB) from each shard (3) once per tick (~.3 Hz) would round out to about 100 KB/s of bandwidth. If supporting that is tenable, the limit should be .3Hz (or 1080 / hour).

For an even more stark example, the code upload limit should likely be closer to 720 / hour or more, given that the "baseline" is users editing code in the online editor might save every 5 seconds during active development, and we know the infrastructure can support this.

The rate limits are intended to increase the challenge of the game.

This seems less likely to me, but if this is the case then browser and steam clients should be rate limited as well. If you don't rate limit that authentication mechanism, then the external tooling will just find ways to use it so it can bypass the rate limits. For example, instead of the tool saying "go here to get an API token", it would say "go here to log in, then run this user script to produce a cookie you can use to log in". It's also worth pointing out that the API method of accessing memory/market/map is emulatable using the console API.

Regardless of the motivation for rate limiting, I'd like to request that a few specific ones be increased to specific values:

POST /api/user/code should have a rate limit of at least 12 / minute = 720 / hour. This lets you update code every 5 seconds, which I bet an active coder on the site would be updating at during active development.
GET /api/user/memory-segment should have a rate limit of at least 1080 / hour. This will allow a script to collect per-tick stats from each shard in realtime.

W4rl0ck

100kb/s would be 0.8 MBit/s per user (or 8.2 GB / day) just for stats... how do you think that would be tenable if you want to support that for every active user?

For an even more stark example, the code upload limit should likely be closer to 720 / hour or more, given that the "baseline" is users editing code in the online editor might save every 5 seconds during active development, and we know the infrastructure can support this.

I would really like to see someone coding for an hour with a average save frequency of 5 seconds That's like saving and uploading code every second tick.

If you don't rate limit that authentication mechanism, then the external tooling will just find ways to use it so it can bypass the rate limits.

That would only work if your script solves captchas. And if you do that actively to circumvent limits set by the game I would expect your account to get banned.

tedivm

100kb/s would be 0.8 MBit/s per user (or 8.2 GB / day) just for stats... how do you think that would be tenable if you want to support that for every active user?

That's definitely not the case- it's a worse case scenario that isn't likely. Assuming one segment per tick per shard, and three second ticks with a player spread across three shards, the segment would have to be completely full of completely random data to hit that target. If the data isn't random the compression used by the API would drop the number significantly.

Even without compression the segments are not likely to be completely full- saving that much data (and thus paying for the JSON.stringifycall) would use up a lot of CPU so people have incentive to only store what they are using. I"m a fairly high GCL player who collects a lot of stats and my segments tend to average around 50kb for stats- which turns into 7kb when compressed (which I just tested using real statistics segments).

artch

Alright, now after reading some of the comments here, I'd like to make another clarification.

Tokens' purpose is to regulate automated use of API endpoints. Automated means human-less here. Such use may involve automated stats gathering or some automated actions during long (more than an hour) sessions. This explains such low limits for some endpoints, since they are not supposed to be automated in general.

However, if you use tokens in some third-party client or another software which involves human presence, then rate limiting shouldn't be the case at all, like in the official client. For that purpose we should probably develop a method to reset all tokens timers at any time in the official client. It would look like a "Reset" button in the "Auth Tokens" section with reCAPTCHA attached to it. If you (not your automated software) have faced some rate limit and it blocks you (not your automated software), then you can easily press that button and continue. We can even develop an UI-less page containing that reCAPTCHA that your client can embed in an <iframe> to handle this scenario easily.

Now to specific questions.

artch

@jbyoshi No, private servers won't include this system.

artch

@tedivm

The rate limit on reading segments will really hurt the screeps stats programs, which currently store stats one per tick. This will be even worse for people who are multiple shard. Even at three second ticks users are only going to be able to get less than a third of their statistics with this system. Combine if with the rate limiting on reading memory (only once per minute) and statistics programs are effectively dead.

Not dead, but needing a refactor. You have to collect per-tick stats in a memory segment and flush it once per minute to third-party software. Collecting something every tick is the load profile that we'd like to eliminate with this new system.

I think the rate limiting on uploading code should be 240 per day, rather than 10 per hour. This would result in the same effective rate limit but would allow people to handle debugging a lot easier. I imagine there will be a lot of salt if people upload a bug but can't work around it due to the upload limit.

It's an option, but we have to consider the other side: with the new "Reset" button per-day limits would be easily circumvented by clicking it once a day, rather than once an hour (which is impossible for most human beings).

A new endpoint that allowed us to pull multiple segments at once would alleviate a lot the pain for the stats programs. With this we could grab all the statistic segments in one go, making it so each stat read only cost 1 memory read, 1 segment read, and 1 console call regardless of how many ticks are being processed.

Makes sense, we'll consider.

It would be nice if we could request an exemption, or at at least higher limits, for some third party tools. Specifically speaking I would like to request a higher limit for the League of Automated Nations website and account (which is only used for completely public information). Otherwise it's going to take a pretty massive rewrite (which I will not have time for in January due to work and travel) to get it to fit into the limits.

We might disable CAPTCHA for the specific user, but we have to define the roadmap of when this rewrite is going to be done, we can't allow this exception to stay for good.

artch

@ags131

After looking at these rate limits, I'm not sure thats viable without spreading the requests over several IPs to counteract the rate limiting, which would be a headache to manage.

All rate limits are user-based, not IP-based.

Another impact is currently most users request stats every 15 seconds, these limits effectively reduce that to once per minute when pulling from Memory, making stats useless for monitoring anything other than long averages.

Reducing the pulling interval is our goal here, as explained above. Please consider aggregating the per-tick stats and pulling them at once.

artch

@tedivm

Are there any plans to add additional endpoints in to the token system? Specifically I think it would be useful to add the "my orders" and "wallet" endpoints to the system so that people can still collect stats about them but not have to give out a full access token.

Yes, it's possible.

artch

@ags131

Any chance we can get an endpoint to query a tokens access? Being able to determine what a token can access would be helpful in cases for example, where a user selects an option to pull from segment 5, but has only granted access to segment 10.

Sure, makes sense.

artch

@tedivm Including all GET endpoints might mean a bit more than a normal third-party software needs to know. It will allow to read, for example, user email, subscription details, and other sensitive data. Is it really different from giving out a full access token?

artch

@cgamesplay

The rate limits are intended to reduce demand on Screeps infrastructure.

This.

In this case the limits should very likely be set so high that only problematic scripts would ever trigger them. For example, requesting a memory segment (100 KB) from each shard (3) once per tick (~.3 Hz) would round out to about 100 KB/s of bandwidth. If supporting that is tenable, the limit should be .3Hz (or 1080 / hour).

We also should consider backend CPU overhead (e.g. for gzip compression) and internal LAN overhead for such operations.

artch

Another side project here is how we should rate limit the websockets. This is not implemented here yet, but we need to come up with some solution eventually. One option which is currently being debated is to limit the connection rate itself:

When you connect to a websocket using a token for the first time, you have 1 hour timeout. After the timer is expired, the websocket connection drops.
After that all websocket sessions will drop in 15 seconds with a 60-second reconnect timeout.
You can use the "Reset" button in your account settings to restore the 1-hour timeout again.

tedivm

@tedivm Including all GET endpoints might mean a bit more than a normal third-party software needs to know. It will allow to read, for example, user email, subscription details, and other sensitive data. Is it really different from giving out a full access token?

Take the quorum dashboard as an example. It's a completely read only application- there's nothing in it that lets a user change game state. It would be a perfect use case for a "read only" token. If someone were to hack the system a read only token means they wouldn't be able to affect the game. I do plan on adding messaging to the dashboard.

This obviously isn't critical, but would fall under the "nice to have" category.

Not dead, but needing a refactor. You have to collect per-tick stats in a memory segment and flush it once per minute to third-party software. Collecting something every tick is the load profile that we'd like to eliminate with this new system.

The other stats program (not the one made by me) already has support for sending statistics using the console instead of segments. I'd worry that most people are going to bypass this limit switching away from segments to console based (at least until that gets rate limited as well).

Another side project here is how we should rate limit the websockets. This is not implemented here yet, but we need to come up with some solution eventually. One option which is currently being debated is to limit the connection rate itself:

There are really two use cases for the websocket and third party applications (that aren't full on clients)-

Recording console data, which the proposed rate limits would make impossible to do as each websocket would only be able to be used for 15 seconds before being disconnected for a minute. Without the ability to record console data it is very difficult for people to trace back older bugs that occur when they aren't at the system, and it would make systems like quorum a bit more difficult since it depends on an app to essentially mirror the console data.

This could also be frustrating for people who use the stand alone console, at least after the first hour. If the connect for the first time and it works for an hour that's great, but then if their reconnects later in the day are ratelimited until they hit a button in the UI that could be frustrating for them. Rather than have the token hit ratelimiting mode (15 seconds on, 60 seconds off) permanently after an hour of usage you may want to have it reset back to "full hour" mode after a few hours. This would effectively stop automated reading of the console, but would mean less frustration for people who are bouncing in and out of the custom console program.

Getting room objects. The "battle reporter" bot uses this to define the category of a battle and then report it to slack and twitter. This bot rarely takes more than a minute to run it's tests, and only looks at rooms that the battle api endpoints say are active. I think this should be able to stay within those limits, but it's going to be really tight.

Why not rate limit based on usage, rather than pure timeouts? Only allow X number of console websockets at a time, and rate limit room objects so only Y amount can be locked up each minute?

To add to that if you created a new API endpoint for pulling in room object data I bet a ton of people would use that instead of using the websocket, which would allow you to rate limit it using the ratelimiting system from above. This may work better than a pure timeout based system, as players can queue their requests, dump out as many as possible during that 15 second window, and then pause until it can query again- resulting in roughly the same amount of load but condensing it to a spike instead of spreading it out.

JBYoshi

I agree with @tedivm. Personally, I don't see how limiting the time for web sockets would help much (as opposed to some other type of limit). An application that is always connected to the web socket (but is otherwise reasonable in its usage) would be no different than leaving your Screeps client on overnight.

hyramgraff

@artch A "read-only" token doesn't need to be a "read-everything" token. I don't know if there's a better description that's widely used for a token that's only authorized for read access to non-sensitive data.

tedivm

Another issue that's come up is that the rate limits are making development difficult. Would it be possible to have the rate limits lifted or removed for PTR?

ags131

For the moment I've reverted my code pushing to using username/password, I've hit the rate limiting 3 times in a row this morning trying to work on cross-shard code.