Auth Tokens
-
Another side project here is how we should rate limit the websockets. This is not implemented here yet, but we need to come up with some solution eventually. One option which is currently being debated is to limit the connection rate itself:
- When you connect to a websocket using a token for the first time, you have 1 hour timeout. After the timer is expired, the websocket connection drops.
- After that all websocket sessions will drop in 15 seconds with a 60-second reconnect timeout.
- You can use the "Reset" button in your account settings to restore the 1-hour timeout again.
-
@tedivm Including all GET endpoints might mean a bit more than a normal third-party software needs to know. It will allow to read, for example, user email, subscription details, and other sensitive data. Is it really different from giving out a full access token?
Take the quorum dashboard as an example. It's a completely read only application- there's nothing in it that lets a user change game state. It would be a perfect use case for a "read only" token. If someone were to hack the system a read only token means they wouldn't be able to affect the game. I do plan on adding messaging to the dashboard.
This obviously isn't critical, but would fall under the "nice to have" category.
Not dead, but needing a refactor. You have to collect per-tick stats in a memory segment and flush it once per minute to third-party software. Collecting something every tick is the load profile that we'd like to eliminate with this new system.
The other stats program (not the one made by me) already has support for sending statistics using the console instead of segments. I'd worry that most people are going to bypass this limit switching away from segments to console based (at least until that gets rate limited as well).
Another side project here is how we should rate limit the websockets. This is not implemented here yet, but we need to come up with some solution eventually. One option which is currently being debated is to limit the connection rate itself:
There are really two use cases for the websocket and third party applications (that aren't full on clients)-
- Recording console data, which the proposed rate limits would make impossible to do as each websocket would only be able to be used for 15 seconds before being disconnected for a minute. Without the ability to record console data it is very difficult for people to trace back older bugs that occur when they aren't at the system, and it would make systems like quorum a bit more difficult since it depends on an app to essentially mirror the console data.
This could also be frustrating for people who use the stand alone console, at least after the first hour. If the connect for the first time and it works for an hour that's great, but then if their reconnects later in the day are ratelimited until they hit a button in the UI that could be frustrating for them. Rather than have the token hit ratelimiting mode (15 seconds on, 60 seconds off) permanently after an hour of usage you may want to have it reset back to "full hour" mode after a few hours. This would effectively stop automated reading of the console, but would mean less frustration for people who are bouncing in and out of the custom console program.
- Getting room objects. The "battle reporter" bot uses this to define the category of a battle and then report it to slack and twitter. This bot rarely takes more than a minute to run it's tests, and only looks at rooms that the battle api endpoints say are active. I think this should be able to stay within those limits, but it's going to be really tight.
Why not rate limit based on usage, rather than pure timeouts? Only allow X number of console websockets at a time, and rate limit room objects so only Y amount can be locked up each minute?
To add to that if you created a new API endpoint for pulling in room object data I bet a ton of people would use that instead of using the websocket, which would allow you to rate limit it using the ratelimiting system from above. This may work better than a pure timeout based system, as players can queue their requests, dump out as many as possible during that 15 second window, and then pause until it can query again- resulting in roughly the same amount of load but condensing it to a spike instead of spreading it out.
-
I agree with @tedivm. Personally, I don't see how limiting the time for web sockets would help much (as opposed to some other type of limit). An application that is always connected to the web socket (but is otherwise reasonable in its usage) would be no different than leaving your Screeps client on overnight.
-
@artch A "read-only" token doesn't need to be a "read-everything" token. I don't know if there's a better description that's widely used for a token that's only authorized for read access to non-sensitive data.
-
Another issue that's come up is that the rate limits are making development difficult. Would it be possible to have the rate limits lifted or removed for PTR?
-
For the moment I've reverted my code pushing to using username/password, I've hit the rate limiting 3 times in a row this morning trying to work on cross-shard code.
-
I have another minor request for the tokens: Add an option to add a short comment or label to the token in the UI, that would make it easy to tell which tokens are used where. For example, commenting with 'local dev', 'screepsplus', 'stats', etc
On that note, the ability to manually enter paths for the token would be nice too. Allows a bit more flexibility than the current full-access or limited selections.
-
Thanks for the reply!
Do you have anything you can share about the impact these proposed limits would have on current usage patterns? Perhaps it would be good to turn on these rate limits in "warning mode" for a few weeks to gather feedback on what reasonable limits feel like?
The websockets endpoint specifically I wish would be rate limited on bandwidth rather than on connection duration. My external console takes effectively no bandwidth but stays connected for hours, even a 0.25 KB/s bandwidth limit would be completely acceptable to me.
As far as the UI-less reCAPTCHA page to clear the rate limits: I actually think this is pretty fine. I'm imagining the deploy script would catch an error and print out a link for the user to click, then just keep refreshing in the background until the user did it.
Sounds like stats are a problem for you guys, so the rate limits on that make sense. I do wish it were more usage-based, like e.g. maybe these endpoints have a CPU cost associated with them that drain directly from your bucket. This creates an incentive for users to optimize their stats collection, and you can adjust the CPU cost of the endpoint as required.
-
The screeps3D project was planning on an early release to showcase the work so far. Based on this discussion, it sounds like there are still some issues to be considered for 3rd-party clients (overall data use including websockets, resetting rate limits). In light of that, it probably seems best not to do a release when it is uncertain what kind of issues it might cause for the public server.
About the rate limits, I'd humbly request some other option than the manual reset. It would not be very good user experience to be scrolling around the map and occasionally have to do a CAPTCHA. I'm not a web-dev so I looked up the invisible reCAPTCHA and I'm not sure that will be possible to do in a non-web environment like unity3d. Of course I understand that the dev-team has limited resources to accommodate the needs of a 3rd party client, so I'm not expecting it. It might be best to put the project on hold until there is something available.
-
Recording console data, which the proposed rate limits would make impossible to do as each websocket would only be able to be used for 15 seconds before being disconnected for a minute.
We need to come up with a solution that allows legit console usage like tracking errors and short messages, but disallow abusing it to send large amounts of data.
To add to that if you created a new API endpoint for pulling in room object data I bet a ton of people would use that instead of using the websocket, which would allow you to rate limit it using the ratelimiting system from above.
Makes sense, we'll look into that.
-
@artch A "read-only" token doesn't need to be a "read-everything" token. I don't know if there's a better description that's widely used for a token that's only authorized for read access to non-sensitive data.
I mean,
GET api/user/me
endpoint contains some sensitive data for example. Allowing allGET
endpoints would include this one. Otherwise we have to develop some other criteria other than "allGET
".
-
Another issue that's come up is that the rate limits are making development difficult. Would it be possible to have the rate limits lifted or removed for PTR?
We may leave only the global 120 req/min limit and drop all per-endpoint limits on the PTR.
-
For the moment I've reverted my code pushing to using username/password, I've hit the rate limiting 3 times in a row this morning trying to work on cross-shard code.
Do you think the "Reset" button would help you with that? You can even set up your push script to open that URL automatically on the rate limiting response, it will give you another 10 pushes in the current hour window (with automatic reset to yet another 10 requests in 00 minutes of the next hour).
-
I have another minor request for the tokens: Add an option to add a short comment or label to the token in the UI, that would make it easy to tell which tokens are used where. For example, commenting with 'local dev', 'screepsplus', 'stats', etc
Nice idea!
On that note, the ability to manually enter paths for the token would be nice too. Allows a bit more flexibility than the current full-access or limited selections.
It's technically difficult to implement.
-
The websockets endpoint specifically I wish would be rate limited on bandwidth rather than on connection duration. My external console takes effectively no bandwidth but stays connected for hours, even a 0.25 KB/s bandwidth limit would be completely acceptable to me.
How would you like to get it implemented personally? Truncating responses? Throttling/skipping them?
-
@bonzaiferroni Do you have any thoughts on what would be the best solution for your project, considering our needs with this new system?
-
Do you have any thoughts on what would be the best solution for your project, considering our needs with this new system?
Unfortunately I don't have any brilliant ideas. Clients are going to be in a whole different class of data use compared to an automated tool like a stats-checker. While a stats-checker will use a little bit of data constantly, a client will use potentially quite a bit more data except only when the player is active. That is why I thought the per-day limits might mitigate the problem, but it is only a partial solution and it isn't suitable for the reasons you've stated above. Another issue is meeting these limits with the client might lock a user out of other tools, which would be unacceptable to most players. The heart of the problem is that automated tools can be designed to stay within reasonable limits, but a client's data use will be intrinsically unpredictable and sporadic.
I can't think of any solution short of allowing clients to bypass the limits, as you've done with the official client. It might be that the best use for 3rd-party clients is with private servers. Since the Screeps3D project is being developed under the MIT license I suppose it would be possible for the dev-team to release their own version that has been modified to access the public server, but I realize that is probably unrealistic.
-
@bonzaiferroni I think the only option for clients is to handle Google Invisible reCAPTCHA somehow. This is how the official client will work, and the same principle should be applied to any other client. You can even continue to use
/api/auth/signin
endpoint with normal token exchange generated by it, but you have to be ready that the server will ask you to confirm CAPTCHA from time to time (once per a few hours). I believe there must be some tool to embed a Web View in a Unity application these days.
-
That might be very workable, I'll definitely look into what it would take. I wonder if it would be possible to get a new subforum for asking questions related to 3rd party tools, sometimes a little direction from the devs can go a long way.
-
@artch Throttling would be ideal. A token bucket controlling the maximum amount of data that can be sent is probably easiest to implement. From my limited memory of the socket endpoints, I think dropping messages that would overflow the bucket is reasonable (I don't think any client will get confused if messages were dropped since they are all named ws events). You could even send a console message describing that you are being rate limited.