Rate Limit APIs with Bucket4J via Java

Rate Limit APIs with Bucket4J via Java

Learn how to limit calls to your API in Java via Bucket4J

To prevent APIs from getting overwhelmed or protect them from malicious attacks, API owners use different ways to ensure APIs are safe. One way we will look is to Rate Limit API.

There are a lot of API management platforms available (we will blog an article on that soon) but what if we have to build one using an Open source tool.

Introducing Bucket4J. Bucket4J is a Java rate-limiting library based on a token-bucket algorithm.

You can read about the token bucket algorithm over here.

Pre-requisites:

  • JDK 8+
  • Maven (pulling bucket4j library)

We will build applications in a client-server architecture. A client which will consume API hosted by server wrapped in Bucket4J API with rate limitations.

Client and Server will be simple Spring Boot Applications for faster development and demo purpose. (Spring IO Initialzr to get it started)

Setting UP Client and Server Application

Server

A simple resource that will return random planets in form of String from the list of planets we have maintained. We start the server on 8001, so as for the client to consume it.

package in.virendra.oswal.server;

import java.util.Arrays;
import java.util.Collections;
import java.util.List;

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RestController;

@SpringBootApplication
public class ServerApplication {

    public static void main(String[] args) {
        SpringApplication.run(ServerApplication.class, args);
    }

}

@RestController
class PlanetResource {

    private final List<String> PLANETS = Arrays.asList(
            new String[] { "Mercury", "Venus", "Earth", "Mars", "Jupiter", "Saturn", "Uranus", "Neptune", "Pluto" });

    @GetMapping(value = "/planet")
    public ResponseEntity<String> getPlanet() {
        Collections.shuffle(PLANETS);
        return ResponseEntity.ok(PLANETS.get(0));
    }
}

Client

We will consume the above service and just log out what planet we get to console for now. We will make server calls to the same service with a delay of 10 seconds in a while loop, so this equates to 6 calls per minute. (So we can demonstrate Bucket4J limitation later.)

package in.virendra.oswal.client;

import java.net.URI;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.CommandLineRunner;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.context.annotation.Bean;
import org.springframework.http.HttpStatus;
import org.springframework.http.ResponseEntity;
import org.springframework.stereotype.Component;
import org.springframework.web.client.RestTemplate;

@SpringBootApplication
public class ClientApplication {

    public static void main(String[] args) {
        SpringApplication.run(ClientApplication.class, args);

    }

    @Bean
    RestTemplate restTemplate() {
        return new RestTemplate();
    }
}

@Component
class Runner implements CommandLineRunner {

    Logger LOG = LoggerFactory.getLogger(Runner.class);

    @Autowired
    RestTemplate _rt;

    @Override
    public void run(String... args) throws Exception {
        while (true) {
            ResponseEntity<String> response = _rt.getForEntity(new URI("http://localhost:8001/planet"), String.class);
            if (response.getStatusCode() == HttpStatus.OK) {
                LOG.info(String.format("Planet received %s", response.getBody()));
            } else {
                LOG.warn("Response Code Received", response.getStatusCodeValue());
            }
            Thread.sleep(10000);
        }
    }
}

On running client, every 10 seconds we make a call to the server and get a planet in a random way. So our client and server are working as expected. 6calls_per_mins_no_rate.JPG

Setup Bucket4J for Rate Limitation

Getting started with Bucket4J is as simple as adding the library to your classpath, we will pull that via Maven. We get the latest as per official bucket4j GitHub repository

<dependency>
    <groupId>com.github.vladimir-bukhtoyarov</groupId>
    <artifactId>bucket4j-core</artifactId>
    <version>6.2.0</version>
</dependency>

We will set up a simple bucket that allows only 3 calls per client within a minute before it waits for the bucket to be refilled after 1 minute again post receiving of the first call.

Server Resource Updated as below

@RestController
class PlanetResource {

    private Bucket bucket;

    private final List<String> PLANETS = Arrays.asList(
            new String[] { "Mercury", "Venus", "Earth", "Mars", "Jupiter", "Saturn", "Uranus", "Neptune", "Pluto" });

    @GetMapping(value = "/planet")
    public ResponseEntity<String> getPlanet() {
        if (bucket.tryConsume(1)) {
            Collections.shuffle(PLANETS);
            return ResponseEntity.ok(PLANETS.get(0));
        } else {
            return ResponseEntity.status(HttpStatus.TOO_MANY_REQUESTS).build();
        }
    }

    @PostConstruct
    public void setupBucket() {
        Bandwidth limit = Bandwidth.classic(3, Refill.intervally(3, Duration.ofMinutes(1)));
        this.bucket = Bucket4j.builder().addLimit(limit).build();
    }
}

In the above code, we set up a bucket with a capacity of 3 requests, and the refill is done every minute to add 3 more.
When we receive a request we consume a token, and if the token is available in a given window, we are allowed to get a planet in this case, else we return HTTP Code 429 which specifies too many requests received more than expected, we need to wait before trying.

Let us re-run our client which is now changed to handle HTTP Code 429 as below:

@Component
class Runner implements CommandLineRunner {

    Logger LOG = LoggerFactory.getLogger(Runner.class);

    @Autowired
    RestTemplate _rt;

    @Override
    public void run(String... args) throws Exception {
        while (true) {
            Thread.sleep(10000);
            try {
                ResponseEntity<String> response = _rt.getForEntity(new URI("http://localhost:8001/planet"),
                        String.class);
                if (response.getStatusCode() == HttpStatus.OK) {
                    LOG.info(String.format("Planet received %s", response.getBody()));
                }
            } catch (HttpClientErrorException ex) {
                if (ex.getRawStatusCode() == 429) {
                    LOG.warn("Rate limit exhausted");
                }
            }

        }
    }
}

If we run now, we see the first 3 requests get planet as expected but the next 3 within that 1-minute window are not processed and we get 429 HTTP Code as below. Voila, we got a basic rate limitation in place. 3calls_rate_limiit.JPG

Bonus

However, as a client, if rate limitation is raised client should avoid I/O calls altogether.
So as servers in any API management tool we will send few response headers which can be used by the client to determine how many tokens are pending in the given window and if exhausted how much time to wait before we re-trigger again.
Though we won't do any handling on the client-side as of now, we will just log out headers.

We will pass 2 headers below:

  • X-RATE-TOKEN-AVAILABLE Number of tokens available in the bucket in the given time window, in this 1-minute. Will be returned for successful request i.e. HTTP Code 200
  • X-RATE-REFILL-TIME Refill time in seconds before an attempt can be remade to API. This will be returned in case of HTTP Code 429

Server code to handle header as part of the response as below, we use ConsumptionProbe)API to get required details:

@GetMapping(value = "/planet")
    public ResponseEntity<String> getPlanet() {

        ConsumptionProbe probe = bucket.tryConsumeAndReturnRemaining(1);
        if (probe.isConsumed()) {
            Collections.shuffle(PLANETS);
            return ResponseEntity.ok().header(X_RATE_TOKEN_AVAILABLE, Long.toString(probe.getRemainingTokens()))
                    .body(PLANETS.get(0));
        } else {
            return ResponseEntity.status(HttpStatus.TOO_MANY_REQUESTS)
                    .header(X_RATE_REFILL_TIME, Long.toString(probe.getNanosToWaitForRefill() / 1_000_000_000)).build();
        }

    }

Client Code to log headers for now:

@Component
class Runner implements CommandLineRunner {

    Logger LOG = LoggerFactory.getLogger(Runner.class);

    @Autowired
    RestTemplate _rt;

    @Override
    public void run(String... args) throws Exception {
        while (true) {
            Thread.sleep(10000);
            try {
                ResponseEntity<String> response = _rt.getForEntity(new URI("http://localhost:8001/planet"),
                        String.class);
                if (response.getStatusCode() == HttpStatus.OK) {
                    LOG.info(String.format("Planet received %s\n Headers: %s", response.getBody(),
                            response.getHeaders().toString()));
                }
            } catch (HttpClientErrorException ex) {
                if (ex.getRawStatusCode() == 429) {
                    LOG.warn("Rate limit exhausted");
                    LOG.info("Await time before refill: " + ex.getResponseHeaders().toString());
                }
            }

        }
    }
}

Let's re-run the server and client. If we see logs of the client now, we see every request to it reduces token by 1, until it exhausts and waits for refill time.

headers.JPG

As per above, we can see 3 calls go through with we know many tokens pending still as per header X_RATE_TOKEN_AVAILABLE sent by the server, after that 3 calls fail to process with refill time as shown in header X-RATE-REFILL-TIME which has been converted to seconds.

That's it for this blog. However this was a very basic usage of the Bucket4J library, but it has a lot of features that we can use to make a PRODUCTION-ready application. From using Distributed to Being Thread-safe to persisting Token all is taken care of via Bucket4J APIs.

Resources

Thank you for reading, If you have reached it so far, please like the article, It will encourage me to write more such articles. Do share your valuable suggestions, I appreciate your honest feedback and suggestions!

I would love to connect with you at Twitter | LinkedIn.

Did you find this article valuable?

Support Virendra Oswal by becoming a sponsor. Any amount is appreciated!