Rate Limiting in Java: Implementing Per-User Throttling with Redis Buckets
To protect APIs from abuse and ensure fair usage, rate limiting is essential. It prevents users or clients from overwhelming the system by enforcing a limit on how often they can call your APIs. In this guide, you’ll learn how to implement per-user throttling in Java using Spring Boot and Redis, using the token bucket algorithm.
1. Why Use Redis for Rate Limiting?
- Low latency and high throughput
- Atomic operations using Redis scripts (Lua)
- Persistence (optional) and cluster support
- Suitable for distributed systems — state doesn’t live in application memory
2. The Token Bucket Algorithm
The token bucket algorithm is widely used for rate limiting. Here’s how it works:
- Each user has a bucket with tokens.
- Each request consumes one token.
- Tokens are refilled at a fixed rate.
- If there are no tokens, the request is denied or delayed.
This model allows for short bursts while maintaining an average rate.
3. Redis Lua Script for Rate Limiting
-- token_bucket.lua
local key = KEYS[1]
local max_tokens = tonumber(ARGV[1])
local refill_rate = tonumber(ARGV[2])
local now = tonumber(ARGV[3])
local requested = tonumber(ARGV[4])
local bucket = redis.call("HMGET", key, "tokens", "timestamp")
local tokens = tonumber(bucket[1]) or max_tokens
local timestamp = tonumber(bucket[2]) or now
local delta = math.max(0, now - timestamp)
tokens = math.min(max_tokens, tokens + delta * refill_rate)
local allowed = tokens >= requested
if allowed then
tokens = tokens - requested
end
redis.call("HMSET", key, "tokens", tokens, "timestamp", now)
redis.call("EXPIRE", key, 60)
return allowed
4. Spring Boot Integration
1. Redis Configuration
@Configuration
public class RedisConfig {
@Bean
public RedisTemplate<String, String> redisTemplate(RedisConnectionFactory factory) {
RedisTemplate<String, String> template = new RedisTemplate<>();
template.setConnectionFactory(factory);
return template;
}
}
2. RateLimiter Service
@Service
public class RateLimiterService {
@Autowired
private RedisTemplate<String, String> redisTemplate;
private final DefaultRedisScript<Boolean> rateLimiterScript;
public RateLimiterService() {
rateLimiterScript = new DefaultRedisScript<>();
rateLimiterScript.setScriptSource(new ResourceScriptSource(new ClassPathResource("token_bucket.lua")));
rateLimiterScript.setResultType(Boolean.class);
}
public boolean isAllowed(String userId) {
String key = "bucket:" + userId;
List<String> keys = List.of(key);
Long now = Instant.now().getEpochSecond();
return Boolean.TRUE.equals(redisTemplate.execute(
rateLimiterScript,
keys,
"10", // max_tokens
"1", // refill_rate (tokens/sec)
now.toString(),
"1" // requested tokens
));
}
}
3. Middleware/Interceptor
@Component
public class RateLimitingInterceptor implements HandlerInterceptor {
@Autowired
private RateLimiterService rateLimiterService;
@Override
public boolean preHandle(HttpServletRequest request, HttpServletResponse response, Object handler) {
String userId = request.getHeader("X-User-Id");
if (userId == null || !rateLimiterService.isAllowed(userId)) {
response.setStatus(HttpStatus.TOO_MANY_REQUESTS.value());
return false;
}
return true;
}
}
4. Register the Interceptor
@Configuration
public class WebConfig implements WebMvcConfigurer {
@Autowired
private RateLimitingInterceptor rateLimitingInterceptor;
@Override
public void addInterceptors(InterceptorRegistry registry) {
registry.addInterceptor(rateLimitingInterceptor);
}
}
5. Benefits of This Approach
- ⚙️ Scalable: Stateless, works in distributed environments
- 🧪 Testable: Token bucket logic lives in Redis script
- 🔐 Secure: Per-user rate limiting avoids abuse
- 🕰️ Flexible: Adjust tokens, refill rates per endpoint/user type
6. Optional Enhancements
- Use a different refill rate per API key or user tier
- Implement burst handling via leaky bucket fallback
- Store limits in Redis hashes or config maps
- Expose remaining quota in HTTP headers
7. Conclusion
Rate limiting is critical for building resilient APIs. By combining Spring Boot, Redis, and a token bucket algorithm, you get a robust and scalable per-user rate limiter that supports zero-downtime, distributed environments. Redis handles concurrency and token math, while your app stays clean and fast.

