Recently during a Customer engagement I encountered what seemed to be a fairly straightforward issues with a Security Group that was running into a service quota isssue with the number of Inbound or outbound rules per security group.
Background to the Issue
To give a little bit of context, this particulary security group was associated with a couple of Amazon EC2 Instances that were running Microsoft Windows Server 2019 and performing the function of Active Directory Domain Controllers that were being self-managed by the Customer.
Active Directory has lots of different ports that it requires to be open as per the following Microsoft Knowledge Base Article so that it can support Replication, Authentication & DNS Resolution.
Specifically, in our use case our existing we had deployed our underlying EC2 instances via CloudFormation including the supporting Security Group and its inbound rules.
What was the problem?
Because of the Network environment that we were working in, we needed to reduce the scope of the source IP Addresses that the Security Groups had within the inbound rules to be more specific e.g. from a /16 to multiple /24 CIDR Ranges.
Due to the number of individual TCP and UDP ports that were required per IP Address Range (13 unique different ports), we were soon hitting the soft limit on the number of inbound or outbound rules per Security Group and when updating the CloudFormation Template via a ChangeSet we were encountering errors to the effect of “exceeding the service quota”.
What was the Solution?
The first part of the solution was obvious, we needed to raise a Service Quota increase. Therefore we submitted the request via Service Quotas to increase the limit from 60 to 120 which was the limit for the maximum number of inbound or outbound rules per VPC security group.
However, once the increase had been approced and applied we were still experiencing the same issue. Upon further investigation I noticed that the security group was utilising a custom prefix list to simplify the process of adding multiple IP Addresses to the Security Group.
This is where things got interesting, the description in the Service Quota read “A rule that references a security group or prefix list ID counts as one rule each for IPv4 and IPv6”. Our interpretation of this that
1 prefix list * 13 inbound rules = 13 rules therefore in theory we should have been within the service quota. However, numerous attempt to update the CloudFormation Stack were failing with the same errors as we previously experienced. After some further research and coming across Group CIDR blocks using managed prefix lists in the AWS documentation it transpires that actually each IP Address within the prefix list counts towards the service quota e.g.,
1 prefix list with 9 IP Address Ranges * 13 inbound rules = 117 rules.
As a test, we therefore commented out all the IP Address ranges in the Prefix List except for 4 that was being associated with security group as a test and then tried to update the CloudFormation Stack again and this time it was successful.
The prefix list that we were using had 9 different IP Address ranges but since we had some other inbound rules that were also impacting the service limit which although it was only 4 they were taking us over the quota to 121. Since these additional rules were being duplicated by the rules being established via the prefix list we simply removed all of those and carried another Stack updated which this time was successful.
Hope this helps others if they ever encounter a similar issue.