0

I have a rather complicated setup in my AWS console.

  1. I have an EC2 instance in region A with LAMP installed for what I will call my CRM.
  2. I have an RDS in the same region A for my CRM that contains the information from the orders / clients I have.
  3. I have an EC2 instance in a region B with LAMP installed that I will call my "Shopping cart"
  4. I have an RDS in the same region B with the database for my shopping cart.
  5. Somewhat minor detail (I think): I have two other EC2 instances in regions C and D with LAMP installed that are secondary "shopping carts". They also have their own RDS instances.

The two primary EC2 servers connect one to the other via calls via CURL. So when an order comes in on my EC2 server B, a curl call is made to my EC2 server A to insert the order, add client information, etc. Also, my server A can make CURL calls to my server B to update prices, etc. Server B can make CURL calls to server A to get current shipping prices to a city.

Now the problem I am having is that yesterday, around 4 AM, my RDS B instance started flooding with connections and bumped its limit of 50 simultaneous connections. So I upgraded from t2.small to t2.medium and I now have 90 simultaenous connections, but the problem persists, constantly arriving at the 90 connection limit anywhere from every couple of minutes to half hour.

I also upgraded my EC2 A instance, but again that changes nothing. When I run the following on my RDS B instance, I typically get 6-10 threads, but occasionally it starts to spike and when it does, arrives at 90 connections typically within one or two minutes.

SHOW status LIKE 'Threads_connected';

+-------------------+-------+
| Variable_name     | Value |
+-------------------+-------+
| Threads_connected | 6     |
+-------------------+-------+
1 row in set (0.01 sec)

Running the following command an my RDS B instance shows that it is dropping connections when I attain the 90 simultaneous connection limit:

show status like 'Conn%';

+-----------------------------------+--------+
| Variable_name                     | Value  |
+-----------------------------------+--------+
| Connection_errors_accept          | 0      |
| Connection_errors_internal        | 0      |
| Connection_errors_max_connections | 6856   |
| Connection_errors_peer_address    | 0      |
| Connection_errors_select          | 0      |
| Connection_errors_tcpwrap         | 0      |
| Connections                       | 123258 |
+-----------------------------------+--------+
7 rows in set (0.03 sec)

Whenever I get to 90 connections on RDS B, my EC2 A instance slows to a crawl and the connections spike on the RDS A instance. And my EC2 B instance sends HTTP 500 errors because the mysqli connection failed due to too many connections.

Finally, if I run the following on either RDS A or RDS B instances, I see lots of sleeping commands, but hardly ever any querying:

SHOW FULL PROCESSLIST;

The temporary "solution" that I have come up with is to restart Apache service on the EC2 A instance. As soon as I do that, all processes on RDS A and B clear up within a few seconds.

I don't understand how this could just suddenly start happening, and even after upping the power of my instances how it can continue to happen. I am out of ideas where to look next. The only "issue" I am having as far as I can tell is that my RDS connections limit is getting hit. EC2 Load averages are very good (0.02 right now). I haven't changed any code in the last week that I can think of.

2
  • 1
    Unrelated, but you're always better off with a T3 instance instead of a T2. You'll get better performance for equal (rds) or less (ec2) cost. Commented Jan 29, 2020 at 5:33
  • 1
    Thank for that tip, @JasonFloyd I'll definitely look into changing that. Also, I have finally pinpointed the problem. Some rogue code introduced by a freelancer.
    – Jonathan
    Commented Jan 29, 2020 at 13:02

1 Answer 1

1

I finally found this issue after about 8 hours of searching. There was some rogue code introduced to one of my websites by a freelancer that was failing to close mysql connections.

Hopefully this will help someone else out. If you are experiencing a similar situation, check the server for files modified recently with:

find . -type f -mtime -$n

Where $n is an integer representing the number of days ago that you started experiencing problems. Run that command in the directory where you expect the change might have taken place.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .