[scale-infra] web to Centos8

Phil Dibowitz phil at ipom.com
Wed Jan 31 06:08:49 UTC 2024


Oh and I missed the obvious one, the huge spike in user CPU at the same 
time.

Anyway, this dashboard is locked at a timeframe that shows the spike, 
the outage, and the recovery for anyone wanting to see:

https://app.datadoghq.com/dash/host/14232198117?refresh_mode=paused&view=spans&from_ts=1706629500000&to_ts=1706636700000&live=false

On 1/30/24 21:58, Phil Dibowitz wrote:
> Er, you're off by two hours 8:10am- 9:39am PT, best I can tell.
> 
> I just got home from work, but I did some digging around.
> 
> I don't think it's apache using too much memory. I don't think it's 
> memory related at all.
> 
> Here's useful things I've found:
> 
> * Load average SPIKED a little while before the outage at around 7:52. 
> It came back down before 8 though
> 
> * We had a spike in network traffic at exactly the same time. Only 
> 7MiB/s, but way bigger than our usual traffic.
> 
> * There was a spike in memory usage at the same time, but the system was 
> no wear near out of memory.
> 
> * The last datapoint DD got from Apache was at 7:54
> 
> * Odd thing that I don't think is related but is funny. The amount of 
> memory journald is using drops significantly when apache gets killed 
> from 215MB to 160MB. At the beginning the outage it ~188. I just 
> restarted it and it's at 22MB. It's the top users of RSS memory on the 
> box, or was until I restarted it.
> 
> Based on the fact that every OTHER process on the system was reporting 
> into DD, but apache was not, it seems pretty likely that every thread in 
> apache is tied up on _something_ - either talking to a DB, a bad client 
> (seems unlikely since we have a cloudflare in front of us), or something 
> else.
> 
> That's all I got for now.
> 
> 
> On 1/30/24 12:04, Ilan Rabinovitch wrote:
>> Looks like we had another outage for about an hour today.  10am - 
>> 11:36am PT.
>>
>> On Mon, Jan 8, 2024 at 9:53 AM Phil Dibowitz <phil at ipom.com 
>> <mailto:phil at ipom.com>> wrote:
>>
>>     No, I woke up, saw the alert, bounced httpd, and had to run out the
>>     door.
>>
>>     It's a reasonable guess that it's the same memory issue as before.
>>     We're
>>     currently bouncing apache in cron every-other-hour, which seems to
>>     mostly keep us up, but occasionally not, so my guess is that's 
>> RIGHT at
>>     the threshold.
>>
>>     I don't have the bandwidth the go spelunking - the httpd.conf
>>     adjustments I made in... November or whatever that was seemed to 
>> help,
>>     but obviously they didn't solve everything. I still suspect that 
>> Drupal
>>     is causing connections to be held open somewhere, but that's pure
>>     speculation.
>>
>>     We have the data, between the apache_status.log and DD to figure it
>>     out,
>>     I'd just need time to do it, and I don't have the cycles right now.
>>
>>     On 1/7/24 17:11, Ilan Rabinovitch wrote:
>>      > Looks like we had another outage today.  Any idea whats up?
>>      >
>>      > On Mon, Dec 25, 2023 at 1:17 AM Ilan Rabinovitch
>>     <ilan at linuxfests.org <mailto:ilan at linuxfests.org>
>>      > <mailto:ilan at linuxfests.org <mailto:ilan at linuxfests.org>>> wrote:
>>      >
>>      >     Looks like another outage about 2 hours ago.
>>      >
>>      >     On Sun, Dec 24, 2023 at 8:45 PM Ilan Rabinovitch
>>      >     <ilan at linuxfests.org <mailto:ilan at linuxfests.org>
>>     <mailto:ilan at linuxfests.org <mailto:ilan at linuxfests.org>>> wrote:
>>      >
>>      >
>>      >
>>      >         website was down again.. looks like for about an hour.
>>       ive just
>>      >         restarted httpd.
>>      >
>>      >         On Sat, Oct 21, 2023 at 5:00 PM Phil Dibowitz
>>     <phil at ipom.com <mailto:phil at ipom.com>
>>      >         <mailto:phil at ipom.com <mailto:phil at ipom.com>>> wrote:
>>      >
>>      >             Logs look good.
>>      >
>>      > https://github.com/socallinuxexpo/scale-chef/pull/302
>>     <https://github.com/socallinuxexpo/scale-chef/pull/302>
>>      >       <https://github.com/socallinuxexpo/scale-chef/pull/302
>>     <https://github.com/socallinuxexpo/scale-chef/pull/302>>
>>      >             drops restarts to
>>      >             every 2 hours. Merged.
>>      >
>>      >             On 10/21/23 13:47, Phil Dibowitz wrote:
>>      >              > I dropped it to less often a few days ago, I
>>     haven't yet
>>      >             looked at the
>>      >              > logs to see what the status of httpd is at those
>>     times.
>>      >             My plan is to do
>>      >              > that, then drop it to less often, and rinse and
>>     repeat.
>>      >              >
>>      >              > Sorry new job is keeping me super busy.
>>      >              >
>>      >              > On 10/21/23 08:25, Ilan Rabinovitch wrote:
>>      >              >> Are we able to remove the cron job that's
>>     restarting httpd?
>>      >              >>
>>      >              >> On Sun, Oct 8, 2023 at 12:48 AM Phil Dibowitz
>>      >             <phil at ipom.com <mailto:phil at ipom.com>
>>     <mailto:phil at ipom.com <mailto:phil at ipom.com>>> wrote:
>>      >              >>>
>>      >              >>> On 10/7/23 14:30, Phillip Smith wrote:
>>      >              >>>> Yes, let's schedule something. I'm available
>>     tomorrow
>>      >             morning and
>>      >              >>>> Tuesday evening.
>>      >              >>>
>>      >              >>> What can I provide you in the mean time? I 
>> thought I
>>      >             saw an email from
>>      >              >>> you saying you wanted Datadog access, but I
>>     can't seem
>>      >             to find it.
>>      >              >>>
>>      >              >>> Davide may have time Tuesday, I will have time 
>> Wed
>>      >             evening. I'm out of
>>      >              >>> town until Monday and have plans Mon/Tues 
>> evening.
>>      >              >>>
>>      >              >>> - Phil
>>      >              >>>
>>      >              >>>
>>      >              >>> _______________________________________________
>>      >              >>> scale-infra mailing list
>>      >              >>> scale-infra at lists.linuxfests.org
>>     <mailto:scale-infra at lists.linuxfests.org>
>>      >             <mailto:scale-infra at lists.linuxfests.org
>>     <mailto:scale-infra at lists.linuxfests.org>>
>>      >              >>>
>>      > https://lists.linuxfests.org/cgi-bin/mailman/listinfo/scale-infra
>>     <https://lists.linuxfests.org/cgi-bin/mailman/listinfo/scale-infra>
>>     <https://lists.linuxfests.org/cgi-bin/mailman/listinfo/scale-infra
>>     <https://lists.linuxfests.org/cgi-bin/mailman/listinfo/scale-infra>>
>>      >              >> _______________________________________________
>>      >              >> scale-infra mailing list
>>      >              >> scale-infra at lists.linuxfests.org
>>     <mailto:scale-infra at lists.linuxfests.org>
>>      >             <mailto:scale-infra at lists.linuxfests.org
>>     <mailto:scale-infra at lists.linuxfests.org>>
>>      >              >>
>>      > https://lists.linuxfests.org/cgi-bin/mailman/listinfo/scale-infra
>>     <https://lists.linuxfests.org/cgi-bin/mailman/listinfo/scale-infra>
>>     <https://lists.linuxfests.org/cgi-bin/mailman/listinfo/scale-infra
>>     <https://lists.linuxfests.org/cgi-bin/mailman/listinfo/scale-infra>>
>>      >              >
>>      >
>>      >             --
>>      >             Phil Dibowitz phil at ipom.com <mailto:phil at ipom.com>
>>     <mailto:phil at ipom.com <mailto:phil at ipom.com>>
>>      >             Open Source software and tech docs        Insanity
>>     Palace of
>>      >             Metallica
>>      > http://www.phildev.net/ <http://www.phildev.net/>
>>     <http://www.phildev.net/ <http://www.phildev.net/>>
>>      > http://www.ipom.com/ <http://www.ipom.com/> <http://www.ipom.com/
>>     <http://www.ipom.com/>>
>>      >
>>      >             "Be who you are and say what you feel, because 
>> those who
>>      >             mind don't
>>      >                matter and those who matter don't mind."
>>      >                - Dr. Seuss
>>      >
>>      >
>>      >             _______________________________________________
>>      >             scale-infra mailing list
>>      > scale-infra at lists.linuxfests.org
>>     <mailto:scale-infra at lists.linuxfests.org>
>>      >             <mailto:scale-infra at lists.linuxfests.org
>>     <mailto:scale-infra at lists.linuxfests.org>>
>>      > https://lists.linuxfests.org/cgi-bin/mailman/listinfo/scale-infra
>>     <https://lists.linuxfests.org/cgi-bin/mailman/listinfo/scale-infra>
>>     <https://lists.linuxfests.org/cgi-bin/mailman/listinfo/scale-infra
>>     <https://lists.linuxfests.org/cgi-bin/mailman/listinfo/scale-infra>>
>>      >
>>      >
>>      > _______________________________________________
>>      > scale-infra mailing list
>>      > scale-infra at lists.linuxfests.org
>>     <mailto:scale-infra at lists.linuxfests.org>
>>      > https://lists.linuxfests.org/cgi-bin/mailman/listinfo/scale-infra
>>     <https://lists.linuxfests.org/cgi-bin/mailman/listinfo/scale-infra>
>>
>>     --     Phil Dibowitz phil at ipom.com <mailto:phil at ipom.com>
>>     Open Source software and tech docs        Insanity Palace of 
>> Metallica
>>     http://www.phildev.net/ <http://www.phildev.net/>
>>     http://www.ipom.com/ <http://www.ipom.com/>
>>
>>     "Be who you are and say what you feel, because those who mind don't
>>        matter and those who matter don't mind."
>>        - Dr. Seuss
>>
>>
>>     _______________________________________________
>>     scale-infra mailing list
>>     scale-infra at lists.linuxfests.org
>>     <mailto:scale-infra at lists.linuxfests.org>
>>     https://lists.linuxfests.org/cgi-bin/mailman/listinfo/scale-infra
>>     <https://lists.linuxfests.org/cgi-bin/mailman/listinfo/scale-infra>
>>
>>
>> _______________________________________________
>> scale-infra mailing list
>> scale-infra at lists.linuxfests.org
>> https://lists.linuxfests.org/cgi-bin/mailman/listinfo/scale-infra
> 

-- 
Phil Dibowitz                             phil at ipom.com
Open Source software and tech docs        Insanity Palace of Metallica
http://www.phildev.net/                   http://www.ipom.com/

"Be who you are and say what you feel, because those who mind don't
  matter and those who matter don't mind."
  - Dr. Seuss




More information about the scale-infra mailing list