The customer task found a large number of Error during the monitoring period. The description of Error is "operation timeout (page)", resulting in the decrease of Availability. Can you help to find out the reason?
(1) Check whether there are commonalities among the error reports ①Region It is relatively scattered regionally, and errors are reported when visiting all parts of the country. ②Look at performance indicators Focus on DNS taking a long time, and an error will be reported directly after DNS. ③Other common characteristics It is found that the Probe described by Error are all concentrated on Chrome 90. (2) View captured packets It was found that DNS successfully resolved the IP, indicating that it was not a DNS problem. (3) Instant Testing recurrence The reproduction is normal, which means it is not 100% reproducible.
(1) From a common point of view, it happened in Chrome 90 (2) Instant Testing is not inevitable, indicating that it is accidental (3) The DNS in the packet capture is actually resolved successfully, indicating that it is not a DNS problem. (4) DNS is followed by TCP to establish a connection, indicating that it is caused by the failure of TCP to establish a connection. (5) As long as the DNS resolution is successful, the client must have sent a TCP connection establishment, so it should be that the server did not respond causing Error (6) The packet capture is for reference only. The TCP connection is not displayed in the packet capture. It means that the client did not capture it. It does not mean that the client cannot send it.