Score:4

How can I see the httpd log for outbound connections?

sa flag

The access log specified in httpd.conf for a website only shows the information for incoming connections. For outbound connections such as those issued by php file_get_contents function, how can I get the log?

MonkeyZeus avatar
in flag
Does this answer your question? [Finding outbound connections from Apache/PHP](https://serverfault.com/questions/154601/finding-outbound-connections-from-apache-php)
Score:10
in flag

Modifying third-party PHP application may not be a feasible solution. HTTP proxy should be used, for at least two reasons:

  1. Every time any of the PHP scripts attempts to access an external resource it uses a proxy, which has its own access log
  2. Proxy should have access control rules that allow only certain addresses and block all others.
peter avatar
sa flag
Your solution sounds promising. I added a proxy virtual host to httpd.conf and added http_proxy=localhost:3128 to /etc/environment to try to catch the outbound connection made by file_get_contents of php, but it does not work. (The proxy does work because it can log the request by wget.) See my another question:https://serverfault.com/questions/1115758/how-to-configure-a-system-wide-proxy-for-php-file-get-contents-on-centos
Score:5
us flag

There isn't any solution that can be just switched on, but this functionality has to be implemented.

One possible approach is:

Make a wrapper around the functions that make outgoing requests:

function log_file_get_contents( $url ) {
    log_request( $url ); // A separate logging function that you create
    file_get_contents( $url );
}

Then, use log_file_get_contents() for all requests that you want to be logged.

A similar wrapper function needs to be written for other functions that are used for outgoing requests.

peter avatar
sa flag
The problem is I do not know which scripts made the outbound connections. I have several websites on a server. I just found(by the top command) many httpd processes exhaust memory, and by the netstat command, I found the httpd processes are connecting to an external ip address.
peter avatar
sa flag
Is it possible to locate the script that issues the outbound connections?
jcaron avatar
co flag
@peter `netstat -np` (running as root or as the user running `httpd`) will tell you which process owns which connection. You can then use `apache2ctl fullstatus` to find the current request associated with that PID.
peter avatar
sa flag
@jcaron I could not find any entry for outbound connections in the report of "apache2ctl fullstatus". It just shows information for incoming requests.
jcaron avatar
co flag
@peter You won't find the outbound connections listed. But `netstat -np` will give you the pid of the process which makes the outgoing request, and you can then look up that pid in the list of incoming requests to find which incoming request results in the outgoing request.
Score:1
in flag

There is no readily available log. You would need a solution specifically catered for your OS.


If you are the programmer then you can look into redeclaring native PHP functions by making use of namespaces and the auto_prepend_file php.ini directive.

<?php
namespace override;
function file_get_contents( string $filename, $use_include_path = false, $context = null, $offset = 0, $length = null )
{
    // If $filename seems like a URL then do log stuff
    if( preg_match( '/^https?:\\/\\//i', $filename ))
    {
        // Do log stuff
        echo 'Doing log stuff for '.$filename;
    }
    
    return \file_get_contents( $filename, $use_include_path, $context, $offset, $length );
}

\override\file_get_contents( 'https://onlinephp.io/' );

Output:

Doing log stuff for https://serverfault.com/

Additionally, you would have to make sure to do this with any other functions or classes which make web requests such as:

curl_init("http://www.example.com/");

curl_setopt($ch, CURLOPT_URL, "http://www.example.com/");

curl_setopt_array($ch, array(CURLOPT_URL => 'http://www.example.com/'));

fopen('http://www.example.com/', 'r');

And possibly more which I am not aware of.

Score:1
us flag

Adapting my answer to How to easily get all HTTPS addresses that an application connects to externally?:

From Monitoring files continuously with lsof, you could use lsof in conjunction with the repeat (-r) option. The following repeats every two seconds

$ lsof -i TCP:80,443 -r 2

which will give you a progressive historical log every 2 seconds:

=======
COMMAND  PID USER   FD   TYPE  DEVICE SIZE/OFF NODE NAME
firefox 9542 user   27u  IPv4 1068219      0t0  TCP user-300V3Z-300V4Z-300V5Z:37360->192.0.78.23:https (ESTABLISHED)
firefox 9542 user   48u  IPv4 1053405      0t0  TCP user-300V3Z-300V4Z-300V5Z:45948->ec2-54-213-37-69.us-west-2.compute.amazonaws.com:https (ESTABLISHED)
=======
COMMAND  PID USER   FD   TYPE  DEVICE SIZE/OFF NODE NAME
firefox 9542 user   27u  IPv4 1068219      0t0  TCP user-300V3Z-300V4Z-300V5Z:37360->192.0.78.23:https (ESTABLISHED)
firefox 9542 user   48u  IPv4 1053405      0t0  TCP user-300V3Z-300V4Z-300V5Z:45948->ec2-54-213-37-69.us-west-2.compute.amazonaws.com:https (ESTABLISHED)
firefox 9542 user   52u  IPv4 1138942      0t0  TCP user-300V3Z-300V4Z-300V5Z:57602->kul08s01-in-f10.1e100.net:https (SYN_SENT)
firefox 9542 user  102u  IPv4 1139934      0t0  TCP user-300V3Z-300V4Z-300V5Z:49102->kul09s13-in-f14.1e100.net:https (ESTABLISHED)
firefox 9542 user  110u  IPv4 1138950      0t0  TCP user-300V3Z-300V4Z-300V5Z:49104->kul09s13-in-f14.1e100.net:https (SYN_SENT)
=======
...
=======
COMMAND  PID USER   FD   TYPE  DEVICE SIZE/OFF NODE NAME
firefox 9542 user   27u  IPv4 1068219      0t0  TCP user-300V3Z-300V4Z-300V5Z:37360->192.0.78.23:https (ESTABLISHED)
firefox 9542 user   48u  IPv4 1053405      0t0  TCP user-300V3Z-300V4Z-300V5Z:45948->ec2-54-213-37-69.us-west-2.compute.amazonaws.com:https (ESTABLISHED)
firefox 9542 user   51u  IPv4 1140129      0t0  TCP user-300V3Z-300V4Z-300V5Z:52284->kul09s13-in-f10.1e100.net:https (ESTABLISHED)
firefox 9542 user  108u  IPv4 1137384      0t0  TCP user-300V3Z-300V4Z-300V5Z:55886->103.229.10.236:https (ESTABLISHED)
firefox 9542 user  122u  IPv4 1137399      0t0  TCP user-300V3Z-300V4Z-300V5Z:55870->kul08s12-in-f1.1e100.net:https (ESTABLISHED)
firefox 9542 user  126u  IPv4 1137402      0t0  TCP user-300V3Z-300V4Z-300V5Z:47370->stackoverflow.com:https (SYN_SENT)

Note: Every two seconds interval is separated by =======.

You could then pipe the output to a file, like so

$ lsof -i TCP:80,443 -r 2 > /tmp/http_out.log

If you don't want to log all outgoing HTTP(S) requests, you could grep for the name of your script/process:

$ lsof -i TCP:80,443 -r 2 | grep <name of your process>

I think that the grep should work, but I'm not able to test it.


Admittedly, the output isn't as pretty as using

watch -n1 lsof -i TCP:80,443 

but this would only give you an instantaneous snapshot of the current outgoing requests:

dropbox    3280 saml   23u  IPv4 56015285      0t0  TCP greeneggs.qmetricstech.local:56003->snt-re3-6c.sjc.dropbox.com:http (ESTABLISHED) 
thunderbi  3306 saml   60u  IPv4 56093767      0t0  TCP greeneggs.qmetricstech.local:34788->ord08s09-in-f20.1e100.net:https (ESTABLISHED) 
mono       3322 saml   15u  IPv4 56012349      0t0  TCP greeneggs.qmetricstech.local:54018->204-62-14-135.static.6sync.net:https (ESTABLISHED) 
chrome    11068 saml  175u  IPv4 56021419      0t0  TCP greeneggs.qmetricstech.local:42182->stackoverflow.com:http (ESTABLISHED) 

Again, to restrict the output to only the PHP process, you might be able to use grep

watch -n1 lsof -i TCP:80,443 | grep <name of your process>
I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.