PHP - Async cURL Requests
APIs are the hot new thing and we use them more and more frequently in our code. Sometimes we need to perform lots of API requests and they aren't dependent on each other. For example, maybe you need to GET a bunch of information from multiple sources, before stitching it together? You could massively improve the performance of such code by performing the requests asynchronously rather than sequentially. Then your code would take roughly as long as the slowest request takes, rather than the sum of them together.
There are two main ways to perform asynchronous requests. You can either go "native" with the in-built curl_multi_*
functions, or you can use the Guzzle library to simplify your life. The Guzzle library uses these curl_multi
functions, but provides you with an easier interface and documentation.
Native
The code below will send off a bunch of requests at once and print out the response data. You will not be able to process any of the responses until they have all returned.
<?php
class ParallelGet
{
function __construct($urls)
{
// Create get requests for each URL
$mh = curl_multi_init();
foreach($urls as $i => $url)
{
$ch[$i] = curl_init($url);
curl_setopt($ch[$i], CURLOPT_RETURNTRANSFER, 1);
curl_multi_add_handle($mh, $ch[$i]);
}
// Start performing the request
do {
$execReturnValue = curl_multi_exec($mh, $runningHandles);
} while ($execReturnValue == CURLM_CALL_MULTI_PERFORM);
// Loop and continue processing the request
while ($runningHandles && $execReturnValue == CURLM_OK)
{
// !!!!! changed this if and the next do-while !!!!!
if (curl_multi_select($mh) != -1)
{
usleep(100);
}
do {
$execReturnValue = curl_multi_exec($mh, $runningHandles);
} while ($execReturnValue == CURLM_CALL_MULTI_PERFORM);
}
// Check for any errors
if ($execReturnValue != CURLM_OK)
{
trigger_error("Curl multi read error $execReturnValue\n", E_USER_WARNING);
}
// Extract the content
foreach ($urls as $i => $url)
{
// Check for errors
$curlError = curl_error($ch[$i]);
if ($curlError == "")
{
$responseContent = curl_multi_getcontent($ch[$i]);
$res[$i] = $responseContent;
}
else
{
print "Curl error on handle $i: $curlError\n";
}
// Remove and close the handle
curl_multi_remove_handle($mh, $ch[$i]);
curl_close($ch[$i]);
}
// Clean up the curl_multi handle
curl_multi_close($mh);
// Print the response data
print "response data: " . print_r($res, true);
}
}
$urls = array(
'localhost',
'localhost',
'localhost',
'localhost'
);
$getter = new ParallelGet($urls);
Guzzle
The first thing you need to do is install guzzle with:
composer require guzzlehttp/guzzle
There are two examples below taken from the Guzzle documentation. With these, it is easy enough to process the responses as they come in and not wait until they have all returned.
Example 1
<?php
require_once(__DIR__ . '/vendor/autoload.php');
$client = new GuzzleHttp\Client();
$promises = [
$client->getAsync('http://localhost')->then(function ($response) { echo $response->getBody(); }),
$client->getAsync('http://localhost')->then(function ($response) { echo $response->getBody(); }),
$client->getAsync('http://localhost')->then(function ($response) { echo $response->getBody(); }),
$client->getAsync('http://localhost')->then(function ($response) { echo $response->getBody(); }),
$client->getAsync('http://localhost')->then(function ($response) { echo $response->getBody(); }),
$client->getAsync('http://localhost')->then(function ($response) { echo $response->getBody(); }),
$client->getAsync('http://localhost')->then(function ($response) { echo $response->getBody(); }),
$client->getAsync('http://localhost')->then(function ($response) { echo $response->getBody(); }),
$client->getAsync('http://localhost')->then(function ($response) { echo $response->getBody(); }),
];
$results = GuzzleHttp\Promise\unwrap($promises);
// Wait for the requests to complete, even if some of them fail
$results = GuzzleHttp\Promise\settle($promises)->wait();
print "finished." . PHP_EOL;
Example 2
This is another example that does the same thing but is written in a slightly different way
<?php
require_once(__DIR__ . '/vendor/autoload.php');
$client = new GuzzleHttp\Client([
'base_uri' => 'http://localhost', // Base URI is used with relative requests
'timeout' => 20.0, // You can set any number of default request options.
]);
$promises = array();
for($i=0; $i<20; $i++)
{
$promise = $client->getAsync('http://localhost');
$promise->then(function ($response) { echo $response->getBody() . PHP_EOL;});
$promises[] = $promise;
}
foreach ($promises as $promise)
{
$response = $promise->wait();
}
print "finished." . PHP_EOL;
Testing
For testing that the code works, run an Apache or Nginx webserver with the following index.php file:
<?php
print "Time start: " . time() . PHP_EOL;
$sleepAmount = rand(1, 10);
print "Sleeping for $sleepAmount seconds." . PHP_EOL;
sleep($sleepAmount);
print "Time end: " . time() . PHP_EOL;
php -S localhost
will not work as that will run a webserver that can only handle one request at a time.
When you run the code, you should get output similar to below:
Time start: 1500815846
Sleeping for 2 seconds.
Time end: 1500815848
Time start: 1500815846
Sleeping for 2 seconds.
Time end: 1500815848
Time start: 1500815846
Sleeping for 3 seconds.
Time end: 1500815849
Time start: 1500815846
Sleeping for 4 seconds.
Time end: 1500815850
Time start: 1500815846
Sleeping for 4 seconds.
Time end: 1500815850
Time start: 1500815846
Sleeping for 7 seconds.
Time end: 1500815853
Time start: 1500815846
Sleeping for 8 seconds.
Time end: 1500815854
Time start: 1500815846
Sleeping for 8 seconds.
Time end: 1500815854
Time start: 1500815846
Sleeping for 9 seconds.
Time end: 1500815855
finished.
The key point here is that the "Sleeping for" messages are in order of duration, and the Time start
values are all the same. This shows that the webserver was hit with all of the requests at the same time. Then the requests that had the lower sleep durations finished first and we output their responses. Thus the output is in order of the sleep durations.
References
- Guzzle Documentation - Async Requests
- Guzzle Documentation - Quickstart - Concurrent Requests
- Stack Overflow - PHP -
curl_multi_exec
never completes
First published: 16th August 2018