Streaming Import from PHP Apps
‘fluent-logger-php’ is used to import data from PHP applications to Treasure Data.
This article explains how to use the fluent-logger-php library.
Table of Contents
- Basic knowledge of PHP.
- Basic knowledge of Treasure Data, including the toolbelt.
- PHP 5.3 or higher (for local testing).
|The fluent-logger-php library does not work in Heroku (here's why) or EngineYard.|
fluent-logger-php requires ‘td-agent’ to be installed on your application servers. td-agent is a daemon program dedicated to the streaming upload of any kind of the time-series data. td-agent is developed and maintained by Treasure Data, Inc.
The fluent-logger-php library enables PHP applications to post records to their local td-agent. td-agent in turn uploads the data to the cloud every 5 minutes. Because the daemon runs on a local node, the logging latency is negligible.
To set up td-agent, please refer to the following articles; we provide deb/rpm packages for Linux systems.
|If you have...||Please look at...|
|MacOS X||Installing td-agent on MacOS X|
|Debian / Ubuntu System||Installing td-agent for Debian and Ubuntu|
|Redhat / CentOS System||Installing td-agent for Redhat and CentOS|
|Joyent SmartOS||Installing fluentd + td plugin on Joyent SmartOS|
|AWS Elastic Beanstalk||Installing td-agent on AWS Elastic Beanstalk|
|td-agent is fully open-sourced under the fluentd project. td-agent extends fluentd with custom plugins for Treasure Data.|
Next, please specify your authentication key by setting the
apikey option. You can view your api key with the td apikey:show command.
Note: You must first authenticate your account using the ‘td account’ command.
$ td apikey:show 3b7118fd3ad7e35bbd3c0e4f607ec7263aa93c30
Next, please set the
apikey option in your td-agent.conf file.
Note: YOUR_API_KEY should be your actual apikey string.
# Unix Domain Socket Input <source> type unix path /var/run/td-agent/td-agent.sock </source> # Treasure Data Output <match td.*.*> type tdlog apikey YOUR_API_KEY auto_create_table buffer_type file buffer_path /var/log/td-agent/buffer/td use_ssl true </match>
Please restart your agent once these lines are in place.
$ sudo /etc/init.d/td-agent restart
td-agent will now accept data via port 24224, buffer it (var/log/td-agent/buffer/td), and automatically upload it into the cloud.
To use fluent-logger-php, copy the library into your project directory.
$ git clone https://github.com/fluent/fluent-logger-php.git $ cp -r src/Fluent <path/to/your_project>
Next, initialize and post the records as follows.
<?php require_once __DIR__.'/src/Fluent/Autoloader.php'; use Fluent\Logger\FluentLogger; Fluent\Autoloader::register(); $logger = new FluentLogger("unix:///var/run/td-agent/td-agent.sock"); $logger->post("td.test_db.test_table", array("hello"=>"world")); $logger->post("td.test_db.follow", array("from"=>"userA", "to"=>"userB"));
Confirming Data Import
Sending a SIGUSR1 signal will flush td-agent’s buffer; upload will start immediately.
$ php test.php $ kill -USR1 `cat /var/run/td-agent/td-agent.pid`
To confirm that your data has been uploaded successfully, issue the td tables command as shown below.
$ td tables +------------+------------+------+-----------+ | Database | Table | Type | Count | +------------+------------+------+-----------+ | test_db | test_table | log | 1 | | test_db | follow | log | 1 | +------------+------------+------+-----------+
|The first argument of post() determines the database name and table name. If you specify `td.test_db.test_table`, the data will be imported into the table *test_table* within the database *test_db*. They are automatically created at upload time.|
Tips on Production Deployment
Use Apache and mod_php
We recommend that you use Apache and mod_php. Other setups have not been fully validated.
Use Apache prefork MPM
Please use Apache prefork MPM. Other MPMs such as worker MPM should not be used. You can confirm your current settings with the apachectl -V command.
$ apachectl -V | grep MPM: Server MPM: Prefork
We recommend that you periodically restart your PHP processes by setting MaxRequestsPerChild in your Apache conf.
<IfModule mpm_prefork_module> StartServers 32 MinSpareServers 32 MaxSpareServers 32 MaxClients 32 MaxRequestsPerChild 4096 </IfModule>
|Do not set MaxRequestsPerChild to zero.|
High-Availablability Configurations of td-agent
For high-traffic websites (more than 5 application nodes), we recommend using a high availability configuration of td-agent. This will improve data transfer reliability and query performance.
Monitoring td-agent itself is also important. Please refer to this document for general monitoring methods for td-agent.
We offer a schema mechanism that is more flexible than that of traditional RDBMSs. For queries, we leverage the Hive Query Language.
For more specific assistance, please visit our support resources: