Monitoring the Server¶
iFun Engine Counter¶
iFun Engine exports its internal states via RESTful API. This is named Counter
.
Provided counters are listed in Counters. Also, you can add your own counters as well as pre-defined ones.
How to use the Counter¶
There are various counter-related functions in counters.h. Suppose we have a code snippet like this:
#include <funapi.h>
void example(){
// Sets the item_count to 150.
// You can choose a counter category, a counter name, and a value.
UpdateCounter("server", "item_count", 150);
// Also, possible to add a description.
// UpdateCounter("server", "item_count", "The number of items", 150);
// Increases "item_count" in the "server" category by 1.
IncreaseCounterBy("server", "item_count", 1);
// Decreases "item_count" in the "server" category by 1.
DecreaseCounterBy("server", "item_count", 1);
// Reads "item_count" in the "server".
int64_t item_count = ReadCounterAsInteger("server", "item_count");
BOOST_ASSERT(item_count == 150);
// Floating point is also possible.
UpdateCounter("server", "connection_per_second", 77.7);
// Another counter category and name example.
UpdateCounter("billing", "purchase_per_second", 7.1);
}
After writing the C++ code, RESTful APIs auto generated for the defined counters:
GET http://localhost:8014/v1/counters/
: returns a list of all counter categories.GET http://localhost:8014/v1/counters/funapi/
: returns a list of counters in the reserved “funapi” category.GET http://localhost:8014/v1/counters/server/item_count/
: returns the counter “item_count” in the category “server”. It should be 150 in this example.GET http://localhost:8014/v1/counters/billing/purchase_per_second/
: return the counter “purchase_per_second” in the “billing” category. It should 7.1 in this example.GET http://localhost:8014/v1/counters/server/item_count/description/
: returns the description about the counter, if any. This is useful if you need to cooperate with external engineers.
Complex Counters with Callbacks¶
Some counters may be complex enough to dynamically calculate the value. Suppose we want to dynamically compute an average value while storing the sum and the count. In this case, we can add a callback for a counter. The callback will be triggered once the counter is invoked. Please see an example below:
#include <funapi.h>
http::StatusCode OnAverageQueried(const string &counter_group, const string &counter_name, Json *ret) {
// If multiple counters share the same callback, you can distinguish using
// "counter_group" and "counter_name" parameters.
// In this example,
// "counter_group" must be "server" as we specified the group name below.
// Similarly, "counter_name" must contain "average_users_per_room".
// Say, we are keeping track of total_users and total_rooms as global variables.
// To make it more RESTful, we return kNoContent if no room.
if (total_rooms == 0) {
return http::kNoContent;
}
// Computes an average from the sum and the number.
double average = total_users / total_rooms;
// Stores the result.
ret->SetDouble(average);
// Returns that the result is available.
return http::kOk;
}
void example() {
// Registers a counter "average_users_per_room" in the "server" category.
// And makes the counter trigger a callback when it's read.
RegisterCallableCounter(
"server", "average_users_per_room",
"Returns the average number of users per game room", OnAverageQueried);
}
GET http://localhost:8014/v1/counters/server/average_users_per_room/
: This API will invokeOnAverageQueried
to get an average value.
Monitoring the Counter Values¶
There could be a case that we need to monitor some counters if they exceed thresholds. For example, spike in the number of rare items can be a signal that there’s an item duplication bug. iFun Engine can set a threshold for a counter of integer or double.
Here’s an example:
#include <funapi.h>
void OnResetGoldCounterTimerExpired(const Timer::Id &,
const WallClock::Value &) {
// Resets "gold_per_hour" to 0.
UpdateCounter("game", "gold_per_hour", 0);
}
void Install() {
// Registers "gold_per_hour" in the category "game".
UpdateCounter("game", "gold_per_hour", 0);
// Requests iFun Engine to report if "gold_per_hour" exceeds 100K.
// Please note that counter registration is required before setting a threshold.
MonitorCounter("game", "gold_per_hour", 100000);
// Registers a timer to reset the counter.
Timer::ExpireRepeatedly(WallClock::FromSec(3600), OnResetGoldCounterTimerExpired);
}
// Say, this function is invoked when the player earns gold.
void PickGold(int64_t gold) {
// 골드량을 증가시킵니다.
IncreaseCounterBy("game", "gold_per_hour", gold);
}
Note
Counter registration by either UpdateCounter()
, IncreaseCounterBy()
, or DecreaseCounterBy()
must be done before MonitorCounter()
.
If the amount of gold exceeds 100K in an hour, we will see a message like this:
W0818 11:03:06.520730 18324 counter.cc:160] The 'gold_per_hour of game' counter exceeded threshold: value=123456, threshold=100000
If MonitorCounter() left log messages after exceeding the threshold, it would generate too many log messages.
Instead, MonitorCounter() checks counters in question once every seconds given in counter_monitoring_interval_in_sec
in MANIFEST.json.
As well as custom counters you register, iFun Engine itself monitors following counters:
event
event_queue_length
: the number of events in the queue.
object
outstanding_fetch_query
: the number of DB read operations in the queue.outstanding_update_query
: the number of DB write operations in the queue.
Regarding the threshold
and the interval
, please see the Counter MANIFEST.json.
Counters¶
These are predefined counter categories and counters in the categories:
process
: Information about the running iFun Engine process. It’s the same as Linux PS output.vsz
cpu
nivcsw
nswap
oublock
minflt
idrss
isrss
ixrss
nsignals
majflt
maxrss
msgsnd
msgrcv
nvcsw
stime
updated
utime
inblock
refresh_interval
: counter’s refresh interval
os
: Information about the server OS.procs
: The number of processors.totalswap
: Total swap size in bytes.freeswap
: Available swap size in bytes.bufferram
: Buffer memory size in bytes.load15
: Load average seen during the last 15 minutes.load5
: Load average seen during the last 5 minutes.load1
: Load average seen during the last 1 minute.uptime
: Server up time in seconds.cpus
: The number of cpu cores.totalram
: Total RAM size in bytes.freeram
: Free RAM size in bytes.sharedram
: Shared memory size in bytes.type
: os type string.updated
: Time stamp of the counter.refresh_interval
: counter’s refresh interval.
funapi
: Information about the iFun Engine.concurrent_user
: The number of concurrent players. Concurrent players are computed viaAccountManager
. Please refer to 클라이언트와 아이펀 세션 연동 / 해제 (로그인 / 로그아웃) for details.sessions
: The number sessions on the server.
object_database_stat
: Per-DB query processing stats. Please see (Advanced) Profiling The ORM for more information.object
: The number of objects cached on the server, the number of DB read / write operations in the queue.rpc_stat
: rpc-related stats.zookeeper_stat
:Zookeeper
performance stats Please refer to Zookeeper profiling.event/performance/queue
: 서버의 이벤트 유입량, 처리량, 대기 중인 이벤트 수 등을 조회합니다.event/profiling/summary
: event profiling data. Please refer to Event profiling: summary for more information.event/profiling/all
: Per-event profiling data. Please refer to Event profiling: details for more information.event/profiling/reset
: flag if resetting the event profiling.event/profiling/outstanding
: The number of outstanding events.
funapi_object_model
: Per-type the number of objects cached on the server.
MANIFEST.json¶
Counter relies on API Service. So, please see API Service MANIFEST as well as parameters in this section.
Component: CounterService
Arguments
counter_monitoring_interval_in_sec
: Interval in seconds to monitor counters in question. Please see Monitoring the Counter Values on counter monitoring. It defaults to 30 seconds.warning_threshold_event_queue_length
: Threshold for theevent_queue_length
of theevent
counter. It defaults to 3000.warning_threshold_outstanding_fetch_query
: Threshold for theoutstanding_fetch_query
counter. It defaults to 5000.warning_threshold_outstanding_update_query
: Threshold for theoutstanding_update_query
counter. It defaults to 5000.warning_threshold_slow_query_in_sec
: Outputs a warning message if the query takes longer than specified seconds during counter monitoring(type=uint64, default=1)warning_threshold_slow_distribution_in_sec
: Outputs a warning message if processing time of the distribution(Redis/ZooKeeper) is longer than specified seconds during counter monitoring(type=uint64, default=3)
iFun Engine Dashboard¶
iFun Engine provides a dashboard that collects various performance metrics and visualizes the status of a running server (including possible performance bottlenecks.) The dashboard is implemented on top of iFun Engine Counter.
In short, iFun Engine Dashboard provides functionalities as follows:
OS-level resource monitoring¶
This monitoring includes the number of player sessions and the number of players authenticated, as well as CPU and RAM usage.
Performance of iFun Engine’s event subsystem¶
The dashboard visualizes how many events generated per second, how long to handle them, and how much the queue increased.
Performance of iFun Engine’s ORM subsystem¶
For each DB (for shard instance DB if using the iFun Engine’s sharding feature), the dashboard visualizes read/write processing times, the number of read/write requests in the queue, and the number of data objects cached in the memory.
Performance of iFun Engine’s distribution subsystem¶
The dashboard depicts RPC traffic between iFun Engine servers.