When a problem occurs in a MongoDB, nstance, the log file can prove to be a key resource in resolving the problem. This may seem obvious, but there is more. In MongoDB, by default, any query that takes longer than 100 ms is logged in the log file (with the slowms parameter). These queries are logged with a severity level ‘I’ (informative) and are generally accompanied by the message “Slow query”.
In this article, we will explore the importance of monitoring and analysing these logs to ensure the proper functioning of your database.
Innovative Scripting Log Analysis
Since version 4.4, MongoDB has introduced a significant new feature: all log output is now written in JSON format.
This change greatly simplifies log analysis, as it makes it easier to process each line as a structured object, instead of as simple text (which could be truncated or difficult to manipulate).
A SORINTian created a script that reads each line of a specific group of log files, converts the data (in JSON format) into objects, masks sensitive data (e.g. by applying a filter on a query find), and groups similar statements by namespace, operation, query pattern and execution plan. In this way, the team has access to detailed information on the most performance-intensive operations and can act accordingly to optimise the most problematic queries.
This approach allows us to obtain a summary of the slowest queries, enabling us to focus on those that have the greatest impact on instance performance. In particular, the script handles various commands such as find, getMore, aggregate, count, distinct, insert, update, findAndModify, remove and createIndexes. For example, for the find command, the component (denoted as ‘c’) is identified as COMMAND, the type attribute is set to command, and within attr.command is the ‘find’ key.
From here, important details can be extracted, such as:
- namespace: value of the “att.command.find”,
- filter: value of “attr.command.filter”. The filter was masked, and named as “query pattern”.
- sort: value of “attr.command.sort”
- limit: value of “attr.command.limit”
- Execution plan: value of “attr.command.planSummary”
- execution statistics: “attr.command.keysExamined”, “attr.command.docsExamined” and “attr.command.nRetuned”
- execution time: “attr.command.durationMillis”
For the ‘find’ instruction, we can monitor a wide range of relevant commands and develop customised scripts to meet specific needs. An interesting aspect is that, in the log, we can also obtain detailed storage information, such as the data read and write times, as well as the wait times associated with each instruction. This allows us to have a complete view of operations and further optimise system performance.
Ken Chen’s scripts, known as ‘keyhole’ scripts, are useful performance analysis tools for evaluating the performance of the MongoDB cluster. Although these tools are popular for guiding development, it is important to know that all the necessary information is already available in the registry. In addition, it is possible to enable the profiler to examine slow queries. The data recorded in the ‘system.profile’ collection follows a similar format to that of the log message file, which means that, with a couple of parameters, a single script can be created to collect and analyse all the necessary information
Having a summary of ‘time-consuming’ instructions is fundamental to effective performance analysis. The real value of implementing a periodic check of the log message file lies in the ability to act proactively, identifying and resolving potential problems before they become critical. This approach becomes even more crucial when managing organisations with thousands of instances. In these cases, it can really make a difference and transform the way performance challenges are addressed.
Credits
Danilo Pala – DB Expert