Skip to content
Snippets Groups Projects
Commit 2ed83fd5 authored by nimrod's avatar nimrod
Browse files

Post about structured logging.

parent 96940ec0
No related branches found
No related tags found
No related merge requests found
Pipeline #1403 passed
Human and structured logs
=========================
:date: 2021-05-08
:summary: Output both human readable and structured logs
As anyone who's heard of the 12 factor app knows, logs are best outputted to the
standard output of the process and have the environment handle the logs
accordingly. A while back I worked at a company where we collected logs with the
EFK stack. We had quite a bit of logs and ingested daily and tracking down
issues in Kibana using regular expressions became unwieldy. Structured logging
(read json lines, where every line is a json object) was the answer. It would
allow us to more easily find all of the logs for a single transaction or account
and our EFK stack could parse out-of-the-box.
There was a previous attempt at structured logs before I joined the team that
failed to be merged. It was too invasive, the diff too big, it never worked
correctly and it required too much work to carry the diff forward and eventually
the branch was dropped.
Another complaint I heard from the developers about that previous attempt was
that it made reading the logs when developing locally (reading the logs from the
Docker container) or for failing CI builds more difficult. Indeed, it was not
human-readable.
Others at the company (with experience log4j and enterprise software) wanted to
output all of the logs to 2 files with different formats. I saw this as a
mistake. It would make handling logs inside of containers more difficult. It
would require dealing with log rotation.
Another goal for me was to be able gradually convert components to the new log
handling so we won't end up with a huge diff that's never merged again.
I went with the `structlog <http://www.structlog.org/>`_ package and after
delving in to it, I created a module that only requires importing to use it.
The module configures the log format (human or json) and level according to
environment variables but it doesn't require any change to the logging calls.
The structlog package also comes with a great looking console renderer where
ever control character are valid. Here's how it looks:
.. image:: http://structlog.readthedocs.io/en/stable/_images/console_renderer.png
:alt: Example rendered logs in a console
And here's the module:
.. code:: python
import logging
import logging.config
import structlog
import structlog.stdlib
import structlog.processors
import structlog.dev
import os
emptyLogRecord = logging.makeLogRecord({})
def add_extra_params(logger, log_method, event_dict):
record = event_dict['_record']
for (k, v) in list(record.__dict__.items()):
if k != 'stack_info' and k not in emptyLogRecord.__dict__ and k not in event_dict: # noqa: E501
event_dict[k] = v
return event_dict
pre_chain = [
structlog.processors.StackInfoRenderer(),
structlog.stdlib.PositionalArgumentsFormatter(),
structlog.stdlib.add_log_level,
structlog.stdlib.add_logger_name,
structlog.processors.TimeStamper(fmt='%Y-%m-%d %H:%M.%S'),
structlog.processors.format_exc_info,
add_extra_params,
]
processor = {
'human': structlog.dev.ConsoleRenderer(),
'json': structlog.processors.JSONRenderer(),
}
logFormat = os.environ.get('LOG_FORMAT', 'human')
assert logFormat in processor
try:
logLevel = getattr(logging, os.environ.get('LOG_LEVEL', 'WARN').upper())
except AttributeError:
logLevel = logging.WARN # The default log level.
config = {
'version': 1,
'disable_existing_loggers': True,
'formatters': {
'json': {
'()': structlog.stdlib.ProcessorFormatter,
'processor': processor['json'],
'foreign_pre_chain': pre_chain,
'format': '%(message)s',
},
'human': {
'()': structlog.stdlib.ProcessorFormatter,
'processor': processor['human'],
'foreign_pre_chain': pre_chain,
},
},
'handlers': {
'default': {
'formatter': logFormat,
'class': 'logging.StreamHandler',
},
},
'loggers': {
'': {
'handlers': ['default'],
'level': logLevel,
'propegate': True,
},
},
}
logging.config.dictConfig(config)
structlog.configure(
processors=[processor[logFormat]],
context_class=dict,
logger_factory=structlog.stdlib.LoggerFactory(), )
I called this module :code:`logging_config` and if a module had imported it,
then logs would be human readable by default and if the :code:`LOG_FORMAT`
environment variable was set to :code:`json` (it would be in the staging and
production environment where we had an EFK stack running and collecting logs)
the output was in json format.
This was the first step and it could be merged without breaking existing
components. However, most logging calls still had formatted strings which didn't
expose all of the data in a structured format. We had to go module by module and
change to have a single, static log message and pass all of the variables in the
:code:`extra` parameter. It would mean changing calls that looked like this:
.. code:: python
logger.error(f"Failed to send message {message} to user {uid}.")
to something like this:
.. code:: python
logger.error("Failed to send message.", extra={"message": message, "uid": uid})
We could add fields to the output easily, deal with missing parameters more
easily, output the Flask request context more easily, etc. We could do this
piecemeal, going through one module at a time and not have to keep a growing
diff maintained. It would keep the human readable logs where needed and have
easily searchable logs in Kibana where available. We kept to the 12 factor app
methodology and the benefits of it. It required little change to our setup and
deployment. It had noticeable benefits out of gate which made the Dev team pitch
in with converting modules to the new logging calls convention.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment